Re: Pulling Snapshots from Kafka, Log compaction last compact offset

2015-05-13 Thread Jonathan Hodges
Very good points, Gwen. I hadn't thought of Oracle Streams case of dependencies. I wonder if GoldenGate handles this better? The tradeoff of these approaches is that each RDBMS will be proprietary on how to get this CDC information. I guess GoldenGate can be a standard interface on RDBMs, but

Re: Pulling Snapshots from Kafka, Log compaction last compact offset

2015-05-10 Thread Gwen Shapira
Hi Jonathan, I agree we can have topic-per-table, but some transactions may span multiple tables and therefore will get applied partially out-of-order. I suspect this can be a consistency issue and create a state that is different than the state in the original database, but I don't have good

Re: Pulling Snapshots from Kafka, Log compaction last compact offset

2015-05-10 Thread Hisham Mardam-Bey
With mypipe (MySQL - Kafka) we've had a similar discussion re: topic names and preserving transactions. At this point: - Kafka topic names are configurable allowing for per db or per table topics - transactions maintain a transaction ID for each event when published into Kafka

Re: Pulling Snapshots from Kafka, Log compaction last compact offset

2015-05-02 Thread Jonathan Hodges
Hi Gwen, As you said I see Bottled Water and Sqoop managing slightly different use cases so I don't see this feature as a Sqoop killer. However I did have a question on your comment that the transaction log or CDC approach will have problems with very large, very active databases. I get that

Pulling Snapshots from Kafka, Log compaction last compact offset

2015-04-30 Thread Jan Filipiak
Hello Everyone, I am quite exited about the recent example of replicating PostgresSQL Changes to Kafka. My view on the log compaction feature always had been a very sceptical one, but now with its great potential exposed to the wide public, I think its an awesome feature. Especially when

Re: Pulling Snapshots from Kafka, Log compaction last compact offset

2015-04-30 Thread Gwen Shapira
I feel a need to respond to the Sqoop-killer comment :) 1) Note that most databases have a single transaction log per db and in order to get the correct view of the DB, you need to read it in order (otherwise transactions will get messed up). This means you are limited to a single producer