[ 
https://issues.apache.org/jira/browse/SAMZA-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14509232#comment-14509232
 ] 

Martin Kleppmann commented on SAMZA-212:
----------------------------------------

I've finally got some usable code! I ended up implementing it as a 
Postgres-to-Kafka bridge, so integration with Samza would be simply by 
consuming from Kafka.

Blog post describing the project: 
http://blog.confluent.io/2015/04/23/bottled-water-real-time-integration-of-postgresql-and-kafka/
Code: https://github.com/confluentinc/bottledwater-pg

The option to implement a direct Samza StreamConsumer for Postgres (without 
going via Kafka) is still there, but it'll require making JNI binding for the 
Bottled Water client library. This is planned 
(https://github.com/confluentinc/bottledwater-pg/issues/2) but not implemented 
yet.

If anyone wants to give it a try, I'd love to hear how you get on!

> Explore using PostgreSQL changelog as input stream
> --------------------------------------------------
>
>                 Key: SAMZA-212
>                 URL: https://issues.apache.org/jira/browse/SAMZA-212
>             Project: Samza
>          Issue Type: New Feature
>            Reporter: Martin Kleppmann
>            Assignee: Martin Kleppmann
>              Labels: project
>
> It would be good to be able to capture changes to a PostgreSQL database, and 
> use it as an input to Samza (like SAMZA-200 but for Postgres instead of 
> MySQL).
> On a high level, it seems like the options are:
> * [Creating a trigger|https://gist.github.com/fbeauchamp/9879820], and using 
> [LISTEN|http://www.postgresql.org/docs/9.3/static/sql-listen.html]/[NOTIFY|http://www.postgresql.org/docs/9.3/static/sql-notify.html]
>  to send changes to an external process. 
> ([Example|https://www.evernote.com/shard/s17/sh/cfbad111-2725-41ef-bf4c-58ac13a026d6/1a6f422f7e64831d7b2d7995a49ee48e])
> * [Logical WAL 
> decoding|http://www.anarazel.de/2ndquadrant/pgcon-2013-05-23/], which has 
> been 
> [committed|http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=commitdiff;h=5eeedd55b2d7e53b5fdcdab6a8e74bb666d75bcc]
>  and should be in the upcoming Postgres 9.4 release.
> [@selenamarie|https://twitter.com/selenamarie] [points 
> out|https://twitter.com/selenamarie/status/450373310367268864]: "problem with 
> it is that LISTEN/NOTIFY are stateless. if you want reliable transport, will 
> need logical replication".
> [This Twitter 
> conversation|https://twitter.com/selenamarie/status/450360318204448768] has 
> some more detail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to