[
https://issues.apache.org/jira/browse/SAMZA-212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14509232#comment-14509232
]
Martin Kleppmann commented on SAMZA-212:
----------------------------------------
I've finally got some usable code! I ended up implementing it as a
Postgres-to-Kafka bridge, so integration with Samza would be simply by
consuming from Kafka.
Blog post describing the project:
http://blog.confluent.io/2015/04/23/bottled-water-real-time-integration-of-postgresql-and-kafka/
Code: https://github.com/confluentinc/bottledwater-pg
The option to implement a direct Samza StreamConsumer for Postgres (without
going via Kafka) is still there, but it'll require making JNI binding for the
Bottled Water client library. This is planned
(https://github.com/confluentinc/bottledwater-pg/issues/2) but not implemented
yet.
If anyone wants to give it a try, I'd love to hear how you get on!
> Explore using PostgreSQL changelog as input stream
> --------------------------------------------------
>
> Key: SAMZA-212
> URL: https://issues.apache.org/jira/browse/SAMZA-212
> Project: Samza
> Issue Type: New Feature
> Reporter: Martin Kleppmann
> Assignee: Martin Kleppmann
> Labels: project
>
> It would be good to be able to capture changes to a PostgreSQL database, and
> use it as an input to Samza (like SAMZA-200 but for Postgres instead of
> MySQL).
> On a high level, it seems like the options are:
> * [Creating a trigger|https://gist.github.com/fbeauchamp/9879820], and using
> [LISTEN|http://www.postgresql.org/docs/9.3/static/sql-listen.html]/[NOTIFY|http://www.postgresql.org/docs/9.3/static/sql-notify.html]
> to send changes to an external process.
> ([Example|https://www.evernote.com/shard/s17/sh/cfbad111-2725-41ef-bf4c-58ac13a026d6/1a6f422f7e64831d7b2d7995a49ee48e])
> * [Logical WAL
> decoding|http://www.anarazel.de/2ndquadrant/pgcon-2013-05-23/], which has
> been
> [committed|http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=commitdiff;h=5eeedd55b2d7e53b5fdcdab6a8e74bb666d75bcc]
> and should be in the upcoming Postgres 9.4 release.
> [@selenamarie|https://twitter.com/selenamarie] [points
> out|https://twitter.com/selenamarie/status/450373310367268864]: "problem with
> it is that LISTEN/NOTIFY are stateless. if you want reliable transport, will
> need logical replication".
> [This Twitter
> conversation|https://twitter.com/selenamarie/status/450360318204448768] has
> some more detail.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)