[ 
https://issues.apache.org/jira/browse/SAMZA-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Kleppmann updated SAMZA-200:
-----------------------------------

    Assignee:     (was: Martin Kleppmann)

> Explore using MySQL changelog as input stream
> ---------------------------------------------
>
>                 Key: SAMZA-200
>                 URL: https://issues.apache.org/jira/browse/SAMZA-200
>             Project: Samza
>          Issue Type: New Feature
>            Reporter: Martin Kleppmann
>
> Samza is designed with good support for database changelogs, but the current 
> open source release is mostly centered around Kafka. It would be good to have 
> out-of-the-box support for some common databases, such as MySQL, as well.
> [Databus|http://www.socc2012.org/s18-das.pdf?attredirects=0] is LinkedIn's 
> change capture tool, but the current open source release focuses mainly on 
> Oracle. There is an open source release of [Databus for 
> MySQL|https://github.com/linkedin/databus/wiki/Databus-for-MySQL], but it's a 
> proof-of-concept implementation, not the one used by LinkedIn in production. 
> (The one used by LinkedIn requires a patched version of MySQL.) The open 
> source Databus uses [Open 
> Replicator|https://code.google.com/p/open-replicator/] to connect to a MySQL 
> server as a slave, and parses the binlog to find any inserts, updates or 
> deletes.
> I played around a bit with Open Replicator today, and got it working — a 
> small Scala program that could get a real-time feed of all changes happening 
> in a MySQL database. However, I have some doubts about the quality of the 
> library (the code is not very good, it has only very cursory tests, the 
> original maintainer hasn't touched it for 18 months, and there are reports of 
> nasty bugs -- eg. blowing up on any negative number). There don't seem to be 
> any better Java binlog parsers out there. But I did skim the source of Open 
> Replicator, and it's not too complicated -- it seems quite feasible to write 
> a MySQL binlog parser ourselves.
> This is still very much at exploratory stage, but I think it could be really 
> cool to have database changelog support easily available in Samza.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to