Not everyone has it their way like Frank Sinatra. Due to various reasons, folks 
need to get the changes in Cassandra to be duplicated to a topic for further 
processing - especially if the new system owner doesn’t own the whole platform.

There are various ways to do this but you have to deal with the consequences.

1. Kafka Connect using landoops current source connector which does “allow 
filtering” on tables. Sends changes to Kafka topic. Then you can either process 
using Kafka Streams, Kafka Connect sink, or Kafka Consumer API.

2. CDC to Kafka , especially if the CDC is coming from commit logs - you may 
see duplicates from nodes.

3. Triggers to Kafka , this is the only way I know now to do once only messages 
to Kafka for every mutation that Cassandra receives. This could be problematic 
because you may lose sending a message to Kafka — because you only get it once.

Ideally you’ll want to do what Jon suggested and source the event from Kafka 
for all subsequent processes rather than process in Cassandra and the create 
the event in Kafka.

Rahul Singh
Chief Executive Officer
m 202.905.2818

Anant Corporation
1010 Wisconsin Ave NW, Suite 250
Washington, D.C. 20007

We build and manage digital business technology platforms.
On Sep 10, 2018, 3:58 AM -0400, Dinesh Joshi <dinesh.jo...@yahoo.com.invalid>, 
wrote:
> > On Sep 9, 2018, at 6:08 AM, Jonathan Haddad <j...@jonhaddad.com> wrote:
> >
> > There may be some use cases for it.. but I'm not sure what they are.  It 
> > might help if you shared the use cases where the extra complexity is 
> > required?  When does writing to Cassandra which then dedupes and writes to 
> > Kafka a preferred design then using Kafka and simply writing to Cassandra?
>
> From the reading of the proposal, it seems bring functionality similar to 
> MySQL's binlog to Kafka connector. This is useful for many applications that 
> want to be notified when certain (or any) rows change in the database 
> primarily for a event driven application architecture.
>
> Implementing this in the database layer means there is a standard approach to 
> getting a change notification stream. Downstream subscribers can then decide 
> which notifications to act on.
>
> LinkedIn's databus is similar in functionality - 
> https://github.com/linkedin/databus However it is for heterogenous datastores.
>
> > > On Thu, Sep 6, 2018 at 1:53 PM Joy Gao <j...@wepay.com.invalid> wrote:
> > > >
> > > >
> > > > We have a WIP design doc that goes over this idea in details.
> > > >
> > > > We haven't sort out all the edge cases yet, but would love to get some 
> > > > feedback from the community on the general feasibility of this 
> > > > approach. Any ideas/concerns/questions would be helpful to us. Thanks!
> > > >
>
> Interesting idea. I did go over the proposal briefly. I concur with Jon about 
> adding more use-cases to clarify this feature's potential use-cases.
>
> Dinesh

Reply via email to