Really excited to see this hit the ML James.

As author of the base CDC (get your stones ready for throwing :D) and someone 
moderately involved in the CEP here, definitely welcome any questions. CDC is a 
*thorny* *problem *in a multi-replica distributed system like this.

On Fri, Sep 27, 2024, at 5:40 PM, James Berragan wrote:
> Hi everyone,
> 
> Wiki: 
> https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-44%3A+Kafka+integration+for+Cassandra+CDC+using+Sidecar
> 
> We would like to propose this CEP for adoption by the community.
> 
> CDC is a common technique in databases but right now there is no 
> out-of-the-box solution to do this easily and at scale with Cassandra. Our 
> proposal is to build a fully-fledged solution into the Apache Cassandra 
> Sidecar. This comes with a number of benefits:
> - Sidecar is an official part of the existing Cassandra eco-system.
> - Sidecar runs co-located with Cassandra instances and so scales with the 
> cluster size.
> - Sidecar can access the underlying Cassandra database to store CDC 
> configuration and the CDC state in a special table.
> - Running in the Sidecar does not require additional external resources to 
> run.
> 
> The core CDC module we anticipate will be pluggable and re-usable, it is 
> available for review here: 
> https://github.com/apache/cassandra-analytics/pull/87. The remaining Sidecar 
> code will follow.
> 
> As a reminder, please keep the discussion here on the dev list vs. in the 
> wiki, as we’ve found it easier to manage via email.
> 
> Sincerely,
> James Berragan
> Bernardo Botella Corbi
> Yifan Cai
> Jyothsna Konisa

Reply via email to