I fully agree with your idea. A general, pluggable CDC framework in Spark would fill a real gap for integrating operational databases with lakehouse formats using Structured Streaming.
I also believe it should integrate seamlessly with declarative pipelines, allowing users to declare intent (source, tables, sink, apply semantics) while Spark manages the underlying streaming jobs. El sáb, 28 feb 2026, 15:20, Nimrod Ofek <[email protected]> escribió: > I think that one is only for Delta tables - I mean something more general > with multiple pluggable sources- like Flink cdc - supporting cdc for sql > server, mysql, postgresql delta and iceberg for starter. > I think probably processing them with something like Spark structured > streaming - supporting cdc for various data sources and general databases. > > While Iceberg and Delta can be read from various engines, other data > sources like mysql, sql server etc. can't- so to share such tables one need > to have an easy way to transform those tables to Iceberg/ Delta for data > lakes (you can't read it all the time from the operational database). > > Thanks, > Nimrod > > בתאריך שבת, 28 בפבר׳ 2026, 15:54, מאת Ángel Álvarez Pascua < > [email protected]>: > >> You mean something like AutoCDC from Databricks? >> https://docs.databricks.com/aws/en/ldp/cdc >> >> El sáb, 28 feb 2026, 10:47, Nimrod Ofek <[email protected]> escribió: >> >>> Hi all, >>> >>> I would like to start a discussion about the possibility for >>> implementation of a Change Data Capture (CDC) feature within Apache Spark, >>> similar to the existing, competing Flink CDC functionality >>> <https://nightlies.apache.org/flink/flink-cdc-docs-master/docs/connectors/flink-sources/overview/> >>> . >>> >>> I believe integrating such a feature would significantly enhance Spark's >>> capabilities for real-time data integration and ETL processes. I would >>> appreciate the opportunity to discuss how we might approach this proposal. >>> >>> Thank you for your time and consideration. >>> >>> Best regards, >>> Nimrod >>> >>>
