Hi Andrew,

Thanks for sharing the SPIP, Does that mean the MERGE statement would be
deprecated? Also I think there was a small typo I have suggested in the
doc.

Regards,
Vaibhav

On Fri, Mar 27, 2026 at 10:15 AM DB Tsai <[email protected]> wrote:

> +1
>
> DB Tsai  |  https://www.dbtsai.com/  |  PGP 42E5B25A8F7A82C1
>
> On Mar 26, 2026, at 6:08 PM, Andreas Neumann <[email protected]> wrote:
>
> Hi all,
>
> I’d like to start a discussion on a new SPIP to introduce Auto CDC
> support to Apache Spark.
>
>    - SPIP Document:
>    
> https://docs.google.com/document/d/1Hp5BGEYJRHbk6J7XUph3bAPZKRQXKOuV1PEaqZMMRoQ/
>    -
>
>    JIRA: <https://issues.apache.org/jira/browse/SPARK-55668>
>    https://issues.apache.org/jira/browse/SPARK-5566
>
> Motivation
>
> With the upcoming introduction of standardized CDC support
> <https://issues.apache.org/jira/browse/SPARK-55668>, Spark will soon have
> a unified way to produce change data feeds. However, consuming these
> feeds and applying them to a target table remains a significant challenge.
>
> Common patterns like SCD Type 1 (maintaining a 1:1 replica) and SCD Type 2
> (tracking full change history) often require hand-crafted, complex MERGE
> logic. In distributed systems, these implementations are frequently
> error-prone when handling deletions or out-of-order data.
> Proposal
>
> This SPIP proposes a new "Auto CDC" flow type for Spark. It encapsulates
> the complex logic for SCD types and out-of-order data, allowing data
> engineers to configure a declarative flow instead of writing manual MERGE
> statements. This feature will be available in both Python and SQL.
> Example SQL:
> -- Produce a change feed
> CREATE STREAMING TABLE cdc.users AS
> SELECT * FROM STREAM my_table CHANGES FROM VERSION 10;
>
> -- Consume the change feed
> CREATE FLOW flow
> AS AUTO CDC INTO
>   target
> FROM stream(cdc_data.users)
>   KEYS (userId)
>   APPLY AS DELETE WHEN operation = "DELETE"
>   SEQUENCE BY sequenceNum
>   COLUMNS * EXCEPT (operation, sequenceNum)
>   STORED AS SCD TYPE 2
>   TRACK HISTORY ON * EXCEPT (city);
>
>
> Please review the full SPIP for the technical details. Looking forward to
> your feedback and discussion!
>
> Best regards,
>
> Andreas
>
>
>

Reply via email to