+1 (non-binding)

-- 
LNC

On Fri, Apr 3, 2026, 5:15 PM Shixiong Zhu <[email protected]> wrote:

> +1
>
>
> On Fri, Apr 3, 2026 at 5:03 PM Mich Talebzadeh <[email protected]>
> wrote:
>
>> +1
>>
>> Dr Mich Talebzadeh,
>> Data Scientist | Distributed Systems (Spark) | Financial Forensics &
>> Metadata Analytics | Transaction Reconstruction | Audit & Evidence-Based
>> Analytics
>>
>>    view my Linkedin profile
>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>
>>
>>
>>
>>
>> On Fri, 3 Apr 2026 at 23:00, Andreas Neumann <[email protected]> wrote:
>>
>>> Hi Spark devs,
>>>
>>> I'd like to call a vote on the SPIP*: Auto CDC Support for Apache Spark*
>>> Motivation
>>>
>>> With the upcoming introduction of standardized CDC support
>>> <https://issues.apache.org/jira/browse/SPARK-55668>, Spark will soon
>>> have a unified way to produce change data feeds. However, consuming these
>>> feeds and applying them to a target table remains a significant challenge.
>>>
>>> Common patterns like SCD Type 1 (maintaining a 1:1 replica) and SCD
>>> Type 2 (tracking full change history) often require hand-crafted,
>>> complex MERGE logic. In distributed systems, these implementations are
>>> frequently error-prone when handling deletions or out-of-order data.
>>> Proposal
>>>
>>> This SPIP proposes a new "Auto CDC" flow type for Spark. It
>>> encapsulates the complex logic for SCD types and out-of-order data,
>>> allowing data engineers to configure a declarative flow instead of writing
>>> manual MERGE statements. This feature will be available in both Python
>>> and SQL.
>>>
>>> Example SQL:
>>>
>>> -- Produce a change feed
>>>
>>> CREATE STREAMING TABLE cdc.users AS
>>>
>>> SELECT * FROM STREAM my_table CHANGES FROM VERSION 10;
>>>
>>>
>>> -- Consume the change feed
>>>
>>> CREATE FLOW flow
>>>
>>> AS AUTO CDC INTO
>>>
>>>   target
>>>
>>> FROM stream(cdc_data.users)
>>>
>>>   KEYS (userId)
>>>
>>>   APPLY AS DELETE WHEN operation = "DELETE"
>>>
>>>   SEQUENCE BY sequenceNum
>>>
>>>   COLUMNS * EXCEPT (operation, sequenceNum)
>>>
>>>   STORED AS SCD TYPE 2
>>>
>>>   TRACK HISTORY ON * EXCEPT (city);
>>>
>>>
>>> *Relevant Links:*
>>>
>>>    - SPIP Document:
>>>    
>>> https://docs.google.com/document/d/1Hp5BGEYJRHbk6J7XUph3bAPZKRQXKOuV1PEaqZMMRoQ/
>>>    -
>>>
>>>    *Discussion Thread: *
>>>    https://lists.apache.org/thread/j6sj9wo9odgdpgzlxtvhoy7szs0jplf7
>>>    -
>>>
>>>    JIRA: <https://issues.apache.org/jira/browse/SPARK-55668>
>>>    https://issues.apache.org/jira/browse/SPARK-56249
>>>
>>> *The vote will be open for at least 72 hours. *Please vote:
>>>
>>> [ ] +1: Accept the proposal as an official SPIP
>>> [ ] +0
>>> [ ] -1: I don't think this is a good idea because ...
>>> Cheers -Andreas
>>>
>>>
>>>

Reply via email to