+1 (non-binding)

Thanks,
Cheng Pan



> On Mar 4, 2026, at 09:59, John Zhuge <[email protected]> wrote:
> 
> +1 (non-binding)
> 
> Thanks for the contribution!
> 
> 
> On Tue, Mar 3, 2026 at 5:50 PM Burak Yavuz <[email protected] 
> <mailto:[email protected]>> wrote:
>> +1!
>> 
>> On Tue, Mar 3, 2026 at 5:48 PM Szehon Ho <[email protected] 
>> <mailto:[email protected]>> wrote:
>>> +1, look forward to it (non binding)
>>> 
>>> Thanks
>>> Szehon
>>> 
>>> On Tue, Mar 3, 2026 at 5:37 PM Anton Okolnychyi <[email protected] 
>>> <mailto:[email protected]>> wrote:
>>>> +1 (non-binding)
>>>> 
>>>> On Tue, Mar 3, 2026 at 5:07 PM Mich Talebzadeh <[email protected] 
>>>> <mailto:[email protected]>> wrote:
>>>>> +1 
>>>>> 
>>>>> Dr Mich Talebzadeh,
>>>>> Data Scientist | Distributed Systems (Spark) | Financial Forensics & 
>>>>> Metadata Analytics | Transaction Reconstruction | Audit & Evidence-Based 
>>>>> Analytics
>>>>> 
>>>>>    view my Linkedin profile 
>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>> 
>>>>>  
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, 4 Mar 2026 at 00:57, Gengliang Wang <[email protected] 
>>>>> <mailto:[email protected]>> wrote:
>>>>>> Hi Spark devs,
>>>>>> 
>>>>>> I'd like to call a vote on the SPIP: Change Data Capture (CDC) Support
>>>>>> 
>>>>>> Summary: 
>>>>>> 
>>>>>> This SPIP proposes a unified approach by adding a CHANGES SQL clause and 
>>>>>> corresponding DataFrame/DataStream APIs that work across DSv2 connectors.
>>>>>> 
>>>>>> 1. Standardized User API
>>>>>> 
>>>>>> SQL:
>>>>>> -- Batch: What changed between version 10 and 20?
>>>>>> 
>>>>>> SELECT * FROM my_table CHANGES FROM VERSION 10 TO VERSION 20;
>>>>>> -- Streaming: Continuously process changes
>>>>>> 
>>>>>> CREATE STREAMING TABLE cdc_sink AS
>>>>>> SELECT * FROM STREAM my_table CHANGES FROM VERSION 0;
>>>>>> DataFrame API:
>>>>>> 
>>>>>> spark.read
>>>>>>   .option("startingVersion", "10")
>>>>>>   .option("endingVersion", "20")
>>>>>>   .changes("my_table")
>>>>>> 2. Engine-Level Post Processing Under the hood, this proposal introduces 
>>>>>> a minimal Changelog interface for DSv2 connectors. Spark's Catalyst 
>>>>>> optimizer will take over the CDC post-processing, including:
>>>>>> 
>>>>>> Filtering out copy-on-write carry-over rows.
>>>>>> Deriving pre-image/post-image updates from raw insert/delete pairs.
>>>>>> Computing net changes.
>>>>>> 
>>>>>> 
>>>>>> Relevant Links:
>>>>>> SPIP Doc: 
>>>>>> https://docs.google.com/document/d/1-4rCS3vsGIyhwnkAwPsEaqyUDg-AuVkdrYLotFPw0U0/edit?usp=sharing
>>>>>> Discuss Thread: 
>>>>>> https://lists.apache.org/thread/dhxx6pohs7fvqc3knzhtoj4tbcgrwxts
>>>>>> JIRA: https://issues.apache.org/jira/browse/SPARK-55668
>>>>>> 
>>>>>> The vote will be open for at least 72 hours. Please vote:
>>>>>> 
>>>>>> [ ] +1: Accept the proposal as an official SPIP
>>>>>> 
>>>>>> [ ] +0
>>>>>> 
>>>>>> [ ] -1: I don't think this is a good idea because ...
>>>>>> 
>>>>>> 
>>>>>> Thanks,
>>>>>> Gengliang Wang
> 
> 
> 
> --
> John Zhuge

Reply via email to