+1 DB Tsai | https://www.dbtsai.com/ | PGP 42E5B25A8F7A82C1
> On Mar 4, 2026, at 3:44 PM, Dongjoon Hyun <[email protected]> wrote: > > +1 > > Dongjoon. > > On 2026/03/04 23:39:58 Qiegang Long wrote: >> +1 >> >> Qiegang >> >> On Tue, Mar 3, 2026 at 7:57 PM Gengliang Wang <[email protected]> wrote: >> >>> Hi Spark devs, >>> >>> I'd like to call a vote on the SPIP*: Change Data Capture (CDC) Support* >>> >>> *Summary:* >>> >>> This SPIP proposes a unified approach by adding a CHANGES SQL clause and >>> corresponding DataFrame/DataStream APIs that work across DSv2 connectors. >>> >>> 1. Standardized User API >>> >>> SQL: >>> >>> -- Batch: What changed between version 10 and 20? >>> >>> SELECT * FROM my_table CHANGES FROM VERSION 10 TO VERSION 20; >>> >>> -- Streaming: Continuously process changes >>> >>> CREATE STREAMING TABLE cdc_sink AS >>> >>> SELECT * FROM STREAM my_table CHANGES FROM VERSION 0; >>> >>> DataFrame API: >>> >>> spark.read >>> >>> .option("startingVersion", "10") >>> >>> .option("endingVersion", "20") >>> >>> .changes("my_table") >>> >>> 2. Engine-Level Post Processing Under the hood, this proposal introduces >>> a minimal Changelog interface for DSv2 connectors. Spark's Catalyst >>> optimizer will take over the CDC post-processing, including: >>> >>> - >>> >>> Filtering out copy-on-write carry-over rows. >>> - >>> >>> Deriving pre-image/post-image updates from raw insert/delete pairs. >>> - >>> >>> Computing net changes. >>> >>> >>> *Relevant Links:* >>> >>> - *SPIP Doc: * >>> >>> https://docs.google.com/document/d/1-4rCS3vsGIyhwnkAwPsEaqyUDg-AuVkdrYLotFPw0U0/edit?usp=sharing >>> - *Discuss Thread: * >>> https://lists.apache.org/thread/dhxx6pohs7fvqc3knzhtoj4tbcgrwxts >>> - *JIRA: *https://issues.apache.org/jira/browse/SPARK-55668 >>> >>> >>> *The vote will be open for at least 72 hours. *Please vote: >>> >>> [ ] +1: Accept the proposal as an official SPIP >>> >>> [ ] +0 >>> >>> [ ] -1: I don't think this is a good idea because ... >>> >>> Thanks, >>> Gengliang Wang >>> >> > > --------------------------------------------------------------------- > To unsubscribe e-mail: [email protected] >
signature.asc
Description: Message signed with OpenPGP
