To add to Gengliang's point, TableCatalog would load Changelog knowing the
range that is being scanned. This allows the connector to traverse the
commit history and detect whether it had any CoW operation or not. In other
words, it is not a blind flag at the table level. It is specific to the
changelog range that is being requested.

ср, 4 бер. 2026 р. о 09:17 Gengliang Wang <[email protected]> пише:

> Thanks for the follow-up — appreciate the rigor.
>
> *1.* *Capability Naming*: The naming is intentional —
> representsUpdateAsDeleteAndInsert() mirrors the existing
> SupportsDelta.representUpdateAsDeleteAndInsert()
> <https://github.com/apache/spark/blob/master/sql/catalyst/src/main/java/org/apache/spark/sql/connector/write/SupportsDelta.java#L45>
> in the DSv2 API. When it returns false, it means the connector's change
> data already distinguishes updates from raw delete/insert pairs, so there
> is nothing for Catalyst to derive.
>
> *2.* *Partition-Level CoW Hints*: A table-level flag is sufficient for
> the common case. If a connector has partitions with mixed CoW behavior and
> needs finer-grained control, it can simply return containsCarryoverRows() =
> false and handle carry-over removal internally within its ScanBuilder — the
> interface already supports this. There is no need to complicate the
> Spark-level API for an edge case that connectors can solve themselves.
>
> *3. Audit Discoverability*: The SPIP proposes only two options in the
> WITH clause (deduplicationMode and computeUpdates) — this is a small,
> well-documented surface, not a hidden knob. Adding an ALL CHANGES grammar
> modifier introduces its own discoverability problem: it implies the table
> retains a complete history of all changes, which is not guaranteed — most
> formats discard old change data after vacuum/expiration. A SQL keyword that
> suggests completeness but silently returns partial results is arguably
> worse for compliance engineers than an explicit option with clear
> documentation.
>
>
>
> On Tue, Mar 3, 2026 at 11:17 PM vaquar khan <[email protected]> wrote:
>
>> Thanks  Gengliang   for the detailed follow-ans. While the mechanics you
>> laid out make sense on paper, looking at how this will actually play out in
>> production.
>>
>> 1. Capability Pushdown vs. Format Flag
>> Returning representsUpdateAsDeleteAndInsert() = false just signals that
>> the connector doesn't use raw delete/insert pairs. It doesn't explicitly
>> tell Catalyst, "I already computed the pre/post images natively, trust my
>> output and skip the window function entirely." Those are semantically
>> different. A dedicated supportsNativePrePostImages() capability method
>> would close this gap much more cleanly than overloading the format flag.
>>
>> 2. CoW I/O is a Table-Level Binary
>> The ScanBuilder delegation is a fair point, but containsCarryoverRows()
>> is still a table-level binary flag. For massive, partitioned CoW tables
>> that have carry-overs in some partitions but not others, this interface
>> forces Spark to apply carry-over removal globally or not at all. A
>> partition-level or scan-level hint is a necessary improvement for
>> mixed-mode CoW tables.
>>
>> 3. Audit Discoverability
>> I agree deduplicationMode='none' is functionally correct, but my concern
>> is discoverability. A compliance engineer or DBA writing SQL shouldn't need
>> institutional knowledge of a hidden WITH clause option string to get
>> audit-safe output. Having an explicit ALL CHANGES modifier in the grammar
>> is crucial for enterprise adoption and auditing.
>>
>> I am highly supportive of the core architecture, but these are real
>> production blockers for enterprise workloads. Let's get these clarified and
>> updated in the SPIP document, Items 1 and 3 are production blockers I'd
>> like addressed in the SPIP document. Item 2 is a real limitation but could
>> reasonably be tracked as a follow-on improvement. Happy to cast my +1 once
>> 1 and 3 are clarified.
>>
>> Regards,
>> Viquar Khan
>>
>> On Wed, 4 Mar 2026 at 00:37, Gengliang Wang <[email protected]> wrote:
>>
>>> Hi Viquar,
>>>
>>> Thanks for the detailed review — all three concerns are already accounted
>>> for in the current SPIP design (Appendix B.2 and B.6).1. Capability
>>> Pushdown: The Changelog interface already exposes declarative
>>> capability methods — containsCarryoverRows(), containsIntermediate
>>> Changes(), and representsUpdateAsDeleteAndInsert(). The
>>> ResolveChangelogTable rule only injects post-processing when the
>>> connector declares it is needed. If Delta Lake already materializes 
>>> pre/post-images
>>> natively, it returns representsUpdateAsDeleteAndInsert() = false and
>>> Spark skips that work entirely. Catalyst never reconstructs what the
>>> storage layer already provides.2. CoW I/O Bottlenecks: Carry-over
>>> removal is already gated on containsCarryoverRows() = true. If a
>>> connector eliminates carry-over rows at the scan level, it returns false and
>>> Spark does nothing. The connector also retains full control over scan 
>>> planning
>>> via its ScanBuilder, so I/O optimization stays in the storage layer.3.
>>> Audit Fidelity: The deduplicationMode option already supports none,
>>> dropCarryovers, and netChanges. Setting deduplicationMode = 'none' returns
>>> the raw, unmodified change stream with every intermediate state
>>> preserved. Net change collapsing happens when explicitly requested by
>>> the user.
>>>
>>> On Tue, Mar 3, 2026 at 10:27 PM Yuming Wang <[email protected]> wrote:
>>>
>>>> +1, really looking forward to this feature.
>>>>
>>>> On Wed, Mar 4, 2026 at 1:57 PM vaquar khan <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi everyone,
>>>>>
>>>>> Sorry for the late response, I know the vote is actively underway, but
>>>>> reviewing the SPIP's Catalyst post-processing mechanics raised a few
>>>>> systemic design concerns we need to clarify to avoid severe performance
>>>>> regressions down the line.
>>>>>
>>>>> 1. Capability Pushdown: The proposal has Catalyst deriving
>>>>> pre/post-images from raw insert/delete pairs. Storage layers like Delta
>>>>> Lake already materialize these natively. If the Changelog interface lacks
>>>>> state pushdown, Catalyst will burn CPU and memory reconstructing what the
>>>>> storage layer already solved.
>>>>>
>>>>> 2. CoW I/O Bottlenecks: Mandating Catalyst to filter "carry-over" rows
>>>>> for CoW tables is highly problematic. Without strict connector-level row
>>>>> lineage, we will be dragging massive, unmodified Parquet files across the
>>>>> network, forcing Spark into heavy distributed joins just to discard
>>>>> unchanged data.
>>>>>
>>>>> 3. Audit Fidelity: The design explicitly targets computing "net
>>>>> changes." Collapsing intermediate states breaks enterprise audit and
>>>>> compliance workflows that require full transactional history. The SQL
>>>>> grammar needs an explicit ALL CHANGES execution path.
>>>>>
>>>>> I fully support unifying CDC  and this SIP is the right direction, but
>>>>> abstracting it at the cost of storage-native optimizations and audit
>>>>> fidelity is a dangerous trade-off. We need to clarify how physical 
>>>>> planning
>>>>> will handle these bottlenecks before formally ratifying the proposal.
>>>>>
>>>>> Regards,
>>>>> Viquar Khan
>>>>>
>>>>> On Tue, 3 Mar 2026 at 20:09, Cheng Pan <[email protected]> wrote:
>>>>>
>>>>>> +1 (non-binding)
>>>>>>
>>>>>> Thanks,
>>>>>> Cheng Pan
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mar 4, 2026, at 09:59, John Zhuge <[email protected]> wrote:
>>>>>>
>>>>>> +1 (non-binding)
>>>>>>
>>>>>> Thanks for the contribution!
>>>>>>
>>>>>>
>>>>>> On Tue, Mar 3, 2026 at 5:50 PM Burak Yavuz <[email protected]> wrote:
>>>>>>
>>>>>>> +1!
>>>>>>>
>>>>>>> On Tue, Mar 3, 2026 at 5:48 PM Szehon Ho <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> +1, look forward to it (non binding)
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Szehon
>>>>>>>>
>>>>>>>> On Tue, Mar 3, 2026 at 5:37 PM Anton Okolnychyi <
>>>>>>>> [email protected]> wrote:
>>>>>>>>
>>>>>>>>> +1 (non-binding)
>>>>>>>>>
>>>>>>>>> On Tue, Mar 3, 2026 at 5:07 PM Mich Talebzadeh <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> +1
>>>>>>>>>>
>>>>>>>>>> Dr Mich Talebzadeh,
>>>>>>>>>> Data Scientist | Distributed Systems (Spark) | Financial
>>>>>>>>>> Forensics & Metadata Analytics | Transaction Reconstruction | Audit &
>>>>>>>>>> Evidence-Based Analytics
>>>>>>>>>>
>>>>>>>>>>    view my Linkedin profile
>>>>>>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, 4 Mar 2026 at 00:57, Gengliang Wang <[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Spark devs,
>>>>>>>>>>>
>>>>>>>>>>> I'd like to call a vote on the SPIP*: Change Data Capture (CDC)
>>>>>>>>>>> Support*
>>>>>>>>>>>
>>>>>>>>>>> *Summary:*
>>>>>>>>>>>
>>>>>>>>>>> This SPIP proposes a unified approach by adding a CHANGES SQL
>>>>>>>>>>> clause and corresponding DataFrame/DataStream APIs that work across 
>>>>>>>>>>> DSv2
>>>>>>>>>>> connectors.
>>>>>>>>>>>
>>>>>>>>>>> 1. Standardized User API
>>>>>>>>>>> SQL:
>>>>>>>>>>>
>>>>>>>>>>> -- Batch: What changed between version 10 and 20?
>>>>>>>>>>> SELECT * FROM my_table CHANGES FROM VERSION 10 TO VERSION 20;
>>>>>>>>>>>
>>>>>>>>>>> -- Streaming: Continuously process changes
>>>>>>>>>>> CREATE STREAMING TABLE cdc_sink AS
>>>>>>>>>>> SELECT * FROM STREAM my_table CHANGES FROM VERSION 0;
>>>>>>>>>>>
>>>>>>>>>>> DataFrame API:
>>>>>>>>>>> spark.read
>>>>>>>>>>>   .option("startingVersion", "10")
>>>>>>>>>>>   .option("endingVersion", "20")
>>>>>>>>>>>   .changes("my_table")
>>>>>>>>>>>
>>>>>>>>>>> 2. Engine-Level Post Processing Under the hood, this proposal
>>>>>>>>>>> introduces a minimal Changelog interface for DSv2 connectors.
>>>>>>>>>>> Spark's Catalyst optimizer will take over the CDC post-processing,
>>>>>>>>>>> including:
>>>>>>>>>>>
>>>>>>>>>>>    -
>>>>>>>>>>>
>>>>>>>>>>>    Filtering out copy-on-write carry-over rows.
>>>>>>>>>>>    - Deriving pre-image/post-image updates from raw
>>>>>>>>>>>    insert/delete pairs.
>>>>>>>>>>>    -
>>>>>>>>>>>
>>>>>>>>>>>    Computing net changes.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *Relevant Links:*
>>>>>>>>>>>
>>>>>>>>>>>    - *SPIP Doc: *
>>>>>>>>>>>    
>>>>>>>>>>> https://docs.google.com/document/d/1-4rCS3vsGIyhwnkAwPsEaqyUDg-AuVkdrYLotFPw0U0/edit?usp=sharing
>>>>>>>>>>>    - *Discuss Thread: *
>>>>>>>>>>>    https://lists.apache.org/thread/dhxx6pohs7fvqc3knzhtoj4tbcgrwxts
>>>>>>>>>>>    - *JIRA: *https://issues.apache.org/jira/browse/SPARK-55668
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> *The vote will be open for at least 72 hours. *Please vote:
>>>>>>>>>>>
>>>>>>>>>>> [ ] +1: Accept the proposal as an official SPIP
>>>>>>>>>>>
>>>>>>>>>>> [ ] +0
>>>>>>>>>>>
>>>>>>>>>>> [ ] -1: I don't think this is a good idea because ...
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Gengliang Wang
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> John Zhuge
>>>>>>
>>>>>>
>>>>>>

Reply via email to