Szehon Ho created SPARK-57331:
---------------------------------

             Summary: Support native CDC changelog sources as input to AutoCDC 
flows
                 Key: SPARK-57331
                 URL: https://issues.apache.org/jira/browse/SPARK-57331
             Project: Spark
          Issue Type: Improvement
          Components: Declarative Pipelines
    Affects Versions: 5.0.0
            Reporter: Szehon Ho


Allow AutoCDC (SCD Type 1) flows in Spark Declarative Pipelines to consume 
Spark native CDC (Changelog) sources directly. When a flow source is detected 
as a native Changelog read, AutoCDC auto-derives its CDC semantics (delete 
detection, metadata-column exclusion, and update_preimage filtering) from the 
changelog contract instead of requiring the user to hand-map them. Includes 
validation that the changelog read is configured for unambiguous, deterministic 
reconciliation (computeUpdates for delete+insert updates, carry-over removal, 
and _commit_version sequencing). Scala-first scope; Python/Connect APIs 
deferred.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to