[ 
https://issues.apache.org/jira/browse/SPARK-57150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated SPARK-57150:
-----------------------------------
    Labels: pull-request-available  (was: )

> AutoCDC SCD1 Out-of-order Event Convergence Tests
> -------------------------------------------------
>
>                 Key: SPARK-57150
>                 URL: https://issues.apache.org/jira/browse/SPARK-57150
>             Project: Spark
>          Issue Type: Sub-task
>          Components: Declarative Pipelines
>    Affects Versions: 4.3.0
>            Reporter: Anish Mahto
>            Priority: Major
>              Labels: pull-request-available
>
> A key feature of SDP's AutoCDC implementation is that it supports reconciling 
> out-of-order (by sequence) events. This support also adds significant 
> complexity to the reconciliation logic as it requires cross-microbatch 
> stateful tracking in the auxiliary table, and is prone to breaking as the 
> implementation evolves over time.
> Introduce an A/B style test suite to execute the implementation on both a 
> sequence-sorted single-microbatch event stream and the same events on a 
> shuffled multi-microbatch event stream. If out-of-order processing is 
> correct, then the SCD1 implementation should produce the same target tables 
> for both runs.
> Data is randomly generated, but with a constant seed for reproducibility.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to