Daishuyuan commented on PR #4341:
URL: https://github.com/apache/flink-cdc/pull/4341#issuecomment-4728537434

   The current DWS SinkV2 implements an application-level two-phase commit with 
staging tables and the SinkV2 committer. The writer writes records into 
checkpoint-scoped staging tables and seals them into committables on 
checkpoints or schema barriers. The committer then merges the latest staged 
rows into the target table by primary key inside a DWS transaction and records 
a commit marker, making retries and recovery idempotent. For schema changes, 
the runtime now provides a FlushEvent-aware sink writer hook, so the DWS writer 
can commit data written with the old schema before the schema barrier is 
applied.
   The trade-off is that this is not native XA/database-level 2PC in DWS; it is 
an application-level 2PC. The upside is that it does not depend on native DWS 
distributed transaction support, fits Flink SinkV2's committer retry/recovery 
model, and handles schema barriers. The cost is that target tables need primary 
keys, extra staging and commit marker tables are required, schema flush 
performs a synchronous commit, and legacy native DWS client batching/retry 
options are kept only as compatibility options in SinkV2 rather than being used 
for write tuning.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to