GabrielBBaldez commented on PR #11057:
URL: https://github.com/apache/seatunnel/pull/11057#issuecomment-4681062483

   Pushed the rework — thanks again, this is much better for it.
   
   Instead of short-circuiting the incremental phase, `snapshot` mode is now 
implemented as **`initial` snapshot + a forced bounded `LATEST` stop**:
   
   - `StartupMode.SNAPSHOT` resolves to a null startup offset (snapshot runs) 
and routes through `HybridSplitAssigner`, exactly like `initial`.
   - `IncrementalSource` forces `StopConfig` to `LATEST` for snapshot mode. 
Since the incremental split is only created *after* the snapshot phase 
completes, `offsetFactory.latest()` resolves to the change-log position at 
snapshot completion. The existing bounded-stop machinery (the binlog reader 
already terminates at a non-`NEVER` stop offset) then runs the catch-up from 
the snapshot low watermark up to that boundary and finishes.
   
   So **Issue 2** is resolved by reusing the exact catch-up path `initial` uses 
— the snapshot/change-log consistency window is closed by the same 
`IncrementalSplitAssigner.createIncrementalSplit` watermark logic, not a 
bespoke path. The only difference from `initial` is the bounded stop. 
(`getBoundedness` reports `BOUNDED` via that stop; `HybridSplitAssigner` is 
back to its original form.)
   
   On **Issue 3**: because the during-snapshot reconciliation is now the same 
code as `initial` (already covered by `testMysqlCdcCheckDataE2e`), I kept the 
snapshot e2e deterministic — it asserts bounded completion, the snapshot is 
captured, and changes made after the boundary are not consumed — rather than 
adding a timing-dependent "mutate during the snapshot scan" test that tends to 
be flaky on the shared runners. Happy to add one if you'd prefer it; just 
wanted to avoid introducing flakiness.
   
   **Issue 1** is also done — the MongoDB-CDC commit is split out, so this PR 
is MySQL-only now. Marked ready for re-review. 🙏


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to