GabrielBBaldez opened a new pull request, #11057: URL: https://github.com/apache/seatunnel/pull/11057
### Purpose of this pull request Closes #11036. Adds a `snapshot` startup mode to `MySQL-CDC`: a bounded bootstrap job that reads the snapshot of the captured tables and then finishes on its own, without entering the incremental/binlog phase. Useful for one-time backfill, initial warehouse/table bootstrap, and controlled migration stages. ### What changed **Option surface (`MySqlIncrementalSourceOptions`)** - `startup.mode` now accepts `snapshot` (choices: `initial`, `snapshot`, `earliest`, `latest`, `specific`, `timestamp`), with the description explaining the bounded semantics. **Runtime (`connector-cdc-base`)** - `StartupMode` gains a `SNAPSHOT` constant; `StartupConfig.getStartupOffset` treats it like `INITIAL` (the snapshot phase needs no stream offset). - `HybridSplitAssigner` gains a `snapshotOnly` flag: the existing snapshot split planning/reading path is fully reused, but once the snapshot phase completes no incremental split is handed out and `waitingForCompletedSplits()` no longer considers the incremental assigner — so the enumerator's existing logic signals no-more-splits and the readers finish. - `IncrementalSource#getBoundedness` reports `BOUNDED` in snapshot mode, letting the job run as `BATCH` and finish naturally instead of idling in a streaming state. - Fail-fast validation at source creation for incompatible combinations: `startup.mode = snapshot` together with `stop.mode != never`, or with `startup.specific-offset.file` / `startup.specific-offset.pos` / `startup.timestamp`, is rejected with a clear message. **Checkpoint / finish semantics** - Snapshot-only jobs checkpoint through the unchanged `HybridPendingSplitsState` path, so checkpointing during the snapshot keeps working. On restore the assigner is rebuilt with the same snapshot-only flag (derived from config), so a restarted job resumes the snapshot phase instead of re-entering it or falling into streaming. **Tests** - `HybridSplitAssignerTest#testSnapshotOnlyFinishesAfterSnapshotPhase`: with a completed snapshot phase, the snapshot-only assigner returns no next split and is not waiting (job can finish), while the default hybrid behavior on the same state keeps waiting to hand out the incremental split. - `MySqlIncrementalSourceFactoryTest#testSupportedStartUpModes`: asserts the supported startup modes, including `snapshot`. - New e2e case `testMysqlCdcSnapshotOnlyStartupMode` (+ `mysqlcdc_snapshot_only.conf`, `BATCH` job): seeds the source table, runs the job synchronously and asserts it exits 0 on its own, asserts the sink equals the snapshot, then mutates the source after completion and asserts the sink is unchanged (no binlog consumption). **Docs (EN + ZH)** - `startup.mode` option row updated and a "Snapshot-only bootstrap" section added with a `BATCH` example and the incompatible-options note. ### Scope notes (matching the issue's non-goals) - No GTID-based startup, no skip-events/skip-rows trimming, no dynamic newly-added table capture, no schema evolution policy changes. ### Verification - `mvn install -pl connector-cdc-base,connector-cdc-mysql` — all module tests passing locally (JDK 11), including the new ones. - `mvn test-compile -pl connector-cdc-mysql-e2e` — e2e module compiles; the new IT runs in CI (needs Docker). - `mvn spotless:apply` clean. ### Check list * [x] Code changed are covered with tests, or it does not need tests for reason * [ ] If any new Jar binary package adding in your PR, please add License Notice according [New License Guide](https://github.com/apache/seatunnel/blob/dev/docs/en/contribution/new-license.md) * [x] If necessary, please update the documentation to describe the new feature. https://github.com/apache/seatunnel/tree/dev/docs -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
