vsantonastaso commented on PR #10799: URL: https://github.com/apache/seatunnel/pull/10799#issuecomment-4480918147
> By the way, I would like this PR to narrow its scope to a clean **staged foundation step**, rather than attempt the full per-connector Debezium isolation in one go. Concretely: > > ### Keep in this PR > 1. Per-connector `<debezium.version>` POM property and the `provided`-scope shift on `connector-cdc-base` Debezium dependencies. > 2. Per-connector `*EmbeddedDatabaseHistory` / `*SchemaHistoryAdapter` for schema-storage isolation — this is a real correctness improvement at a single Debezium version and is the right direction. > 3. The `DebeziumAdapter` SPI interface as a **target shape**, defined in base but not yet relied on by the runtime wiring. Treat it as the contract that future PRs will fully wire up. > 4. The `incompatible-changes.md` entries you have added (English + Chinese). > > ### Revert from this PR (please move to a follow-up) > 1. The `TableId → TableIdentifier` field-type migration across `SnapshotSplit` / `IncrementalSplit` / `CompletedSnapshotSplitInfo` / `SnapshotPhaseState`. > 2. The custom `SourceSplitSerializer` / `SnapshotSplitSerializer` / `IncrementalSplitSerializer` / `PendingSplitsStateSerializer` / `CompletedSnapshotSplitInfoSerializer`. > 3. `MigrationObjectInputStream` / `LegacyTableId` and the `readObject(GetField)` overrides on the state classes. > 4. The runtime-path rewiring in `*SourceFetchTaskContext` that currently routes through the adapter and then immediately unwraps via `getDelegate()`. Keep the existing direct Debezium calls intact in this PR. > > ### Why this split > Those reverted pieces only become necessary once base is **actually** Debezium-neutral and connector plugins have classloader isolation. Until then, changing the checkpoint binary format buys real upgrade risk for existing users without delivering the per-connector independence the PR description claims. Decoupling them lets us: > > * Land the unambiguous wins now with zero wire-format risk > * Do the full isolation work as a single coherent follow-up where the cost is justified by reaching the stated goal > > ### Suggested PR description update > Please also update the PR title and description to reflect a "staged foundation" scope, e.g. > > > `[Feature][Connector-V2] Foundation for per-connector Debezium version management (staged)` > > and add a sentence noting that full multi-version isolation is a follow-up. > > ### Path B remains acceptable > If you would prefer to land the full isolation story in this PR instead of staging, the bar is: > > 1. Remove **all** `io.debezium.*` imports from `connector-cdc-base/src/main/java`. > 2. Move patched Debezium-side classes (`HistorizedRelationalDatabaseConnectorConfig`, `ChangeEventQueue`, `HeartbeatFactory`, `DefaultHeartbeatConnectionProvider`) into per-version sibling modules. > 3. Make `JdbcDataSourceDialect` / `JdbcSourceFetchTaskContext` / `TableIdentifierImpl` Debezium-neutral end to end. > 4. Wire `SeaTunnelChildFirstClassLoader` to treat `io.debezium.**` as strictly child-first. > 5. Provide a runnable proof-of-concept: one CDC connector on Debezium 1.9.x and another on Debezium 3.x, both loadable in the same JVM without `LinkageError`. > > Path A is meaningfully smaller and lower-risk, so I would recommend it, but Path B is acceptable if you would rather complete the story end-to-end. > > ### Next step > You can also start a short `dev@` thread so the community can align on the architectural direction before we land anything that touches the CDC SPI surface. > > Happy to review either path quickly once the direction is chosen. Thanks again for driving this — the underlying problem is real and worth solving properly. Hi @davidzollo, thanks again for the detailed review and for taking the time to lay out both paths so clearly. I agreed with path A. The PR has been narrowed to the staged foundation as you described. I updated the PR title and the description now explicitly calls out what is deferred and why. The checkpoint format, `JdbcSourceFetchTaskContext` API, and all runtime behaviour are identical to `dev`. No migration needed for existing users. The full runtime wiring + classloader isolation is the clear next step. I am ready to start that conversation on dev@ or in a follow-up issue once this lands. I am waiting for your review. Thanks -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
