nzw921rx commented on issue #10596:
URL: https://github.com/apache/seatunnel/issues/10596#issuecomment-4056083773

   Hi @davidzollo, I'd like to take on this issue.
   
   ### About me
   
   I've been working with SeaTunnel in production, building custom Transform 
and Sink plugins for our data pipeline, including:
   
   - A multi-table-aware **DataSnapshot Transform** (extends 
`AbstractMultiCatalogMapTransform`)
   - A **DataDiff Sink** (implements `SupportMultiTableSink`)
   - **KMS encryption** and **data guard** transforms on top of CDC sources 
(`MySQL-CDC`)
   
   ### My understanding of this issue
   
   Add a `sample-sharding.enable` (CDC) / `split.sample-sharding.enable` (JDBC) 
boolean option (default `true`) to let users explicitly disable sampling-based 
sharding. When set to `false`, the system falls back to unevenly-sized chunk 
splitting regardless of shard count.
   
   Key changes would be in:
   
   | Module | File | Change |
   |--------|------|--------|
   | CDC | `JdbcSourceOptions` / `BaseSourceConfig` | Add the new option |
   | JDBC | `JdbcSourceOptions` / `JdbcSourceConfig` | Add the new option |
   | CDC | `AbstractJdbcSourceChunkSplitter.evenlyColumnSplitChunks()` | Guard 
the sampling path |
   | JDBC | `DynamicChunkSplitter.evenlyColumnSplitChunks()` | Same guard |
   
   I'll also add corresponding **unit tests** and update the **documentation**.
   
   Could you please assign this issue to me? Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to