yzeng1618 opened a new pull request, #10310: URL: https://github.com/apache/seatunnel/pull/10310
### Purpose of this pull request When running HBase source on Flink with checkpoint/savepoint, the restore path calls addSplitsBack() and then registerReader(). The enumerator used to re-enumerate all table splits in registerReader and reassign them, so the job re-scans the table and the read count can roughly double after restore. <img width="1686" height="301" alt="image" src="https://github.com/user-attachments/assets/45990a27-1424-42d6-8adb-a8afcdd55599" /> ### Does this PR introduce _any_ user-facing change? Yes. Restoring from checkpoint/savepoint now resumes from the remaining splits without re-scanning the whole table, so duplicate reads drop to the expected at-least-once behavior. Fresh runs are unchanged. ### How was this patch tested? - Added unit tests in `HbaseSourceSplitEnumeratorTest` to cover restore assignment and multi-parallel init. - Not run locally (skipped by request). ### Check list * [ ] If any new Jar binary package adding in your PR, please add License Notice according [New License Guide](https://github.com/apache/seatunnel/blob/dev/docs/en/contribution/new-license.md) * [ ] If necessary, please update the documentation to describe the new feature. https://github.com/apache/seatunnel/tree/dev/docs * [ ] If necessary, please update `incompatible-changes.md` to describe the incompatibility caused by this PR. * [ ] If you are contributing the connector code, please check that the following files are updated: 1. Update [plugin-mapping.properties](https://github.com/apache/seatunnel/blob/dev/plugin-mapping.properties) and add new connector information in it 2. Update the pom file of [seatunnel-dist](https://github.com/apache/seatunnel/blob/dev/seatunnel-dist/pom.xml) 3. Add ci label in [label-scope-conf](https://github.com/apache/seatunnel/blob/dev/.github/workflows/labeler/label-scope-conf.yml) 4. Add e2e testcase in [seatunnel-e2e](https://github.com/apache/seatunnel/tree/dev/seatunnel-e2e/seatunnel-connector-v2-e2e/) 5. Update connector [plugin_config](https://github.com/apache/seatunnel/blob/dev/config/plugin_config) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
