642933588 opened a new issue, #8621: URL: https://github.com/apache/seatunnel/issues/8621
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues. ### What happened When using the TidbCDC feature of Seatunnel, a potential bug was found in the getSplitOwner method. When this method performs shard allocation, it may assign multiple shards to the same reader, resulting in the loss of some shards. `private static int getSplitOwner(String splitId, int numReaders) { return (splitId.hashCode() & Integer.MAX_VALUE) % numReaders; }` ### SeaTunnel Version 2.3.8 ### SeaTunnel Config ```conf This method uses the hash code of the splitId modulo numReaders to determine the reader to which the shard belongs. Since hash code collisions may occur, that is, different splitIds may produce the same hash code, or different hash codes may yield the same result after modulo operation. This can cause multiple shards to be assigned to the same reader. In the actual shard processing, this situation may lead to some shards not being processed correctly, resulting in shard loss. ``` ### Running Command ```shell Configure Seatunnel to use the TidbCDC feature. Prepare multiple different splitIds for shard allocation testing. Run the relevant code and observe the allocation results of the getSplitOwner method. Run the test multiple times, and you will find that multiple splitIds may be assigned to the same reader. ``` ### Error Exception ```log Each splitId should be evenly assigned to different readers to ensure that all shards can be processed correctly without shard loss. Actual behavior Multiple splitIds may be assigned to the same reader, causing some shards not to be processed and resulting in shard loss. ``` ### Zeta or Flink or Spark Version _No response_ ### Java or Scala Version _No response_ ### Screenshots _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
