Hi devs, I'd like to initiate a discussion about [FLIP-537: Enumerator with Global Split Assignment Distribution for Balanced Split Assignment] [1], which addresses critical limitations in our current Kafka connector split distribution mechanism.
As documented in [FLINK-31762] [2], several scenarios currently lead to uneven Kafka split distribution, causing reader delays and performance bottlenecks. The core issue stems from the enumerator's lack of visibility into post-assignment split distribution. This flip does two things: 1. ReaderRegistrationEvent Enhancement: SourceOperator should send ReaderRegistrationEvent with assigned splits metadata after startup to ensure state consistency. 2. Implementation in the Kafka connector to resolve imbalanced splits and state awareness during recovery (the enumerator will always choose the least assigned subtask,and reason aslo as follows) Any additional questions regarding this FLIP? Looking forward to hearing from you. Best Hongshun [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-537%3A+Enumerator+with+Global+Split+Assignment+Distribution+for+Balanced+Split+assignment [2] https://issues.apache.org/jira/browse/FLINK-31762