[I] [Bug]: The beam task consumes Kafka, and then the runner is Flinkrunner. After the job recovers from the checkpoint, it repeatedly consumes Kafka messages [beam]

via GitHub Thu, 08 May 2025 09:06:43 -0700


doc-laowu opened a new issue, #34890:
URL: https://github.com/apache/beam/issues/34890


   ### What happened?
   
   ### The beam task consumes Kafka, and then the runner is Flinkrunner. After 
the job recovers from the checkpoint, it repeatedly consumes Kafka messages
   
   I have investigated the cause of the issue, which is that Flink's 
SourceOperator # open function restored the FlinkSourceSplit of the snapshot 
from the ReaderState, and then triggered the FlinkSourceSplitAnalyzer # start 
function in the SourceCoordinator # start function to calculate the split. 
Then, the AddSpliceEvent message was sent to the SourceOperator and 
FlinkSourceReaderBase # sourceSplits, resulting in the final 
FlinkUnboundeSourceReader being created repeatedly, leading to duplicate 
message consumption in Kafka.
   
   
我查看了问题的原因是flink的SourceOperator#open函数中从readerState中恢复了快照的FlinkSourceSplit，然后在SourceCoordinator#start函数中又触发了FlinkSourceSplitEnumerator#start函数进行split的计算，然后通过AddSplitEvent消息发送到了SourceOperator中并且FlinkSourceReaderBase#sourceSplits中，导致最终的FlinkUnboundedSourceReader被重复创建，进而导致kafka的消息消费出现了重复的现象。
   
   ### Issue Priority
   
   Priority: 2 (default / most bugs should be filed as P2)
   
   ### Issue Components
   
   - [ ] Component: Python SDK
   - [x] Component: Java SDK
   - [ ] Component: Go SDK
   - [ ] Component: Typescript SDK
   - [ ] Component: IO connector
   - [ ] Component: Beam YAML
   - [ ] Component: Beam examples
   - [ ] Component: Beam playground
   - [ ] Component: Beam katas
   - [ ] Component: Website
   - [ ] Component: Infrastructure
   - [ ] Component: Spark Runner
   - [x] Component: Flink Runner
   - [ ] Component: Samza Runner
   - [ ] Component: Twister2 Runner
   - [ ] Component: Hazelcast Jet Runner
   - [ ] Component: Google Cloud Dataflow Runner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[I] [Bug]: The beam task consumes Kafka, and then the runner is Flinkrunner. After the job recovers from the checkpoint, it repeatedly consumes Kafka messages [beam]

Reply via email to