gaoyunhaii commented on pull request #14831: URL: https://github.com/apache/flink/pull/14831#issuecomment-774809273
Hi @pnowojski , very thanks for the deep thoughts! I also like the idea of let the upstream tasks and downstream tasks to do handshake before exit, and I only have some minor complements: 1. I also agree with we could do a two step handshake. Since we now already have `EndOfpartitionEvent` as the last event, I think we could introduce a new event before `EndOfpartitionEvent` like `EndOfInputEvent` (or some other names) ? Namely change `EndOfpartitionEvent` to `EndOfInputEvent` and make `CloseEvent` to `EndOfPartitionEvent`, and we could also deal with all the logic in `StreamTask`. 2. If a task received RPC trigger, it indicates it must have processed all the `EndOfInputEvent` and might only have some pending CheckpointBarrier to process. In this case the task could bookkeep the received RPC checkpoint triggers, and after all the EndOfPartition is processed, it could then trigger all the bookkept checkpoints. 3. > I assume you need this behaviour also in the current version? @gaoyunhaii what are you doing if there is a race condition and some Task finishes while trigger checkpoint RPC is already on its way to it? Yes, indeed, if a task finished before it received the trigger message, in the first version we would directly decline the checkpoint ([FLINK-21246](https://issues.apache.org/jira/browse/FLINK-21246)), and for more fine-grained processing we could recompute and re-trigger the following tasks ([FLINK-21081](https://issues.apache.org/jira/browse/FLINK-21081)). I'll open a new PR based on this solution. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
