gaoyunhaii commented on pull request #14820:
URL: https://github.com/apache/flink/pull/14820#issuecomment-854710138
Hi @dawidwys very thanks for the suggestions!
The main blocker from my view for directly performing the checkpoint is to
subsuming or waiting for the previous checkpoints. For example, suppose we have
```
A ---|
|--> C
B ---|
```
We may met the cases like
1. A emits Barrier 5.
2. A finished and emits EndOfPartition, but C has not received yet.
3. B finished and emits EndOfPartition, but C has not received yet.
4. The CheckpointCoordinator notifies C for checkpoint 6.
In this case we either wait for checkpoint 5 finish and then trigger
checkpoint 6, or subsuming the checkpoint 5 and trigger checkpoint 6. But in
both cases, only `CheckpointBarrierHandler` knows the existence of checkpoint
5, thus it seems we have to notify the `CheckpointBarrierHandler` in some way ?
One possible alternative might be let the `CheckpointBarrierHandler` first
do a check of outdatedness on alignment, if so, then abort the checkpoint. The
drawback of this method is that the `CheckpointBarrierHandler` might do some
useless work, and we would have two entrances for performCheckpoint() that are
not fully independent.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]