AHeise commented on a change in pull request #15728:
URL: https://github.com/apache/flink/pull/15728#discussion_r621205945



##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/DefaultCheckpointPlanCalculator.java
##########
@@ -111,7 +112,10 @@ public void setAllowCheckpointsAfterTasksFinished(boolean 
allowCheckpointsAfterT
                                         ? calculateAfterTasksFinished()
                                         : calculateWithAllTasksRunning();
 
-                        checkTasksStarted(result.getTasksToTrigger());
+                        checkTasksStarted(
+                                isUnalignedCheckpoint
+                                        ? result.getTasksToWaitFor()
+                                        : result.getTasksToTrigger());

Review comment:
       > In my opinion, the only first point makes sense. It is better to have 
the first checkpoint sooner rather than later. But I still don't understand how 
it is important because for UC which we want to have as primal one, we don't 
support such behaviour. So my position is to have the same behaviour for both 
checkpoints. If we think that the delay in starting of the first checkpoint is 
crucial then we should support it for UC(maybe not in this ticket but in 
general). but if we think that it is not so important then we can remove this 
support from AC.
   
   UC will not have this issue as strongly as the barrier travels pretty much 
instantly as soon as recovery finishes. The slowest part during UC (biggest 
state) will probably also be the part that takes longest to recover.
   
   The worst part is that since UC overtakes, we will cancel all checkpoints 
until the job fully recovered. This can put a huge load onto the checkpoint 
storage and I/O in general. Hence, we absolutely need this change for UC.
   
   > However there is also one more case which I would be worried about. What 
if recovery is a thing that's causing this huge backpressure
   
   The big question here is: wouldn't we prolong recovery and increase 
backpressure if we write data to checkpoint storage at the same time as we 
recover from it if I/O is the bottleneck? I'm assuming there is no clear answer 
to it. I'm also reluctant to shift the responsibility to the user because I 
don't know how any non-expert can estimate that.
   
   That is, I'm assuming both ways are fine with their disadvantages and we 
should just pick one. In doubt, pick the one that preserves the old behavior.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to