yanghua commented on a change in pull request #8322: [FLINK-12364] Introduce a 
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#discussion_r284070033
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java
 ##########
 @@ -435,6 +439,12 @@ public boolean triggerCheckpoint(long timestamp, boolean 
isPeriodic) {
                        triggerCheckpoint(timestamp, checkpointProperties, 
null, isPeriodic, false);
                        return true;
                } catch (CheckpointException e) {
+                       try {
+                               long latestGeneratedCheckpointId = 
getCheckpointIdCounter().getAndIncrement();
 
 Review comment:
   > BTW, I have one more important additional point, which is the 
`numUnsuccessfulCheckpointsTriggers` in checkpoint coordinator, which 
absolutely sounds like something that should now be moved into the failure 
manager, wdyt?
   
   After we provided `CheckpointFailureManager`, IMO the 
`numUnsuccessfulCheckpointsTriggers` is not valuable. Currently, it is 
incremented in the trigger lock of method `triggerCheckpoint`. The domain of 
trigger lock is the subset of the trigger phase which been 
`CheckpointFailureManager` understood. So the counting is not correct. And it 
is just for logging purpose. So I suggest we could remove it in this PR or in 
the third step (next step). If the logging is really necessary, we could do it 
again after we implemented the new counting logic based on checkpoint id 
sequence. 
   
   What's your opinion?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to