StefanRRichter commented on a change in pull request #8322: [FLINK-12364] 
Introduce a CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#discussion_r283750480
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java
 ##########
 @@ -435,6 +439,12 @@ public boolean triggerCheckpoint(long timestamp, boolean 
isPeriodic) {
                        triggerCheckpoint(timestamp, checkpointProperties, 
null, isPeriodic, false);
                        return true;
                } catch (CheckpointException e) {
+                       try {
+                               long latestGeneratedCheckpointId = 
getCheckpointIdCounter().getAndIncrement();
 
 Review comment:
   What exactly do you mean? Because the access to the id-generator does not 
happen under the lock? Then how about just rewriting the trigger checkpoing 
method:
   The catch looks not well designed anyways, instread you could already report 
the exception to the failure manager instead of throwing, or catch and report 
while still under the lock inside triggerCheckpoint. There is nor real reason 
to catch outside the method/lock-scope.
   BTW, I have one more important additional point, which is the 
`numUnsuccessfulCheckpointsTriggers` in checkpoint coordinator, which 
absolutely sounds like something that should now be moved into the failure 
manager, wdyt?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to