yanghua commented on a change in pull request #8322: [FLINK-12364] Introduce a
CheckpointFailureManager to centralized manage checkpoint failure
URL: https://github.com/apache/flink/pull/8322#discussion_r284102152
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java
##########
@@ -435,6 +439,12 @@ public boolean triggerCheckpoint(long timestamp, boolean
isPeriodic) {
triggerCheckpoint(timestamp, checkpointProperties,
null, isPeriodic, false);
return true;
} catch (CheckpointException e) {
+ try {
+ long latestGeneratedCheckpointId =
getCheckpointIdCounter().getAndIncrement();
Review comment:
> What I meant is removing the `try-catch(CheckpointException)`around the
`triggerCheckpoint` call completely and exchange the `throw`s inside the method
by direct calls to the failure handler.
I have accepted all the suggestion, except this one, because when I was
refactoring, I found the complexity may exceed our imagination, some obstacle:
1)
```
public boolean triggerCheckpoint(long timestamp, boolean isPeriodic) {
```
This method returns a boolean value, so it needs a binary state, when we
catch `CheckpointException` we return false and when we get `PendingCheckpoint`
we return true. If the real triggerCheckpoint method does not throw
`CheckpointException` how two distinguish binary state?
2)The triggerCheckpoint's boolean return value has been used in so many test
case to judge whether a checkpoint is successful or not.
3) The `triggerSavepointInternal` method need to catch the
`CheckpointException` to return a completed exceptionally for the future
object, that means `triggerSavepointInternal` also needs a binary state for
`CompletableFuture<CompletedCheckpoint>`
4) I have implemented `CheckpointIDCounter#get`, after providing this
function. I think report failure centralized would be better than scattered
code, just like we provide a unified `PendingCheckpoint#abort` before.
What do you think?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services