akalash commented on a change in pull request #18845:
URL: https://github.com/apache/flink/pull/18845#discussion_r811311861
##########
File path:
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java
##########
@@ -1385,6 +1385,7 @@ private CompletedCheckpoint
addCompletedCheckpointToStoreAndSubsumeOldest(
checkpointsCleaner.cleanCheckpointOnFailedStoring(completedCheckpoint,
executor);
}
+ reportFailedCheckpoint(completedCheckpoint, exception);
Review comment:
I agree that it is better to keep both 'success' and 'fail' reporting in
one place. It is why I wanted to move it to `completePendingCheckpoint` but it
is not possible since `finalizeCheckpoint` can be called independently so
'fail' reporting should be inside of `finalizeCheckpoint`. So right now we
report about the fail from `finalizeCheckpoint` and
`addCompletedCheckpointToStoreAndSubsumeOldest` and also we report about
'success' from `completePendingCheckpoint`. I don't really like it but I don't
think that moving 'succes' reporting to
`addCompletedCheckpointToStoreAndSubsumeOldest` somehow improve the situation
since the reporting should be the last step in the finalization of the
checkpoint but moving this step into the method which is the last for now but
can be changed in future seems not so promising
##########
File path:
flink-runtime/src/test/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinatorTest.java
##########
@@ -2083,6 +2094,12 @@ public void testTriggerAndConfirmSimpleSavepoint()
throws Exception {
assertEquals(pending.getCheckpointId(), success.getCheckpointID());
assertEquals(2, success.getOperatorStates().size());
+ AbstractCheckpointStats actualStats =
+
statsTracker.createSnapshot().getHistory().getCheckpointById(checkpointId);
+
+ assertEquals(checkpointId, actualStats.getCheckpointId());
+ assertEquals(CheckpointStatsStatus.COMPLETED, actualStats.getStatus());
+
Review comment:
No, this test doesn't fail in master. I just notice that we have no test
which checks the success scenario. So this extra condition just checks that we
switch status to `COMPLETED` when the checkpoint is successful.
Yes, the next commit adds the actual regression test which fails with the
current master. Technically, we can add that test to this commit but instead of
`FAILED` status we should check for NOT `COMPLETED` or `IN_PROGRESS` status
since this commit has one more bug which is resolved in next commit.(but I
decided to add the correct test at once to next commit)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]