rkhachatryan commented on a change in pull request #12611:
URL: https://github.com/apache/flink/pull/12611#discussion_r439007115



##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/checkpoint/CheckpointCoordinator.java
##########
@@ -538,36 +538,45 @@ private void 
startTriggeringCheckpoint(CheckpointTriggerRequest request) {
                                                                        
coordinatorsToCheckpoint, pendingCheckpoint, timer),
                                                        timer);
 
-                       CompletableFuture.allOf(masterStatesComplete, 
coordinatorCheckpointsComplete)
-                               .whenCompleteAsync(
-                                       (ignored, throwable) -> {
-                                               final PendingCheckpoint 
checkpoint =
-                                                       
FutureUtils.getWithoutException(pendingCheckpointCompletableFuture);
-
-                                               if (throwable == null && 
checkpoint != null && !checkpoint.isDiscarded()) {
-                                                       // no exception, no 
discarding, everything is OK
-                                                       final long checkpointId 
= checkpoint.getCheckpointId();
-                                                       snapshotTaskState(
-                                                               timestamp,
-                                                               checkpointId,
-                                                               
checkpoint.getCheckpointStorageLocation(),
-                                                               request.props,
-                                                               executions,
-                                                               
request.advanceToEndOfTime);
-
-                                                       
coordinatorsToCheckpoint.forEach((ctx) -> 
ctx.afterSourceBarrierInjection(checkpointId));
-
-                                                       onTriggerSuccess();
-                                               } else {
-                                                               // the 
initialization might not be finished yet
-                                                               if (checkpoint 
== null) {
-                                                                       
onTriggerFailure(request, throwable);
+                       FutureUtils.assertNoException(

Review comment:
       Thanks for the update.
   
   I'm also not sure about `assertNoException` which calls `System.exit` 
internally. It can cause problems because:
   - there might be other jobs
   - depending on the setup, it could be problematic to find the reason; e.g. 
buffered logs can be lost
   - we skip any cleanup
   
   Instead, we could notify `CheckpointFailureManager` so that it would 
terminate only this job.
   What do you think?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to