tillrohrmann commented on a change in pull request #14798:
URL: https://github.com/apache/flink/pull/14798#discussion_r570084862



##########
File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/scheduler/SchedulerBase.java
##########
@@ -522,6 +527,7 @@ protected ComponentMainThreadExecutor 
getMainThreadExecutor() {
     protected void failJob(Throwable cause) {
         incrementVersionsOfAllVertices();
         executionGraph.failJob(cause);
+        getTerminationFuture().thenRun(() -> archiveGlobalFailure(cause));

Review comment:
       I mean the case that a failure happens, the job goes into the `FAILING` 
state and tries to cancel the tasks and now the user cancels the job because it 
takes too long for him. Then the job will go into the `CANCELING` state which 
will result to the `CANCELED` state once all tasks have terminated.
   
   I think you are right that we should still record the failure cause.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to