[ https://issues.apache.org/jira/browse/FLINK-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15148480#comment-15148480 ]
ASF GitHub Bot commented on FLINK-3396: --------------------------------------- Github user uce commented on a diff in the pull request: https://github.com/apache/flink/pull/1633#discussion_r52998947 --- Diff: flink-runtime/src/main/scala/org/apache/flink/runtime/jobmanager/JobManager.scala --- @@ -1073,57 +1073,73 @@ class JobManager( // execute the recovery/writing the jobGraph into the SubmittedJobGraphStore asynchronously // because it is a blocking operation future { - try { - if (isRecovery) { - executionGraph.restoreLatestCheckpointedState() - } - else { - val snapshotSettings = jobGraph.getSnapshotSettings - if (snapshotSettings != null) { - val savepointPath = snapshotSettings.getSavepointPath() + val restoreStateSuccess = + try { + if (isRecovery) { + executionGraph.restoreLatestCheckpointedState() --- End diff -- Had an offline discussion with Stephan. He agrees with you that the failure in this case is too hard. I'll undo that change by ACK'ing the submission earlier. > Job submission Savepoint restore logic flawed > --------------------------------------------- > > Key: FLINK-3396 > URL: https://issues.apache.org/jira/browse/FLINK-3396 > Project: Flink > Issue Type: Bug > Reporter: Ufuk Celebi > Assignee: Ufuk Celebi > Fix For: 1.0.0 > > > When savepoint restoring fails, the thrown Exception fails the execution > graph, but the client is not informed about the failure. > The expected behaviour is that the submission should be acked with success or > failure in any case. With savepoint restore failures, the ack message will be > skipped. -- This message was sent by Atlassian JIRA (v6.3.4#6332)