[ 
https://issues.apache.org/jira/browse/FLINK-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15150365#comment-15150365
 ] 

ASF GitHub Bot commented on FLINK-3396:
---------------------------------------

GitHub user uce opened a pull request:

    https://github.com/apache/flink/pull/1656

    [FLINK-3396] [runtime] Fix JobGraph submission and client ACK logic

    A failure when recovering savepoint state could lead to not ACKing
    the job submission. For detached submissions, this could have lead
    to a submission timeout although the job eventually starts to run.
    
    Moreover, a failure to restore savepoint state, could lead to a job
    graph skipping the submitted graph store for HA.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/uce/flink 3396-submit

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1656.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1656
    
----
commit 3847e304c93a30c18560719d8f169ca424d734e6
Author: Ufuk Celebi <[email protected]>
Date:   2016-02-12T21:09:29Z

    [hotfix] Rename UnrecoverableException to SuppressRestartsException

commit 92e900b3266b93204ea84050d189f00bbbca6678
Author: Ufuk Celebi <[email protected]>
Date:   2016-02-16T16:49:13Z

    [FLINK-3396] [runtime] Fix JobGraph submission and client ACK logic
    
    A failure when recovering savepoint state could lead to not ACKing
    the job submission. For detached submissions, this could have lead
    to a submission timeout although the job eventually starts to run.
    
    Moreover, a failure to restore savepoint state, could lead to a job
    graph skipping the submitted graph store for HA.

----


> Job submission Savepoint restore logic flawed
> ---------------------------------------------
>
>                 Key: FLINK-3396
>                 URL: https://issues.apache.org/jira/browse/FLINK-3396
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Ufuk Celebi
>            Assignee: Ufuk Celebi
>             Fix For: 1.0.0
>
>
> When savepoint restoring fails, the thrown Exception fails the execution 
> graph, but the client is not informed about the failure.
> The expected behaviour is that the submission should be acked with success or 
> failure in any case. With savepoint restore failures, the ack message will be 
> skipped.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to