[ 
https://issues.apache.org/jira/browse/FLINK-6708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16023176#comment-16023176
 ] 

ASF GitHub Bot commented on FLINK-6708:
---------------------------------------

GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/3982

    [FLINK-6708] [yarn] Harden FlinkYarnSessionCli to handle 
GetClusterStatusResponse exceptions

    This PR is based on #3981.
    
    This PR hardens the FlinkYarnSessionCli by handling exceptions which occur 
when
    retrieving the GetClusterStatusResponse. If no such response is retrieved 
and instead
    an exception is thrown, the Cli won't fail but retry it the next time.
    
    cc @rmetzger.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink hardenYarnSession

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3982.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3982
    
----
commit 72ce39a1752cc19669f003b70cc2708852a06ac5
Author: Till Rohrmann <[email protected]>
Date:   2017-05-24T15:59:51Z

    [FLINK-6646] [yarn] Let YarnJobManager delete Yarn application files
    
    Before the YarnClusterClient decided when to delete the Yarn application 
files.
    This is problematic because the client does not know whether a Yarn 
application
    is being restarted or terminated. Due to this the files where always 
deleted. This
    prevents Yarn from restarting a failed ApplicationMaster, effectively 
thwarting
    Flink's HA capabilities.
    
    The PR changes the behaviour such that the YarnJobManager deletes the Yarn 
files
    if it receives a StopCluster message. That way, we can be sure that the 
yarn files
    are deleted only iff the cluster is intended to be shut down.

commit 9227539f97e6dbc77c5367b8c555b4ba0b2ad06d
Author: Till Rohrmann <[email protected]>
Date:   2017-05-24T16:26:57Z

    [FLINK-6708] [yarn] Harden FlinkYarnSessionCli to handle 
GetClusterStatusResponse exceptions
    
    This PR hardens the FlinkYarnSessionCli by handling exceptions which occur 
when
    retrieving the GetClusterStatusResponse. If no such response is retrieved 
and instead
    an exception is thrown, the Cli won't fail but retry it the next time.

----


> Don't let the FlinkYarnSessionCli fail if it cannot retrieve the ClusterStatus
> ------------------------------------------------------------------------------
>
>                 Key: FLINK-6708
>                 URL: https://issues.apache.org/jira/browse/FLINK-6708
>             Project: Flink
>          Issue Type: Improvement
>          Components: YARN
>    Affects Versions: 1.3.0, 1.4.0
>            Reporter: Till Rohrmann
>            Assignee: Till Rohrmann
>            Priority: Minor
>
> The {{FlinkYarnSessionCli}} should not fail if it cannot retrieve the 
> {{GetClusterStatusResponse}}. This would harden Flink's Yarn session.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to