[ 
https://issues.apache.org/jira/browse/FLINK-2790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946639#comment-14946639
 ] 

ASF GitHub Bot commented on FLINK-2790:
---------------------------------------

Github user uce commented on a diff in the pull request:

    https://github.com/apache/flink/pull/1213#discussion_r41374384
  
    --- Diff: 
flink-runtime/src/test/scala/org/apache/flink/runtime/testingUtils/TestingCluster.scala
 ---
    @@ -103,18 +103,17 @@ class TestingCluster(
           instanceManager,
           scheduler,
           libraryCacheManager,
    -      _,
           executionRetries,
           delayBetweenRetries,
           timeout,
           archiveCount,
    -      leaderElectionService) = 
JobManager.createJobManagerComponents(config)
    +      leaderElectionService) = JobManager.createJobManagerComponents(
    +      config,
    --- End diff --
    
    indentation


> Add high availability support for Yarn
> --------------------------------------
>
>                 Key: FLINK-2790
>                 URL: https://issues.apache.org/jira/browse/FLINK-2790
>             Project: Flink
>          Issue Type: Sub-task
>          Components: JobManager, TaskManager
>            Reporter: Till Rohrmann
>             Fix For: 0.10
>
>
> Add master high availability support for Yarn. The idea is to let Yarn 
> restart a failed application master in a new container. For that, we set the 
> number of application retries to something greater than 1. 
> From version 2.4.0 onwards, it is possible to reuse already started 
> containers for the TaskManagers, thus, avoiding unnecessary restart delays.
> From version 2.6.0 onwards, it is possible to specify an interval in which 
> the number of application attempts have to be exceeded in order to fail the 
> job. This will prevent long running jobs from eventually depleting all 
> available application attempts.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to