[ 
https://issues.apache.org/jira/browse/FLINK-6174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15941521#comment-15941521
 ] 

ASF GitHub Bot commented on FLINK-6174:
---------------------------------------

Github user WangTaoTheTonic commented on the issue:

    https://github.com/apache/flink/pull/3599
  
    I don't think it's a good idea, as it can not solve the "split brain" issue 
too.
    
    The key problem is that `LeaderLatch` in curator is too sensitive to 
connection state to Zookeeper(it will revoke leadership when connection to 
zookeeper is temporarily broke), and probably the best way is offerring a 
"duller" LeaderLatch, which can be also used in standalone cluster.
    
    I did same work in our own private Spark release, let me see if it can be 
reused.


> Introduce a leader election service in yarn mode to make JobManager always 
> available
> ------------------------------------------------------------------------------------
>
>                 Key: FLINK-6174
>                 URL: https://issues.apache.org/jira/browse/FLINK-6174
>             Project: Flink
>          Issue Type: Improvement
>          Components: JobManager
>            Reporter: Tao Wang
>            Assignee: Tao Wang
>
> Now in yarn mode, if we use zookeeper as high availability choice, it will 
> create a election service to get a leader depending on zookeeper election.
> When zookeeper leader crashes or the connection between JobManager and 
> zookeeper instance was broken, JobManager's leadership will be revoked and 
> send a Disconnect message to TaskManager, which will cancle all running tasks 
> and make them waiting connection rebuild between JM and ZK.
> In yarn mode, we have one and only JobManager(AM) in same time, and it should 
> be alwasy leader instead of elected through zookeeper. We can introduce a new 
> leader election service in yarn mode to achive that.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to