[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036846#comment-17036846
 ] 

Mate Szalay-Beko edited comment on ZOOKEEPER-2164 at 2/14/20 9:39 AM:
----------------------------------------------------------------------

[~suhas.dantkale] I think this issue with the 0.0.0.0 hostnames needs some more 
thinking. This previous ticket can be interesting: 
[https://issues.apache.org/jira/browse/ZOOKEEPER-107]  - so maybe this 
behaviour is intentional and we want to allow the new Peers to send their 
addresses. Also there is the config property 'quorumListenOnAllIPs', maybe we 
can solve your problem with that?

Anyway, it might worth a discussion on the dev list. On the user list, there is 
also a case, when someone is running dockerized ZooKeeper and claims that he 
has to use the 0.0.0.0 address. 


was (Author: symat):
[~suhas.dantkale] I think this issue with the 0.0.0.0 hostnames needs some more 
thinking. This previous tickets can be interesting: 
[https://issues.apache.org/jira/browse/ZOOKEEPER-107]  - so maybe this 
behaviour is intentional and we want to allow the new Peers to send their 
addresses. Also there is the config property 'quorumListenOnAllIPs', maybe we 
can solve your problem with that?

Anyway, it might worse a discussion on the dev list. On the user list, there is 
also a case, when someone is running dockerized ZooKeeper and claims that he 
has to use the 0.0.0.0 address. 

> fast leader election keeps failing
> ----------------------------------
>
>                 Key: ZOOKEEPER-2164
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2164
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: leaderElection
>    Affects Versions: 3.4.5
>            Reporter: Michi Mutsuzaki
>            Assignee: Mate Szalay-Beko
>            Priority: Major
>             Fix For: 3.7.0, 3.5.8
>
>
> I have a 3-node cluster with sids 1, 2 and 3. Originally 2 is the leader. 
> When I shut down 2, 1 and 3 keep going back to leader election. Here is what 
> seems to be happening.
> - Both 1 and 3 elect 3 as the leader.
> - 1 receives votes from 3 and itself, and starts trying to connect to 3 as a 
> follower.
> - 3 doesn't receive votes for 5 seconds because connectOne() to 2 doesn't 
> timeout for 5 seconds: 
> https://github.com/apache/zookeeper/blob/41c9fcb3ca09cd3d05e59fe47f08ecf0b85532c8/src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java#L346
> - By the time 3 receives votes, 1 has given up trying to connect to 3: 
> https://github.com/apache/zookeeper/blob/41c9fcb3ca09cd3d05e59fe47f08ecf0b85532c8/src/java/main/org/apache/zookeeper/server/quorum/Learner.java#L247
> I'm using 3.4.5, but it looks like this part of the code hasn't changed for a 
> while, so I'm guessing later versions have the same issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to