[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17037667#comment-17037667
 ] 

Mate Szalay-Beko edited comment on ZOOKEEPER-2164 at 2/15/20 11:56 PM:
-----------------------------------------------------------------------

[~suhas.dantkale]

Actually, after adding some extra logs and analyzing them, I realized that the 
issue I found before and reproduced is indeed caused by the 0.0.0.0 addresses 
(I just mixed up the configs, and indeed I used wildcard addresses in the 
config files).  Sorry for misleading you...

Your root cause analysis is totally correct. I have a fix that solves this 
issue. It is actually quite easy...  sending the address in in the initial 
message was introduced in 3.5.0 (ZOOKEEPER-107) and the 3.4 versions never used 
this field. And still in 3.5 for backward compatibility reasons (needed during 
rolling upgrade) there is a version of the {{QuorumCnxManager.connectOne()}} 
that needs no election address but use the last known address to initiate the 
connection. So the solution can be simply to call this method if the address is 
a wildcard address (0.0.0.0). It can simply be verified using 
{{InetAddress.isAnyLocalAddress()}}.

Still, we have to verify if this change is compatible with the dynamic reconfig 
(I think it is) and also works with rolling upgrade. (I also had the idea to 
not even send the 0.0.0.0 in the first place, but then I think we would hit 
parsing errors during rolling upgrades, so the best is to still send it, just 
filter out in the receiver side.) Also the same change will not work both on 
the 3.5 and 3.6 branches, as we have the MultiAddress feature added for 3.6 and 
we use a slightly different message format / internal representation of 
addresses.

Anyway, as you were the one found this issue in the first place, let me know if 
you wish to take it over and work on it. I think it is a change that will 
require some discussion within the community. Otherwise I will push my PR and 
do the rest of the work.

BTW: I don't think that this would be something that can be verified by unit 
tests. Even using 0.0.0.0 in the unit tests would always work (would be similar 
to 127.0.0.1), as we are executing everything on a single machine.

Still, it is a question for me if this ticket was originally about this issue 
or not. Some of the comments seems to indicate that people were hitting the 
0.0.0.0 issues, but in the original description ZooKeeper 3.4.5 was mentioned, 
and that can not be the the issue you and I were talking here. I still have to 
look into that.


was (Author: symat):
[~suhas.dantkale]

Actually, after adding some extra logs and analyzing them, I realized that the 
issue I found before and reproduced is indeed caused by the 0.0.0.0 addresses 
(I just mixed up the configs, and indeed I used wildcard addresses in the 
config files).  Sorry for misleading you...

Your root cause analysis is totally correct. I have a fix that solves this 
issue.(it is quite easy... actually sending the address in in the initial 
message was introduced in 3.5.0 (ZOOKEEPER-107) and the 3.4 versions never used 
this field. And still in 3.5 for backward compatibility reasons there is a 
version of the {{QuorumCnxManager.connectOne()}} that needs no election address 
but use the last known address to initiate the connection. So the solution can 
be simply to call this method if the address is a wildcard address (0.0.0.0). 
It can simply be verified using {{InetAddress.isAnyLocalAddress()}}.

Still, we have to verify if this change is compatible with the dynamic reconfig 
(I think it is) and also works with rolling upgrade. (I also had the idea to 
not even send the 0.0.0.0 in the first place, but then I think we would hit 
parsing errors during rolling upgrades, so the best is to still send it, just 
filter out in the receiver side.) Also the same change will not work both on 
the 3.5 and 3.6 branches, as we have the MultiAddress feature added for 3.6 and 
we use a slightly different message format / internal representation of 
addresses.

Anyway, as you were the one found this issue in the first place, let me know if 
you wish to take it over and work on it. I think it is a change that will 
require some discussion within the community. Otherwise I will push my PR and 
do the rest of the work.

BTW: I don't think that this would be something that can be verified by unit 
tests. Even using 0.0.0.0 in the unit tests would always work (would be similar 
to 127.0.0.1), as we are executing everything on a single machine.

Still, it is a question for me if this ticket was originally about this issue 
or not. Some of the comments seems to indicate that people were hitting the 
0.0.0.0 issues, but in the original description ZooKeeper 3.4.5 was mentioned, 
and that can not be the the issue you and I were talking here. I still have to 
look into that.

> fast leader election keeps failing
> ----------------------------------
>
>                 Key: ZOOKEEPER-2164
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2164
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: leaderElection
>    Affects Versions: 3.4.5
>            Reporter: Michi Mutsuzaki
>            Assignee: Mate Szalay-Beko
>            Priority: Major
>             Fix For: 3.7.0, 3.5.8
>
>
> I have a 3-node cluster with sids 1, 2 and 3. Originally 2 is the leader. 
> When I shut down 2, 1 and 3 keep going back to leader election. Here is what 
> seems to be happening.
> - Both 1 and 3 elect 3 as the leader.
> - 1 receives votes from 3 and itself, and starts trying to connect to 3 as a 
> follower.
> - 3 doesn't receive votes for 5 seconds because connectOne() to 2 doesn't 
> timeout for 5 seconds: 
> https://github.com/apache/zookeeper/blob/41c9fcb3ca09cd3d05e59fe47f08ecf0b85532c8/src/java/main/org/apache/zookeeper/server/quorum/QuorumCnxManager.java#L346
> - By the time 3 receives votes, 1 has given up trying to connect to 3: 
> https://github.com/apache/zookeeper/blob/41c9fcb3ca09cd3d05e59fe47f08ecf0b85532c8/src/java/main/org/apache/zookeeper/server/quorum/Learner.java#L247
> I'm using 3.4.5, but it looks like this part of the code hasn't changed for a 
> while, so I'm guessing later versions have the same issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to