[
https://issues.apache.org/jira/browse/ZOOKEEPER-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17108982#comment-17108982
]
Keli Wang edited comment on ZOOKEEPER-3830 at 5/16/20, 10:22 AM:
-----------------------------------------------------------------
After reproduce with a 5 nodes cluster, I think the key log is
{code:java}
2020-05-16 17:06:59,514 [myid:6] - INFO
[QuorumPeer[myid=6](plain=[0:0:0:0:0:0:0:0]:2186)(secure=disabled):Leader@1296]
- Have quorum of supporters, sids: [ [2, 3, 4, 6],[2, 3, 4] ]; starting up and
setting last processed zxid: 0x1a00000000
{code}
node6 lastSeenQuorumVerifier doesn't contains itself, so Leader#allowedToCommit
field is false after node6 became leader.
In the original 5 nodes cluster, lastSeenQuorumVerifier only contains 5
members. Every follower will got lastSeenQuorumVerifier from current leader
when leader send NEWLEADER packet. So after node6 started, it got
lastSeenQuorumVerifier with 5 members from leader and this
lastSeenQuorumVerifier doesn't contains node6 itself.
was (Author: keliwang):
After reproduce with a 5 nodes cluster, I think the key log is
{code:java}
2020-05-16 17:06:59,514 [myid:6] - INFO
[QuorumPeer[myid=6](plain=[0:0:0:0:0:0:0:0]:2186)(secure=disabled):Leader@1296]
- Have quorum of supporters, sids: [ [2, 3, 4, 6],[2, 3, 4] ]; starting up and
setting last processed zxid: 0x1a00000000
{code}
node6 lastSeenQuorumVerifier doesn't contains itself, so Leader#allowedToCommit
field is false after node6 became leader.
In the original 5 nodes cluster, lastSeenQuorumVerifier only contains 5
members. Every follower will got lastSeenQuorumVerifier from current leader
when leader send NEWLEADER packet. So after node6 started, it got
lastSeenQuorumVerifier with 5 members from leader and this
lastSeenQuorumVerifier doesn't contains node6 itself.
> After add a new node, zookeeper cluster won't commit any proposal if this new
> node is leader
> --------------------------------------------------------------------------------------------
>
> Key: ZOOKEEPER-3830
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3830
> Project: ZooKeeper
> Issue Type: Bug
> Components: server
> Affects Versions: 3.5.6, 3.5.7, 3.5.8
> Environment: Zookeeper 3.5.8
> JDK 1.8
> Reporter: Keli Wang
> Priority: Major
> Attachments: reproduce-zkclusters.tar.gz
>
>
> I have a zookeeper cluster with 3 nodes, node3 is the leader of the cluster.
>
> {code:java}
> server.1=node1
> server.2=node2
> server.3=node3 # current leader{code}
> With dynamic reconfiguration disabled, I scale this cluster to 4 nodes with 2
> steps:
> # Start node4 with new config, now node4 is a follower.
> # Modify config and restart node1, node2 and node3 one by one.
> The new cluster config is:
> {code:java}
> server.1=node1
> server.2=node2
> server.3=node3
> server.4=node4 # current leader
> {code}
> After restart, node4 is the leader of this cluster. But I cannot connect to
> this cluster using zkCli now.
> If I restart node4, node3 will be the new leader, and now I can connect to
> cluster using zkCli again.
> After some digging, I find node4's Leader#allowedToCommit field is false, so
> this cluster won't commit any new proposals.
>
> I have attached a zookeeper cluster to reproduce this problem. The cluster in
> the attachment can run in one single machine.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)