[ https://issues.apache.org/jira/browse/ZOOKEEPER-4816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
mutu updated ZOOKEEPER-4816: ---------------------------- Component/s: leaderElection > A follower can not join the cluster for 30s seconds > --------------------------------------------------- > > Key: ZOOKEEPER-4816 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4816 > Project: ZooKeeper > Issue Type: Bug > Components: leaderElection > Affects Versions: 3.10.0 > Reporter: mutu > Priority: Critical > Attachments: system1_FLE.log, system1_hang.log, system2_FLE.log, > system2_hang.log, system3_FLE.log, system3_hang.log > > > We encounter a strange scenario. When we set up the cluster of zookeeper(3 > nodes totally), the third node is stuck in ({*}sealStream{*}) serializing the > snapshot to the local disk. However, the leader election is executed > normally. After the election, the third node is elected as the leader. The > other two nodes fail to connect with the leader. Hence, the first and second > nodes restart the leader election, finally the second node is elected as the > leader. At this time, the third node still act as the leader. There are two > leaders in the cluster. The first node can not join the cluster for 30s. > The logs of the first node are as following. > {code:java} > 2024-07-01 02:35:28,223 [myid:] - INFO > [WorkerReceiver[myid=1]:o.a.z.s.q.FastLeaderElection$Messenger$WorkerReceiver@391] > - Notification: my state:LOOKING; n.sid:2, n.state:LOOKING, n.leader:2, > n.round:0x2, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, > n.config version:0x02024-07-01 02:35:28,253 [myid:] - INFO > [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):o.a.z.s.q.FastLeaderElection@996] > - Notification time out: 12800 ms2024-07-01 02:35:28,254 [myid:] - INFO > [WorkerReceiver[myid=1]:o.a.z.s.q.FastLeaderElection$Messenger$WorkerReceiver@391] > - Notification: my state:LOOKING; n.sid:1, n.state:LOOKING, n.leader:2, > n.round:0x2, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, > n.config version:0x02024-07-01 02:35:28,287 [myid:] - INFO > [WorkerReceiver[myid=1]:o.a.z.s.q.FastLeaderElection$Messenger$WorkerReceiver@391] > - Notification: my state:LOOKING; n.sid:3, n.state:LEADING, n.leader:3, > n.round:0x1, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, > n.config version:0x02024-07-01 02:35:28,288 [myid:] - INFO > [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):o.a.z.s.q.FastLeaderElection@1205] > - Oracle indicates not to follow2024-07-01 02:35:29,493 [myid:] - WARN > [NIOWorkerThread-20:o.a.z.s.NIOServerCnxn@397] - Close of session 0x0{code} > During this procedure, the client can not connect with any nodes of the > cluster. > Runtime logs are attached. > The root cause is the serializing the snapshot blocks the status modification > of the third node? > > Sometimes, this phenomenon will disappears. The configuration is the same as > above. The second node is stuck in sealStream due to fail-slow disk. > {code:java} > // node1 > 2024-07-01 02:19:21,942 [myid:] - INFO > [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):o.a.z.s.q.QuorumPeer@920] > - Peer state changed: looking2024-07-01 02:19:21,943 [myid:] - WARN > [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):o.a.z.s.q.QuorumPeer@1602] > - PeerState set to LOOKING2024-07-01 02:19:21,943 [myid:] - INFO > [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):o.a.z.s.q.QuorumPeer@1471] > - LOOKING2024-07-01 02:19:21,943 [myid:] - INFO > [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):o.a.z.s.q.FastLeaderElection@946] > - New election. My id = 1, proposed zxid=0x02024-07-01 02:19:21,980 [myid:] > - INFO > [WorkerReceiver[myid=1]:o.a.z.s.q.FastLeaderElection$Messenger$WorkerReceiver@391] > - Notification: my state:LOOKING; n.sid:1, n.state:LOOKING, n.leader:1, > n.round:0x2, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, > n.config version:0x02024-07-01 02:19:22,012 [myid:] - INFO > [WorkerReceiver[myid=1]:o.a.z.s.q.FastLeaderElection$Messenger$WorkerReceiver@391] > - Notification: my state:LOOKING; n.sid:3, n.state:FOLLOWING, n.leader:2, > n.round:0x1, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, > n.config version:0x02024-07-01 02:19:22,026 [myid:] - INFO > [WorkerReceiver[myid=1]:o.a.z.s.q.FastLeaderElection$Messenger$WorkerReceiver@391] > - Notification: my state:LOOKING; n.sid:2, n.state:LEADING, n.leader:2, > n.round:0x1, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, > n.config version:0x02024-07-01 02:19:22,026 [myid:] - INFO > [QuorumPeer[myid=1](plain=0.0.0.0:2181)(secure=disabled):o.a.z.s.q.QuorumPeer@906] > - Peer state changed: following {code} > {code:java} > // node3 > 2024-07-01 02:19:22,780 [myid:] - INFO > [QuorumPeer[myid=3](plain=0.0.0.0:2181)(secure=disabled):o.a.z.s.q.QuorumPeer@920] > - Peer state changed: looking2024-07-01 02:19:22,780 [myid:] - WARN > [QuorumPeer[myid=3](plain=0.0.0.0:2181)(secure=disabled):o.a.z.s.q.QuorumPeer@1602] > - PeerState set to LOOKING2024-07-01 02:19:22,780 [myid:] - INFO > [QuorumPeer[myid=3](plain=0.0.0.0:2181)(secure=disabled):o.a.z.s.q.QuorumPeer@1471] > - LOOKING2024-07-01 02:19:22,781 [myid:] - INFO > [QuorumPeer[myid=3](plain=0.0.0.0:2181)(secure=disabled):o.a.z.s.q.FastLeaderElection@946] > - New election. My id = 3, proposed zxid=0x02024-07-01 02:19:22,819 [myid:] > - INFO > [WorkerReceiver[myid=3]:o.a.z.s.q.FastLeaderElection$Messenger$WorkerReceiver@391] > - Notification: my state:LOOKING; n.sid:3, n.state:LOOKING, n.leader:3, > n.round:0x2, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, > n.config version:0x02024-07-01 02:19:22,838 [myid:] - INFO > [WorkerReceiver[myid=3]:o.a.z.s.q.FastLeaderElection$Messenger$WorkerReceiver@391] > - Notification: my state:LOOKING; n.sid:1, n.state:FOLLOWING, n.leader:2, > n.round:0x1, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, > n.config version:0x02024-07-01 02:19:22,849 [myid:] - INFO > [WorkerReceiver[myid=3]:o.a.z.s.q.FastLeaderElection$Messenger$WorkerReceiver@391] > - Notification: my state:LOOKING; n.sid:2, n.state:LEADING, n.leader:2, > n.round:0x1, n.peerEpoch:0x0, n.zxid:0x0, message format version:0x2, > n.config version:0x02024-07-01 02:19:22,850 [myid:] - INFO > [QuorumPeer[myid=3](plain=0.0.0.0:2181)(secure=disabled):o.a.z.s.q.QuorumPeer@906] > - Peer state changed: following {code} > But the leader election notification is strange. Node1 follows node2 since > node3 follows node2. However, Node3 follows node2 since node1 follows node2. > This condition cannot stand. > Moreover, this two scenarios are both stuck in sealStream. But there are > different results. Is there a data race? > Are there any comments to figure out this issues? I will very appreciate them. -- This message was sent by Atlassian Jira (v8.20.10#820010)