I have a 5-node Zookeeper 3.4.6 cluster across 3 data centres (2 zookeepers in 
each “main” DC, and a 5th in a 3rd DC for quorum). I see that the two nodes in 
one DC have regular “issues” where they get kicked out of the cluster and the 
ZooKeeperServer process stops for a few minutes until the node rejoins. I’d 
like to know a couple of things, if someone could please point me in the 
direction of the relevant docs I’d greatly appreciate it.

1.) Is it expected behaviour that when a node is kicked from the cluster, it 
will not be allowed to re-join for a period ? From the logs below I can see 
that re-establishing a valid cluster took around 15 minutes.
2.) It appears that the leader closes connections to the affected followers 
after a “transaction timeout” occurs. Where would I find out what this timeout 
is ? Is this the same thing as a session timout (e.g. The default of 20 * 
tickTime) ?
3.) Where can I find the definition of the different fields in the election log 
messages (I.e. What are “n.round”, “n.zxid”, “n.state” and so on) ? I can make 
an educated guess, which seems to point towards a possible network partition 
event (see below). This should help me start to narrow down the issue. For 
reference, below are some extracts from the relevant logs:

On the leader, I see a message similar to the following :

2015-06-24 04:18:40,152 [myid:5] - WARN  
[QuorumPeer[myid=5]/0:0:0:0:0:0:0:0:2181:LearnerHandler@687] - Closing 
connection to peer due to transaction timeout.
2015-06-24 04:18:40,152 [myid:5] - WARN  
[LearnerHandler-/10.93.167.74:36874:LearnerHandler@646] - ******* GOODBYE 
/10.93.167.74:36874 ********
2015-06-24 04:18:40,153 [myid:5] - WARN  
[LearnerHandler-/10.93.167.74:36874:LearnerHandler@658] - Ignoring unexpected 
exception
java.lang.InterruptedException
        at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireInterruptibly(AbstractQueuedSynchronizer.java:1219)
        at 
java.util.concurrent.locks.ReentrantLock.lockInterruptibly(ReentrantLock.java:340)
        at 
java.util.concurrent.LinkedBlockingQueue.put(LinkedBlockingQueue.java:338)
        at 
org.apache.zookeeper.server.quorum.LearnerHandler.shutdown(LearnerHandler.java:656)
        at 
org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:649)

And on the affected follower :

2015-06-24 04:18:40,160 [myid:3] - WARN  
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@89] - Exception when 
following the leader
java.io.EOFException
        at java.io.DataInputStream.readInt(DataInputStream.java:392)
        at 
org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
        at 
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
        at 
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103)
        at 
org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:153)
        at 
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:85)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786)
2015-06-24 04:18:40,160 [myid:3] - INFO  
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@166] - shutdown called
java.lang.Exception: shutdown Follower
        at 
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:166)
        at 
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:790)

Following this, the follower has a lot of the following :

2015-06-24 04:26:24,615 [myid:3] - INFO  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] - Accepted 
socket connection from /0:0:0:0:0:0:0:1:42079
2015-06-24 04:26:24,616 [myid:3] - WARN  
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@362] - Exception 
causing close of session 0x0 due to java.io.IOException: ZooKeeperServer not 
running


And a few election messages. These are interesting as (from a guess of what the 
fields are in these messages), I see what looks like a network partition issue :

2015-06-24 04:26:27,689 [myid:3] - INFO  
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message 
format version), 3 (n.leader), 0x10009b227 (n.zxid), 0x2 (n.round), LOOKING 
(n.state), 3 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:26:27,691 [myid:3] - INFO  
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message 
format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), FOLLOWING 
(n.state), 4 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)


Assuming that “n.leader” means what this zookeeper thinks the leader is, it 
looks like there are two leaders – 3 (the affected follower) and 5 (the actual 
leader that the rest of the cluster recognises) . And a little while later on 
the follower, I see a successful election. It looks like there’s also some 
shaking out of quorum here – almost as if it’s trying to recover from a near 
split-brain scenario (of course, it’s possible my interpretation of these logs 
is way off…):


2015-06-24 04:34:27,707 [myid:3] - INFO  
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@849] - Notification 
time out: 60000
2015-06-24 04:34:27,712 [myid:3] - INFO  
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message 
format version), 3 (n.leader), 0x10009b227 (n.zxid), 0x2 (n.round), LOOKING 
(n.state), 3 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,714 [myid:3] - INFO  
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message 
format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), FOLLOWING 
(n.state), 4 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,719 [myid:3] - INFO  
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message 
format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), FOLLOWING 
(n.state), 1 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,719 [myid:3] - INFO  
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message 
format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), FOLLOWING 
(n.state), 1 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,721 [myid:3] - INFO  
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message 
format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), FOLLOWING 
(n.state), 2 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,722 [myid:3] - INFO  
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message 
format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), FOLLOWING 
(n.state), 2 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,728 [myid:3] - INFO  
[WorkerSender[myid=3]:QuorumCnxManager@193] - Have smaller server identifier, 
so dropping the connection: (5, 3)
2015-06-24 04:34:27,762 [myid:3] - INFO  
[nftserentzk01.slu.skycdc.com/10.93.167.74:3888:QuorumCnxManager$Listener@511] 
- Received connection request /10.105.6.117:38358
2015-06-24 04:34:27,764 [myid:3] - INFO  
[WorkerReceiver[myid=3]:FastLeaderElection@597] - Notification: 1 (message 
format version), 5 (n.leader), 0x0 (n.zxid), 0x1 (n.round), LEADING (n.state), 
5 (n.sid), 0x1 (n.peerEpoch) LOOKING (my state)
2015-06-24 04:34:27,765 [myid:3] - INFO  
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer@784] - FOLLOWING
2015-06-24 04:34:27,765 [myid:3] - INFO  
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@162] - Created server 
with tickTime 2000 minSessionTimeout 4000 maxSessionTimeout 40000 datadir 
/zkdata/version-2 snapdir /zkdata/version-2
2015-06-24 04:34:27,765 [myid:3] - INFO  
[QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Follower@63] - FOLLOWING - LEADER 
ELECTION TOOK - 943889


Are my assumptions correct here, or am I barking up the wrong tree ?

Many thanks in advance,

Mark Dastmalchi-Round
DevOps Engineer,  Sky Search
BSkyB | Great West House | GW1 - 4th floor

Information in this email including any attachments may be privileged, 
confidential and is intended exclusively for the addressee. The views expressed 
may not be official policy, but the personal views of the originator. If you 
have received it in error, please notify the sender by return e-mail and delete 
it from your system. You should not reproduce, distribute, store, retransmit, 
use or disclose its contents to anyone. Please note we reserve the right to 
monitor all e-mail communication through our internal and external networks. 
SKY and the SKY marks are trademarks of Sky plc and Sky International AG and 
are used under licence. Sky UK Limited (Registration No. 2906991), Sky-In-Home 
Service Limited (Registration No. 2067075) and Sky Subscribers Services Limited 
(Registration No. 2340150) are direct or indirect subsidiaries of Sky plc 
(Registration No. 2247735). All of the companies mentioned in this paragraph 
are incorporated in England and Wales and share the same registered office at 
Grant Way, Isleworth, Middlesex TW7 5QD.

Reply via email to