Yeah: " fsync-ing the write ahead log in SyncThread:2 took 13600ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide"
looks suspicious. Either too much contention for IO or a big GC pause would cause a timeout writing (or reading) to/from the leader which would then kick the follower out. -rgs On 12 June 2015 at 07:46, Flavio Junqueira <[email protected]> wrote: > Do you happen to know what happened on the leader side? The fsync warning > is also a bit odd... > -Flavio > > > On Friday, June 12, 2015 3:30 PM, "[email protected]" < > [email protected]> wrote: > > > > Hi! > I have a question: zookeeper is inexplicable shutdown. Do you know why? > My log information is as follows: > > 2015-06-12 13:03:54,536 [myid:2] - INFO > [NIOWorkerThread-8:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.3:27426 which had sessionid 0x14dada66bde0004 > 2015-06-12 13:03:54,998 [myid:2] - WARN > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@92] - Exception when > following the leader > java.io.EOFException > at java.io.DataInputStream.readInt(DataInputStream.java:392) > at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) > at > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) > at > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:103) > at org.apache.zookeeper.server.quorum.Learner.readPacket(Learner.java:160) > at > org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:88) > at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:991) > 2015-06-12 13:03:55,000 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:MBeanRegistry@119] - Unregister > MBean > [org.apache.ZooKeeperService:name0=ReplicatedServer_id2,name1=replica.2,name2=Follower] > 2015-06-12 13:03:55,001 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:Follower@197] - shutdown called > java.lang.Exception: shutdown Follower > at org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:197) > at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:995) > 2015-06-12 13:03:55,001 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:LearnerZooKeeperServer@161] - > Shutting down > 2015-06-12 13:03:55,001 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@443] - shutting > down > 2015-06-12 13:03:55,002 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FollowerRequestProcessor@135] - > Shutting down > 2015-06-12 13:03:55,003 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:CommitProcessor@357] - Shutting > down > 2015-06-12 13:03:55,003 [myid:2] - INFO > [FollowerRequestProcessor:2:FollowerRequestProcessor@106] - > FollowerRequestProcessor exited loop! > 2015-06-12 13:03:55,003 [myid:2] - INFO > [CommitProcessor:2:CommitProcessor@192] - CommitProcessor exited loop! > 2015-06-12 13:03:55,004 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:FinalRequestProcessor@476] - > shutdown of request processor complete > 2015-06-12 13:03:55,005 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:MBeanRegistry@119] - Unregister > MBean > [org.apache.ZooKeeperService:name0=ReplicatedServer_id2,name1=replica.2,name2=Follower,name3=InMemoryDataTree] > 2015-06-12 13:03:55,005 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:SyncRequestProcessor@191] - > Shutting down > 2015-06-12 13:03:57,841 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.2:62539 > 2015-06-12 13:03:57,843 [myid:2] - WARN > [NIOWorkerThread-1:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:03:57,843 [myid:2] - INFO > [NIOWorkerThread-1:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.2:62539 (no session established for client) > 2015-06-12 13:04:01,017 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.2:62542 > 2015-06-12 13:04:01,022 [myid:2] - WARN > [NIOWorkerThread-2:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:01,023 [myid:2] - INFO > [NIOWorkerThread-2:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.2:62542 (no session established for client) > 2015-06-12 13:04:01,121 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.3:27432 > 2015-06-12 13:04:01,122 [myid:2] - WARN > [NIOWorkerThread-3:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:01,122 [myid:2] - INFO > [NIOWorkerThread-3:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.3:27432 (no session established for client) > 2015-06-12 13:04:01,506 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.1:63948 > 2015-06-12 13:04:01,507 [myid:2] - WARN > [NIOWorkerThread-4:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:01,507 [myid:2] - INFO > [NIOWorkerThread-4:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.1:63948 (no session established for client) > 2015-06-12 13:04:04,137 [myid:2] - WARN [SyncThread:2:FileTxnLog@335] - > fsync-ing the write ahead log in SyncThread:2 took 13600ms which will > adversely effect operation latency. See the ZooKeeper troubleshooting guide > 2015-06-12 13:04:04,138 [myid:2] - WARN > [SyncThread:2:SendAckRequestProcessor@64] - Closing connection to leader, > exception during packet send > java.net.SocketException: Socket closed > at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:116) > at java.net.SocketOutputStream.write(SocketOutputStream.java:153) > at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) > at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) > at org.apache.zookeeper.server.quorum.Learner.writePacket(Learner.java:146) > at > org.apache.zookeeper.server.quorum.SendAckRequestProcessor.flush(SendAckRequestProcessor.java:62) > at > org.apache.zookeeper.server.SyncRequestProcessor.flush(SyncRequestProcessor.java:186) > at > org.apache.zookeeper.server.SyncRequestProcessor.run(SyncRequestProcessor.java:113) > 2015-06-12 13:04:04,139 [myid:2] - INFO > [SyncThread:2:SyncRequestProcessor@169] - SyncRequestProcessor exited! > 2015-06-12 13:04:04,139 [myid:2] - WARN > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@1038] - PeerState set > to LOOKING > 2015-06-12 13:04:04,139 [myid:2] - INFO > [QuorumPeer[myid=2]/0:0:0:0:0:0:0:0:2181:QuorumPeer@913] - LOOKING > 2015-06-12 13:04:04,664 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.3:27435 > 2015-06-12 13:04:04,665 [myid:2] - WARN > [NIOWorkerThread-6:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:04,666 [myid:2] - INFO > [NIOWorkerThread-6:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.3:27435 (no session established for client) > 2015-06-12 13:04:04,707 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.2:62545 > 2015-06-12 13:04:04,708 [myid:2] - WARN > [NIOWorkerThread-5:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:04,708 [myid:2] - INFO > [NIOWorkerThread-5:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.2:62545 (no session established for client) > 2015-06-12 13:04:04,818 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.1:63952 > 2015-06-12 13:04:04,818 [myid:2] - WARN > [NIOWorkerThread-7:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:04,819 [myid:2] - INFO > [NIOWorkerThread-7:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.1:63952 (no session established for client) > 2015-06-12 13:04:05,024 [myid:2] - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@682] - Notification: 2 > (message format version), 3 (n.leader), 0x8000002c9 (n.zxid), 0x2 > (n.round), LOOKING (n.state), 3 (n.sid), 0x8 (n.peerEPoch), LOOKING (my > state)100000000 (n.config version) > 2015-06-12 13:04:05,288 [myid:2] - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@682] - Notification: 2 > (message format version), 3 (n.leader), 0x8000002c9 (n.zxid), 0x2 > (n.round), LOOKING (n.state), 3 (n.sid), 0x8 (n.peerEPoch), LOOKING (my > state)100000000 (n.config version) > 2015-06-12 13:04:05,690 [myid:2] - INFO > [WorkerReceiver[myid=2]:FastLeaderElection@682] - Notification: 2 > (message format version), 3 (n.leader), 0x8000002c9 (n.zxid), 0x2 > (n.round), LOOKING (n.state), 3 (n.sid), 0x8 (n.peerEPoch), LOOKING (my > state)100000000 (n.config version) > 2015-06-12 13:04:06,459 [myid:2] - INFO [NIOServerCxnFactory.AcceptThread: > 0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory$AcceptThread@296] - Accepted > socket connection from /172.16.20.2:62548 > 2015-06-12 13:04:06,460 [myid:2] - WARN > [NIOWorkerThread-8:NIOServerCnxn@372] - Exception causing close of > session 0x0: ZooKeeperServer not running > 2015-06-12 13:04:06,460 [myid:2] - INFO > [NIOWorkerThread-8:NIOServerCnxn@1006] - Closed socket connection for > client /172.16.20.2:62548 (no session established for client) > > Some configuration: > clientPort=2181 > dataDir=/opt/zookeeper > syncLimit=2 > tickTime=2000 > initLimit=5 > server.1=kafka1:2888:3888:participant > server.2=kafka2:2888:3888:participant > server.3=kafka3:2888:3888:participant > > thinks! > > > > > [email protected] > > > > >
