Hi Youngseok, Don't rotate transaction logs yourself. At the very minimum, you need to keep the most recent snapshot file and the most recent transaction log file. You can use the snapCount to control the size of the transaction log files. However be aware that smaller snapCount means more frequent snapshotting, which affects ZooKeeper performance. Do test it before changing snapCount in production.
http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_advancedConfiguration For zookeeper.out, it's really up to you to decide how much data to retain. Personally I would keep at least a day worth of INFO log. On Wed, Mar 26, 2014 at 3:36 PM, Jung Young Seok <[email protected]> wrote: > I have a couple of question regarding log and snapshot management. > > I use auto purge feature so I keep only 3 snapshots and 3 transaction logs. > Zookeeper log(zookeeper.out) is rotated daily. > > What I would like to do is keep the log and snapshot files as small as > possible. > > Would it be okay to manage snapshot files and transaction logs with > logrotate.d or Log4j rolling not to grow more than 50MB ? > > Thank you in advance. > Youngseok > 2014. 3. 24. 오후 4:43에 "Rakesh R" <[email protected]>님이 작성: > >> >> From the latest log shared by YoungSeok, >> >> [1] I could see the LearnerHandler fails to get a Leader.ACKEPOCH response >> from the Followers and is failing with the following exception. >> >> 2014-03-19 17:29:19,312 [myid:3] - INFO [LearnerHandler-/10.0.33.1:58547 >> :LearnerHandler@263] - Follower sid: 1 :info : >> org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@3c966db5 >> 2014-03-19 17:29:19,314 [myid:3] - INFO [LearnerHandler-/10.0.33.129:49810 >> :LearnerHandler@263] - Follower sid: 2 :info : >> org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@466b56b >> 2014-03-19 17:29:19,475 [myid:3] - ERROR [LearnerHandler-/10.0.33.1:58547 >> :LearnerHandler@562] - Unexpected >> exception causing shutdown while sock still open java.io.EOFException >> at java.io.DataInputStream.readInt(DataInputStream.java:392) >> at >> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) >> at >> >> org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) >> at >> org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108) >> at >> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.j >> ava:290) >> >> Hi Michi, It would be great if you can help to know more on ZK-1697. As I >> understood this is talking about the Leader.ACK, am I correct ?, if so I >> got confused by seeing the Leader.ACKEPOCH exception in LearnerHandler side >> [1]. From the code what I've seen Leader.ACKEPOCH would sent to the Leader >> at the time of Learner# registerWithLeader(). >> >> Thanks in advance. >> Rakesh >> >> -----Original Message----- >> From: Jung Young Seok [mailto:[email protected]] >> Sent: 24 March 2014 10:20 >> To: [email protected] >> Cc: [email protected] >> Subject: Re: [Zookeeper] Zookeeper Cluster broken due to snapshot >> corrupted error >> >> I'm not sure if my issue is related to >> https://issues.apache.org/jira/browse/ZOOKEEPER-1697 >> but I think I should try with Zookeeer version to 3.4.6(stable). >> >> I'm just hoping 3.4.6 version would prevent happening my issue again. >> >> Thank you for your answer. >> Have a great day. >> >> Best Regards, >> Youngseok Jung >> >> >> 2014-03-24 13:05 GMT+09:00 Michi Mutsuzaki <[email protected]>: >> >> > I wonder if this is related to ZOOKEEPER-1697. >> > >> > https://issues.apache.org/jira/browse/ZOOKEEPER-1697 >> > >> > --Michi >> > >> > On Sun, Mar 23, 2014 at 6:15 PM, Jung Young Seok >> > <[email protected]> wrote: >> > > I've added zookeeper log (192.168.161.1). >> > > The time that the log was written look different but you might ignore >> it. >> > > Logs on 192.168.161.1 had been repeated with below pattern. >> > > >> > > Thank you for your asking. >> > > >> > > >> > ---------------------------------------------------------------------- >> > ---------------------------------------------------------------------- >> > ---------------------------------------------------------- >> > > 2014-03-19 17:28:06,105 [myid:3] - INFO >> > > [LearnerHandler-/10.0.33.129:49809:LearnerHandler@395] - Sending >> > > DIFF >> > > 2014-03-19 17:28:07,414 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41252 >> > > 2014-03-19 17:28:07,415 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:28:07,415 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41252 (no session >> > established for >> > > client) >> > > 2014-03-19 17:28:12,173 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41255 >> > > 2014-03-19 17:28:12,174 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:28:12,174 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41255 (no session >> > established for >> > > client) >> > > 2014-03-19 17:28:14,558 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41258 >> > > 2014-03-19 17:28:14,559 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:28:14,559 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41258 (no session >> > established for >> > > client) >> > > 2014-03-19 17:28:18,585 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41261 >> > > 2014-03-19 17:28:18,586 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:28:18,586 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41261 (no session >> > established for >> > > client) >> > > 2014-03-19 17:28:20,067 [myid:3] - WARN >> > > [LearnerHandler-/10.0.33.1:58546:Leader@574] - Commiting zxid >> > 0xc500000000 >> > > from /10.0.161.1:2888 not first! >> > > 2014-03-19 17:28:20,067 [myid:3] - WARN >> > > [LearnerHandler-/10.0.33.1:58546:Leader@576] - First is 0x0 >> > > 2014-03-19 17:28:20,068 [myid:3] - INFO >> > > [LearnerHandler-/10.0.33.1:58546:Leader@598] - Have quorum of >> > supporters; >> > > starting up and setting last processed zxid: 0xc500000000 >> > > 2014-03-19 17:28:22,312 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@490] - Shutting down >> > > 2014-03-19 17:28:22,312 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@496] - Shutdown >> > > called >> > > java.lang.Exception: shutdown Leader! reason: Only 1 followers, need 1 >> > > at >> > > org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:496) >> > > at >> > org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:471) >> > > at >> > > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:75 >> > > 3) >> > > 2014-03-19 17:28:22,313 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@419] - >> > > shutting down >> > > 2014-03-19 17:28:22,320 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:SessionTrackerImpl@225] - >> > Shutting >> > > down >> > > 2014-03-19 17:28:22,320 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:PrepRequestProcessor@743] - >> > > Shutting down >> > > 2014-03-19 17:28:22,321 [myid:3] - INFO [ProcessThread(sid:3 >> > > cport:-1)::PrepRequestProcessor@143] - PrepRequestProcessor exited >> loop! >> > > 2014-03-19 17:28:22,321 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:ProposalRequestProcessor@88 >> > > ] - Shutting down >> > > 2014-03-19 17:28:22,322 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:CommitProcessor@181] - >> > > Shutting down >> > > 2014-03-19 17:28:22,322 [myid:3] - INFO >> > > [CommitProcessor:3:CommitProcessor@150] - CommitProcessor exited loop! >> > > 2014-03-19 17:28:22,322 [myid:3] - INFO >> > > >> > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader$ToBeAppliedRequestProc >> > essor@655 >> > ] >> > > - Shutting down >> > > 2014-03-19 17:28:22,322 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FinalRequestProcessor@415] >> > > - shutdown of request processor complete >> > > 2014-03-19 17:28:22,323 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:SyncRequestProcessor@175] - >> > > Shutting down >> > > 2014-03-19 17:28:22,323 [myid:3] - INFO >> > > [SyncThread:3:SyncRequestProcessor@155] - SyncRequestProcessor exited! >> > > 2014-03-19 17:28:22,325 [myid:3] - WARN >> > > [LearnerHandler-/10.0.33.1:58546:LearnerHandler@575] - ******* >> > > GOODBYE >> > > /10.0.33.1:58546 ******** >> > > 2014-03-19 17:28:22,326 [myid:3] - WARN >> > > [LearnerHandler-/10.0.33.129:49809:LearnerHandler@575] - ******* >> > > GOODBYE >> > > /10.0.33.129:49809 ******** >> > > 2014-03-19 17:28:22,327 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer@670] - LOOKING >> > > 2014-03-19 17:28:22,328 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FileSnap@83] - Reading >> > > snapshot >> > > /home/zookeeper/data/version-2/snapshot.c200000001 >> > > 2014-03-19 17:28:22,332 [myid:3] - INFO >> > > [Thread-140:Leader$LearnerCnxAcceptor@309] - exception while >> > > shutting >> > down >> > > acceptor: java.net.SocketException: Socket closed >> > > 2014-03-19 17:28:24,004 [myid:3] - INFO >> > > [SessionTracker:SessionTrackerImpl@162] - SessionTrackerImpl exited >> > loop! >> > > 2014-03-19 17:28:27,398 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41264 >> > > 2014-03-19 17:28:27,399 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:28:27,399 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41264 (no session >> > established for >> > > client) >> > > 2014-03-19 17:28:34,987 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41267 >> > > 2014-03-19 17:28:34,988 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:28:34,988 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41267 (no session >> > established for >> > > client) >> > > 2014-03-19 17:28:35,218 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@740] - >> > > New election. My id = 3, proposed zxid=0xc200000001 >> > > 2014-03-19 17:28:35,219 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:35,420 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:35,420 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - >> > > Notification time out: 400 >> > > 2014-03-19 17:28:35,821 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:35,822 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - >> > > Notification time out: 800 >> > > 2014-03-19 17:28:36,623 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:36,623 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - >> > > Notification time out: 1600 >> > > 2014-03-19 17:28:36,800 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING >> > > (n.state), >> > 1 >> > > (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:37,096 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING >> > > (n.state), >> > 2 >> > > (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:37,097 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING >> > > (n.state), >> > 2 >> > > (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:38,698 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:38,698 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - >> > > Notification time out: 3200 >> > > 2014-03-19 17:28:38,700 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING >> > > (n.state), >> > 1 >> > > (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:38,705 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING >> > > (n.state), >> > 2 >> > > (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:39,408 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41270 >> > > 2014-03-19 17:28:39,409 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:28:39,409 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41270 (no session >> > established for >> > > client) >> > > 2014-03-19 17:28:41,906 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:41,906 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - >> > > Notification time out: 6400 >> > > 2014-03-19 17:28:42,390 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41273 >> > > 2014-03-19 17:28:42,390 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:28:42,391 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41273 (no session >> > established for >> > > client) >> > > 2014-03-19 17:28:44,729 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41276 >> > > 2014-03-19 17:28:44,730 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:28:44,730 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41276 (no session >> > established for >> > > client) >> > > 2014-03-19 17:28:48,307 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - >> > > Notification time out: 12800 >> > > 2014-03-19 17:28:48,308 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:49,840 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 1 >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING >> > > (n.state), 1 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:49,841 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING >> > > (n.state), 1 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) >> > > 2014-03-19 17:28:50,042 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer@750] - LEADING >> > > 2014-03-19 17:28:50,042 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@162] - >> > > Created server with tickTime 2000 minSessionTimeout 4000 >> > > maxSessionTimeout 40000 datadir /home/zookeeper/data/version-2 >> > > snapdir >> > > /home/zookeeper/data/version-2 >> > > 2014-03-19 17:28:50,042 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@345] - LEADING - >> > > LEADER ELECTION TOOK - 27714 >> > > 2014-03-19 17:28:50,045 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FileSnap@83] - Reading >> > > snapshot >> > > /home/zookeeper/data/version-2/snapshot.c200000001 >> > > 2014-03-19 17:28:50,540 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41279 >> > > 2014-03-19 17:28:50,541 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:28:50,541 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41279 (no session >> > established for >> > > client) >> > > 2014-03-19 17:28:51,406 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 2 >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING >> > > (n.state), 2 (n.sid), 0xc5 (n.peerEPoch), LEADING (my state) >> > > 2014-03-19 17:28:51,406 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING >> > > (n.state), 2 (n.sid), 0xc5 (n.peerEPoch), LEADING (my state) >> > > 2014-03-19 17:28:53,526 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41282 >> > > 2014-03-19 17:28:53,526 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:28:53,527 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41282 (no session >> > established for >> > > client) >> > > 2014-03-19 17:28:59,322 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41285 >> > > 2014-03-19 17:28:59,323 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:28:59,323 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41285 (no session >> > established for >> > > client) >> > > 2014-03-19 17:29:00,253 [myid:3] - INFO >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@240] - >> > Snapshotting: >> > > 0xc200000001 to /home/zookeeper/data/version-2/snapshot.c200000001 >> > > 2014-03-19 17:29:04,860 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:29:04,860 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41288 (no session >> > established for >> > > client) >> > > 2014-03-19 17:29:11,031 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41291 >> > > 2014-03-19 17:29:11,032 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:29:11,032 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41291 (no session >> > established for >> > > client) >> > > 2014-03-19 17:29:16,490 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41294 >> > > 2014-03-19 17:29:16,491 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:29:16,491 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41294 (no session >> > established for >> > > client) >> > > 2014-03-19 17:29:19,064 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197] >> > > - Accepted socket connection from /10.0.160.243:41297 >> > > 2014-03-19 17:29:19,065 [myid:3] - WARN >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - >> > Exception >> > > causing close of session 0x0 due to java.io.IOException: >> > > ZooKeeperServer >> > not >> > > running >> > > 2014-03-19 17:29:19,065 [myid:3] - INFO >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - >> > > Closed socket connection for client /10.0.160.243:41297 (no session >> > established for >> > > client) >> > > 2014-03-19 17:29:19,312 [myid:3] - INFO >> > > [LearnerHandler-/10.0.33.1:58547:LearnerHandler@263] - Follower sid: >> 1 : >> > > info : >> > org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@3c966db5 >> > > 2014-03-19 17:29:19,314 [myid:3] - INFO >> > > [LearnerHandler-/10.0.33.129:49810:LearnerHandler@263] - Follower sid: >> > 2 : >> > > info : >> > > org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@466b56b >> > > 2014-03-19 17:29:19,475 [myid:3] - ERROR >> > > [LearnerHandler-/10.0.33.1:58547:LearnerHandler@562] - Unexpected >> > exception >> > > causing shutdown while sock still open java.io.EOFException >> > > at java.io.DataInputStream.readInt(DataInputStream.java:392) >> > > at >> > > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) >> > > at >> > > >> > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPack >> > et.java:83) >> > > at >> > > >> > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java: >> > 108) >> > > at >> > > >> > org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.j >> > ava:290) >> > > 2014-03-19 17:29:19,476 [myid:3] - WARN >> > > [LearnerHandler-/10.0.33.1:58547:LearnerHandler@575] - ******* >> > > GOODBYE >> > > /10.0.33.1:58547 ******** >> > > 2014-03-19 17:29:19,476 [myid:3] - ERROR >> > > [LearnerHandler-/10.0.33.129:49810:LearnerHandler@562] - Unexpected >> > > exception causing shutdown while sock still open >> > > java.io.EOFException >> > > at java.io.DataInputStream.readInt(DataInputStream.java:392) >> > > at >> > > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) >> > > at >> > > >> > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPack >> > et.java:83) >> > > at >> > > >> > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java: >> > 108) >> > > at >> > > >> > org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.j >> > ava:290) >> > > 2014-03-19 17:29:19,477 [myid:3] - WARN >> > > [LearnerHandler-/10.0.33.129:49810:LearnerHandler@575] - ******* >> > > GOODBYE >> > > /10.0.33.129:49810 ******** >> > > 2014-03-19 17:29:21,757 [myid:3] - INFO >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 1 >> > > (n.leader), 0xc200000001 (n.zxid), 0x128 (n.round), LOOKING >> > > (n.state), 1 (n.sid), 0xc5 (n.peerEPoch), LEADING (my state) >> > > >> > > >> > > >> > > 2014-03-23 12:05 GMT+09:00 Michi Mutsuzaki <[email protected]>: >> > > >> > >> Hi Youngseok, >> > >> >> > >> Could you post the log file from 192.168.161.1? The log file you >> > >> posted indicates that 192.168.33.1 is not able to connect to >> > >> 192.168.161.1. >> > >> >> > >> Thanks! >> > >> --Michi >> > >> >> > >> >> > >> On Fri, Mar 21, 2014 at 12:14 AM, Jung Young Seok >> > >> <[email protected]> wrote: >> > >> > Dear Zookeeper usergroup members, >> > >> > >> > >> > I have some questions. >> > >> > >> > >> > We're currently use Zookeeper 3.4.5 with clustering 3 nodes. >> > >> > We got zookeeper service stopped all of sudden so client wasn't >> > >> > able >> > to >> > >> > connect to zookeeper server. >> > >> > In that situation, zookeepers couldn't elect leader each other. >> > >> > >> > >> > Then I restarted zookeeper service (all of them) but could't >> > >> > elect leader and be follower. >> > >> > So I rebooted linux but same happened. (I lost zookeeper log here >> > >> > t.t) When I removed snapshot files in data directory, the >> > >> > zookeeper worked okay. >> > >> > I have uploaded my zookeeper snapshot here >> > >> > - >> > >> > >> > https://s3-ap-northeast-1.amazonaws.com/zookeeper-logs/data_org_b1.tar >> > >> > >> > >> > If I push the snapshot into data directory, zookeeper clustering >> > >> > fail reappears again. >> > >> > >> > >> > My question is >> > >> > 1. why the snapshot was corrupted all of sudden? >> > >> > 2. Is there any way I can avoid this snapshot corruption issue? >> > >> > >> > >> > I've attached zoo.cfg and some of error log. >> > >> > >> > >> > I'd be happy if I get any opinion. >> > >> > Thank You. >> > >> > >> > >> > Best Regards >> > >> > Youngseok Jung >> > >> > >> > >> > >> > >> > #zoo.cfg (pretty much default setting) >> > >> > tickTime=2000 >> > >> > initLimit=10 >> > >> > syncLimit=5 >> > >> > dataDir=/home/zookeeper/data >> > >> > clientPort=2181 >> > >> > >> > >> > server.1=192.168.33.1:2888:3888 >> > >> > server.2=192.168.33.129:2888:3888 >> > >> > server.3=192.168.161.1:2888:3888 >> > >> > autopurge.snapRetainCount=3 >> > >> > autopurge.purgeInterval=1 >> > >> > >> > >> > >> > >> > #Some of error log >> > >> > 2014-03-19 17:56:24,737 [myid:1] - INFO >> > >> > [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 2 >> > >> > (n.leader), 0xc600000001 (n.zxid), 0x144 (n.round), LEADING >> > (n.state), 2 >> > >> > (n.sid), 0xc6 (n.peerEPoch), LOOKING (my state) >> > >> > 2014-03-19 17:56:24,737 [myid:1] - WARN >> > >> > [WorkerSender[myid=1]:QuorumCnxManager@368] - Cannot open channel >> > to 3 >> > >> > at >> > >> > election address /10.0.161.1:3888 >> > >> > java.net.ConnectException: Connection refused >> > >> > at java.net.PlainSocketImpl.socketConnect(Native Method) >> > >> > at >> > >> > >> > >> > >> > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.jav >> > a:339) >> > >> > at >> > >> > >> > >> > >> > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketI >> > mpl.java:200) >> > >> > at >> > >> > >> > >> > >> > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java: >> > 182) >> > >> > at >> java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) >> > >> > at java.net.Socket.connect(Socket.java:579) >> > >> > at >> > >> > >> > >> > >> > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumC >> > nxManager.java:354) >> > >> > at >> > >> > >> > >> > >> > org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxMa >> > nager.java:327) >> > >> > at >> > >> > >> > >> > >> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$Worker >> > Sender.process(FastLeaderElection.java:393) >> > >> > at >> > >> > >> > >> > >> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$Worker >> > Sender.run(FastLeaderElection.java:365) >> > >> > at java.lang.Thread.run(Thread.java:724) >> > >> > 2014-03-19 17:56:25,537 [myid:1] - INFO >> > >> > [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] >> > >> > - Notification time out: 1600 >> > >> > 2014-03-19 17:56:25,538 [myid:1] - INFO >> > >> > [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 1 >> > >> > (n.leader), 0xc200000001 (n.zxid), 0x145 (n.round), LOOKING >> > (n.state), 1 >> > >> > (n.sid), 0xc6 (n.peerEPoch), LOOKING (my state) >> > >> > 2014-03-19 17:56:25,540 [myid:1] - INFO >> > >> > [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 2 >> > >> > (n.leader), 0xc600000001 (n.zxid), 0x144 (n.round), LEADING >> > (n.state), 2 >> > >> > (n.sid), 0xc6 (n.peerEPoch), LOOKING (my state) >> > >> > 2014-03-19 17:56:25,540 [myid:1] - WARN >> > >> > [WorkerSender[myid=1]:QuorumCnxManager@368] - Cannot open channel >> > to 3 >> > >> > at >> > >> > election address /10.0.161.1:3888 >> > >> > java.net.ConnectException: Connection refused >> > >> > at java.net.PlainSocketImpl.socketConnect(Native Method) >> > >> > at >> > >> > >> > >> > >> > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.jav >> > a:339) >> > >> > at >> > >> > >> > >> > >> > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketI >> > mpl.java:200) >> > >> > at >> > >> > >> > >> > >> > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java: >> > 182) >> > >> > at >> java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) >> > >> > at java.net.Socket.connect(Socket.java:579) >> > >> > at >> > >> > >> > >> > >> > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumC >> > nxManager.java:354) >> > >> > at >> > >> > >> > >> > >> > org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxMa >> > nager.java:327) >> > >> > at >> > >> > >> > >> > >> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$Worker >> > Sender.process(FastLeaderElection.java:393) >> > >> > at >> > >> > >> > >> > >> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$Worker >> > Sender.run(FastLeaderElection.java:365) >> > >> > at java.lang.Thread.run(Thread.java:724) >> > > >> > > >> > >>
