Thank you, Michi Does snapshotting means dumping snap data on memory into disk?
Small snapCount would take little time for snapshotting, it means shorter stopping world. Do I understand properly? Thank you in advance. 2014. 3. 27. 오전 8:01에 "Michi Mutsuzaki" <[email protected]>님이 작성: > Hi Youngseok, > > Don't rotate transaction logs yourself. At the very minimum, you need > to keep the most recent snapshot file and the most recent transaction > log file. You can use the snapCount to control the size of the > transaction log files. However be aware that smaller snapCount means > more frequent snapshotting, which affects ZooKeeper performance. Do > test it before changing snapCount in production. > > > http://zookeeper.apache.org/doc/r3.4.6/zookeeperAdmin.html#sc_advancedConfiguration > > For zookeeper.out, it's really up to you to decide how much data to > retain. Personally I would keep at least a day worth of INFO log. > > > On Wed, Mar 26, 2014 at 3:36 PM, Jung Young Seok > <[email protected]> wrote: > > I have a couple of question regarding log and snapshot management. > > > > I use auto purge feature so I keep only 3 snapshots and 3 transaction > logs. > > Zookeeper log(zookeeper.out) is rotated daily. > > > > What I would like to do is keep the log and snapshot files as small as > > possible. > > > > Would it be okay to manage snapshot files and transaction logs with > > logrotate.d or Log4j rolling not to grow more than 50MB ? > > > > Thank you in advance. > > Youngseok > > 2014. 3. 24. 오후 4:43에 "Rakesh R" <[email protected]>님이 작성: > > > >> > >> From the latest log shared by YoungSeok, > >> > >> [1] I could see the LearnerHandler fails to get a Leader.ACKEPOCH > response > >> from the Followers and is failing with the following exception. > >> > >> 2014-03-19 17:29:19,312 [myid:3] - INFO [LearnerHandler-/ > 10.0.33.1:58547 > >> :LearnerHandler@263] - Follower sid: 1 :info : > >> org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@3c966db5 > >> 2014-03-19 17:29:19,314 [myid:3] - INFO [LearnerHandler-/ > 10.0.33.129:49810 > >> :LearnerHandler@263] - Follower sid: 2 :info : > >> org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@466b56b > >> 2014-03-19 17:29:19,475 [myid:3] - ERROR [LearnerHandler-/ > 10.0.33.1:58547 > >> :LearnerHandler@562] - Unexpected > >> exception causing shutdown while sock still open java.io.EOFException > >> at java.io.DataInputStream.readInt(DataInputStream.java:392) > >> at > >> org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) > >> at > >> > >> > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83) > >> at > >> > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108) > >> at > >> org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.j > >> ava:290) > >> > >> Hi Michi, It would be great if you can help to know more on ZK-1697. As > I > >> understood this is talking about the Leader.ACK, am I correct ?, if so I > >> got confused by seeing the Leader.ACKEPOCH exception in LearnerHandler > side > >> [1]. From the code what I've seen Leader.ACKEPOCH would sent to the > Leader > >> at the time of Learner# registerWithLeader(). > >> > >> Thanks in advance. > >> Rakesh > >> > >> -----Original Message----- > >> From: Jung Young Seok [mailto:[email protected]] > >> Sent: 24 March 2014 10:20 > >> To: [email protected] > >> Cc: [email protected] > >> Subject: Re: [Zookeeper] Zookeeper Cluster broken due to snapshot > >> corrupted error > >> > >> I'm not sure if my issue is related to > >> https://issues.apache.org/jira/browse/ZOOKEEPER-1697 > >> but I think I should try with Zookeeer version to 3.4.6(stable). > >> > >> I'm just hoping 3.4.6 version would prevent happening my issue again. > >> > >> Thank you for your answer. > >> Have a great day. > >> > >> Best Regards, > >> Youngseok Jung > >> > >> > >> 2014-03-24 13:05 GMT+09:00 Michi Mutsuzaki <[email protected]>: > >> > >> > I wonder if this is related to ZOOKEEPER-1697. > >> > > >> > https://issues.apache.org/jira/browse/ZOOKEEPER-1697 > >> > > >> > --Michi > >> > > >> > On Sun, Mar 23, 2014 at 6:15 PM, Jung Young Seok > >> > <[email protected]> wrote: > >> > > I've added zookeeper log (192.168.161.1). > >> > > The time that the log was written look different but you might > ignore > >> it. > >> > > Logs on 192.168.161.1 had been repeated with below pattern. > >> > > > >> > > Thank you for your asking. > >> > > > >> > > > >> > ---------------------------------------------------------------------- > >> > ---------------------------------------------------------------------- > >> > ---------------------------------------------------------- > >> > > 2014-03-19 17:28:06,105 [myid:3] - INFO > >> > > [LearnerHandler-/10.0.33.129:49809:LearnerHandler@395] - Sending > >> > > DIFF > >> > > 2014-03-19 17:28:07,414 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41252 > >> > > 2014-03-19 17:28:07,415 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:28:07,415 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41252 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:28:12,173 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41255 > >> > > 2014-03-19 17:28:12,174 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:28:12,174 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41255 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:28:14,558 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41258 > >> > > 2014-03-19 17:28:14,559 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:28:14,559 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41258 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:28:18,585 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41261 > >> > > 2014-03-19 17:28:18,586 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:28:18,586 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41261 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:28:20,067 [myid:3] - WARN > >> > > [LearnerHandler-/10.0.33.1:58546:Leader@574] - Commiting zxid > >> > 0xc500000000 > >> > > from /10.0.161.1:2888 not first! > >> > > 2014-03-19 17:28:20,067 [myid:3] - WARN > >> > > [LearnerHandler-/10.0.33.1:58546:Leader@576] - First is 0x0 > >> > > 2014-03-19 17:28:20,068 [myid:3] - INFO > >> > > [LearnerHandler-/10.0.33.1:58546:Leader@598] - Have quorum of > >> > supporters; > >> > > starting up and setting last processed zxid: 0xc500000000 > >> > > 2014-03-19 17:28:22,312 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@490] - Shutting > down > >> > > 2014-03-19 17:28:22,312 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@496] - Shutdown > >> > > called > >> > > java.lang.Exception: shutdown Leader! reason: Only 1 followers, > need 1 > >> > > at > >> > > org.apache.zookeeper.server.quorum.Leader.shutdown(Leader.java:496) > >> > > at > >> > org.apache.zookeeper.server.quorum.Leader.lead(Leader.java:471) > >> > > at > >> > > org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:75 > >> > > 3) > >> > > 2014-03-19 17:28:22,313 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@419] - > >> > > shutting down > >> > > 2014-03-19 17:28:22,320 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:SessionTrackerImpl@225] - > >> > Shutting > >> > > down > >> > > 2014-03-19 17:28:22,320 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:PrepRequestProcessor@743] > - > >> > > Shutting down > >> > > 2014-03-19 17:28:22,321 [myid:3] - INFO [ProcessThread(sid:3 > >> > > cport:-1)::PrepRequestProcessor@143] - PrepRequestProcessor exited > >> loop! > >> > > 2014-03-19 17:28:22,321 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:ProposalRequestProcessor@88 > >> > > ] - Shutting down > >> > > 2014-03-19 17:28:22,322 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:CommitProcessor@181] - > >> > > Shutting down > >> > > 2014-03-19 17:28:22,322 [myid:3] - INFO > >> > > [CommitProcessor:3:CommitProcessor@150] - CommitProcessor exited > loop! > >> > > 2014-03-19 17:28:22,322 [myid:3] - INFO > >> > > > >> > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader$ToBeAppliedRequestProc > >> > essor@655 > >> > ] > >> > > - Shutting down > >> > > 2014-03-19 17:28:22,322 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FinalRequestProcessor@415] > >> > > - shutdown of request processor complete > >> > > 2014-03-19 17:28:22,323 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:SyncRequestProcessor@175] > - > >> > > Shutting down > >> > > 2014-03-19 17:28:22,323 [myid:3] - INFO > >> > > [SyncThread:3:SyncRequestProcessor@155] - SyncRequestProcessor > exited! > >> > > 2014-03-19 17:28:22,325 [myid:3] - WARN > >> > > [LearnerHandler-/10.0.33.1:58546:LearnerHandler@575] - ******* > >> > > GOODBYE > >> > > /10.0.33.1:58546 ******** > >> > > 2014-03-19 17:28:22,326 [myid:3] - WARN > >> > > [LearnerHandler-/10.0.33.129:49809:LearnerHandler@575] - ******* > >> > > GOODBYE > >> > > /10.0.33.129:49809 ******** > >> > > 2014-03-19 17:28:22,327 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer@670] - LOOKING > >> > > 2014-03-19 17:28:22,328 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FileSnap@83] - Reading > >> > > snapshot > >> > > /home/zookeeper/data/version-2/snapshot.c200000001 > >> > > 2014-03-19 17:28:22,332 [myid:3] - INFO > >> > > [Thread-140:Leader$LearnerCnxAcceptor@309] - exception while > >> > > shutting > >> > down > >> > > acceptor: java.net.SocketException: Socket closed > >> > > 2014-03-19 17:28:24,004 [myid:3] - INFO > >> > > [SessionTracker:SessionTrackerImpl@162] - SessionTrackerImpl exited > >> > loop! > >> > > 2014-03-19 17:28:27,398 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41264 > >> > > 2014-03-19 17:28:27,399 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:28:27,399 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41264 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:28:34,987 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41267 > >> > > 2014-03-19 17:28:34,988 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:28:34,988 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41267 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:28:35,218 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@740] - > >> > > New election. My id = 3, proposed zxid=0xc200000001 > >> > > 2014-03-19 17:28:35,219 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING > >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:35,420 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING > >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:35,420 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - > >> > > Notification time out: 400 > >> > > 2014-03-19 17:28:35,821 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING > >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:35,822 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - > >> > > Notification time out: 800 > >> > > 2014-03-19 17:28:36,623 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING > >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:36,623 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - > >> > > Notification time out: 1600 > >> > > 2014-03-19 17:28:36,800 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING > >> > > (n.state), > >> > 1 > >> > > (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:37,096 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING > >> > > (n.state), > >> > 2 > >> > > (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:37,097 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING > >> > > (n.state), > >> > 2 > >> > > (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:38,698 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING > >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:38,698 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - > >> > > Notification time out: 3200 > >> > > 2014-03-19 17:28:38,700 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING > >> > > (n.state), > >> > 1 > >> > > (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:38,705 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x126 (n.round), FOLLOWING > >> > > (n.state), > >> > 2 > >> > > (n.sid), 0xc4 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:39,408 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41270 > >> > > 2014-03-19 17:28:39,409 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:28:39,409 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41270 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:28:41,906 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING > >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:41,906 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - > >> > > Notification time out: 6400 > >> > > 2014-03-19 17:28:42,390 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41273 > >> > > 2014-03-19 17:28:42,390 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:28:42,391 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41273 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:28:44,729 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41276 > >> > > 2014-03-19 17:28:44,730 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:28:44,730 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41276 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:28:48,307 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - > >> > > Notification time out: 12800 > >> > > 2014-03-19 17:28:48,308 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING > >> > > (n.state), 3 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:49,840 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 1 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING > >> > > (n.state), 1 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:49,841 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING > >> > > (n.state), 1 (n.sid), 0xc5 (n.peerEPoch), LOOKING (my state) > >> > > 2014-03-19 17:28:50,042 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:QuorumPeer@750] - LEADING > >> > > 2014-03-19 17:28:50,042 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:ZooKeeperServer@162] - > >> > > Created server with tickTime 2000 minSessionTimeout 4000 > >> > > maxSessionTimeout 40000 datadir /home/zookeeper/data/version-2 > >> > > snapdir > >> > > /home/zookeeper/data/version-2 > >> > > 2014-03-19 17:28:50,042 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:Leader@345] - LEADING - > >> > > LEADER ELECTION TOOK - 27714 > >> > > 2014-03-19 17:28:50,045 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FileSnap@83] - Reading > >> > > snapshot > >> > > /home/zookeeper/data/version-2/snapshot.c200000001 > >> > > 2014-03-19 17:28:50,540 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41279 > >> > > 2014-03-19 17:28:50,541 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:28:50,541 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41279 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:28:51,406 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 2 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING > >> > > (n.state), 2 (n.sid), 0xc5 (n.peerEPoch), LEADING (my state) > >> > > 2014-03-19 17:28:51,406 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 3 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x127 (n.round), LOOKING > >> > > (n.state), 2 (n.sid), 0xc5 (n.peerEPoch), LEADING (my state) > >> > > 2014-03-19 17:28:53,526 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41282 > >> > > 2014-03-19 17:28:53,526 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:28:53,527 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41282 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:28:59,322 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41285 > >> > > 2014-03-19 17:28:59,323 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:28:59,323 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41285 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:29:00,253 [myid:3] - INFO > >> > > [QuorumPeer[myid=3]/0:0:0:0:0:0:0:0:2181:FileTxnSnapLog@240] - > >> > Snapshotting: > >> > > 0xc200000001 to /home/zookeeper/data/version-2/snapshot.c200000001 > >> > > 2014-03-19 17:29:04,860 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:29:04,860 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41288 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:29:11,031 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41291 > >> > > 2014-03-19 17:29:11,032 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:29:11,032 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41291 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:29:16,490 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41294 > >> > > 2014-03-19 17:29:16,491 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:29:16,491 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41294 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:29:19,064 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@197 > ] > >> > > - Accepted socket connection from /10.0.160.243:41297 > >> > > 2014-03-19 17:29:19,065 [myid:3] - WARN > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@354] - > >> > Exception > >> > > causing close of session 0x0 due to java.io.IOException: > >> > > ZooKeeperServer > >> > not > >> > > running > >> > > 2014-03-19 17:29:19,065 [myid:3] - INFO > >> > > [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - > >> > > Closed socket connection for client /10.0.160.243:41297 (no session > >> > established for > >> > > client) > >> > > 2014-03-19 17:29:19,312 [myid:3] - INFO > >> > > [LearnerHandler-/10.0.33.1:58547:LearnerHandler@263] - Follower > sid: > >> 1 : > >> > > info : > >> > org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@3c966db5 > >> > > 2014-03-19 17:29:19,314 [myid:3] - INFO > >> > > [LearnerHandler-/10.0.33.129:49810:LearnerHandler@263] - Follower > sid: > >> > 2 : > >> > > info : > >> > > org.apache.zookeeper.server.quorum.QuorumPeer$QuorumServer@466b56b > >> > > 2014-03-19 17:29:19,475 [myid:3] - ERROR > >> > > [LearnerHandler-/10.0.33.1:58547:LearnerHandler@562] - Unexpected > >> > exception > >> > > causing shutdown while sock still open java.io.EOFException > >> > > at java.io.DataInputStream.readInt(DataInputStream.java:392) > >> > > at > >> > > > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) > >> > > at > >> > > > >> > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPack > >> > et.java:83) > >> > > at > >> > > > >> > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java: > >> > 108) > >> > > at > >> > > > >> > org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.j > >> > ava:290) > >> > > 2014-03-19 17:29:19,476 [myid:3] - WARN > >> > > [LearnerHandler-/10.0.33.1:58547:LearnerHandler@575] - ******* > >> > > GOODBYE > >> > > /10.0.33.1:58547 ******** > >> > > 2014-03-19 17:29:19,476 [myid:3] - ERROR > >> > > [LearnerHandler-/10.0.33.129:49810:LearnerHandler@562] - Unexpected > >> > > exception causing shutdown while sock still open > >> > > java.io.EOFException > >> > > at java.io.DataInputStream.readInt(DataInputStream.java:392) > >> > > at > >> > > > org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63) > >> > > at > >> > > > >> > org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPack > >> > et.java:83) > >> > > at > >> > > > >> > org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java: > >> > 108) > >> > > at > >> > > > >> > org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.j > >> > ava:290) > >> > > 2014-03-19 17:29:19,477 [myid:3] - WARN > >> > > [LearnerHandler-/10.0.33.129:49810:LearnerHandler@575] - ******* > >> > > GOODBYE > >> > > /10.0.33.129:49810 ******** > >> > > 2014-03-19 17:29:21,757 [myid:3] - INFO > >> > > [WorkerReceiver[myid=3]:FastLeaderElection@542] - Notification: 1 > >> > > (n.leader), 0xc200000001 (n.zxid), 0x128 (n.round), LOOKING > >> > > (n.state), 1 (n.sid), 0xc5 (n.peerEPoch), LEADING (my state) > >> > > > >> > > > >> > > > >> > > 2014-03-23 12:05 GMT+09:00 Michi Mutsuzaki <[email protected]>: > >> > > > >> > >> Hi Youngseok, > >> > >> > >> > >> Could you post the log file from 192.168.161.1? The log file you > >> > >> posted indicates that 192.168.33.1 is not able to connect to > >> > >> 192.168.161.1. > >> > >> > >> > >> Thanks! > >> > >> --Michi > >> > >> > >> > >> > >> > >> On Fri, Mar 21, 2014 at 12:14 AM, Jung Young Seok > >> > >> <[email protected]> wrote: > >> > >> > Dear Zookeeper usergroup members, > >> > >> > > >> > >> > I have some questions. > >> > >> > > >> > >> > We're currently use Zookeeper 3.4.5 with clustering 3 nodes. > >> > >> > We got zookeeper service stopped all of sudden so client wasn't > >> > >> > able > >> > to > >> > >> > connect to zookeeper server. > >> > >> > In that situation, zookeepers couldn't elect leader each other. > >> > >> > > >> > >> > Then I restarted zookeeper service (all of them) but could't > >> > >> > elect leader and be follower. > >> > >> > So I rebooted linux but same happened. (I lost zookeeper log here > >> > >> > t.t) When I removed snapshot files in data directory, the > >> > >> > zookeeper worked okay. > >> > >> > I have uploaded my zookeeper snapshot here > >> > >> > - > >> > >> > > >> > > https://s3-ap-northeast-1.amazonaws.com/zookeeper-logs/data_org_b1.tar > >> > >> > > >> > >> > If I push the snapshot into data directory, zookeeper clustering > >> > >> > fail reappears again. > >> > >> > > >> > >> > My question is > >> > >> > 1. why the snapshot was corrupted all of sudden? > >> > >> > 2. Is there any way I can avoid this snapshot corruption issue? > >> > >> > > >> > >> > I've attached zoo.cfg and some of error log. > >> > >> > > >> > >> > I'd be happy if I get any opinion. > >> > >> > Thank You. > >> > >> > > >> > >> > Best Regards > >> > >> > Youngseok Jung > >> > >> > > >> > >> > > >> > >> > #zoo.cfg (pretty much default setting) > >> > >> > tickTime=2000 > >> > >> > initLimit=10 > >> > >> > syncLimit=5 > >> > >> > dataDir=/home/zookeeper/data > >> > >> > clientPort=2181 > >> > >> > > >> > >> > server.1=192.168.33.1:2888:3888 > >> > >> > server.2=192.168.33.129:2888:3888 > >> > >> > server.3=192.168.161.1:2888:3888 > >> > >> > autopurge.snapRetainCount=3 > >> > >> > autopurge.purgeInterval=1 > >> > >> > > >> > >> > > >> > >> > #Some of error log > >> > >> > 2014-03-19 17:56:24,737 [myid:1] - INFO > >> > >> > [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: > 2 > >> > >> > (n.leader), 0xc600000001 (n.zxid), 0x144 (n.round), LEADING > >> > (n.state), 2 > >> > >> > (n.sid), 0xc6 (n.peerEPoch), LOOKING (my state) > >> > >> > 2014-03-19 17:56:24,737 [myid:1] - WARN > >> > >> > [WorkerSender[myid=1]:QuorumCnxManager@368] - Cannot open > channel > >> > to 3 > >> > >> > at > >> > >> > election address /10.0.161.1:3888 > >> > >> > java.net.ConnectException: Connection refused > >> > >> > at java.net.PlainSocketImpl.socketConnect(Native Method) > >> > >> > at > >> > >> > > >> > >> > > >> > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.jav > >> > a:339) > >> > >> > at > >> > >> > > >> > >> > > >> > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketI > >> > mpl.java:200) > >> > >> > at > >> > >> > > >> > >> > > >> > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java: > >> > 182) > >> > >> > at > >> java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > >> > >> > at java.net.Socket.connect(Socket.java:579) > >> > >> > at > >> > >> > > >> > >> > > >> > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumC > >> > nxManager.java:354) > >> > >> > at > >> > >> > > >> > >> > > >> > org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxMa > >> > nager.java:327) > >> > >> > at > >> > >> > > >> > >> > > >> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$Worker > >> > Sender.process(FastLeaderElection.java:393) > >> > >> > at > >> > >> > > >> > >> > > >> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$Worker > >> > Sender.run(FastLeaderElection.java:365) > >> > >> > at java.lang.Thread.run(Thread.java:724) > >> > >> > 2014-03-19 17:56:25,537 [myid:1] - INFO > >> > >> > [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] > >> > >> > - Notification time out: 1600 > >> > >> > 2014-03-19 17:56:25,538 [myid:1] - INFO > >> > >> > [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: > 1 > >> > >> > (n.leader), 0xc200000001 (n.zxid), 0x145 (n.round), LOOKING > >> > (n.state), 1 > >> > >> > (n.sid), 0xc6 (n.peerEPoch), LOOKING (my state) > >> > >> > 2014-03-19 17:56:25,540 [myid:1] - INFO > >> > >> > [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: > 2 > >> > >> > (n.leader), 0xc600000001 (n.zxid), 0x144 (n.round), LEADING > >> > (n.state), 2 > >> > >> > (n.sid), 0xc6 (n.peerEPoch), LOOKING (my state) > >> > >> > 2014-03-19 17:56:25,540 [myid:1] - WARN > >> > >> > [WorkerSender[myid=1]:QuorumCnxManager@368] - Cannot open > channel > >> > to 3 > >> > >> > at > >> > >> > election address /10.0.161.1:3888 > >> > >> > java.net.ConnectException: Connection refused > >> > >> > at java.net.PlainSocketImpl.socketConnect(Native Method) > >> > >> > at > >> > >> > > >> > >> > > >> > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.jav > >> > a:339) > >> > >> > at > >> > >> > > >> > >> > > >> > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketI > >> > mpl.java:200) > >> > >> > at > >> > >> > > >> > >> > > >> > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java: > >> > 182) > >> > >> > at > >> java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > >> > >> > at java.net.Socket.connect(Socket.java:579) > >> > >> > at > >> > >> > > >> > >> > > >> > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumC > >> > nxManager.java:354) > >> > >> > at > >> > >> > > >> > >> > > >> > org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxMa > >> > nager.java:327) > >> > >> > at > >> > >> > > >> > >> > > >> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$Worker > >> > Sender.process(FastLeaderElection.java:393) > >> > >> > at > >> > >> > > >> > >> > > >> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$Worker > >> > Sender.run(FastLeaderElection.java:365) > >> > >> > at java.lang.Thread.run(Thread.java:724) > >> > > > >> > > > >> > > >> >
