Hi Youngseok, Could you post the log file from 192.168.161.1? The log file you posted indicates that 192.168.33.1 is not able to connect to 192.168.161.1.
Thanks! --Michi On Fri, Mar 21, 2014 at 12:14 AM, Jung Young Seok <[email protected]> wrote: > Dear Zookeeper usergroup members, > > I have some questions. > > We're currently use Zookeeper 3.4.5 with clustering 3 nodes. > We got zookeeper service stopped all of sudden so client wasn't able to > connect to zookeeper server. > In that situation, zookeepers couldn't elect leader each other. > > Then I restarted zookeeper service (all of them) but could't elect leader > and be follower. > So I rebooted linux but same happened. (I lost zookeeper log here t.t) > When I removed snapshot files in data directory, the zookeeper worked okay. > I have uploaded my zookeeper snapshot here > - https://s3-ap-northeast-1.amazonaws.com/zookeeper-logs/data_org_b1.tar > > If I push the snapshot into data directory, zookeeper clustering fail > reappears again. > > My question is > 1. why the snapshot was corrupted all of sudden? > 2. Is there any way I can avoid this snapshot corruption issue? > > I've attached zoo.cfg and some of error log. > > I'd be happy if I get any opinion. > Thank You. > > Best Regards > Youngseok Jung > > > #zoo.cfg (pretty much default setting) > tickTime=2000 > initLimit=10 > syncLimit=5 > dataDir=/home/zookeeper/data > clientPort=2181 > > server.1=192.168.33.1:2888:3888 > server.2=192.168.33.129:2888:3888 > server.3=192.168.161.1:2888:3888 > autopurge.snapRetainCount=3 > autopurge.purgeInterval=1 > > > #Some of error log > 2014-03-19 17:56:24,737 [myid:1] - INFO > [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 2 > (n.leader), 0xc600000001 (n.zxid), 0x144 (n.round), LEADING (n.state), 2 > (n.sid), 0xc6 (n.peerEPoch), LOOKING (my state) > 2014-03-19 17:56:24,737 [myid:1] - WARN > [WorkerSender[myid=1]:QuorumCnxManager@368] - Cannot open channel to 3 at > election address /10.0.161.1:3888 > java.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) > at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) > at > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > at java.net.Socket.connect(Socket.java:579) > at > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:354) > at > org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:327) > at > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:393) > at > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:365) > at java.lang.Thread.run(Thread.java:724) > 2014-03-19 17:56:25,537 [myid:1] - INFO > [QuorumPeer[myid=1]/0:0:0:0:0:0:0:0:2181:FastLeaderElection@774] - > Notification time out: 1600 > 2014-03-19 17:56:25,538 [myid:1] - INFO > [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 1 > (n.leader), 0xc200000001 (n.zxid), 0x145 (n.round), LOOKING (n.state), 1 > (n.sid), 0xc6 (n.peerEPoch), LOOKING (my state) > 2014-03-19 17:56:25,540 [myid:1] - INFO > [WorkerReceiver[myid=1]:FastLeaderElection@542] - Notification: 2 > (n.leader), 0xc600000001 (n.zxid), 0x144 (n.round), LEADING (n.state), 2 > (n.sid), 0xc6 (n.peerEPoch), LOOKING (my state) > 2014-03-19 17:56:25,540 [myid:1] - WARN > [WorkerSender[myid=1]:QuorumCnxManager@368] - Cannot open channel to 3 at > election address /10.0.161.1:3888 > java.net.ConnectException: Connection refused > at java.net.PlainSocketImpl.socketConnect(Native Method) > at > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:339) > at > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:200) > at > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:182) > at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392) > at java.net.Socket.connect(Socket.java:579) > at > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:354) > at > org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:327) > at > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:393) > at > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:365) > at java.lang.Thread.run(Thread.java:724)
