Suzuki, You should upgrade to 3.4.2. You are hitting https://issues.apache.org/jira/browse/ZOOKEEPER-1333, which was fixed in 3.4.2.
mahadev On Thu, Jan 12, 2012 at 10:21 PM, Suzuki, Takaaki <[email protected]> wrote: > Hi > > We are using three zookeeper node. > but, one zookeeper node broken short while ago. > Something wrong remaining zookeeper node. > We can't relaunch zookeeper process with remaining node. > > How to recover this issue? Any ideas? > And, I share with you config and log information. > > -----config----- > # http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html > # The number of milliseconds of each tick > tickTime=2000 > # The number of ticks that the initial > # synchronization phase can take > initLimit=10 > # The number of ticks that can pass between > # sending a request and getting an acknowledgement > syncLimit=5 > # the directory where the snapshot is stored. > dataDir=/var/lib/zookeeper > # Place the dataLogDir to a separate physical disc for better performance > # dataLogDir=/disk2/zookeeper > # the port at which the clients will connect > clientPort=2181 > > # specify all zookeeper servers > # The fist port is used by followers to connect to the leader > # The second one is used for leader election > #server.1=zookeeper1:2888:3888 > #server.2=zookeeper2:2888:3888 > #server.3=zookeeper3:2888:3888 > > server.1=192.168.100.4:2888:3888 > server.1=192.168.100.5:2888:3888 > server.2=192.168.100.6:2888:3888 > > ---log--- > 2012-01-13 15:07:51,065 [myid:] - INFO [main:QuorumPeerConfig@101] - > Reading configuration from: /etc/zookeeper/zoo.cfg > 2012-01-13 15:07:51,072 [myid:] - WARN [main:QuorumPeerConfig@287] - > No server failure will be tolerated. You need at least 3 servers. > 2012-01-13 15:07:51,072 [myid:] - INFO [main:QuorumPeerConfig@334] - > Defaulting to majority quorums > 2012-01-13 15:07:51,075 [myid:2] - INFO > [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3 > 2012-01-13 15:07:51,076 [myid:2] - INFO > [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0 > 2012-01-13 15:07:51,076 [myid:2] - INFO > [main:DatadirCleanupManager@101] - Purge task is not scheduled. > 2012-01-13 15:07:51,123 [myid:2] - INFO [main:QuorumPeerMain@127] - > Starting quorum peer > 2012-01-13 15:07:51,138 [myid:2] - INFO > [main:NIOServerCnxnFactory@110] - binding to port 0.0.0.0/0.0.0.0:2181 > 2012-01-13 15:07:51,152 [myid:2] - INFO [main:QuorumPeer@914] - > tickTime set to 2000 > 2012-01-13 15:07:51,153 [myid:2] - INFO [main:QuorumPeer@934] - > minSessionTimeout set to -1 > 2012-01-13 15:07:51,153 [myid:2] - INFO [main:QuorumPeer@945] - > maxSessionTimeout set to -1 > 2012-01-13 15:07:51,153 [myid:2] - INFO [main:QuorumPeer@960] - > initLimit set to 10 > 2012-01-13 15:07:51,164 [myid:2] - INFO [main:FileSnap@83] - Reading > snapshot /var/lib/zookeeper/version-2/snapshot.0 > 2012-01-13 15:07:51,352 [myid:2] - ERROR [main:QuorumPeerMain@89] - > Unexpected exception, exiting abnormally > java.lang.NullPointerException > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.processTransaction(FileTxnSnapLog.java:203) > at > org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:150) > at > org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:223) > at > org.apache.zookeeper.server.quorum.QuorumPeer.loadDataBase(QuorumPeer.java:418) > at > org.apache.zookeeper.server.quorum.QuorumPeer.start(QuorumPeer.java:410) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.runFromConfig(QuorumPeerMain.java:151) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:111) > at > org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:78) > > Thanks! > > Suzuki
