Thanks all for the response but I still couldn't figure out why its not working. If I configured the cluster it should give an error first place. When I kill the leader it fails and at the same time when I kill a follower and try to start it again it doesn't work either, but the other nodes in the cluster works fine.
When kill the leader I see following error in one of the followers, 2014-06-13 09:35:37,215 [myid:1] - WARN [QuorumPeer[myid=1]/0.0.0.0:2181 :Learner@233] - Unexpected exception, tries=1, connecting to / 129.79.247.5:2888 java.net.ConnectException: Connection refused at java.net.PlainSocketImpl.socketConnect(Native Method) at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:351) at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:213) at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:200) at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:432) at java.net.Socket.connect(Socket.java:529) at org.apache.zookeeper.server.quorum.Learner.connectToLeader(Learner.java:225) at org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:71) at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:786) I can see 129.79.247.5 is the other follower and something is wrong. But what I do not understand is why this is not coming when I start the cluster at the first place, because when I start the cluster initially it finish the voting process successfully then one became a leader and rest became follower. Regards Lahiru On Thu, Jun 12, 2014 at 9:56 PM, James A. Robinson <[email protected]> wrote: > On Thu, Jun 12, 2014 at 4:47 PM, Cameron McKenzie <[email protected]> > wrote: > > > This is not correct, 3 is a minimum for redundancy. If 1 goes down, the > > other 2 can still form a quorum (as there are more than half of them > > remaining). > > > > Thank you, it's good to know this -- I must have gotten confused > about the way the quorum logic worked at some point. > > Jim > -- System Analyst Programmer PTI Lab Indiana University
