Re: Receive timed out error while starting zookeeper server

2010-06-27 Thread Ted Dunning
Are you sure that you understand that there really isn't a good concept of a
master and slave in zookeeper (at least not by default)?

Are you actually starting servers on all of your machines in your cluster?

On Sat, Jun 26, 2010 at 6:53 AM, Peeyush Kumar  wrote:

> I have a 6 node cluster (5 slaves and 1 master). I am trying to
> start the zookeper server on the cluster. when I issue this command:
> $ java -cp zookeeper.jar:lib/log4j-1.2.15.jar:conf \
> org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg
> I get the following error:
> 2010-06-26 18:09:17,468 - INFO  [main:quorumpeercon...@80] - Reading
> configuration from: conf/zoo.cfg
> 2010-06-26 18:09:17,483 - INFO  [main:quorumpeercon...@232] - Defaulting
> to
> majority quorums
> 2010-06-26 18:09:17,545 - INFO  [main:quorumpeerm...@118] - Starting
> quorum
> peer
> 2010-06-26 18:09:17,585 - INFO  [QuorumPeer:/0.0.0.0:2179:quorump...@514]
> -
> LOOKING
> 2010-06-26 18:09:17,589 - INFO  [QuorumPeer:/0.0.0.0:2179
> :leaderelect...@154]
> - Server address: master.cf.net/192.168.1.1:2180
>
> 2010-06-26 18:09:17,589 - INFO  [QuorumPeer:/0.0.0.0:2179
> :leaderelect...@154]
> - Server address: slave01.cf.net/192.168.1.2:2180
>
> 2010-06-26 18:09:17,792 - WARN  [QuorumPeer:/0.0.0.0:2179
> :leaderelect...@194]
> - Ignoring exception while looking for
> leader
>


Re: Receive timed out error while starting zookeeper server

2010-06-27 Thread Patrick Hunt


On 06/26/2010 06:53 AM, Peeyush Kumar wrote:

 I have a 6 node cluster (5 slaves and 1 master). I am trying to


You typically want an odd number given that zk works by majority (even 
is fine, but not optimal). So 5 would be great (7 is a bit of overkill). 
3 is fine too, but 5 allows for you to take 1 server down for "scheduled 
maintenance" and still experience an unexpected failure w/o impact to 
service availability.


In your exception I see "DatagramSocket" this is unusual. What are you 
running for ZK version? As Lei suggested please include your config file 
so that we can review that as well (if you are overriding electionAlg 
this might be part of the problem. Current versions of ZK servers use 
tcp for connections by default, that's why this is unusual.)


Most likely there is either a config problem or perhaps you have a 
firewall that's blocking communication btw the servers? Try verifying 
server to server connectivity on the ports you've selected.


Patrick


start the zookeper server on the cluster. when I issue this command:
$ java -cp zookeeper.jar:lib/log4j-1.2.15.jar:conf \
org.apache.zookeeper.server.quorum.QuorumPeerMain zoo.cfg
I get the following error:
2010-06-26 18:09:17,468 - INFO  [main:quorumpeercon...@80] - Reading
configuration from: conf/zoo.cfg
2010-06-26 18:09:17,483 - INFO  [main:quorumpeercon...@232] - Defaulting to
majority quorums
2010-06-26 18:09:17,545 - INFO  [main:quorumpeerm...@118] - Starting quorum
peer
2010-06-26 18:09:17,585 - INFO  [QuorumPeer:/0.0.0.0:2179:quorump...@514] -
LOOKING
2010-06-26 18:09:17,589 - INFO  [QuorumPeer:/0.0.0.0:2179:leaderelect...@154]
- Server address: master.cf.net/192.168.1.1:2180

2010-06-26 18:09:17,589 - INFO  [QuorumPeer:/0.0.0.0:2179:leaderelect...@154]
- Server address: slave01.cf.net/192.168.1.2:2180

2010-06-26 18:09:17,792 - WARN  [QuorumPeer:/0.0.0.0:2179:leaderelect...@194]
- Ignoring exception while looking for
leader

java.net.SocketTimeoutException: Receive timed
out
 at java.net.PlainDatagramSocketImpl.receive0(Native
Method)
 at
java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:136)

 at
java.net.DatagramSocket.receive(DatagramSocket.java:725)

 at
org.apache.zookeeper.server.quorum.LeaderElection.lookForLeader(LeaderElection.java:170)

 at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:515)

2010-06-26 18:09:17,794 - INFO  [QuorumPeer:/0.0.0.0:2179:leaderelect...@154]
- Server address: slave02.cf.net/192.168.1.3:2180

2010-06-26 18:09:17,995 - WARN  [QuorumPeer:/0.0.0.0:2179:leaderelect...@194]
- Ignoring exception while looking for
leader

java.net.SocketTimeoutException: Receive timed
out
 at java.net.PlainDatagramSocketImpl.receive0(Native
Method)
 at
java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:136)

 at
java.net.DatagramSocket.receive(DatagramSocket.java:725)

 at
org.apache.zookeeper.server.quorum.LeaderElection.lookForLeader(LeaderElection.java:170)

 at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:515)

2010-06-26 18:09:17,996 - INFO  [QuorumPeer:/0.0.0.0:2179:leaderelect...@154]
- Server address: slave03.cf.net/192.168.1.4:2180

2010-06-26 18:09:18,197 - WARN  [QuorumPeer:/0.0.0.0:2179:leaderelect...@194]
- Ignoring exception while looking for
leader

java.net.SocketTimeoutException: Receive timed
out
 at java.net.PlainDatagramSocketImpl.receive0(Native
Method)
 at
java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:136)

 at
java.net.DatagramSocket.receive(DatagramSocket.java:725)

 at
org.apache.zookeeper.server.quorum.LeaderElection.lookForLeader(LeaderElection.java:170)

 at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:515)

2010-06-26 18:09:18,200 - INFO  [QuorumPeer:/0.0.0.0:2179:leaderelect...@154]
- Server address: slave04.cf.net/192.168.1.5:2180

2010-06-26 18:09:18,401 - WARN  [QuorumPeer:/0.0.0.0:2179:leaderelect...@194]
- Ignoring exception while looking for
leader

java.net.SocketTimeoutException: Receive timed
out
 at java.net.PlainDatagramSocketImpl.receive0(Native
Method)
 at
java.net.PlainDatagramSocketImpl.receive(PlainDatagramSocketImpl.java:136)

 at
java.net.DatagramSocket.receive(DatagramSocket.java:725)

 at
org.apache.zookeeper.server.quorum.LeaderElection.lookForLeader(LeaderElection.java:170)

 at
org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:515)

2010-06-26 18:09:18,402 - INFO  [QuorumPeer:/0.0.0.0:2179:leaderelect...@154]
- Server address: slave05.cf.net/192.168.1.6:2180

2010-06-26 18:09:18,604 - WARN  [QuorumPeer:/0.0.0.0:2179:leaderelect...@194]
- Ignoring exception while looking for
leader

java.net.SocketTimeoutException: Receive timed
out
 at java.net.PlainDatagramSocketImpl.receive0(Native
Method)
 at
java.net.PlainDatagramSocketImpl.receive(Plain

Re: Receive timed out error while starting zookeeper server

2010-06-27 Thread Lei Zhang
Can you show your zoo.cfg? How many zookeeper servers do you intend to have
in the quorum? Did you start zookeeper daemon on each of the server?