Hi Check sorry for the late response. It looks like the leader election is binding to the loopback instead of 0.0.0.0. Maybe the hostname is configured to resolve to 127.0.1.1 locally?
[Thread-1:QuorumCnxManager$Listener@504] - My election bind port: machineA.host.com/127.0.1.1:3888 On Wed, Feb 11, 2015 at 1:49 PM, Check Peck <[email protected]> wrote: > Adding zookeeper dev mailing to this. Has anyone seen this issue before? > > > On Wed, Feb 11, 2015 at 9:56 AM, Check Peck <[email protected]> wrote: > >> Can anyone help me on this? Has anyone seen these kind of issues? >> >> On Tue, Feb 10, 2015 at 4:26 PM, Check Peck <[email protected]> >> wrote: >> >>> I have also verified there is no firewall issue. Does anyone know what is >>> this error all about and how we can resolve this? >>> >>> On Tue, Feb 10, 2015 at 9:20 AM, Check Peck <[email protected]> >>> wrote: >>> >>>> I am trying to setup 5 node zookeeper ensemble manage through Exhibitor. >>>> I have 5 machines and on each machine I will be running exhibitor and >>>> zookeeper. Below is my zoo.cfg file which is generated by exhibitor. >>>> >>>> #Auto-generated by Exhibitor - Mon Feb 09 10:18:35 GMT-07:00 2015 >>>> #Mon Feb 09 10:18:35 GMT-07:00 2015 >>>> server.3=machineC.host.com\: >>>> 2888\:3888 >>>> server.2=machineB.host.com\:2888\:3888 >>>> server.1=machineA.host.com\:2888\:3888 >>>> initLimit=10 >>>> syncLimit=5 >>>> maxClientCnxns=21000 >>>> clientPort=2181 >>>> tickTime=2000 >>>> dataDir=/opt/zookeeper/data >>>> dataLogDir=/opt/zookeeper/data >>>> server.5=machineD.host.com\:2888\:3888 >>>> server.4=machineE.host.com\:2888\:3888 >>>> >>>> As soon as I am starting zookeeper through Exhibitor config pannel, I >>>> can see all the five machines in my control panel but they all are yellow >>>> which means "ZooKeeper is running, but can't communicate with the rest of >>>> the ensemble" and in my Exhibitor logs, I am seeing these which has some >>>> ERROR in it. >>>> >>>> dev >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Exhibitor >>>> started [main] >>>> INFO org.mortbay.log Logging to >>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via >>>> org.mortbay.log.Slf4jLog [main] >>>> INFO org.mortbay.log jetty-6.1.x [main] >>>> INFO org.mortbay.log Started [email protected]:8080 [main] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog State: not >>>> serving [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> down/not-serving waiting 30004 of 40000 ms before restarting >>>> [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Restarting >>>> down/not-serving ZooKeeper after 60008 ms pause [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>>> stop instance [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>>> start/restart ZooKeeper [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill >>>> attempted result: 0 [ActivityQueue-0] >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: JMX enabled by default [pool-2-thread-1] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: -Xmx2048m -Djava.net.preferIPv4Stack=true [pool-2-thread-2] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Process >>>> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh >>>> [ActivityQueue-0] >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >>>> [pool-2-thread-1] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: Starting zookeeper ... STARTED [pool-2-thread-2] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> down/not-serving waiting 30005 of 40000 ms before restarting >>>> [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Restarting >>>> down/not-serving ZooKeeper after 60008 ms pause [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>>> stop instance [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>>> start/restart ZooKeeper [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill >>>> attempted result: 0 [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Process >>>> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh >>>> [ActivityQueue-0] >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: JMX enabled by default [pool-2-thread-1] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: -Xmx2048m -Djava.net.preferIPv4Stack=true [pool-2-thread-2] >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >>>> [pool-2-thread-1] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: Starting zookeeper ... STARTED [pool-2-thread-2] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> down/not-serving waiting 30004 of 40000 ms before restarting >>>> [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Restarting >>>> down/not-serving ZooKeeper after 60014 ms pause [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>>> stop instance [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>>> start/restart ZooKeeper [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill >>>> attempted result: 0 [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Process >>>> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh >>>> [ActivityQueue-0] >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: JMX enabled by default [pool-2-thread-3] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: -Xmx2048m -Djava.net.preferIPv4Stack=true [pool-2-thread-2] >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >>>> [pool-2-thread-3] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: Starting zookeeper ... STARTED [pool-2-thread-2] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> down/not-serving waiting 30005 of 40000 ms before restarting >>>> [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Restarting >>>> down/not-serving ZooKeeper after 60008 ms pause [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>>> stop instance [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>>> start/restart ZooKeeper [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill >>>> attempted result: 0 [ActivityQueue-0] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Process >>>> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh >>>> [ActivityQueue-0] >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: JMX enabled by default [pool-2-thread-2] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: -Xmx2048m -Djava.net.preferIPv4Stack=true [pool-2-thread-3] >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >>>> [pool-2-thread-2] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> Server: Starting zookeeper ... STARTED [pool-2-thread-3] >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>>> down/not-serving waiting 30004 of 40000 ms before restarting >>>> [ActivityQueue-0] >>>> >>>> And in my zookeeper logs, I am seeing these - >>>> >>>> 2015-02-09 00:11:19,355 [myid:] - INFO [main:QuorumPeerConfig@103] >>>> - Reading configuration from: >>>> /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >>>> 2015-02-09 00:11:19,365 [myid:] - INFO [main:QuorumPeerConfig@340] >>>> - Defaulting to majority quorums >>>> 2015-02-09 00:11:19,368 [myid:1] - INFO >>>> [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3 >>>> 2015-02-09 00:11:19,368 [myid:1] - INFO >>>> [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0 >>>> 2015-02-09 00:11:19,369 [myid:1] - INFO >>>> [main:DatadirCleanupManager@101] - Purge task is not scheduled. >>>> 2015-02-09 00:11:19,379 [myid:1] - INFO [main:QuorumPeerMain@127] >>>> - Starting quorum peer >>>> 2015-02-09 00:11:19,397 [myid:1] - INFO >>>> [main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:2181 >>>> 2015-02-09 00:11:19,414 [myid:1] - INFO [main:QuorumPeer@959] - >>>> tickTime set to 2000 >>>> 2015-02-09 00:11:19,414 [myid:1] - INFO [main:QuorumPeer@979] - >>>> minSessionTimeout set to -1 >>>> 2015-02-09 00:11:19,414 [myid:1] - INFO [main:QuorumPeer@990] - >>>> maxSessionTimeout set to -1 >>>> 2015-02-09 00:11:19,414 [myid:1] - INFO [main:QuorumPeer@1005] - >>>> initLimit set to 10 >>>> 2015-02-09 00:11:19,431 [myid:1] - INFO >>>> [Thread-1:QuorumCnxManager$Listener@504] - My election bind port: >>>> machineA.host.com/127.0.1.1:3888 >>>> 2015-02-09 00:11:19,440 [myid:1] - INFO >>>> [QuorumPeer[myid=1]/0.0.0.0:2181:QuorumPeer@714] - LOOKING >>>> 2015-02-09 00:11:19,441 [myid:1] - INFO >>>> [QuorumPeer[myid=1]/0.0.0.0:2181:FastLeaderElection@815] - New >>>> election. My id = 1, proposed zxid=0x0 >>>> 2015-02-09 00:11:19,443 [myid:1] - INFO >>>> [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 >>>> (message format version), 1 (n.leader), 0x0 (n.zxid), 0x1 (n.round), >>>> LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEpoch) LOOKING (my state) >>>> 2015-02-09 00:11:19,445 [myid:1] - WARN >>>> [WorkerSender[myid=1]:QuorumCnxManager@382] - Cannot open channel to 2 >>>> at election address machineB.host.com/10.52.81.211:3888 >>>> java.net.ConnectException: Connection refused >>>> at java.net.PlainSocketImpl.socketConnect(Native Method) >>>> at >>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) >>>> at >>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) >>>> at >>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) >>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) >>>> at java.net.Socket.connect(Socket.java:546) >>>> at >>>> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368) >>>> at >>>> org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341) >>>> at >>>> org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449) >>>> at >>>> org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430) >>>> at java.lang.Thread.run(Thread.java:679) >>>> 2015-02-09 00:11:19,449 [myid:1] - WARN >>>> [WorkerSender[myid=1]:QuorumCnxManager@382] - Cannot open channel to 3 >>>> at election address machineC.host.com/10.57.78.941:3888 >>>> java.net.ConnectException: Connection refused >>>> at java.net.PlainSocketImpl.socketConnect(Native Method) >>>> at >>>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) >>>> at >>>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) >>>> at >>>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) >>>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) >>>> at java.net.Socket.connect(Socket.java:546) >>>> at >>>> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368) >>>> at >>>> org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341) >>>> at >>>> org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449) >>>> at >>>> org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430) >>>> at java.lang.Thread.run(Thread.java:679) >>>> 2015-02-09 00:11:19,450 [myid:1] - WARN >>>> [WorkerSender[myid=1]:QuorumCnxManager@382] - Cannot open channel to 4 >>>> at election address machineD.host.com/10.59.576.12:3888 >>>> >>>> I am running Exhibitor 1.5.3 and Zookeeper 3.4.6. Is there anything >>>> wrong I am doing? I have googled it for this ERROR and I was not able to >>>> find anything concrete. I have also verified that it is able to generate >>>> myid successfully in each machine. >>>> >>>> Is this known issue? I have seen other people also having same issue >>>> after I search on the google? >>>> >>> >>> >>
