Adding zookeeper dev mailing to this. Has anyone seen this issue before?
On Wed, Feb 11, 2015 at 9:56 AM, Check Peck <[email protected]> wrote: > Can anyone help me on this? Has anyone seen these kind of issues? > > On Tue, Feb 10, 2015 at 4:26 PM, Check Peck <[email protected]> > wrote: > >> I have also verified there is no firewall issue. Does anyone know what is >> this error all about and how we can resolve this? >> >> On Tue, Feb 10, 2015 at 9:20 AM, Check Peck <[email protected]> >> wrote: >> >>> I am trying to setup 5 node zookeeper ensemble manage through Exhibitor. >>> I have 5 machines and on each machine I will be running exhibitor and >>> zookeeper. Below is my zoo.cfg file which is generated by exhibitor. >>> >>> #Auto-generated by Exhibitor - Mon Feb 09 10:18:35 GMT-07:00 2015 >>> #Mon Feb 09 10:18:35 GMT-07:00 2015 >>> server.3=machineC.host.com\: >>> 2888\:3888 >>> server.2=machineB.host.com\:2888\:3888 >>> server.1=machineA.host.com\:2888\:3888 >>> initLimit=10 >>> syncLimit=5 >>> maxClientCnxns=21000 >>> clientPort=2181 >>> tickTime=2000 >>> dataDir=/opt/zookeeper/data >>> dataLogDir=/opt/zookeeper/data >>> server.5=machineD.host.com\:2888\:3888 >>> server.4=machineE.host.com\:2888\:3888 >>> >>> As soon as I am starting zookeeper through Exhibitor config pannel, I >>> can see all the five machines in my control panel but they all are yellow >>> which means "ZooKeeper is running, but can’t communicate with the rest of >>> the ensemble" and in my Exhibitor logs, I am seeing these which has some >>> ERROR in it. >>> >>> dev >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Exhibitor >>> started [main] >>> INFO org.mortbay.log Logging to >>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via >>> org.mortbay.log.Slf4jLog [main] >>> INFO org.mortbay.log jetty-6.1.x [main] >>> INFO org.mortbay.log Started [email protected]:8080 [main] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog State: not >>> serving [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> down/not-serving waiting 30004 of 40000 ms before restarting >>> [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Restarting >>> down/not-serving ZooKeeper after 60008 ms pause [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>> stop instance [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>> start/restart ZooKeeper [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill >>> attempted result: 0 [ActivityQueue-0] >>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: JMX enabled by default [pool-2-thread-1] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: -Xmx2048m -Djava.net.preferIPv4Stack=true [pool-2-thread-2] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Process >>> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh >>> [ActivityQueue-0] >>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >>> [pool-2-thread-1] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: Starting zookeeper ... STARTED [pool-2-thread-2] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> down/not-serving waiting 30005 of 40000 ms before restarting >>> [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Restarting >>> down/not-serving ZooKeeper after 60008 ms pause [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>> stop instance [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>> start/restart ZooKeeper [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill >>> attempted result: 0 [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Process >>> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh >>> [ActivityQueue-0] >>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: JMX enabled by default [pool-2-thread-1] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: -Xmx2048m -Djava.net.preferIPv4Stack=true [pool-2-thread-2] >>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >>> [pool-2-thread-1] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: Starting zookeeper ... STARTED [pool-2-thread-2] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> down/not-serving waiting 30004 of 40000 ms before restarting >>> [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Restarting >>> down/not-serving ZooKeeper after 60014 ms pause [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>> stop instance [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>> start/restart ZooKeeper [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill >>> attempted result: 0 [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Process >>> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh >>> [ActivityQueue-0] >>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: JMX enabled by default [pool-2-thread-3] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: -Xmx2048m -Djava.net.preferIPv4Stack=true [pool-2-thread-2] >>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >>> [pool-2-thread-3] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: Starting zookeeper ... STARTED [pool-2-thread-2] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> down/not-serving waiting 30005 of 40000 ms before restarting >>> [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Restarting >>> down/not-serving ZooKeeper after 60008 ms pause [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>> stop instance [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >>> start/restart ZooKeeper [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill >>> attempted result: 0 [ActivityQueue-0] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog Process >>> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh >>> [ActivityQueue-0] >>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: JMX enabled by default [pool-2-thread-2] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: -Xmx2048m -Djava.net.preferIPv4Stack=true [pool-2-thread-3] >>> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >>> [pool-2-thread-2] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> Server: Starting zookeeper ... STARTED [pool-2-thread-3] >>> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >>> down/not-serving waiting 30004 of 40000 ms before restarting >>> [ActivityQueue-0] >>> >>> And in my zookeeper logs, I am seeing these - >>> >>> 2015-02-09 00:11:19,355 [myid:] - INFO [main:QuorumPeerConfig@103] >>> - Reading configuration from: >>> /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >>> 2015-02-09 00:11:19,365 [myid:] - INFO [main:QuorumPeerConfig@340] >>> - Defaulting to majority quorums >>> 2015-02-09 00:11:19,368 [myid:1] - INFO >>> [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3 >>> 2015-02-09 00:11:19,368 [myid:1] - INFO >>> [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0 >>> 2015-02-09 00:11:19,369 [myid:1] - INFO >>> [main:DatadirCleanupManager@101] - Purge task is not scheduled. >>> 2015-02-09 00:11:19,379 [myid:1] - INFO [main:QuorumPeerMain@127] >>> - Starting quorum peer >>> 2015-02-09 00:11:19,397 [myid:1] - INFO >>> [main:NIOServerCnxnFactory@94] - binding to port 0.0.0.0/0.0.0.0:2181 >>> 2015-02-09 00:11:19,414 [myid:1] - INFO [main:QuorumPeer@959] - >>> tickTime set to 2000 >>> 2015-02-09 00:11:19,414 [myid:1] - INFO [main:QuorumPeer@979] - >>> minSessionTimeout set to -1 >>> 2015-02-09 00:11:19,414 [myid:1] - INFO [main:QuorumPeer@990] - >>> maxSessionTimeout set to -1 >>> 2015-02-09 00:11:19,414 [myid:1] - INFO [main:QuorumPeer@1005] - >>> initLimit set to 10 >>> 2015-02-09 00:11:19,431 [myid:1] - INFO >>> [Thread-1:QuorumCnxManager$Listener@504] - My election bind port: >>> machineA.host.com/127.0.1.1:3888 >>> 2015-02-09 00:11:19,440 [myid:1] - INFO >>> [QuorumPeer[myid=1]/0.0.0.0:2181:QuorumPeer@714] - LOOKING >>> 2015-02-09 00:11:19,441 [myid:1] - INFO >>> [QuorumPeer[myid=1]/0.0.0.0:2181:FastLeaderElection@815] - New >>> election. My id = 1, proposed zxid=0x0 >>> 2015-02-09 00:11:19,443 [myid:1] - INFO >>> [WorkerReceiver[myid=1]:FastLeaderElection@597] - Notification: 1 >>> (message format version), 1 (n.leader), 0x0 (n.zxid), 0x1 (n.round), >>> LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEpoch) LOOKING (my state) >>> 2015-02-09 00:11:19,445 [myid:1] - WARN >>> [WorkerSender[myid=1]:QuorumCnxManager@382] - Cannot open channel to 2 >>> at election address machineB.host.com/10.52.81.211:3888 >>> java.net.ConnectException: Connection refused >>> at java.net.PlainSocketImpl.socketConnect(Native Method) >>> at >>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) >>> at >>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) >>> at >>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) >>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) >>> at java.net.Socket.connect(Socket.java:546) >>> at >>> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368) >>> at >>> org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341) >>> at >>> org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449) >>> at >>> org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430) >>> at java.lang.Thread.run(Thread.java:679) >>> 2015-02-09 00:11:19,449 [myid:1] - WARN >>> [WorkerSender[myid=1]:QuorumCnxManager@382] - Cannot open channel to 3 >>> at election address machineC.host.com/10.57.78.941:3888 >>> java.net.ConnectException: Connection refused >>> at java.net.PlainSocketImpl.socketConnect(Native Method) >>> at >>> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) >>> at >>> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) >>> at >>> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) >>> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) >>> at java.net.Socket.connect(Socket.java:546) >>> at >>> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368) >>> at >>> org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341) >>> at >>> org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449) >>> at >>> org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430) >>> at java.lang.Thread.run(Thread.java:679) >>> 2015-02-09 00:11:19,450 [myid:1] - WARN >>> [WorkerSender[myid=1]:QuorumCnxManager@382] - Cannot open channel to 4 >>> at election address machineD.host.com/10.59.576.12:3888 >>> >>> I am running Exhibitor 1.5.3 and Zookeeper 3.4.6. Is there anything >>> wrong I am doing? I have googled it for this ERROR and I was not able to >>> find anything concrete. I have also verified that it is able to generate >>> myid successfully in each machine. >>> >>> Is this known issue? I have seen other people also having same issue >>> after I search on the google? >>> >> >> >
