Let me try this out and will update how it is looking. On Thu, Feb 12, 2015 at 12:11 AM, Michi Mutsuzaki <[email protected]> wrote:
> I would try commenting out this line from /etc/hosts: > > 127.0.1.1 machineA.dc.host.com machineA > > I guess nslookup doesn't check /etc/hosts. > > > > On Wed, Feb 11, 2015 at 11:25 PM, Check Peck <[email protected]> > wrote: > > In /etc/hosts, this is what I found - > > > > 127.0.0.1 localhost > > 127.0.1.1 machineA.dc.host.com machineA > > > > # The following lines are desirable for IPv6 capable hosts > > ::1 ip6-localhost ip6-loopback > > fe00::0 ip6-localnet > > ff00::0 ip6-mcastprefix > > ff01::3 ip6-allnodes > > ff05::2 ip6-allrouters > > > > 10.22.45.11 machineA.dc.host.com machineA > > > > And nslookup on the same machine results back - > > > > check@machineA:~$ nslookup machineA.dc.host.com > > Server: 10.171.127.141 > > Address: 10.171.127.141#53 > > > > Name: machineA.dc.host.com > > Address: 10.22.45.11 > > > > > > Does anything look off? > > > > > > On Wed, Feb 11, 2015 at 11:08 PM, Michi Mutsuzaki <[email protected] > > > > wrote: > >> > >> There might be an entry in /etc/hosts. You can do 'nslookup > >> machineA.host.com' to see what ip address it resolves to. You can also > >> try using ip addresses instead of hostnames in zoo.cfg. > >> > >> On Wed, Feb 11, 2015 at 11:02 PM, Check Peck <[email protected]> > >> wrote: > >> > Thanks Michi for the help. How can I verify this thing out? > >> > > >> > Is there any way which I can check in my Ubuntu machines? > >> > > >> > On Wed, Feb 11, 2015 at 10:58 PM, Michi Mutsuzaki > >> > <[email protected]> > >> > wrote: > >> >> > >> >> Hi Check sorry for the late response. > >> >> > >> >> It looks like the leader election is binding to the loopback instead > >> >> of 0.0.0.0. Maybe the hostname is configured to resolve to 127.0.1.1 > >> >> locally? > >> >> > >> >> [Thread-1:QuorumCnxManager$Listener@504] - My election bind port: > >> >> machineA.host.com/127.0.1.1:3888 > >> >> > >> >> On Wed, Feb 11, 2015 at 1:49 PM, Check Peck <[email protected] > > > >> >> wrote: > >> >> > Adding zookeeper dev mailing to this. Has anyone seen this issue > >> >> > before? > >> >> > > >> >> > > >> >> > On Wed, Feb 11, 2015 at 9:56 AM, Check Peck < > [email protected]> > >> >> > wrote: > >> >> > > >> >> >> Can anyone help me on this? Has anyone seen these kind of issues? > >> >> >> > >> >> >> On Tue, Feb 10, 2015 at 4:26 PM, Check Peck > >> >> >> <[email protected]> > >> >> >> wrote: > >> >> >> > >> >> >>> I have also verified there is no firewall issue. Does anyone know > >> >> >>> what > >> >> >>> is > >> >> >>> this error all about and how we can resolve this? > >> >> >>> > >> >> >>> On Tue, Feb 10, 2015 at 9:20 AM, Check Peck > >> >> >>> <[email protected]> > >> >> >>> wrote: > >> >> >>> > >> >> >>>> I am trying to setup 5 node zookeeper ensemble manage through > >> >> >>>> Exhibitor. > >> >> >>>> I have 5 machines and on each machine I will be running > exhibitor > >> >> >>>> and > >> >> >>>> zookeeper. Below is my zoo.cfg file which is generated by > >> >> >>>> exhibitor. > >> >> >>>> > >> >> >>>> #Auto-generated by Exhibitor - Mon Feb 09 10:18:35 GMT-07:00 > >> >> >>>> 2015 > >> >> >>>> #Mon Feb 09 10:18:35 GMT-07:00 2015 > >> >> >>>> server.3=machineC.host.com\: > >> >> >>>> 2888\:3888 > >> >> >>>> server.2=machineB.host.com\:2888\:3888 > >> >> >>>> server.1=machineA.host.com\:2888\:3888 > >> >> >>>> initLimit=10 > >> >> >>>> syncLimit=5 > >> >> >>>> maxClientCnxns=21000 > >> >> >>>> clientPort=2181 > >> >> >>>> tickTime=2000 > >> >> >>>> dataDir=/opt/zookeeper/data > >> >> >>>> dataLogDir=/opt/zookeeper/data > >> >> >>>> server.5=machineD.host.com\:2888\:3888 > >> >> >>>> server.4=machineE.host.com\:2888\:3888 > >> >> >>>> > >> >> >>>> As soon as I am starting zookeeper through Exhibitor config > >> >> >>>> pannel, I > >> >> >>>> can see all the five machines in my control panel but they all > are > >> >> >>>> yellow > >> >> >>>> which means "ZooKeeper is running, but can't communicate with > the > >> >> >>>> rest of > >> >> >>>> the ensemble" and in my Exhibitor logs, I am seeing these which > >> >> >>>> has > >> >> >>>> some > >> >> >>>> ERROR in it. > >> >> >>>> > >> >> >>>> dev > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> Exhibitor > >> >> >>>> started [main] > >> >> >>>> INFO org.mortbay.log Logging to > >> >> >>>> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via > >> >> >>>> org.mortbay.log.Slf4jLog [main] > >> >> >>>> INFO org.mortbay.log jetty-6.1.x [main] > >> >> >>>> INFO org.mortbay.log Started [email protected]:8080 > >> >> >>>> [main] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > State: > >> >> >>>> not > >> >> >>>> serving [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> down/not-serving waiting 30004 of 40000 ms before restarting > >> >> >>>> [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> Restarting > >> >> >>>> down/not-serving ZooKeeper after 60008 ms pause > [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> Attempting > >> >> >>>> to > >> >> >>>> stop instance [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> Attempting > >> >> >>>> to > >> >> >>>> start/restart ZooKeeper [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill > >> >> >>>> attempted result: 0 [ActivityQueue-0] > >> >> >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: JMX enabled by default [pool-2-thread-1] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: -Xmx2048m -Djava.net.preferIPv4Stack=true > >> >> >>>> [pool-2-thread-2] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > Process > >> >> >>>> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh > >> >> >>>> [ActivityQueue-0] > >> >> >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: Using config: > >> >> >>>> /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg > >> >> >>>> [pool-2-thread-1] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: Starting zookeeper ... STARTED [pool-2-thread-2] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> down/not-serving waiting 30005 of 40000 ms before restarting > >> >> >>>> [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> Restarting > >> >> >>>> down/not-serving ZooKeeper after 60008 ms pause > [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> Attempting > >> >> >>>> to > >> >> >>>> stop instance [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> Attempting > >> >> >>>> to > >> >> >>>> start/restart ZooKeeper [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill > >> >> >>>> attempted result: 0 [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > Process > >> >> >>>> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh > >> >> >>>> [ActivityQueue-0] > >> >> >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: JMX enabled by default [pool-2-thread-1] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: -Xmx2048m -Djava.net.preferIPv4Stack=true > >> >> >>>> [pool-2-thread-2] > >> >> >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: Using config: > >> >> >>>> /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg > >> >> >>>> [pool-2-thread-1] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: Starting zookeeper ... STARTED [pool-2-thread-2] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> down/not-serving waiting 30004 of 40000 ms before restarting > >> >> >>>> [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> Restarting > >> >> >>>> down/not-serving ZooKeeper after 60014 ms pause > [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> Attempting > >> >> >>>> to > >> >> >>>> stop instance [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> Attempting > >> >> >>>> to > >> >> >>>> start/restart ZooKeeper [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill > >> >> >>>> attempted result: 0 [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > Process > >> >> >>>> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh > >> >> >>>> [ActivityQueue-0] > >> >> >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: JMX enabled by default [pool-2-thread-3] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: -Xmx2048m -Djava.net.preferIPv4Stack=true > >> >> >>>> [pool-2-thread-2] > >> >> >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: Using config: > >> >> >>>> /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg > >> >> >>>> [pool-2-thread-3] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: Starting zookeeper ... STARTED [pool-2-thread-2] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> down/not-serving waiting 30005 of 40000 ms before restarting > >> >> >>>> [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> Restarting > >> >> >>>> down/not-serving ZooKeeper after 60008 ms pause > [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> Attempting > >> >> >>>> to > >> >> >>>> stop instance [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> Attempting > >> >> >>>> to > >> >> >>>> start/restart ZooKeeper [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill > >> >> >>>> attempted result: 0 [ActivityQueue-0] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > Process > >> >> >>>> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh > >> >> >>>> [ActivityQueue-0] > >> >> >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: JMX enabled by default [pool-2-thread-2] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: -Xmx2048m -Djava.net.preferIPv4Stack=true > >> >> >>>> [pool-2-thread-3] > >> >> >>>> ERROR com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: Using config: > >> >> >>>> /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg > >> >> >>>> [pool-2-thread-2] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> Server: Starting zookeeper ... STARTED [pool-2-thread-3] > >> >> >>>> INFO com.netflix.exhibitor.core.activity.ActivityLog > >> >> >>>> ZooKeeper > >> >> >>>> down/not-serving waiting 30004 of 40000 ms before restarting > >> >> >>>> [ActivityQueue-0] > >> >> >>>> > >> >> >>>> And in my zookeeper logs, I am seeing these - > >> >> >>>> > >> >> >>>> 2015-02-09 00:11:19,355 [myid:] - INFO > >> >> >>>> [main:QuorumPeerConfig@103] > >> >> >>>> - Reading configuration from: > >> >> >>>> /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg > >> >> >>>> 2015-02-09 00:11:19,365 [myid:] - INFO > >> >> >>>> [main:QuorumPeerConfig@340] > >> >> >>>> - Defaulting to majority quorums > >> >> >>>> 2015-02-09 00:11:19,368 [myid:1] - INFO > >> >> >>>> [main:DatadirCleanupManager@78] - autopurge.snapRetainCount > set to > >> >> >>>> 3 > >> >> >>>> 2015-02-09 00:11:19,368 [myid:1] - INFO > >> >> >>>> [main:DatadirCleanupManager@79] - autopurge.purgeInterval set > to 0 > >> >> >>>> 2015-02-09 00:11:19,369 [myid:1] - INFO > >> >> >>>> [main:DatadirCleanupManager@101] - Purge task is not scheduled. > >> >> >>>> 2015-02-09 00:11:19,379 [myid:1] - INFO > >> >> >>>> [main:QuorumPeerMain@127] > >> >> >>>> - Starting quorum peer > >> >> >>>> 2015-02-09 00:11:19,397 [myid:1] - INFO > >> >> >>>> [main:NIOServerCnxnFactory@94] - binding to port > >> >> >>>> 0.0.0.0/0.0.0.0:2181 > >> >> >>>> 2015-02-09 00:11:19,414 [myid:1] - INFO > [main:QuorumPeer@959] > >> >> >>>> - > >> >> >>>> tickTime set to 2000 > >> >> >>>> 2015-02-09 00:11:19,414 [myid:1] - INFO > [main:QuorumPeer@979] > >> >> >>>> - > >> >> >>>> minSessionTimeout set to -1 > >> >> >>>> 2015-02-09 00:11:19,414 [myid:1] - INFO > [main:QuorumPeer@990] > >> >> >>>> - > >> >> >>>> maxSessionTimeout set to -1 > >> >> >>>> 2015-02-09 00:11:19,414 [myid:1] - INFO > >> >> >>>> [main:QuorumPeer@1005] - > >> >> >>>> initLimit set to 10 > >> >> >>>> 2015-02-09 00:11:19,431 [myid:1] - INFO > >> >> >>>> [Thread-1:QuorumCnxManager$Listener@504] - My election bind > port: > >> >> >>>> machineA.host.com/127.0.1.1:3888 > >> >> >>>> 2015-02-09 00:11:19,440 [myid:1] - INFO > >> >> >>>> [QuorumPeer[myid=1]/0.0.0.0:2181:QuorumPeer@714] - LOOKING > >> >> >>>> 2015-02-09 00:11:19,441 [myid:1] - INFO > >> >> >>>> [QuorumPeer[myid=1]/0.0.0.0:2181:FastLeaderElection@815] - New > >> >> >>>> election. My id = 1, proposed zxid=0x0 > >> >> >>>> 2015-02-09 00:11:19,443 [myid:1] - INFO > >> >> >>>> [WorkerReceiver[myid=1]:FastLeaderElection@597] - > Notification: 1 > >> >> >>>> (message format version), 1 (n.leader), 0x0 (n.zxid), 0x1 > >> >> >>>> (n.round), > >> >> >>>> LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEpoch) LOOKING (my > state) > >> >> >>>> 2015-02-09 00:11:19,445 [myid:1] - WARN > >> >> >>>> [WorkerSender[myid=1]:QuorumCnxManager@382] - Cannot open > channel > >> >> >>>> to > >> >> >>>> 2 > >> >> >>>> at election address machineB.host.com/10.52.81.211:3888 > >> >> >>>> java.net.ConnectException: Connection refused > >> >> >>>> at java.net.PlainSocketImpl.socketConnect(Native > >> >> >>>> Method) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) > >> >> >>>> at > >> >> >>>> java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) > >> >> >>>> at java.net.Socket.connect(Socket.java:546) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430) > >> >> >>>> at java.lang.Thread.run(Thread.java:679) > >> >> >>>> 2015-02-09 00:11:19,449 [myid:1] - WARN > >> >> >>>> [WorkerSender[myid=1]:QuorumCnxManager@382] - Cannot open > channel > >> >> >>>> to > >> >> >>>> 3 > >> >> >>>> at election address machineC.host.com/10.57.78.941:3888 > >> >> >>>> java.net.ConnectException: Connection refused > >> >> >>>> at java.net.PlainSocketImpl.socketConnect(Native > >> >> >>>> Method) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:327) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:193) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:180) > >> >> >>>> at > >> >> >>>> java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384) > >> >> >>>> at java.net.Socket.connect(Socket.java:546) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:368) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:341) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:449) > >> >> >>>> at > >> >> >>>> > >> >> >>>> > >> >> >>>> > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:430) > >> >> >>>> at java.lang.Thread.run(Thread.java:679) > >> >> >>>> 2015-02-09 00:11:19,450 [myid:1] - WARN > >> >> >>>> [WorkerSender[myid=1]:QuorumCnxManager@382] - Cannot open > channel > >> >> >>>> to > >> >> >>>> 4 > >> >> >>>> at election address machineD.host.com/10.59.576.12:3888 > >> >> >>>> > >> >> >>>> I am running Exhibitor 1.5.3 and Zookeeper 3.4.6. Is there > >> >> >>>> anything > >> >> >>>> wrong I am doing? I have googled it for this ERROR and I was not > >> >> >>>> able > >> >> >>>> to > >> >> >>>> find anything concrete. I have also verified that it is able to > >> >> >>>> generate > >> >> >>>> myid successfully in each machine. > >> >> >>>> > >> >> >>>> Is this known issue? I have seen other people also having same > >> >> >>>> issue > >> >> >>>> after I search on the google? > >> >> >>>> > >> >> >>> > >> >> >>> > >> >> >> > >> > > >> > > > > > >
