The internalBlockUntilConnectedOrTimedOut() method will timeout after connectionTimeoutMs.
It seems like it should be more than that since connectionTimeoutMs will be the time it takes to check only 1 node. Thoughts? On Fri, Jul 29, 2016 at 2:55 PM, Benjamin Jaton <[email protected]> wrote: > Adding the end of the log: > > 570438 [main] DEBUG org.apache.curator.RetryLoop - Retry policy not > allowing retry > 570438 [main-EventThread] INFO > org.apache.curator.framework.state.ConnectionStateManager - State change: > SUSPENDED > Exception in thread "main" > org.apache.zookeeper.KeeperException$ConnectionLossException: > KeeperErrorCode = ConnectionLoss for / > at org.apache.zookeeper.KeeperException.create(KeeperException.java:99) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1956) > at > org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:310) > at > org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:299) > at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108) > at > org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:295) > at > org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:287) > at > org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:34) > at > testzkclient.TestCuratorReadOnlyClient.main(TestCuratorReadOnlyClient.java:25) > > On Fri, Jul 29, 2016 at 2:49 PM, Benjamin Jaton <[email protected]> > wrote: > >> I was making a mistake in the ZK test, the ZK client works properly ( I >> was not waiting for the watcher to give me a ConnectedReadOnly event ). >> >> So I retried with zk 3.5.2 and curator 2.11.0, I have more logs (see >> below) >> >> 10.11.9.70:2181 is the functioning node, >> 10.11.157.4:2181 and 10.11.157.210:2181 don't exist. >> >> I stopped capturing at 300sec. >> >> This line: >> 15043 [main] DEBUG org.apache.curator.CuratorZookeeperClient - >> blockUntilConnectedOrTimedOut() end. isConnected: false >> is not right. Why did it stop blocking? >> Is there a way to get more info? >> >> 0 [main] INFO org.apache.curator.framework.imps.CuratorFrameworkImpl - >> Starting >> 4 [main] DEBUG org.apache.curator.CuratorZookeeperClient - Starting >> 4 [main] DEBUG org.apache.curator.ConnectionState - Starting >> 4 [main] DEBUG org.apache.curator.ConnectionState - reset >> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:zookeeper.version=3.5.2-alpha-1753710, built on 07/21/2016 >> 16:24 GMT >> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment: >> host.name=w-rli09-ben >> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:java.version=1.8.0_11 >> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:java.vendor=Oracle Corporation >> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:java.home=/usr/local/apps/jdk1.8.0_11/jre >> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:java.class.path=/home/benji/workspace-trunk/testzkclient/bin:/home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/commons-cli-1.2.jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/jackson-core-asl-1.9.11.jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/jackson-mapper-asl-1.9.11.jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/javacc.jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/jetty-6.1.26.jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/jetty-util-6.1.26.jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/jline-2.11.jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/log4j-1.2.17.jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/netty-3.10.5.Final.jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/servlet-api-2.5-20081211.jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/slf4j-api-1.7.5.jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/slf4j-log4j12-1.7.5.jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/zookeeper-3.5.2-alpha.jar:/home/benji/workspace-trunk/apache-curator-3.2.0/guava-16.0.1.jar:/home/benji/workspace-trunk/apache-curator-3.2.0/curator-client-2.11.0.jar:/home/benji/workspace-trunk/apache-curator-3.2.0/curator-framework-2.11.0.jar >> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib >> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:java.io.tmpdir=/tmp >> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:java.compiler=<NA> >> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment: >> os.name=Linux >> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:os.arch=amd64 >> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:os.version=3.8.0-44-generic >> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment: >> user.name=benji >> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:user.home=/home/benji >> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:user.dir=/home/benji/workspace-trunk/testzkclient >> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:os.memory.free=378MB >> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:os.memory.max=5358MB >> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >> environment:os.memory.total=381MB >> 16 [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client >> connection, connectString=10.11.9.70:2181,10.11.157.4:2181, >> 10.11.157.210:2181 sessionTimeout=45000 >> watcher=org.apache.curator.ConnectionState@1ff8b8f >> 27 [main] DEBUG org.apache.zookeeper.ClientCnxnSocket - jute.maxbuffer >> is 4194304 >> 39 [main] DEBUG org.apache.curator.CuratorZookeeperClient - >> blockUntilConnectedOrTimedOut() start >> 39 [main-SendThread(10.11.157.210:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Opening socket connection to server >> 10.11.157.210/10.11.157.210:2181. Will not attempt to authenticate using >> SASL (unknown error) >> *15043 [main] DEBUG org.apache.curator.CuratorZookeeperClient - >> blockUntilConnectedOrTimedOut() end. isConnected: false* >> Successfully established the connection with ZooKeeper >> 15058 [main-SendThread(10.11.157.210:2181)] WARN >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 15023ms for sessionid 0x0 >> 15058 [main-SendThread(10.11.157.210:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 15023ms for sessionid 0x0, closing socket connection and >> attempting reconnect >> 15059 [main-SendThread(10.11.157.210:2181)] DEBUG >> org.apache.zookeeper.ClientCnxnSocketNIO - Ignoring exception during >> shutdown input >> java.net.SocketException: Socket is not connected >> at sun.nio.ch.Net.translateToSocketException(Net.java:143) >> at sun.nio.ch.Net.translateException(Net.java:177) >> at sun.nio.ch.Net.translateException(Net.java:183) >> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:404) >> at >> org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:198) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1325) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1245) >> Caused by: java.nio.channels.NotYetConnectedException >> at >> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:777) >> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:402) >> ... 3 more >> 15062 [main-SendThread(10.11.157.210:2181)] DEBUG >> org.apache.zookeeper.ClientCnxnSocketNIO - Ignoring exception during >> shutdown output >> java.net.SocketException: Socket is not connected >> at sun.nio.ch.Net.translateToSocketException(Net.java:143) >> at sun.nio.ch.Net.translateException(Net.java:177) >> at sun.nio.ch.Net.translateException(Net.java:183) >> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:412) >> at >> org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:205) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1325) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1245) >> Caused by: java.nio.channels.NotYetConnectedException >> at >> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:794) >> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:410) >> ... 3 more >> 15169 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Opening socket connection to server >> 10.11.9.70/10.11.9.70:2181. Will not attempt to authenticate using SASL >> (unknown error) >> 15170 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Socket connection established, >> initiating session, client: /10.11.12.4:44742, server: >> 10.11.9.70/10.11.9.70:2181 >> 15173 [main-SendThread(10.11.9.70:2181)] DEBUG >> org.apache.zookeeper.ClientCnxn - Session establishment request sent on >> 10.11.9.70/10.11.9.70:2181 >> 15180 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Session establishment complete on server >> 10.11.9.70/10.11.9.70:2181, sessionid = 0x10003fa439a0060, negotiated >> timeout = 40000 (READ-ONLY mode) >> 15182 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Checking server >> 10.11.157.4/10.11.157.4:2181 for being r/w. Timeout 200 >> 15187 [main-EventThread] INFO >> org.apache.curator.framework.state.ConnectionStateManager - State change: >> READ_ONLY >> 142434 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Unable to read additional data from >> server sessionid 0x10003fa439a0060, likely server has closed socket, >> closing socket connection and attempting reconnect >> 142538 [main-EventThread] INFO >> org.apache.curator.framework.state.ConnectionStateManager - State change: >> SUSPENDED >> 142545 [main] DEBUG org.apache.curator.RetryLoop - Retry-able exception >> received >> org.apache.zookeeper.KeeperException$ConnectionLossException: >> KeeperErrorCode = ConnectionLoss for / >> at >> org.apache.zookeeper.KeeperException.create(KeeperException.java:99) >> at >> org.apache.zookeeper.KeeperException.create(KeeperException.java:51) >> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1956) >> at >> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:310) >> at >> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:299) >> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108) >> at >> org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:295) >> at >> org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:287) >> at >> org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:34) >> at >> testzkclient.TestCuratorReadOnlyClient.main(TestCuratorReadOnlyClient.java:25) >> 143211 [main-SendThread(10.11.157.210:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Opening socket connection to server >> 10.11.157.210/10.11.157.210:2181. Will not attempt to authenticate using >> SASL (unknown error) >> 147546 [main] DEBUG org.apache.curator.RetryLoop - Retrying operation >> 156556 [main-SendThread(10.11.157.210:2181)] WARN >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 14019ms for sessionid 0x10003fa439a0060 >> 156557 [main-SendThread(10.11.157.210:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 14019ms for sessionid 0x10003fa439a0060, closing socket >> connection and attempting reconnect >> 156557 [main-SendThread(10.11.157.210:2181)] DEBUG >> org.apache.zookeeper.ClientCnxnSocketNIO - Ignoring exception during >> shutdown input >> java.net.SocketException: Socket is not connected >> at sun.nio.ch.Net.translateToSocketException(Net.java:143) >> at sun.nio.ch.Net.translateException(Net.java:177) >> at sun.nio.ch.Net.translateException(Net.java:183) >> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:404) >> at >> org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:198) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1325) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1245) >> Caused by: java.nio.channels.NotYetConnectedException >> at >> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:777) >> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:402) >> ... 3 more >> 156557 [main-SendThread(10.11.157.210:2181)] DEBUG >> org.apache.zookeeper.ClientCnxnSocketNIO - Ignoring exception during >> shutdown output >> java.net.SocketException: Socket is not connected >> at sun.nio.ch.Net.translateToSocketException(Net.java:143) >> at sun.nio.ch.Net.translateException(Net.java:177) >> at sun.nio.ch.Net.translateException(Net.java:183) >> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:412) >> at >> org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:205) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1325) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1245) >> Caused by: java.nio.channels.NotYetConnectedException >> at >> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:794) >> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:410) >> ... 3 more >> 157544 [Curator-Framework-0] ERROR org.apache.curator.ConnectionState - >> Connection timed out for connection string (10.11.9.70:2181, >> 10.11.157.4:2181,10.11.157.210:2181) and timeout (15000) / elapsed >> (15006) >> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = >> ConnectionLoss >> at >> org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:197) >> at >> org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88) >> at >> org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:116) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:835) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> 158163 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Opening socket connection to server >> 10.11.9.70/10.11.9.70:2181. Will not attempt to authenticate using SASL >> (unknown error) >> 158164 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Socket connection established, >> initiating session, client: /10.11.12.4:45070, server: >> 10.11.9.70/10.11.9.70:2181 >> 158164 [main-SendThread(10.11.9.70:2181)] DEBUG >> org.apache.zookeeper.ClientCnxn - Session establishment request sent on >> 10.11.9.70/10.11.9.70:2181 >> 158166 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Session establishment complete on server >> 10.11.9.70/10.11.9.70:2181, sessionid = 0x10003fa439a0061, negotiated >> timeout = 40000 (READ-ONLY mode) >> 158166 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Checking server >> 10.11.157.4/10.11.157.4:2181 for being r/w. Timeout 400 >> 158167 [main-EventThread] INFO >> org.apache.curator.framework.state.ConnectionStateManager - State change: >> READ_ONLY >> 285409 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Unable to read additional data from >> server sessionid 0x10003fa439a0061, likely server has closed socket, >> closing socket connection and attempting reconnect >> 285510 [main] DEBUG org.apache.curator.RetryLoop - Retry-able exception >> received >> org.apache.zookeeper.KeeperException$ConnectionLossException: >> KeeperErrorCode = ConnectionLoss for / >> at >> org.apache.zookeeper.KeeperException.create(KeeperException.java:99) >> at >> org.apache.zookeeper.KeeperException.create(KeeperException.java:51) >> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1956) >> at >> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:310) >> at >> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:299) >> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108) >> at >> org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:295) >> at >> org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:287) >> at >> org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:34) >> at >> testzkclient.TestCuratorReadOnlyClient.main(TestCuratorReadOnlyClient.java:25) >> 285516 [main-EventThread] INFO >> org.apache.curator.framework.state.ConnectionStateManager - State change: >> SUSPENDED >> 286312 [main-SendThread(10.11.157.210:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Opening socket connection to server >> 10.11.157.210/10.11.157.210:2181. Will not attempt to authenticate using >> SASL (unknown error) >> 290511 [main] DEBUG org.apache.curator.RetryLoop - Retrying operation >> 295516 [Curator-Framework-0] INFO >> org.apache.curator.framework.state.ConnectionStateManager - State change: >> LOST >> 295517 [Curator-Framework-0] ERROR >> org.apache.curator.framework.imps.CuratorFrameworkImpl - Background >> operation retry gave up >> org.apache.zookeeper.KeeperException$ConnectionLossException: >> KeeperErrorCode = ConnectionLoss >> at >> org.apache.zookeeper.KeeperException.create(KeeperException.java:99) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:728) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:857) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> 295518 [Curator-Framework-0] ERROR >> org.apache.curator.framework.imps.CuratorFrameworkImpl - Background retry >> gave up >> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = >> ConnectionLoss >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:838) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> 299659 [main-SendThread(10.11.157.210:2181)] WARN >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 14149ms for sessionid 0x10003fa439a0061 >> 299659 [main-SendThread(10.11.157.210:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Client session timed out, have not heard >> from server in 14149ms for sessionid 0x10003fa439a0061, closing socket >> connection and attempting reconnect >> 299660 [main-SendThread(10.11.157.210:2181)] DEBUG >> org.apache.zookeeper.ClientCnxnSocketNIO - Ignoring exception during >> shutdown input >> java.net.SocketException: Socket is not connected >> at sun.nio.ch.Net.translateToSocketException(Net.java:143) >> at sun.nio.ch.Net.translateException(Net.java:177) >> at sun.nio.ch.Net.translateException(Net.java:183) >> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:404) >> at >> org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:198) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1325) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1245) >> Caused by: java.nio.channels.NotYetConnectedException >> at >> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:777) >> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:402) >> ... 3 more >> 299660 [main-SendThread(10.11.157.210:2181)] DEBUG >> org.apache.zookeeper.ClientCnxnSocketNIO - Ignoring exception during >> shutdown output >> java.net.SocketException: Socket is not connected >> at sun.nio.ch.Net.translateToSocketException(Net.java:143) >> at sun.nio.ch.Net.translateException(Net.java:177) >> at sun.nio.ch.Net.translateException(Net.java:183) >> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:412) >> at >> org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:205) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1325) >> at >> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1245) >> Caused by: java.nio.channels.NotYetConnectedException >> at >> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:794) >> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:410) >> ... 3 more >> 300520 [Curator-Framework-0] ERROR org.apache.curator.ConnectionState - >> Connection timed out for connection string (10.11.9.70:2181, >> 10.11.157.4:2181,10.11.157.210:2181) and timeout (15000) / elapsed >> (15004) >> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = >> ConnectionLoss >> at >> org.apache.curator.ConnectionState.checkTimeouts(ConnectionState.java:197) >> at >> org.apache.curator.ConnectionState.getZooKeeper(ConnectionState.java:88) >> at >> org.apache.curator.CuratorZookeeperClient.getZooKeeper(CuratorZookeeperClient.java:116) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:835) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:809) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:64) >> at >> org.apache.curator.framework.imps.CuratorFrameworkImpl$4.call(CuratorFrameworkImpl.java:267) >> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >> at >> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >> at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) >> at java.lang.Thread.run(Thread.java:745) >> 301077 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Opening socket connection to server >> 10.11.9.70/10.11.9.70:2181. Will not attempt to authenticate using SASL >> (unknown error) >> 301078 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Socket connection established, >> initiating session, client: /10.11.12.4:45385, server: >> 10.11.9.70/10.11.9.70:2181 >> 301078 [main-SendThread(10.11.9.70:2181)] DEBUG >> org.apache.zookeeper.ClientCnxn - Session establishment request sent on >> 10.11.9.70/10.11.9.70:2181 >> 301081 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Session establishment complete on server >> 10.11.9.70/10.11.9.70:2181, sessionid = 0x10003fa439a0062, negotiated >> timeout = 40000 (READ-ONLY mode) >> 301082 [main-SendThread(10.11.9.70:2181)] INFO >> org.apache.zookeeper.ClientCnxn - Checking server >> 10.11.157.4/10.11.157.4:2181 for being r/w. Timeout 800 >> 301082 [main-EventThread] INFO >> org.apache.curator.framework.state.ConnectionStateManager - State change: >> READ_ONLY >> >> >> On Tue, Jul 12, 2016 at 4:11 PM, Benjamin Jaton <[email protected] >> > wrote: >> >>> Cameron, you seem to be right, the simple zookeeper client also fails >>> sometimes. >>> I will bring it up to the ZK team, thank you! >>> >>> On Tue, Jul 12, 2016 at 3:09 PM, Cameron McKenzie < >>> [email protected]> wrote: >>> >>>> I'm at a bit of a loss then. Looking at the stack trace, it's in the ZK >>>> code (not the Curator code) that it appears to be blocking. Maybe try and >>>> create a test that removes Curator and just uses the ZK client directly and >>>> see if you still have the same problem? >>>> >>>> On Wed, Jul 13, 2016 at 7:54 AM, Benjamin Jaton < >>>> [email protected]> wrote: >>>> >>>>> Yes, I mean the ping goes through properly >>>>> >>>>> On Tue, Jul 12, 2016 at 2:50 PM, Cameron McKenzie < >>>>> [email protected]> wrote: >>>>> >>>>>> Are the 'dead' hosts reachable? I seem to recall having problems with >>>>>> the ZK client blocking for a long time if you had a host configured that >>>>>> was resolvable but not reachable. >>>>>> >>>>>> On Wed, Jul 13, 2016 at 4:30 AM, Benjamin Jaton < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> More to the story: >>>>>>> >>>>>>> While it consistently works on the same machine than the sole >>>>>>> running ZK node when I specify only 2 of the 3 servers in the connection >>>>>>> string, >>>>>>> it doesn't work when I specify 3 hosts: >>>>>>> >>>>>>> >>>>>>> 2016-07-12 11:23:46 INFO o.a.z.ZooKeeper:716 - Initiating client >>>>>>> connection, >>>>>>> connectString=QA-E8WIN11:2181,QA-E8WIN12:2181,QA-E8WIN13:2181 >>>>>>> sessionTimeout=45000 watcher=org.apache.curator.ConnectionState@c86b9e3 >>>>>>> 2016-07-12 11:23:46 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>> connection to server QA-E8WIN13/10.11.8.230:2181. Will not attempt >>>>>>> to authenticate using SASL (unknown error) >>>>>>> 2016-07-12 11:24:01 INFO o.a.z.ClientCnxn:1251 - Client session >>>>>>> timed out, have not heard from server in 15018ms for sessionid 0x0, >>>>>>> closing >>>>>>> socket connection and attempting reconnect >>>>>>> Successfully established the connection with ZooKeeper >>>>>>> 2016-07-12 11:24:01 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>> connection to server QA-E8WIN12/10.11.8.232:2181. Will not attempt >>>>>>> to authenticate using SASL (unknown error) >>>>>>> 2016-07-12 11:24:16 INFO o.a.z.ClientCnxn:1251 - Client session >>>>>>> timed out, have not heard from server in 15000ms for sessionid 0x0, >>>>>>> closing >>>>>>> socket connection and attempting reconnect >>>>>>> 2016-07-12 11:24:16 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>> connection to server QA-E8WIN11/10.11.8.236:2181. Will not attempt >>>>>>> to authenticate using SASL (unknown error) >>>>>>> 2016-07-12 11:24:16 INFO o.a.z.ClientCnxn:980 - Socket connection >>>>>>> established, initiating session, client: /10.11.8.236:50423, >>>>>>> server: QA-E8WIN11/10.11.8.236:2181 >>>>>>> 2016-07-12 11:24:16 INFO o.a.z.ClientCnxn:1400 - Session >>>>>>> establishment complete on server QA-E8WIN11/10.11.8.236:2181, >>>>>>> sessionid = 0x10003c8af64001b, negotiated timeout = 45000 (READ-ONLY >>>>>>> mode) >>>>>>> 2016-07-12 11:24:16 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>> QA-E8WIN13/10.11.8.230:2181 for being r/w. Timeout 200 >>>>>>> 2016-07-12 11:24:16 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>> State change: READ_ONLY >>>>>>> 2016-07-12 11:24:37 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>> QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 400 >>>>>>> 2016-07-12 11:24:58 INFO o.a.z.ClientCnxn:1251 - Client session >>>>>>> timed out, have not heard from server in 42023ms for sessionid >>>>>>> 0x10003c8af64001b, closing socket connection and attempting reconnect >>>>>>> 2016-07-12 11:24:58 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>> State change: SUSPENDED >>>>>>> 2016-07-12 11:25:00 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>> connection to server QA-E8WIN11/10.11.8.236:2181. Will not attempt >>>>>>> to authenticate using SASL (unknown error) >>>>>>> 2016-07-12 11:25:00 INFO o.a.z.ClientCnxn:980 - Socket connection >>>>>>> established, initiating session, client: /10.11.8.236:50428, >>>>>>> server: QA-E8WIN11/10.11.8.236:2181 >>>>>>> 2016-07-12 11:25:00 INFO o.a.z.ClientCnxn:1400 - Session >>>>>>> establishment complete on server QA-E8WIN11/10.11.8.236:2181, >>>>>>> sessionid = 0x10003c8af64001c, negotiated timeout = 45000 (READ-ONLY >>>>>>> mode) >>>>>>> 2016-07-12 11:25:00 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>> QA-E8WIN13/10.11.8.230:2181 for being r/w. Timeout 800 >>>>>>> 2016-07-12 11:25:00 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>> State change: READ_ONLY >>>>>>> 2016-07-12 11:25:21 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>> QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 1600 >>>>>>> 2016-07-12 11:25:42 INFO o.a.z.ClientCnxn:1251 - Client session >>>>>>> timed out, have not heard from server in 42011ms for sessionid >>>>>>> 0x10003c8af64001c, closing socket connection and attempting reconnect >>>>>>> 2016-07-12 11:25:42 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>> State change: SUSPENDED >>>>>>> 2016-07-12 11:25:44 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>> connection to server QA-E8WIN11/10.11.8.236:2181. Will not attempt >>>>>>> to authenticate using SASL (unknown error) >>>>>>> 2016-07-12 11:25:44 INFO o.a.z.ClientCnxn:980 - Socket connection >>>>>>> established, initiating session, client: /10.11.8.236:50435, >>>>>>> server: QA-E8WIN11/10.11.8.236:2181 >>>>>>> 2016-07-12 11:25:44 INFO o.a.z.ClientCnxn:1400 - Session >>>>>>> establishment complete on server QA-E8WIN11/10.11.8.236:2181, >>>>>>> sessionid = 0x10003c8af64001d, negotiated timeout = 45000 (READ-ONLY >>>>>>> mode) >>>>>>> 2016-07-12 11:25:44 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>> QA-E8WIN13/10.11.8.230:2181 for being r/w. Timeout 3200 >>>>>>> 2016-07-12 11:25:44 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>> State change: READ_ONLY >>>>>>> 2016-07-12 11:26:05 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>> QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 6400 >>>>>>> 2016-07-12 11:26:26 INFO o.a.z.ClientCnxn:1251 - Client session >>>>>>> timed out, have not heard from server in 42003ms for sessionid >>>>>>> 0x10003c8af64001d, closing socket connection and attempting reconnect >>>>>>> 2016-07-12 11:26:26 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>> State change: SUSPENDED >>>>>>> Exception in thread "main" >>>>>>> org.apache.zookeeper.KeeperException$ConnectionLossException: >>>>>>> KeeperErrorCode = ConnectionLoss for / >>>>>>> at >>>>>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:99) >>>>>>> at >>>>>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:51) >>>>>>> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1753) >>>>>>> at >>>>>>> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:310) >>>>>>> at >>>>>>> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:299) >>>>>>> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108) >>>>>>> at >>>>>>> org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:295) >>>>>>> at >>>>>>> org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:287) >>>>>>> at >>>>>>> org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:34) >>>>>>> at test.Test.main(Test.java:25) >>>>>>> >>>>>>> >>>>>>> That seems to suggest that adding 1 host to check somehow messes up >>>>>>> something in the timeouts. >>>>>>> I am suspicious of those: >>>>>>> 2016-07-12 11:25:44 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>> QA-E8WIN13/10.11.8.230:2181 for being r/w. Timeout 3200 >>>>>>> They seem to be "blocking" the existing connection we have to the >>>>>>> healthy ZK server, which goes stale. Is that possible? >>>>>>> >>>>>>> >>>>>>> On Tue, Jul 12, 2016 at 11:17 AM, Benjamin Jaton < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> I did not use the zkCli tool but the ZK java API directly. And >>>>>>>> other tools like ZooInspector (contrib). They were run from a remote >>>>>>>> machine. >>>>>>>> I ran it on the same machine and the one read only server is >>>>>>>> running properly, I can browse it using those methods. >>>>>>>> >>>>>>>> Trying to make sense of those client-server interactions is >>>>>>>> difficult but here is what I got: >>>>>>>> >>>>>>>> Client: >>>>>>>> 2016-07-12 10:55:01,508 INFO org.apache.zookeeper.ClientCnxn:1138 >>>>>>>> - *Opening socket connection to server QA-E8WIN11*/10.11.8.236:2181. >>>>>>>> Will not attempt to authenticate using SASL (unknown error) >>>>>>>> 2016-07-12 10:55:01,512 INFO org.apache.zookeeper.ClientCnxn:980 - >>>>>>>> *Socket >>>>>>>> connection established*, initiating session, client: / >>>>>>>> 10.11.12.4:60586, server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>> 2016-07-12 10:55:01,521 INFO org.apache.zookeeper.ClientCnxn:1400 >>>>>>>> - *Session establishment complete on server QA-E8WIN11*/ >>>>>>>> 10.11.8.236:2181, *sessionid = 0x10003c8af640007*, negotiated >>>>>>>> timeout = 45000 (READ-ONLY mode) >>>>>>>> 2016-07-12 10:55:01,523 INFO org.apache.zookeeper.ClientCnxn:1297 >>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. >>>>>>>> Timeout 200 >>>>>>>> 2016-07-12 10:55:01,528 INFO >>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - *State >>>>>>>> change: READ_ONLY* >>>>>>>> 2016-07-12 10:57:08,800 INFO org.apache.zookeeper.ClientCnxn:1253 >>>>>>>> - *Unable to read additional data from server sessionid >>>>>>>> 0x10003c8af640007, likely server has closed socket, closing socket >>>>>>>> connection and attempting reconnect* >>>>>>>> 2016-07-12 10:57:08,903 INFO >>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - *State >>>>>>>> change: SUSPENDED* >>>>>>>> >>>>>>>> >>>>>>>> Server: >>>>>>>> 2016-07-12T10:55:56,468 [myid:] - INFO [ProcessThread(sid:1 >>>>>>>> cport:-1)::ZooKeeperServer@678] - *Established session >>>>>>>> 0x10003c8af640007 with negotiated timeout 45000* for client / >>>>>>>> 10.11.12.4:60586 >>>>>>>> (...) >>>>>>>> 2016-07-12T10:56:41,756 [myid:] - INFO >>>>>>>> [SessionTracker:ZooKeeperServer@384] - *Expiring session >>>>>>>> 0x10003c8af640007, timeout of 45000ms exceeded* >>>>>>>> >>>>>>>> >>>>>>>> Note: Don't worry about the time being about 1 min off between the >>>>>>>> machines. >>>>>>>> >>>>>>>> What I don't understand is why we are losing the read-only >>>>>>>> connection we successfully established. ZK ro server says it timed out? >>>>>>>> >>>>>>>> Now interestingly, when I run it on the same machine it works and >>>>>>>> completes pretty quickly: >>>>>>>> >>>>>>>> >>>>>>>> 2016-07-12 10:43:49 INFO o.a.z.ZooKeeper:716 - Initiating client >>>>>>>> connection, connectString=QA-E8WIN11:2181,QA-E8WIN12:2181 >>>>>>>> sessionTimeout=45000 watcher=org.apache.curator.ConnectionState@e4487af >>>>>>>> 2016-07-12 10:43:49 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>>> connection to server QA-E8WIN12/10.11.8.232:2181. Will not attempt >>>>>>>> to authenticate using SASL (unknown error) >>>>>>>> Successfully established the connection with ZooKeeper >>>>>>>> 2016-07-12 10:44:10 WARN o.a.z.ClientCnxn:1257 - Session 0x0 for >>>>>>>> server null, unexpected error, closing socket connection and attempting >>>>>>>> reconnect >>>>>>>> java.net.ConnectException: Connection timed out: no further >>>>>>>> information >>>>>>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>>>>>>> ~[?:1.8.0_91] >>>>>>>> at >>>>>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) >>>>>>>> ~[?:1.8.0_91] >>>>>>>> at >>>>>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355) >>>>>>>> ~[zookeeper-3.5.0-alpha.jar:3.5.2-alpha-SNAPSHOT-1702518] >>>>>>>> at >>>>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1236) >>>>>>>> [zookeeper-3.5.0-alpha.jar:3.5.2-alpha-SNAPSHOT-1702518] >>>>>>>> 2016-07-12 10:44:11 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>>> connection to server QA-E8WIN11/10.11.8.236:2181. Will not attempt >>>>>>>> to authenticate using SASL (unknown error) >>>>>>>> 2016-07-12 10:44:11 INFO o.a.z.ClientCnxn:980 - Socket connection >>>>>>>> established, initiating session, client: /10.11.8.236:50147, >>>>>>>> server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>> 2016-07-12 10:44:11 INFO o.a.z.ClientCnxn:1400 - Session >>>>>>>> establishment complete on server QA-E8WIN11/10.11.8.236:2181, >>>>>>>> sessionid = 0x10003b315b30012, negotiated timeout = 45000 (READ-ONLY >>>>>>>> mode) >>>>>>>> 2016-07-12 10:44:11 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>>> QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 200 >>>>>>>> 2016-07-12 10:44:11 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>>> State change: READ_ONLY >>>>>>>> 2016-07-12 10:44:32 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>>> QA-E8WIN11/10.11.8.236:2181 for being r/w. Timeout 400 >>>>>>>> Done. >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Jul 11, 2016 at 4:42 PM, Cameron McKenzie < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> It looks like a connection issue based on the logs. When you >>>>>>>>> confirmed that the connection was ok via the zkCli, was that running >>>>>>>>> on the >>>>>>>>> same host as the read only server? Can you run it from the same host >>>>>>>>> that >>>>>>>>> you're running the Java client from? >>>>>>>>> >>>>>>>>> On Tue, Jul 12, 2016 at 9:40 AM, Benjamin Jaton < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> Well I was trying to let it run and it eventually fails, with a >>>>>>>>>> stack: >>>>>>>>>> >>>>>>>>>> Successfully established the connection with ZooKeeper >>>>>>>>>> 2016-07-11 16:28:21,943 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1251 - Client session timed out, >>>>>>>>>> have not >>>>>>>>>> heard from server in 22519ms for sessionid 0x0, closing socket >>>>>>>>>> connection >>>>>>>>>> and attempting reconnect >>>>>>>>>> 2016-07-11 16:28:22,046 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1138 - Opening socket connection to >>>>>>>>>> server >>>>>>>>>> QA-E8WIN11/10.11.8.236:2181. Will not attempt to authenticate >>>>>>>>>> using SASL (unknown error) >>>>>>>>>> 2016-07-11 16:28:22,050 INFO org.apache.zookeeper.ClientCnxn:980 >>>>>>>>>> - Socket connection established, initiating session, client: / >>>>>>>>>> 10.11.12.4:54702, server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>> 2016-07-11 16:28:22,058 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1400 - Session establishment >>>>>>>>>> complete on >>>>>>>>>> server QA-E8WIN11/10.11.8.236:2181, sessionid = >>>>>>>>>> 0x100024f09360080, negotiated timeout = 45000 (READ-ONLY mode) >>>>>>>>>> 2016-07-11 16:28:22,061 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1297 - Checking server QA-E8WIN12/ >>>>>>>>>> 10.11.8.232:2181 for being r/w. Timeout 200 >>>>>>>>>> 2016-07-11 16:28:22,066 INFO >>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - State >>>>>>>>>> change: READ_ONLY >>>>>>>>>> 2016-07-11 16:30:29,312 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1253 - Unable to read additional >>>>>>>>>> data from >>>>>>>>>> server sessionid 0x100024f09360080, likely server has closed socket, >>>>>>>>>> closing socket connection and attempting reconnect >>>>>>>>>> 2016-07-11 16:30:29,416 INFO >>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - State >>>>>>>>>> change: SUSPENDED >>>>>>>>>> 2016-07-11 16:30:31,031 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1138 - Opening socket connection to >>>>>>>>>> server >>>>>>>>>> QA-E8WIN11/10.11.8.236:2181. Will not attempt to authenticate >>>>>>>>>> using SASL (unknown error) >>>>>>>>>> 2016-07-11 16:30:31,032 INFO org.apache.zookeeper.ClientCnxn:980 >>>>>>>>>> - Socket connection established, initiating session, client: / >>>>>>>>>> 10.11.12.4:55156, server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>> 2016-07-11 16:30:31,034 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1400 - Session establishment >>>>>>>>>> complete on >>>>>>>>>> server QA-E8WIN11/10.11.8.236:2181, sessionid = >>>>>>>>>> 0x100024f09360083, negotiated timeout = 45000 (READ-ONLY mode) >>>>>>>>>> 2016-07-11 16:30:31,034 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1297 - Checking server QA-E8WIN12/ >>>>>>>>>> 10.11.8.232:2181 for being r/w. Timeout 400 >>>>>>>>>> 2016-07-11 16:30:31,035 INFO >>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - State >>>>>>>>>> change: READ_ONLY >>>>>>>>>> 2016-07-11 16:32:38,335 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1253 - Unable to read additional >>>>>>>>>> data from >>>>>>>>>> server sessionid 0x100024f09360083, likely server has closed socket, >>>>>>>>>> closing socket connection and attempting reconnect >>>>>>>>>> 2016-07-11 16:32:38,441 INFO >>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - State >>>>>>>>>> change: SUSPENDED >>>>>>>>>> 2016-07-11 16:32:40,425 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1138 - Opening socket connection to >>>>>>>>>> server >>>>>>>>>> QA-E8WIN11/10.11.8.236:2181. Will not attempt to authenticate >>>>>>>>>> using SASL (unknown error) >>>>>>>>>> 2016-07-11 16:32:40,426 INFO org.apache.zookeeper.ClientCnxn:980 >>>>>>>>>> - Socket connection established, initiating session, client: / >>>>>>>>>> 10.11.12.4:55637, server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>> 2016-07-11 16:32:40,428 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1400 - Session establishment >>>>>>>>>> complete on >>>>>>>>>> server QA-E8WIN11/10.11.8.236:2181, sessionid = >>>>>>>>>> 0x100024f09360087, negotiated timeout = 45000 (READ-ONLY mode) >>>>>>>>>> 2016-07-11 16:32:40,428 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1297 - Checking server QA-E8WIN12/ >>>>>>>>>> 10.11.8.232:2181 for being r/w. Timeout 800 >>>>>>>>>> 2016-07-11 16:32:40,428 INFO >>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - State >>>>>>>>>> change: READ_ONLY >>>>>>>>>> 2016-07-11 16:34:47,615 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1253 - Unable to read additional >>>>>>>>>> data from >>>>>>>>>> server sessionid 0x100024f09360087, likely server has closed socket, >>>>>>>>>> closing socket connection and attempting reconnect >>>>>>>>>> 2016-07-11 16:34:47,716 INFO >>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - State >>>>>>>>>> change: SUSPENDED >>>>>>>>>> 2016-07-11 16:34:49,557 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1138 - Opening socket connection to >>>>>>>>>> server >>>>>>>>>> QA-E8WIN11/10.11.8.236:2181. Will not attempt to authenticate >>>>>>>>>> using SASL (unknown error) >>>>>>>>>> 2016-07-11 16:34:49,558 INFO org.apache.zookeeper.ClientCnxn:980 >>>>>>>>>> - Socket connection established, initiating session, client: / >>>>>>>>>> 10.11.12.4:56094, server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>> 2016-07-11 16:34:49,560 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1400 - Session establishment >>>>>>>>>> complete on >>>>>>>>>> server QA-E8WIN11/10.11.8.236:2181, sessionid = >>>>>>>>>> 0x100024f09360089, negotiated timeout = 45000 (READ-ONLY mode) >>>>>>>>>> 2016-07-11 16:34:49,560 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1297 - Checking server QA-E8WIN12/ >>>>>>>>>> 10.11.8.232:2181 for being r/w. Timeout 1600 >>>>>>>>>> 2016-07-11 16:34:49,560 INFO >>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - State >>>>>>>>>> change: READ_ONLY >>>>>>>>>> 2016-07-11 16:36:56,895 INFO >>>>>>>>>> org.apache.zookeeper.ClientCnxn:1253 - Unable to read additional >>>>>>>>>> data from >>>>>>>>>> server sessionid 0x100024f09360089, likely server has closed socket, >>>>>>>>>> closing socket connection and attempting reconnect >>>>>>>>>> Exception in thread "main" 2016-07-11 16:36:56,996 INFO >>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - State >>>>>>>>>> change: SUSPENDED >>>>>>>>>> org.apache.zookeeper.KeeperException$ConnectionLossException: >>>>>>>>>> KeeperErrorCode = ConnectionLoss for / >>>>>>>>>> at >>>>>>>>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:99) >>>>>>>>>> at >>>>>>>>>> org.apache.zookeeper.KeeperException.create(KeeperException.java:51) >>>>>>>>>> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1753) >>>>>>>>>> at >>>>>>>>>> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:310) >>>>>>>>>> at >>>>>>>>>> org.apache.curator.framework.imps.GetDataBuilderImpl$4.call(GetDataBuilderImpl.java:299) >>>>>>>>>> at >>>>>>>>>> org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108) >>>>>>>>>> at >>>>>>>>>> org.apache.curator.framework.imps.GetDataBuilderImpl.pathInForeground(GetDataBuilderImpl.java:295) >>>>>>>>>> at >>>>>>>>>> org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:287) >>>>>>>>>> at >>>>>>>>>> org.apache.curator.framework.imps.GetDataBuilderImpl.forPath(GetDataBuilderImpl.java:34) >>>>>>>>>> at TestCuratorReadOnly.main(TestCuratorReadOnly.java:23) >>>>>>>>>> >>>>>>>>>> On Mon, Jul 11, 2016 at 4:39 PM, Cameron McKenzie < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> So, it appears that the initial connection occurs ok, but then >>>>>>>>>>> something goes wrong >>>>>>>>>>> >>>>>>>>>>> 2016-07-11 15:49:15,712 INFO org.apache.zookeeper.ClientCnxn:1253 >>>>>>>>>>> - Unable to read additional data from server sessionid >>>>>>>>>>> 0x100024f0936005b, >>>>>>>>>>> likely server has closed socket, closing socket connection and >>>>>>>>>>> attempting >>>>>>>>>>> reconnect >>>>>>>>>>> >>>>>>>>>>> Then a reconnection occurs ok, and then the cycle starts again. >>>>>>>>>>> >>>>>>>>>>> Can you see where in the stack the getData() call is hanging? >>>>>>>>>>> >>>>>>>>>>> On Tue, Jul 12, 2016 at 9:35 AM, Benjamin Jaton < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> The blockUntilConnectedOrTimedOut() does return, we see the >>>>>>>>>>>> line: >>>>>>>>>>>> "Successfully established the connection with ZooKeeper" >>>>>>>>>>>> which happens right after. >>>>>>>>>>>> >>>>>>>>>>>> The getData() is where it hangs. >>>>>>>>>>>> >>>>>>>>>>>> We see this first: >>>>>>>>>>>> 2016-07-11 15:47:08,420 INFO >>>>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - >>>>>>>>>>>> State >>>>>>>>>>>> change: READ_ONLY >>>>>>>>>>>> >>>>>>>>>>>> and then we see it again 2 minutes later: >>>>>>>>>>>> 2016-07-11 15:49:17,316 INFO >>>>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - >>>>>>>>>>>> State >>>>>>>>>>>> change: READ_ONLY >>>>>>>>>>>> >>>>>>>>>>>> and then again 2 minutes after that: >>>>>>>>>>>> 2016-07-11 15:51:26,470 INFO >>>>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - >>>>>>>>>>>> State >>>>>>>>>>>> change: READ_ONLY >>>>>>>>>>>> >>>>>>>>>>>> Then I stopped it. >>>>>>>>>>>> >>>>>>>>>>>> On Mon, Jul 11, 2016 at 4:13 PM, Cameron McKenzie < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> hey Ben, >>>>>>>>>>>>> From the logs it looks like the connection is established at >>>>>>>>>>>>> the end? >>>>>>>>>>>>> >>>>>>>>>>>>> Are you saying that the blockUntilConnectedOrTimedOut() does >>>>>>>>>>>>> not return when this occurs? >>>>>>>>>>>>> cheers >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Jul 12, 2016 at 9:10 AM, Benjamin Jaton < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> ZK is 3.5.1-alpha >>>>>>>>>>>>>> Curator is 2.11.0 >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Jul 11, 2016 at 4:01 PM, Benjamin Jaton < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I have a 3 nodes ZK cluster , readonly mode is enabled. >>>>>>>>>>>>>>> 2 nodes are down, so the third one is in read-only (verified >>>>>>>>>>>>>>> by using the ZK API manually). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I'm using this piece of code: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Builder curatorClientBuilder = >>>>>>>>>>>>>>> CuratorFrameworkFactory.builder() >>>>>>>>>>>>>>> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181") >>>>>>>>>>>>>>> .sessionTimeoutMs(45000).connectionTimeoutMs(15000) >>>>>>>>>>>>>>> .retryPolicy(new RetryNTimes(3, 5000)).canBeReadOnly(true); >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> CuratorFramework client = curatorClientBuilder.build(); >>>>>>>>>>>>>>> client.start(); >>>>>>>>>>>>>>> client.getZookeeperClient().blockUntilConnectedOrTimedOut(); >>>>>>>>>>>>>>> System.out.println("Successfully established the connection >>>>>>>>>>>>>>> with ZooKeeper"); >>>>>>>>>>>>>>> client.getData().forPath("/"); >>>>>>>>>>>>>>> System.out.println("Done."); >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> When curator pick the host that is UP first, it goes through >>>>>>>>>>>>>>> very quickly. When it picks the host that is down first >>>>>>>>>>>>>>> (QA-E8WIN12), it >>>>>>>>>>>>>>> never goes through. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Am I missing something here? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Below is the log: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2016-07-11 15:46:45,733 INFO >>>>>>>>>>>>>>> org.apache.curator.framework.imps.CuratorFrameworkImpl:235 - >>>>>>>>>>>>>>> Starting >>>>>>>>>>>>>>> 2016-07-11 15:46:45,743 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:zookeeper.version=3.5.1-alpha-1748895, built on >>>>>>>>>>>>>>> 06/17/2016 >>>>>>>>>>>>>>> 22:24 GMT >>>>>>>>>>>>>>> 2016-07-11 15:46:45,743 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client environment: >>>>>>>>>>>>>>> host.name=w-rli09-ben >>>>>>>>>>>>>>> 2016-07-11 15:46:45,744 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:java.version=1.8.0_11 >>>>>>>>>>>>>>> 2016-07-11 15:46:45,744 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:java.vendor=Oracle >>>>>>>>>>>>>>> Corporation >>>>>>>>>>>>>>> 2016-07-11 15:46:45,744 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:java.home=/usr/local/apps/jdk1.8.0_11/jre >>>>>>>>>>>>>>> 2016-07-11 15:46:45,744 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:java.class.path=(..) >>>>>>>>>>>>>>> 2016-07-11 15:46:45,746 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:java.io.tmpdir=/tmp >>>>>>>>>>>>>>> 2016-07-11 15:46:45,746 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:java.compiler=<NA> >>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client environment: >>>>>>>>>>>>>>> os.name=Linux >>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:os.arch=amd64 >>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:os.version=3.8.0-44-generic >>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client environment: >>>>>>>>>>>>>>> user.name=benji >>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:user.home=(..) >>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:user.dir=(..) >>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:os.memory.free=385MB >>>>>>>>>>>>>>> 2016-07-11 15:46:45,748 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:os.memory.max=5358MB >>>>>>>>>>>>>>> 2016-07-11 15:46:45,748 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 - Client >>>>>>>>>>>>>>> environment:os.memory.total=397MB >>>>>>>>>>>>>>> 2016-07-11 15:46:45,750 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:716 - Initiating client >>>>>>>>>>>>>>> connection, >>>>>>>>>>>>>>> connectString=QA-E8WIN11:2181,QA-E8WIN12:2181 >>>>>>>>>>>>>>> sessionTimeout=45000 >>>>>>>>>>>>>>> watcher=org.apache.curator.ConnectionState@3e96bacf >>>>>>>>>>>>>>> 2016-07-11 15:46:45,769 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1138 - Opening socket >>>>>>>>>>>>>>> connection to server >>>>>>>>>>>>>>> QA-E8WIN12/10.11.8.232:2181. Will not attempt to >>>>>>>>>>>>>>> authenticate using SASL (unknown error) >>>>>>>>>>>>>>> Successfully established the connection with ZooKeeper >>>>>>>>>>>>>>> 2016-07-11 15:47:08,300 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1251 - Client session timed >>>>>>>>>>>>>>> out, have not >>>>>>>>>>>>>>> heard from server in 22534ms for sessionid 0x0, closing socket >>>>>>>>>>>>>>> connection >>>>>>>>>>>>>>> and attempting reconnect >>>>>>>>>>>>>>> 2016-07-11 15:47:08,403 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1138 - Opening socket >>>>>>>>>>>>>>> connection to server >>>>>>>>>>>>>>> QA-E8WIN11/10.11.8.236:2181. Will not attempt to >>>>>>>>>>>>>>> authenticate using SASL (unknown error) >>>>>>>>>>>>>>> 2016-07-11 15:47:08,406 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:980 - Socket connection >>>>>>>>>>>>>>> established, >>>>>>>>>>>>>>> initiating session, client: /10.11.12.4:47004, server: >>>>>>>>>>>>>>> QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>>>>>>> 2016-07-11 15:47:08,413 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1400 - Session establishment >>>>>>>>>>>>>>> complete on >>>>>>>>>>>>>>> server QA-E8WIN11/10.11.8.236:2181, sessionid = >>>>>>>>>>>>>>> 0x100024f0936005b, negotiated timeout = 45000 (READ-ONLY mode) >>>>>>>>>>>>>>> 2016-07-11 15:47:08,415 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1297 - Checking server >>>>>>>>>>>>>>> QA-E8WIN12/ >>>>>>>>>>>>>>> 10.11.8.232:2181 for being r/w. Timeout 200 >>>>>>>>>>>>>>> 2016-07-11 15:47:08,420 INFO >>>>>>>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 >>>>>>>>>>>>>>> - State >>>>>>>>>>>>>>> change: READ_ONLY >>>>>>>>>>>>>>> 2016-07-11 15:49:15,712 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1253 - Unable to read >>>>>>>>>>>>>>> additional data from >>>>>>>>>>>>>>> server sessionid 0x100024f0936005b, likely server has closed >>>>>>>>>>>>>>> socket, >>>>>>>>>>>>>>> closing socket connection and attempting reconnect >>>>>>>>>>>>>>> 2016-07-11 15:49:15,815 INFO >>>>>>>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 >>>>>>>>>>>>>>> - State >>>>>>>>>>>>>>> change: SUSPENDED >>>>>>>>>>>>>>> 2016-07-11 15:49:17,313 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1138 - Opening socket >>>>>>>>>>>>>>> connection to server >>>>>>>>>>>>>>> QA-E8WIN11/10.11.8.236:2181. Will not attempt to >>>>>>>>>>>>>>> authenticate using SASL (unknown error) >>>>>>>>>>>>>>> 2016-07-11 15:49:17,314 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:980 - Socket connection >>>>>>>>>>>>>>> established, >>>>>>>>>>>>>>> initiating session, client: /10.11.12.4:47295, server: >>>>>>>>>>>>>>> QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>>>>>>> 2016-07-11 15:49:17,316 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1400 - Session establishment >>>>>>>>>>>>>>> complete on >>>>>>>>>>>>>>> server QA-E8WIN11/10.11.8.236:2181, sessionid = >>>>>>>>>>>>>>> 0x100024f0936005f, negotiated timeout = 45000 (READ-ONLY mode) >>>>>>>>>>>>>>> 2016-07-11 15:49:17,316 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1297 - Checking server >>>>>>>>>>>>>>> QA-E8WIN12/ >>>>>>>>>>>>>>> 10.11.8.232:2181 for being r/w. Timeout 400 >>>>>>>>>>>>>>> 2016-07-11 15:49:17,316 INFO >>>>>>>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 >>>>>>>>>>>>>>> - State >>>>>>>>>>>>>>> change: READ_ONLY >>>>>>>>>>>>>>> 2016-07-11 15:51:24,607 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1253 - Unable to read >>>>>>>>>>>>>>> additional data from >>>>>>>>>>>>>>> server sessionid 0x100024f0936005f, likely server has closed >>>>>>>>>>>>>>> socket, >>>>>>>>>>>>>>> closing socket connection and attempting reconnect >>>>>>>>>>>>>>> 2016-07-11 15:51:24,713 INFO >>>>>>>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 >>>>>>>>>>>>>>> - State >>>>>>>>>>>>>>> change: SUSPENDED >>>>>>>>>>>>>>> 2016-07-11 15:51:26,467 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1138 - Opening socket >>>>>>>>>>>>>>> connection to server >>>>>>>>>>>>>>> QA-E8WIN11/10.11.8.236:2181. Will not attempt to >>>>>>>>>>>>>>> authenticate using SASL (unknown error) >>>>>>>>>>>>>>> 2016-07-11 15:51:26,468 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:980 - Socket connection >>>>>>>>>>>>>>> established, >>>>>>>>>>>>>>> initiating session, client: /10.11.12.4:47576, server: >>>>>>>>>>>>>>> QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>>>>>>> 2016-07-11 15:51:26,470 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1400 - Session establishment >>>>>>>>>>>>>>> complete on >>>>>>>>>>>>>>> server QA-E8WIN11/10.11.8.236:2181, sessionid = >>>>>>>>>>>>>>> 0x100024f09360061, negotiated timeout = 45000 (READ-ONLY mode) >>>>>>>>>>>>>>> 2016-07-11 15:51:26,470 INFO >>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1297 - Checking server >>>>>>>>>>>>>>> QA-E8WIN12/ >>>>>>>>>>>>>>> 10.11.8.232:2181 for being r/w. Timeout 800 >>>>>>>>>>>>>>> 2016-07-11 15:51:26,470 INFO >>>>>>>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 >>>>>>>>>>>>>>> - State >>>>>>>>>>>>>>> change: READ_ONLY >>>>>>>>>>>>>>> (...) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>> Ben >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
