bump On Fri, Jul 29, 2016 at 3:06 PM, Benjamin Jaton <[email protected]> wrote:
> The internalBlockUntilConnectedOrTimedOut() method will timeout after > connectionTimeoutMs. > > It seems like it should be more than that since connectionTimeoutMs will > be the time it takes to check only 1 node. > > Thoughts? > > On Fri, Jul 29, 2016 at 2:55 PM, Benjamin Jaton <[email protected]> > wrote: > >> Adding the end of the log: >> >> 570438 [main] DEBUG org.apache.curator.RetryLoop - Retry policy not >> allowing retry >> 570438 [main-EventThread] INFO >> org.apache.curator.framework.state.ConnectionStateManager >> - State change: SUSPENDED >> Exception in thread "main" >> org.apache.zookeeper.KeeperException$ConnectionLossException: >> KeeperErrorCode = ConnectionLoss for / >> at org.apache.zookeeper.KeeperException.create( >> KeeperException.java:99) >> at org.apache.zookeeper.KeeperException.create( >> KeeperException.java:51) >> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1956) >> at org.apache.curator.framework.imps.GetDataBuilderImpl$4. >> call(GetDataBuilderImpl.java:310) >> at org.apache.curator.framework.imps.GetDataBuilderImpl$4. >> call(GetDataBuilderImpl.java:299) >> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108) >> at org.apache.curator.framework.imps.GetDataBuilderImpl. >> pathInForeground(GetDataBuilderImpl.java:295) >> at org.apache.curator.framework.imps.GetDataBuilderImpl. >> forPath(GetDataBuilderImpl.java:287) >> at org.apache.curator.framework.imps.GetDataBuilderImpl. >> forPath(GetDataBuilderImpl.java:34) >> at testzkclient.TestCuratorReadOnlyClient.main( >> TestCuratorReadOnlyClient.java:25) >> >> On Fri, Jul 29, 2016 at 2:49 PM, Benjamin Jaton <[email protected] >> > wrote: >> >>> I was making a mistake in the ZK test, the ZK client works properly ( I >>> was not waiting for the watcher to give me a ConnectedReadOnly event ). >>> >>> So I retried with zk 3.5.2 and curator 2.11.0, I have more logs (see >>> below) >>> >>> 10.11.9.70:2181 is the functioning node, >>> 10.11.157.4:2181 and 10.11.157.210:2181 don't exist. >>> >>> I stopped capturing at 300sec. >>> >>> This line: >>> 15043 [main] DEBUG org.apache.curator.CuratorZookeeperClient - >>> blockUntilConnectedOrTimedOut() end. isConnected: false >>> is not right. Why did it stop blocking? >>> Is there a way to get more info? >>> >>> 0 [main] INFO org.apache.curator.framework.imps.CuratorFrameworkImpl - >>> Starting >>> 4 [main] DEBUG org.apache.curator.CuratorZookeeperClient - Starting >>> 4 [main] DEBUG org.apache.curator.ConnectionState - Starting >>> 4 [main] DEBUG org.apache.curator.ConnectionState - reset >>> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:zookeeper.version=3.5.2-alpha-1753710, built on 07/21/2016 >>> 16:24 GMT >>> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment: >>> host.name=w-rli09-ben >>> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:java.version=1.8.0_11 >>> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:java.vendor=Oracle Corporation >>> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:java.home=/usr/local/apps/jdk1.8.0_11/jre >>> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:java.class.path=/home/benji/workspace-trunk/ >>> testzkclient/bin:/home/benji/workspace-trunk/zookeeper-3.5. >>> 2/build/lib/commons-cli-1.2.jar:/home/benji/workspace- >>> trunk/zookeeper-3.5.2/build/lib/jackson-core-asl-1.9.11. >>> jar:/home/benji/workspace-trunk/zookeeper-3.5.2/build/ >>> lib/jackson-mapper-asl-1.9.11.jar:/home/benji/workspace- >>> trunk/zookeeper-3.5.2/build/lib/javacc.jar:/home/benji/ >>> workspace-trunk/zookeeper-3.5.2/build/lib/jetty-6.1.26.jar:/ >>> home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/ >>> jetty-util-6.1.26.jar:/home/benji/workspace-trunk/ >>> zookeeper-3.5.2/build/lib/jline-2.11.jar:/home/benji/ >>> workspace-trunk/zookeeper-3.5.2/build/lib/log4j-1.2.17.jar:/ >>> home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/ >>> netty-3.10.5.Final.jar:/home/benji/workspace-trunk/ >>> zookeeper-3.5.2/build/lib/servlet-api-2.5-20081211.jar:/ >>> home/benji/workspace-trunk/zookeeper-3.5.2/build/lib/ >>> slf4j-api-1.7.5.jar:/home/benji/workspace-trunk/ >>> zookeeper-3.5.2/build/lib/slf4j-log4j12-1.7.5.jar:/home/ >>> benji/workspace-trunk/zookeeper-3.5.2/build/zookeeper-3.5.2-alpha.jar:/ >>> home/benji/workspace-trunk/apache-curator-3.2.0/guava-16. >>> 0.1.jar:/home/benji/workspace-trunk/apache-curator-3.2.0/ >>> curator-client-2.11.0.jar:/home/benji/workspace-trunk/ >>> apache-curator-3.2.0/curator-framework-2.11.0.jar >>> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:java.library.path=/usr/java/packages/lib/amd64:/ >>> usr/lib64:/lib64:/lib:/usr/lib >>> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:java.io.tmpdir=/tmp >>> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:java.compiler=<NA> >>> 12 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment: >>> os.name=Linux >>> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:os.arch=amd64 >>> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:os.version=3.8.0-44-generic >>> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client environment: >>> user.name=benji >>> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:user.home=/home/benji >>> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:user.dir=/home/benji/workspace-trunk/testzkclient >>> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:os.memory.free=378MB >>> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:os.memory.max=5358MB >>> 13 [main] INFO org.apache.zookeeper.ZooKeeper - Client >>> environment:os.memory.total=381MB >>> 16 [main] INFO org.apache.zookeeper.ZooKeeper - Initiating client >>> connection, connectString=10.11.9.70:2181,10.11.157.4:2181,10.11.157. >>> 210:2181 sessionTimeout=45000 watcher=org.apache.curator. >>> ConnectionState@1ff8b8f >>> 27 [main] DEBUG org.apache.zookeeper.ClientCnxnSocket - jute.maxbuffer >>> is 4194304 >>> 39 [main] DEBUG org.apache.curator.CuratorZookeeperClient - >>> blockUntilConnectedOrTimedOut() start >>> 39 [main-SendThread(10.11.157.210:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Opening socket connection to server 10.11.157.210/10.11.157.210:2181. >>> Will not attempt to authenticate using SASL (unknown error) >>> *15043 [main] DEBUG org.apache.curator.CuratorZookeeperClient - >>> blockUntilConnectedOrTimedOut() end. isConnected: false* >>> Successfully established the connection with ZooKeeper >>> 15058 [main-SendThread(10.11.157.210:2181)] WARN >>> org.apache.zookeeper.ClientCnxn >>> - Client session timed out, have not heard from server in 15023ms for >>> sessionid 0x0 >>> 15058 [main-SendThread(10.11.157.210:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Client session timed out, have not heard from server in 15023ms for >>> sessionid 0x0, closing socket connection and attempting reconnect >>> 15059 [main-SendThread(10.11.157.210:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxnSocketNIO >>> - Ignoring exception during shutdown input >>> java.net.SocketException: Socket is not connected >>> at sun.nio.ch.Net.translateToSocketException(Net.java:143) >>> at sun.nio.ch.Net.translateException(Net.java:177) >>> at sun.nio.ch.Net.translateException(Net.java:183) >>> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:404) >>> at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup( >>> ClientCnxnSocketNIO.java:198) >>> at org.apache.zookeeper.ClientCnxn$SendThread.cleanup( >>> ClientCnxn.java:1325) >>> at org.apache.zookeeper.ClientCnxn$SendThread.run( >>> ClientCnxn.java:1245) >>> Caused by: java.nio.channels.NotYetConnectedException >>> at sun.nio.ch.SocketChannelImpl.shutdownInput( >>> SocketChannelImpl.java:777) >>> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:402) >>> ... 3 more >>> 15062 [main-SendThread(10.11.157.210:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxnSocketNIO >>> - Ignoring exception during shutdown output >>> java.net.SocketException: Socket is not connected >>> at sun.nio.ch.Net.translateToSocketException(Net.java:143) >>> at sun.nio.ch.Net.translateException(Net.java:177) >>> at sun.nio.ch.Net.translateException(Net.java:183) >>> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:412) >>> at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup( >>> ClientCnxnSocketNIO.java:205) >>> at org.apache.zookeeper.ClientCnxn$SendThread.cleanup( >>> ClientCnxn.java:1325) >>> at org.apache.zookeeper.ClientCnxn$SendThread.run( >>> ClientCnxn.java:1245) >>> Caused by: java.nio.channels.NotYetConnectedException >>> at sun.nio.ch.SocketChannelImpl.shutdownOutput( >>> SocketChannelImpl.java:794) >>> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:410) >>> ... 3 more >>> 15169 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Opening socket connection to server 10.11.9.70/10.11.9.70:2181. Will >>> not attempt to authenticate using SASL (unknown error) >>> 15170 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Socket connection established, initiating session, client: / >>> 10.11.12.4:44742, server: 10.11.9.70/10.11.9.70:2181 >>> 15173 [main-SendThread(10.11.9.70:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxn >>> - Session establishment request sent on 10.11.9.70/10.11.9.70:2181 >>> 15180 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Session establishment complete on server 10.11.9.70/10.11.9.70:2181, >>> sessionid = 0x10003fa439a0060, negotiated timeout = 40000 (READ-ONLY mode) >>> 15182 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Checking server 10.11.157.4/10.11.157.4:2181 for being r/w. Timeout >>> 200 >>> 15187 [main-EventThread] INFO >>> org.apache.curator.framework.state.ConnectionStateManager >>> - State change: READ_ONLY >>> 142434 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Unable to read additional data from server sessionid 0x10003fa439a0060, >>> likely server has closed socket, closing socket connection and attempting >>> reconnect >>> 142538 [main-EventThread] INFO >>> org.apache.curator.framework.state.ConnectionStateManager >>> - State change: SUSPENDED >>> 142545 [main] DEBUG org.apache.curator.RetryLoop - Retry-able exception >>> received >>> org.apache.zookeeper.KeeperException$ConnectionLossException: >>> KeeperErrorCode = ConnectionLoss for / >>> at org.apache.zookeeper.KeeperException.create( >>> KeeperException.java:99) >>> at org.apache.zookeeper.KeeperException.create( >>> KeeperException.java:51) >>> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1956) >>> at org.apache.curator.framework.imps.GetDataBuilderImpl$4. >>> call(GetDataBuilderImpl.java:310) >>> at org.apache.curator.framework.imps.GetDataBuilderImpl$4. >>> call(GetDataBuilderImpl.java:299) >>> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108) >>> at org.apache.curator.framework.imps.GetDataBuilderImpl. >>> pathInForeground(GetDataBuilderImpl.java:295) >>> at org.apache.curator.framework.imps.GetDataBuilderImpl. >>> forPath(GetDataBuilderImpl.java:287) >>> at org.apache.curator.framework.imps.GetDataBuilderImpl. >>> forPath(GetDataBuilderImpl.java:34) >>> at testzkclient.TestCuratorReadOnlyClient.main( >>> TestCuratorReadOnlyClient.java:25) >>> 143211 [main-SendThread(10.11.157.210:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Opening socket connection to server 10.11.157.210/10.11.157.210:2181. >>> Will not attempt to authenticate using SASL (unknown error) >>> 147546 [main] DEBUG org.apache.curator.RetryLoop - Retrying operation >>> 156556 [main-SendThread(10.11.157.210:2181)] WARN >>> org.apache.zookeeper.ClientCnxn >>> - Client session timed out, have not heard from server in 14019ms for >>> sessionid 0x10003fa439a0060 >>> 156557 [main-SendThread(10.11.157.210:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Client session timed out, have not heard from server in 14019ms for >>> sessionid 0x10003fa439a0060, closing socket connection and attempting >>> reconnect >>> 156557 [main-SendThread(10.11.157.210:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxnSocketNIO >>> - Ignoring exception during shutdown input >>> java.net.SocketException: Socket is not connected >>> at sun.nio.ch.Net.translateToSocketException(Net.java:143) >>> at sun.nio.ch.Net.translateException(Net.java:177) >>> at sun.nio.ch.Net.translateException(Net.java:183) >>> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:404) >>> at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup( >>> ClientCnxnSocketNIO.java:198) >>> at org.apache.zookeeper.ClientCnxn$SendThread.cleanup( >>> ClientCnxn.java:1325) >>> at org.apache.zookeeper.ClientCnxn$SendThread.run( >>> ClientCnxn.java:1245) >>> Caused by: java.nio.channels.NotYetConnectedException >>> at sun.nio.ch.SocketChannelImpl.shutdownInput( >>> SocketChannelImpl.java:777) >>> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:402) >>> ... 3 more >>> 156557 [main-SendThread(10.11.157.210:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxnSocketNIO >>> - Ignoring exception during shutdown output >>> java.net.SocketException: Socket is not connected >>> at sun.nio.ch.Net.translateToSocketException(Net.java:143) >>> at sun.nio.ch.Net.translateException(Net.java:177) >>> at sun.nio.ch.Net.translateException(Net.java:183) >>> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:412) >>> at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup( >>> ClientCnxnSocketNIO.java:205) >>> at org.apache.zookeeper.ClientCnxn$SendThread.cleanup( >>> ClientCnxn.java:1325) >>> at org.apache.zookeeper.ClientCnxn$SendThread.run( >>> ClientCnxn.java:1245) >>> Caused by: java.nio.channels.NotYetConnectedException >>> at sun.nio.ch.SocketChannelImpl.shutdownOutput( >>> SocketChannelImpl.java:794) >>> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:410) >>> ... 3 more >>> 157544 [Curator-Framework-0] ERROR org.apache.curator.ConnectionState >>> - Connection timed out for connection string (10.11.9.70:2181, >>> 10.11.157.4:2181,10.11.157.210:2181) and timeout (15000) / elapsed >>> (15006) >>> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = >>> ConnectionLoss >>> at org.apache.curator.ConnectionState.checkTimeouts( >>> ConnectionState.java:197) >>> at org.apache.curator.ConnectionState.getZooKeeper( >>> ConnectionState.java:88) >>> at org.apache.curator.CuratorZookeeperClient.getZooKeeper( >>> CuratorZookeeperClient.java:116) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl. >>> performBackgroundOperation(CuratorFrameworkImpl.java:835) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl. >>> backgroundOperationsLoop(CuratorFrameworkImpl.java:809) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl. >>> access$300(CuratorFrameworkImpl.java:64) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl$4. >>> call(CuratorFrameworkImpl.java:267) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>> at java.util.concurrent.ScheduledThreadPoolExecutor$ >>> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >>> at java.util.concurrent.ScheduledThreadPoolExecutor$ >>> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>> at java.util.concurrent.ThreadPoolExecutor.runWorker( >>> ThreadPoolExecutor.java:1142) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >>> ThreadPoolExecutor.java:617) >>> at java.lang.Thread.run(Thread.java:745) >>> 158163 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Opening socket connection to server 10.11.9.70/10.11.9.70:2181. Will >>> not attempt to authenticate using SASL (unknown error) >>> 158164 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Socket connection established, initiating session, client: / >>> 10.11.12.4:45070, server: 10.11.9.70/10.11.9.70:2181 >>> 158164 [main-SendThread(10.11.9.70:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxn >>> - Session establishment request sent on 10.11.9.70/10.11.9.70:2181 >>> 158166 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Session establishment complete on server 10.11.9.70/10.11.9.70:2181, >>> sessionid = 0x10003fa439a0061, negotiated timeout = 40000 (READ-ONLY mode) >>> 158166 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Checking server 10.11.157.4/10.11.157.4:2181 for being r/w. Timeout >>> 400 >>> 158167 [main-EventThread] INFO >>> org.apache.curator.framework.state.ConnectionStateManager >>> - State change: READ_ONLY >>> 285409 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Unable to read additional data from server sessionid 0x10003fa439a0061, >>> likely server has closed socket, closing socket connection and attempting >>> reconnect >>> 285510 [main] DEBUG org.apache.curator.RetryLoop - Retry-able exception >>> received >>> org.apache.zookeeper.KeeperException$ConnectionLossException: >>> KeeperErrorCode = ConnectionLoss for / >>> at org.apache.zookeeper.KeeperException.create( >>> KeeperException.java:99) >>> at org.apache.zookeeper.KeeperException.create( >>> KeeperException.java:51) >>> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1956) >>> at org.apache.curator.framework.imps.GetDataBuilderImpl$4. >>> call(GetDataBuilderImpl.java:310) >>> at org.apache.curator.framework.imps.GetDataBuilderImpl$4. >>> call(GetDataBuilderImpl.java:299) >>> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108) >>> at org.apache.curator.framework.imps.GetDataBuilderImpl. >>> pathInForeground(GetDataBuilderImpl.java:295) >>> at org.apache.curator.framework.imps.GetDataBuilderImpl. >>> forPath(GetDataBuilderImpl.java:287) >>> at org.apache.curator.framework.imps.GetDataBuilderImpl. >>> forPath(GetDataBuilderImpl.java:34) >>> at testzkclient.TestCuratorReadOnlyClient.main( >>> TestCuratorReadOnlyClient.java:25) >>> 285516 [main-EventThread] INFO >>> org.apache.curator.framework.state.ConnectionStateManager >>> - State change: SUSPENDED >>> 286312 [main-SendThread(10.11.157.210:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Opening socket connection to server 10.11.157.210/10.11.157.210:2181. >>> Will not attempt to authenticate using SASL (unknown error) >>> 290511 [main] DEBUG org.apache.curator.RetryLoop - Retrying operation >>> 295516 [Curator-Framework-0] INFO >>> org.apache.curator.framework.state.ConnectionStateManager >>> - State change: LOST >>> 295517 [Curator-Framework-0] ERROR >>> org.apache.curator.framework.imps.CuratorFrameworkImpl >>> - Background operation retry gave up >>> org.apache.zookeeper.KeeperException$ConnectionLossException: >>> KeeperErrorCode = ConnectionLoss >>> at org.apache.zookeeper.KeeperException.create( >>> KeeperException.java:99) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl. >>> checkBackgroundRetry(CuratorFrameworkImpl.java:728) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl. >>> performBackgroundOperation(CuratorFrameworkImpl.java:857) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl. >>> backgroundOperationsLoop(CuratorFrameworkImpl.java:809) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl. >>> access$300(CuratorFrameworkImpl.java:64) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl$4. >>> call(CuratorFrameworkImpl.java:267) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>> at java.util.concurrent.ScheduledThreadPoolExecutor$ >>> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >>> at java.util.concurrent.ScheduledThreadPoolExecutor$ >>> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>> at java.util.concurrent.ThreadPoolExecutor.runWorker( >>> ThreadPoolExecutor.java:1142) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >>> ThreadPoolExecutor.java:617) >>> at java.lang.Thread.run(Thread.java:745) >>> 295518 [Curator-Framework-0] ERROR >>> org.apache.curator.framework.imps.CuratorFrameworkImpl >>> - Background retry gave up >>> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = >>> ConnectionLoss >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl. >>> performBackgroundOperation(CuratorFrameworkImpl.java:838) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl. >>> backgroundOperationsLoop(CuratorFrameworkImpl.java:809) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl. >>> access$300(CuratorFrameworkImpl.java:64) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl$4. >>> call(CuratorFrameworkImpl.java:267) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>> at java.util.concurrent.ScheduledThreadPoolExecutor$ >>> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >>> at java.util.concurrent.ScheduledThreadPoolExecutor$ >>> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>> at java.util.concurrent.ThreadPoolExecutor.runWorker( >>> ThreadPoolExecutor.java:1142) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >>> ThreadPoolExecutor.java:617) >>> at java.lang.Thread.run(Thread.java:745) >>> 299659 [main-SendThread(10.11.157.210:2181)] WARN >>> org.apache.zookeeper.ClientCnxn >>> - Client session timed out, have not heard from server in 14149ms for >>> sessionid 0x10003fa439a0061 >>> 299659 [main-SendThread(10.11.157.210:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Client session timed out, have not heard from server in 14149ms for >>> sessionid 0x10003fa439a0061, closing socket connection and attempting >>> reconnect >>> 299660 [main-SendThread(10.11.157.210:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxnSocketNIO >>> - Ignoring exception during shutdown input >>> java.net.SocketException: Socket is not connected >>> at sun.nio.ch.Net.translateToSocketException(Net.java:143) >>> at sun.nio.ch.Net.translateException(Net.java:177) >>> at sun.nio.ch.Net.translateException(Net.java:183) >>> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:404) >>> at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup( >>> ClientCnxnSocketNIO.java:198) >>> at org.apache.zookeeper.ClientCnxn$SendThread.cleanup( >>> ClientCnxn.java:1325) >>> at org.apache.zookeeper.ClientCnxn$SendThread.run( >>> ClientCnxn.java:1245) >>> Caused by: java.nio.channels.NotYetConnectedException >>> at sun.nio.ch.SocketChannelImpl.shutdownInput( >>> SocketChannelImpl.java:777) >>> at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:402) >>> ... 3 more >>> 299660 [main-SendThread(10.11.157.210:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxnSocketNIO >>> - Ignoring exception during shutdown output >>> java.net.SocketException: Socket is not connected >>> at sun.nio.ch.Net.translateToSocketException(Net.java:143) >>> at sun.nio.ch.Net.translateException(Net.java:177) >>> at sun.nio.ch.Net.translateException(Net.java:183) >>> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:412) >>> at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup( >>> ClientCnxnSocketNIO.java:205) >>> at org.apache.zookeeper.ClientCnxn$SendThread.cleanup( >>> ClientCnxn.java:1325) >>> at org.apache.zookeeper.ClientCnxn$SendThread.run( >>> ClientCnxn.java:1245) >>> Caused by: java.nio.channels.NotYetConnectedException >>> at sun.nio.ch.SocketChannelImpl.shutdownOutput( >>> SocketChannelImpl.java:794) >>> at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:410) >>> ... 3 more >>> 300520 [Curator-Framework-0] ERROR org.apache.curator.ConnectionState >>> - Connection timed out for connection string (10.11.9.70:2181, >>> 10.11.157.4:2181,10.11.157.210:2181) and timeout (15000) / elapsed >>> (15004) >>> org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = >>> ConnectionLoss >>> at org.apache.curator.ConnectionState.checkTimeouts( >>> ConnectionState.java:197) >>> at org.apache.curator.ConnectionState.getZooKeeper( >>> ConnectionState.java:88) >>> at org.apache.curator.CuratorZookeeperClient.getZooKeeper( >>> CuratorZookeeperClient.java:116) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl. >>> performBackgroundOperation(CuratorFrameworkImpl.java:835) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl. >>> backgroundOperationsLoop(CuratorFrameworkImpl.java:809) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl. >>> access$300(CuratorFrameworkImpl.java:64) >>> at org.apache.curator.framework.imps.CuratorFrameworkImpl$4. >>> call(CuratorFrameworkImpl.java:267) >>> at java.util.concurrent.FutureTask.run(FutureTask.java:266) >>> at java.util.concurrent.ScheduledThreadPoolExecutor$ >>> ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) >>> at java.util.concurrent.ScheduledThreadPoolExecutor$ >>> ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) >>> at java.util.concurrent.ThreadPoolExecutor.runWorker( >>> ThreadPoolExecutor.java:1142) >>> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >>> ThreadPoolExecutor.java:617) >>> at java.lang.Thread.run(Thread.java:745) >>> 301077 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Opening socket connection to server 10.11.9.70/10.11.9.70:2181. Will >>> not attempt to authenticate using SASL (unknown error) >>> 301078 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Socket connection established, initiating session, client: / >>> 10.11.12.4:45385, server: 10.11.9.70/10.11.9.70:2181 >>> 301078 [main-SendThread(10.11.9.70:2181)] DEBUG >>> org.apache.zookeeper.ClientCnxn >>> - Session establishment request sent on 10.11.9.70/10.11.9.70:2181 >>> 301081 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Session establishment complete on server 10.11.9.70/10.11.9.70:2181, >>> sessionid = 0x10003fa439a0062, negotiated timeout = 40000 (READ-ONLY mode) >>> 301082 [main-SendThread(10.11.9.70:2181)] INFO >>> org.apache.zookeeper.ClientCnxn >>> - Checking server 10.11.157.4/10.11.157.4:2181 for being r/w. Timeout >>> 800 >>> 301082 [main-EventThread] INFO >>> org.apache.curator.framework.state.ConnectionStateManager >>> - State change: READ_ONLY >>> >>> >>> On Tue, Jul 12, 2016 at 4:11 PM, Benjamin Jaton < >>> [email protected]> wrote: >>> >>>> Cameron, you seem to be right, the simple zookeeper client also fails >>>> sometimes. >>>> I will bring it up to the ZK team, thank you! >>>> >>>> On Tue, Jul 12, 2016 at 3:09 PM, Cameron McKenzie < >>>> [email protected]> wrote: >>>> >>>>> I'm at a bit of a loss then. Looking at the stack trace, it's in the >>>>> ZK code (not the Curator code) that it appears to be blocking. Maybe try >>>>> and create a test that removes Curator and just uses the ZK client >>>>> directly >>>>> and see if you still have the same problem? >>>>> >>>>> On Wed, Jul 13, 2016 at 7:54 AM, Benjamin Jaton < >>>>> [email protected]> wrote: >>>>> >>>>>> Yes, I mean the ping goes through properly >>>>>> >>>>>> On Tue, Jul 12, 2016 at 2:50 PM, Cameron McKenzie < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Are the 'dead' hosts reachable? I seem to recall having problems >>>>>>> with the ZK client blocking for a long time if you had a host configured >>>>>>> that was resolvable but not reachable. >>>>>>> >>>>>>> On Wed, Jul 13, 2016 at 4:30 AM, Benjamin Jaton < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> More to the story: >>>>>>>> >>>>>>>> While it consistently works on the same machine than the sole >>>>>>>> running ZK node when I specify only 2 of the 3 servers in the >>>>>>>> connection >>>>>>>> string, >>>>>>>> it doesn't work when I specify 3 hosts: >>>>>>>> >>>>>>>> >>>>>>>> 2016-07-12 11:23:46 INFO o.a.z.ZooKeeper:716 - Initiating client >>>>>>>> connection, connectString=QA-E8WIN11:2181, >>>>>>>> QA-E8WIN12:2181,QA-E8WIN13:2181 sessionTimeout=45000 >>>>>>>> watcher=org.apache.curator.ConnectionState@c86b9e3 >>>>>>>> 2016-07-12 11:23:46 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>>> connection to server QA-E8WIN13/10.11.8.230:2181. Will not attempt >>>>>>>> to authenticate using SASL (unknown error) >>>>>>>> 2016-07-12 11:24:01 INFO o.a.z.ClientCnxn:1251 - Client session >>>>>>>> timed out, have not heard from server in 15018ms for sessionid 0x0, >>>>>>>> closing >>>>>>>> socket connection and attempting reconnect >>>>>>>> Successfully established the connection with ZooKeeper >>>>>>>> 2016-07-12 11:24:01 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>>> connection to server QA-E8WIN12/10.11.8.232:2181. Will not attempt >>>>>>>> to authenticate using SASL (unknown error) >>>>>>>> 2016-07-12 11:24:16 INFO o.a.z.ClientCnxn:1251 - Client session >>>>>>>> timed out, have not heard from server in 15000ms for sessionid 0x0, >>>>>>>> closing >>>>>>>> socket connection and attempting reconnect >>>>>>>> 2016-07-12 11:24:16 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>>> connection to server QA-E8WIN11/10.11.8.236:2181. Will not attempt >>>>>>>> to authenticate using SASL (unknown error) >>>>>>>> 2016-07-12 11:24:16 INFO o.a.z.ClientCnxn:980 - Socket connection >>>>>>>> established, initiating session, client: /10.11.8.236:50423, >>>>>>>> server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>> 2016-07-12 11:24:16 INFO o.a.z.ClientCnxn:1400 - Session >>>>>>>> establishment complete on server QA-E8WIN11/10.11.8.236:2181, >>>>>>>> sessionid = 0x10003c8af64001b, negotiated timeout = 45000 (READ-ONLY >>>>>>>> mode) >>>>>>>> 2016-07-12 11:24:16 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>>> QA-E8WIN13/10.11.8.230:2181 for being r/w. Timeout 200 >>>>>>>> 2016-07-12 11:24:16 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>>> State change: READ_ONLY >>>>>>>> 2016-07-12 11:24:37 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>>> QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 400 >>>>>>>> 2016-07-12 11:24:58 INFO o.a.z.ClientCnxn:1251 - Client session >>>>>>>> timed out, have not heard from server in 42023ms for sessionid >>>>>>>> 0x10003c8af64001b, closing socket connection and attempting reconnect >>>>>>>> 2016-07-12 11:24:58 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>>> State change: SUSPENDED >>>>>>>> 2016-07-12 11:25:00 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>>> connection to server QA-E8WIN11/10.11.8.236:2181. Will not attempt >>>>>>>> to authenticate using SASL (unknown error) >>>>>>>> 2016-07-12 11:25:00 INFO o.a.z.ClientCnxn:980 - Socket connection >>>>>>>> established, initiating session, client: /10.11.8.236:50428, >>>>>>>> server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>> 2016-07-12 11:25:00 INFO o.a.z.ClientCnxn:1400 - Session >>>>>>>> establishment complete on server QA-E8WIN11/10.11.8.236:2181, >>>>>>>> sessionid = 0x10003c8af64001c, negotiated timeout = 45000 (READ-ONLY >>>>>>>> mode) >>>>>>>> 2016-07-12 11:25:00 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>>> QA-E8WIN13/10.11.8.230:2181 for being r/w. Timeout 800 >>>>>>>> 2016-07-12 11:25:00 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>>> State change: READ_ONLY >>>>>>>> 2016-07-12 11:25:21 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>>> QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 1600 >>>>>>>> 2016-07-12 11:25:42 INFO o.a.z.ClientCnxn:1251 - Client session >>>>>>>> timed out, have not heard from server in 42011ms for sessionid >>>>>>>> 0x10003c8af64001c, closing socket connection and attempting reconnect >>>>>>>> 2016-07-12 11:25:42 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>>> State change: SUSPENDED >>>>>>>> 2016-07-12 11:25:44 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>>> connection to server QA-E8WIN11/10.11.8.236:2181. Will not attempt >>>>>>>> to authenticate using SASL (unknown error) >>>>>>>> 2016-07-12 11:25:44 INFO o.a.z.ClientCnxn:980 - Socket connection >>>>>>>> established, initiating session, client: /10.11.8.236:50435, >>>>>>>> server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>> 2016-07-12 11:25:44 INFO o.a.z.ClientCnxn:1400 - Session >>>>>>>> establishment complete on server QA-E8WIN11/10.11.8.236:2181, >>>>>>>> sessionid = 0x10003c8af64001d, negotiated timeout = 45000 (READ-ONLY >>>>>>>> mode) >>>>>>>> 2016-07-12 11:25:44 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>>> QA-E8WIN13/10.11.8.230:2181 for being r/w. Timeout 3200 >>>>>>>> 2016-07-12 11:25:44 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>>> State change: READ_ONLY >>>>>>>> 2016-07-12 11:26:05 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>>> QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 6400 >>>>>>>> 2016-07-12 11:26:26 INFO o.a.z.ClientCnxn:1251 - Client session >>>>>>>> timed out, have not heard from server in 42003ms for sessionid >>>>>>>> 0x10003c8af64001d, closing socket connection and attempting reconnect >>>>>>>> 2016-07-12 11:26:26 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>>> State change: SUSPENDED >>>>>>>> Exception in thread "main" >>>>>>>> org.apache.zookeeper.KeeperException$ConnectionLossException: >>>>>>>> KeeperErrorCode = ConnectionLoss for / >>>>>>>> at org.apache.zookeeper.KeeperException.create( >>>>>>>> KeeperException.java:99) >>>>>>>> at org.apache.zookeeper.KeeperException.create( >>>>>>>> KeeperException.java:51) >>>>>>>> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1753) >>>>>>>> at org.apache.curator.framework.imps.GetDataBuilderImpl$4. >>>>>>>> call(GetDataBuilderImpl.java:310) >>>>>>>> at org.apache.curator.framework.imps.GetDataBuilderImpl$4. >>>>>>>> call(GetDataBuilderImpl.java:299) >>>>>>>> at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java: >>>>>>>> 108) >>>>>>>> at org.apache.curator.framework.imps.GetDataBuilderImpl. >>>>>>>> pathInForeground(GetDataBuilderImpl.java:295) >>>>>>>> at org.apache.curator.framework.imps.GetDataBuilderImpl. >>>>>>>> forPath(GetDataBuilderImpl.java:287) >>>>>>>> at org.apache.curator.framework.imps.GetDataBuilderImpl. >>>>>>>> forPath(GetDataBuilderImpl.java:34) >>>>>>>> at test.Test.main(Test.java:25) >>>>>>>> >>>>>>>> >>>>>>>> That seems to suggest that adding 1 host to check somehow messes up >>>>>>>> something in the timeouts. >>>>>>>> I am suspicious of those: >>>>>>>> 2016-07-12 11:25:44 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>>> QA-E8WIN13/10.11.8.230:2181 for being r/w. Timeout 3200 >>>>>>>> They seem to be "blocking" the existing connection we have to the >>>>>>>> healthy ZK server, which goes stale. Is that possible? >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Jul 12, 2016 at 11:17 AM, Benjamin Jaton < >>>>>>>> [email protected]> wrote: >>>>>>>> >>>>>>>>> I did not use the zkCli tool but the ZK java API directly. And >>>>>>>>> other tools like ZooInspector (contrib). They were run from a remote >>>>>>>>> machine. >>>>>>>>> I ran it on the same machine and the one read only server is >>>>>>>>> running properly, I can browse it using those methods. >>>>>>>>> >>>>>>>>> Trying to make sense of those client-server interactions is >>>>>>>>> difficult but here is what I got: >>>>>>>>> >>>>>>>>> Client: >>>>>>>>> 2016-07-12 10:55:01,508 INFO org.apache.zookeeper.ClientCnxn:1138 >>>>>>>>> - *Opening socket connection to server QA-E8WIN11*/ >>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using SASL >>>>>>>>> (unknown error) >>>>>>>>> 2016-07-12 10:55:01,512 INFO org.apache.zookeeper.ClientCnxn:980 >>>>>>>>> - *Socket connection established*, initiating session, client: / >>>>>>>>> 10.11.12.4:60586, server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>>> 2016-07-12 10:55:01,521 INFO org.apache.zookeeper.ClientCnxn:1400 >>>>>>>>> - *Session establishment complete on server QA-E8WIN11*/ >>>>>>>>> 10.11.8.236:2181, *sessionid = 0x10003c8af640007*, negotiated >>>>>>>>> timeout = 45000 (READ-ONLY mode) >>>>>>>>> 2016-07-12 10:55:01,523 INFO org.apache.zookeeper.ClientCnxn:1297 >>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. >>>>>>>>> Timeout 200 >>>>>>>>> 2016-07-12 10:55:01,528 INFO org.apache.curator.framework. >>>>>>>>> state.ConnectionStateManager:228 - *State change: READ_ONLY* >>>>>>>>> 2016-07-12 10:57:08,800 INFO org.apache.zookeeper.ClientCnxn:1253 >>>>>>>>> - *Unable to read additional data from server sessionid >>>>>>>>> 0x10003c8af640007, likely server has closed socket, closing socket >>>>>>>>> connection and attempting reconnect* >>>>>>>>> 2016-07-12 10:57:08,903 INFO org.apache.curator.framework. >>>>>>>>> state.ConnectionStateManager:228 - *State change: SUSPENDED* >>>>>>>>> >>>>>>>>> >>>>>>>>> Server: >>>>>>>>> 2016-07-12T10:55:56,468 [myid:] - INFO [ProcessThread(sid:1 >>>>>>>>> cport:-1)::ZooKeeperServer@678] - *Established session >>>>>>>>> 0x10003c8af640007 with negotiated timeout 45000* for client / >>>>>>>>> 10.11.12.4:60586 >>>>>>>>> (...) >>>>>>>>> 2016-07-12T10:56:41,756 [myid:] - INFO [SessionTracker: >>>>>>>>> ZooKeeperServer@384] - *Expiring session 0x10003c8af640007, >>>>>>>>> timeout of 45000ms exceeded* >>>>>>>>> >>>>>>>>> >>>>>>>>> Note: Don't worry about the time being about 1 min off between the >>>>>>>>> machines. >>>>>>>>> >>>>>>>>> What I don't understand is why we are losing the read-only >>>>>>>>> connection we successfully established. ZK ro server says it timed >>>>>>>>> out? >>>>>>>>> >>>>>>>>> Now interestingly, when I run it on the same machine it works and >>>>>>>>> completes pretty quickly: >>>>>>>>> >>>>>>>>> >>>>>>>>> 2016-07-12 10:43:49 INFO o.a.z.ZooKeeper:716 - Initiating client >>>>>>>>> connection, connectString=QA-E8WIN11:2181,QA-E8WIN12:2181 >>>>>>>>> sessionTimeout=45000 watcher=org.apache.curator. >>>>>>>>> ConnectionState@e4487af >>>>>>>>> 2016-07-12 10:43:49 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>>>> connection to server QA-E8WIN12/10.11.8.232:2181. Will not >>>>>>>>> attempt to authenticate using SASL (unknown error) >>>>>>>>> Successfully established the connection with ZooKeeper >>>>>>>>> 2016-07-12 10:44:10 WARN o.a.z.ClientCnxn:1257 - Session 0x0 for >>>>>>>>> server null, unexpected error, closing socket connection and >>>>>>>>> attempting >>>>>>>>> reconnect >>>>>>>>> java.net.ConnectException: Connection timed out: no further >>>>>>>>> information >>>>>>>>> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) >>>>>>>>> ~[?:1.8.0_91] >>>>>>>>> at >>>>>>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717) >>>>>>>>> ~[?:1.8.0_91] >>>>>>>>> at >>>>>>>>> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355) >>>>>>>>> ~[zookeeper-3.5.0-alpha.jar:3.5.2-alpha-SNAPSHOT-1702518] >>>>>>>>> at >>>>>>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1236) >>>>>>>>> [zookeeper-3.5.0-alpha.jar:3.5.2-alpha-SNAPSHOT-1702518] >>>>>>>>> 2016-07-12 10:44:11 INFO o.a.z.ClientCnxn:1138 - Opening socket >>>>>>>>> connection to server QA-E8WIN11/10.11.8.236:2181. Will not >>>>>>>>> attempt to authenticate using SASL (unknown error) >>>>>>>>> 2016-07-12 10:44:11 INFO o.a.z.ClientCnxn:980 - Socket connection >>>>>>>>> established, initiating session, client: /10.11.8.236:50147, >>>>>>>>> server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>>> 2016-07-12 10:44:11 INFO o.a.z.ClientCnxn:1400 - Session >>>>>>>>> establishment complete on server QA-E8WIN11/10.11.8.236:2181, >>>>>>>>> sessionid = 0x10003b315b30012, negotiated timeout = 45000 (READ-ONLY >>>>>>>>> mode) >>>>>>>>> 2016-07-12 10:44:11 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>>>> QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 200 >>>>>>>>> 2016-07-12 10:44:11 INFO o.a.c.f.s.ConnectionStateManager:228 - >>>>>>>>> State change: READ_ONLY >>>>>>>>> 2016-07-12 10:44:32 INFO o.a.z.ClientCnxn:1297 - Checking server >>>>>>>>> QA-E8WIN11/10.11.8.236:2181 for being r/w. Timeout 400 >>>>>>>>> Done. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Jul 11, 2016 at 4:42 PM, Cameron McKenzie < >>>>>>>>> [email protected]> wrote: >>>>>>>>> >>>>>>>>>> It looks like a connection issue based on the logs. When you >>>>>>>>>> confirmed that the connection was ok via the zkCli, was that running >>>>>>>>>> on the >>>>>>>>>> same host as the read only server? Can you run it from the same host >>>>>>>>>> that >>>>>>>>>> you're running the Java client from? >>>>>>>>>> >>>>>>>>>> On Tue, Jul 12, 2016 at 9:40 AM, Benjamin Jaton < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> Well I was trying to let it run and it eventually fails, with a >>>>>>>>>>> stack: >>>>>>>>>>> >>>>>>>>>>> Successfully established the connection with ZooKeeper >>>>>>>>>>> 2016-07-11 16:28:21,943 INFO org.apache.zookeeper.ClientCnxn:1251 >>>>>>>>>>> - Client session timed out, have not heard from server in 22519ms >>>>>>>>>>> for >>>>>>>>>>> sessionid 0x0, closing socket connection and attempting reconnect >>>>>>>>>>> 2016-07-11 16:28:22,046 INFO org.apache.zookeeper.ClientCnxn:1138 >>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/ >>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using SASL >>>>>>>>>>> (unknown error) >>>>>>>>>>> 2016-07-11 16:28:22,050 INFO org.apache.zookeeper.ClientCnxn:980 >>>>>>>>>>> - Socket connection established, initiating session, client: / >>>>>>>>>>> 10.11.12.4:54702, server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>>> 2016-07-11 16:28:22,058 INFO org.apache.zookeeper.ClientCnxn:1400 >>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/ >>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f09360080, negotiated >>>>>>>>>>> timeout = 45000 (READ-ONLY mode) >>>>>>>>>>> 2016-07-11 16:28:22,061 INFO org.apache.zookeeper.ClientCnxn:1297 >>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. >>>>>>>>>>> Timeout 200 >>>>>>>>>>> 2016-07-11 16:28:22,066 INFO org.apache.curator.framework. >>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY >>>>>>>>>>> 2016-07-11 16:30:29,312 INFO org.apache.zookeeper.ClientCnxn:1253 >>>>>>>>>>> - Unable to read additional data from server sessionid >>>>>>>>>>> 0x100024f09360080, >>>>>>>>>>> likely server has closed socket, closing socket connection and >>>>>>>>>>> attempting >>>>>>>>>>> reconnect >>>>>>>>>>> 2016-07-11 16:30:29,416 INFO org.apache.curator.framework. >>>>>>>>>>> state.ConnectionStateManager:228 - State change: SUSPENDED >>>>>>>>>>> 2016-07-11 16:30:31,031 INFO org.apache.zookeeper.ClientCnxn:1138 >>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/ >>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using SASL >>>>>>>>>>> (unknown error) >>>>>>>>>>> 2016-07-11 16:30:31,032 INFO org.apache.zookeeper.ClientCnxn:980 >>>>>>>>>>> - Socket connection established, initiating session, client: / >>>>>>>>>>> 10.11.12.4:55156, server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>>> 2016-07-11 16:30:31,034 INFO org.apache.zookeeper.ClientCnxn:1400 >>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/ >>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f09360083, negotiated >>>>>>>>>>> timeout = 45000 (READ-ONLY mode) >>>>>>>>>>> 2016-07-11 16:30:31,034 INFO org.apache.zookeeper.ClientCnxn:1297 >>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. >>>>>>>>>>> Timeout 400 >>>>>>>>>>> 2016-07-11 16:30:31,035 INFO org.apache.curator.framework. >>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY >>>>>>>>>>> 2016-07-11 16:32:38,335 INFO org.apache.zookeeper.ClientCnxn:1253 >>>>>>>>>>> - Unable to read additional data from server sessionid >>>>>>>>>>> 0x100024f09360083, >>>>>>>>>>> likely server has closed socket, closing socket connection and >>>>>>>>>>> attempting >>>>>>>>>>> reconnect >>>>>>>>>>> 2016-07-11 16:32:38,441 INFO org.apache.curator.framework. >>>>>>>>>>> state.ConnectionStateManager:228 - State change: SUSPENDED >>>>>>>>>>> 2016-07-11 16:32:40,425 INFO org.apache.zookeeper.ClientCnxn:1138 >>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/ >>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using SASL >>>>>>>>>>> (unknown error) >>>>>>>>>>> 2016-07-11 16:32:40,426 INFO org.apache.zookeeper.ClientCnxn:980 >>>>>>>>>>> - Socket connection established, initiating session, client: / >>>>>>>>>>> 10.11.12.4:55637, server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>>> 2016-07-11 16:32:40,428 INFO org.apache.zookeeper.ClientCnxn:1400 >>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/ >>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f09360087, negotiated >>>>>>>>>>> timeout = 45000 (READ-ONLY mode) >>>>>>>>>>> 2016-07-11 16:32:40,428 INFO org.apache.zookeeper.ClientCnxn:1297 >>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. >>>>>>>>>>> Timeout 800 >>>>>>>>>>> 2016-07-11 16:32:40,428 INFO org.apache.curator.framework. >>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY >>>>>>>>>>> 2016-07-11 16:34:47,615 INFO org.apache.zookeeper.ClientCnxn:1253 >>>>>>>>>>> - Unable to read additional data from server sessionid >>>>>>>>>>> 0x100024f09360087, >>>>>>>>>>> likely server has closed socket, closing socket connection and >>>>>>>>>>> attempting >>>>>>>>>>> reconnect >>>>>>>>>>> 2016-07-11 16:34:47,716 INFO org.apache.curator.framework. >>>>>>>>>>> state.ConnectionStateManager:228 - State change: SUSPENDED >>>>>>>>>>> 2016-07-11 16:34:49,557 INFO org.apache.zookeeper.ClientCnxn:1138 >>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/ >>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using SASL >>>>>>>>>>> (unknown error) >>>>>>>>>>> 2016-07-11 16:34:49,558 INFO org.apache.zookeeper.ClientCnxn:980 >>>>>>>>>>> - Socket connection established, initiating session, client: / >>>>>>>>>>> 10.11.12.4:56094, server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>>> 2016-07-11 16:34:49,560 INFO org.apache.zookeeper.ClientCnxn:1400 >>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/ >>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f09360089, negotiated >>>>>>>>>>> timeout = 45000 (READ-ONLY mode) >>>>>>>>>>> 2016-07-11 16:34:49,560 INFO org.apache.zookeeper.ClientCnxn:1297 >>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. >>>>>>>>>>> Timeout 1600 >>>>>>>>>>> 2016-07-11 16:34:49,560 INFO org.apache.curator.framework. >>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY >>>>>>>>>>> 2016-07-11 16:36:56,895 INFO org.apache.zookeeper.ClientCnxn:1253 >>>>>>>>>>> - Unable to read additional data from server sessionid >>>>>>>>>>> 0x100024f09360089, >>>>>>>>>>> likely server has closed socket, closing socket connection and >>>>>>>>>>> attempting >>>>>>>>>>> reconnect >>>>>>>>>>> Exception in thread "main" 2016-07-11 16:36:56,996 INFO >>>>>>>>>>> org.apache.curator.framework.state.ConnectionStateManager:228 - >>>>>>>>>>> State change: SUSPENDED >>>>>>>>>>> org.apache.zookeeper.KeeperException$ConnectionLossException: >>>>>>>>>>> KeeperErrorCode = ConnectionLoss for / >>>>>>>>>>> at org.apache.zookeeper.KeeperException.create( >>>>>>>>>>> KeeperException.java:99) >>>>>>>>>>> at org.apache.zookeeper.KeeperException.create( >>>>>>>>>>> KeeperException.java:51) >>>>>>>>>>> at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper. >>>>>>>>>>> java:1753) >>>>>>>>>>> at org.apache.curator.framework.imps.GetDataBuilderImpl$4. >>>>>>>>>>> call(GetDataBuilderImpl.java:310) >>>>>>>>>>> at org.apache.curator.framework.imps.GetDataBuilderImpl$4. >>>>>>>>>>> call(GetDataBuilderImpl.java:299) >>>>>>>>>>> at org.apache.curator.RetryLoop. >>>>>>>>>>> callWithRetry(RetryLoop.java:108) >>>>>>>>>>> at org.apache.curator.framework.imps.GetDataBuilderImpl. >>>>>>>>>>> pathInForeground(GetDataBuilderImpl.java:295) >>>>>>>>>>> at org.apache.curator.framework.imps.GetDataBuilderImpl. >>>>>>>>>>> forPath(GetDataBuilderImpl.java:287) >>>>>>>>>>> at org.apache.curator.framework.imps.GetDataBuilderImpl. >>>>>>>>>>> forPath(GetDataBuilderImpl.java:34) >>>>>>>>>>> at TestCuratorReadOnly.main(TestCuratorReadOnly.java:23) >>>>>>>>>>> >>>>>>>>>>> On Mon, Jul 11, 2016 at 4:39 PM, Cameron McKenzie < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> So, it appears that the initial connection occurs ok, but then >>>>>>>>>>>> something goes wrong >>>>>>>>>>>> >>>>>>>>>>>> 2016-07-11 15:49:15,712 INFO org.apache.zookeeper.ClientCnxn:1253 >>>>>>>>>>>> - Unable to read additional data from server sessionid >>>>>>>>>>>> 0x100024f0936005b, >>>>>>>>>>>> likely server has closed socket, closing socket connection and >>>>>>>>>>>> attempting >>>>>>>>>>>> reconnect >>>>>>>>>>>> >>>>>>>>>>>> Then a reconnection occurs ok, and then the cycle starts again. >>>>>>>>>>>> >>>>>>>>>>>> Can you see where in the stack the getData() call is hanging? >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Jul 12, 2016 at 9:35 AM, Benjamin Jaton < >>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> The blockUntilConnectedOrTimedOut() does return, we see the >>>>>>>>>>>>> line: >>>>>>>>>>>>> "Successfully established the connection with ZooKeeper" >>>>>>>>>>>>> which happens right after. >>>>>>>>>>>>> >>>>>>>>>>>>> The getData() is where it hangs. >>>>>>>>>>>>> >>>>>>>>>>>>> We see this first: >>>>>>>>>>>>> 2016-07-11 15:47:08,420 INFO org.apache.curator.framework. >>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY >>>>>>>>>>>>> >>>>>>>>>>>>> and then we see it again 2 minutes later: >>>>>>>>>>>>> 2016-07-11 15:49:17,316 INFO org.apache.curator.framework. >>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY >>>>>>>>>>>>> >>>>>>>>>>>>> and then again 2 minutes after that: >>>>>>>>>>>>> 2016-07-11 15:51:26,470 INFO org.apache.curator.framework. >>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY >>>>>>>>>>>>> >>>>>>>>>>>>> Then I stopped it. >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Jul 11, 2016 at 4:13 PM, Cameron McKenzie < >>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> hey Ben, >>>>>>>>>>>>>> From the logs it looks like the connection is established at >>>>>>>>>>>>>> the end? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Are you saying that the blockUntilConnectedOrTimedOut() does >>>>>>>>>>>>>> not return when this occurs? >>>>>>>>>>>>>> cheers >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Jul 12, 2016 at 9:10 AM, Benjamin Jaton < >>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> ZK is 3.5.1-alpha >>>>>>>>>>>>>>> Curator is 2.11.0 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Mon, Jul 11, 2016 at 4:01 PM, Benjamin Jaton < >>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I have a 3 nodes ZK cluster , readonly mode is enabled. >>>>>>>>>>>>>>>> 2 nodes are down, so the third one is in read-only >>>>>>>>>>>>>>>> (verified by using the ZK API manually). >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I'm using this piece of code: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Builder curatorClientBuilder = CuratorFrameworkFactory. >>>>>>>>>>>>>>>> builder() >>>>>>>>>>>>>>>> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181") >>>>>>>>>>>>>>>> .sessionTimeoutMs(45000).connectionTimeoutMs(15000) >>>>>>>>>>>>>>>> .retryPolicy(new RetryNTimes(3, 5000)).canBeReadOnly(true); >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> CuratorFramework client = curatorClientBuilder.build(); >>>>>>>>>>>>>>>> client.start(); >>>>>>>>>>>>>>>> client.getZookeeperClient().blockUntilConnectedOrTimedOut( >>>>>>>>>>>>>>>> ); >>>>>>>>>>>>>>>> System.out.println("Successfully established the >>>>>>>>>>>>>>>> connection with ZooKeeper"); >>>>>>>>>>>>>>>> client.getData().forPath("/"); >>>>>>>>>>>>>>>> System.out.println("Done."); >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> When curator pick the host that is UP first, it goes >>>>>>>>>>>>>>>> through very quickly. When it picks the host that is down first >>>>>>>>>>>>>>>> (QA-E8WIN12), it never goes through. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Am I missing something here? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Below is the log: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,733 INFO >>>>>>>>>>>>>>>> org.apache.curator.framework.imps.CuratorFrameworkImpl:235 >>>>>>>>>>>>>>>> - Starting >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,743 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:zookeeper.version=3.5.1-alpha-1748895, >>>>>>>>>>>>>>>> built on 06/17/2016 22:24 GMT >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,743 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:host.name=w-rli09-ben >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,744 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:java.version=1.8.0_11 >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,744 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:java.vendor=Oracle Corporation >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,744 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:java.home=/usr/ >>>>>>>>>>>>>>>> local/apps/jdk1.8.0_11/jre >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,744 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:java.class.path=(..) >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,746 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:java.io.tmpdir=/tmp >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,746 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:java.compiler=<NA> >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:os.name=Linux >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:os.arch=amd64 >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:os.version=3.8.0-44-generic >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:user.name=benji >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:user.home=(..) >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:user.dir=(..) >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,747 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:os.memory.free=385MB >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,748 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:os.memory.max=5358MB >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,748 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:109 >>>>>>>>>>>>>>>> - Client environment:os.memory.total=397MB >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,750 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ZooKeeper:716 >>>>>>>>>>>>>>>> - Initiating client connection, >>>>>>>>>>>>>>>> connectString=QA-E8WIN11:2181,QA-E8WIN12:2181 >>>>>>>>>>>>>>>> sessionTimeout=45000 watcher=org.apache.curator. >>>>>>>>>>>>>>>> ConnectionState@3e96bacf >>>>>>>>>>>>>>>> 2016-07-11 15:46:45,769 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1138 >>>>>>>>>>>>>>>> - Opening socket connection to server QA-E8WIN12/ >>>>>>>>>>>>>>>> 10.11.8.232:2181. Will not attempt to authenticate using >>>>>>>>>>>>>>>> SASL (unknown error) >>>>>>>>>>>>>>>> Successfully established the connection with ZooKeeper >>>>>>>>>>>>>>>> 2016-07-11 15:47:08,300 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1251 >>>>>>>>>>>>>>>> - Client session timed out, have not heard from server in >>>>>>>>>>>>>>>> 22534ms for >>>>>>>>>>>>>>>> sessionid 0x0, closing socket connection and attempting >>>>>>>>>>>>>>>> reconnect >>>>>>>>>>>>>>>> 2016-07-11 15:47:08,403 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1138 >>>>>>>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/ >>>>>>>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using >>>>>>>>>>>>>>>> SASL (unknown error) >>>>>>>>>>>>>>>> 2016-07-11 15:47:08,406 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:980 >>>>>>>>>>>>>>>> - Socket connection established, initiating session, client: / >>>>>>>>>>>>>>>> 10.11.12.4:47004, server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>>>>>>>> 2016-07-11 15:47:08,413 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1400 >>>>>>>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/ >>>>>>>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f0936005b, >>>>>>>>>>>>>>>> negotiated timeout = 45000 (READ-ONLY mode) >>>>>>>>>>>>>>>> 2016-07-11 15:47:08,415 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1297 >>>>>>>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being >>>>>>>>>>>>>>>> r/w. Timeout 200 >>>>>>>>>>>>>>>> 2016-07-11 15:47:08,420 INFO org.apache.curator.framework. >>>>>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY >>>>>>>>>>>>>>>> 2016-07-11 15:49:15,712 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1253 >>>>>>>>>>>>>>>> - Unable to read additional data from server sessionid >>>>>>>>>>>>>>>> 0x100024f0936005b, >>>>>>>>>>>>>>>> likely server has closed socket, closing socket connection and >>>>>>>>>>>>>>>> attempting >>>>>>>>>>>>>>>> reconnect >>>>>>>>>>>>>>>> 2016-07-11 15:49:15,815 INFO org.apache.curator.framework. >>>>>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: SUSPENDED >>>>>>>>>>>>>>>> 2016-07-11 15:49:17,313 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1138 >>>>>>>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/ >>>>>>>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using >>>>>>>>>>>>>>>> SASL (unknown error) >>>>>>>>>>>>>>>> 2016-07-11 15:49:17,314 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:980 >>>>>>>>>>>>>>>> - Socket connection established, initiating session, client: / >>>>>>>>>>>>>>>> 10.11.12.4:47295, server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>>>>>>>> 2016-07-11 15:49:17,316 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1400 >>>>>>>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/ >>>>>>>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f0936005f, >>>>>>>>>>>>>>>> negotiated timeout = 45000 (READ-ONLY mode) >>>>>>>>>>>>>>>> 2016-07-11 15:49:17,316 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1297 >>>>>>>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being >>>>>>>>>>>>>>>> r/w. Timeout 400 >>>>>>>>>>>>>>>> 2016-07-11 15:49:17,316 INFO org.apache.curator.framework. >>>>>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY >>>>>>>>>>>>>>>> 2016-07-11 15:51:24,607 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1253 >>>>>>>>>>>>>>>> - Unable to read additional data from server sessionid >>>>>>>>>>>>>>>> 0x100024f0936005f, >>>>>>>>>>>>>>>> likely server has closed socket, closing socket connection and >>>>>>>>>>>>>>>> attempting >>>>>>>>>>>>>>>> reconnect >>>>>>>>>>>>>>>> 2016-07-11 15:51:24,713 INFO org.apache.curator.framework. >>>>>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: SUSPENDED >>>>>>>>>>>>>>>> 2016-07-11 15:51:26,467 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1138 >>>>>>>>>>>>>>>> - Opening socket connection to server QA-E8WIN11/ >>>>>>>>>>>>>>>> 10.11.8.236:2181. Will not attempt to authenticate using >>>>>>>>>>>>>>>> SASL (unknown error) >>>>>>>>>>>>>>>> 2016-07-11 15:51:26,468 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:980 >>>>>>>>>>>>>>>> - Socket connection established, initiating session, client: / >>>>>>>>>>>>>>>> 10.11.12.4:47576, server: QA-E8WIN11/10.11.8.236:2181 >>>>>>>>>>>>>>>> 2016-07-11 15:51:26,470 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1400 >>>>>>>>>>>>>>>> - Session establishment complete on server QA-E8WIN11/ >>>>>>>>>>>>>>>> 10.11.8.236:2181, sessionid = 0x100024f09360061, >>>>>>>>>>>>>>>> negotiated timeout = 45000 (READ-ONLY mode) >>>>>>>>>>>>>>>> 2016-07-11 15:51:26,470 INFO >>>>>>>>>>>>>>>> org.apache.zookeeper.ClientCnxn:1297 >>>>>>>>>>>>>>>> - Checking server QA-E8WIN12/10.11.8.232:2181 for being >>>>>>>>>>>>>>>> r/w. Timeout 800 >>>>>>>>>>>>>>>> 2016-07-11 15:51:26,470 INFO org.apache.curator.framework. >>>>>>>>>>>>>>>> state.ConnectionStateManager:228 - State change: READ_ONLY >>>>>>>>>>>>>>>> (...) >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks >>>>>>>>>>>>>>>> Ben >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
