The blockUntilConnectedOrTimedOut() does return, we see the line: "Successfully established the connection with ZooKeeper" which happens right after.
The getData() is where it hangs. We see this first: 2016-07-11 15:47:08,420 INFO org.apache.curator.framework.state.ConnectionStateManager:228 - State change: READ_ONLY and then we see it again 2 minutes later: 2016-07-11 15:49:17,316 INFO org.apache.curator.framework.state.ConnectionStateManager:228 - State change: READ_ONLY and then again 2 minutes after that: 2016-07-11 15:51:26,470 INFO org.apache.curator.framework.state.ConnectionStateManager:228 - State change: READ_ONLY Then I stopped it. On Mon, Jul 11, 2016 at 4:13 PM, Cameron McKenzie <[email protected]> wrote: > hey Ben, > From the logs it looks like the connection is established at the end? > > Are you saying that the blockUntilConnectedOrTimedOut() does not return > when this occurs? > cheers > > > On Tue, Jul 12, 2016 at 9:10 AM, Benjamin Jaton <[email protected]> > wrote: > >> ZK is 3.5.1-alpha >> Curator is 2.11.0 >> >> On Mon, Jul 11, 2016 at 4:01 PM, Benjamin Jaton <[email protected]> >> wrote: >> >>> I have a 3 nodes ZK cluster , readonly mode is enabled. >>> 2 nodes are down, so the third one is in read-only (verified by using >>> the ZK API manually). >>> >>> I'm using this piece of code: >>> >>> Builder curatorClientBuilder = CuratorFrameworkFactory.builder() >>> .connectString("QA-E8WIN11:2181,QA-E8WIN12:2181") >>> .sessionTimeoutMs(45000).connectionTimeoutMs(15000) >>> .retryPolicy(new RetryNTimes(3, 5000)).canBeReadOnly(true); >>> >>> CuratorFramework client = curatorClientBuilder.build(); >>> client.start(); >>> client.getZookeeperClient().blockUntilConnectedOrTimedOut(); >>> System.out.println("Successfully established the connection with >>> ZooKeeper"); >>> client.getData().forPath("/"); >>> System.out.println("Done."); >>> >>> When curator pick the host that is UP first, it goes through very >>> quickly. When it picks the host that is down first (QA-E8WIN12), it never >>> goes through. >>> >>> Am I missing something here? >>> >>> Below is the log: >>> >>> 2016-07-11 15:46:45,733 INFO >>> org.apache.curator.framework.imps.CuratorFrameworkImpl:235 - Starting >>> 2016-07-11 15:46:45,743 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:zookeeper.version=3.5.1-alpha-1748895, built on >>> 06/17/2016 22:24 GMT >>> 2016-07-11 15:46:45,743 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:host.name=w-rli09-ben >>> 2016-07-11 15:46:45,744 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:java.version=1.8.0_11 >>> 2016-07-11 15:46:45,744 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:java.vendor=Oracle Corporation >>> 2016-07-11 15:46:45,744 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:java.home=/usr/local/apps/jdk1.8.0_11/jre >>> 2016-07-11 15:46:45,744 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:java.class.path=(..) >>> 2016-07-11 15:46:45,746 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:java.io.tmpdir=/tmp >>> 2016-07-11 15:46:45,746 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:java.compiler=<NA> >>> 2016-07-11 15:46:45,747 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:os.name=Linux >>> 2016-07-11 15:46:45,747 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:os.arch=amd64 >>> 2016-07-11 15:46:45,747 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:os.version=3.8.0-44-generic >>> 2016-07-11 15:46:45,747 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:user.name=benji >>> 2016-07-11 15:46:45,747 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:user.home=(..) >>> 2016-07-11 15:46:45,747 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:user.dir=(..) >>> 2016-07-11 15:46:45,747 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:os.memory.free=385MB >>> 2016-07-11 15:46:45,748 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:os.memory.max=5358MB >>> 2016-07-11 15:46:45,748 INFO org.apache.zookeeper.ZooKeeper:109 - >>> Client environment:os.memory.total=397MB >>> 2016-07-11 15:46:45,750 INFO org.apache.zookeeper.ZooKeeper:716 - >>> Initiating client connection, connectString=QA-E8WIN11:2181,QA-E8WIN12:2181 >>> sessionTimeout=45000 watcher=org.apache.curator.ConnectionState@3e96bacf >>> 2016-07-11 15:46:45,769 INFO org.apache.zookeeper.ClientCnxn:1138 - >>> Opening socket connection to server QA-E8WIN12/10.11.8.232:2181. Will >>> not attempt to authenticate using SASL (unknown error) >>> Successfully established the connection with ZooKeeper >>> 2016-07-11 15:47:08,300 INFO org.apache.zookeeper.ClientCnxn:1251 - >>> Client session timed out, have not heard from server in 22534ms for >>> sessionid 0x0, closing socket connection and attempting reconnect >>> 2016-07-11 15:47:08,403 INFO org.apache.zookeeper.ClientCnxn:1138 - >>> Opening socket connection to server QA-E8WIN11/10.11.8.236:2181. Will >>> not attempt to authenticate using SASL (unknown error) >>> 2016-07-11 15:47:08,406 INFO org.apache.zookeeper.ClientCnxn:980 - >>> Socket connection established, initiating session, client: / >>> 10.11.12.4:47004, server: QA-E8WIN11/10.11.8.236:2181 >>> 2016-07-11 15:47:08,413 INFO org.apache.zookeeper.ClientCnxn:1400 - >>> Session establishment complete on server QA-E8WIN11/10.11.8.236:2181, >>> sessionid = 0x100024f0936005b, negotiated timeout = 45000 (READ-ONLY mode) >>> 2016-07-11 15:47:08,415 INFO org.apache.zookeeper.ClientCnxn:1297 - >>> Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 200 >>> 2016-07-11 15:47:08,420 INFO >>> org.apache.curator.framework.state.ConnectionStateManager:228 - State >>> change: READ_ONLY >>> 2016-07-11 15:49:15,712 INFO org.apache.zookeeper.ClientCnxn:1253 - >>> Unable to read additional data from server sessionid 0x100024f0936005b, >>> likely server has closed socket, closing socket connection and attempting >>> reconnect >>> 2016-07-11 15:49:15,815 INFO >>> org.apache.curator.framework.state.ConnectionStateManager:228 - State >>> change: SUSPENDED >>> 2016-07-11 15:49:17,313 INFO org.apache.zookeeper.ClientCnxn:1138 - >>> Opening socket connection to server QA-E8WIN11/10.11.8.236:2181. Will >>> not attempt to authenticate using SASL (unknown error) >>> 2016-07-11 15:49:17,314 INFO org.apache.zookeeper.ClientCnxn:980 - >>> Socket connection established, initiating session, client: / >>> 10.11.12.4:47295, server: QA-E8WIN11/10.11.8.236:2181 >>> 2016-07-11 15:49:17,316 INFO org.apache.zookeeper.ClientCnxn:1400 - >>> Session establishment complete on server QA-E8WIN11/10.11.8.236:2181, >>> sessionid = 0x100024f0936005f, negotiated timeout = 45000 (READ-ONLY mode) >>> 2016-07-11 15:49:17,316 INFO org.apache.zookeeper.ClientCnxn:1297 - >>> Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 400 >>> 2016-07-11 15:49:17,316 INFO >>> org.apache.curator.framework.state.ConnectionStateManager:228 - State >>> change: READ_ONLY >>> 2016-07-11 15:51:24,607 INFO org.apache.zookeeper.ClientCnxn:1253 - >>> Unable to read additional data from server sessionid 0x100024f0936005f, >>> likely server has closed socket, closing socket connection and attempting >>> reconnect >>> 2016-07-11 15:51:24,713 INFO >>> org.apache.curator.framework.state.ConnectionStateManager:228 - State >>> change: SUSPENDED >>> 2016-07-11 15:51:26,467 INFO org.apache.zookeeper.ClientCnxn:1138 - >>> Opening socket connection to server QA-E8WIN11/10.11.8.236:2181. Will >>> not attempt to authenticate using SASL (unknown error) >>> 2016-07-11 15:51:26,468 INFO org.apache.zookeeper.ClientCnxn:980 - >>> Socket connection established, initiating session, client: / >>> 10.11.12.4:47576, server: QA-E8WIN11/10.11.8.236:2181 >>> 2016-07-11 15:51:26,470 INFO org.apache.zookeeper.ClientCnxn:1400 - >>> Session establishment complete on server QA-E8WIN11/10.11.8.236:2181, >>> sessionid = 0x100024f09360061, negotiated timeout = 45000 (READ-ONLY mode) >>> 2016-07-11 15:51:26,470 INFO org.apache.zookeeper.ClientCnxn:1297 - >>> Checking server QA-E8WIN12/10.11.8.232:2181 for being r/w. Timeout 800 >>> 2016-07-11 15:51:26,470 INFO >>> org.apache.curator.framework.state.ConnectionStateManager:228 - State >>> change: READ_ONLY >>> (...) >>> >>> Thanks >>> Ben >>> >>> >> >
