Hello Chris, Thank you for the detailed repro steps describing which versions work and which versions don't work. I tested your code sample against a 3.4.6 build, and it worked consistently for me. My only thought is that perhaps the MAX_ZK_CONNECT_ATTEMPTS and ZK_CONNECT_WAIT constants are set such that the polling loop exits before the connection completes. Perhaps a subtle timing difference in the newer versions is just now exposing this.
The typical pattern for connection establishment is to rely on the ZooKeeper client's asynchronous event notification instead of a polling loop. See below for a code sample that initiates the connection and then waits for the SyncConnected event. The ZooKeeper programmer's guide and example program docs have a more detailed discussion of this. http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html http://zookeeper.apache.org/doc/r3.4.6/javaExample.html Could you please try running this against your 3.4.6 cluster? I'd be curious to see if the connection completes for you. This would also give you a sense for how long connection establishment is taking and whether or not that's in line with your definitions of MAX_ZK_CONNECT_ATTEMPTS and ZK_CONNECT_WAIT. I hope this helps. class Main implements Watcher { private final CountDownLatch connectLatch; private final ZooKeeper zk; public static void main(final String[] args) throws Exception { String hostPort = args[0]; Main main = new Main(hostPort); main.awaitConnection(); System.out.println("Exiting."); } private Main(String hostPort) throws Exception { this.connectLatch = new CountDownLatch(1); this.zk = new ZooKeeper(hostPort, 3000, this); } private void awaitConnection() throws InterruptedException { this.connectLatch.await(); System.out.println("Connection has completed."); } @Override public void process(WatchedEvent event) { System.out.println("Received event: " + event); if (event.getType() == Event.EventType.None) { switch (event.getState()) { case SyncConnected: this.connectLatch.countDown(); break; } } } } --Chris Nauroth On 7/24/15, 9:16 AM, "Chris Barlock" <[email protected]> wrote: >Ran some more tests. My code works fine up through ZK 3.4.2, but then >fails with 3.4.3. I did have to add the following to the Import-Package >list in the ZK MANIFEST.MF: > >org.slf4 j,javax.security.auth.login,javax.security.sasl > >I could really use some help here, ZK folks! Is my code incorrect with >newer versions of ZK, or is ZK broken? > >Chris > > > > > >From: Chris Barlock/Raleigh/IBM@IBMUS >To: [email protected] >Date: 07/23/2015 09:34 PM >Subject: Re: ZooKeeper Class Will Not Connect > > > >I tried the 3.5.0 alpha build to see if it made any difference. It did >not. > >But, I had to hack the MANIFEST.MF file in the JAR because the >"3.50-alpha" version fails tests in OSGi bundles that import the ZK >classes which have something like: > >version="[3.2,4)" > >I suggest that if you want to name the JAR "3.50-alpha" that all the >internals just use a version of 3.50. > >Chris > >IBM Tivoli Systems >Research Triangle Park, NC >(919) 224-2240 >Internet: [email protected] > > > >From: Chris Barlock/Raleigh/IBM@IBMUS >To: [email protected] >Date: 07/23/2015 01:37 PM >Subject: ZooKeeper Class Will Not Connect > > > >We are attempting to upgrade from Kafka 0.8.0, which includes ZK 3.3.4 to >Kafka 0.8.2.1 with ZK 3.4.6. My code which attempts to connect to ZK is >pretty straightforward: > > try { > ZooKeeper zk = new ZooKeeper(connectString, sessionTimeout >, this); > int connectAttempts = 0; > > while (!zk.getState().isConnected() && connectAttempts < >MAX_ZK_CONNECT_ATTEMPTS) { > try { > Thread.sleep(ZK_CONNECT_WAIT); > } catch (InterruptedException e) { > // Ignore > } > connectAttempts++; > } > } catch (IOException e) { > trace.exception(CLASS_NAME, methodName, e); > } > >With some additional tracing, States is always CONNECTING. Has something >changed with 3.4.6 about how I should connect to the server? I can >connect just fine with the zookeeper-shell.sh that Kafka ships. This >code > > >always runs on the same system as ZK, so the connectString is always >"localhost:2181" > >Chris >
