I found the problem -- there were several missing package imports in the MANIFEST.MF that I created for the OSGi bundle I made that wrapped all the Kafka/ZooKeeper JAR files. It took enabling the log4j logging for
log4j.logger.org.apache.zookeeper=DEBUG in order to see the ClassNotFoundExceptions that are typical with this problem. It was odd to me because my application's logging has always uncovered these problems, but... However, I did uncover a ZooKeeper bug. After fixing this, I tried "cleaning up" my OSGi bundle by pulling out all the JARs that are already OSGi bundles on their own. zookeeper-3.4.6 was one of these. However, it is missing a package import for org.ietf.jgss which is part of the Java SDK. I added this to my ZK JAR and it resolved the problem. Chris From: Chris Barlock/Raleigh/IBM@IBMUS To: [email protected] Date: 07/27/2015 05:09 PM Subject: Re: ZooKeeper Class Will Not Connect OK, I'm convinced that ZK is not broken. I stripped my code down to a simple stand-alone test case and it works just fine, even with the sleep loop. But, when I run it normally ZK 3.4.6, I don't see the ZK server logging anything at all. With 3.4.2, it is fine, as previously noted. Full disclosure: we are running the client code in the WebSphere Liberty Profile app server and I have packaged the Kafka and ZK code into an OSGi bundle. I'd like to see what the client is doing, so I took the default log4j.properties file that ships with ZK 3.4.6 and added this to the end: log4j.logger.org.apache.zookeeper=DEBUG, CONSOLE I see this in the console log that Liberty manages: [err] log4j:WARN No appenders could be found for logger (org.apache.zookeeper.ZooKeeper). I'm not that familiar with log4j configuration. Where have I gone wrong here? Thanks! Chris From: Chris Barlock/Raleigh/IBM@IBMUS To: [email protected] Date: 07/27/2015 03:13 PM Subject: Re: ZooKeeper Class Will Not Connect OK, ignore this...embarrassing! I only got State once and tested it over and over again... Chris From: Chris Barlock/Raleigh/IBM@IBMUS To: [email protected] Date: 07/27/2015 02:25 PM Subject: Re: ZooKeeper Class Will Not Connect Chris: Your sample code works for me and I should be able to adapt it, but I still think there is a bug. I made the following change to your sample to test my method: private Main(String hostPort) throws Exception { this.connectLatch = new CountDownLatch(1); waitTime = System.currentTimeMillis(); this.zk = new ZooKeeper(hostPort, 3000, this); States state = zk.getState(); while (state != States.CONNECTED) { System.out.println("State " + state); try { Thread.sleep(100); } catch (InterruptedException e) { System.out.println("Interrupted!"); } } } This outputs: State CONNECTING Received event: WatchedEvent state:SyncConnected type:None path:null State CONNECTING State CONNECTING ... and zk.getState never returns anything but CONNECTING. It seems that this started with ZK 3.4.3 as 3.4.2 works for me, but 3.4.3, 4, 5 and 6 all have this behavior in which the state is always CONNECTING. Chris From: Chris Nauroth <[email protected]> To: "[email protected]" <[email protected]> Date: 07/24/2015 07:28 PM Subject: Re: ZooKeeper Class Will Not Connect ZooKeeper writes INFO-level logging about connection and session establishment. On the client side, these messages would come from the ZooKeeper and ClientCnxn classes. On the server side, these messages would come from the ZooKeeperServer, NIOServerCnxn and NettyServerCnxn classes. It's possible that you could get more detail by turning up to DEBUG level logging by adding these lines to log4j.properties for the client and server respectively: log4j.logger.org.apache.zookeeper=DEBUG log4j.logger.org.apache.zookeeper.server=DEBUG --Chris Nauroth On 7/24/15, 3:42 PM, "Chris Barlock" <[email protected]> wrote: >Chris: > >I have defined: > > private static final int MAX_ZK_CONNECT_ATTEMPTS = 400; > > private static final long ZK_CONNECT_WAIT = 5; // Milliseconds > >so, two seconds, but I have also tried with ZK_CONNECT_WAIT = 500 (two >hundred seconds). getState always returned CONNECTING. I can play with >the async notification, but it really doesn't fit my application very >well. Is there any additional server or client tracing that can be >enabled to get a better sense of what is going on? > >Chris > >IBM Tivoli Systems >Research Triangle Park, NC >(919) 224-2240 >Internet: [email protected] > > > >From: Chris Nauroth <[email protected]> >To: "[email protected]" <[email protected]> >Date: 07/24/2015 06:19 PM >Subject: Re: ZooKeeper Class Will Not Connect > > > >Hello Chris, > >Thank you for the detailed repro steps describing which versions work and >which versions don't work. I tested your code sample against a 3.4.6 >build, and it worked consistently for me. My only thought is that perhaps >the MAX_ZK_CONNECT_ATTEMPTS and ZK_CONNECT_WAIT constants are set such >that the polling loop exits before the connection completes. Perhaps a >subtle timing difference in the newer versions is just now exposing this. > >The typical pattern for connection establishment is to rely on the >ZooKeeper client's asynchronous event notification instead of a polling >loop. See below for a code sample that initiates the connection and then >waits for the SyncConnected event. The ZooKeeper programmer's guide and >example program docs have a more detailed discussion of this. > >http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html > > >http://zookeeper.apache.org/doc/r3.4.6/javaExample.html > > >Could you please try running this against your 3.4.6 cluster? I'd be >curious to see if the connection completes for you. This would also give >you a sense for how long connection establishment is taking and whether or >not that's in line with your definitions of MAX_ZK_CONNECT_ATTEMPTS and >ZK_CONNECT_WAIT. > >I hope this helps. > > > >class Main implements Watcher { > > private final CountDownLatch connectLatch; > private final ZooKeeper zk; > > public static void main(final String[] args) throws Exception { > String hostPort = args[0]; > Main main = new Main(hostPort); > main.awaitConnection(); > System.out.println("Exiting."); > } > > private Main(String hostPort) throws Exception { > this.connectLatch = new CountDownLatch(1); > this.zk = new ZooKeeper(hostPort, 3000, this); > } > > private void awaitConnection() throws InterruptedException { > this.connectLatch.await(); > System.out.println("Connection has completed."); > } > > @Override > public void process(WatchedEvent event) { > System.out.println("Received event: " + event); > if (event.getType() == Event.EventType.None) { > switch (event.getState()) { > case SyncConnected: > this.connectLatch.countDown(); > break; > } > } > } >} > > >--Chris Nauroth > > > > >On 7/24/15, 9:16 AM, "Chris Barlock" <[email protected]> wrote: > >>Ran some more tests. My code works fine up through ZK 3.4.2, but then >>fails with 3.4.3. I did have to add the following to the Import-Package >>list in the ZK MANIFEST.MF: >> >>org.slf4 j,javax.security.auth.login,javax.security.sasl >> >>I could really use some help here, ZK folks! Is my code incorrect with >>newer versions of ZK, or is ZK broken? >> >>Chris >> >> >> >> >> >>From: Chris Barlock/Raleigh/IBM@IBMUS >>To: [email protected] >>Date: 07/23/2015 09:34 PM >>Subject: Re: ZooKeeper Class Will Not Connect >> >> >> >>I tried the 3.5.0 alpha build to see if it made any difference. It did >>not. >> >>But, I had to hack the MANIFEST.MF file in the JAR because the >>"3.50-alpha" version fails tests in OSGi bundles that import the ZK >>classes which have something like: >> >>version="[3.2,4)" >> >>I suggest that if you want to name the JAR "3.50-alpha" that all the >>internals just use a version of 3.50. >> >>Chris >> >>IBM Tivoli Systems >>Research Triangle Park, NC >>(919) 224-2240 >>Internet: [email protected] >> >> >> >>From: Chris Barlock/Raleigh/IBM@IBMUS >>To: [email protected] >>Date: 07/23/2015 01:37 PM >>Subject: ZooKeeper Class Will Not Connect >> >> >> >>We are attempting to upgrade from Kafka 0.8.0, which includes ZK 3.3.4 to >>Kafka 0.8.2.1 with ZK 3.4.6. My code which attempts to connect to ZK is >>pretty straightforward: >> >> try { >> ZooKeeper zk = new ZooKeeper(connectString, >sessionTimeout >>, this); >> int connectAttempts = 0; >> >> while (!zk.getState().isConnected() && connectAttempts < >>MAX_ZK_CONNECT_ATTEMPTS) { >> try { >> Thread.sleep(ZK_CONNECT_WAIT); >> } catch (InterruptedException e) { >> // Ignore >> } >> connectAttempts++; >> } >> } catch (IOException e) { >> trace.exception(CLASS_NAME, methodName, e); >> } >> >>With some additional tracing, States is always CONNECTING. Has something >>changed with 3.4.6 about how I should connect to the server? I can >>connect just fine with the zookeeper-shell.sh that Kafka ships. This >>code >> >> >>always runs on the same system as ZK, so the connectString is always >>"localhost:2181" >> >>Chris >> > >
