Hi Chris, I'm glad to hear this worked out!
Regarding the remaining bug related to OSGi, would you please file an Apache jira to track that? Even better, if you're available to code a patch for it, the community would appreciate the contribution. It sounds like you're a heavy user of OSGi, so you likely have a good test environment for validating a patch. The OSGi manifest headers are coded in the build.xml. You can find them by searching for "Import-Package". Thanks again! --Chris Nauroth On 7/31/15, 5:18 PM, "Chris Barlock" <[email protected]> wrote: >I found the problem -- there were several missing package imports in the >MANIFEST.MF that I created for the OSGi bundle I made that wrapped all >the >Kafka/ZooKeeper JAR files. It took enabling the log4j logging for > >log4j.logger.org.apache.zookeeper=DEBUG > >in order to see the ClassNotFoundExceptions that are typical with this >problem. It was odd to me because my application's logging has always >uncovered these problems, but... > >However, I did uncover a ZooKeeper bug. After fixing this, I tried >"cleaning up" my OSGi bundle by pulling out all the JARs that are already >OSGi bundles on their own. zookeeper-3.4.6 was one of these. However, >it >is missing a package import for > >org.ietf.jgss > >which is part of the Java SDK. I added this to my ZK JAR and it resolved >the problem. > >Chris > > > > >From: Chris Barlock/Raleigh/IBM@IBMUS >To: [email protected] >Date: 07/27/2015 05:09 PM >Subject: Re: ZooKeeper Class Will Not Connect > > > >OK, I'm convinced that ZK is not broken. I stripped my code down to a >simple stand-alone test case and it works just fine, even with the sleep >loop. But, when I run it normally ZK 3.4.6, I don't see the ZK server >logging anything at all. With 3.4.2, it is fine, as previously noted. >Full disclosure: we are running the client code in the WebSphere Liberty >Profile app server and I have packaged the Kafka and ZK code into an OSGi >bundle. > >I'd like to see what the client is doing, so I took the default >log4j.properties file that ships with ZK 3.4.6 and added this to the end: > >log4j.logger.org.apache.zookeeper=DEBUG, CONSOLE > >I see this in the console log that Liberty manages: > >[err] log4j:WARN No appenders could be found for logger >(org.apache.zookeeper.ZooKeeper). > >I'm not that familiar with log4j configuration. Where have I gone wrong >here? > >Thanks! > >Chris > > > > > >From: Chris Barlock/Raleigh/IBM@IBMUS >To: [email protected] >Date: 07/27/2015 03:13 PM >Subject: Re: ZooKeeper Class Will Not Connect > > > >OK, ignore this...embarrassing! I only got State once and tested it over >and over again... > >Chris > > > > > >From: Chris Barlock/Raleigh/IBM@IBMUS >To: [email protected] >Date: 07/27/2015 02:25 PM >Subject: Re: ZooKeeper Class Will Not Connect > > > >Chris: > >Your sample code works for me and I should be able to adapt it, but I >still think there is a bug. I made the following change to your sample >to > > > >test my method: > > private Main(String hostPort) throws Exception { > this.connectLatch = new CountDownLatch(1); > waitTime = System.currentTimeMillis(); > this.zk = new ZooKeeper(hostPort, 3000, this); > > States state = zk.getState(); > while (state != States.CONNECTED) { > System.out.println("State " + state); > try { > Thread.sleep(100); > } catch (InterruptedException e) { > System.out.println("Interrupted!"); > } > } > } > >This outputs: > >State CONNECTING >Received event: WatchedEvent state:SyncConnected type:None path:null >State CONNECTING >State CONNECTING >... > >and zk.getState never returns anything but CONNECTING. It seems that >this > > > >started with ZK 3.4.3 as 3.4.2 works for me, but 3.4.3, 4, 5 and 6 all >have this behavior in which the state is always CONNECTING. > >Chris > > > > >From: Chris Nauroth <[email protected]> >To: "[email protected]" <[email protected]> >Date: 07/24/2015 07:28 PM >Subject: Re: ZooKeeper Class Will Not Connect > > > >ZooKeeper writes INFO-level logging about connection and session >establishment. On the client side, these messages would come from the >ZooKeeper and ClientCnxn classes. On the server side, these messages >would come from the ZooKeeperServer, NIOServerCnxn and NettyServerCnxn >classes. > >It's possible that you could get more detail by turning up to DEBUG level >logging by adding these lines to log4j.properties for the client and >server respectively: > >log4j.logger.org.apache.zookeeper=DEBUG >log4j.logger.org.apache.zookeeper.server=DEBUG > > > >--Chris Nauroth > > > > >On 7/24/15, 3:42 PM, "Chris Barlock" <[email protected]> wrote: > >>Chris: >> >>I have defined: >> >> private static final int MAX_ZK_CONNECT_ATTEMPTS = 400; >> >> private static final long ZK_CONNECT_WAIT = 5; // Milliseconds >> >>so, two seconds, but I have also tried with ZK_CONNECT_WAIT = 500 (two >>hundred seconds). getState always returned CONNECTING. I can play with >>the async notification, but it really doesn't fit my application very >>well. Is there any additional server or client tracing that can be >>enabled to get a better sense of what is going on? >> >>Chris >> >>IBM Tivoli Systems >>Research Triangle Park, NC >>(919) 224-2240 >>Internet: [email protected] >> >> >> >>From: Chris Nauroth <[email protected]> >>To: "[email protected]" <[email protected]> >>Date: 07/24/2015 06:19 PM >>Subject: Re: ZooKeeper Class Will Not Connect >> >> >> >>Hello Chris, >> >>Thank you for the detailed repro steps describing which versions work and >>which versions don't work. I tested your code sample against a 3.4.6 >>build, and it worked consistently for me. My only thought is that >perhaps >>the MAX_ZK_CONNECT_ATTEMPTS and ZK_CONNECT_WAIT constants are set such >>that the polling loop exits before the connection completes. Perhaps a >>subtle timing difference in the newer versions is just now exposing this. >> >>The typical pattern for connection establishment is to rely on the >>ZooKeeper client's asynchronous event notification instead of a polling >>loop. See below for a code sample that initiates the connection and then >>waits for the SyncConnected event. The ZooKeeper programmer's guide and >>example program docs have a more detailed discussion of this. >> >>http://zookeeper.apache.org/doc/r3.4.6/zookeeperProgrammers.html >> >> >>http://zookeeper.apache.org/doc/r3.4.6/javaExample.html >> >> >>Could you please try running this against your 3.4.6 cluster? I'd be >>curious to see if the connection completes for you. This would also give >>you a sense for how long connection establishment is taking and whether >or >>not that's in line with your definitions of MAX_ZK_CONNECT_ATTEMPTS and >>ZK_CONNECT_WAIT. >> >>I hope this helps. >> >> >> >>class Main implements Watcher { >> >> private final CountDownLatch connectLatch; >> private final ZooKeeper zk; >> >> public static void main(final String[] args) throws Exception { >> String hostPort = args[0]; >> Main main = new Main(hostPort); >> main.awaitConnection(); >> System.out.println("Exiting."); >> } >> >> private Main(String hostPort) throws Exception { >> this.connectLatch = new CountDownLatch(1); >> this.zk = new ZooKeeper(hostPort, 3000, this); >> } >> >> private void awaitConnection() throws InterruptedException { >> this.connectLatch.await(); >> System.out.println("Connection has completed."); >> } >> >> @Override >> public void process(WatchedEvent event) { >> System.out.println("Received event: " + event); >> if (event.getType() == Event.EventType.None) { >> switch (event.getState()) { >> case SyncConnected: >> this.connectLatch.countDown(); >> break; >> } >> } >> } >>} >> >> >>--Chris Nauroth >> >> >> >> >>On 7/24/15, 9:16 AM, "Chris Barlock" <[email protected]> wrote: >> >>>Ran some more tests. My code works fine up through ZK 3.4.2, but then >>>fails with 3.4.3. I did have to add the following to the Import-Package >>>list in the ZK MANIFEST.MF: >>> >>>org.slf4 j,javax.security.auth.login,javax.security.sasl >>> >>>I could really use some help here, ZK folks! Is my code incorrect with >>>newer versions of ZK, or is ZK broken? >>> >>>Chris >>> >>> >>> >>> >>> >>>From: Chris Barlock/Raleigh/IBM@IBMUS >>>To: [email protected] >>>Date: 07/23/2015 09:34 PM >>>Subject: Re: ZooKeeper Class Will Not Connect >>> >>> >>> >>>I tried the 3.5.0 alpha build to see if it made any difference. It did >>>not. >>> >>>But, I had to hack the MANIFEST.MF file in the JAR because the >>>"3.50-alpha" version fails tests in OSGi bundles that import the ZK >>>classes which have something like: >>> >>>version="[3.2,4)" >>> >>>I suggest that if you want to name the JAR "3.50-alpha" that all the >>>internals just use a version of 3.50. >>> >>>Chris >>> >>>IBM Tivoli Systems >>>Research Triangle Park, NC >>>(919) 224-2240 >>>Internet: [email protected] >>> >>> >>> >>>From: Chris Barlock/Raleigh/IBM@IBMUS >>>To: [email protected] >>>Date: 07/23/2015 01:37 PM >>>Subject: ZooKeeper Class Will Not Connect >>> >>> >>> >>>We are attempting to upgrade from Kafka 0.8.0, which includes ZK 3.3.4 >to >>>Kafka 0.8.2.1 with ZK 3.4.6. My code which attempts to connect to ZK is >>>pretty straightforward: >>> >>> try { >>> ZooKeeper zk = new ZooKeeper(connectString, >>sessionTimeout >>>, this); >>> int connectAttempts = 0; >>> >>> while (!zk.getState().isConnected() && connectAttempts < >>>MAX_ZK_CONNECT_ATTEMPTS) { >>> try { >>> Thread.sleep(ZK_CONNECT_WAIT); >>> } catch (InterruptedException e) { >>> // Ignore >>> } >>> connectAttempts++; >>> } >>> } catch (IOException e) { >>> trace.exception(CLASS_NAME, methodName, e); >>> } >>> >>>With some additional tracing, States is always CONNECTING. Has >something >>>changed with 3.4.6 about how I should connect to the server? I can >>>connect just fine with the zookeeper-shell.sh that Kafka ships. This >>>code >>> >>> >>>always runs on the same system as ZK, so the connectString is always >>>"localhost:2181" >>> >>>Chris >>> >> >> > > > > >
