rdhabalia commented on issue #569: Revert back to default ZookeeperClientFactoryImpl URL: https://github.com/apache/incubator-pulsar/pull/569#issuecomment-315687079 After enabling debug log, found out that build exists because `ZooKeeperSessionWatcher` couldn't get heartbeat with in zksession timeout. ``` [pulsar-zk-session-watcher-274-1:ZooKeeperSessionWatcher@164] - zoo keeper disconnected, waiting to reconnect, time remaining 0 [pulsar-zk-session-watcher-75235-1:ZooKeeperSessionWatcher@158] - timeout expired for reconnecting, invoking shutdown service ``` After digging into it, it seems issue is not BK-ZkClient library but the processing time of zk-response into aspectj-advice. [ZKClientCnxAspect](https://github.com/apache/incubator-pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/zookeeper/aspectj/ClientCnxnAspect.java#L72) intercept zk-response call and if takes more than few msec then zk-client somewhere lose the event (not sure what exactly happens in zk-client) and it doesn't serve any subsequent zk-response which ultimately cause zk-timeout. It can be easily verified by **Fix will not fail if:** commenting out [event-notification at timedProcessEvent](https://github.com/apache/incubator-pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/zookeeper/aspectj/ClientCnxnAspect.java#L81) ```java if (request != null) { long timeElapsed = (MathUtils.now() - startTimeMs); //notifyListeners(checkType(request), timeElapsed); } ``` **build immediately fails** Replace`notifyListeners(checkType(request), timeElapsed);` with `Thread.sleep(50)` ```java if (request != null) { long timeElapsed = (MathUtils.now() - startTimeMs); Thread.sleep(100); // if it takes more than few msec then zk-client lib misbehaves } ``` I am testing the [fix](https://github.com/rdhabalia/pulsar/commit/af6734d2da66a0605f9cb0a96f116345502de74b), and will create a PR after testing it multiple times. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
With regards, Apache Git Services