This looks like a bug in PathChildrenCache to me. I see an NPE there. Please open an issue in Jira for this.
-Jordan From: Bae, Jae Hyeon [email protected] Reply: [email protected] [email protected] Date: April 9, 2014 at 12:55:51 PM To: [email protected] [email protected] Subject: curator-2.4.0 cannot recover connection loss Last night, I rolling-restarted zookeeper 3.4.5 to update configuration and I saw curator-2.4.0 couldn't recover connection loss. ERROR 2014-04-09 17:48:15,231 [DaemonThreadFactory-2-thread-2] org.apache.curator.framework.imps.CuratorFrameworkImpl: Background retry gave up org.apache.curator.CuratorConnectionLossException: KeeperErrorCode = ConnectionLoss at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:766) at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:749) at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:56) at org.apache.curator.framework.imps.CuratorFrameworkImpl$3.call(CuratorFrameworkImpl.java:244) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) INFO 2014-04-09 17:48:15,276 [ServerInventoryView-0-EventThread] org.apache.curator.framework.state.ConnectionStateManager: State change: RECONNECTED INFO 2014-04-09 17:48:15,382 [ServerInventoryView-0-EventThread] org.apache.curator.framework.state.ConnectionStateManager: State change: SUSPENDED ERROR 2014-04-09 17:48:15,748 [DaemonThreadFactory-2-thread-2] org.apache.curator.framework.imps.CuratorFrameworkImpl: Background exception was not retry-able or retry gave up java.lang.NullPointerException at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:191) at com.google.common.collect.Lists$TransformingSequentialList.<init>(Lists.java:527) at com.google.common.collect.Lists.transform(Lists.java:510) at org.apache.curator.framework.recipes.cache.PathChildrenCache.processChildren(PathChildrenCache.java:635) at org.apache.curator.framework.recipes.cache.PathChildrenCache.access$200(PathChildrenCache.java:68) at org.apache.curator.framework.recipes.cache.PathChildrenCache$4.processResult(PathChildrenCache.java:476) at org.apache.curator.framework.imps.CuratorFrameworkImpl.sendToBackgroundCallback(CuratorFrameworkImpl.java:686) at org.apache.curator.framework.imps.CuratorFrameworkImpl.checkBackgroundRetry(CuratorFrameworkImpl.java:659) at org.apache.curator.framework.imps.CuratorFrameworkImpl.performBackgroundOperation(CuratorFrameworkImpl.java:783) at org.apache.curator.framework.imps.CuratorFrameworkImpl.backgroundOperationsLoop(CuratorFrameworkImpl.java:749) at org.apache.curator.framework.imps.CuratorFrameworkImpl.access$300(CuratorFrameworkImpl.java:56) at org.apache.curator.framework.imps.CuratorFrameworkImpl$3.call(CuratorFrameworkImpl.java:244) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) I am not sure this bug is on PathChildrenCache. I need to restart all instances using curator-2.4.0, which is really bad. Thank you Best, Jae
