[ https://issues.apache.org/jira/browse/HELIX-608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15610469#comment-15610469 ]
Lei Xia commented on HELIX-608: ------------------------------- we saw similar exceptions here too. I am investigating it. > NPE and unable to reconnect to zookeeper after a network outage > --------------------------------------------------------------- > > Key: HELIX-608 > URL: https://issues.apache.org/jira/browse/HELIX-608 > Project: Apache Helix > Issue Type: Bug > Components: helix-core > Affects Versions: 0.7.1 > Reporter: Changgeng Li > > I noticed one of the participant is not a live instance in zookeeper after a > network outage, while the java process is live. I have to restart the java > process to make it live again. > Found following logs: > ERROR 2015-07-28 17:12:15,010 [main-EventThread] > org.apache.zookeeper.ClientCnxn: Error while calling watcher > java.lang.RuntimeException: Exception while restarting zk client > at > org.I0Itec.zkclient.ZkClient.processStateChanged(ZkClient.java:462) > ~[zaaa.jar:?] > at org.I0Itec.zkclient.ZkClient.process(ZkClient.java:368) > ~[zaaa.jar:?] > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:531) > [zaaa.jar:?] > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:507) > [zaaa.jar:?] > Caused by: org.I0Itec.zkclient.exception.ZkException: Unable to connect to > zzookeeperhost:2181,zookeeperhost2.com:2181/a > at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:66) > ~[zaaa.jar:?] > at org.I0Itec.zkclient.ZkClient.reconnect(ZkClient.java:935) > ~[zaaa.jar:?] > at > org.I0Itec.zkclient.ZkClient.processStateChanged(ZkClient.java:458) > ~[zaaa.jar:?] > ... 3 more > Caused by: java.net.UnknownHostException: zzookeeperhost: Temporary failure > in name resolution > at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method) > ~[?:1.7.0_72] > at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:901) > ~[?:1.7.0_72] > at > java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1293) > ~[?:1.7.0_72] > at java.net.InetAddress.getAllByName0(InetAddress.java:1246) > ~[?:1.7.0_72] > at java.net.InetAddress.getAllByName(InetAddress.java:1162) > ~[?:1.7.0_72] > at java.net.InetAddress.getAllByName(InetAddress.java:1098) > ~[?:1.7.0_72] > at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:387) > ~[zaaa.jar:?] > at org.apache.zookeeper.ClientCnxn.<init>(ClientCnxn.java:332) > ~[zaaa.jar:?] > at org.apache.zookeeper.ZooKeeper.<init>(ZooKeeper.java:383) > ~[zaaa.jar:?] > at org.I0Itec.zkclient.ZkConnection.connect(ZkConnection.java:64) > ~[zaaa.jar:?] > at org.I0Itec.zkclient.ZkClient.reconnect(ZkClient.java:935) > ~[zaaa.jar:?] > at > org.I0Itec.zkclient.ZkClient.processStateChanged(ZkClient.java:458) > ~[zaaa.jar:?] > ... 3 more > INFO 2015-07-28 17:12:15,010 [main-EventThread] > org.apache.zookeeper.ClientCnxn: EventThread shut down > ERROR 2015-07-28 17:12:15,014 > [ZkClient-EventThread-184-zzookeeperhost:2181,zookeeperhost2.com:2181/a] > org.I0Itec.zkclient.ZkEventThread: Error handling event ZkEvent[Children of > /zaaa/INSTANCES/10.211.12.21_9000/MESSAGES changed sent to > org.apache.helix.manager.zk.ZkCallbackHandler@71bd5cfa] > java.lang.NullPointerException > at org.I0Itec.zkclient.ZkConnection.exists(ZkConnection.java:95) > ~[zaaa.jar:?] > at org.apache.helix.manager.zk.ZkClient$2.call(ZkClient.java:195) > ~[zaaa.jar:?] > at org.apache.helix.manager.zk.ZkClient$2.call(ZkClient.java:192) > ~[zaaa.jar:?] > at > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) > ~[zaaa.jar:?] > at org.apache.helix.manager.zk.ZkClient.exists(ZkClient.java:192) > ~[zaaa.jar:?] > at org.I0Itec.zkclient.ZkClient.exists(ZkClient.java:445) > ~[zaaa.jar:?] > at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:566) ~[zaaa.jar:?] > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > [zaaa.jar:?] > ERROR 2015-07-28 17:12:15,015 > [ZkClient-EventThread-184-zzookeeperhost:2181,zookeeperhost2.com:2181/a] > org.I0Itec.zkclient.ZkEventThread: Error handling event ZkEvent[Children of > /zaaa/EXTERNALVIEW changed sent to > org.apache.helix.manager.zk.ZkCallbackHandler@35d1655] > java.lang.NullPointerException > at org.I0Itec.zkclient.ZkConnection.exists(ZkConnection.java:95) > ~[zaaa.jar:?] > at org.apache.helix.manager.zk.ZkClient$2.call(ZkClient.java:195) > ~[zaaa.jar:?] > at org.apache.helix.manager.zk.ZkClient$2.call(ZkClient.java:192) > ~[zaaa.jar:?] > at > org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) > ~[zaaa.jar:?] > at org.apache.helix.manager.zk.ZkClient.exists(ZkClient.java:192) > ~[zaaa.jar:?] > at org.I0Itec.zkclient.ZkClient.exists(ZkClient.java:445) > ~[zaaa.jar:?] > at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:566) ~[zaaa.jar:?] > at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71) > [zaaa.jar:?] -- This message was sent by Atlassian JIRA (v6.3.4#6332)