Bijan Fahimi Shemrani created STORM-2706:
--------------------------------------------

             Summary: Nimbus stuck in exception and does not fail fast
                 Key: STORM-2706
                 URL: https://issues.apache.org/jira/browse/STORM-2706
             Project: Apache Storm
          Issue Type: Bug
    Affects Versions: 1.1.1
            Reporter: Bijan Fahimi Shemrani


We experience a problem in nimbus which leads it to get stuck in a retry and 
fail loop. When I manually restart the nimbus it works again as expected. 
However, it would be great if nimbus would shut down so our monitoring can 
automatically restart the nimbus. 

The nimbus log. 

{noformat}
24.8.2017 15:39:1913:39:19.804 [pool-13-thread-51] ERROR 
org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer - 
Unexpected throwable while invoking!
24.8.2017 
15:39:19org.apache.storm.shade.org.apache.zookeeper.KeeperException$NoNodeException:
 KeeperErrorCode = NoNode for /storm/leader-lock
24.8.2017 15:39:19      at 
org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1590)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:230)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:219)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:216)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:207)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:40)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.shade.org.apache.curator.framework.recipes.locks.LockInternals.getSortedChildren(LockInternals.java:151)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.shade.org.apache.curator.framework.recipes.locks.LockInternals.getParticipantNodes(LockInternals.java:133)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.shade.org.apache.curator.framework.recipes.leader.LeaderLatch.getLeader(LeaderLatch.java:453)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown 
Source) ~[?:?]
24.8.2017 15:39:19      at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_131]
24.8.2017 15:39:19      at java.lang.reflect.Method.invoke(Method.java:498) 
~[?:1.8.0_131]
24.8.2017 15:39:19      at 
clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) 
~[clojure-1.7.0.jar:?]
24.8.2017 15:39:19      at 
clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:313) 
~[clojure-1.7.0.jar:?]
24.8.2017 15:39:19      at 
org.apache.storm.zookeeper$zk_leader_elector$reify__1043.getLeader(zookeeper.clj:296)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown 
Source) ~[?:?]
24.8.2017 15:39:19      at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_131]
24.8.2017 15:39:19      at java.lang.reflect.Method.invoke(Method.java:498) 
~[?:1.8.0_131]
24.8.2017 15:39:19      at 
clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) 
~[clojure-1.7.0.jar:?]
24.8.2017 15:39:19      at 
clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:313) 
~[clojure-1.7.0.jar:?]
24.8.2017 15:39:19      at 
org.apache.storm.daemon.nimbus$mk_reified_nimbus$reify__10780.getLeader(nimbus.clj:2412)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.generated.Nimbus$Processor$getLeader.getResult(Nimbus.java:3944)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.generated.Nimbus$Processor$getLeader.getResult(Nimbus.java:3928)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:39) 
~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:162)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) 
~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:19      at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[?:1.8.0_131]
24.8.2017 15:39:19      at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[?:1.8.0_131]
24.8.2017 15:39:19      at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
24.8.2017 15:39:2713:39:27.205 [pool-13-thread-52] ERROR 
org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer - 
Unexpected throwable while invoking!
24.8.2017 
15:39:27org.apache.storm.shade.org.apache.zookeeper.KeeperException$NoNodeException:
 KeeperErrorCode = NoNode for /storm/leader-lock
24.8.2017 15:39:27      at 
org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1590)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:230)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:219)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:109)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:216)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:207)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:40)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.shade.org.apache.curator.framework.recipes.locks.LockInternals.getSortedChildren(LockInternals.java:151)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.shade.org.apache.curator.framework.recipes.locks.LockInternals.getParticipantNodes(LockInternals.java:133)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.shade.org.apache.curator.framework.recipes.leader.LeaderLatch.getLeader(LeaderLatch.java:453)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown 
Source) ~[?:?]
24.8.2017 15:39:27      at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_131]
24.8.2017 15:39:27      at java.lang.reflect.Method.invoke(Method.java:498) 
~[?:1.8.0_131]
24.8.2017 15:39:27      at 
clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) 
~[clojure-1.7.0.jar:?]
24.8.2017 15:39:27      at 
clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:313) 
~[clojure-1.7.0.jar:?]
24.8.2017 15:39:27      at 
org.apache.storm.zookeeper$zk_leader_elector$reify__1043.getLeader(zookeeper.clj:296)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at sun.reflect.GeneratedMethodAccessor32.invoke(Unknown 
Source) ~[?:?]
24.8.2017 15:39:27      at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_131]
24.8.2017 15:39:27      at java.lang.reflect.Method.invoke(Method.java:498) 
~[?:1.8.0_131]
24.8.2017 15:39:27      at 
clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) 
~[clojure-1.7.0.jar:?]
24.8.2017 15:39:27      at 
clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:313) 
~[clojure-1.7.0.jar:?]
24.8.2017 15:39:27      at 
org.apache.storm.daemon.nimbus$get_cluster_info.invoke(nimbus.clj:1544) 
~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.daemon.nimbus$mk_reified_nimbus$reify__10780.getClusterInfo(nimbus.clj:2006)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.generated.Nimbus$Processor$getClusterInfo.getResult(Nimbus.java:3920)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.generated.Nimbus$Processor$getClusterInfo.getResult(Nimbus.java:3904)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:39) 
~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.security.auth.SimpleTransportPlugin$SimpleWrapProcessor.process(SimpleTransportPlugin.java:162)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.thrift.server.AbstractNonblockingServer$FrameBuffer.invoke(AbstractNonblockingServer.java:518)
 ~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
org.apache.storm.thrift.server.Invocation.run(Invocation.java:18) 
~[storm-core-1.1.1.jar:1.1.1]
24.8.2017 15:39:27      at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[?:1.8.0_131]
24.8.2017 15:39:27      at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[?:1.8.0_131]
24.8.2017 15:39:27      at java.lang.Thread.run(Thread.java:748) [?:1.8.0_131]
24.8.2017 15:39:2913:39:29.270 [timer] INFO  org.apache.storm.daemon.nimbus - 
not a leader, skipping assignments
24.8.2017 15:39:2913:39:29.270 [timer] INFO  org.apache.storm.daemon.nimbus - 
not a leader, skipping cleanup
24.8.2017 15:39:3913:39:39.270 [timer] INFO  org.apache.storm.daemon.nimbus - 
not a leader, skipping assignments
24.8.2017 15:39:3913:39:39.270 [timer] INFO  org.apache.storm.daemon.nimbus - 
not a leader, skipping cleanup
24.8.2017 15:39:4913:39:49.271 [timer] INFO  org.apache.storm.daemon.nimbus - 
not a leader, skipping assignments
24.8.2017 15:39:4913:39:49.272 [timer] INFO  org.apache.storm.daemon.nimbus - 
not a leader, skipping cleanup
24.8.2017 15:39:5913:39:59.272 [timer] INFO  org.apache.storm.daemon.nimbus - 
not a leader, skipping assignments
24.8.2017 15:39:5913:39:59.272 [timer] INFO  org.apache.storm.daemon.nimbus - 
not a leader, skipping cleanup
24.8.2017 15:40:0913:40:09.272 [timer] INFO  org.apache.storm.daemon.nimbus - 
not a leader, skipping assignments
24.8.2017 15:40:0913:40:09.272 [timer] INFO  org.apache.storm.daemon.nimbus - 
not a leader, skipping cleanup
24.8.2017 15:40:1313:40:13.806 [timer] INFO  
org.apache.storm.shade.org.apache.curator.framework.imps.CuratorFrameworkImpl - 
Starting
24.8.2017 15:40:1313:40:13.807 [timer] INFO  
org.apache.storm.shade.org.apache.zookeeper.ZooKeeper - Initiating client 
connection, connectString=zookeeper:2181/storm sessionTimeout=20000 
watcher=org.apache.storm.shade.org.apache.curator.ConnectionState@f90354
24.8.2017 15:40:1313:40:13.808 [timer-SendThread(10.42.174.214:2181)] INFO  
org.apache.storm.shade.org.apache.zookeeper.ClientCnxn - Opening socket 
connection to server 10.42.174.214/10.42.174.214:2181. Will not attempt to 
authenticate using SASL (unknown error)
24.8.2017 15:40:1313:40:13.862 [timer-SendThread(10.42.174.214:2181)] INFO  
org.apache.storm.shade.org.apache.zookeeper.ClientCnxn - Socket connection 
established to 10.42.174.214/10.42.174.214:2181, initiating session
24.8.2017 15:40:1313:40:13.865 [timer-SendThread(10.42.174.214:2181)] INFO  
org.apache.storm.shade.org.apache.zookeeper.ClientCnxn - Session establishment 
complete on server 10.42.174.214/10.42.174.214:2181, sessionid = 
0x15e14456dc70045, negotiated timeout = 20000
24.8.2017 15:40:1313:40:13.910 [timer] INFO  
org.apache.storm.shade.org.apache.zookeeper.ZooKeeper - Session: 
0x15e14456dc70045 closed
24.8.2017 15:40:1313:40:13.910 [timer-EventThread] INFO  
org.apache.storm.shade.org.apache.zookeeper.ClientCnxn - EventThread shut down
{noformat}




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to