Jackson Chung created STORM-1022:
------------------------------------

             Summary: disconnectiong between workers
                 Key: STORM-1022
                 URL: https://issues.apache.org/jira/browse/STORM-1022
             Project: Apache Storm
          Issue Type: Bug
            Reporter: Jackson Chung


We upgraded to 0.9.5 ando ran into the following exception. The supervisors did 
go down:

1 caution in our upgrade is we started a new nimbus, without any supervisors 
attached. Then we deployed topologies (from CICD). Next we build new 
supervisors and the supervisors will start on startup. However, in between the 
network service is restarted (due to hostname changed during the build <- 
chef). Just wanna throw this out in case this makes a difference.

In other word, it could be that supervisors started, picked up work,  then 
network restarted. 

{code}
SEVERE: RuntimeException while executing runnable 
org.apache.storm.guava.util.concurrent.Futures$4@445058b with executor 
org.apache.storm.guava.util.concurrent.MoreExecutors$SameThreadExecutorService@691bc565
java.lang.RuntimeException: Failed to connect to 
Netty-Client-usw2b-grunt-drone32-prod.amz.relateiq.com/10.30.103.202:6700
at backtype.storm.messaging.netty.Client.connect(Client.java:308)
at backtype.storm.messaging.netty.Client.access$1100(Client.java:78)
at backtype.storm.messaging.netty.Client$2.reconnectAgain(Client.java:297)
at backtype.storm.messaging.netty.Client$2.onSuccess(Client.java:283)
at backtype.storm.messaging.netty.Client$2.onSuccess(Client.java:275)
at org.apache.storm.guava.util.concurrent.Futures$4.run(Futures.java:1181)
at 
org.apache.storm.guava.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
at 
org.apache.storm.guava.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
at 
org.apache.storm.guava.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
at 
org.apache.storm.guava.util.concurrent.ListenableFutureTask.done(ListenableFutureTask.java:91)
at java.util.concurrent.FutureTask.finishCompletion(FutureTask.java:384)
at java.util.concurrent.FutureTask.set(FutureTask.java:233)
at java.util.concurrent.FutureTask.run(FutureTask.java:274)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Giving up to connect to 
Netty-Client-usw2b-grunt-drone32-prod.amz.relateiq.com/10.30.103.202:6700 after 
102 failed attempts
at backtype.storm.messaging.netty.Client.connect(Client.java:303)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to