Jackson Chung created STORM-1022:
------------------------------------
Summary: disconnectiong between workers
Key: STORM-1022
URL: https://issues.apache.org/jira/browse/STORM-1022
Project: Apache Storm
Issue Type: Bug
Reporter: Jackson Chung
We upgraded to 0.9.5 ando ran into the following exception. The supervisors did
go down:
1 caution in our upgrade is we started a new nimbus, without any supervisors
attached. Then we deployed topologies (from CICD). Next we build new
supervisors and the supervisors will start on startup. However, in between the
network service is restarted (due to hostname changed during the build <-
chef). Just wanna throw this out in case this makes a difference.
In other word, it could be that supervisors started, picked up work, then
network restarted.
{code}
SEVERE: RuntimeException while executing runnable
org.apache.storm.guava.util.concurrent.Futures$4@445058b with executor
org.apache.storm.guava.util.concurrent.MoreExecutors$SameThreadExecutorService@691bc565
java.lang.RuntimeException: Failed to connect to
Netty-Client-usw2b-grunt-drone32-prod.amz.relateiq.com/10.30.103.202:6700
at backtype.storm.messaging.netty.Client.connect(Client.java:308)
at backtype.storm.messaging.netty.Client.access$1100(Client.java:78)
at backtype.storm.messaging.netty.Client$2.reconnectAgain(Client.java:297)
at backtype.storm.messaging.netty.Client$2.onSuccess(Client.java:283)
at backtype.storm.messaging.netty.Client$2.onSuccess(Client.java:275)
at org.apache.storm.guava.util.concurrent.Futures$4.run(Futures.java:1181)
at
org.apache.storm.guava.util.concurrent.MoreExecutors$SameThreadExecutorService.execute(MoreExecutors.java:297)
at
org.apache.storm.guava.util.concurrent.ExecutionList.executeListener(ExecutionList.java:156)
at
org.apache.storm.guava.util.concurrent.ExecutionList.execute(ExecutionList.java:145)
at
org.apache.storm.guava.util.concurrent.ListenableFutureTask.done(ListenableFutureTask.java:91)
at java.util.concurrent.FutureTask.finishCompletion(FutureTask.java:384)
at java.util.concurrent.FutureTask.set(FutureTask.java:233)
at java.util.concurrent.FutureTask.run(FutureTask.java:274)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Giving up to connect to
Netty-Client-usw2b-grunt-drone32-prod.amz.relateiq.com/10.30.103.202:6700 after
102 failed attempts
at backtype.storm.messaging.netty.Client.connect(Client.java:303)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)