I checked my supervisor log, and found logs like  "Shutting down and clearing 
state for id {worker.id}". But still can't find the cause of worker shutting 
down.

Here are the logs in supervisor. It seems that worker's jvm exit before 
supervisor kill it.

2015-05-06 17:21:47:847 b.s.d.supervisor [INFO] Launching worker with 
assignment #backtype.storm.daemon.supervisor.LocalAssignment{:storm-id 
"message_storm-101-17-1430903865", :executors ...} for this supervisor 
95d77940-08b2-4f7a-9660-03d8f61a2529 on port 6701 with id 
95722030-7785-40d6-a47a-cc072c4c9d83
2015-05-06 17:21:47:849 b.s.d.supervisor [INFO] Launching worker with command: 
'/usr/java/bin/java' '-server' '-Xmx8g' '-Xms8g' '-Xmn2g' '-DServer=mblog' 
'-DNODE_POOL_NAME=openapi_storm-yf' '-XX:PermSize=128m' '-XX:MaxPermSize=128m' 
'-XX:MaxTenuringThreshold=4' '-XX:+UseConcMarkSweepGC' '-XX:SurvivorRatio=8' 
'-XX:CMSInitiatingOccupancyFraction=70' '-XX:+ExplicitGCInvokesConcurrent' 
'-XX:+PrintCommandLineFlags' '-XX:+PrintGCDateStamps' 
'-XX:+PrintTenuringDistribution' '-XX:+PrintGCDetails' '-XX:+PrintGCTimeStamps' 
'-XX:+PrintGCApplicationStoppedTime' '-XX:+PrintGCApplicationConcurrentTime' 
'-Xloggc:/data0/logs/storm/gc.log' 
'-Djava.library.path=storm-local/supervisor/stormdist/message_storm-101-17-1430903865/resources/Linux-amd64:storm-local/supervisor/stormdist/message_storm-101-17-1430903865/resources:/usr/local/lib:/opt/local/lib:/usr/lib'
 '-Dlogfile.name=worker-6701.log' '-Dstorm.home=/data0/storm' 
'-Dstorm.conf.file=' '-Dstorm.options=' '-Dstorm.log.dir=/data0/storm/logs' 
'-Dlogback.configurationFile=/data0/storm/logback/cluster.xml' 
'-Dstorm.id=message_storm-101-17-1430903865' 
'-Dworker.id=95722030-7785-40d6-a47a-cc072c4c9d83' '-Dworker.port=6701' '-cp' 
'/data0/storm/lib/logback-core-1.0.13.jar:/data0/storm/lib/tools.logging-0.2.3.jar:/data0/storm/lib/commons-codec-1.6.jar:/data0/storm/lib/json-simple-1.1.jar:/data0/storm/lib/commons-logging-1.1.3.jar:/data0/storm/lib/tools.cli-0.2.4.jar:/data0/storm/lib/jetty-util-6.1.26.jar:/data0/storm/lib/kryo-2.21.jar:/data0/storm/lib/ring-devel-0.3.11.jar:/data0/storm/lib/commons-lang-2.5.jar:/data0/storm/lib/commons-exec-1.1.jar:/data0/storm/lib/clout-1.0.1.jar:/data0/storm/lib/servlet-api-2.5.jar:/data0/storm/lib/ring-jetty-adapter-0.3.11.jar:/data0/storm/lib/clj-time-0.4.1.jar:/data0/storm/lib/jgrapht-core-0.9.0.jar:/data0/storm/lib/chill-java-0.3.5.jar:/data0/storm/lib/jline-2.11.jar:/data0/storm/lib/tools.macro-0.1.0.jar:/data0/storm/lib/log4j-over-slf4j-1.6.6.jar:/data0/storm/lib/minlog-1.2.jar:/data0/storm/lib/core.incubator-0.1.0.jar:/data0/storm/lib/jetty-6.1.26.jar:/data0/storm/lib/compojure-1.1.3.jar:/data0/storm/lib/math.numeric-tower-0.0.1.jar:/data0/storm/lib/slf4j-api-1.7.5.jar:/data0/storm/lib/hiccup-0.3.6.jar:/data0/storm/lib/objenesis-1.2.jar:/data0/storm/lib/storm-core-0.9.3.jar:/data0/storm/lib/snakeyaml-1.11.jar:/data0/storm/lib/asm-4.0.jar:/data0/storm/lib/carbonite-1.4.0.jar:/data0/storm/lib/commons-fileupload-1.2.1.jar:/data0/storm/lib/ring-core-1.1.5.jar:/data0/storm/lib/logback-classic-1.0.13.jar:/data0/storm/lib/clojure-1.5.1.jar:/data0/storm/lib/reflectasm-1.07-shaded.jar:/data0/storm/lib/ring-servlet-0.3.11.jar:/data0/storm/lib/disruptor-2.10.1.jar:/data0/storm/lib/joda-time-2.0.jar:/data0/storm/lib/commons-io-2.4.jar:/data0/storm/lib/clj-stacktrace-0.2.2.jar:/data0/storm/conf:storm-local/supervisor/stormdist/message_storm-101-17-1430903865/stormjar.jar'
 'backtype.storm.daemon.worker' 'message_storm-101-17-1430903865' 
'95d77940-08b2-4f7a-9660-03d8f61a2529' '6701' 
'95722030-7785-40d6-a47a-cc072c4c9d83'
2015-05-06 17:21:47:850 b.s.d.supervisor [INFO] 
95722030-7785-40d6-a47a-cc072c4c9d83 still hasn't started
2015-05-06 17:21:48:350 b.s.d.supervisor [INFO] 
95722030-7785-40d6-a47a-cc072c4c9d83 still hasn't started
2015-05-06 17:21:48:850 b.s.d.supervisor [INFO] 
95722030-7785-40d6-a47a-cc072c4c9d83 still hasn't started
2015-05-06 17:21:49:351 b.s.d.supervisor [INFO] 
95722030-7785-40d6-a47a-cc072c4c9d83 still hasn't started
2015-05-06 17:21:49:851 b.s.d.supervisor [INFO] 
95722030-7785-40d6-a47a-cc072c4c9d83 still hasn't started
2015-05-06 17:21:50:352 b.s.d.supervisor [INFO] 
95722030-7785-40d6-a47a-cc072c4c9d83 still hasn't started
2015-05-06 17:21:50:852 b.s.d.supervisor [INFO] 
95722030-7785-40d6-a47a-cc072c4c9d83 still hasn't started
2015-05-06 17:21:51:352 b.s.d.supervisor [INFO] 
95722030-7785-40d6-a47a-cc072c4c9d83 still hasn't started
2015-05-06 17:21:51:853 b.s.d.supervisor [INFO] 
95722030-7785-40d6-a47a-cc072c4c9d83 still hasn't started
2015-05-06 17:21:52:353 b.s.d.supervisor [INFO] 
95722030-7785-40d6-a47a-cc072c4c9d83 still hasn't started
2015-05-06 17:25:37:845 b.s.d.supervisor [INFO] Shutting down and clearing 
state for id 95722030-7785-40d6-a47a-cc072c4c9d83. Current supervisor time: 
1430904337. State: :disallowed, Heartbeat: 
#backtype.storm.daemon.common.WorkerHeartbeat{:time-secs 1430904305, :storm-id 
"message_storm-101-17-1430903865", :executors #{...}, :port 6701}
2015-05-06 17:25:37:845 b.s.d.supervisor [INFO] Shutting down 
95d77940-08b2-4f7a-9660-03d8f61a2529:95722030-7785-40d6-a47a-cc072c4c9d83
2015-05-06 17:25:37:847 b.s.util [INFO] Error when trying to kill 27187. 
Process is probably already dead.
2015-05-06 17:25:38:849 b.s.util [INFO] Error when trying to kill 27187. 
Process is probably already dead.
2015-05-06 17:25:38:853 b.s.d.supervisor [INFO] Shut down 
95d77940-08b2-4f7a-9660-03d8f61a2529:95722030-7785-40d6-a47a-cc072c4c9d83




From: [email protected]
To: [email protected]
Subject: Re: Storm worker died - Client is being closed, and does not take 
requests any more
Date: Thu, 7 May 2015 15:15:50 +0000






Look into your supervisor logs, with them set to INFO.



Look for something like:
“Shutting down and clearing state for id {worker.id}”



That id will match your worker’s logs.



The reason for the shutdown will be in the supervisor’s logs.














From: SuXingyu <[email protected]>

Reply-To: "[email protected]" <[email protected]>

Date: 2015,Thursday, May 7 at 02:44

To: "[email protected]" <[email protected]>

Subject: Storm worker died - Client is being closed, and does not take requests 
any more







Hi all,



I am runing a Kafka-Storm cluster in distributed mode. There are four 
supervisors in my storm cluster. For most of the time, it works fun. But after 
I redeployed topology yesterday, I found all the four workers in storm cluster 
restart after running several
 minutes.



I check the log in storm and find this:



2015-05-06 17:23:08:934 b.s.util [ERROR] Async loop died!

java.lang.RuntimeException: java.lang.RuntimeException: Client is being closed, 
and does not take requests any more

        at 
backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:128)
 ~[storm-core-0.9.3.jar:0.9.3]

        at 
backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99)
 ~[storm-core-0.9.3.jar:0.9.3]

        at 
backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) 
~[storm-core-0.9.3.jar:0.9.3]

        at 
backtype.storm.disruptor$consume_loop_STAR_$fn__1460.invoke(disruptor.clj:94) 
~[storm-core-0.9.3.jar:0.9.3]

        at backtype.storm.util$async_loop$fn__464.invoke(util.clj:463) 
~[storm-core-0.9.3.jar:0.9.3]

        at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]

Caused by: java.lang.RuntimeException: Client is being closed, and does not 
take requests any more

        at backtype.storm.messaging.netty.Client.send(Client.java:185) 
~[storm-core-0.9.3.jar:0.9.3]

        at backtype.storm.utils.TransferDrainer.send(TransferDrainer.java:54) 
~[storm-core-0.9.3.jar:0.9.3]

        at 
backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__3730$fn__3731.invoke(worker.clj:330)
 ~[storm-core-0.9.3.jar:0.9.3]

        at 
backtype.storm.daemon.worker$mk_transfer_tuples_handler$fn__3730.invoke(worker.clj:328)
 ~[storm-core-0.9.3.jar:0.9.3]

        at 
backtype.storm.disruptor$clojure_handler$reify__1447.onEvent(disruptor.clj:58) 
~[storm-core-0.9.3.jar:0.9.3]

        at 
backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125)
 ~[storm-core-0.9.3.jar:0.9.3]

        ... 6 common frames omitted

2015-05-06 17:23:08:938 b.s.util [ERROR] Halting process: ("Async loop died!")

java.lang.RuntimeException: ("Async loop died!")

        at backtype.storm.util$exit_process_BANG_.doInvoke(util.clj:325) 
[storm-core-0.9.3.jar:0.9.3]

        at clojure.lang.RestFn.invoke(RestFn.java:423) [clojure-1.5.1.jar:na]

        at 
backtype.storm.disruptor$consume_loop_STAR_$fn__1458.invoke(disruptor.clj:92) 
[storm-core-0.9.3.jar:0.9.3]

        at backtype.storm.util$async_loop$fn__464.invoke(util.clj:473) 
[storm-core-0.9.3.jar:0.9.3]

        at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]

        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]

2015-05-06 17:23:08:957 b.s.d.worker [INFO] Shutting down worker 
message_storm-101-17-1430903865 95d77940-08b2-4f7a-9660-03d8f61a2529 6700

2015-05-06 17:23:45:487 o.a.s.z.ZooKeeper [INFO] Client 
environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT



These logs appear several times in about twenty minutes. And the cluter 
restored after a reblance operation in Storm ui.



Any idea why this may be happening?



Best regard!



Xingyu












                                          

Reply via email to