I found a strange error which can cause worker restart. it seems to be thrown
by storm itself rather my program. Even this worker restart automatically, it
results in large scale restart of other workers related to the first broken
worker.
The initial error message of first broken worker is as shown below:
2016-10-09 14:55:03.672 o.a.s.m.n.StormServerHandler [ERROR] server errors in
handling the request
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_102]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
~[?:1.8.0_102]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
~[?:1.8.0_102]
at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_102]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
~[?:1.8.0_102]
at
org.apache.storm.shade.org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
[storm-core-1.0.2.jar:1.0.2]
at
org.apache.storm.shade.org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
[storm-core-1.0.2.jar:1.0.2]
at
org.apache.storm.shade.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
[storm-core-1.0.2.jar:1.0.2]
at
org.apache.storm.shade.org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
[storm-core-1.0.2.jar:1.0.2]
at
org.apache.storm.shade.org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
[storm-core-1.0.2.jar:1.0.2]
at
org.apache.storm.shade.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
[storm-core-1.0.2.jar:1.0.2]
at
org.apache.storm.shade.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
[storm-core-1.0.2.jar:1.0.2]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[?:1.8.0_102]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[?:1.8.0_102]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]
2016-10-09 14:55:03.710 o.a.s.m.n.StormClientHandler [INFO] Connection to
storm01/132.228.28.136:6732<http://132.228.28.136:6732/> failed:
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_102]
at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39)
~[?:1.8.0_102]
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223)
~[?:1.8.0_102]
at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_102]
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:380)
~[?:1.8.0_102]
at
org.apache.storm.shade.org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:64)
[storm-core-1.0.2.jar:1.0.2]
at
org.apache.storm.shade.org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
[storm-core-1.0.2.jar:1.0.2]
at
org.apache.storm.shade.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
[storm-core-1.0.2.jar:1.0.2]
at
org.apache.storm.shade.org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
[storm-core-1.0.2.jar:1.0.2]
at
org.apache.storm.shade.org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
[storm-core-1.0.2.jar:1.0.2]
at
org.apache.storm.shade.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
[storm-core-1.0.2.jar:1.0.2]
at
org.apache.storm.shade.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
[storm-core-1.0.2.jar:1.0.2]
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[?:1.8.0_102]
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[?:1.8.0_102]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_102]
Does anyone have a good solution to prevent related workers restart?
Thanks,
Junfeng Chen