Hi Josh

We are having the same issue for long time, and only solution is restart
the whole storm cluster.
(Actually I have asked the same question on 12 May but got no response.)

In the meantime, we are currently evaluating switch to Apache Spark for
streaming, you might also have a look.





On Wed, May 14, 2014 at 11:25 PM, Josh Walton <[email protected]> wrote:

> Recently, we have had a couple of power failures for the servers running
> our zookeeper cluster. When zookeeper dies, the nimbus and supervisor
> processes eventually die as well. After the zookeeper failure, the only way
> I have gotten the supervisor processes to start back up is to delete the
> supervisor and worker directories as specified in the storm.yaml file. Is
> there a better/cleaner way to restart them?
>
> I have also noticed that when I start nimbus and the UI process back up,
> and navigate to the storm status page, the topologies we had started are
> still shown as active (even though they are not).
>
> This is the exception in the supervisor logs when I try to start them up
> after the zookeeper failure:
>
> 2014-05-14 09:16:03 b.s.event [ERROR] Error when processing event
> java.lang.RuntimeException: java.io.EOFException
> at backtype.storm.utils.Utils.deserialize(Utils.java:69)
> ~[storm-core-0.9.0-rc3.jar:na]
> at backtype.storm.utils.LocalState.snapshot(LocalState.java:28)
> ~[storm-core-0.9.0-rc3.jar:na]
>  at backtype.storm.utils.LocalState.get(LocalState.java:39)
> ~[storm-core-0.9.0-rc3.jar:na]
> at
> backtype.storm.daemon.supervisor$sync_processes.invoke(supervisor.clj:187)
> ~[storm-core-0.9.0-rc3.jar:na]
>  at clojure.lang.AFn.applyToHelper(AFn.java:161) [clojure-1.4.0.jar:na]
> at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na]
>  at clojure.core$apply.invoke(core.clj:603) ~[clojure-1.4.0.jar:na]
> at clojure.core$partial$fn__4070.doInvoke(core.clj:2343)
> ~[clojure-1.4.0.jar:na]
>  at clojure.lang.RestFn.invoke(RestFn.java:397) ~[clojure-1.4.0.jar:na]
> at backtype.storm.event$event_manager$fn__3070.invoke(event.clj:24)
> ~[storm-core-0.9.0-rc3.jar:na]
>  at clojure.lang.AFn.run(AFn.java:24) [clojure-1.4.0.jar:na]
> at java.lang.Thread.run(Thread.java:722) [na:1.7.0_21]
> Caused by: java.io.EOFException: null
> at
> java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2323)
> ~[na:1.7.0_21]
> at
> java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2792)
> ~[na:1.7.0_21]
>  at
> java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:799)
> ~[na:1.7.0_21]
> at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
> ~[na:1.7.0_21]
>  at backtype.storm.utils.Utils.deserialize(Utils.java:64)
> ~[storm-core-0.9.0-rc3.jar:na]
> ... 11 common frames omitted
> 2014-05-14 09:16:03 b.s.util [INFO] Halting process: ("Error when
> processing an event")
>
>

Reply via email to