Storm version 0.9.2. this host is crashed due to some other issue. Before crash, it is running a topo. after restarted, supervisor can't start due to deserialize / npe issue. it can be fixed by remove the files from data dir. But still what is the root cause of this ?
2015-01-22 11:32:22 o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting 2015-01-22 11:32:22 o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=10.8.74.106:2181,10.8.74.107:2181,10.8.74.222:2181 sessionTimeout=20000 watcher=org.apache.curator.ConnectionState@425fdbde 2015-01-22 11:32:22 o.a.z.ClientCnxn [INFO] Opening socket connection to server VMS06918/10.8.74.222:2181. Will not attempt to authenticate using SASL (unknown error) 2015-01-22 11:32:22 o.a.z.ClientCnxn [INFO] Socket connection established to VMS06918/10.8.74.222:2181, initiating session 2015-01-22 11:32:22 o.a.z.ClientCnxn [INFO] Session establishment complete on server VMS06918/10.8.74.222:2181, sessionid = 0x34917a5658f1541, negotiated timeout = 20000 2015-01-22 11:32:22 o.a.c.f.s.ConnectionStateManager [INFO] State change: CONNECTED 2015-01-22 11:32:22 o.a.c.f.s.ConnectionStateManager [WARN] There are no ConnectionStateListeners registered. 2015-01-22 11:32:22 b.s.zookeeper [INFO] Zookeeper state update: :connected:none 2015-01-22 11:32:23 o.a.z.ZooKeeper [INFO] Session: 0x34917a5658f1541 closed 2015-01-22 11:32:23 o.a.z.ClientCnxn [INFO] EventThread shut down 2015-01-22 11:32:23 o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting 2015-01-22 11:32:23 o.a.z.ZooKeeper [INFO] Initiating client connection, connectString=10.8.74.106:2181,10.8.74.107:2181, 10.8.74.222:2181/storm-antibot sessionTimeout=20000 watcher=org.apache.curator.ConnectionState@f5dcbbc 2015-01-22 11:32:23 o.a.z.ClientCnxn [INFO] Opening socket connection to server VMS06916/10.8.74.106:2181. Will not attempt to authenticate using SASL (unknown error) 2015-01-22 11:32:23 o.a.z.ClientCnxn [INFO] Socket connection established to VMS06916/10.8.74.106:2181, initiating session 2015-01-22 11:32:23 o.a.z.ClientCnxn [INFO] Session establishment complete on server VMS06916/10.8.74.106:2181, sessionid = 0x14917a568cb14ef, negotiated timeout = 20000 2015-01-22 11:32:23 o.a.c.f.s.ConnectionStateManager [INFO] State change: CONNECTED 2015-01-22 11:32:23 o.a.c.f.s.ConnectionStateManager [WARN] There are no ConnectionStateListeners registered. 2015-01-22 11:32:23 b.s.d.supervisor [INFO] Starting supervisor with id 214f9e32-b99b-4833-a808-bcb73db4d673 at host SH02SVR4271 2015-01-22 11:32:24 b.s.event [ERROR] Error when processing event java.lang.RuntimeException: java.io.EOFException at backtype.storm.utils.Utils.deserialize(Utils.java:93) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.utils.LocalState.snapshot(LocalState.java:45) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.utils.LocalState.get(LocalState.java:56) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.daemon.supervisor$read_worker_heartbeat.invoke(supervisor.clj:77) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.daemon.supervisor$read_worker_heartbeats$iter__6167__6171$fn__6172.invoke(supervisor.clj:90) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at clojure.lang.LazySeq.sval(LazySeq.java:42) ~[clojure-1.5.1.jar:na] at clojure.lang.LazySeq.seq(LazySeq.java:60) ~[clojure-1.5.1.jar:na] at clojure.lang.Cons.next(Cons.java:39) ~[clojure-1.5.1.jar:na] at clojure.lang.LazySeq.next(LazySeq.java:92) ~[clojure-1.5.1.jar:na] at clojure.lang.RT.next(RT.java:598) ~[clojure-1.5.1.jar:na] at clojure.core$next.invoke(core.clj:64) ~[clojure-1.5.1.jar:na] at clojure.core$dorun.invoke(core.clj:2781) ~[clojure-1.5.1.jar:na] at clojure.core$doall.invoke(core.clj:2796) ~[clojure-1.5.1.jar:na] at backtype.storm.daemon.supervisor$read_worker_heartbeats.invoke(supervisor.clj:89) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.daemon.supervisor$read_allocated_workers.invoke(supervisor.clj:106) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at backtype.storm.daemon.supervisor$sync_processes.invoke(supervisor.clj:209) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at clojure.lang.AFn.applyToHelper(AFn.java:161) [clojure-1.5.1.jar:na] at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.5.1.jar:na] at clojure.core$apply.invoke(core.clj:619) ~[clojure-1.5.1.jar:na] at clojure.core$partial$fn__4190.doInvoke(core.clj:2396) ~[clojure-1.5.1.jar:na] at clojure.lang.RestFn.invoke(RestFn.java:397) ~[clojure-1.5.1.jar:na] at backtype.storm.event$event_manager$fn__2378.invoke(event.clj:39) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na] at java.lang.Thread.run(Thread.java:744) [na:1.7.0_51] Caused by: java.io.EOFException: null at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2325) ~[na:1.7.0_51] at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2794) ~[na:1.7.0_51] at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801) ~[na:1.7.0_51] at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299) ~[na:1.7.0_51] at backtype.storm.utils.Utils.deserialize(Utils.java:88) ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating] ... 23 common frames omitted -- Best regards! Mike Zang
