Will try,

thank you Harsha

2014-11-26 16:31 GMT+02:00 Harsha <[email protected]>:

>  This could be due to your storm.local.dir getting corrupted. You can
> delete the contents of this dir and restart the storm cluster (nimbus,
> supervisor).
>
>
> On Wed, Nov 26, 2014, at 01:51 AM, Dimitris Samaras wrote:
>
> Hi all,
>
> @Harsha, by :
>
> "Everything works fine up with topologies etc, to the point that the
> Storm cluster needs to be restarted.
> In that case for storm.sh (nimbus, super ,ui) to run successfully on a
> node Storm has to be redeployed on that node and reconfigured(storm.yaml)."
>
> i mean that i can deploy a fully functional cluster and run/test the
> topologies properly, everything ok on runtime.
> If the node gets restarted (it runs on VM) due to host pc restart etc.,
> when i execute "storm supervisor"  for example on a supervisor node to
> restart it, it does not start!
>
> @Samit, the supervisor.log is:
>
> 2014-11-26 11:26:16 b.s.d.supervisor [INFO] Starting supervisor with id
> ea561988-508d-4593-9873-00f15736a6bf at host Ubuntu14super1
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client
> environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client environment:host.name
> =Ubuntu14super1
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client
> environment:java.version=1.7.0_72
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client
> environment:java.vendor=Oracle Corporation
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client
> environment:java.home=/usr/lib/jvm/java-7-oracle/jre
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client
> environment:java.class.path=/usr/local/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/local/storm/lib/logback-classic-1.0.6.jar:/usr/local/storm/lib/chill-java-0.3.5.jar:/usr/local/storm/lib/compojure-1.1.3.jar:/usr/local/sto$
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client
> environment:java.library.path=/usr/local/lib:/opt/local/lib:/usr/lib
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client
> environment:java.io.tmpdir=/tmp
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client
> environment:java.compiler=<NA>
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client environment:os.name
> =Linux
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client environment:os.arch=amd64
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client
> environment:os.version=3.13.0-40-generic
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client environment:user.name
> =dimsam
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client
> environment:user.home=/home/dimsam
> 2014-11-26 11:35:33 o.a.z.ZooKeeper [INFO] Client
> environment:user.dir=/usr/local/storm/bin
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server
> environment:zookeeper.version=3.4.5-1392090, built on 09/30/2012 17:52 GMT
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server environment:
> host.name=Ubuntu14super1
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server
> environment:java.version=1.7.0_72
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server
> environment:java.vendor=Oracle Corporation
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server
> environment:java.home=/usr/lib/jvm/java-7-oracle/jre
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server
> environment:java.class.path=/usr/local/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/local/storm/lib/logback-classic-1.0.6.jar:/usr/local/storm/lib/chill-java-0.3.5.jar:/usr/local/storm/lib/compojure-1.1.3.jar:/usr/l$
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server
> environment:java.library.path=/usr/local/lib:/opt/local/lib:/usr/lib
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server
> environment:java.io.tmpdir=/tmp
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server
> environment:java.compiler=<NA>
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server environment:
> os.name=Linux
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server
> environment:os.arch=amd64
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server
> environment:os.version=3.13.0-40-generic
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server environment:
> user.name=dimsam
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server
> environment:user.home=/home/dimsam
> 2014-11-26 11:35:33 o.a.z.s.ZooKeeperServer [INFO] Server
> environment:user.dir=/usr/local/storm/bin
> 2014-11-26 11:35:33 b.s.d.supervisor [INFO] Starting Supervisor with conf
> {"dev.zookeeper.path" "/tmp/dev-storm-zookeeper",
> "topology.tick.tuple.freq.secs" nil,
> "topology.builtin.metrics.bucket.size.secs" 60,
> "topology.fall.back.on.java.serialization" true, "topology.ma$
> 2014-11-26 11:35:34 o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
> 2014-11-26 11:35:34 o.a.z.ZooKeeper [INFO] Initiating client connection,
> connectString=195.251.117.209:2181 sessionTimeout=20000
> watcher=org.apache.curator.ConnectionState@4dddb4e
> 2014-11-26 11:35:34 o.a.z.ClientCnxn [INFO] Opening socket connection to
> server themis.iti.gr/195.251.117.209:2181. Will not attempt to
> authenticate using SASL (unknown error)
> 2014-11-26 11:35:34 o.a.z.ClientCnxn [INFO] Socket connection established
> to themis.iti.gr/195.251.117.209:2181, initiating session
> 2014-11-26 11:35:34 o.a.z.ClientCnxn [INFO] Session establishment complete
> on server themis.iti.gr/195.251.117.209:2181, sessionid =
> 0x149eb6ae8d10006, negotiated timeout = 20000
> 2014-11-26 11:35:34 o.a.c.f.s.ConnectionStateManager [INFO] State change:
> CONNECTED
> 2014-11-26 11:35:34 o.a.c.f.s.ConnectionStateManager [WARN] There are no
> ConnectionStateListeners registered.
> 2014-11-26 11:35:34 b.s.zookeeper [INFO] Zookeeper state update:
> :connected:none
> 2014-11-26 11:35:35 o.a.z.ClientCnxn [INFO] EventThread shut down
> 2014-11-26 11:35:35 o.a.z.ZooKeeper [INFO] Session: 0x149eb6ae8d10006
> closed
> 2014-11-26 11:35:35 o.a.c.f.i.CuratorFrameworkImpl [INFO] Starting
> 2014-11-26 11:35:35 o.a.z.ZooKeeper [INFO] Initiating client connection,
> connectString=195.251.117.209:2181/storm sessionTimeout=20000
> watcher=org.apache.curator.ConnectionState@4e451d76
> 2014-11-26 11:35:35 o.a.z.ClientCnxn [INFO] Opening socket connection to
> server themis.iti.gr/195.251.117.209:2181. Will not attempt to
> authenticate using SASL (unknown error)
> 2014-11-26 11:35:35 o.a.z.ClientCnxn [INFO] Socket connection established
> to themis.iti.gr/195.251.117.209:2181, initiating session
> 2014-11-26 11:35:35 o.a.z.ClientCnxn [INFO] Session establishment complete
> on server themis.iti.gr/195.251.117.209:2181, sessionid =
> 0x149eb6ae8d10007, negotiated timeout = 20000
> 2014-11-26 11:35:35 o.a.c.f.s.ConnectionStateManager [INFO] State change:
> CONNECTED
> 2014-11-26 11:35:35 o.a.c.f.s.ConnectionStateManager [WARN] There are no
> ConnectionStateListeners registered.
> 2014-11-26 11:35:35 b.s.d.supervisor [INFO] Starting supervisor with id
> ea561988-508d-4593-9873-00f15736a6bf at host Ubuntu14super1
> 2014-11-26 11:35:36 b.s.event [ERROR] Error when processing event
> java.lang.RuntimeException: java.io.EOFException
>         at backtype.storm.utils.Utils.deserialize(Utils.java:93)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
>         at backtype.storm.utils.LocalState.snapshot(LocalState.java:45)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
>         at backtype.storm.utils.LocalState.get(LocalState.java:56)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
>         at
> backtype.storm.daemon.supervisor$sync_processes.invoke(supervisor.clj:207)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
>         at clojure.lang.AFn.applyToHelper(AFn.java:161)
> [clojure-1.5.1.jar:na]
>         at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.5.1.jar:na]
>         at clojure.core$apply.invoke(core.clj:619) ~[clojure-1.5.1.jar:na]
>         at clojure.core$partial$fn__4190.doInvoke(core.clj:2396)
> ~[clojure-1.5.1.jar:na]
>         at clojure.lang.RestFn.invoke(RestFn.java:397)
> ~[clojure-1.5.1.jar:na]
>         at
> backtype.storm.event$event_manager$fn__2378.invoke(event.clj:39)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]
> Caused by: java.io.EOFException: null
>         at
> java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2325)
> ~[na:1.7.0_72]
>         at
> java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2794)
> ~[na:1.7.0_72]
>         at
> java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801)
> ~[na:1.7.0_72]
>         at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
> ~[na:1.7.0_72]
>         at backtype.storm.utils.Utils.deserialize(Utils.java:88)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
>         ... 11 common frames omitted
> 2014-11-26 11:35:36 b.s.event [ERROR] Error when processing event
> java.lang.RuntimeException: java.io.EOFException
>         at backtype.storm.utils.Utils.deserialize(Utils.java:93)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
>         at backtype.storm.utils.LocalState.snapshot(LocalState.java:45)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
>         at backtype.storm.utils.LocalState.get(LocalState.java:56)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
>         at
> backtype.storm.daemon.supervisor$mk_synchronize_supervisor$this__6330.invoke(supervisor.clj:307)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
>         at
> backtype.storm.event$event_manager$fn__2378.invoke(event.clj:39)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
>         at clojure.lang.AFn.run(AFn.java:24) [clojure-1.5.1.jar:na]
>         at java.lang.Thread.run(Thread.java:745) [na:1.7.0_72]
> Caused by: java.io.EOFException: null
>         at
> java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2325)
> ~[na:1.7.0_72]
>         at
> java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2794)
> ~[na:1.7.0_72]
>         at
> java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:801)
> ~[na:1.7.0_72]
>         at java.io.ObjectInputStream.<init>(ObjectInputStream.java:299)
> ~[na:1.7.0_72]
>         at backtype.storm.utils.Utils.deserialize(Utils.java:88)
> ~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
>         ... 6 common frames omitted
> 2014-11-26 11:35:36 b.s.util [INFO] Halting process: ("Error when
> processing an event")
>
>
> The first line is from when the strom supervisor was running properly!
> After a node restart the supervisor will not start and i get the rest of
> the log....
>
>
> by: "to run successfully on a node, Storm has to be redeployed on that
> node and reconfigured(storm.yaml)."
>  i mean that in order to run the supervisor/nimbus again i have to
> redeploy Storm on every node that fails to start! I do not change the
> config on storm.yaml, simply have to rewrite it with the same values.
>
>
> Thanks again!
>
> 2014-11-25 17:53 GMT+02:00 Harsha <[email protected]>:
>
>
>
> Dimitris,
>        can you give more details on this "
> Everything works fine up with topologies etc, to the point that the Storm
> cluster needs to be restarted.
> In that case for storm.sh (nimbus, super ,ui) to run successfully on a
> node Storm has to be redeployed on that  node and reconfigured(storm.yaml)."
>
>
>    Is the cluster going down when you deploy a topology?
> "to run successfully on a node Storm has to be redeployed on that  node
> and reconfigured(storm.yaml)."
>
>   what you mean by reconfiguration do you change the storm.yaml values
> from previous deployment.
>
> -Harsha
>
>
> On Tue, Nov 25, 2014, at 06:24 AM, Samit Sasan wrote:
>
> can you share the logs
>
> -Samit
>
> On Tue, Nov 25, 2014 at 6:12 PM, Dimitris Samaras <
> [email protected]> wrote:
>
> Hi all,
>
> We are currently testing Storm framework with 4 VM nodes (1 nimbus , 3
> supervisors) and a single node zookeeper cluster for the Storm cluster
> management.
> Everything works fine up with topologies etc, to the point that the Storm
> cluster needs to be restarted.
> In that case for storm.sh (nimbus, super ,ui) to run successfully on a
> node Storm has to be redeployed on that  node and reconfigured(storm.yaml).
>
> Any thoughts?
> Thanks in advance,
> Dimitris
>
>
>
>
>
>
>
>
>

Reply via email to