I'm not sure if it is a bug in storm, or if your system is somehow messed up.
I tracked down the exception being thrown and it indicates that someone deleted
the current working directory that you are running the supervisor in. I'm not
sure how that happens, but I have never seen it happen on any of my clusters,
which leads me to believe that it is something that happened with your system,
and not storm itself.
- Bobby
On Wednesday, March 25, 2015 4:54 PM, Justin Workman
<[email protected]> wrote:
Sounds like your storm-local is pointed to tmp and you are going through the
same process as us.
Is this a bug, or is there a better solution?
Sent from my iPhone
On Mar 25, 2015, at 2:41 PM, Andres Gomez Ferrer <[email protected]> wrote:
#yiv0938392330 body{font-family:Helvetica, Arial;font-size:13px;}I stop all my
topologies, stop all storm nodes, remove /tmp/storm, start all again!
Andrés Gómez
Developer redborder.net / [email protected]: +34 955 60 11 60
<0e6e8de_1.png>
<square-twitter-20.png> <square-google-plus-20.png> <square-linkedin-20.png>
Piénsalo antes de imprimir este mensaje Este correo electrónico, incluidos sus
anexos, se dirige exclusivamente a su destinatario. Contiene información
CONFIDENCIAL cuya divulgación está prohibida por la ley o puede estar sometida
a secreto profesional. Si ha recibido este mensaje por error, le rogamos nos lo
comunique inmediatamente y proceda a su destrucción. This email, including
attachments, is intended exclusively for its addressee. It contains information
that is CONFIDENTIAL whose disclosure is prohibited by law and may be covered
by legal privilege. If you have received this email in error, please notify the
sender and delete it from your system.
En 25 de marzo de 2015 en 21:31:50, Justin Workman ([email protected])
escrito:
During each topology kill/restart? Where do you put thathook?
Or was it a one time remove of /tmp/storm and futuretopology restarts were
fine.
Sent from my iPhone
On Mar 25, 2015, at 1:59 PM, Andres Gomez Ferrer <[email protected]>wrote:
I solved it removing all /tmp/storm/*
Regards,
Andrés Gómez
Developer redborder.net / [email protected]: +34955 60 11 60
<0e6e8de_1.png>
<square-twitter-20.png> <square-google-plus-20.png> <square-linkedin-20.png>
Piénsalo antes deimprimir este mensaje Este correoelectrónico, incluidos sus
anexos, se dirige exclusivamente a sudestinatario. Contiene información
CONFIDENCIAL cuya divulgaciónestá prohibida por la ley o puede estar sometida a
secretoprofesional. Si ha recibido este mensaje por error, le rogamos noslo
comunique inmediatamente y proceda a sudestrucción. This email,
includingattachments, is intended exclusively for its addressee. It
containsinformation that is CONFIDENTIAL whose disclosure is prohibited bylaw
and may be covered by legal privilege. If you havereceived this email in error,
please notify the sender and deleteit from your system.
En 25 de marzo de 2015 en 20:54:58, JustinWorkman
([email protected])escrito:
We are currently running 7 node storm cluster,1 nimbus and 6 supervisor nodes
all running storm 0.9.2, running 3topologies. Any time we kill a running
topology the supervisorsacross all nodes start flapping and we end up in a
mess. To cleanthis up we end up killing all running topologies, shutdown
thesupervisors, cleanup the storm/storm-local directories on allsupervisor
nodes, restart the supervisor processes then restart thetopologies.
Has anyone experienced this issues, or have any ideas onhow to resolve it.
Log snippet we see in the supervisor logs when thishappens...
2015-03-25 11:28:13 b.s.d.supervisor [INFO]
Shuttingdown4d971d4b-a208-4758-a55e-3e8b34d7531f:ce049d5c-fd4c-499c-ad8d-ef1d8f2b992b2015-03-25
11:28:13 b.s.event [ERROR] Error whenprocessing eventjava.io.IOException: .
doesn't exist.
atorg.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:157)~[commons-exec-1.1.jar:1.1]
atorg.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:147)~[commons-exec-1.1.jar:1.1]
atbacktype.storm.util$exec_command_BANG_.invoke(util.clj:378)~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
atbacktype.storm.util$ensure_process_killed_BANG_.invoke(util.clj:394)~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
atbacktype.storm.daemon.supervisor$shutdown_worker.invoke(supervisor.clj:175)~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
atbacktype.storm.daemon.supervisor$sync_processes.invoke(supervisor.clj:240)~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
atclojure.lang.AFn.applyToHelper(AFn.java:161)~[clojure-1.5.1.jar:na]
atclojure.lang.AFn.applyTo(AFn.java:151)~[clojure-1.5.1.jar:na]
atclojure.core$apply.invoke(core.clj:619)~[clojure-1.5.1.jar:na]
atclojure.core$partial$fn__4190.doInvoke(core.clj:2396)~[clojure-1.5.1.jar:na]
atclojure.lang.RestFn.invoke(RestFn.java:397)~[clojure-1.5.1.jar:na]
atbacktype.storm.event$event_manager$fn__2378.invoke(event.clj:39)~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
atclojure.lang.AFn.run(AFn.java:24)~[clojure-1.5.1.jar:na]
atjava.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]2015-03-25 11:28:13
b.s.util [INFO] Halting process:("Error when processing an event")
There does not be any thing corresponding to this in theworker logs.'
Ideas??
ThanksJustin