Re: Supervisors flapping after killing topology

Bobby Evans Thu, 26 Mar 2015 08:41:53 -0700

I'm not sure if it is a bug in storm, or if your system is somehow messed up.  
I tracked down the exception being thrown and it indicates that someone deleted 
the current working directory that you are running the supervisor in. I'm not 
sure how that happens, but I have never seen it happen on any of my clusters, 
which leads me to believe that it is something that happened with your system, 
and not storm itself.
 - Bobby



     On Wednesday, March 25, 2015 4:54 PM, Justin Workman 
<[email protected]> wrote:
   

 Sounds like your storm-local is pointed to tmp and you are going through the 
same process as us. 
Is this a bug, or is there a better solution?

Sent from my iPhone
On Mar 25, 2015, at 2:41 PM, Andres Gomez Ferrer <[email protected]> wrote:


#yiv0938392330 body{font-family:Helvetica, Arial;font-size:13px;}I stop all my 
topologies, stop all storm nodes, remove /tmp/storm, start all again! 
Andrés Gómez
Developer redborder.net / [email protected]: +34 955 60 11 60
<0e6e8de_1.png>
<square-twitter-20.png> <square-google-plus-20.png> <square-linkedin-20.png>

Piénsalo antes de imprimir este mensaje Este correo electrónico, incluidos sus 
anexos, se dirige exclusivamente a su destinatario. Contiene información 
CONFIDENCIAL cuya divulgación está prohibida por la ley o puede estar sometida 
a secreto profesional. Si ha recibido este mensaje por error, le rogamos nos lo 
comunique inmediatamente y proceda a su destrucción. This email, including 
attachments, is intended exclusively for its addressee. It contains information 
that is CONFIDENTIAL whose disclosure is prohibited by law and may be covered 
by legal privilege. If you have received this email in error, please notify the 
sender and delete it from your system. 
 
En 25 de marzo de 2015 en 21:31:50, Justin Workman ([email protected]) 
escrito: 


During each topology kill/restart? Where do you put thathook?
 Or was it a one time remove of /tmp/storm and futuretopology restarts were 
fine. 

Sent from my iPhone
On Mar 25, 2015, at 1:59 PM, Andres Gomez Ferrer <[email protected]>wrote:


I solved it removing all /tmp/storm/*
Regards,
Andrés Gómez
Developer redborder.net / [email protected]: +34955 60 11 60
<0e6e8de_1.png>
<square-twitter-20.png> <square-google-plus-20.png> <square-linkedin-20.png>

Piénsalo antes deimprimir este mensaje Este correoelectrónico, incluidos sus 
anexos, se dirige exclusivamente a sudestinatario. Contiene información 
CONFIDENCIAL cuya divulgaciónestá prohibida por la ley o puede estar sometida a 
secretoprofesional. Si ha recibido este mensaje por error, le rogamos noslo 
comunique inmediatamente y proceda a sudestrucción. This email, 
includingattachments, is intended exclusively for its addressee. It 
containsinformation that is CONFIDENTIAL whose disclosure is prohibited bylaw 
and may be covered by legal privilege. If you havereceived this email in error, 
please notify the sender and deleteit from your system. 

En 25 de marzo de 2015 en 20:54:58, JustinWorkman 
([email protected])escrito:
We are currently running 7 node storm cluster,1 nimbus and 6 supervisor nodes 
all running storm 0.9.2, running 3topologies. Any time we kill a running 
topology the supervisorsacross all nodes start flapping and we end up in a 
mess. To cleanthis up we end up killing all running topologies, shutdown 
thesupervisors, cleanup the storm/storm-local directories on allsupervisor 
nodes, restart the supervisor processes then restart thetopologies. 
Has anyone experienced this issues, or have any ideas onhow to resolve it.
Log snippet we see in the supervisor logs when thishappens...
2015-03-25 11:28:13 b.s.d.supervisor [INFO] 
Shuttingdown4d971d4b-a208-4758-a55e-3e8b34d7531f:ce049d5c-fd4c-499c-ad8d-ef1d8f2b992b2015-03-25
 11:28:13 b.s.event [ERROR] Error whenprocessing eventjava.io.IOException: . 
doesn't exist.        
atorg.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:157)~[commons-exec-1.1.jar:1.1]
        
atorg.apache.commons.exec.DefaultExecutor.execute(DefaultExecutor.java:147)~[commons-exec-1.1.jar:1.1]
        
atbacktype.storm.util$exec_command_BANG_.invoke(util.clj:378)~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
        
atbacktype.storm.util$ensure_process_killed_BANG_.invoke(util.clj:394)~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
        
atbacktype.storm.daemon.supervisor$shutdown_worker.invoke(supervisor.clj:175)~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
        
atbacktype.storm.daemon.supervisor$sync_processes.invoke(supervisor.clj:240)~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
        atclojure.lang.AFn.applyToHelper(AFn.java:161)~[clojure-1.5.1.jar:na]   
     atclojure.lang.AFn.applyTo(AFn.java:151)~[clojure-1.5.1.jar:na]        
atclojure.core$apply.invoke(core.clj:619)~[clojure-1.5.1.jar:na]        
atclojure.core$partial$fn__4190.doInvoke(core.clj:2396)~[clojure-1.5.1.jar:na]  
      atclojure.lang.RestFn.invoke(RestFn.java:397)~[clojure-1.5.1.jar:na]      
  
atbacktype.storm.event$event_manager$fn__2378.invoke(event.clj:39)~[storm-core-0.9.2-incubating.jar:0.9.2-incubating]
        atclojure.lang.AFn.run(AFn.java:24)~[clojure-1.5.1.jar:na]        
atjava.lang.Thread.run(Thread.java:724) ~[na:1.7.0_25]2015-03-25 11:28:13 
b.s.util [INFO] Halting process:("Error when processing an event")

There does not be any thing corresponding to this in theworker logs.'
Ideas??
ThanksJustin

Re: Supervisors flapping after killing topology

Reply via email to