[ 
https://issues.apache.org/jira/browse/STORM-183?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14033747#comment-14033747
 ] 

ASF GitHub Bot commented on STORM-183:
--------------------------------------

Github user revans2 commented on the pull request:

    https://github.com/apache/incubator-storm/pull/143#issuecomment-46302266
  
    I do like combining the two shutdown hooks together.
    
    I understand what is happening now with the leaked processes now.  I was 
confused about exactly what the code was supposed to be doing.
    
    I thought that the supervisor would send a sig term to the process, and 
then wait a while and send a sig kill to it.  I didn't realize that instead it 
was just sending a sigterm, and letting the worker send the sigkill to itself.
    
    The issue with not having the supervisor force-kill the child is that a bug 
in the worker, or in a child process the worker forks, could result in the 
process being leaked.  
    
    I don't want to do the sleep right after the sigterm, because if there are 
multiple workers the sleeps will add up.  I think we want to modify 
sync-processes in the supervisor to do 2 passes over the workers that need to 
be killed.  The first pass would ask the worker to exit (sigterm).  The second 
pass would force-kill the worker and cleanup the directories associated with 
it.  There could be a 1 second sleep in between if any workers were being 
killed (I don't want to sleep if no workers are shutting down). 
    
    Does that sound reasonable?


> Supervisor/worker shutdown hook should be called in distributed mode.
> ---------------------------------------------------------------------
>
>                 Key: STORM-183
>                 URL: https://issues.apache.org/jira/browse/STORM-183
>             Project: Apache Storm (Incubating)
>          Issue Type: Bug
>            Reporter: caofangkun
>            Priority: Minor
>         Attachments: STORM-183-1.patch
>
>
> if the process is killed forcefully from the OS or if it's crashing due to 
> resource issues (e.g., out of memory), shutdown hooks won't be invoked.
> -TERM (15) 
> The process is requested to stop running; it should try to exit cleanly 
> -KILL (9) 
> The process will be killed by the kernel; this signal cannot be ignored.
> So should we better use 'kill -15' ?
> See:
> https://github.com/apache/incubator-storm/blob/master/storm-core/src/clj/backtype/storm/util.clj#L392
> https://github.com/apache/incubator-storm/blob/master/storm-core/src/clj/backtype/storm/daemon/supervisor.clj#L175
> will never be called for supervisor:
> https://github.com/apache/incubator-storm/blob/master/storm-core/src/clj/backtype/storm/daemon/supervisor.clj#L396
> will never be called for worker:
> https://github.com/apache/incubator-storm/blob/master/storm-core/src/clj/backtype/storm/daemon/worker.clj#L421
> We'd better add something like :
> (.addShutdownHook (Runtime/getRuntime) (Thread. (fn [] (.shutdown mk-sv)))))) 
>  ?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to