[ https://issues.apache.org/jira/browse/STORM-532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14303716#comment-14303716 ]
ASF GitHub Bot commented on STORM-532: -------------------------------------- Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/296#issuecomment-72706534 The code change looks OK, but I am seeing test failures in supervisor_test.clj. I also would prefer to have us cache the Process that we used to launch the external process and ask it if the process has exited, to know if it is still up, instead of running ps/etc. > Supervisor should restart worker immediately, if the worker process does not > exist any more > -------------------------------------------------------------------------------------------- > > Key: STORM-532 > URL: https://issues.apache.org/jira/browse/STORM-532 > Project: Apache Storm > Issue Type: Improvement > Affects Versions: 0.10.0 > Reporter: caofangkun > Assignee: caofangkun > Priority: Minor > > For now > if the worker process does not exist any more > Supervisor will have to wait a few seconds for worker heartbeart timeout and > restart worker . > If supervisor knows the worker processid and check if the process exists in > the sync-processes thread ,may need less time to restart worker. > 1: record worker process id in the worker local heartbeart > 2: in supervisor sync-processes ,get process id from worker local heartbeat > and check if the process exits > 3: if not restart it immediately -- This message was sent by Atlassian JIRA (v6.3.4#6332)