Thomas Graves created STORM-1155:
------------------------------------

             Summary: Supervisor recurring health checks
                 Key: STORM-1155
                 URL: https://issues.apache.org/jira/browse/STORM-1155
             Project: Apache Storm
          Issue Type: Improvement
          Components: storm-core
            Reporter: Thomas Graves
            Assignee: Thomas Graves


Add the ability for the supervisor to call out to health check scripts to allow 
some validation of the health of the node the supervisor is running on.

It could regularly run scripts in a directory provided by the cluster admin. If 
any scripts fail, it should kill the workers and stop itself.

This could work very much like the Hadoop scripts and if ERROR is returned on 
stdout it means the node has some issue and we should shut down.

If a non-zero exit code is returned it indicates that the scripts failed to 
execute properly so you don't want to mark the node as unhealthy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to