[
https://issues.apache.org/jira/browse/STORM-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14988086#comment-14988086
]
ASF GitHub Bot commented on STORM-1155:
---------------------------------------
Github user zhuoliu commented on the pull request:
https://github.com/apache/storm/pull/849#issuecomment-153483169
+1
> Supervisor recurring health checks
> ----------------------------------
>
> Key: STORM-1155
> URL: https://issues.apache.org/jira/browse/STORM-1155
> Project: Apache Storm
> Issue Type: Improvement
> Components: storm-core
> Reporter: Thomas Graves
> Assignee: Thomas Graves
>
> Add the ability for the supervisor to call out to health check scripts to
> allow some validation of the health of the node the supervisor is running on.
> It could regularly run scripts in a directory provided by the cluster admin.
> If any scripts fail, it should kill the workers and stop itself.
> This could work very much like the Hadoop scripts and if ERROR is returned on
> stdout it means the node has some issue and we should shut down.
> If a non-zero exit code is returned it indicates that the scripts failed to
> execute properly so you don't want to mark the node as unhealthy.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)