[
https://issues.apache.org/jira/browse/MESOS-540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Benjamin Mahler updated MESOS-540:
----------------------------------
Labels: starter_project (was: )
> Executor health checking.
> -------------------------
>
> Key: MESOS-540
> URL: https://issues.apache.org/jira/browse/MESOS-540
> Project: Mesos
> Issue Type: Improvement
> Reporter: Benjamin Mahler
> Labels: starter_project
>
> We currently do not health check running executors.
> At Twitter, this has led to out-of-band health checking of executors for an
> internal framework.
> For the Storm framework, this has led to out-of-band health checking via
> ZooKeeper. Health checking would allow Storm to use finer grained executors
> for better isolation.
> This also helps the Hadoop and Jenkins frameworks as well should health
> checking be desired.
> As for implementation, I would propose adding a call on the Executor
> interface:
> /**
> * Invoked by the ExecutorDriver to determine the health of the executor.
> * When this function returns, the Executor is considered healthy.
> */
> void heartbeat(ExecutorDriver* driver) = 0;
> The driver can then heartbeat periodically and kill when the Executor is not
> responding to heartbeats. The driver should also detect the executor
> deadlocking on any of the other callbacks.
--
This message was sent by Atlassian JIRA
(v6.1.4#6159)