David Robinson created MESOS-5376:
-------------------------------------
Summary: Add systemd watchdog support
Key: MESOS-5376
URL: https://issues.apache.org/jira/browse/MESOS-5376
Project: Mesos
Issue Type: Improvement
Reporter: David Robinson
It would be great if Mesos had support for systemd's
[watchdog|http://0pointer.de/blog/projects/watchdog.html]. Users would
typically use a supervisor like [monit|https://mmonit.com/monit/] to check the
agent/master's /health endpoint and restart upon consecutive failures. Systemd
doesn't support polling services, it uses a watchdog to communicate liveliness
instead. Supervisor solutions like monit could be replaced with systemd if
mesos had watchdog support. Note that simply restarting the service upon
failure (ie, when the process exits) is not sufficient -- a deadlock within
mesos would not cause the process to exit but a watchdog could detect this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)