Re: Detecting slave crashes event

Benjamin Mahler Wed, 16 Sep 2015 10:31:45 -0700

You can detect when we remove an agent due to health check failures via the
metrics endpoint, but these are counters that are better used for alerting
/ dashboards for visibility. If you need to know which agents, you can also
consume the logs as a stop-gap solution, until we offer a mechanism for
subscribing to cluster events.


On Wed, Sep 16, 2015 at 10:11 AM, Paul Bell <[email protected]> wrote:

> Hi All,
>
> I am led to believe that, unlike Marathon, Mesos doesn't (yet?) offer a
> subscribable event bus.
>
> So I am wondering if there's a best practices way of determining if a
> slave node has crashed. By "crashed" I mean something like the power plug
> got yanked, or anything that would cause Mesos to stop talking to the slave
> node.
>
> I suppose such information would be recorded in /var/log/mesos.
>
> Interested to learn how best to detect this.
>
> Thank you.
>
> -Paul
>

Re: Detecting slave crashes event

Reply via email to