I believe some of the contributors from Mesosphere have been thinking about it, but not sure on the plans. I'll let them reply here.
On Wed, Sep 16, 2015 at 11:11 AM, Paul Bell <[email protected]> wrote: > Thank you, Benjamin. > > So, I could periodically request the metrics endpoint, or stream the logs > (maybe via mesos.cli; or SSH)? What, roughly, does the "agent removed" > message look like in the logs? > > Are there plans to offer a mechanism for event subscription? > > Cordially, > > Paul > > > > On Wed, Sep 16, 2015 at 1:30 PM, Benjamin Mahler < > [email protected]> wrote: > >> You can detect when we remove an agent due to health check failures via >> the metrics endpoint, but these are counters that are better used for >> alerting / dashboards for visibility. If you need to know which agents, you >> can also consume the logs as a stop-gap solution, until we offer a >> mechanism for subscribing to cluster events. >> >> On Wed, Sep 16, 2015 at 10:11 AM, Paul Bell <[email protected]> wrote: >> >>> Hi All, >>> >>> I am led to believe that, unlike Marathon, Mesos doesn't (yet?) offer a >>> subscribable event bus. >>> >>> So I am wondering if there's a best practices way of determining if a >>> slave node has crashed. By "crashed" I mean something like the power plug >>> got yanked, or anything that would cause Mesos to stop talking to the slave >>> node. >>> >>> I suppose such information would be recorded in /var/log/mesos. >>> >>> Interested to learn how best to detect this. >>> >>> Thank you. >>> >>> -Paul >>> >> >> >

