Gastón Kleiman created MESOS-8987:
-------------------------------------
Summary: Master asks agent to shutdown upon auth errors
Key: MESOS-8987
URL: https://issues.apache.org/jira/browse/MESOS-8987
Project: Mesos
Issue Type: Bug
Components: master, security
Affects Versions: 1.6.0, 1.5.1, 1.4.1, 1.7.0
Reporter: Gastón Kleiman
The Mesos master sends a {{ShutdownMessage}} to an agent if there is an
[authentication|https://github.com/apache/mesos/blob/d733b1031350e03bce443aa287044eb4eee1053a/src/master/master.cpp#L6532-L6543]
or an
[authorization|https://github.com/apache/mesos/blob/d733b1031350e03bce443aa287044eb4eee1053a/src/master/master.cpp#L6622-L6633]
error during agent registration.
Upon receipt of this message, the agent kills alls its tasks and commits
suicide. This means that transient auth errors can lead to whole agents being
killed along with it's tasks.
I think the master should stop sending the {{ShutdownMessage}}s in these cases,
or at least let the agent retry the registration a few times before asking it
to shutdown.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)