Xudong Ni created MESOS-10032:
---------------------------------
Summary: Mesos agent should sever proactively master connection
when failing to detect the leading master
Key: MESOS-10032
URL: https://issues.apache.org/jira/browse/MESOS-10032
Project: Mesos
Issue Type: Improvement
Reporter: Xudong Ni
We have observed that this often happens when the agents losing ZK connections
and resetting its master to None and beginning dropping messages from the
master because they can't verify that the messages are valid.
This has caused Jarvis to be unable to kill tasks (and they aren't counted as
unreachable because the master can still reach the agent).
A reasonable solution is for the agent to disconnect from the master upon
resetting the master it tracks since it's just going to drop control messages.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)