[ https://issues.apache.org/jira/browse/MESOS-10032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16970584#comment-16970584 ]
Xudong Ni commented on MESOS-10032: ----------------------------------- https://reviews.apache.org/r/71742/ > Mesos agent should sever proactively master connection when failing to detect > the leading master > ------------------------------------------------------------------------------------------------ > > Key: MESOS-10032 > URL: https://issues.apache.org/jira/browse/MESOS-10032 > Project: Mesos > Issue Type: Improvement > Reporter: Xudong Ni > Assignee: Xudong Ni > Priority: Major > > We have observed that this often happens when the agents losing ZK > connections and resetting its master to None and beginning dropping messages > from the master because they can't verify that the messages are valid. > This has caused Jarvis to be unable to kill tasks (and they aren't counted as > unreachable because the master can still reach the agent). > A reasonable solution is for the agent to disconnect from the master upon > resetting the master it tracks since it's just going to drop control messages. -- This message was sent by Atlassian Jira (v8.3.4#803005)