Neil Conway created MESOS-6869:
----------------------------------

             Summary: Master should not shutdown agents that re-register 
without authenticating
                 Key: MESOS-6869
                 URL: https://issues.apache.org/jira/browse/MESOS-6869
             Project: Mesos
          Issue Type: Bug
          Components: master
            Reporter: Neil Conway
            Priority: Minor


Currently, the master sends a {{ShutdownMessage}} to the agent if the agent 
attempts to register or re-register without first authenticating:

https://github.com/apache/mesos/blob/be127e6eca1312bf8c2b039646f6909fa42cd342/src/master/master.cpp#L5083
https://github.com/apache/mesos/blob/be127e6eca1312bf8c2b039646f6909fa42cd342/src/master/master.cpp#L5268

This produces bad behavior in this scenario:

* Agent authenticates successfully
* Agent's registration / re-registration message is dropped. Agent will then 
attempt to register / re-register again in a loop w/ backoff.
* Master eventually marks the agent unreachable because it fails health checks. 
This clears the {{authenticated}} state for the agent.
* Simultaneously, a re-registration attempt from the agent reaches the master. 
The master responds with {{ShutdownMessage}}, which is bad because it will 
terminate all of its tasks.

Instead, the master should ignore the registration attempt -- however, that is 
still problematic, because (a) the agent will never attempt to re-authenticate 
because it thinks it has already authenticated, (b) the agent will continue to 
respond to pings from the master and will not clear its local "I have 
authenticated" state.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to