Benjamin Mahler created MESOS-7187:
--------------------------------------

             Summary: Master can neglect to update agent metadata in a 
re-registration corner case.
                 Key: MESOS-7187
                 URL: https://issues.apache.org/jira/browse/MESOS-7187
             Project: Mesos
          Issue Type: Bug
          Components: master, technical debt
            Reporter: Benjamin Mahler


If the agent is re-registering with the master for the first time, the master 
will drop any re-registration messages that arrive while the registry operation 
is in progress.

These dropped messages can have different metadata (e.g. version, capabilities, 
etc) that gets dropped. Since the master doesn't distinguish between different 
instances of the agent (both share the same UPID and there is no instance 
identifying information), the master can't tell whether this is a retry from 
the original instance of the agent or a re-registration from a new instance of 
the agent.

The following is an example:

(1) Master restarts.
(2) Agent re-registers with OLD_VERSION / OLD_CAPABILITIES.
(3) While registry operation is in progress, agent is upgraded and re-registers 
with NEW_VERSION / NEW_CAPABILITIES.
(4) Registry operation completes, new agent receives the re-registration 
acknowledgement message and so, does not retry.
(5) Now, the master's memory reflects OLD_VERSION / OLD_CAPABILITIES for the 
agent which remains inconsistent until a later re-registration occurs.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to