Benjamin Mahler created MESOS-7187:
--------------------------------------
Summary: Master can neglect to update agent metadata in a
re-registration corner case.
Key: MESOS-7187
URL: https://issues.apache.org/jira/browse/MESOS-7187
Project: Mesos
Issue Type: Bug
Components: master, technical debt
Reporter: Benjamin Mahler
If the agent is re-registering with the master for the first time, the master
will drop any re-registration messages that arrive while the registry operation
is in progress.
These dropped messages can have different metadata (e.g. version, capabilities,
etc) that gets dropped. Since the master doesn't distinguish between different
instances of the agent (both share the same UPID and there is no instance
identifying information), the master can't tell whether this is a retry from
the original instance of the agent or a re-registration from a new instance of
the agent.
The following is an example:
(1) Master restarts.
(2) Agent re-registers with OLD_VERSION / OLD_CAPABILITIES.
(3) While registry operation is in progress, agent is upgraded and re-registers
with NEW_VERSION / NEW_CAPABILITIES.
(4) Registry operation completes, new agent receives the re-registration
acknowledgement message and so, does not retry.
(5) Now, the master's memory reflects OLD_VERSION / OLD_CAPABILITIES for the
agent which remains inconsistent until a later re-registration occurs.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)