David Robinson created MESOS-5200:
-------------------------------------
Summary: agent->master messages use temporary TCP connections
Key: MESOS-5200
URL: https://issues.apache.org/jira/browse/MESOS-5200
Project: Mesos
Issue Type: Bug
Reporter: David Robinson
Background info: When an agent is started it starts a background task
(libprocess process?) to detect the leading master. When the leading master is
detected (or changes) the [SocketManager's link()
method|https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/process.cpp#L1415]
[is
called|https://github.com/apache/mesos/blob/master/src/slave/slave.cpp#L942]
and a TCP connection to the master is established. The connection is used by
the agent to send messages to the master, and the master, upon receiving a
RegisterSlaveMessage/ReregisterSlaveMessage, establishes another TCP connection
back to the agent. Each TCP connection is uni-directional, the agent writes
messages on one connection and reads messages from the other, and the master
reads/writes from the opposite ends of the connections.
If the initial TCP connection to the master fails to be established then
temporary connections are used for all agent->master messages; each send()
causes a new TCP connection to be setup, the message sent, then the connection
torn down. If link() succeeds a persistent TCP connection is used instead.
If agents do not use ZK to detect the master then the master detector "detects"
the master immediately and attempts to connect immediately. The master may not
be listening for connections at the time, or it could be overwhelmed w/ TCP
connection attempts, therefore the initial TCP connection attempt fails. The
agent does not attempt to establish a new persistent connection as link() is
only called when a new master is detected, which only occurs once unless ZK is
used.
It's possible for agents to overwhelm a master w/ TCP connections such that
agents cannot establish connections. When this occurs pong messages may not be
received by the master so the master shuts down agents thus killing any tasks
they were running. We have witnessed this scenario during scale/load tests at
Twitter.
The problem is trivial to reproduce: configure an agent to use a certain master
(--master=10.20.30.40:5050), start the agent, wait several minutes then start
the master. All the agent->master messages will occur over temporary
connections.
The problem occurs less frequently in production because ZK is typically used
for master detection and a master only registers in ZK after it has started
listening on its socket. However, the scenario described above can also occur
when ZK is used – a thundering herd of 10,000+ slaves establishing TCP
connections to the master can result in some connection attempts failing and
agents using temporary connections.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)