Benjamin Mahler created MESOS-808:
-------------------------------------
Summary: The scheduler driver should queue messages when
disconnected.
Key: MESOS-808
URL: https://issues.apache.org/jira/browse/MESOS-808
Project: Mesos
Issue Type: Improvement
Reporter: Benjamin Mahler
Currently when schedulers try to take an action while the driver is
disconnected (i.e. a call to SchedulerDriver::disconnected has occurred), the
driver will drop the request.
In the case of launching a task, we'll reply with TASK_LOST directly in the
driver. However, with things like killTask, we simply drop the kill task
request.
This behavior seems a little unfriendly for schedulers, as they need to be
concerned about queueing any operations until Scheduler::connected is called.
We should consider queuing in the driver instead.
The implementation here can consist of a queue<Message> holding the messages
that were constructed while !connected. Once we re-connect, we simply run
through this queue sending all messages.
However, without state in the driver, schedulers will have to live with the
possibility of dropped messages anyway (i.e. if they fail while disconnected,
any messages will be lost).
--
This message was sent by Atlassian JIRA
(v6.1#6144)