Benjamin Mahler created MESOS-808:
-------------------------------------

             Summary: The scheduler driver should queue messages when 
disconnected.
                 Key: MESOS-808
                 URL: https://issues.apache.org/jira/browse/MESOS-808
             Project: Mesos
          Issue Type: Improvement
            Reporter: Benjamin Mahler


Currently when schedulers try to take an action while the driver is 
disconnected (i.e. a call to SchedulerDriver::disconnected has occurred), the 
driver will drop the request.

In the case of launching a task, we'll reply with TASK_LOST directly in the 
driver. However, with things like killTask, we simply drop the kill task 
request.

This behavior seems a little unfriendly for schedulers, as they need to be 
concerned about queueing any operations until Scheduler::connected is called. 
We should consider queuing in the driver instead.

The implementation here can consist of a queue<Message> holding the messages 
that were constructed while !connected. Once we re-connect, we simply run 
through this queue sending all messages.

However, without state in the driver, schedulers will have to live with the 
possibility of dropped messages anyway (i.e. if they fail while disconnected, 
any messages will be lost).



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to