Chun-Hung Hsiao created MESOS-9541:
--------------------------------------

             Summary: Transition agent operations to some "lost" state when the 
agent is removed.
                 Key: MESOS-9541
                 URL: https://issues.apache.org/jira/browse/MESOS-9541
             Project: Mesos
          Issue Type: Improvement
    Affects Versions: 1.7.1, 1.7.0, 1.6.1, 1.6.0, 1.5.2, 1.5.1, 1.5.0
            Reporter: Chun-Hung Hsiao


MESOS-8782 and MESOS-8783 transition operations to 
{{OPERATION_GONE_BY_OPERATOR}} or {{OPERATION_UNREACHABLE}} when their agents 
are marked as gone or unreachable respectively. However, there are other cases 
where agents can be "removed" and forgot by the master:
1) When an agent tries to register with a new ID from the same IP:
https://github.com/apache/mesos/blob/f130544bdb8a9849096ee2cb35ebcbc7d8a326d8/src/master/master.cpp#L6836-L6849
2) When an agent requests to unregister:
https://github.com/apache/mesos/blob/f130544bdb8a9849096ee2cb35ebcbc7d8a326d8/src/master/master.cpp#L7817-L7840

In these tasks, the master explicitly sends {{TASK_LOST}} for task status 
updates (this also means that [this 
documentation|https://github.com/apache/mesos/blob/f130544bdb8a9849096ee2cb35ebcbc7d8a326d8/include/mesos/mesos.proto#L2287-L2288]
 is wrong), but does nothing for operations. We should design proper operation 
status transitions for these cases.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to