[
https://issues.apache.org/jira/browse/MESOS-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15363424#comment-15363424
]
Adam B edited comment on MESOS-5693 at 7/5/16 10:59 PM:
Only the earliest unacknowledged update (i.e. the TASK_RUNNING, not the
TASK_KILLED) will be sent (and resent with periodic retries) for each task from
the agent's StatusUpdateManager to the master. However, with these updates, the
agent will add the latest task state (not a full StatusUpdate), so the master
can know to release the resources and update the state in the webui. The data
and messages from the final terminal status update must wait for all its
preceding updates to be acknowledged so that it can be sent.
was (Author: adam-mesos):
Only the earliest unacknowledged update (i.e. the TASK_RUNNING, not the
TASK_KILLED) will be sent for each task from the agent's StatusUpdateManager to
the master. However, with these updates, the agent will add the latest task
state (not a full StatusUpdate), so the master can know to release the
resources and update the state in the webui. The data and messages from the
final terminal status update must wait for all its preceding updates to be
acknowledged so that it can be sent.
> slave delay to forword status update
>
>
> Key: MESOS-5693
> URL: https://issues.apache.org/jira/browse/MESOS-5693
> Project: Mesos
> Issue Type: Improvement
> Components: slave
>Affects Versions: 0.22.1
> Environment: debian7
>Reporter: zhangfuxing
>
> we observe that mesos slave delay to forward task status update to master,
> I0615 14:59:10.997902 3890 slave.cpp:2531] Handling status update
> TASK_KILLED (UUID: 17e9c12f-5241-4aca-81fa-67d6830990b0) for task
> xxx.64554b80 of framework 20150629-151659-3355508746-5060-6173-0001 from
> executor(1)@10.0.40.189:54304
> I0615 14:59:11.001126 3895 status_update_manager.cpp:317] Received status
> update TASK_KILLED (UUID: 17e9c12f-5241-4aca-81fa-67d6830990b0) for task
> xxx.64554b80 of framework 20150629-151659-3355508746-5060-6173-0001
> I0615 14:59:11.001174 3895 status_update_manager.hpp:346] Checkpointing
> UPDATE for status update TASK_KILLED (UUID:
> 17e9c12f-5241-4aca-81fa-67d6830990b0) for task xxx.64554b80 of framework
> 20150629-151659-3355508746-5060-6173-0001
> I0615 14:59:11.037376 3894 slave.cpp:2709] Sending acknowledgement for
> status update TASK_KILLED (UUID: 17e9c12f-5241-4aca-81fa-67d6830990b0) for
> task xxx.64554b80 of framework 20150629-151659-3355508746-5060-6173-0001 to
> executor(1)@10.0.40.189:54304
> I0615 15:54:21.352087 3888 slave.cpp:2776] Forwarding the update TASK_KILLED
> (UUID: 17e9c12f-5241-4aca-81fa-67d6830990b0) for task xxx.64554b80 of
> framework 20150629-151659-3355508746-5060-6173-0001 to master@10.0.1.200:5060
> for this example, the task xxx.64554b80 has been killed at 14:59 but the
> status didn't forward to master until 15:54
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)