[jira] [Comment Edited] (MESOS-5693) slave delay to forword status update

Adam B (JIRA) Tue, 05 Jul 2016 16:00:27 -0700

    [ 
https://issues.apache.org/jira/browse/MESOS-5693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15363424#comment-15363424
 ]


Adam B edited comment on MESOS-5693 at 7/5/16 10:59 PM:
--------------------------------------------------------

Only the earliest unacknowledged update (i.e. the TASK_RUNNING, not the 
TASK_KILLED) will be sent (and resent with periodic retries) for each task from 
the agent's StatusUpdateManager to the master. However, with these updates, the 
agent will add the latest task state (not a full StatusUpdate), so the master 
can know to release the resources and update the state in the webui. The data 
and messages from the final terminal status update must wait for all its 
preceding updates to be acknowledged so that it can be sent.


was (Author: adam-mesos):
Only the earliest unacknowledged update (i.e. the TASK_RUNNING, not the 
TASK_KILLED) will be sent for each task from the agent's StatusUpdateManager to 
the master. However, with these updates, the agent will add the latest task 
state (not a full StatusUpdate), so the master can know to release the 
resources and update the state in the webui. The data and messages from the 
final terminal status update must wait for all its preceding updates to be 
acknowledged so that it can be sent.

> slave delay to forword status update
> ------------------------------------
>
>                 Key: MESOS-5693
>                 URL: https://issues.apache.org/jira/browse/MESOS-5693
>             Project: Mesos
>          Issue Type: Improvement
>          Components: slave
>    Affects Versions: 0.22.1
>         Environment: debian7 
>            Reporter: zhangfuxing
>
> we observe that mesos slave delay to forward task status update to master, 
> I0615 14:59:10.997902  3890 slave.cpp:2531] Handling status update 
> TASK_KILLED (UUID: 17e9c12f-5241-4aca-81fa-67d6830990b0) for task 
> xxx.64554b80 of framework 20150629-151659-3355508746-5060-6173-0001 from 
> executor(1)@10.0.40.189:54304
> I0615 14:59:11.001126  3895 status_update_manager.cpp:317] Received status 
> update TASK_KILLED (UUID: 17e9c12f-5241-4aca-81fa-67d6830990b0) for task 
> xxx.64554b80 of framework 20150629-151659-3355508746-5060-6173-0001
> I0615 14:59:11.001174  3895 status_update_manager.hpp:346] Checkpointing 
> UPDATE for status update TASK_KILLED (UUID: 
> 17e9c12f-5241-4aca-81fa-67d6830990b0) for task xxx.64554b80 of framework 
> 20150629-151659-3355508746-5060-6173-0001
> I0615 14:59:11.037376  3894 slave.cpp:2709] Sending acknowledgement for 
> status update TASK_KILLED (UUID: 17e9c12f-5241-4aca-81fa-67d6830990b0) for 
> task xxx.64554b80 of framework 20150629-151659-3355508746-5060-6173-0001 to 
> executor(1)@10.0.40.189:54304
> I0615 15:54:21.352087  3888 slave.cpp:2776] Forwarding the update TASK_KILLED 
> (UUID: 17e9c12f-5241-4aca-81fa-67d6830990b0) for task xxx.64554b80 of 
> framework 20150629-151659-3355508746-5060-6173-0001 to master@10.0.1.200:5060
> for this example, the task xxx.64554b80 has been killed at 14:59 but the 
> status didn't forward to master until 15:54



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (MESOS-5693) slave delay to forword status update

Reply via email to