[
https://issues.apache.org/jira/browse/MESOS-7744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16069355#comment-16069355
]
Sargun Dhillon commented on MESOS-7744:
---------------------------------------
Full log:
{code}
Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:26.951799 5171 slave.cpp:1495] Got assigned
task Titus-7590548-worker-0-4476 for framework TitusFramework
Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:26.952251 5171 slave.cpp:1614] Launching task
Titus-7590548-worker-0-4476 for framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.484611 5171 slave.cpp:1853] Queuing task
‘Titus-7590548-worker-0-4476’ for executor ‘docker-executor’ of framework
TitusFramework at executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.487876 5171 slave.cpp:2035] Asked to kill
task Titus-7590548-worker-0-4476 of framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.488994 5171 slave.cpp:3211] Handling status
update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task
Titus-7590548-worker-0-4476 of framework TitusFramework from @0.0.0.0:0
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.490603 5171 slave.cpp:2005] Sending queued
task ‘Titus-7590548-worker-0-4476’ to executor ‘docker-executor’ of framework
TitusFramework at executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.494860 5171 slave.cpp:3211] Handling status
update TASK_STARTING (UUID: d6aaed02-5d21-11e7-846c-0a0c90a7033c) for task
Titus-7590548-worker-0-4476 of framework TitusFramework from
executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.496829 5191 status_update_manager.cpp:320]
Received status update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4)
for task Titus-7590548-worker-0-4476 of framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.497530 5191 status_update_manager.cpp:825]
Checkpointing UPDATE for status update TASK_KILLED (UUID:
898215d6-a244-4dbe-bc9c-878a22d36ea4) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.498082 5171 slave.cpp:3211] Handling status
update TASK_STARTING (UUID: d6aafd3f-5d21-11e7-846c-0a0c90a7033c) for task
Titus-7590548-worker-0-4476 of framework TitusFramework from
executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.500267 5191 status_update_manager.cpp:320]
Received status update TASK_STARTING (UUID:
d6aaed02-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.500377 5191 status_update_manager.cpp:825]
Checkpointing UPDATE for status update TASK_STARTING (UUID:
d6aaed02-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.500562 5189 slave.cpp:3604] Forwarding the
update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for task
Titus-7590548-worker-0-4476 of framework TitusFramework to
[email protected]:7103
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.502029 5191 status_update_manager.cpp:320]
Received status update TASK_STARTING (UUID:
d6aafd3f-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.502092 5191 status_update_manager.cpp:825]
Checkpointing UPDATE for status update TASK_STARTING (UUID:
d6aafd3f-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.502393 5189 slave.cpp:3514] Sending
acknowledgement for status update TASK_STARTING (UUID:
d6aaed02-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework to executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.504465 5189 slave.cpp:3514] Sending
acknowledgement for status update TASK_STARTING (UUID:
d6aafd3f-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework to executor(1)@100.66.11.10:17707
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.518888 5191 status_update_manager.cpp:392]
Received status update acknowledgement (UUID:
898215d6-a244-4dbe-bc9c-878a22d36ea4) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:37.519039 5191 status_update_manager.cpp:825]
Checkpointing ACK for status update TASK_KILLED (UUID:
898215d6-a244-4dbe-bc9c-878a22d36ea4) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: W0629 23:22:37.520956 5191 status_update_manager.cpp:446]
Acknowledged a terminal status update TASK_KILLED (UUID:
898215d6-a244-4dbe-bc9c-878a22d36ea4) for task Titus-7590548-worker-0-4476 of
framework TitusFramework but updates are still pending
Jun 29 23:22:39 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:39.681637 5183 slave.cpp:3211] Handling status
update TASK_STARTING (UUID: d7f8e8bc-5d21-11e7-846c-0a0c90a7033c) for task
Titus-7590548-worker-0-4476 of framework TitusFramework from
executor(1)@100.66.11.10:17707
Jun 29 23:22:39 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: W0629 23:22:39.681761 5183 slave.cpp:3291] Could not find
the executor for status update TASK_STARTING (UUID:
d7f8e8bc-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:39 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:39.682006 5180 status_update_manager.cpp:320]
Received status update TASK_STARTING (UUID:
d7f8e8bc-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:39 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:39.682586 5181 slave.cpp:3604] Forwarding the
update TASK_STARTING (UUID: d7f8e8bc-5d21-11e7-846c-0a0c90a7033c) for task
Titus-7590548-worker-0-4476 of framework TitusFramework to
[email protected]:7103
Jun 29 23:22:39 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:39.682958 5181 slave.cpp:3514] Sending
acknowledgement for status update TASK_STARTING (UUID:
d7f8e8bc-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework to executor(1)@100.66.11.10:17707
Jun 29 23:22:39 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:39.686782 5172 status_update_manager.cpp:392]
Received status update acknowledgement (UUID:
d7f8e8bc-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:39 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: E0629 23:22:39.687196 5195 slave.cpp:2621] Status update
acknowledgement (UUID: d7f8e8bc-5d21-11e7-846c-0a0c90a7033c) for task
Titus-7590548-worker-0-4476 of unknown executor
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:51.530827 5182 slave.cpp:3211] Handling status
update TASK_STARTING (UUID: df08f5a4-5d21-11e7-846c-0a0c90a7033c) for task
Titus-7590548-worker-0-4476 of framework TitusFramework from
executor(1)@100.66.11.10:17707
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: W0629 23:22:51.530951 5182 slave.cpp:3291] Could not find
the executor for status update TASK_STARTING (UUID:
df08f5a4-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:51.531138 5172 status_update_manager.cpp:320]
Received status update TASK_STARTING (UUID:
df08f5a4-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:51.531445 5181 slave.cpp:3604] Forwarding the
update TASK_STARTING (UUID: df08f5a4-5d21-11e7-846c-0a0c90a7033c) for task
Titus-7590548-worker-0-4476 of framework TitusFramework to
[email protected]:7103
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:51.531718 5181 slave.cpp:3514] Sending
acknowledgement for status update TASK_STARTING (UUID:
df08f5a4-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework to executor(1)@100.66.11.10:17707
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:51.536438 5196 status_update_manager.cpp:392]
Received status update acknowledgement (UUID:
df08f5a4-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: E0629 23:22:51.536902 5197 slave.cpp:2621] Status update
acknowledgement (UUID: df08f5a4-5d21-11e7-846c-0a0c90a7033c) for task
Titus-7590548-worker-0-4476 of unknown executor
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:51.693526 5189 slave.cpp:3211] Handling status
update TASK_RUNNING (UUID: df21c703-5d21-11e7-846c-0a0c90a7033c) for task
Titus-7590548-worker-0-4476 of framework TitusFramework from
executor(1)@100.66.11.10:17707
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: W0629 23:22:51.693653 5189 slave.cpp:3291] Could not find
the executor for status update TASK_RUNNING (UUID:
df21c703-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:51.693857 5199 status_update_manager.cpp:320]
Received status update TASK_RUNNING (UUID:
df21c703-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:51.694207 5170 slave.cpp:3604] Forwarding the
update TASK_RUNNING (UUID: df21c703-5d21-11e7-846c-0a0c90a7033c) for task
Titus-7590548-worker-0-4476 of framework TitusFramework to
[email protected]:7103
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:51.694473 5170 slave.cpp:3514] Sending
acknowledgement for status update TASK_RUNNING (UUID:
df21c703-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework to executor(1)@100.66.11.10:17707
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: I0629 23:22:51.698933 5201 status_update_manager.cpp:392]
Received status update acknowledgement (UUID:
df21c703-5d21-11e7-846c-0a0c90a7033c) for task Titus-7590548-worker-0-4476 of
framework TitusFramework
Jun 29 23:22:51 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
mesos-slave[4290]: E0629 23:22:51.699404 5172 slave.cpp:2621] Status update
acknowledgement (UUID: df21c703-5d21-11e7-846c-0a0c90a7033c) for task
Titus-7590548-worker-0-4476 of unknown executor
{code}
> Mesos Agent Sends TASK_KILL status update to Master, and still launches task
> ----------------------------------------------------------------------------
>
> Key: MESOS-7744
> URL: https://issues.apache.org/jira/browse/MESOS-7744
> Project: Mesos
> Issue Type: Bug
> Affects Versions: 1.0.1
> Reporter: Sargun Dhillon
> Priority: Minor
>
> We sometimes launch jobs, and cancel them in ~7 seconds, if we don't get a
> TASK_STARTING back from the agent. Under certain conditions it can result in
> Mesos losing track of the task. The chunk of the logs which is interesting is
> here:
> {code}
> Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
> mesos-slave[4290]: I0629 23:22:26.951799 5171 slave.cpp:1495] Got assigned
> task Titus-7590548-worker-0-4476 for framework TitusFramework
> Jun 29 23:22:26 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
> mesos-slave[4290]: I0629 23:22:26.952251 5171 slave.cpp:1614] Launching task
> Titus-7590548-worker-0-4476 for framework TitusFramework
> Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
> mesos-slave[4290]: I0629 23:22:37.484611 5171 slave.cpp:1853] Queuing task
> ‘Titus-7590548-worker-0-4476’ for executor ‘docker-executor’ of framework
> TitusFramework at executor(1)@100.66.11.10:17707
> Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
> mesos-slave[4290]: I0629 23:22:37.487876 5171 slave.cpp:2035] Asked to kill
> task Titus-7590548-worker-0-4476 of framework TitusFramework
> Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
> mesos-slave[4290]: I0629 23:22:37.488994 5171 slave.cpp:3211] Handling
> status update TASK_KILLED (UUID: 898215d6-a244-4dbe-bc9c-878a22d36ea4) for
> task Titus-7590548-worker-0-4476 of framework TitusFramework from @0.0.0.0:0
> Jun 29 23:22:37 titusagent-mainvpc-r3.8xlarge.2-i-04907efc9f1f8535c
> mesos-slave[4290]: I0629 23:22:37.490603 5171 slave.cpp:2005] Sending queued
> task ‘Titus-7590548-worker-0-4476’ to executor ‘docker-executor’ of framework
> TitusFramework at executor(1)@100.66.11.10:17707{
> {code}
> In our executor, we see that the launch message arrives after the master has
> already gotten the kill update. We then send non-terminal state updates to
> the agent, and yet it doesn't forward these to our framework.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)