Sathish Kumar created MESOS-6952:
------------------------------------
Summary: Mesos task state was stuck in staging inspite
Key: MESOS-6952
URL: https://issues.apache.org/jira/browse/MESOS-6952
Project: Mesos
Issue Type: Bug
Components: executor
Affects Versions: 0.28.2
Environment: ubuntu 14.04
Reporter: Sathish Kumar
Task is stuck at staging stage even after slave executor is terminated.
Mesos master keeps the task state in staging state. Since the task is stuck at
staging framework have not got the update from mesos-master
The issue got fixed after slave restart.
I can see in the slave logs Asked to run task ' which is terminating/terminated
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:17.089251 107759 status_update_manager.cpp:824] Checkpointing ACK for
status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:17.097193 107774 slave.cpp:1361] Got assigned task
ct:1484816820000:0:foocare_zendesk_round_robin: for framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:17.097453 107774 slave.cpp:1480] Launching task
ct:1484816820000:0:foocare_zendesk_round_robin: for framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:W0119
14:42:17.097527 107774 slave.cpp:1673] Asked to run task
'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor
'ct:1484816820000:0:foocare_zendesk_round_robin:' which is
terminating/terminated
full Log of slave
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:15.066277 107763 slave.cpp:3012] Handling status update TASK_FAILED
(UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001 from executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:15.134692 107766 status_update_manager.cpp:320] Received status update
TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:15.134753 107766 status_update_manager.cpp:824] Checkpointing UPDATE for
status update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:15.142010 107767 slave.cpp:3410] Forwarding the update TASK_FAILED
(UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001 to [email protected]:5050
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:15.142119 107767 slave.cpp:3320] Sending acknowledgement for status
update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001 to executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:15.226682 107761 status_update_manager.cpp:392] Received status update
acknowledgement (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:15.226759 107761 status_update_manager.cpp:824] Checkpointing ACK for
status update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:15.858510 107759 slave.cpp:1361] Got assigned task
ct:1484816820000:0:foocare_zendesk_round_robin: for framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:15.858762 107759 slave.cpp:1480] Launching task
ct:1484816820000:0:foocare_zendesk_round_robin: for framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:15.859004 107759 slave.cpp:1711] Queuing task
'ct:1484816820000:0:foocare_zendesk_round_robin:' for executor
'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:15.939483 107759 slave.cpp:1863] Sending queued task
'ct:1484816820000:0:foocare_zendesk_round_robin:' to executor
'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:16.141394 107762 slave.cpp:3871] Executor
'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001 exited with status 0
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:16.141451 107762 slave.cpp:3012] Handling status update TASK_FAILED
(UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001 from @0.0.0.0:0
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:16.141849 107762 status_update_manager.cpp:320] Received status update
TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:16.141989 107762 status_update_manager.cpp:824] Checkpointing UPDATE for
status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:16.147343 107766 slave.cpp:3410] Forwarding the update TASK_FAILED
(UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001 to [email protected]:5050
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:17.089175 107759 status_update_manager.cpp:392] Received status update
acknowledgement (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:17.089251 107759 status_update_manager.cpp:824] Checkpointing ACK for
status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:17.097193 107774 slave.cpp:1361] Got assigned task
ct:1484816820000:0:foocare_zendesk_round_robin: for framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:17.097453 107774 slave.cpp:1480] Launching task
ct:1484816820000:0:foocare_zendesk_round_robin: for framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:W0119
14:42:17.097527 107774 slave.cpp:1673] Asked to run task
'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor
'ct:1484816820000:0:foocare_zendesk_round_robin:' which is
terminating/terminated
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:17.097568 107774 slave.cpp:3012] Handling status update TASK_LOST (UUID:
b999fb64-34f0-496d-be19-f5a7f998230e) for task
ct:1484816820000:0:foocare_zendesk_round_robin: of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001 from @0.0.0.0:0
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:17.097633 107774 slave.cpp:3975] Cleaning up executor
'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:17.097790 107772 gc.cpp:55] Scheduling
'/data/mesos/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:/runs/6b8922ff-3f57-42a0-97d1-d79c1de3d93b'
for gc 6.99999886874074days in the future
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:17.097836 107772 gc.cpp:55] Scheduling
'/data/mesos/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:'
for gc 6.99999886832296days in the future
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:17.097869 107772 gc.cpp:55] Scheduling
'/data/mesos/meta/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:/runs/6b8922ff-3f57-42a0-97d1-d79c1de3d93b'
for gc 6.99999886819259days in the future
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
14:42:17.097888 107772 gc.cpp:55] Scheduling
'/data/mesos/meta/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:'
for gc 6.99999886809185days in the future
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.WARNING.20161004-154318.107733:W0119
14:42:17.097527 107774 slave.cpp:1673] Asked to run task
'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework
19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor
'ct:1484816820000:0:foocare_zendesk_round_robin:' which is
terminating/terminated
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)