[
https://issues.apache.org/jira/browse/MESOS-7942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16156079#comment-16156079
]
Baskar Sikkayan commented on MESOS-7942:
----------------------------------------
Verbose logs, FYI ...
I0906 14:05:02.915587 11 slave.cpp:4256] Forwarding the update TASK_RUNNING
(UUID: 72a301e2-f3d3-4e24-8e43-fc5ee44b3730) for task
ct:1504706700007:0:Job_Task_Test: of framework
5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 to [email protected]:5050
I0906 14:05:02.915688 11 slave.cpp:4150] Status update manager successfully
handled status update TASK_RUNNING (UUID: 72a301e2-f3d3-4e24-8e43-fc5ee44b3730)
for task ct:1504706700007:0:Job_Task_Test: of framework
5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
I0906 14:05:02.915700 11 slave.cpp:4166] Sending acknowledgement for status
update TASK_RUNNING (UUID: 72a301e2-f3d3-4e24-8e43-fc5ee44b3730) for task
ct:1504706700007:0:Job_Task_Test: of framework
5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 to executor(1)@20.426.45.305:48379
I0906 14:05:03.628334 10 status_update_manager.cpp:395] Received status
update acknowledgement (UUID: 72a301e2-f3d3-4e24-8e43-fc5ee44b3730) for task
ct:1504706700007:0:Job_Task_Test: of framework
5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
I0906 14:05:03.628398 10 status_update_manager.cpp:832] Checkpointing ACK
for status update TASK_RUNNING (UUID: 72a301e2-f3d3-4e24-8e43-fc5ee44b3730) for
task ct:1504706700007:0:Job_Task_Test: of framework
5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
I0906 14:05:03.628463 10 slave.cpp:3105] Status update manager successfully
handled status update acknowledgement (UUID:
72a301e2-f3d3-4e24-8e43-fc5ee44b3730) for task
ct:1504706700007:0:Job_Task_Test: of framework
5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
I0906 14:05:04.256636 14 slave.cpp:5731] Querying resource estimator for
oversubscribable resources
I0906 14:05:04.256691 14 slave.cpp:5745] Received oversubscribable resources
{} from the resource estimator
I0906 14:05:04.944047 10 slave.cpp:4346] Received ping from
slave-observer(1)@20.426.45.305:5050
I0906 14:05:11.049429 12 slave.cpp:3816] Handling status update TASK_FAILED
(UUID: 83eceb05-6a94-4660-bdb5-3cbc2b166b1b) for task
ct:1504706700007:0:Job_Task_Test: of framework
5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 from executor(1)@20.426.45.305:48379
I0906 14:05:11.049636 12 docker.cpp:987] Running docker -H
unix:///var/run/docker.sock inspect
mesos-81cb9c2a-d18b-4127-872b-2a5676dfb314-S0.97dc2c67-5d69-4a8c-b4e1-ba15807697cf
> Mesos slave - docker job exits normally but reporting as TASK_FAILED
> --------------------------------------------------------------------
>
> Key: MESOS-7942
> URL: https://issues.apache.org/jira/browse/MESOS-7942
> Project: Mesos
> Issue Type: Bug
> Components: agent, docker
> Affects Versions: 1.1.0, 1.2.1, 1.3.1
> Environment: Kernel | OS | Snapshot:
> 3.8.13-98.7.1.el7uek | OL 7.3 | 7-2017.6.4
> Reporter: Baskar Sikkayan
>
> Mesos version - 1.2.1.
> Jobs are being scheduled using Chronos. Docker job is being invoked properly,
> but still getting TASK_FAILED error even it completes with exit status ZERO.
> Mesos slave logs :-
> {code}
> I0906 04:15:03.311928 10 slave.cpp:1785] Launching task
> 'ct:1504671300002:0:Job_Task_Test:' for framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
> I0906 04:15:03.314584 10 paths.cpp:547] Trying to chown '
> /mesos-data/slave-2/slaves/f20ab78e-acd3-407a-b1b6-47d67a947eff-S1/frameworks/5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000/executors/ct:1504671300002:0:Job_Task_Test:/runs/7cd8bd78-b20d-4db5-8435-4d1420cb1b93'
> to user 'root'
> I0906 04:15:03.315140 10 slave.cpp:6479] Launching executor
> 'ct:1504671300002:0:Job_Task_Test:' of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 with resources cpus(*)(allocated:
> *):0.1; mem(*)(allocated: *):32 in work directory '
> /mesos-data/slave-2/slaves/f20ab78e-acd3-407a-b1b6-47d67a947eff-S1/frameworks/5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000/executors/ct:1504671300002:0:Job_Task_Test:/runs/7cd8bd78-b20d-4db5-8435-4d1420cb1b93'
> I0906 04:15:03.315809 10 slave.cpp:2118] Queued task
> 'ct:1504671300002:0:Job_Task_Test:' for executor
> 'ct:1504671300002:0:Job_Task_Test:' of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
> I0906 04:15:03.316238 12 docker.cpp:1165] Starting container
> '7cd8bd78-b20d-4db5-8435-4d1420cb1b93' for task
> 'ct:1504671300002:0:Job_Task_Test:' (and executor
> 'ct:1504671300002:0:Job_Task_Test:') of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
> I0906 04:15:03.612807 10 docker.cpp:803] Checkpointing pid 248 to '
> /mesos-data/slave-2/meta/slaves/f20ab78e-acd3-407a-b1b6-47d67a947eff-S1/frameworks/5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000/executors/ct:1504671300002:0:Job_Task_Test:/runs/7cd8bd78-b20d-4db5-8435-4d1420cb1b93/pids/forked.pid'
> I0906 04:15:03.649960 10 slave.cpp:3385] Got registration for executor
> 'ct:1504671300002:0:Job_Task_Test:' of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 from executor(1)@20.403.68.700:38740
> I0906 04:15:03.650584 11 docker.cpp:1608] Ignoring updating container
> 7cd8bd78-b20d-4db5-8435-4d1420cb1b93 because resources passed to update are
> identical to existing resources
> I0906 04:15:03.650701 11 slave.cpp:2331] Sending queued task
> 'ct:1504671300002:0:Job_Task_Test:' to executor
> 'ct:1504671300002:0:Job_Task_Test:' of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 at executor(1)@20.403.68.700:38740
> I0906 04:15:05.255101 10 slave.cpp:3816] Handling status update
> TASK_RUNNING (UUID: 35a9a010-d623-45a3-9d1e-bdbc6942129f) for task
> ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 from executor(1)@20.403.68.700:38740
> I0906 04:15:05.255280 10 status_update_manager.cpp:323] Received status
> update TASK_RUNNING (UUID: 35a9a010-d623-45a3-9d1e-bdbc6942129f) for task
> ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
> I0906 04:15:05.255551 10 status_update_manager.cpp:832] Checkpointing
> UPDATE for status update TASK_RUNNING (UUID:
> 35a9a010-d623-45a3-9d1e-bdbc6942129f) for task
> ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
> I0906 04:15:05.255697 9 slave.cpp:4256] Forwarding the update
> TASK_RUNNING (UUID: 35a9a010-d623-45a3-9d1e-bdbc6942129f) for task
> ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 to [email protected]:5050
> I0906 04:15:05.255803 9 slave.cpp:4166] Sending acknowledgement for
> status update TASK_RUNNING (UUID: 35a9a010-d623-45a3-9d1e-bdbc6942129f) for
> task ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 to executor(1)@20.403.68.700:38740
> I0906 04:15:05.260083 10 status_update_manager.cpp:395] Received status
> update acknowledgement (UUID: 35a9a010-d623-45a3-9d1e-bdbc6942129f) for task
> ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
> I0906 04:15:05.260114 10 status_update_manager.cpp:832] Checkpointing ACK
> for status update TASK_RUNNING (UUID: 35a9a010-d623-45a3-9d1e-bdbc6942129f)
> for task ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
> I0906 04:15:13.090368 13 slave.cpp:3816] *{color:#f6c342}Handling status
> update TASK_FAILED{color}* (UUID: 62ecb989-f260-42b0-ba99-d22fa7210a4c) for
> task ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 from executor(1)@20.403.68.700:38740
> I0906 04:15:13.164096 13 status_update_manager.cpp:323] Received status
> update TASK_FAILED (UUID: 62ecb989-f260-42b0-ba99-d22fa7210a4c) for task
> ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
> I0906 04:15:13.164135 13 status_update_manager.cpp:832] Checkpointing
> UPDATE for status update TASK_FAILED (UUID:
> 62ecb989-f260-42b0-ba99-d22fa7210a4c) for task
> ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
> I0906 04:15:13.164289 10 slave.cpp:4256] Forwarding the update TASK_FAILED
> (UUID: 62ecb989-f260-42b0-ba99-d22fa7210a4c) for task
> ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 to [email protected]:5050
> I0906 04:15:13.164397 10 slave.cpp:4166] Sending acknowledgement for
> status update TASK_FAILED (UUID: 62ecb989-f260-42b0-ba99-d22fa7210a4c) for
> task ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 to executor(1)@20.403.68.700:38740
> I0906 04:15:13.172888 12 status_update_manager.cpp:395] Received status
> update acknowledgement (UUID: 62ecb989-f260-42b0-ba99-d22fa7210a4c) for task
> ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
> I0906 04:15:13.172940 12 status_update_manager.cpp:832] Checkpointing ACK
> for status update TASK_FAILED (UUID: 62ecb989-f260-42b0-ba99-d22fa7210a4c)
> for task ct:1504671300002:0:Job_Task_Test: of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000
> I0906 04:15:14.092870 11 slave.cpp:4388] Got exited event for
> executor(1)@20.403.68.700:38740
> I0906 04:15:14.168128 11 docker.cpp:2397] Executor for container
> 7cd8bd78-b20d-4db5-8435-4d1420cb1b93 has exited
> I0906 04:15:14.168166 11 docker.cpp:2091] Destroying container
> 7cd8bd78-b20d-4db5-8435-4d1420cb1b93
> I0906 04:15:14.168196 11 docker.cpp:2218] Running docker stop on container
> 7cd8bd78-b20d-4db5-8435-4d1420cb1b93
> I0906 04:15:14.170940 15 slave.cpp:4768] *Executor
> 'ct:1504671300002:0:Job_Task_Test:' of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 exited with status 0*
> I0906 04:15:14.170967 15 slave.cpp:4868] Cleaning up executor
> 'ct:1504671300002:0:Job_Task_Test:' of framework
> 5175f6c9-0617-4145-ab46-3b7e64dc67ea-0000 at executor(1)@20.403.68.700:38740
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)