[ 
https://issues.apache.org/jira/browse/MESOS-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14274091#comment-14274091
 ] 

Bhuvan Arumugam commented on MESOS-1837:
----------------------------------------

We are facing same issue when running docker based jobs using mesos and aurora. 
The issue is not always reproducible.  It happen with a simple bash script that 
run infinitely. It happen in different clusters.

In our case, mesos slave don't seem to start thermos-executor, or looking for 
executor pid file even before it was started.

We are using docker 1.2 and mesos 0.21 as of 
d9cd0e318d0261e39ff4b91f494117ab3a555a4e.
    
https://github.com/apache/mesos/commit/d9cd0e318d0261e39ff4b91f494117ab3a555a4e

{code}
I1220 00:30:19.447170 31606 slave.cpp:1076] Got assigned task 
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4 
for framework 20140522-213145-1749004561-5050-29512-0000
I1220 00:30:19.448401 31606 slave.cpp:1186] Launching task 
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4 
for framework 20140522-213145-1749004561-5050-29512-0000
I1220 00:30:19.450564 31606 slave.cpp:3852] Launching executor 
thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
 of framework 20140522-213145-1749004561-5050-29512-0000 in work directory 
'/tmp/mesos/slaves/20141213-083425-666939665-5051-50967-45/frameworks/20140522-213145-1749004561-5050-29512-0000/executors/thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4/runs/ecc5078b-c7ac-43e0-8977-7278ae9d1784'
I1220 00:30:19.450933 31606 slave.cpp:1312] Queuing task 
'1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4'
 for executor 
thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
 of framework '20140522-213145-1749004561-5050-29512-0000
I1220 00:30:19.467684 31600 docker.cpp:984] Starting container 
'ecc5078b-c7ac-43e0-8977-7278ae9d1784' for executor 
'thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4'
 and framework '20140522-213145-1749004561-5050-29512-0000'
I1220 00:30:20.865212 31617 docker.cpp:1138] Checkpointing executor's forked 
pid 1280 to 
'/tmp/mesos/meta/slaves/20141213-083425-666939665-5051-50967-45/frameworks/20140522-213145-1749004561-5050-29512-0000/executors/thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4/runs/ecc5078b-c7ac-43e0-8977-7278ae9d1784/pids/forked.pid'
I1220 00:30:20.868604 31598 slave.cpp:2783] Monitoring executor 
'thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4'
 of framework '20140522-213145-1749004561-5050-29512-0000' in container 
'ecc5078b-c7ac-43e0-8977-7278ae9d1784'
I1220 00:30:21.664190 31598 slave.cpp:1368] Asked to kill task 
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4 
of framework 20140522-213145-1749004561-5050-29512-0000
I1220 00:30:21.665225 31598 slave.cpp:2197] Handling status update TASK_KILLED 
(UUID: 3d84baf6-95c6-4e48-ad4d-11d02ec6424e) for task 
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4 
of framework 20140522-213145-1749004561-5050-29512-0000 from @0.0.0.0:0
W1220 00:30:21.665529 31598 slave.cpp:1465] Killing the unregistered executor 
'thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4'
 of framework 20140522-213145-1749004561-5050-29512-0000 because it has no tasks
I1220 00:30:21.668202 31614 docker.cpp:1473] Destroying container 
'ecc5078b-c7ac-43e0-8977-7278ae9d1784'
I1220 00:30:21.668301 31614 docker.cpp:1568] Running docker kill on container 
'ecc5078b-c7ac-43e0-8977-7278ae9d1784'
I1220 00:30:21.774128 31615 docker.cpp:1646] Executor for container 
'ecc5078b-c7ac-43e0-8977-7278ae9d1784' has exited
E1220 00:30:21.777487 31600 slave.cpp:2323] Failed to update resources for 
container ecc5078b-c7ac-43e0-8977-7278ae9d1784 of executor 
thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
 running task 
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4 
on status update for terminal task, destroying container: Failed to determine 
cgroup for the 'cpu' subsystem: Failed to read /proc/1280/cgroup: Failed to 
open file '/proc/1280/cgroup': No such file or directory
I1220 00:30:21.778852 31600 status_update_manager.cpp:317] Received status 
update TASK_KILLED (UUID: 3d84baf6-95c6-4e48-ad4d-11d02ec6424e) for task 
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4 
of framework 20140522-213145-1749004561-5050-29512-0000
e48-ad4d-11d02ec6424e) for task 
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4 
of framework 20140522-213145-1749004561-5050-29512-0000
I1220 00:30:22.709486 31604 slave.cpp:2437] Forwarding the update TASK_KILLED 
(UUID: 3d84baf6-95c6-4e48-ad4d-11d02ec6424e) for task 
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4 
of framework 20140522-213145-1749004561-5050-29512-0000 to 
[email protected]:5051
I1220 00:30:22.723120 31600 status_update_manager.cpp:389] Received status 
update acknowledgement (UUID: 3d84baf6-95c6-4e48-ad4d-11d02ec6424e) for task 
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4 
of framework 20140522-213145-1749004561-5050-29512-0000
I1220 00:30:22.723160 31600 status_update_manager.hpp:346] Checkpointing ACK 
for status update TASK_KILLED (UUID: 3d84baf6-95c6-4e48-ad4d-11d02ec6424e) for 
task 
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4 
of framework 20140522-213145-1749004561-5050-29512-0000
I1220 00:30:30.138429 31610 slave.cpp:2834] Executor 
'thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4'
 of framework 20140522-213145-1749004561-5050-29512-0000 has terminated with 
unknown status
I1220 00:30:30.138703 31610 slave.cpp:2978] Cleaning up executor 
'thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4'
 of framework 20140522-213145-1749004561-5050-29512-0000
{code}

> failed to determine cgroup for the 'cpu' subsystem
> --------------------------------------------------
>
>                 Key: MESOS-1837
>                 URL: https://issues.apache.org/jira/browse/MESOS-1837
>             Project: Mesos
>          Issue Type: Bug
>          Components: general
>    Affects Versions: 0.20.1
>         Environment: Ubuntu 14.04
>            Reporter: Chris Fortier
>
> Attempting to launch Docker container with Marathon. Container is launched 
> then fails. 
> A search of /var/log/syslog reveals:
> Sep 27 03:01:43 vagrant-ubuntu-trusty-64 mesos-slave[1409]: E0927 
> 03:01:43.546957  1463 slave.cpp:2205] Failed to update resources for 
> container 8c2429d9-f090-4443-8108-0206ca37f3fd of executor 
> hello-world.970dbe74-45f2-11e4-8b1d-56847afe9799 running task 
> hello-world.970dbe74-45f2-11e4-8b1d-56847afe9799 on status update for 
> terminal task, destroying container: Failed to determine cgroup for the 'cpu' 
> subsystem: Failed to read /proc/9792/cgroup: Failed to open file 
> '/proc/9792/cgroup': No such file or directory



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to