[
https://issues.apache.org/jira/browse/MESOS-1837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14274091#comment-14274091
]
Bhuvan Arumugam commented on MESOS-1837:
----------------------------------------
We are facing same issue when running docker based jobs using mesos and aurora.
The issue is not always reproducible. It happen with a simple bash script that
run infinitely. It happen in different clusters.
In our case, mesos slave don't seem to start thermos-executor, or looking for
executor pid file even before it was started.
We are using docker 1.2 and mesos 0.21 as of
d9cd0e318d0261e39ff4b91f494117ab3a555a4e.
https://github.com/apache/mesos/commit/d9cd0e318d0261e39ff4b91f494117ab3a555a4e
{code}
I1220 00:30:19.447170 31606 slave.cpp:1076] Got assigned task
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
for framework 20140522-213145-1749004561-5050-29512-0000
I1220 00:30:19.448401 31606 slave.cpp:1186] Launching task
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
for framework 20140522-213145-1749004561-5050-29512-0000
I1220 00:30:19.450564 31606 slave.cpp:3852] Launching executor
thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
of framework 20140522-213145-1749004561-5050-29512-0000 in work directory
'/tmp/mesos/slaves/20141213-083425-666939665-5051-50967-45/frameworks/20140522-213145-1749004561-5050-29512-0000/executors/thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4/runs/ecc5078b-c7ac-43e0-8977-7278ae9d1784'
I1220 00:30:19.450933 31606 slave.cpp:1312] Queuing task
'1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4'
for executor
thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
of framework '20140522-213145-1749004561-5050-29512-0000
I1220 00:30:19.467684 31600 docker.cpp:984] Starting container
'ecc5078b-c7ac-43e0-8977-7278ae9d1784' for executor
'thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4'
and framework '20140522-213145-1749004561-5050-29512-0000'
I1220 00:30:20.865212 31617 docker.cpp:1138] Checkpointing executor's forked
pid 1280 to
'/tmp/mesos/meta/slaves/20141213-083425-666939665-5051-50967-45/frameworks/20140522-213145-1749004561-5050-29512-0000/executors/thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4/runs/ecc5078b-c7ac-43e0-8977-7278ae9d1784/pids/forked.pid'
I1220 00:30:20.868604 31598 slave.cpp:2783] Monitoring executor
'thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4'
of framework '20140522-213145-1749004561-5050-29512-0000' in container
'ecc5078b-c7ac-43e0-8977-7278ae9d1784'
I1220 00:30:21.664190 31598 slave.cpp:1368] Asked to kill task
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
of framework 20140522-213145-1749004561-5050-29512-0000
I1220 00:30:21.665225 31598 slave.cpp:2197] Handling status update TASK_KILLED
(UUID: 3d84baf6-95c6-4e48-ad4d-11d02ec6424e) for task
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
of framework 20140522-213145-1749004561-5050-29512-0000 from @0.0.0.0:0
W1220 00:30:21.665529 31598 slave.cpp:1465] Killing the unregistered executor
'thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4'
of framework 20140522-213145-1749004561-5050-29512-0000 because it has no tasks
I1220 00:30:21.668202 31614 docker.cpp:1473] Destroying container
'ecc5078b-c7ac-43e0-8977-7278ae9d1784'
I1220 00:30:21.668301 31614 docker.cpp:1568] Running docker kill on container
'ecc5078b-c7ac-43e0-8977-7278ae9d1784'
I1220 00:30:21.774128 31615 docker.cpp:1646] Executor for container
'ecc5078b-c7ac-43e0-8977-7278ae9d1784' has exited
E1220 00:30:21.777487 31600 slave.cpp:2323] Failed to update resources for
container ecc5078b-c7ac-43e0-8977-7278ae9d1784 of executor
thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
running task
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
on status update for terminal task, destroying container: Failed to determine
cgroup for the 'cpu' subsystem: Failed to read /proc/1280/cgroup: Failed to
open file '/proc/1280/cgroup': No such file or directory
I1220 00:30:21.778852 31600 status_update_manager.cpp:317] Received status
update TASK_KILLED (UUID: 3d84baf6-95c6-4e48-ad4d-11d02ec6424e) for task
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
of framework 20140522-213145-1749004561-5050-29512-0000
e48-ad4d-11d02ec6424e) for task
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
of framework 20140522-213145-1749004561-5050-29512-0000
I1220 00:30:22.709486 31604 slave.cpp:2437] Forwarding the update TASK_KILLED
(UUID: 3d84baf6-95c6-4e48-ad4d-11d02ec6424e) for task
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
of framework 20140522-213145-1749004561-5050-29512-0000 to
[email protected]:5051
I1220 00:30:22.723120 31600 status_update_manager.cpp:389] Received status
update acknowledgement (UUID: 3d84baf6-95c6-4e48-ad4d-11d02ec6424e) for task
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
of framework 20140522-213145-1749004561-5050-29512-0000
I1220 00:30:22.723160 31600 status_update_manager.hpp:346] Checkpointing ACK
for status update TASK_KILLED (UUID: 3d84baf6-95c6-4e48-ad4d-11d02ec6424e) for
task
1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4
of framework 20140522-213145-1749004561-5050-29512-0000
I1220 00:30:30.138429 31610 slave.cpp:2834] Executor
'thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4'
of framework 20140522-213145-1749004561-5050-29512-0000 has terminated with
unknown status
I1220 00:30:30.138703 31610 slave.cpp:2978] Cleaning up executor
'thermos-1419035419430-tilter-prod-tilter-docker-0-a0fc4f22-5397-46df-b9b5-25133eb6cca4'
of framework 20140522-213145-1749004561-5050-29512-0000
{code}
> failed to determine cgroup for the 'cpu' subsystem
> --------------------------------------------------
>
> Key: MESOS-1837
> URL: https://issues.apache.org/jira/browse/MESOS-1837
> Project: Mesos
> Issue Type: Bug
> Components: general
> Affects Versions: 0.20.1
> Environment: Ubuntu 14.04
> Reporter: Chris Fortier
>
> Attempting to launch Docker container with Marathon. Container is launched
> then fails.
> A search of /var/log/syslog reveals:
> Sep 27 03:01:43 vagrant-ubuntu-trusty-64 mesos-slave[1409]: E0927
> 03:01:43.546957 1463 slave.cpp:2205] Failed to update resources for
> container 8c2429d9-f090-4443-8108-0206ca37f3fd of executor
> hello-world.970dbe74-45f2-11e4-8b1d-56847afe9799 running task
> hello-world.970dbe74-45f2-11e4-8b1d-56847afe9799 on status update for
> terminal task, destroying container: Failed to determine cgroup for the 'cpu'
> subsystem: Failed to read /proc/9792/cgroup: Failed to open file
> '/proc/9792/cgroup': No such file or directory
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)