[
https://issues.apache.org/jira/browse/MESOS-9076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16544871#comment-16544871
]
Qian Zhang commented on MESOS-9076:
-----------------------------------
The root cause of this issue is, a leading slash in {{--cgroups_root}} will
make [this
check|https://github.com/apache/mesos/blob/1.6.0/src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp#L241]
false, and then we will treat the agent cgroup as an unknown orphaned
container.
> Mesos agent will be wrongly treated as unknown orphaned container if
> `--cgroups_root` has a leading slash
> ---------------------------------------------------------------------------------------------------------
>
> Key: MESOS-9076
> URL: https://issues.apache.org/jira/browse/MESOS-9076
> Project: Mesos
> Issue Type: Bug
> Components: agent, containerization
> Reporter: Qian Zhang
> Priority: Major
>
> When agent is started with the following flags:
> * --cgroups_root=<a value with a leading slash>, e.g., {{/mesos}}.
> * --agent_subsystems=<some cgroups subsystems>, e.g., {{memory}}.
> * --isolation=<some cgroups subsystems which contain the ones specified in
> --agent_subsystems>, e.g., {{cgroups/cpu,cgroups/mem}}.
> we will see the agent will be treated as an unknown orphaned container:
> {code:java}
> I0716 14:55:59.969400 3892 containerizer.cpp:718] Recovering Mesos containers
> I0716 14:55:59.970304 3888 linux_launcher.cpp:299] Recovering Linux launcher
> I0716 14:55:59.973687 3894 containerizer.cpp:1025] Recovering isolators
> W0716 14:55:59.978680 3891 cgroups.cpp:352] Couldn't find the cgroup
> '/mesos/slave' in hierarchy '/sys/fs/cgroup/cpu,cpuacct' for container slave
> I0716 14:55:59.981603 3892 memory.cpp:478] Started listening for OOM events
> for container slave
> I0716 14:55:59.983178 3892 memory.cpp:590] Started listening on 'low' memory
> pressure events for container slave
> I0716 14:55:59.985761 3892 memory.cpp:590] Started listening on 'medium'
> memory pressure events for container slave
> I0716 14:55:59.987036 3892 memory.cpp:590] Started listening on 'critical'
> memory pressure events for container slave
> I0716 14:55:59.988668 3893 cgroups.cpp:320] Cleaning up unknown orphaned
> container slave{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)