[
https://issues.apache.org/jira/browse/MESOS-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
haosdent updated MESOS-6414:
----------------------------
Description:
Now if we launch a docker container in Mesos containerizer, the racing may
happen
between docker daemon and Mesos containerizer during cgroups operations.
For example, when the docker container which run in Mesos containerizer OOM
exit,
Mesos containerizer would destroy following hierarchies
{code}
/sys/fs/cgroup/freezer/mesos/<mesos-cgroup>/<docker-cgroup>
/sys/fs/cgroup/freezer/mesos/<mesos-cgroup>
{code}
But the docker daemon may destroy
{code}
/sys/fs/cgroup/freezer/mesos/<mesos-cgroup>/<docker-cgroup>
{code}
at the same time.
If the docker daemon destroy the hierarchy first, then the Mesos containerizer
would
failed during {{CgroupsIsolatorProcess::cleanup}} because it could not find
that hierarchy
when destroying.
was:
If a mesos task is launched in a cgroup outside of the context of Mesos, Mesos
is unaware of that cgroup created in the task context.
Now when the Mesos task terminates: Mesos tries to cleanup all cgroups within
the top level cgroup it knows about. If the cgroup created in the task context
exists when LinuxLauncherProcess::destroy() is called but is eventually cleaned
up by the container before we do a freeze() or thaw() or remove(), it fails at
those stages leading to an incomplete cleanup of the container.
> Task cleanup fails when the containers includes cgroups not owned by Mesos
> --------------------------------------------------------------------------
>
> Key: MESOS-6414
> URL: https://issues.apache.org/jira/browse/MESOS-6414
> Project: Mesos
> Issue Type: Bug
> Components: cgroups
> Reporter: Anindya Sinha
> Assignee: Anindya Sinha
> Priority: Minor
>
> Now if we launch a docker container in Mesos containerizer, the racing may
> happen
> between docker daemon and Mesos containerizer during cgroups operations.
> For example, when the docker container which run in Mesos containerizer OOM
> exit,
> Mesos containerizer would destroy following hierarchies
> {code}
> /sys/fs/cgroup/freezer/mesos/<mesos-cgroup>/<docker-cgroup>
> /sys/fs/cgroup/freezer/mesos/<mesos-cgroup>
> {code}
> But the docker daemon may destroy
> {code}
> /sys/fs/cgroup/freezer/mesos/<mesos-cgroup>/<docker-cgroup>
> {code}
> at the same time.
> If the docker daemon destroy the hierarchy first, then the Mesos
> containerizer would
> failed during {{CgroupsIsolatorProcess::cleanup}} because it could not find
> that hierarchy
> when destroying.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)