[ 
https://issues.apache.org/jira/browse/MESOS-6414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-6414:
----------------------------
    Description: 
Now if we launch a docker container in Mesos containerizer, the racing may 
happen
between docker daemon and Mesos containerizer during cgroups operations.
For example, when the docker container which run in Mesos containerizer OOM 
exit,
Mesos containerizer would destroy following hierarchies

{code}
/sys/fs/cgroup/freezer/mesos/<mesos-cgroup>/<docker-cgroup>
/sys/fs/cgroup/freezer/mesos/<mesos-cgroup>
{code}

But the docker daemon may destroy 

{code}
/sys/fs/cgroup/freezer/mesos/<mesos-cgroup>/<docker-cgroup>
{code}

at the same time.

If the docker daemon destroy the hierarchy first, then the Mesos containerizer 
would
failed during {{CgroupsIsolatorProcess::cleanup}} because it could not find 
that hierarchy
when destroying.

  was:
If a mesos task is launched in a cgroup outside of the context of Mesos,  Mesos 
is unaware of that cgroup created in the task context.

Now when the Mesos task terminates: Mesos tries to cleanup all cgroups within 
the top level cgroup it knows about. If the cgroup created in the task context 
exists when LinuxLauncherProcess::destroy() is called but is eventually cleaned 
up by the container before we do a freeze() or thaw() or remove(), it fails at 
those stages leading to an incomplete cleanup of the container.


> Task cleanup fails when the containers includes cgroups not owned by Mesos
> --------------------------------------------------------------------------
>
>                 Key: MESOS-6414
>                 URL: https://issues.apache.org/jira/browse/MESOS-6414
>             Project: Mesos
>          Issue Type: Bug
>          Components: cgroups
>            Reporter: Anindya Sinha
>            Assignee: Anindya Sinha
>            Priority: Minor
>
> Now if we launch a docker container in Mesos containerizer, the racing may 
> happen
> between docker daemon and Mesos containerizer during cgroups operations.
> For example, when the docker container which run in Mesos containerizer OOM 
> exit,
> Mesos containerizer would destroy following hierarchies
> {code}
> /sys/fs/cgroup/freezer/mesos/<mesos-cgroup>/<docker-cgroup>
> /sys/fs/cgroup/freezer/mesos/<mesos-cgroup>
> {code}
> But the docker daemon may destroy 
> {code}
> /sys/fs/cgroup/freezer/mesos/<mesos-cgroup>/<docker-cgroup>
> {code}
> at the same time.
> If the docker daemon destroy the hierarchy first, then the Mesos 
> containerizer would
> failed during {{CgroupsIsolatorProcess::cleanup}} because it could not find 
> that hierarchy
> when destroying.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to