[ 
https://issues.apache.org/jira/browse/MESOS-6489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15611011#comment-15611011
 ] 

Anindya Sinha commented on MESOS-6489:
--------------------------------------

Jotting down some thoughts based on our previous conversation:

In {{Future<Nothing> destroy(const string& hierarchy, const string& cgroup)}}

1. Extract all cgroups (including sub cgroups) in bottom up fashion via 
{{cgroups::get(hierarchy, cgroup)}}

2. If freezer is available:
2a. We use {{TasksKiller}} to freeze cgroups, {{SIGKILL}} all tasks, and thaw 
cgroups (may be in top down fashion). However, we add a new attribute to this 
class {{bool ignoreMissingCgroup}}. If that is set, we ignore any error for 
cgroups that do not exist in {{TasksKiller::finished()}}.
2b. At this point, we remove the cgroups in bottom up fashion incase there is 
no error reported in {{TasksKiller}}. We bail out as an error if there is any 
failure in removal of cgroups. Similar to step #2a, we ignore errors for 
cgroups that do not exist.

3. If freezer is unavailable, we remove the cgroups starting from bottom up 
using {{cgroups::remove(hierarchy, cgroup)}}. If remove fails due to 
non-presence of the cgroup, we ignore that failure,

We will have the "ignore error due to missing cgroup" in 2 places, viz. 
{{TasksKiller::finished()}} and in {{cgroups::destroy}}

> Better support for containers that want to manage their own cgroup.
> -------------------------------------------------------------------
>
>                 Key: MESOS-6489
>                 URL: https://issues.apache.org/jira/browse/MESOS-6489
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Jie Yu
>
> Some containers want to manage their cgroup by sub-dividing the cgroup that 
> Mesos allocates to them into multiple sub-cgroups and put subprocess into the 
> corresponding sub-cgroups.
> For instance, someone wants to run Docker daemon in a Mesos container. Docker 
> daemon will manage the cgroup assigned to it by Mesos (with the help , for 
> example, cgroups namespace).
> Problems arise during the teardown of the container because two entities 
> might be manipulating the same cgroup simultaneously. For example, the Mesos 
> cgroups::destroy might fail if the task running inside is trying to delete 
> the same nested cgroup at the same time.
> To support that case, we should consider kill all the processes in the Mesos 
> cgroup first, making sure that no one will be creating sub-cgroups and moving 
> new processes into sub-cgroups. And then, destroy the cgroups recursively.
> And we need freezer because we want to make sure all processes are stopped 
> while we are sending kill signals to avoid TOCTTOU race problem. I think it 
> makes more sense to freezer the cgroups (and sub-cgroups) from top down 
> (rather than bottom up because typically, processes in the parent cgroup 
> manipulate sub-cgroups).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to