[jira] [Commented] (MESOS-1765) Use PID namespace to avoid freezing cgroup
[ https://issues.apache.org/jira/browse/MESOS-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14125756#comment-14125756 ] Cong Wang commented on MESOS-1765: -- [~yasumoto] Sure, here is the patch I sent to Linux kernel: https://lkml.org/lkml/2014/9/4/646 which contains the description of the bug. Use PID namespace to avoid freezing cgroup -- Key: MESOS-1765 URL: https://issues.apache.org/jira/browse/MESOS-1765 Project: Mesos Issue Type: Story Components: containerization Reporter: Cong Wang There is some known kernel issue when we freeze the whole cgroup upon OOM. Mesos probably can just use PID namespace so that we will only need to kill the init of the pid namespace, instead of freezing all the processes and killing them one by one. But I am not quite sure if this would break the existing code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1765) Use PID namespace to avoid freezing cgroup
[ https://issues.apache.org/jira/browse/MESOS-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14124979#comment-14124979 ] Joe Smith commented on MESOS-1765: -- [~wangcong] can you share a link to the kernel bug? (Or a pointer to more discussion?) Sounds like we should also keep tabs on fixing that as well. Use PID namespace to avoid freezing cgroup -- Key: MESOS-1765 URL: https://issues.apache.org/jira/browse/MESOS-1765 Project: Mesos Issue Type: Story Components: containerization Reporter: Cong Wang There is some known kernel issue when we freeze the whole cgroup upon OOM. Mesos probably can just use PID namespace so that we will only need to kill the init of the pid namespace, instead of freezing all the processes and killing them one by one. But I am not quite sure if this would break the existing code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1765) Use PID namespace to avoid freezing cgroup
[ https://issues.apache.org/jira/browse/MESOS-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14123176#comment-14123176 ] Vinod Kone commented on MESOS-1765: --- What is the minimum kernel version required for pid namespaces? If it's not an old enough kernel, we need to figure out a way to use either freezer (w/ kill) or pid namespace depending on the kernel version. Use PID namespace to avoid freezing cgroup -- Key: MESOS-1765 URL: https://issues.apache.org/jira/browse/MESOS-1765 Project: Mesos Issue Type: Story Components: containerization Reporter: Cong Wang There is some known kernel issue when we freeze the whole cgroup upon OOM. Mesos probably can just use PID namespace so that we will only need to kill the init of the pid namespace, instead of freezing all the processes and killing them one by one. But I am not quite sure if this would break the existing code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-1765) Use PID namespace to avoid freezing cgroup
[ https://issues.apache.org/jira/browse/MESOS-1765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14123270#comment-14123270 ] Cong Wang commented on MESOS-1765: -- According to man page, clone(CLONE_NEWPID) requires kernel = 2.6.24, unshare(CLONE_NEWPID) requires 3.8 at least. Mesos probably only needs clone(CLONE_NEWPID), so it should be safe since the current network isolation code already requires kernel 3.4. Use PID namespace to avoid freezing cgroup -- Key: MESOS-1765 URL: https://issues.apache.org/jira/browse/MESOS-1765 Project: Mesos Issue Type: Story Components: containerization Reporter: Cong Wang There is some known kernel issue when we freeze the whole cgroup upon OOM. Mesos probably can just use PID namespace so that we will only need to kill the init of the pid namespace, instead of freezing all the processes and killing them one by one. But I am not quite sure if this would break the existing code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)