[ 
https://issues.apache.org/jira/browse/MESOS-2554?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14517529#comment-14517529
 ] 

haosdent commented on MESOS-2554:
---------------------------------

[~jieyu] I am not sure whether my idea to fix this bug are correct or not. Let 
me describe my idea.

1. check the --isolation and --slave_subsystems when start slave. When they are 
inconsistent, print error message and exit slave. But I not sure how to check 
when isolation param is "external"
2. pass slave_subsystems to different containerizer.cpp, check the consistent 
in different containerizer implementation. If the container don't use the 
subsystems which list in --slave_subsystems, move the container to the root in 
that subsystems. For example, if container A use 'mem' as isolation, and slave  
use 'cpu,mem' as slave_subsystems when start. After start container A, move it 
to the cgroup cpu root hierarchy

I am not sure whether I understand this issue correct or not, please indicate 
the errors. Thank you very much.

> Slave flaps when using --slave_subsystems that are not used for isolation.
> --------------------------------------------------------------------------
>
>                 Key: MESOS-2554
>                 URL: https://issues.apache.org/jira/browse/MESOS-2554
>             Project: Mesos
>          Issue Type: Bug
>    Affects Versions: 0.21.0, 0.21.1, 0.22.0
>            Reporter: Jie Yu
>            Priority: Critical
>
> Say one use --slave_subsystems=cpuacct
> However, if he/she does not use cpuacct cgroup for isolation, all processes 
> forked by the slave (e.g., tasks) will be part of the slave cgroup. This is 
> not expected. ALso, more importantly, this will cause the slave to flap when 
> restart because there are task processes in slave's cgroup.
> We should add a check during slave startup at least!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to