> On July 28, 2016, 12:01 a.m., Jie Yu wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp, line 486
> > <https://reviews.apache.org/r/49827/diff/9/?file=1453251#file1453251line486>
> >
> > Would love to understand why test will cause this issue.
>
> Qian Zhang wrote:
> It seems that comment was introduced by this ticket:
> https://issues.apache.org/jira/browse/MESOS-1438 , and in the test code here:
> https://github.com/apache/mesos/blob/1.0.0/src/tests/cluster.cpp#L552:L559 ,
> I see each container created in the test will be cleaned up, and we have
> already initiated cleanup for those containers in the test code, that might
> be where the race occurs.
>
> haosdent huang wrote:
> Thx @qian!
>
> Jie Yu wrote:
> Calling `destroy` on Containerizer multiple times should not cause
> `isolator->destroy` being called multiple times. I still don't get it
Hi, @jieyu. I check the code again. We have
```
if (container->state == DESTROYING) {
VLOG(1) << "Destroy has already been initiated for '" << containerId << "'";
return;
}
```
so multiple `MesosContainerizer::destroy` should be safe.
But we still need this for these 2 cases:
1. `MesosContainerizer::prepare` are running serially. Suppose we failed in
other isolator preparation and have not performed `CgroupsIsolator::prepare`.
`MesosContainerizer::destroy` would be run in Mesos Agent. And then it would
run `CgroupsIsolator::destroy` as well.
2. Suppose we start Mesos Agent with `--isolation=cgroups/cpu,cgroups/mem` and
start some containers. And restart Mesos Agent with a new isolation flag
`--isolation=cgroups/cpu,cgroups/mem,cgroups/perf_event`. Then when recover
those exists containers, only `cpu`, `cpuacct`, `memory` subsystems could
recovered successfully. But when call `CgroupsIsolator::destroy`, we would try
to cleanup all known subsystems `cpu`, `cpuacct`, `memory`, `perf_event`. This
is what the test case
`MesosContainerizerSlaveRecoveryTest.CGROUPS_ROOT_PERF_RollForward` do exactly.
- haosdent
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/49827/#review143861
-----------------------------------------------------------
On July 31, 2016, 5:47 p.m., haosdent huang wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/49827/
> -----------------------------------------------------------
>
> (Updated July 31, 2016, 5:47 p.m.)
>
>
> Review request for mesos, Gilbert Song, Jie Yu, Qian Zhang, and Zhengju Sha.
>
>
> Bugs: MESOS-5041
> https://issues.apache.org/jira/browse/MESOS-5041
>
>
> Repository: mesos
>
>
> Description
> -------
>
> Implemented `CgroupsIsolatorProcess::cleanup`.
>
>
> Diffs
> -----
>
> src/slave/containerizer/mesos/isolators/cgroups/cgroups.hpp
> 422b4653fc3cb3d14c94b93ff456625fc59fbb27
> src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp
> bd20631a9650cf84e99c6489b2e92bc40ed764ca
>
> Diff: https://reviews.apache.org/r/49827/diff/
>
>
> Testing
> -------
>
>
> Thanks,
>
> haosdent huang
>
>