> On June 12, 2016, 1:09 p.m., Qian Zhang wrote:
> > src/slave/containerizer/mesos/isolators/cgroups/subsystem.cpp, lines 674-677
> > <https://reviews.apache.org/r/45363/diff/8/?file=1347662#file1347662line674>
> >
> >     I see here we call `lambda::bind()`, but in the original `cgroups/mem` 
> > isolator, we call defer():
> >     
> > https://github.com/apache/mesos/blob/0.28.2/src/slave/containerizer/mesos/isolators/cgroups/mem.cpp#L605:L609
> >     
> >     That means `MemorySubsystem::oomWaited()` will be executed in a 
> > separate thread. So my concern is there may be some multi-thread race 
> > condition issue for your way, e.g., what if `MemorySubsystem::oomWaited()` 
> > is triggered after the container exits (i.e, we have cleaned it up)? I 
> > think in the original `cgroups/mem` isolator, we have already considered 
> > this issue, please refer to this comment for details:
> >     
> > https://github.com/apache/mesos/blob/0.28.2/src/slave/containerizer/mesos/isolators/cgroups/mem.cpp#L634:L637
> 
> haosdent huang wrote:
>     For
>     ```
>         // It is likely that process exited is executed before this
>         // function (e.g.  The kill and OOM events happen at the same
>         // time, and the process exit event arrives first.) Therefore, we
>         // should not report a fatal error here.
>     ```
>     It is because eventfd would be triggered when hierarchy destruction. It 
> should be fixed in https://reviews.apache.org/r/46299/
> 
> haosdent huang wrote:
>     I try to avoid to make `Subsystem` extends from `Process` before, but I 
> think it should be necessary, let me update here.
> 
> Qian Zhang wrote:
>     I think we need to handle this multi-thread issue anyway (even 
> https://reviews.apache.org/r/46299/ is fixed), because when container uses 
> the memory more than the limit, it will be killed (so 
> `CgroupsMemIsolatorProcess::cleanup()` will be called) and at the same time 
> OOM event will happen (so `CgroupsMemIsolatorProcess::oom()` will be called), 
> but the order can not be guaranteed, if `CgroupsMemIsolatorProcess::oom()` is 
> called after `CgroupsMemIsolatorProcess::cleanup()` which will delete `info`, 
> then we will not find `info` by the containerId. I think that's why we have 
> the following code in the `cgroups/mem` isolator.
>     ```
>       if (!infos.contains(containerId)) {
>         // It is likely that process exited is executed before this
>         // function (e.g.  The kill and OOM events happen at the same
>         // time, and the process exit event arrives first.) Therefore, we
>         // should not report a fatal error here.
>         LOG(INFO) << "OOM detected for an already terminated executor";
>         return;
>       }
>     ```

Yes, need extends from `Process` and use `defer` to avoid threads competition.


- haosdent


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/45363/#review137168
-----------------------------------------------------------


On June 19, 2016, 10:31 a.m., haosdent huang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/45363/
> -----------------------------------------------------------
> 
> (Updated June 19, 2016, 10:31 a.m.)
> 
> 
> Review request for mesos, Gilbert Song, Guangya Liu, Ian Downes, Jie Yu, 
> Kevin Klues, and Qian Zhang.
> 
> 
> Bugs: MESOS-5045
>     https://issues.apache.org/jira/browse/MESOS-5045
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Add `MemorySubsystem` for cgroups unified isolator.
> 
> 
> Diffs
> -----
> 
>   src/slave/containerizer/mesos/isolators/cgroups/cgroups.cpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystem.hpp PRE-CREATION 
>   src/slave/containerizer/mesos/isolators/cgroups/subsystem.cpp PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/45363/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> haosdent huang
> 
>

Reply via email to