Re: Review Request 20818: Refactored cgroups::destroy to use a single pass and to reap processes.

Ian Downes Fri, 16 May 2014 15:26:29 -0700


> On May 14, 2014, 10:16 a.m., Jie Yu wrote:
> > src/linux/cgroups.cpp, lines 1634-1645
> > <https://reviews.apache.org/r/20818/diff/4/?file=572874#file572874line1634>
> >
> >     This sequence of operations are introduced in this commit. Probably you 
> > should contact the author to see if your current logic is OK. If yes, 
> > please add some comments to explain why it is OK.
> >     
> >     commit 77db3cb32f9d25656b86607f6f241b2303dbd982
> >     Author: Brenden Matthews <brenden.matth...@airbnb.com>
> >     Date:   Fri May 10 16:00:15 2013 -0700
> >     
> >         Changed cgroups killTasks() sequence.
> >         
> >         The sequence is as follows:
> >           stop -> kill -> empty -> freeze -> kill -> thaw -> empty
> >         
> >         We also now ignore ESRCH errors from kill() in cgroups::kill().
> >         
> >         Review: https://reviews.apache.org/r/11131


Brendan commented that this was introduced when it was observed that OOM'ed 
processes weren't being killed successfully. 

At this time the cgroups_isolator was disabling the kernel OOM killer and 
trying to handle the OOM in Mesos. Trying to handle this in userspace is a bad 
idea and could lead to processes stuck in weird states. This is no longer done 
in Mesos and the kernel OOM killer is used.


- Ian


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/20818/#review42977
-----------------------------------------------------------


On May 5, 2014, 1:16 p.m., Ian Downes wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/20818/
> -----------------------------------------------------------
> 
> (Updated May 5, 2014, 1:16 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ben Mahler, Chi Zhang, Jie Yu, 
> and Vinod Kone.
> 
> 
> Bugs: MESOS-759 and MESOS-976
>     https://issues.apache.org/jira/browse/MESOS-759
>     https://issues.apache.org/jira/browse/MESOS-976
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
>     internal::Destroyer now uses a single pass of freeze, kill, thaw, reap
>     to kill all processes in a freezer cgroup. cgroups::destroy will not
>     complete until all processes have been reaped. A timeout is used for
>     each step so destroy will eventually fail rather than constantly
>     retrying.
> 
>     Added a test for destroying a freezer cgroup containing a stopped
>     process.
> 
> 
> Diffs
> -----
> 
>   src/linux/cgroups.hpp 5a5735721fb9f051eee661edb08d1cdaa163d0f3 
>   src/linux/cgroups.cpp 8202c282f580d027a60ded2081962e96e4860f60 
>   src/slave/containerizer/isolators/cgroups/cpushare.cpp 
> b494a9236210245383e20fa9ab3dbac01e42f8dd 
>   src/slave/containerizer/linux_launcher.cpp 
> 530e0bd64d71bad761a2eab3d6e2f2179a167b4b 
>   src/tests/cgroups_tests.cpp 6ba9de622953e158feadaa9950618b0b13c9e832 
> 
> Diff: https://reviews.apache.org/r/20818/diff/
> 
> 
> Testing
> -------
> 
> make check # Linux
> 
> 
> Thanks,
> 
> Ian Downes
> 
>

Re: Review Request 20818: Refactored cgroups::destroy to use a single pass and to reap processes.

Reply via email to