> On Sept. 9, 2013, 6:03 p.m., Ben Mahler wrote:
> > We should definitely enable the OOM killer!
> > 
> > I would like to rebase off of your changes here into some changes I've been 
> > working on that use memory threshold notifications as a way for us to 
> > induce our own "oom".
> > 
> > I'll describe what a few of us had discussed here:
> >  -> Enable the oom killer, I'll pull in your change here!
> >  -> Use memory threshold notifications set to the memory limit.
> >  -> When the notification triggers, consider it an OOM and destroy the 
> > container. This can still send memory.stat information.
> >  -> If a process is allocating quickly enough to trigger the OOM killer, 
> > we'll still receive an OOM notification and process it, the downside is 
> > that the memory information will not represent the OOM state. This is 
> > because a process has been killed once we're notified of the OOM (as you 
> > described).
> > 
> > Do you see any issues with using memory threshold notifications as well?

No issues here, sounds good!


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/14024/#review25994
-----------------------------------------------------------


On Sept. 6, 2013, 11:05 p.m., David Mackey wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/14024/
> -----------------------------------------------------------
> 
> (Updated Sept. 6, 2013, 11:05 p.m.)
> 
> 
> Review request for mesos, Benjamin Hindman, Ben Mahler, Eric Biederman, and 
> Vinod Kone.
> 
> 
> Bugs: MESOS-662
>     https://issues.apache.org/jira/browse/MESOS-662
> 
> 
> Repository: mesos-git
> 
> 
> Description
> -------
> 
> I post this partially as a RFC. I'm in favor of this approach but happy to 
> have the discussion here.
> 
> The Mesos userspace OOM handler does not conform to the practical
> restrictions imposed upon it given the potential states the kernel can
> be in when it gets the OOM notification. The result of this has been
> numerous deadlocks because the Mesos OOM handler blocks on a lock that
> is being held by the task it is trying to kill.
> 
> This patch does not try to fix the issues with the OOM handler. Instead,
> it hands over the job of OOM-killing to the kernel. The end result is
> very similar. The downside to this approach compared to the approach
> it's moving away from is now when the Mesos OOM handler reads the
> memory.stats they will be after the oom condition occurred. The "maximum
> usage" is still captured but the breakdown is lost. This exposes another
> weakness in the memcg implementation regarding page cache awareness.
> However, the reliability improvements outweigh the weakness in stats.
> 
> 
> Diffs
> -----
> 
>   src/linux/cgroups.hpp 5ee64d6 
>   src/linux/cgroups.cpp 813dcb3 
>   src/slave/cgroups_isolator.cpp a1f5b32 
> 
> Diff: https://reviews.apache.org/r/14024/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> David Mackey
> 
>

Reply via email to