> On May 3, 2013, 6:45 p.m., Ben Mahler wrote: > > Were you running into this issue when using process isolation, or cgroups > > isolation? > > Brenden Matthews wrote: > Using cgroups isolation. > > I'm still having a major issue where the JVM occasionally 'runs away' and > the load averages go through the roof. Without a simple check like this, the > slave will keep accepting tasks which hang forever. > > I still haven't figured out the root cause of the JVM getting stuck. > Between strace and jstack (which usually hangs forever) there aren't any good > indicators of what's going on. > > Ben Mahler wrote: > We've seen several issues when there's heavy disk I/O on a machine as > well, since there's currently no disk isolation in place.
Yeah, I figured I wasn't the only one having this problem. I think CFQ (https://lwn.net/Articles/427961/) might be the way to go. For this, I need to go after the low-hanging fruit first. - Brenden ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/10928/#review20128 ----------------------------------------------------------- On May 3, 2013, 6:39 p.m., Brenden Matthews wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/10928/ > ----------------------------------------------------------- > > (Updated May 3, 2013, 6:39 p.m.) > > > Review request for mesos. > > > Description > ------- > > From 69b4dc2e1fc778b2d8377eb4ec03f793c33e8061 Mon Sep 17 00:00:00 2001 > From: Brenden Matthews <[email protected]> > Date: Mon, 29 Apr 2013 11:35:53 -0700 > Subject: [PATCH 5/9] Slave feature: maximum system load. > > When the load exceeds a specified value, don't accept tasks. Some nodes > may become unstable under excessive load (i.e., heavy disk I/O), and > this helps prevent the assigning of further tasks to busy slaves. > --- > src/slave/flags.hpp | 11 ++++++++++- > src/slave/slave.cpp | 43 +++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 53 insertions(+), 1 deletion(-) > > > Diffs > ----- > > src/slave/flags.hpp f3cbe3d > src/slave/slave.cpp 86a15fc > > Diff: https://reviews.apache.org/r/10928/diff/ > > > Testing > ------- > > Used in production at airbnb. > > > Thanks, > > Brenden Matthews > >
