Re: cgroups isolation: at most vs at least

Chris Burroughs Tue, 24 Jun 2014 06:01:18 -0700

On 06/23/2014 08:41 PM, Benjamin Mahler wrote:

Since cgroups_enable_cfs potentially could leave cpu resources unused I
presume it has a corresponding advantage, but I'm not sure what it is. When
would one be preferred over the another?


Setting the upper limit provides predictability. This is desired when
running online workloads, like web servers.

For example, let's say that you're happily running along with all your web
services using 16 cores instead of 4. When the system is further
constrained and you're forced down to 4 cores you will have a bad time. :)

It is currently configured at the slave level, but of course it might be
nice to provide different cpu isolation characteristics at the executor
level in the future.

Thanks that makes a lot of sense. I agree it might be useful to allowthis to be a finer grained setting than the slave level though. Thatwas batch jobs could 'burst' while service jobs are still humming alongin their tight bounds.

## mem
Is there a best practice for frameworks to choose memory limits that
include RSS + page cache?  In particular, I'm unsure how to reason about
page cache use since a page is counted against whichever cgroup happened to
access it first.


The page cache accounting was a surprise to us as well, have you seen Ian's
reply here:

http://mail-archives.apache.org/mod_mbox/mesos-user/201406.mbox/%3CCAAJX7shQ_FB6qmvDBYJv5%2Bdh5VgG3WyaedBY_cQo1bY4cPvsDA%40mail.gmail.com%3E

I believe we didn't see this on newer kernels.

That thread is very helpful, although I admit that "it depends on thekernel version" does not full me with hope.

The use case I am studying mesos for might be (slightly) atypical inthat I don't have a separate DFS or database that tasks are fetchingfrom. Instead both batch and latency-sensitive tasks are reading andwriting from the local file system. In many case the same files areread by multiple tasks In that environment I really don't see asensible way to slice up upper bounds on page cache use without thestatic partitioning resulting in significant inefficiencies.

Re: cgroups isolation: at most vs at least

Reply via email to