On 06/23/2014 08:41 PM, Benjamin Mahler wrote:
Since cgroups_enable_cfs potentially could leave cpu resources unused I
presume it has a corresponding advantage, but I'm not sure what it is. When
would one be preferred over the another?


Setting the upper limit provides predictability. This is desired when
running online workloads, like web servers.

For example, let's say that you're happily running along with all your web
services using 16 cores instead of 4. When the system is further
constrained and you're forced down to 4 cores you will have a bad time. :)

It is currently configured at the slave level, but of course it might be
nice to provide different cpu isolation characteristics at the executor
level in the future.


Thanks that makes a lot of sense. I agree it might be useful to allow this to be a finer grained setting than the slave level though. That was batch jobs could 'burst' while service jobs are still humming along in their tight bounds.


## mem
Is there a best practice for frameworks to choose memory limits that
include RSS + page cache?  In particular, I'm unsure how to reason about
page cache use since a page is counted against whichever cgroup happened to
access it first.


The page cache accounting was a surprise to us as well, have you seen Ian's
reply here:

http://mail-archives.apache.org/mod_mbox/mesos-user/201406.mbox/%3CCAAJX7shQ_FB6qmvDBYJv5%2Bdh5VgG3WyaedBY_cQo1bY4cPvsDA%40mail.gmail.com%3E

I believe we didn't see this on newer kernels.


That thread is very helpful, although I admit that "it depends on the kernel version" does not full me with hope.

The use case I am studying mesos for might be (slightly) atypical in that I don't have a separate DFS or database that tasks are fetching from. Instead both batch and latency-sensitive tasks are reading and writing from the local file system. In many case the same files are read by multiple tasks In that environment I really don't see a sensible way to slice up upper bounds on page cache use without the static partitioning resulting in significant inefficiencies.

Reply via email to