On Wed, 12 Sep 2012, Dave Love wrote:
...
Thanks! I just upgraded to 8.1.2. Will these patches work with 8.1.2
or were they intended only for 8.1.1?
I started this work but noticed that it collides/duplicates functionality
with the newer stuff (e.g. see below).
If the patchset goes anywhere but not mainline into Son of Gridengine,
I'll post updates to one of the soge lists.
They may or may not apply. However, at least the configuration is
inconsistent with the cpuset stuff in 8.1.2 (and subsequent work).
We'll work it out, and I'm happy to hear opinions on whether it's best
to define the cgroup location via execd_params or by finding an
externally-created one (made at execd startup with knowledge about the
host concerned, which I think is an admin win).
There are advantages to allowing the admin to specify the location of the
cgroup or similar entity. For example, I sometimes run two copies of
gridengine on the same machines (production and development/testing) and
want to keep the cgroups separate to avoid name clashes.
Using my patchset, I can (and do) set
CGROUP_MEMORY=/cgroup/memory/sge_prod on one installation and
CGROUP_MEMORY=/cgroup/memory/sge_dev on the other.
However, the mess that results from continually extending execd_params is
clearly unsustainable.
We already have entries like execd_spool_dir and xterm in "qconf -mconf
<host|global>" - how about adding new entries like "cgroup_memory",
"cgroup_cpuset", etc.? e.g. setting them to something like "none" or
"false" could disable the relevant feature, "auto" or "true" to trigger
your automatic code, or a specific path to override it.
(apologies if I've missed the point - I've not had much time to look at
8.1.2)
More importantly, there seem to be problems with what the memory
controller reports, but it's not clear in what versions, and how you
check. I need to chase that up, but I'd be glad of information from
anyone who knows gory (historical) details of the memory controller.
"This can may contain worms."
...
Well said! This is obviously a new area and, aside from the obvious
problems, we may still get interesting interactions with some of the more
exotic things regularly found in our environments (e.g. Lustre,
InfiniBand, etc.). It's going to be "fun" working it out...
Mark
--
-----------------------------------------------------------------
Mark Dixon Email : [email protected]
HPC/Grid Systems Support Tel (int): 35429
Information Systems Services Tel (ext): +44(0)113 343 5429
University of Leeds, LS2 9JT, UK
-----------------------------------------------------------------
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users