Mark, Thanks! I just upgraded to 8.1.2. Will these patches work with 8.1.2 or were they intended only for 8.1.1?
Joseph On 09/10/2012 07:45 AM, Mark Dixon wrote:
Hi, Way back in May I promised this list a simple integration of gridengine with the cgroup functionality found in recent Linux distributions. I'm not quite sure what happened to all the time between then and now, but I'm fulfilling that promise now. Please find attached a number of patches that add that functionality. They happen to be prepared against SoGE 8.1.1, but should be readily portable to other gridengine versions e.g. 6.2u5. Notes: * This is intentionally a naive, but hopefully extendable, piece of work - integrating all of the functionality the cgroup feature has to offer is beyond the scope of the problem I was attempting to solve. If you missed it, the Open Grid Scheduler people have previously announced that they have written and will open source a far more comprehensive implementation: please consider this a stop-gap measure until then. * When enabled in your gridengine configuration, this patchset alters the behaviour of h_vmem, h_rss and the accounting value vmem. It also introduces two new queue parameters - s_as and h_as. * h_vmem and vmem will use the actual RAM+swap usage as reported by the cgroup memory controller, instead of simply adding up all the address space usage by all processes in the job. This should result in a much more accurate measurement of host resource usage (as previously discussed on this list). * h_vmem will no longer set RLIMIT_AS (i.e. the "virtual memory" line in bash's ulimit command) for job processes, as it will now be redundant in the majority of cases and is the second source of gridengine's over-estimate of memory usage by jobs. * If you really do wish to set RLIMIT_AS for a job, then s_as will set the soft limit and h_as will set the hard limit. * Setting h_rss will limit the maximum amount of RAM a job can use so, if the admin wanted to allow swapping, h_rss would limit RAM usage and h_vmem would limit RAM+swap usage. The job will only be killed if it hits h_vmem, not h_rss. * Modifications to existing files are licensed under SISSL version 1.2. New files are under the LGPL version 3. * To enable: 1) Add a "CGROUP_MEMORY=<dir>" parameter to your execd_params configuration. "<dir>" should be the path to a directory that exists, under where you have the memory cgroup controller mounted e.g. on RHEL6 with the libcgroup package, /cgroup/memory would work. 2) If you have queues from an earlier version of gridengine, you will need to edit them and set "s_as" and "h_as" to "INFINITY" (instead of the default, "NONE"). * Housekeeping: deletion of the cgroup after a job ends can fail due to there still being memory in use. Possible examples of this: cached I/O and shared libraries that were originally loaded by the job but still in use elsewhere on the system. If you really don't want the new attributes s_as and h_as, don't bother with the 2nd and 8th patches; however, this will also re-enable h_vmem's setting of RLIMIT_AS and I would strongly recommend you to disable it (search for RLIMIT_AS in source/daemons/shepherd/setrlimits.c) as it would defeat the whole point of using the memory cgroup (from my personal perspective). I had intended to try to write a version of it for non-cgroup enabled Linux systems, which would make a "best effort" but couldn't strictly enforce the limit. I didn't get round to it. Sorry. While I'm saying sorry, in no particular order, I apologise for: the posting of a SoGE patch to the main user list (but I did promise a patch here - and it's reasonably generally applicable); the use of MIME (but list servers and mail clients often munge patches); skipping the typical one-patch-per-mail convention (wanted to minimise email to the uninterested) and anything else anyone finds irritating :) If you find it useful, or would find it useful but don't like the license, please let me know. All the best, Mark _______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
_______________________________________________ users mailing list [email protected] https://gridengine.org/mailman/listinfo/users
