First off, I am running UGE as opposed to SGE.
We've got a couple of systems, one running 8.5.4 and the other 8.6.5.
Users request memory resources in their job scripts by passing:

"-l h_vmem=1G" (for example).

We make use of a JSV and when this is set, what actually gets passed to 
the scheduler is:

"h_vmem=1G,m_mem_free=1G"

We set "h_vmem_limit=true" in cgroups_params so it is enforced by cgroups.

The thing that I am not entirely sure about is what we are actually 
limiting here!

If I write a program to malloc memory in a loop, then cgroups kills it 
when it has allocated over 400G of ram (on a machine with about 24G).

Looking at the output of qacct, it has used ~1G to do so. So my 
assumption here is that cgroups is killing on memory used as opposed to 
virtual memory allocated. Which of the two settings (h_vmem / 
m_mem_free) is responsible for this, and what is the other one for?

I'm sure this isn't the first time this has been asked, and for that I 
apologise but I can't seem to find a clear explanation of this.

Thanks!

-- 
Dan Whitehouse
Research Systems Administrator, IT Services
Queen Mary University of London
Mile End
E1 4NS

_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to