First off, I am running UGE as opposed to SGE. We've got a couple of systems, one running 8.5.4 and the other 8.6.5. Users request memory resources in their job scripts by passing:
"-l h_vmem=1G" (for example). We make use of a JSV and when this is set, what actually gets passed to the scheduler is: "h_vmem=1G,m_mem_free=1G" We set "h_vmem_limit=true" in cgroups_params so it is enforced by cgroups. The thing that I am not entirely sure about is what we are actually limiting here! If I write a program to malloc memory in a loop, then cgroups kills it when it has allocated over 400G of ram (on a machine with about 24G). Looking at the output of qacct, it has used ~1G to do so. So my assumption here is that cgroups is killing on memory used as opposed to virtual memory allocated. Which of the two settings (h_vmem / m_mem_free) is responsible for this, and what is the other one for? I'm sure this isn't the first time this has been asked, and for that I apologise but I can't seem to find a clear explanation of this. Thanks! -- Dan Whitehouse Research Systems Administrator, IT Services Queen Mary University of London Mile End E1 4NS _______________________________________________ users mailing list users@gridengine.org https://gridengine.org/mailman/listinfo/users