Re: [Mauiusers] nodeavailabilitypolicy

Abhishek Gupta Thu, 16 Dec 2010 08:32:56 -0800

Hi Renato,

Its not increasing memory, but if I say I need mem=6gb or pmem=6gb, itstill goes to the node with total memory less than 6gb. So I thought bysetting the NODEAVAILABILITYPOLICY, I will be able to defineavailability on the bases of memory.Like we define np= in nodes file, do we have to define memory resourcestoo?

Thanks,
Abhi.



Renato Borges wrote:

Hi Abhi!
On Wed, Dec 15, 2010 at 7:21 PM, Abhishek Gupta <[email protected]<mailto:[email protected]>> wrote:
    Hi,

    I am trying to figure out the way so that memory usage does not exceed
    the available memory on a node. I was thinking that this parameter (
    NODEAVAILABILITYPOLICY COMBINED:MEM ) should check the availability of
    node on the bases of memory available, but it does not.
    Is there anything else I need to add to make it work?
    NODEAVAILABILITYPOLICY COMBINED:MEM

    Thanks,
    Abhi.
I´ve never used NODEAVAILABILITYPOLICY, but I have a similar problem,which is: the jobs we run at my site start out with a small memoryfootprint, and end with large amounts of data in memory (invirtualization lingo, they "balloon"). Maybe this is also your case,and this is why setting this variable doesn`t work?
To avoid swapping, I have set a MAXJOBPERUSER variable for eachcompute node, because all of our jobs that have an increasing memoryfootprint come from a single user (actually, a grid account).
Tweaking the MAXJOBPERUSER variable, I have found a value for eachnode (we have an heterogeneous cluster) that runs the jobs withoutswapping.
However, this is not ideal because this setting is applied to all jobsthat run on a given node, and some local users have jobs that aresmall in memory, but large in number of cores, and the limits which Iset for the grid jobs are too restrictive for them. Whereas a grid jobcan only run 4 jobs on a 8 core, 8GB RAM node, local user´s jobs couldmerrily run on all 8 cores simultaneously.
Trying to find a better solution, I found that one can set on torque(supposing you use torque):
qmgr -c "set queue XXX resources_min.mem=2000kb"
And this would (theoretically) only attribute nodes that have at least2GB of free memory to waiting jobs on XXX queue. I say "theoretically"because I have not had luck with this setting. As I said, our gridjobs balloon, and so our nodes get one job per slot, since initially(for the first few hours) the jobs are only downloading data, and sothere is always 2GB free. But when the memories ballon, we startswapping heavily.
I guess that you might have more luck with that if your jobs´ memoryfootprint is more constant, or if some guru could teach us how to"reserve" some memory amount per job, I know that would suit me perfectly.
Cheers,
Renato.
--
Renato Callado Borges
Lab Specialist - DFN/IF/USP
Email: [email protected] <mailto:[email protected]>
Phone: +55 11 3091 7105

_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Re: [Mauiusers] nodeavailabilitypolicy

Reply via email to