On Wed, 11 Mar 2009, Lech Nieroda wrote:

Dear list,

I'm trying to set up a limit on the number of used processors, so that
a job which uses more cores than requested at the time of submit is
cancelled, preferably after some grace time has passed.
According to the manual the right config would be

 RESOURCELIMITPOLICY PROC:EXTENDEDVIOLATION:CANCEL:00:05:00

which monitors the actual load and should cancel a job if a violation
takes longer than 5 minutes.
The problem: it kills any job that exceeds load 1 even if it declares
several cores at submit time (and it doesn't wait 5 minutes to do so
but that's another issue).

I have had this exact same issue. I backed it out because it was causing more problems then it was worth. I've also implemented CPUSETS which is something that you might be interested in to mitigate the damages caused by these types of issues.

--
James A. Peltier
Systems Analyst (FASNet), VIVARIUM Technical Director
Simon Fraser University - Burnaby Campus
Phone   : 778-782-6573
Fax     : 778-782-3045
E-Mail  : [email protected]
Website : http://www.fas.sfu.ca | http://vivarium.cs.sfu.ca
           http://blogs.sfu.ca/people/jpeltier
MSN     : [email protected]

Your mouse has moved.  Windows has detected hardware
changes that require a reboot. Click OK to reboot.
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to