This isn't quite the problem. The problem is that even though a user requests 1 node, 1 PPN, and torque shows it as such, maui (through showq) shows this as needing 2 processors per node, and thereby has allocated 100% of the cluster's resources. Even torque output shows that more resources have been assigned than the job requested (eg, "the scheduler messed up").
This only happens on this one users' jobs. Restarting maui causes it to realize these jobs only needed one processor, and appropriately schedules the remaining jobs. --Jim On Thu, Sep 8, 2011 at 7:32 AM, Gus Correa <[email protected]> wrote: > Jim Kusznir wrote: >> >> Hi all: >> >> I've got a user who's creating a bunch of single-threaded jobs via >> script (about 250 at a shot). All are specified (in torque) as -l >> nodes=1:ppn=1. However, half of his jobs end up queued rather than >> running (he sizes his job to take the entire cluster). When I look >> into why, checkjob shows that the resources allocated (2) exceeds >> requested (1), and showq shows that it assigned 2 cores per job, yet >> torque can't show that anywhere. To fix, I restart maui, and it >> correctly sees that each job should only be 1 core and starts the rest >> of the jobs that were queued. When jobs are in queue, showq shows >> them as requiring only one processor. >> >> How can I fix this permanently? >> >> maui 3.2.6p19 (as installed on a rocks cluster from the torque+maui >> roll, rocks 5.1) >> torque-2.3.0 >> >> Thanks! >> --Jim >> _______________________________________________ >> mauiusers mailing list >> [email protected] >> http://www.supercluster.org/mailman/listinfo/mauiusers > > Hi Jim > > Some guesses: > > Look at your JOBNODEMATCHPOLICY in ${MAUI}/maui.cfg. > To pack multiple jobs on a node you could choose it to be EXACTPROC. > http://www.adaptivecomputing.com/resources/docs/maui/a.fparameters.php > > Another thing to look at, is DEFERTIME. > The default is 1 hour. > You could set it to less. > For instance, if you want it to be one minute, add this line: > DEFERTIME 00:01:00 > to your ${MAUI}/maui.cfg file and restart maui. > http://www.adaptivecomputing.com/resources/docs/maui/a.fparameters.php > > I hope this helps, > Gus Correa > _______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
