A little more information - to see if this was a general maui problem or just an issue with the torque/maui handoff, I set "JOBNODEMATCHPOLICY EXACTPROC" in maui.cfg. In this case "checkjob -v" reports the ppn=16 nodes as "rejected : CPU" as expected. So it appears to be a problem with communicating the information correctly between torque and maui and not the JOBNODEMATCHPOLICY parameter itself. According to the documentation here:
http://www.adaptivecomputing.com/resources/docs/mwm/13.3rmextensions.php I would expect to be able to use the torque 2.0+ "-l" syntax but I have to revert to the torque 1.0 "-W x=" syntax. % qsub -l nodes=18:ppn=8,nmatchpolicy=exactproc test.pbs qsub: Job rejected by all possible destinations % qsub -l nodes=18:ppn=8 -W x=nmatchpolicy:exactproc test.pbs 36896.praesepe.jsc.nasa.gov % Any ideas why I can't use the "-l" syntax? Is the "-l" syntax required with torque 2.0 or is the "-W x=" syntax still supposed to work? On Feb 9, 2011, at 1:00 PM, Vicker, Darby (JSC-EG311) wrote: > Hello, > > We have a cluster with ppn=8 and ppn=16 nodes. In general we want jobs > requesting ppn=8 nodes to run on the ppn=16 nodes if they are free (i.e. > JOBNODEMATCHPOLICY = EXACTNODE). But we'd like the users to be able to > specify JOBNODEMATCHPOLICY = EXACTPROC to constrain their jobs to the ppn=8 > nodes. This should be possible but I can't get it working. If I submit the > following script: > > #! /bin/csh -f > #PBS -S /bin/csh > #PBS -N TEST > #PBS -r n -j oe > #PBS -l nodes=1:ppn=8 > #PBS -l walltime=2:00:00 > #PBS -W x=nmatchpolicy:exactproc > > cd $PBS_O_WORKDIR > > env > env.txt > qstat -f $PBS_JOBID > qstat.txt > > > > > it will run on a a ppn=16 node, even though nmatchpolicy is showing up in the > torque job attributes and the maui log. > > > > % tail -1 qstat.txt > x = nmatchpolicy:exactproc > % grep -i match /usr/local/maui/log/maui.log > 02/08 16:39:41 MUGetIndex(nmatchpolicy:exactproc,ValList,0) > % > > > We are running maui 3.2.6p21 and torque 2.3.6. Any ideas on how to debug > this further and correct the problem? > > Thanks, > Darby > _______________________________________________ mauiusers mailing list [email protected] http://www.supercluster.org/mailman/listinfo/mauiusers
