Lennart Karlsson wrote:
Paul,
My advice is that you try some of the informational commands of
Maui, like
showq
mdiag -q
checkjob 5846
showstart 5846
and perhaps also
checkjob -v 5846
These will possibly tell you quite a lot about what resources
there are and why your job are not able to run on them.
Best regards,
-- Lennart Karlsson <[EMAIL PROTECTED]>
National Supercomputer Centre in Linkoping, Sweden
http://www.nsc.liu.se
Paul Van Allsburg wrote:
I have what seems to be a simple policy but my job is stuck in the queue
and I don't know why. The cluster is 16 nodes/32processors, I have 4
queues, 'normal' is the default. The is the cluster current status:
Job id Name User Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
5805.curie ...o1-7imp-md123 hinkle 13:37:24 R
normal
5836.curie ...o1-2imp-md125 hinkle 0 Q
normal
5837.curie ...o1-9imp-md136 hinkle 0 Q
normal
5846.curie cpuburn vanallp 0 Q normal
I have Hinkle limited to 4 processors, job 5805 is using all 4. I
submitted cpuburn to a single node but it's not running.
My maui.cfg is:
# maui.cfg 3.2p8
RMCFG[base] TYPE=PBS
RMPOLLINTERVAL 00:02:00
SERVERPORT 42559
SERVERMODE NORMAL
LOGFILE maui.log
LOGFILEMAXSIZE 10000000
LOGLEVEL 3
QUEUETIMEWEIGHT 1
BACKFILLPOLICY FIRSTFIT
RESERVATIONPOLICY CURRENTHIGHEST
NODEALLOCATIONPOLICY CPULOAD
CREDWEIGHT 1
USERWEIGHT 1
GROUPWEIGHT 1
CLASSWEIGHT 1
USERCFG[vanallp] MAXNODE=2
USERCFG[hinkle] MAXPROC=4
USERCFG[webmo] MAXNODE=4 PRIORITY=100000
USERCFG[DEFAULT] MAXNODE=9
GROUPCFG[DEFAULT] MAXNODE=11
# these are the 4 queues
CLASSCFG[webmoq] PRIORITY=1000000
CLASSCFG[normal] MAXNODE=14
CLASSCFG[debug] MAXNODE=15
CLASSCFG[admin] MAXNODE=16
XFACTOR 1
# this parm gives short wall clock jobs priority
# limited to 1 day... see 5.1.2.5 in Maui admin guide:)
# one day!
XFMINWCLIMIT 1440
#<eof>
Am I missing the obvious?
Thanks!
Paul Van Allsburg
I think I'm a little more confused... I did a
checkjob 5846
and it immediately returned with a State: of "Running" on node 8.
I qsub'ed another, and it immediately started. I qsub'ed 14 more and
they all ran.
It seems the MAXNODE= has no effect on the scheduler in my
configuration. When I set
MAXPROC= the scheduler will correctly hold jobs based on that
setting. Where did I
go wrong?
Thanks
Paul
_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers