Lennart Karlsson wrote:

Paul,

My advice is that you try some of the informational commands of
Maui, like

        showq
        mdiag -q
        checkjob 5846
        showstart 5846

and perhaps also

        checkjob -v 5846

These will possibly tell you quite a lot about what resources
there are and why your job are not able to run on them.

Best regards,
-- Lennart Karlsson <[EMAIL PROTECTED]>
  National Supercomputer Centre in Linkoping, Sweden
  http://www.nsc.liu.se


Paul Van Allsburg wrote:
I have what seems to be a simple policy but my job is stuck in the queue and I don't know why. The cluster is 16 nodes/32processors, I have 4 queues, 'normal' is the default. The is the cluster current status:

Job id           Name             User             Time Use S Queue
---------------- ---------------- ---------------- -------- - -----
5805.curie ...o1-7imp-md123 hinkle 13:37:24 R normal 5836.curie ...o1-2imp-md125 hinkle 0 Q normal 5837.curie ...o1-9imp-md136 hinkle 0 Q normal 5846.curie cpuburn vanallp 0 Q normal

I have Hinkle limited to 4 processors, job 5805 is using all 4. I submitted cpuburn to a single node but it's not running. My maui.cfg is:

# maui.cfg 3.2p8
RMCFG[base] TYPE=PBS
RMPOLLINTERVAL        00:02:00
SERVERPORT            42559
SERVERMODE            NORMAL
LOGFILE               maui.log
LOGFILEMAXSIZE        10000000
LOGLEVEL              3
QUEUETIMEWEIGHT       1
BACKFILLPOLICY        FIRSTFIT
RESERVATIONPOLICY     CURRENTHIGHEST
NODEALLOCATIONPOLICY  CPULOAD

CREDWEIGHT            1
USERWEIGHT            1
GROUPWEIGHT           1
CLASSWEIGHT           1

USERCFG[vanallp]      MAXNODE=2
USERCFG[hinkle]       MAXPROC=4
USERCFG[webmo]        MAXNODE=4 PRIORITY=100000
USERCFG[DEFAULT]      MAXNODE=9
GROUPCFG[DEFAULT]     MAXNODE=11

# these are the 4 queues
CLASSCFG[webmoq]      PRIORITY=1000000
CLASSCFG[normal]      MAXNODE=14
CLASSCFG[debug]       MAXNODE=15
CLASSCFG[admin]       MAXNODE=16

XFACTOR               1
# this parm gives short wall clock jobs priority
# limited to 1 day... see 5.1.2.5 in Maui admin guide:)
# one day! XFMINWCLIMIT 1440

#<eof>
Am I missing the obvious? Thanks!
Paul Van Allsburg
I think I'm a little more confused... I did a
checkjob 5846
and it immediately returned with a State: of "Running" on node 8. I qsub'ed another, and it immediately started. I qsub'ed 14 more and they all ran. It seems the MAXNODE= has no effect on the scheduler in my configuration. When I set MAXPROC= the scheduler will correctly hold jobs based on that setting. Where did I go wrong? Thanks
Paul

_______________________________________________
mauiusers mailing list
[email protected]
http://www.supercluster.org/mailman/listinfo/mauiusers

Reply via email to