We recently tried to implement accounting and fair queuing. For
completeness, the system is a Cray XE6m

In slurm.conf, we have:
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageHost=sdb
AccountingStorageEnforce=limits
PriorityType=priority/multifactor

PriorityWeightAge=1000
PriorityWeightFairshare=10000
PriorityWeightJobSize=1000
PriorityWeightPartition=1000
PriorityWeightQOS=0 # don't use the qos factor

MessageTimeout=45 # problems with race condition!

# PARTITIONS
PartitionName=workq Default=YES Priority=1 DefaultTime=60 MaxTime=06:00:00
AllowGroups=ALL
Nodes=nid00[002-007,024-029,040-043,046-049,052-055,064-071,088-091,094-099,100-103,120-127,136-151,160-167,184-199,216-223,232-247,256-263,2
80-287] MaxNodes=135
PartitionName=debugq Default=YES Priority=5 DefaultTime=60 MaxTime=4:00:00
AllowGroups=ALL Nodes=nid00[002-007,024-029] MaxNodes=4
PartitionName=wofq Default=YES Priority=1 DefaultTime=60 MaxTime=06:00:00
AllowGroups=ALL
Nodes=nid00[002-007,024-029,040-043,046-049,052-055,064-071,088-091,094-099,100-103,120-127,136-151,160-167,184-199,216-223,232-247,256-263,28
0-287] MaxNodes=135

IN sacctmgr, I have the following associations:
   Cluster    Account       User  Partition     Share GrpJobs GrpNodes
 GrpCPUs  GrpMem GrpSubmit     GrpWall  GrpCPUMins MaxJobs MaxNodes
 MaxCPUs MaxSubmit     MaxWall  MaxCPUMins                  QOS   Def QOS
GrpCPURunMins
---------- ---------- ---------- ---------- --------- ------- --------
-------- ------- --------- ----------- ----------- ------- --------
-------- --------- ----------- ----------- -------------------- ---------
-------------
      loki       root                               1

                                            normal
      loki       root       root                    1

                                            normal
      loki      debug                             200
                                                             4
     2    00:15:00                           normal
      loki      debug   chenghao     debugq       200
                                                             4
     2    00:15:00                           normal
      loki      debug chris.kar+     debugq       200
                                                             4
     2    00:15:00                           normal
      loki      debug    cpotvin     debugq       200
                                                             4
     2    00:15:00                           normal
      loki      debug      gerry     debugq       200
                                                             4
     2    00:15:00                           normal
      loki      debug james.cor+     debugq       200
                                                             4
     2    00:15:00                           normal
      loki      debug      jdgao     debugq       200
                                                             4
     2    00:15:00                           normal
      loki      debug   kknopf83     debugq       200
                                                             4
     2    00:15:00                           normal
      loki      debug    mansell     debugq       200
                                                             4
     2    00:15:00                           normal
      loki      debug     mflora     debugq       200
                                                             4
     2    00:15:00                           normal
      loki      debug   nyussouf     debugq       200
                                                             4
     2    00:15:00                           normal
      loki      debug   skinnerp     debugq       200
                                                             4
     2    00:15:00                           normal
      loki      debug    tajones     debugq       200
                                                             4
     2    00:15:00                           normal
      loki      debug     wicker     debugq       200
                                                             4
     2    00:15:00                           normal
      loki      debug        wof     debugq       200
                                                             4
     2    00:15:00                           normal
      loki largequeue                             100
                                                            96     1024
         06:00:00                           normal
      loki largequeue   chenghao      workq       100
                                                            96     1024
         06:00:00                           normal
      loki largequeue chris.kar+      workq       100
                                                            96     1024
         06:00:00                           normal
      loki largequeue    cpotvin      workq       100
                                                            96     1024
         06:00:00                           normal
      loki largequeue      gerry      workq       100
                                                            96     1024
         06:00:00                           normal
      loki largequeue james.cor+      workq       100
                                                            96     1024
         06:00:00                           normal
      loki largequeue      jdgao      workq       100
                                                            96     1024
         06:00:00                           normal
      loki largequeue   kknopf83      workq       100
                                                            96     1024
         06:00:00                           normal
      loki largequeue    mansell      workq       100
                                                            96     1024
         06:00:00                           normal
      loki largequeue     mflora      workq       100
                                                            96     1024
         06:00:00                           normal
      loki largequeue   nyussouf      workq       100
                                                            96     1024
         06:00:00                           normal
      loki largequeue   skinnerp      workq       100
                                                            96     1024
         06:00:00                           normal
      loki largequeue    tajones                  100
                                                            96     1024
         06:00:00                           normal
      loki largequeue     wicker      workq       100
                                                            96     1024
         06:00:00                           normal
      loki largequeue        wof      workq       100
                                                            96     1024
         06:00:00                           normal
      loki   realtime                            1000
                                                           128     4096
         01:00:00                           normal
      loki   realtime        wof       wofq      1000
                                                           128     4096
         01:00:00                           normal
      loki smallqueue                             100
                                                            36      288
         06:00:00                           normal
      loki smallqueue   chenghao      workq       100
                                                            36      288
         06:00:00                           normal
      loki smallqueue chris.kar+      workq       100
                                                            36      288
         06:00:00                           normal
      loki smallqueue    cpotvin      workq       100
                                                            36      288
         06:00:00                           normal
      loki smallqueue      gerry      workq       100
                                                            36      288
         06:00:00                           normal
      loki smallqueue james.cor+      workq       100
                                                            36      288
         06:00:00                           normal
      loki smallqueue      jdgao      workq       100
                                                            36      288
         06:00:00                           normal
      loki smallqueue   kknopf83      workq       100
                                                            36      288
         06:00:00                           normal
      loki smallqueue    mansell      workq       100
                                                            36      288
         06:00:00                           normal
      loki smallqueue     mflora      workq       100
                                                            36      288
         06:00:00                           normal
      loki smallqueue   nyussouf      workq       100
                                                            36      288
         06:00:00                           normal
      loki smallqueue   skinnerp      workq       100
                                                            36      288
         06:00:00                           normal
      loki smallqueue    tajones      workq       100
                                                            36      288
         06:00:00                           normal
      loki smallqueue     wicker      workq       100
                                                            36      288
         06:00:00                           normal
      loki smallqueue        wof      workq       100
                                                            36      288
         06:00:00                           normal


I've a user who keeps getting error'd out, with a claim that she has an
account/partition mismatch. The partition specified is not anywhere in her
slurm submission script, however (wofq).

I'm baffled. Any suggestions?
-- 
Gerry Creager
NSSL/CIMMS
405.325.6371
++++++++++++++++++++++
“Big whorls have little whorls,
That feed on their velocity;
And little whorls have lesser whorls,
And so on to viscosity.”
Lewis Fry Richardson (1881-1953)

Reply via email to