From "man sacctmgr":

SPECIFICATIONS FOR QOS
Flags Used by the slurmctld to override or enforce certain characteris?
              tics.
              Valid options are

              EnforceUsageThreshold
If set, and the QOS also has a UsageThreshold, any jobs submitted with this QOS that fall below the UsageThreshold will be held until their Fairshare Usage goes above the
                     Threshold.

              NoReserve
If this flag is set and backfill scheduling is used, jobs using this QOS will not reserve resources in the backfill schedule's map of resources allocated through time. This flag is intended for use with a QOS that may be preempted by jobs associated with all other QOS (e.g use with a "standby" QOS). If the allocated is used with a QOS which can not be preempted by all other QOS, it could result in
                     starvation of larger jobs.

              PartitionMaxNodes
If set jobs using this QOS will be able to override the
                     requested partition's MaxNodes limit.

              PartitionMinNodes
If set jobs using this QOS will be able to override the
                     requested partition's MinNodes limit.

              PartitionTimeLimit
If set jobs using this QOS will be able to override the
                     requested partition's TimeLimit.


Quoting Lyn Gerner <[email protected]>:

Certainly; how about the Flags?

Thanks again,
Lyn

On Mon, Oct 31, 2011 at 1:08 PM, Moe Jette <[email protected]> wrote:

SLURM's QOS and resource limits web pages describe most of this:
http://www.schedmd.com/**slurmdocs/qos.html<http://www.schedmd.com/slurmdocs/qos.html>
http://www.schedmd.com/**slurmdocs/resource_limits.html<http://www.schedmd.com/slurmdocs/resource_limits.html>


Quoting Lyn Gerner <[email protected]>:

 PS: Moe, is there a related document?  Couldn't find anything obvious.

Thanks,
Lyn

On Mon, Oct 31, 2011 at 12:59 PM, Lyn Gerner <[email protected]>**
wrote:

 Great, thanks Moe.


On Mon, Oct 31, 2011 at 10:39 AM, Moe Jette <[email protected]> wrote:

 This works for me.
What version of SLURM are you running?
You might want to look at your SlurmctldLogFile.

Lyn,
You can use the QOS mechanism was Matt is with flags (e.g.
"Flags=PartitionTimeLimit") to override partition time and/or size
limits.


Quoting Matteo Guglielmi <[email protected]>:

 Dear All,

I'm trying to create a simple qos called 1week which
I would like to associate to those users who do need
to run for one week instead of 2 days at maximum:

### slurm.conf ###
EnforcePartLimits=YES
TaskPlugin=task/affinity
TaskPluginParam=Sched
TopologyPlugin=topology/none
TrackWCKey=no
SchedulerType=sched/backfill
SelectType=select/cons_res
SelectTypeParameters=CR_Core_****Memory
PriorityType=priority/****multifactor

PriorityDecayHalfLife=7-0
PriorityCalcPeriod=5
PriorityFavorSmall=YES
PriorityMaxAge=7-0
PriorityUsageResetPeriod=NONE
PriorityWeightAge=1000
PriorityWeightFairshare=1000
PriorityWeightJobSize=10000
PriorityWeightPartition=10000
PriorityWeightQOS=10000
AccountingStorageEnforce=****limits,qos
AccountingStorageType=****accounting_storage/slurmdbd
JobCompType=jobcomp/none
JobAcctGatherType=jobacct_****gather/linux
PreemptMode=suspend,gang
PreemptType=preempt/partition_****prio


NodeName=DEFAULT TmpDisk=16384 State=UNKNOWN

NodeName=foff[01-08] Procs=8  CoresPerSocket=4  Sockets=2
ThreadsPerCore=1 RealMemory=7000   Weight=1 Feature=X5482,foff,fofflm
NodeName=foff[09-13] Procs=48 CoresPerSocket=12 Sockets=4
ThreadsPerCore=1 RealMemory=127000 Weight=1 Feature=6176,foff,foffhm

PartitionName=DEFAULT DefaultTime=60 MinNodes=1 MaxNodes=UNLIMITED
MaxTime=2-0 PreemptMode=SUSPEND Shared=FORCE:1 State=UP Default=NO
PartitionName=batch   Nodes=foff[01-13] Default=YES
PartitionName=foff1   Nodes=foff[01-08] Priority=1000
PartitionName=foff2   Nodes=foff[09-13] Priority=1000
#################

sacctmgr list associations format=Account,Cluster,User,**
Fairshare,Partition,****defaultqos,qos tree withd


           Account    Cluster       User     Share  Partition   Def QOS
                QOS
-------------------- ---------- ---------- --------- ----------
--------- --------------------
root                     superb                    1
              normal
 root                    superb       root         1
              normal
 sb                      superb                    1
              normal
 sb                     superb   belushki         1      batch
             normal
 sb                     superb     fiocco         1      batch
             normal
 gr-fo                  superb                    1
              normal
 gr-fo                 superb   belushki         1      foff1
             normal
 gr-fo                 superb   belushki         1      foff2
             normal
 gr-fo                 superb     fiocco         1      foff1
             normal
 gr-fo                 superb     fiocco         1      foff2
             normal


sacctmgr add qos Name=1week MaxWall=7-0 Priority=100
PreemptMode=Cluster
Flags=PartitionTimeLimit

sacctmgr modify user name=belushki Account=gr-fo set qos+=1week

sacctmgr list associations format=Account,Cluster,User,**
Fairshare,Partition,****defaultqos,qos tree withd


           Account    Cluster       User     Share  Partition   Def QOS
                QOS
-------------------- ---------- ---------- --------- ----------
--------- --------------------
root                     superb                    1
              normal
 root                    superb       root         1
              normal
 sb                      superb                    1
              normal
 sb                     superb   belushki         1      batch
             normal
 sb                     superb     fiocco         1      batch
             normal
 gr-fo                  superb                    1
              normal
 gr-fo                 superb   belushki         1      foff1
       1week,normal
 gr-fo                 superb   belushki         1      foff2
       1week,normal
 gr-fo                 superb     fiocco         1      foff1
             normal
 gr-fo                 superb     fiocco         1      foff2
             normal

/etc/init.d/slurmd restart (same command was issued on all nodes too)

su - belushki

srun -p foff2 -A gr-fo --qos=1week -t 7-0 hostname
srun: error: Unable to allocate resources: Requested time limit is
invalid (exceeds some limit)


Could you tell me what I still miss in order to make it working for
user
"belushki"?

Thanks,

--matt















Reply via email to