Look at the sacctmgr man page on the subject http://www.schedmd.com/slurmdocs/sacctmgr.html
Look for SPECIFICATIONS FOR QOS On Monday October 31 2011 1:17:17 PM Lyn Gerner wrote: > Certainly; how about the Flags? > > Thanks again, > Lyn > > On Mon, Oct 31, 2011 at 1:08 PM, Moe Jette <[email protected]> wrote: > > > SLURM's QOS and resource limits web pages describe most of this: > > http://www.schedmd.com/**slurmdocs/qos.html<http://www.schedmd.com/slurmdocs/qos.html> > > http://www.schedmd.com/**slurmdocs/resource_limits.html<http://www.schedmd.com/slurmdocs/resource_limits.html> > > > > > > Quoting Lyn Gerner <[email protected]>: > > > > PS: Moe, is there a related document? Couldn't find anything obvious. > >> > >> Thanks, > >> Lyn > >> > >> On Mon, Oct 31, 2011 at 12:59 PM, Lyn Gerner <[email protected]>** > >> wrote: > >> > >> Great, thanks Moe. > >>> > >>> > >>> On Mon, Oct 31, 2011 at 10:39 AM, Moe Jette <[email protected]> wrote: > >>> > >>> This works for me. > >>>> What version of SLURM are you running? > >>>> You might want to look at your SlurmctldLogFile. > >>>> > >>>> Lyn, > >>>> You can use the QOS mechanism was Matt is with flags (e.g. > >>>> "Flags=PartitionTimeLimit") to override partition time and/or size > >>>> limits. > >>>> > >>>> > >>>> Quoting Matteo Guglielmi <[email protected]>: > >>>> > >>>> Dear All, > >>>>> > >>>>> I'm trying to create a simple qos called 1week which > >>>>> I would like to associate to those users who do need > >>>>> to run for one week instead of 2 days at maximum: > >>>>> > >>>>> ### slurm.conf ### > >>>>> EnforcePartLimits=YES > >>>>> TaskPlugin=task/affinity > >>>>> TaskPluginParam=Sched > >>>>> TopologyPlugin=topology/none > >>>>> TrackWCKey=no > >>>>> SchedulerType=sched/backfill > >>>>> SelectType=select/cons_res > >>>>> SelectTypeParameters=CR_Core_****Memory > >>>>> PriorityType=priority/****multifactor > >>>>> > >>>>> PriorityDecayHalfLife=7-0 > >>>>> PriorityCalcPeriod=5 > >>>>> PriorityFavorSmall=YES > >>>>> PriorityMaxAge=7-0 > >>>>> PriorityUsageResetPeriod=NONE > >>>>> PriorityWeightAge=1000 > >>>>> PriorityWeightFairshare=1000 > >>>>> PriorityWeightJobSize=10000 > >>>>> PriorityWeightPartition=10000 > >>>>> PriorityWeightQOS=10000 > >>>>> AccountingStorageEnforce=****limits,qos > >>>>> AccountingStorageType=****accounting_storage/slurmdbd > >>>>> JobCompType=jobcomp/none > >>>>> JobAcctGatherType=jobacct_****gather/linux > >>>>> PreemptMode=suspend,gang > >>>>> PreemptType=preempt/partition_****prio > >>>>> > >>>>> > >>>>> NodeName=DEFAULT TmpDisk=16384 State=UNKNOWN > >>>>> > >>>>> NodeName=foff[01-08] Procs=8 CoresPerSocket=4 Sockets=2 > >>>>> ThreadsPerCore=1 RealMemory=7000 Weight=1 Feature=X5482,foff,fofflm > >>>>> NodeName=foff[09-13] Procs=48 CoresPerSocket=12 Sockets=4 > >>>>> ThreadsPerCore=1 RealMemory=127000 Weight=1 Feature=6176,foff,foffhm > >>>>> > >>>>> PartitionName=DEFAULT DefaultTime=60 MinNodes=1 MaxNodes=UNLIMITED > >>>>> MaxTime=2-0 PreemptMode=SUSPEND Shared=FORCE:1 State=UP Default=NO > >>>>> PartitionName=batch Nodes=foff[01-13] Default=YES > >>>>> PartitionName=foff1 Nodes=foff[01-08] Priority=1000 > >>>>> PartitionName=foff2 Nodes=foff[09-13] Priority=1000 > >>>>> ################# > >>>>> > >>>>> sacctmgr list associations format=Account,Cluster,User,** > >>>>> Fairshare,Partition,****defaultqos,qos tree withd > >>>>> > >>>>> > >>>>> Account Cluster User Share Partition Def QOS > >>>>> QOS > >>>>> -------------------- ---------- ---------- --------- ---------- > >>>>> --------- -------------------- > >>>>> root superb 1 > >>>>> normal > >>>>> root superb root 1 > >>>>> normal > >>>>> sb superb 1 > >>>>> normal > >>>>> sb superb belushki 1 batch > >>>>> normal > >>>>> sb superb fiocco 1 batch > >>>>> normal > >>>>> gr-fo superb 1 > >>>>> normal > >>>>> gr-fo superb belushki 1 foff1 > >>>>> normal > >>>>> gr-fo superb belushki 1 foff2 > >>>>> normal > >>>>> gr-fo superb fiocco 1 foff1 > >>>>> normal > >>>>> gr-fo superb fiocco 1 foff2 > >>>>> normal > >>>>> > >>>>> > >>>>> sacctmgr add qos Name=1week MaxWall=7-0 Priority=100 > >>>>> PreemptMode=Cluster > >>>>> Flags=PartitionTimeLimit > >>>>> > >>>>> sacctmgr modify user name=belushki Account=gr-fo set qos+=1week > >>>>> > >>>>> sacctmgr list associations format=Account,Cluster,User,** > >>>>> Fairshare,Partition,****defaultqos,qos tree withd > >>>>> > >>>>> > >>>>> Account Cluster User Share Partition Def QOS > >>>>> QOS > >>>>> -------------------- ---------- ---------- --------- ---------- > >>>>> --------- -------------------- > >>>>> root superb 1 > >>>>> normal > >>>>> root superb root 1 > >>>>> normal > >>>>> sb superb 1 > >>>>> normal > >>>>> sb superb belushki 1 batch > >>>>> normal > >>>>> sb superb fiocco 1 batch > >>>>> normal > >>>>> gr-fo superb 1 > >>>>> normal > >>>>> gr-fo superb belushki 1 foff1 > >>>>> 1week,normal > >>>>> gr-fo superb belushki 1 foff2 > >>>>> 1week,normal > >>>>> gr-fo superb fiocco 1 foff1 > >>>>> normal > >>>>> gr-fo superb fiocco 1 foff2 > >>>>> normal > >>>>> > >>>>> /etc/init.d/slurmd restart (same command was issued on all nodes too) > >>>>> > >>>>> su - belushki > >>>>> > >>>>> srun -p foff2 -A gr-fo --qos=1week -t 7-0 hostname > >>>>> srun: error: Unable to allocate resources: Requested time limit is > >>>>> invalid (exceeds some limit) > >>>>> > >>>>> > >>>>> Could you tell me what I still miss in order to make it working for > >>>>> user > >>>>> "belushki"? > >>>>> > >>>>> Thanks, > >>>>> > >>>>> --matt > >>>>> > >>>>> > >>>>> > >>>> > >>>> > >>>> > >>> > >> > > > > > >
