Perfect; thanks, Danny and Moe. On Mon, Oct 31, 2011 at 1:19 PM, Danny Auble <[email protected]> wrote:
> Look at the sacctmgr man page on the subject > > http://www.schedmd.com/slurmdocs/sacctmgr.html > > Look for SPECIFICATIONS FOR QOS > > On Monday October 31 2011 1:17:17 PM Lyn Gerner wrote: > > Certainly; how about the Flags? > > > > Thanks again, > > Lyn > > > > On Mon, Oct 31, 2011 at 1:08 PM, Moe Jette <[email protected]> wrote: > > > > > SLURM's QOS and resource limits web pages describe most of this: > > > http://www.schedmd.com/**slurmdocs/qos.html< > http://www.schedmd.com/slurmdocs/qos.html> > > > http://www.schedmd.com/**slurmdocs/resource_limits.html< > http://www.schedmd.com/slurmdocs/resource_limits.html> > > > > > > > > > Quoting Lyn Gerner <[email protected]>: > > > > > > PS: Moe, is there a related document? Couldn't find anything obvious. > > >> > > >> Thanks, > > >> Lyn > > >> > > >> On Mon, Oct 31, 2011 at 12:59 PM, Lyn Gerner < > [email protected]>** > > >> wrote: > > >> > > >> Great, thanks Moe. > > >>> > > >>> > > >>> On Mon, Oct 31, 2011 at 10:39 AM, Moe Jette <[email protected]> > wrote: > > >>> > > >>> This works for me. > > >>>> What version of SLURM are you running? > > >>>> You might want to look at your SlurmctldLogFile. > > >>>> > > >>>> Lyn, > > >>>> You can use the QOS mechanism was Matt is with flags (e.g. > > >>>> "Flags=PartitionTimeLimit") to override partition time and/or size > > >>>> limits. > > >>>> > > >>>> > > >>>> Quoting Matteo Guglielmi <[email protected]>: > > >>>> > > >>>> Dear All, > > >>>>> > > >>>>> I'm trying to create a simple qos called 1week which > > >>>>> I would like to associate to those users who do need > > >>>>> to run for one week instead of 2 days at maximum: > > >>>>> > > >>>>> ### slurm.conf ### > > >>>>> EnforcePartLimits=YES > > >>>>> TaskPlugin=task/affinity > > >>>>> TaskPluginParam=Sched > > >>>>> TopologyPlugin=topology/none > > >>>>> TrackWCKey=no > > >>>>> SchedulerType=sched/backfill > > >>>>> SelectType=select/cons_res > > >>>>> SelectTypeParameters=CR_Core_****Memory > > >>>>> PriorityType=priority/****multifactor > > >>>>> > > >>>>> PriorityDecayHalfLife=7-0 > > >>>>> PriorityCalcPeriod=5 > > >>>>> PriorityFavorSmall=YES > > >>>>> PriorityMaxAge=7-0 > > >>>>> PriorityUsageResetPeriod=NONE > > >>>>> PriorityWeightAge=1000 > > >>>>> PriorityWeightFairshare=1000 > > >>>>> PriorityWeightJobSize=10000 > > >>>>> PriorityWeightPartition=10000 > > >>>>> PriorityWeightQOS=10000 > > >>>>> AccountingStorageEnforce=****limits,qos > > >>>>> AccountingStorageType=****accounting_storage/slurmdbd > > >>>>> JobCompType=jobcomp/none > > >>>>> JobAcctGatherType=jobacct_****gather/linux > > >>>>> PreemptMode=suspend,gang > > >>>>> PreemptType=preempt/partition_****prio > > >>>>> > > >>>>> > > >>>>> NodeName=DEFAULT TmpDisk=16384 State=UNKNOWN > > >>>>> > > >>>>> NodeName=foff[01-08] Procs=8 CoresPerSocket=4 Sockets=2 > > >>>>> ThreadsPerCore=1 RealMemory=7000 Weight=1 > Feature=X5482,foff,fofflm > > >>>>> NodeName=foff[09-13] Procs=48 CoresPerSocket=12 Sockets=4 > > >>>>> ThreadsPerCore=1 RealMemory=127000 Weight=1 > Feature=6176,foff,foffhm > > >>>>> > > >>>>> PartitionName=DEFAULT DefaultTime=60 MinNodes=1 MaxNodes=UNLIMITED > > >>>>> MaxTime=2-0 PreemptMode=SUSPEND Shared=FORCE:1 State=UP Default=NO > > >>>>> PartitionName=batch Nodes=foff[01-13] Default=YES > > >>>>> PartitionName=foff1 Nodes=foff[01-08] Priority=1000 > > >>>>> PartitionName=foff2 Nodes=foff[09-13] Priority=1000 > > >>>>> ################# > > >>>>> > > >>>>> sacctmgr list associations format=Account,Cluster,User,** > > >>>>> Fairshare,Partition,****defaultqos,qos tree withd > > >>>>> > > >>>>> > > >>>>> Account Cluster User Share Partition > Def QOS > > >>>>> QOS > > >>>>> -------------------- ---------- ---------- --------- ---------- > > >>>>> --------- -------------------- > > >>>>> root superb 1 > > >>>>> normal > > >>>>> root superb root 1 > > >>>>> normal > > >>>>> sb superb 1 > > >>>>> normal > > >>>>> sb superb belushki 1 batch > > >>>>> normal > > >>>>> sb superb fiocco 1 batch > > >>>>> normal > > >>>>> gr-fo superb 1 > > >>>>> normal > > >>>>> gr-fo superb belushki 1 foff1 > > >>>>> normal > > >>>>> gr-fo superb belushki 1 foff2 > > >>>>> normal > > >>>>> gr-fo superb fiocco 1 foff1 > > >>>>> normal > > >>>>> gr-fo superb fiocco 1 foff2 > > >>>>> normal > > >>>>> > > >>>>> > > >>>>> sacctmgr add qos Name=1week MaxWall=7-0 Priority=100 > > >>>>> PreemptMode=Cluster > > >>>>> Flags=PartitionTimeLimit > > >>>>> > > >>>>> sacctmgr modify user name=belushki Account=gr-fo set qos+=1week > > >>>>> > > >>>>> sacctmgr list associations format=Account,Cluster,User,** > > >>>>> Fairshare,Partition,****defaultqos,qos tree withd > > >>>>> > > >>>>> > > >>>>> Account Cluster User Share Partition > Def QOS > > >>>>> QOS > > >>>>> -------------------- ---------- ---------- --------- ---------- > > >>>>> --------- -------------------- > > >>>>> root superb 1 > > >>>>> normal > > >>>>> root superb root 1 > > >>>>> normal > > >>>>> sb superb 1 > > >>>>> normal > > >>>>> sb superb belushki 1 batch > > >>>>> normal > > >>>>> sb superb fiocco 1 batch > > >>>>> normal > > >>>>> gr-fo superb 1 > > >>>>> normal > > >>>>> gr-fo superb belushki 1 foff1 > > >>>>> 1week,normal > > >>>>> gr-fo superb belushki 1 foff2 > > >>>>> 1week,normal > > >>>>> gr-fo superb fiocco 1 foff1 > > >>>>> normal > > >>>>> gr-fo superb fiocco 1 foff2 > > >>>>> normal > > >>>>> > > >>>>> /etc/init.d/slurmd restart (same command was issued on all nodes > too) > > >>>>> > > >>>>> su - belushki > > >>>>> > > >>>>> srun -p foff2 -A gr-fo --qos=1week -t 7-0 hostname > > >>>>> srun: error: Unable to allocate resources: Requested time limit is > > >>>>> invalid (exceeds some limit) > > >>>>> > > >>>>> > > >>>>> Could you tell me what I still miss in order to make it working for > > >>>>> user > > >>>>> "belushki"? > > >>>>> > > >>>>> Thanks, > > >>>>> > > >>>>> --matt > > >>>>> > > >>>>> > > >>>>> > > >>>> > > >>>> > > >>>> > > >>> > > >> > > > > > > > > > >
