On 07/11/2013 10:51 PM, Neil Van Lysel wrote:
> Hi Lennart,
>
> Thanks for the info!
>
> Do you set any of the PriorityWeight flags in slurm.conf? If so, what values
> do you use?
>
> Thanks,
> Neil
>
>
> On 07/10/2013 03:17 AM, Lennart Karlsson wrote:
>> On 07/09/2013 08:38 PM, Neil Van Lysel wrote:
>>> Is it possible to grant a user priority on X cores? For example, we have
>>> a small 768 core SLURM cluster, and we would like to give user A
>>> priority on only 512 cores. I am currently using QOS to give specific
>>> users priority on all cores, but I do not know how specify priority on X
>>> cores.
>>>
>>> Here's my slurm.conf file:
>>>
>>> ClusterName="aci"
>>> ControlMachine=aci-service-1
>>> BackupController=aci-service-2
>>> SlurmUser=slurm
>>> SlurmctldPort=6817
>>> SlurmdPort=6818
>>> AuthType=auth/munge
>>> StateSaveLocation=/tmp/slurmstate
>>> SlurmdSpoolDir=/tmp/slurmd
>>> SwitchType=switch/none
>>> MpiDefault=none
>>> MpiParams=ports=12000-13999
>>> SlurmctldPidFile=/var/run/slurmctld.pid
>>> SlurmdPidFile=/var/run/slurmd.pid
>>> ProctrackType=proctrack/pgid
>>> CacheGroups=0
>>> ReturnToService=0
>>> PropagateResourceLimitsExcept=MEMLOCK,NOFILE
>>> UsePAM=1
>>> SlurmctldTimeout=120
>>> SlurmdTimeout=300
>>> InactiveLimit=0
>>> MinJobAge=300
>>> KillWait=30
>>> Waittime=0
>>> SchedulerType=sched/backfill
>>> SelectType=select/cons_res
>>> SelectTypeParameters=CR_Core
>>> FastSchedule=1
>>> PriorityType=priority/multifactor
>>> PriorityWeightQOS=1
>>> PreemptType=preempt/qos
>>> PreemptMode=cancel
>>> SlurmctldDebug=4
>>> SlurmctldLogFile=/var/log/slurm/slurmctld.log
>>> SlurmdDebug=4
>>> SlurmdLogFile=/var/log/slurm/slurmd.log
>>> JobCompType=jobcomp/none
>>> AccountingStorageType=accounting_storage/slurmdbd
>>> AccountingStorageHost=aci-service-1
>>> AccountingStorageLoc=slurm_acct_db
>>> JobAcctGatherType=jobacct_gather/linux
>>> JobAcctGatherFrequency=30
>>> NodeName=aci-[001-048] Sockets=2 CoresPerSocket=8 ThreadsPerCore=1
>>> CPUs=16 State=UNKNOWN
>>> PartitionName=aci Nodes=aci-[001-048] Default=YES MaxTime=INFINITE State=UP
>>>
>>> [root ~]# sacctmgr -p list qos
>>> Name|Priority|GraceTime|Preempt|PreemptMode|Flags|UsageThres|UsageFactor|...
>>> normal|0|00:00:00|low|cluster|||1.000000|||||||||||||||||
>>> low|0|00:00:00||cancel|||1.000000|||||||||||||||||
>>>
>>> If it matters, all of the machines in this cluster are running
>>> Scientific Linux 6.3 and running SLURM version 2.5.1.
>>>
>>> Any help is greatly appreciated.
>>>
>>> Thanks,
>>>
>>> Neil Van Lysel
>>> Center for High Throughput Computing
>>> University of Wisconsin - Madison
>>> [email protected]
>> Hi,
>>
>> We have implemented that function via the QOS system, by allowing
>> only this user (or groups of users) to use a special QOS, set like this:
>>
>> Name Priority GraceTime Preempt PreemptMode
>> Flags UsageThres UsageFactor GrpCPUs GrpCPUMins
>> GrpCPURunMins GrpJobs GrpMem GrpNodes GrpSubmit GrpWall MaxCPUs
>> MaxCPUMins MaxNodes MaxWall MaxCPUsPU MaxJobsPU MaxNodesPU MaxSubmitPU
>> ---------- ---------- ---------- ---------- -----------
>> ---------------------------------------- ---------- ----------- --------
>> ----------- ------------- ------- ------- -------- --------- -----------
>> -------- ----------- -------- ----------- --------- --------- ----------
>> -----------
>> seqver 100000 00:00:00 cluster
>> 1.000000 480
>>
>> i.e. with a priority boost and with a GrpCPUs limit.
>>
>> Cheers
>> -- Lennart Karlsson, UPPMAX, Uppsala University, Sweden
Hi Neil,
My weights are
PriorityWeightAge=20160
PriorityWeightQOS=400000
so I do not actually use "100000" as the value in the QOS table, but
rather 40.
I have not one group, but a few groups, that have a high priority on
some nodes each, like this:
Name Priority GraceTime Preempt PreemptMode
Flags UsageThres UsageFactor GrpCPUs GrpCPUMins GrpCPURunMins
GrpJobs GrpMem GrpNodes GrpSubmit GrpWall MaxCPUs MaxCPUMins MaxNodes
MaxWall MaxCPUsPU MaxJobsPU MaxNodesPU MaxSubmitPU
---------- ---------- ---------- ---------- -----------
---------------------------------------- ---------- ----------- --------
----------- ------------- ------- ------- -------- --------- -----------
-------- ----------- -------- ----------- --------- --------- ----------
-----------
uppmax_st+ 40 00:00:00 cluster
1.000000 32
b2010028_+ 40 00:00:00 cluster
1.000000 32
a2009001_+ 40 00:00:00 cluster
1.000000 96
a2009001_+ 30 00:00:00 cluster
1.000000 160
swegrid_n+ 30 00:00:00 cluster
1.000000 256
The reason can be e.g. that the group has paid extra to get nodes of its own,
but that others may use when the group is not running any jobs.
(I set also PriorityWeightFairshare and PriorityWeightJobSize, but probably
not in a way that is meaningful for your system.)
Cheers
-- Lennart Karlsson, UPPMAX, Uppsala University, Sweden