[slurm-users] Interfaces of topology/tree and Topology Awareness

2024-03-20 Thread nico.derl--- via slurm-users
Hello everyone,

I'm trying to improve topology awareness in a local Slurm-managed HPC system. 
It's using the default hierarchical 3-level topology with the tree-plugin. It 
however does not always confine jobs to the most tightly packed group of nodes, 
seems to over-provision switches for smaller jobs, and gets slow or overwhelmed 
with jobs that have a high node count. 
I'd like to implement something more literally aligned with best-fit, but I'm 
having trouble understanding the relevant interfaces to hook into the topology 
model of Slurm. I would like a high-level explanation of how the tree- and 
common topology components work, how they integrate into the higher scheduling 
logic and what the internal topology model looks like. Or some pointers to 
relevant docs discussing this.

I have read the topology guide and its dev-doc, which does note some of the 
caveats I mentioned. It however only talks about providing a set of weights to 
the upper logic levels in the form of a node ranking. I can't see how this 
ranking resembles the topology and how it's being used. From looking at the 
signatures and C-code I can tell this much:

topology-tree consumes the topology.conf and generates a ranking of some kind 
that is passed to topology-common.

topology-common consumes a ranking and uses its own gres-sched to figure out 
what nodes can fit a job (possibly pulling info from the gres-select-plugin to 
determine node capabilities).

It's then supposed to apply a best-fit algorithm to efficiently fill up vacant 
cluster-capacity, but I can't manage to follow this part in the code as 
everything crumbles into separate files that I can't link correctly in my head.

Thanks in advance.

referenced docs: 
 
 


-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: Lua script

2024-03-20 Thread Ole Holm Nielsen via slurm-users

What is the contents of your /etc/slurm/job_submit.lua file?
Did you reconfigure slurmctld?
Check the log file by: grep job_submit /var/log/slurm/slurmctld.log
What is your Slurm version?

You can read about job_submit plugins in this Wiki page:
https://wiki.fysik.dtu.dk/Niflheim_system/Slurm_configuration/#job-submit-plugins

I hope this helps,
Ole


On 3/20/24 09:49, Gestió Servidors via slurm-users wrote:
after adding “EnforcePartLimits=ALL” in slurm.conf and restarting 
slurmctld daemon, job continues being accepted… so I don’t undertand where 
I’m doing some wrong.


My slurm.conf is this:

ControlMachine=my_server

MailProg=/bin/mail

MpiDefault=none

ProctrackType=proctrack/linuxproc

ReturnToService=2

SlurmctldPidFile=/var/run/slurmctld.pid

SlurmctldPort=6817

SlurmdPidFile=/var/run/slurmd.pid

SlurmdPort=6818

SlurmdSpoolDir=/var/spool/slurmd

SlurmUser=slurm

SlurmdUser=root

AuthType=auth/munge

StateSaveLocation=/var/log/slurm

SwitchType=switch/none

TaskPlugin=task/none,task/affinity,task/cgroup

TaskPluginParam=none

DebugFlags=NO_CONF_HASH,Backfill,BackfillMap,SelectType,Steps,TraceJobs

*JobSubmitPlugins=lua*

SchedulerType=sched/backfill

SelectType=select/cons_tres

SelectTypeParameters=CR_Core

SchedulerParameters=max_script_size=20971520

*EnforcePartLimits=ALL*

CoreSpecPlugin=core_spec/none

AccountingStorageType=accounting_storage/slurmdbd

AccountingStoreFlags=job_comment

JobCompType=jobcomp/filetxt

JobCompLoc=/var/log/slurm/job_completions

ClusterName=my_cluster

JobAcctGatherType=jobacct_gather/linux

SlurmctldDebug=5

SlurmctldLogFile=/var/log/slurmctld.log

SlurmdDebug=5

SlurmdLogFile=/var/log/slurmd.log

AccountingStorageEnforce=limits

AccountingStorageHost=my_server

NodeName=clus[01-06] CPUs=12 SocketsPerBoard=2 CoresPerSocket=6 
ThreadsPerCore=1 RealMemory=128387 TmpDisk=81880 Feature=big-mem


NodeName=clus[07-12] CPUs=12 SocketsPerBoard=2 CoresPerSocket=6 
ThreadsPerCore=1 RealMemory=15491 TmpDisk=81880 Feature=small-mem


NodeName=clus-login CPUs=4 SocketsPerBoard=2 CoresperSocket=2 
ThreadsperCore=1 RealMemory=15886 TmpDisk=30705


*PartitionName=nodo.q Nodes=clus[01-12] Default=YES MaxTime=04:00:00 
State=UP AllocNodes=clus-login,clus05 MaxCPUsPerNode=12*


KillOnBadExit=1

OverTimeLimit=30 # si el trabajo dura mas de 30 minutos despues del tiempo 
maximo (2 horas), se cancela


TCPTimeout=5

PriorityType=priority/multifactor

PriorityDecayHalfLife=7-0

PriorityCalcPeriod=5

PriorityUsageResetPeriod=QUARTERLY

PriorityFavorSmall=NO

PriorityMaxAge=7-0

PriorityWeightAge=1

PriorityWeightFairshare=100

PriorityWeightJobSize=1000

PriorityWeightPartition=1000

PriorityWeightQOS=0

PropagateResourceLimitsExcept=MEMLOCK

And testing script is this:

#!/bin/bash

*#SBATCH --time=5-00:00:00*

srun /bin/hostname

date

sleep 50

date

Why my job is being submited into the queue and not refused BEFORE being 
queued?


--
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com


[slurm-users] Re: Lua script

2024-03-20 Thread Gestió Servidors via slurm-users
Hello,

after adding "EnforcePartLimits=ALL" in slurm.conf and restarting slurmctld 
daemon, job continues being accepted... so I don't undertand where I'm doing 
some wrong.

My slurm.conf is this:
ControlMachine=my_server
MailProg=/bin/mail
MpiDefault=none
ProctrackType=proctrack/linuxproc
ReturnToService=2
SlurmctldPidFile=/var/run/slurmctld.pid
SlurmctldPort=6817
SlurmdPidFile=/var/run/slurmd.pid
SlurmdPort=6818
SlurmdSpoolDir=/var/spool/slurmd
SlurmUser=slurm
SlurmdUser=root
AuthType=auth/munge
StateSaveLocation=/var/log/slurm
SwitchType=switch/none
TaskPlugin=task/none,task/affinity,task/cgroup
TaskPluginParam=none
DebugFlags=NO_CONF_HASH,Backfill,BackfillMap,SelectType,Steps,TraceJobs
JobSubmitPlugins=lua
SchedulerType=sched/backfill
SelectType=select/cons_tres
SelectTypeParameters=CR_Core
SchedulerParameters=max_script_size=20971520
EnforcePartLimits=ALL
CoreSpecPlugin=core_spec/none
AccountingStorageType=accounting_storage/slurmdbd
AccountingStoreFlags=job_comment
JobCompType=jobcomp/filetxt
JobCompLoc=/var/log/slurm/job_completions
ClusterName=my_cluster
JobAcctGatherType=jobacct_gather/linux
SlurmctldDebug=5
SlurmctldLogFile=/var/log/slurmctld.log
SlurmdDebug=5
SlurmdLogFile=/var/log/slurmd.log
AccountingStorageEnforce=limits
AccountingStorageHost=my_server
NodeName=clus[01-06] CPUs=12 SocketsPerBoard=2 CoresPerSocket=6 
ThreadsPerCore=1 RealMemory=128387 TmpDisk=81880 Feature=big-mem
NodeName=clus[07-12] CPUs=12 SocketsPerBoard=2 CoresPerSocket=6 
ThreadsPerCore=1 RealMemory=15491 TmpDisk=81880 Feature=small-mem
NodeName=clus-login CPUs=4 SocketsPerBoard=2 CoresperSocket=2 ThreadsperCore=1 
RealMemory=15886 TmpDisk=30705
PartitionName=nodo.q Nodes=clus[01-12] Default=YES MaxTime=04:00:00 State=UP 
AllocNodes=clus-login,clus05 MaxCPUsPerNode=12
KillOnBadExit=1
OverTimeLimit=30 # si el trabajo dura mas de 30 minutos despues del tiempo 
maximo (2 horas), se cancela
TCPTimeout=5
PriorityType=priority/multifactor
PriorityDecayHalfLife=7-0
PriorityCalcPeriod=5
PriorityUsageResetPeriod=QUARTERLY
PriorityFavorSmall=NO
PriorityMaxAge=7-0
PriorityWeightAge=1
PriorityWeightFairshare=100
PriorityWeightJobSize=1000
PriorityWeightPartition=1000
PriorityWeightQOS=0
PropagateResourceLimitsExcept=MEMLOCK

And testing script is this:
#!/bin/bash
#SBATCH --time=5-00:00:00
srun /bin/hostname
date
sleep 50
date

Why my job is being submited into the queue and not refused BEFORE being queued?

Thanks.

-- 
slurm-users mailing list -- slurm-users@lists.schedmd.com
To unsubscribe send an email to slurm-users-le...@lists.schedmd.com