Hi Slurm community,

I have hopefully an easy question regarding cpu/partition configuration in 
slurm.conf.

BACKGROUND:

We are running slurm 16.05.6 built on Ubuntu 14.04 LTS (because 14.04 works 
with our current bcfg2 xml configuration management servers).
Each node has two, 12 core Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
When you run 'cat /proc/cpuinfo' it returns 48 processors because each cores 
consists of two threads.

I want to make sure that we are defining our cpu and available cores to slurm 
appropriately. What slurm considers a cpu, and what a process considers a 
thread - all can get mixed up with the semantics. 


PROBLEM: 

Most users run R. R is single threaded so when someone submits a job it will 
take 1 thread and leave the other thread on the core empty. So although a user 
thinks there are 48 cores available, in actuality they only have the 24 
physical available to them. If however they are running an app that can use the 
multiple threads (Julia?) then things are different. We've been getting by up 
to this point until a user tried to run a numpy array in his python3.5 app 
which has resulted in all kinds of cpu overload and memory swap. He's using job 
arrays of size 32, running one array in each job, and on one node for example 
12 of his python apps are running but all 48 cpus are utilized. Load average is 
300.0+. Sometimes memory is swapping and sometimes not.

Before getting into his submit script I wanted to make sure we are configuring 
slurm.conf appropriately for our nodes, and then I can make sure he's making 
the right allocations in his submit scripts.


SLURM.CONF

Below is our slurm.conf - I assume the defining of our nodes and partitions at 
the bottom is most suspect. Can anyone advise as to the best way to configure 
these nodes for cpu utilization? We are using consumable resources for CPU but 
not for memory at this time. I'll also include the SLURM_ env variables at the 
bottom (by simply running 'srun env' if that's of help too. It's interesting to 
me that SLURM_CPUS_ON_NODE=2. Is that correct? Doesn't seem right.

ClusterName=marzano
ControlMachine=lunchbox
ControlAddr=xxxxxxxxx
#BackupController=
#BackupAddr=
#
SlurmUser=slurm
#SlurmdUser=root
SlurmctldPort=6817
SlurmdPort=6818
AuthType=auth/munge
#JobCredentialPrivateKey=
#JobCredentialPublicCertificate=
StateSaveLocation=/slurm.state
SlurmdSpoolDir=/tmp/slurmd
SwitchType=switch/none
MpiDefault=none
SlurmctldPidFile=/var/run/slurm/slurmctld.pid
SlurmdPidFile=/var/run/slurm/slurmd.pid
ProctrackType=proctrack/pgid
#PluginDir=
#FirstJobId=
ReturnToService=2
#MaxJobCount=
#PlugStackConfig=/etc/slurm/plugstack.conf
#PropagatePrioProcess=
#PropagateResourceLimits=
#PropagateResourceLimitsExcept=
#Prolog=
#Epilog=
#SrunProlog=
#SrunEpilog=
#TaskProlog=
#TaskEpilog=
#TaskPlugin=
#TrackWCKey=no
#TreeWidth=50
#TmpFS=
#UsePAM=
#MailProg=/s/slurm/bin/smail
MailProg=/workspace/statlab/bin/smailwrap
#
# TIMERS
SlurmctldTimeout=300
SlurmdTimeout=300
InactiveLimit=0
MinJobAge=300
KillWait=30
Waittime=0
#
# SCHEDULING
SchedulerType=sched/backfill
#SchedulerAuth=
#SchedulerPort=
#SchedulerRootFilter=
SelectType=select/cons_res
SelectTypeParameters=CR_Core
FastSchedule=1
#DefMemPerNode           = UNLIMITED
#MaxMemPerNode           = UNLIMITED
#DefMemPerCPU           = UNLIMITED
MaxMemPerCPU           = 2600

PriorityType=priority/multifactor
PriorityDecayHalfLife=14-0
#PriorityUsageResetPeriod=14-0
PriorityWeightFairshare=100000
PriorityWeightAge=1000
PriorityWeightPartition=10000
PriorityWeightJobSize=1000
PriorityMaxAge=7-0
PriorityFavorSmall=NO
#
# LOGGING
SlurmctldDebug=6
SlurmctldLogFile=/var/log/slurmctld/slurmctld.log
SlurmdDebug=6
SlurmdLogFile=/var/log/slurmd/slurmd.log
JobCompType=jobcomp/none
#JobCompLoc=
#
# ACCOUNTING
JobAcctGatherType=jobacct_gather/linux
JobAcctGatherFrequency=30
#
AccountingStorageType=accounting_storage/slurmdbd
AccountingStorageEnforce=limits,qos
AccountingStorageHost=lunchbox
AccountingStorageLoc=slurm_acct_db
AccountingStoragePass=auth/munge
AccountingStorageUser=slurm
#
# COMPUTE NODES
NodeName=marzano0[1-8] CPUs=48 Sockets=2 CoresPerSocket=12 ThreadsPerCore=2 
RealMemory=128827 State=UNKNOWN

PartitionName=long Nodes=marzano0[1-4,7-8] Default=NO MaxTime=14-0 State=UP
PartitionName=short Nodes=marzano0[5-6] Default=YES MaxTime=4-0 State=UP


SLURM_PRIO_PROCESS=0
SRUN_DEBUG=3
SLURM_UMASK=0002
SLURM_CLUSTER_NAME=marzano
SLURM_SUBMIT_DIR=/workspace/software/cyana-3.97
SLURM_SUBMIT_HOST=lunchbox
SLURM_JOB_NAME=env
SLURM_JOB_CPUS_PER_NODE=2
SLURM_NTASKS=1
SLURM_NPROCS=1
SLURM_DISTRIBUTION=cyclic
SLURM_JOB_ID=21223
SLURM_JOBID=21223
SLURM_STEP_ID=0
SLURM_STEPID=0
SLURM_NNODES=1
SLURM_JOB_NUM_NODES=1
SLURM_NODELIST=marzano05
SLURM_JOB_PARTITION=short
SLURM_TASKS_PER_NODE=1
SLURM_SRUN_COMM_PORT=49261
SLURM_JOB_ACCOUNT=mikec
SLURM_JOB_QOS=normal
SLURM_STEP_NODELIST=marzano05
SLURM_JOB_NODELIST=marzano05
SLURM_STEP_NUM_NODES=1
SLURM_STEP_NUM_TASKS=1
SLURM_STEP_TASKS_PER_NODE=1
SLURM_STEP_LAUNCHER_PORT=49261
SLURM_SRUN_COMM_HOST=xxxxxxxxx
SLURM_TOPOLOGY_ADDR=marzano05
SLURM_TOPOLOGY_ADDR_PATTERN=node
TMPDIR=/tmp
SLURM_CPUS_ON_NODE=2
SLURM_TASK_PID=23727
SLURM_NODEID=0
SLURM_PROCID=0
SLURM_LOCALID=0
SLURM_LAUNCH_NODE_IPADDR=xxxxxxxx
SLURM_GTIDS=0
SLURM_CHECKPOINT_IMAGE_DIR=/var/slurm/checkpoint
SLURM_JOB_UID=3691
SLURM_JOB_USER=mikec
SLURMD_NODENAME=marzano05

Reply via email to