Are the job's allocated different nodes or even different cores on the
nodes? If so, they don't need to preempt each other. Also see:
http://slurm.schedmd.com/preempt.html
Quoting "Ryan M. Bergmann" <[email protected]>:
Hi dev list,
I’m having trouble getting preemption to work in suspend mode. I
have three partitions, one with a priority of 100, one with a
priority of 3, and one with a priority of 1. When users submit jobs
on the 100 priority partition, a job on a lower priority partition
is not suspended. I have SHARED=FORCE:1 for all partitions. Any
ideas what could be happening? The default memory per CPU=2GB, so
if users do not explicitly specify their memory usage, will
preemption fail if the high priority job needs more memory than what
is free on the node?
Thanks!
Ryan
My configuration is:
AccountingStorageBackupHost = (null)
AccountingStorageEnforce = associations,limits
AccountingStorageHost = berkelium.berkelium
AccountingStorageLoc = N/A
AccountingStoragePort = 6819
AccountingStorageType = accounting_storage/slurmdbd
AccountingStorageUser = N/A
AccountingStoreJobComment = YES
AuthType = auth/munge
BackupAddr = (null)
BackupController = (null)
BatchStartTimeout = 10 sec
BOOT_TIME = 2014-03-10T18:06:50
CacheGroups = 0
CheckpointType = checkpoint/none
ClusterName = cluster
CompleteWait = 0 sec
ControlAddr = berkelium
ControlMachine = berkelium
CryptoType = crypto/munge
DebugFlags = (null)
DefMemPerCPU = 2
DisableRootJobs = NO
EnforcePartLimits = NO
Epilog = (null)
EpilogMsgTime = 2000 usec
EpilogSlurmctld = (null)
FastSchedule = 1
FirstJobId = 1
GetEnvTimeout = 2 sec
GresTypes = (null)
GroupUpdateForce = 0
GroupUpdateTime = 600 sec
HASH_VAL = Match
HealthCheckInterval = 0 sec
HealthCheckProgram = (null)
InactiveLimit = 0 sec
JobAcctGatherFrequency = 30 sec
JobAcctGatherType = jobacct_gather/none
JobCheckpointDir = /var/slurm/checkpoint
JobCompHost = localhost
JobCompLoc = /var/log/slurm_jobcomp.log
JobCompPort = 0
JobCompType = jobcomp/filetxt
JobCompUser = root
JobCredentialPrivateKey = (null)
JobCredentialPublicCertificate = (null)
JobFileAppend = 0
JobRequeue = 1
JobSubmitPlugins = (null)
KillOnBadExit = 0
KillWait = 30 sec
Licenses = (null)
MailProg = /bin/mail
MaxJobCount = 10000
MaxJobId = 4294901760
MaxMemPerNode = UNLIMITED
MaxStepCount = 40000
MaxTasksPerNode = 128
MessageTimeout = 10 sec
MinJobAge = 300 sec
MpiDefault = none
MpiParams = (null)
NEXT_JOB_ID = 11897
OverTimeLimit = 0 min
PluginDir = /usr/lib64/slurm
PlugStackConfig = /etc/slurm/plugstack.conf
PreemptMode = GANG,SUSPEND
PreemptType = preempt/partition_prio
PriorityDecayHalfLife = 00:07:00
PriorityCalcPeriod = 00:05:00
PriorityFavorSmall = 0
PriorityFlags = 0
PriorityMaxAge = 7-00:00:00
PriorityUsageResetPeriod = NONE
PriorityType = priority/multifactor
PriorityWeightAge = 1
PriorityWeightFairShare = 5
PriorityWeightJobSize = 0
PriorityWeightPartition = 100
PriorityWeightQOS = 5
PrivateData = none
ProctrackType = proctrack/pgid
Prolog = (null)
PrologSlurmctld = (null)
PropagatePrioProcess = 0
PropagateResourceLimits = ALL
PropagateResourceLimitsExcept = (null)
RebootProgram = (null)
ReconfigFlags = (null)
ResumeProgram = (null)
ResumeRate = 300 nodes/min
ResumeTimeout = 60 sec
ResvOverRun = 0 min
ReturnToService = 1
SallocDefaultCommand = (null)
SchedulerParameters = (null)
SchedulerPort = 7321
SchedulerRootFilter = 1
SchedulerTimeSlice = 30 sec
SchedulerType = sched/backfill
SelectType = select/cons_res
SelectTypeParameters = CR_CORE_MEMORY
SlurmUser = slurm(202)
SlurmctldDebug = info
SlurmctldLogFile = (null)
SlurmSchedLogFile = (null)
SlurmctldPort = 6817
SlurmctldTimeout = 120 sec
SlurmdDebug = info
SlurmdLogFile = (null)
SlurmdPidFile = /var/run/slurmd.pid
SlurmdPort = 6818
SlurmdSpoolDir = /var/spool/slurmd
SlurmdTimeout = 300 sec
SlurmdUser = root(0)
SlurmSchedLogLevel = 0
SlurmctldPidFile = /var/run/slurmctld.pid
SLURM_CONF = /etc/slurm/slurm.conf
SLURM_VERSION = 2.4.3
SrunEpilog = (null)
SrunProlog = (null)
StateSaveLocation = /var/spool/slurmsave
SuspendExcNodes = (null)
SuspendExcParts = (null)
SuspendProgram = (null)
SuspendRate = 60 nodes/min
SuspendTime = NONE
SuspendTimeout = 30 sec
SwitchType = switch/none
TaskEpilog = (null)
TaskPlugin = task/affinity
TaskPluginParam = (null type)
TaskProlog = (null)
TmpFS = /tmp
TopologyPlugin = topology/none
TrackWCKey = 0
TreeWidth = 50
UsePam = 0
UnkillableStepProgram = (null)
UnkillableStepTimeout = 60 sec
VSizeFactor = 0 percent
WaitTime = 0 sec