"Jette, Moe" <[email protected]> writes: > My tests of this show try_sched() completing in in a few milliseconds > and I don't see how the existence of a constraint would measurably > impact performance.
The only thing I can see (from the code, but I don't know whether it makes any difference in practice), is that if the job has constraints with counts, _try_sched() runs job_req_node_filter(job_ptr, *avail_bitmap) to reduce the nodes to test, but not if the job only has constraints without counts. I'm sorry for the lack of details. > What version of SLURM are you using? 2.2.1 > What is your configuration? Rough overview: ~ 650 nodes, ~ 4400 cpus, priority/multifactor, sched/backfill, select/cons_res with CR_CPU_Memory, PreemptMode requeue, preempt/qos. Details: see attached slurm.conf > Do you have job preemption configured and if so, how? Yes, preemption by requeueing, PreemptType=preempt/qos. Basically, all QoS'es can preempt jobs in the lowpri QoS. > How many active and queued jobs are there? At the time, about 1000 running jobs, and about 1000 queued jobs. The problem is most likely related to the load of the cluster, so it is hard to investigate this on our test cluster. Is there some debug/logging output that would help us figure out what happens?
## slurm.conf: main configuration file for SLURM ## $Id: slurm.conf,v 1.25 2011/02/12 16:59:04 root Exp root $ ### ### Cluster ### ClusterName=titan #default: AuthType=auth/munge #default: CryptoType=crypto/munge SlurmctldPort=6817 SlurmdPort=6818 TmpFs=/work #default: TreeWidth=50 Use ceil(sqrt(#nodes)) TreeWidth=5 ## Timers: #default: MessageTimeout=10 SlurmdTimeout=36000 WaitTime=0 ### ### Slurmctld ### ControlMachine=teflon #default: MinJobAge=300 SlurmUser=slurm StateSaveLocation=/state/partition1/slurm/slurmstate ### ### Nodes ### FastSchedule=2 HealthCheckInterval=60 HealthCheckProgram=/sbin/healthcheck ReturnToService=1 Nodename=DEFAULT CoresPerSocket=2 Sockets=2 RealMemory=3949 State=unknown TmpDisk=10000 Weight=2027 PartitionName=DEFAULT MaxTime=Infinite State=up Shared=NO Include /etc/slurm/slurmnodes.conf ### ### Jobs ### PropagateResourceLimits=NONE DefMemPerCPU=500 EnforcePartLimits=yes #default: InactiveLimit=0 JobFileAppend=1 #default: JobRequeue=1 JobSubmitPlugins=lua #default: MaxJobCount=10000 #default: MpiDefault=none #FIXME: openmpi? #default: OverTimeLimit=0 VSizeFactor=150 ## Prologs/Epilogs # run by slurmctld as SlurmUser on ControlMachine before granting a job allocation: #PrologSlurmctld= # run by slurmd on each node prior to the first job step on the node: Prolog=/site/sbin/slurmprolog # run by srun on the node running srun, prior to the launch of a job step: #SrunProlog= # run as user for each task prior to initiate the task: TaskProlog=/site/sbin/taskprolog # run as user for each task after the task finishes: #TaskEpilog= # run by srun on the node running srun, after a job step finishes: #SrunEpilog= # run as root on each node when job has completed Epilog=/site/sbin/slurmepilog # run as SlurmUser on ControlMachine after the allocation is released: #EpilogSlurmctld= ### ### Job Priority ### PriorityType=priority/multifactor #default: PriorityCalcPeriod=5 #default: PriorityDecayHalfLife=7-0 #(7 days) #default: PriorityUsageResetPeriod=NONE #default: PriorityMaxAge=7-0 #(7 days) #default: PriorityFavorSmall=no PriorityWeightAge=10000 #default: PriorityWeightFairshare=0 PriorityWeightJobSize=1000 #default: PriorityWeightPartition=0 PriorityWeightQOS=10000 ### ### Scheduling ### SchedulerType=sched/backfill #default: SchedulerParameters=default_queue_depth=100,defer=?,bf_interval=30,bf_window=1440,max_job_bf=50 SelectType=select/cons_res SelectTypeParameters=CR_CPU_Memory PreemptMode=requeue #PreemptMode=checkpoint # FIXME: cancels if checkpoint is not possible! PreemptType=preempt/qos CompleteWait=32 # KillWait + 2 #default: KillWait=30 ### ### Checkpointing ### # ************** WARNING *********************** # *** ENABLING/DISABLING THIS KILLS ALL JOBS *** # ********************************************** CheckpointType=checkpoint/blcr JobCheckpointDir=/state/partition1/slurm/checkpoint ### ### Logging ### SlurmctldDebug=6 SlurmctldLogFile=/var/log/slurm/slurmctld.log SlurmSchedLogLevel=1 SlurmSchedLogFile=/var/log/slurm/sched.log SlurmdDebug=5 SlurmdLogFile=/var/log/slurm/slurmd.log #default: DebugFlags= ### ### Accounting (Slurmdbd) ### AccountingStorageType=accounting_storage/slurmdbd AccountingStorageHost=blaster JobAcctGatherType=jobacct_gather/linux #default: JobAcctGatherFrequency=30 ProctrackType=proctrack/linuxproc # FIXME: check out cgroup AccountingStorageEnforce=limits,qos # combination of associations < limits < wckeys, qos
-- Cheers, B/H
