We are running a slurm controller (2.6.5) with built-in scheduler. No matter which options I give to sbatch and srun I can only manage to run multiple tasks on a single core.
I have thousands of independent tasks I want to run. I should be able to run them individually on a single core, right? I don't care about memory bandwidth. I do care about using a dedicated core for each task. All the compute nodes have 8 cores and I want to run 8 tasks on a dedicated core. So task 1 should run on core 1 and ... and task 8 should run on core 8. What happens is that all 8 tasks are run on core 1. I do not want this. I did also experiment with --exclusive and --shared. The used partition is set in exclusive mode. Here is an example batch script I use: #!/bin/bash #SBATCH --partition=m610 -N9 --output=~/experiments/scripts/slurm-out.log --open-mode=append --cpus-per-task=1 --ntasks-per-core=1 --ntasks-per-node=8 #steps 1 - 500 srun -n1 -N1 --exclusive --time=35 ~/experiments/scripts/steps/step_718f5c96-18da-421d-840a-ee94d4ddee18.sh & ... thousands more similar tasks ... The full list of scheduling options is: # SCHEDULING #DefMemPerCPU=0 FastSchedule=1 #MaxMemPerCPU=0 #SchedulerRootFilter=1 #SchedulerTimeSlice=30 SchedulerType=sched/builtin SchedulerPort=7321 SelectType=select/cons_res SelectTypeParameters=CR_Core_Memory SchedulerParameters=defer Any ideas what I am doing wrong?
