The -B option is a constraint on node selection. You specified --exclusive, so Slurm allocated the entire node to your job. It then applied the default distribution method of cyclic to select the threads to bind to your tasks. To select threads in the same socket for binding, try specifying block second distribution, -m block:block or -m *:block. See the CPU Management Guide for more information and examples: http://slurm.schedmd.com/cpu_management.html Martin Perry
From: Mamerto Bacallado [mailto:[email protected]] Sent: Monday, March 14, 2016 2:37 AM To: slurm-dev Subject: [slurm-dev] Confine tasks to one socket with -B option Hi all, I am working on a computer with dual socket, 12 cores per socket nodes with enabled hyperthread. Slurm version is 14.11.9 and the relevant configuration is: TaskPlugin=task/affinity TaskPluginParam=Cpusets Cpuinfo shows cpus labelled as : SOCKET 1 SOCKET 2 ---------------------------------------------------------------------------------------- Core id | 00 01 02 03 04 05 06 07 08 09 10 11 | 00 01 02 03 04 05 06 07 08 09 10 11 | -------- -------------------------------------- ---------------------------------------| Thread 0 | 00 01 02 03 04 05 06 07 08 09 10 11 | 12 13 14 15 16 17 18 19 20 21 22 23 | Thread 1 | 24 25 26 27 28 29 30 31 32 33 34 35 | 36 37 38 39 40 41 42 43 44 45 46 47 | ---------------------------------------------------------------------------------------- In order to launch 6 tasks in one socket only I run: $ srun -N1 -n6 -B 1:3:2 --exclusive -p operation -o log hostname assuming -B option wil set 1 sockets-per-node, 3 cores-per-socket and 2 thread-per-core Nevertheless, the log file says that all 6 tasks have run in both sockets, cycliclly assigned: cpu_bind=MASK - node001, task 0 0 [100602]: mask 0x1 set --> cpuid=00 cpu_bind=MASK - node001, task 1 1 [100603]: mask 0x1000 set --> cpuid=12 cpu_bind=MASK - node001, task 2 2 [100604]: mask 0x1000000 set --> cpuid=24 cpu_bind=MASK - node001, task 3 3 [100605]: mask 0x1000000000 set --> cpuid=36 cpu_bind=MASK - node001, task 4 4 [100606]: mask 0x2 set --> cpuid=01 cpu_bind=MASK - node001, task 5 5 [100607]: mask 0x2000 set --> cpuid=13 SOCKET 1 SOCKET 2 ---------------------------------------------------------------------------------------- Core id | 00 01 02 03 04 05 06 07 08 09 10 11 | 00 01 02 03 04 05 06 07 08 09 10 11 | -------- -------------------------------------- ---------------------------------------| Thread 0 | x x | x x | Thread 1 | x | x | ---------------------------------------------------------------------------------------- Am I misinterpreting how -B option works? Regards, Mam
