The -B option is a constraint on node selection. You specified --exclusive, so 
Slurm allocated the entire node to your job. It then applied the default 
distribution method of cyclic to select the threads to bind to your tasks. To 
select threads in the same socket for binding, try specifying block second 
distribution, -m block:block or -m *:block. See the CPU Management Guide for 
more information and examples: http://slurm.schedmd.com/cpu_management.html
Martin Perry

From: Mamerto Bacallado [mailto:[email protected]]
Sent: Monday, March 14, 2016 2:37 AM
To: slurm-dev
Subject: [slurm-dev] Confine tasks to one socket with -B option


Hi all,

I am working on a computer with dual socket, 12 cores per socket nodes with 
enabled hyperthread.
Slurm version is 14.11.9 and the relevant configuration is:

TaskPlugin=task/affinity
TaskPluginParam=Cpusets

Cpuinfo shows cpus labelled as :

                       SOCKET 1                                   SOCKET 2
----------------------------------------------------------------------------------------
Core id  | 00 01 02 03 04 05 06 07 08 09 10 11  |  00 01 02 03 04 05 06 07 08 
09 10 11 |
-------- -------------------------------------- 
---------------------------------------|
Thread 0 | 00 01 02 03 04 05 06 07 08 09 10 11  |  12 13 14 15 16 17 18 19 20 
21 22 23 |
Thread 1 | 24 25 26 27 28 29 30 31 32 33 34 35  |  36 37 38 39 40 41 42 43 44 
45 46 47 |
----------------------------------------------------------------------------------------

In order to launch 6 tasks in one socket only I run:

$ srun -N1 -n6  -B 1:3:2 --exclusive -p operation -o log hostname

assuming -B option wil set 1 sockets-per-node, 3 cores-per-socket and 2 
thread-per-core
Nevertheless, the log file says that all 6 tasks have run in both sockets, 
cycliclly assigned:

cpu_bind=MASK - node001, task  0  0 [100602]: mask 0x1 set           --> 
cpuid=00
cpu_bind=MASK - node001, task  1  1 [100603]: mask 0x1000 set        --> 
cpuid=12
cpu_bind=MASK - node001, task  2  2 [100604]: mask 0x1000000 set     --> 
cpuid=24
cpu_bind=MASK - node001, task  3  3 [100605]: mask 0x1000000000 set  --> 
cpuid=36
cpu_bind=MASK - node001, task  4  4 [100606]: mask 0x2 set           --> 
cpuid=01
cpu_bind=MASK - node001, task  5  5 [100607]: mask 0x2000 set        --> 
cpuid=13

                       SOCKET 1                                   SOCKET 2
----------------------------------------------------------------------------------------
Core id  | 00 01 02 03 04 05 06 07 08 09 10 11  |  00 01 02 03 04 05 06 07 08 
09 10 11 |
-------- -------------------------------------- 
---------------------------------------|
Thread 0 | x  x                                 |  x  x                         
       |
Thread 1 | x                                    |  x                            
       |
----------------------------------------------------------------------------------------

Am I misinterpreting how -B option works?

Regards,
Mam

Reply via email to