Hi everyone,

My objective: I want to assign few tasks to the logical CPUs belong to a
particular socket(e.g., say socket 0) and at other time, I want to assign
another set of tasks to the logical CPUs belongs to another socket (e.g.,
say socket 0). In summary, I want to achieve task affinity to a particular
logical CPU

slurm version used: slurm 16.05.10-2

slurm.conf to achieve task affinity:

SelectType=select/cons_res
SelectTypeParameters=CR_Core
TaskPlugin=task/affinity
TaskPluginParam=sched

Node used: Xeon processor; two sockets each having 8 cores with 2
threads/core

Processor layout(/proc/cpuinfo):
processor physical id   core id
0,16            0            0
1,17            0            1
2,18            0             2
3,19             0            3
4,20             0            4
5,21             0            5
6,22             0            6
7,23             0            7
8,24             1            0
9,25             1            1
10,26           1            2
11,27           1            3
12,28           1            4
13,29           1            5
14,30           1            6
15,31           1            7

Question: *I am unable to assign all the tasks to the particular logical
CPUs belong to socket 0/ Socket 1 *

The tasks are always assigning to the sockets 0 first irrespective of the
specified map_cpu before going to socket 1

*My observation:*

*$ srun -n 8 --cpu_bind=verbose,map_cpu:0,1,2,3,16,17,18,19
--distribution=block:block --mem=1024 sleep 100 &*
[1] 14665
cpu_bind=MASK - clusterhost1, task  0  0 [14697]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  1  1 [14698]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  4  4 [14701]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  2  2 [14699]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  3  3 [14700]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  5  5 [14702]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  6  6 [14703]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  7  7 [14704]: mask 0xf000f set
*$ srun bash -c "cat /proc/self/status | grep Cpus_allowed_list"*
Cpus_allowed_list:    4,20


*$ srun -n 8 --cpu_bind=verbose,map_cpu:0,1,2,3,4,5,6,7
--distribution=block:block --mem=1024 sleep 100 &*
[1] 14814
cpu_bind=MASK - clusterhost1, task  1  1 [14847]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  2  2 [14848]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  3  3 [14849]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  0  0 [14846]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  5  5 [14851]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  6  6 [14852]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  4  4 [14850]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  7  7 [14853]: mask 0xf000f set
*$ srun bash -c "cat /proc/self/status | grep Cpus_allowed_list"*
Cpus_allowed_list:    4,20

*$ srun -n 20
--cpu_bind=verbose,map_cpu:0,1,2,3,4,5,6,7,9,10,11,12,13,14,15,16,17,18,19
--distribution=block:block --mem=1024 sleep 100 &*
[1] 15688
cpu_bind=MASK - clusterhost1, task  1  1 [15721]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task  2  2 [15722]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task  4  4 [15724]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task  5  5 [15725]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task  7  7 [15727]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task  0  0 [15720]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task  6  6 [15726]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task  3  3 [15723]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task 10 10 [15730]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task  8  8 [15728]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task  9  9 [15729]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task 11 11 [15731]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task 12 12 [15732]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task 14 14 [15734]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task 13 13 [15733]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task 15 15 [15735]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task 16 16 [15736]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task 17 17 [15737]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task 18 18 [15738]: mask 0x3ff03ff set
cpu_bind=MASK - clusterhost1, task 19 19 [15739]: mask 0x3ff03ff set
*$ srun bash -c "cat /proc/self/status | grep Cpus_allowed_list"*
Cpus_allowed_list:    10,26

*$ srun -n 8 --cpu_bind=verbose,map_cpu:8,9,10,11,24,25,26,27
--distribution=block:block --mem=1024 sleep 100 &*
[1] 16816
cpu_bind=MASK - clusterhost1, task  1  1 [16850]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  4  4 [16853]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  3  3 [16852]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  2  2 [16851]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  0  0 [16849]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  6  6 [16855]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  5  5 [16854]: mask 0xf000f set
cpu_bind=MASK - clusterhost1, task  7  7 [16856]: mask 0xf000f set

*$ srun bash -c "cat /proc/self/status | grep Cpus_allowed_list"*
Cpus_allowed_list:    4,20

*$ srun  --nodes=1 --ntasks=32 --cpu_bind=cores,verbose --label cat
/proc/self/status | grep Cpus_allowed_list*
00: cpu_bind=MASK - clusterhost1, task  0  0 [13955]: mask 0x10001 set
01: cpu_bind=MASK - clusterhost1, task  1  1 [13956]: mask 0x20002 set
04: cpu_bind=MASK - clusterhost1, task  4  4 [13959]: mask 0x100010 set
05: cpu_bind=MASK - clusterhost1, task  5  5 [13960]: mask 0x200020 set
06: cpu_bind=MASK - clusterhost1, task  6  6 [13961]: mask 0x400040 set
03: cpu_bind=MASK - clusterhost1, task  3  3 [13958]: mask 0x80008 set
02: cpu_bind=MASK - clusterhost1, task  2  2 [13957]: mask 0x40004 set
09: cpu_bind=MASK - clusterhost1, task  9  9 [13964]: mask 0x2000200 set
07: cpu_bind=MASK - clusterhost1, task  7  7 [13962]: mask 0x800080 set
10: cpu_bind=MASK - clusterhost1, task 10 10 [13965]: mask 0x4000400 set
11: cpu_bind=MASK - clusterhost1, task 11 11 [13966]: mask 0x8000800 set
14: cpu_bind=MASK - clusterhost1, task 14 14 [13969]: mask 0x40004000 set
15: cpu_bind=MASK - clusterhost1, task 15 15 [13970]: mask 0x80008000 set
12: cpu_bind=MASK - clusterhost1, task 12 12 [13967]: mask 0x10001000 set
13: cpu_bind=MASK - clusterhost1, task 13 13 [13968]: mask 0x20002000 set
08: cpu_bind=MASK - clusterhost1, task  8  8 [13963]: mask 0x1000100 set
17: cpu_bind=MASK - clusterhost1, task 17 17 [13972]: mask 0x20002 set
16: cpu_bind=MASK - clusterhost1, task 16 16 [13971]: mask 0x10001 set
20: cpu_bind=MASK - clusterhost1, task 20 20 [13975]: mask 0x100010 set
19: cpu_bind=MASK - clusterhost1, task 19 19 [13974]: mask 0x80008 set
18: cpu_bind=MASK - clusterhost1, task 18 18 [13973]: mask 0x40004 set
22: cpu_bind=MASK - clusterhost1, task 22 22 [13977]: mask 0x400040 set
21: cpu_bind=MASK - clusterhost1, task 21 21 [13976]: mask 0x200020 set
24: cpu_bind=MASK - clusterhost1, task 24 24 [13979]: mask 0x1000100 set
25: cpu_bind=MASK - clusterhost1, task 25 25 [13980]: mask 0x2000200 set
23: cpu_bind=MASK - clusterhost1, task 23 23 [13978]: mask 0x800080 set
26: cpu_bind=MASK - clusterhost1, task 26 26 [13981]: mask 0x4000400 set
30: cpu_bind=MASK - clusterhost1, task 30 30 [13985]: mask 0x40004000 set
31: cpu_bind=MASK - clusterhost1, task 31 31 [13986]: mask 0x80008000 set
28: cpu_bind=MASK - clusterhost1, task 28 28 [13983]: mask 0x10001000 set
29: cpu_bind=MASK - clusterhost1, task 29 29 [13984]: mask 0x20002000 set
27: cpu_bind=MASK - clusterhost1, task 27 27 [13982]: mask 0x8000800 set
03: Cpus_allowed_list:    3,19
04: Cpus_allowed_list:    4,20
01: Cpus_allowed_list:    1,17
06: Cpus_allowed_list:    6,22
00: Cpus_allowed_list:    0,16
02: Cpus_allowed_list:    2,18
05: Cpus_allowed_list:    5,21
09: Cpus_allowed_list:    9,25
10: Cpus_allowed_list:    10,26
14: Cpus_allowed_list:    14,30
11: Cpus_allowed_list:    11,27
15: Cpus_allowed_list:    15,31
12: Cpus_allowed_list:    12,28
13: Cpus_allowed_list:    13,29
17: Cpus_allowed_list:    1,17
07: Cpus_allowed_list:    7,23
16: Cpus_allowed_list:    0,16
08: Cpus_allowed_list:    8,24
20: Cpus_allowed_list:    4,20
19: Cpus_allowed_list:    3,19
18: Cpus_allowed_list:    2,18
21: Cpus_allowed_list:    5,21
22: Cpus_allowed_list:    6,22
24: Cpus_allowed_list:    8,24
23: Cpus_allowed_list:    7,23
26: Cpus_allowed_list:    10,26
30: Cpus_allowed_list:    14,30
31: Cpus_allowed_list:    15,31
25: Cpus_allowed_list:    9,25
28: Cpus_allowed_list:    12,28
29: Cpus_allowed_list:    13,29
27: Cpus_allowed_list:    11,27


*Kindly help me to assign all the tasks to either socket.*

Any kind of help will be appreciated.

Thanks in advance.



-- 
Thanks & Regards,
Animesh Kuity,
Research Scholar,
Computer Science department,
IIT Roorkee

Reply via email to