Hi everyone, My objective: I want to assign few tasks to the logical CPUs belong to a particular socket(e.g., say socket 0) and at other time, I want to assign another set of tasks to the logical CPUs belongs to another socket (e.g., say socket 0). In summary, I want to achieve task affinity to a particular logical CPU
slurm version used: slurm 16.05.10-2 slurm.conf to achieve task affinity: SelectType=select/cons_res SelectTypeParameters=CR_Core TaskPlugin=task/affinity TaskPluginParam=sched Node used: Xeon processor; two sockets each having 8 cores with 2 threads/core Processor layout(/proc/cpuinfo): processor physical id core id 0,16 0 0 1,17 0 1 2,18 0 2 3,19 0 3 4,20 0 4 5,21 0 5 6,22 0 6 7,23 0 7 8,24 1 0 9,25 1 1 10,26 1 2 11,27 1 3 12,28 1 4 13,29 1 5 14,30 1 6 15,31 1 7 Question: *I am unable to assign all the tasks to the particular logical CPUs belong to socket 0/ Socket 1 * The tasks are always assigning to the sockets 0 first irrespective of the specified map_cpu before going to socket 1 *My observation:* *$ srun -n 8 --cpu_bind=verbose,map_cpu:0,1,2,3,16,17,18,19 --distribution=block:block --mem=1024 sleep 100 &* [1] 14665 cpu_bind=MASK - clusterhost1, task 0 0 [14697]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 1 1 [14698]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 4 4 [14701]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 2 2 [14699]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 3 3 [14700]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 5 5 [14702]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 6 6 [14703]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 7 7 [14704]: mask 0xf000f set *$ srun bash -c "cat /proc/self/status | grep Cpus_allowed_list"* Cpus_allowed_list: 4,20 *$ srun -n 8 --cpu_bind=verbose,map_cpu:0,1,2,3,4,5,6,7 --distribution=block:block --mem=1024 sleep 100 &* [1] 14814 cpu_bind=MASK - clusterhost1, task 1 1 [14847]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 2 2 [14848]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 3 3 [14849]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 0 0 [14846]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 5 5 [14851]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 6 6 [14852]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 4 4 [14850]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 7 7 [14853]: mask 0xf000f set *$ srun bash -c "cat /proc/self/status | grep Cpus_allowed_list"* Cpus_allowed_list: 4,20 *$ srun -n 20 --cpu_bind=verbose,map_cpu:0,1,2,3,4,5,6,7,9,10,11,12,13,14,15,16,17,18,19 --distribution=block:block --mem=1024 sleep 100 &* [1] 15688 cpu_bind=MASK - clusterhost1, task 1 1 [15721]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 2 2 [15722]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 4 4 [15724]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 5 5 [15725]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 7 7 [15727]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 0 0 [15720]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 6 6 [15726]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 3 3 [15723]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 10 10 [15730]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 8 8 [15728]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 9 9 [15729]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 11 11 [15731]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 12 12 [15732]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 14 14 [15734]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 13 13 [15733]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 15 15 [15735]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 16 16 [15736]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 17 17 [15737]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 18 18 [15738]: mask 0x3ff03ff set cpu_bind=MASK - clusterhost1, task 19 19 [15739]: mask 0x3ff03ff set *$ srun bash -c "cat /proc/self/status | grep Cpus_allowed_list"* Cpus_allowed_list: 10,26 *$ srun -n 8 --cpu_bind=verbose,map_cpu:8,9,10,11,24,25,26,27 --distribution=block:block --mem=1024 sleep 100 &* [1] 16816 cpu_bind=MASK - clusterhost1, task 1 1 [16850]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 4 4 [16853]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 3 3 [16852]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 2 2 [16851]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 0 0 [16849]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 6 6 [16855]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 5 5 [16854]: mask 0xf000f set cpu_bind=MASK - clusterhost1, task 7 7 [16856]: mask 0xf000f set *$ srun bash -c "cat /proc/self/status | grep Cpus_allowed_list"* Cpus_allowed_list: 4,20 *$ srun --nodes=1 --ntasks=32 --cpu_bind=cores,verbose --label cat /proc/self/status | grep Cpus_allowed_list* 00: cpu_bind=MASK - clusterhost1, task 0 0 [13955]: mask 0x10001 set 01: cpu_bind=MASK - clusterhost1, task 1 1 [13956]: mask 0x20002 set 04: cpu_bind=MASK - clusterhost1, task 4 4 [13959]: mask 0x100010 set 05: cpu_bind=MASK - clusterhost1, task 5 5 [13960]: mask 0x200020 set 06: cpu_bind=MASK - clusterhost1, task 6 6 [13961]: mask 0x400040 set 03: cpu_bind=MASK - clusterhost1, task 3 3 [13958]: mask 0x80008 set 02: cpu_bind=MASK - clusterhost1, task 2 2 [13957]: mask 0x40004 set 09: cpu_bind=MASK - clusterhost1, task 9 9 [13964]: mask 0x2000200 set 07: cpu_bind=MASK - clusterhost1, task 7 7 [13962]: mask 0x800080 set 10: cpu_bind=MASK - clusterhost1, task 10 10 [13965]: mask 0x4000400 set 11: cpu_bind=MASK - clusterhost1, task 11 11 [13966]: mask 0x8000800 set 14: cpu_bind=MASK - clusterhost1, task 14 14 [13969]: mask 0x40004000 set 15: cpu_bind=MASK - clusterhost1, task 15 15 [13970]: mask 0x80008000 set 12: cpu_bind=MASK - clusterhost1, task 12 12 [13967]: mask 0x10001000 set 13: cpu_bind=MASK - clusterhost1, task 13 13 [13968]: mask 0x20002000 set 08: cpu_bind=MASK - clusterhost1, task 8 8 [13963]: mask 0x1000100 set 17: cpu_bind=MASK - clusterhost1, task 17 17 [13972]: mask 0x20002 set 16: cpu_bind=MASK - clusterhost1, task 16 16 [13971]: mask 0x10001 set 20: cpu_bind=MASK - clusterhost1, task 20 20 [13975]: mask 0x100010 set 19: cpu_bind=MASK - clusterhost1, task 19 19 [13974]: mask 0x80008 set 18: cpu_bind=MASK - clusterhost1, task 18 18 [13973]: mask 0x40004 set 22: cpu_bind=MASK - clusterhost1, task 22 22 [13977]: mask 0x400040 set 21: cpu_bind=MASK - clusterhost1, task 21 21 [13976]: mask 0x200020 set 24: cpu_bind=MASK - clusterhost1, task 24 24 [13979]: mask 0x1000100 set 25: cpu_bind=MASK - clusterhost1, task 25 25 [13980]: mask 0x2000200 set 23: cpu_bind=MASK - clusterhost1, task 23 23 [13978]: mask 0x800080 set 26: cpu_bind=MASK - clusterhost1, task 26 26 [13981]: mask 0x4000400 set 30: cpu_bind=MASK - clusterhost1, task 30 30 [13985]: mask 0x40004000 set 31: cpu_bind=MASK - clusterhost1, task 31 31 [13986]: mask 0x80008000 set 28: cpu_bind=MASK - clusterhost1, task 28 28 [13983]: mask 0x10001000 set 29: cpu_bind=MASK - clusterhost1, task 29 29 [13984]: mask 0x20002000 set 27: cpu_bind=MASK - clusterhost1, task 27 27 [13982]: mask 0x8000800 set 03: Cpus_allowed_list: 3,19 04: Cpus_allowed_list: 4,20 01: Cpus_allowed_list: 1,17 06: Cpus_allowed_list: 6,22 00: Cpus_allowed_list: 0,16 02: Cpus_allowed_list: 2,18 05: Cpus_allowed_list: 5,21 09: Cpus_allowed_list: 9,25 10: Cpus_allowed_list: 10,26 14: Cpus_allowed_list: 14,30 11: Cpus_allowed_list: 11,27 15: Cpus_allowed_list: 15,31 12: Cpus_allowed_list: 12,28 13: Cpus_allowed_list: 13,29 17: Cpus_allowed_list: 1,17 07: Cpus_allowed_list: 7,23 16: Cpus_allowed_list: 0,16 08: Cpus_allowed_list: 8,24 20: Cpus_allowed_list: 4,20 19: Cpus_allowed_list: 3,19 18: Cpus_allowed_list: 2,18 21: Cpus_allowed_list: 5,21 22: Cpus_allowed_list: 6,22 24: Cpus_allowed_list: 8,24 23: Cpus_allowed_list: 7,23 26: Cpus_allowed_list: 10,26 30: Cpus_allowed_list: 14,30 31: Cpus_allowed_list: 15,31 25: Cpus_allowed_list: 9,25 28: Cpus_allowed_list: 12,28 29: Cpus_allowed_list: 13,29 27: Cpus_allowed_list: 11,27 *Kindly help me to assign all the tasks to either socket.* Any kind of help will be appreciated. Thanks in advance. -- Thanks & Regards, Animesh Kuity, Research Scholar, Computer Science department, IIT Roorkee
