Dear HPX-experts,

I am trying to spawn 8 hpx processes on a cluster node with 8 numa nodes, containing 6 real cpu cores each. All seems well, but the output of `--hpx:print-bind` confuses me.

I am using slurm (sbatch command) and openmpi (mpirun command in sbatch script). The output of mpirun's `--display-map` makes complete sense. All 8 process ranks get assigned to the 6 cores in the 8 numa nodes, in order. Process rank 0 is on the first numa node, etc.

The output of `--hpx:print-bind` seems not in sync with this. There is a correspondence between mpi ranks and hpx locality ids, but the mapping of hpx localities to cpu cores is different now. For example, it seems that locality 1 is not on the second numa node (as per mpirun's `--display-map`), but on the 7-th (as per hpx's `--print-bind`). Also, the output of `--print-bind` differs per invocation.

It is important for me that hpx localities are assigned to numa nodes in order. Localities with similar IDs communicate more with each other than with other localities.

I have attached the slurm script and outputs mentioned above. Does somebody maybe have an idea what is going on and how to fix things? Does hpx maybe re-assign the ranks upon initialization? If so, can I influence this to make this ordering similar to the ordering of the numa nodes?

BTW, I am pretty sure all this worked fine some time ago, when I was still using an earlier version of HPX, another version of MPI, and started HPX processes using srun instead of mpirun.

Thanks for any info!

Kor

======================   ALLOCATED NODES   ======================
        node032: flags=0x11 slots=8 max_slots=0 slots_inuse=0 state=UP
=================================================================
 Data for JOB [38370,1] offset 0 Total slots allocated 8

 ========================   JOB MAP   ========================

 Data for node: node032 Num slots: 8    Max slots: 0    Num procs: 8
        Process OMPI jobid: [38370,1] App: 0 Process rank: 0 Bound: socket 
0[core 0[hwt 0-1]], socket 0[core 1[hwt 0-1]], socket 0[core 2[hwt 0-1]], 
socket 0[core 3[hwt 0-1]], socket 0[core 4[hwt 0-1]], socket 0[core 5[hwt 
0-1]]:[BB/BB/BB/BB/BB/BB/../../../../../../../../../../../../../../../../../..][../../../../../../../../../../../../../../../../../../../../../../../..]
        Process OMPI jobid: [38370,1] App: 0 Process rank: 1 Bound: socket 
0[core 6[hwt 0-1]], socket 0[core 7[hwt 0-1]], socket 0[core 8[hwt 0-1]], 
socket 0[core 9[hwt 0-1]], socket 0[core 10[hwt 0-1]], socket 0[core 11[hwt 
0-1]]:[../../../../../../BB/BB/BB/BB/BB/BB/../../../../../../../../../../../..][../../../../../../../../../../../../../../../../../../../../../../../..]
        Process OMPI jobid: [38370,1] App: 0 Process rank: 2 Bound: socket 
0[core 12[hwt 0-1]], socket 0[core 13[hwt 0-1]], socket 0[core 14[hwt 0-1]], 
socket 0[core 15[hwt 0-1]], socket 0[core 16[hwt 0-1]], socket 0[core 17[hwt 
0-1]]:[../../../../../../../../../../../../BB/BB/BB/BB/BB/BB/../../../../../..][../../../../../../../../../../../../../../../../../../../../../../../..]
        Process OMPI jobid: [38370,1] App: 0 Process rank: 3 Bound: socket 
0[core 18[hwt 0-1]], socket 0[core 19[hwt 0-1]], socket 0[core 20[hwt 0-1]], 
socket 0[core 21[hwt 0-1]], socket 0[core 22[hwt 0-1]], socket 0[core 23[hwt 
0-1]]:[../../../../../../../../../../../../../../../../../../BB/BB/BB/BB/BB/BB][../../../../../../../../../../../../../../../../../../../../../../../..]
        Process OMPI jobid: [38370,1] App: 0 Process rank: 4 Bound: socket 
1[core 24[hwt 0-1]], socket 1[core 25[hwt 0-1]], socket 1[core 26[hwt 0-1]], 
socket 1[core 27[hwt 0-1]], socket 1[core 28[hwt 0-1]], socket 1[core 29[hwt 
0-1]]:[../../../../../../../../../../../../../../../../../../../../../../../..][BB/BB/BB/BB/BB/BB/../../../../../../../../../../../../../../../../../..]
        Process OMPI jobid: [38370,1] App: 0 Process rank: 5 Bound: socket 
1[core 30[hwt 0-1]], socket 1[core 31[hwt 0-1]], socket 1[core 32[hwt 0-1]], 
socket 1[core 33[hwt 0-1]], socket 1[core 34[hwt 0-1]], socket 1[core 35[hwt 
0-1]]:[../../../../../../../../../../../../../../../../../../../../../../../..][../../../../../../BB/BB/BB/BB/BB/BB/../../../../../../../../../../../..]
        Process OMPI jobid: [38370,1] App: 0 Process rank: 6 Bound: socket 
1[core 36[hwt 0-1]], socket 1[core 37[hwt 0-1]], socket 1[core 38[hwt 0-1]], 
socket 1[core 39[hwt 0-1]], socket 1[core 40[hwt 0-1]], socket 1[core 41[hwt 
0-1]]:[../../../../../../../../../../../../../../../../../../../../../../../..][../../../../../../../../../../../../BB/BB/BB/BB/BB/BB/../../../../../..]
        Process OMPI jobid: [38370,1] App: 0 Process rank: 7 Bound: socket 
1[core 42[hwt 0-1]], socket 1[core 43[hwt 0-1]], socket 1[core 44[hwt 0-1]], 
socket 1[core 45[hwt 0-1]], socket 1[core 46[hwt 0-1]], socket 1[core 47[hwt 
0-1]]:[../../../../../../../../../../../../../../../../../../../../../../../..][../../../../../../../../../../../../../../../../../../BB/BB/BB/BB/BB/BB]

 =============================================================


*******************************************************************************
locality: 0
   0: PU L#0(P#0), Core L#0(P#0), Socket L#0(P#0), on pool "default"
   1: PU L#2(P#1), Core L#1(P#1), Socket L#0(P#0), on pool "default"
   2: PU L#4(P#2), Core L#2(P#2), Socket L#0(P#0), on pool "default"
   3: PU L#6(P#3), Core L#3(P#4), Socket L#0(P#0), on pool "default"
   4: PU L#8(P#4), Core L#4(P#5), Socket L#0(P#0), on pool "default"
   5: PU L#10(P#5), Core L#5(P#6), Socket L#0(P#0), on pool "default"
*******************************************************************************
locality: 5
   0: PU L#36(P#18), Core L#18(P#24), Socket L#0(P#0), on pool "default"
   1: PU L#38(P#19), Core L#19(P#25), Socket L#0(P#0), on pool "default"
   2: PU L#40(P#20), Core L#20(P#26), Socket L#0(P#0), on pool "default"
   3: PU L#42(P#21), Core L#21(P#28), Socket L#0(P#0), on pool "default"
   4: PU L#44(P#22), Core L#22(P#29), Socket L#0(P#0), on pool "default"
   5: PU L#46(P#23), Core L#23(P#30), Socket L#0(P#0), on pool "default"
*******************************************************************************
locality: 4
   0: PU L#12(P#6), Core L#6(P#8), Socket L#0(P#0), on pool "default"
   1: PU L#14(P#7), Core L#7(P#9), Socket L#0(P#0), on pool "default"
   2: PU L#16(P#8), Core L#8(P#10), Socket L#0(P#0), on pool "default"
   3: PU L#18(P#9), Core L#9(P#12), Socket L#0(P#0), on pool "default"
   4: PU L#20(P#10), Core L#10(P#13), Socket L#0(P#0), on pool "default"
   5: PU L#22(P#11), Core L#11(P#14), Socket L#0(P#0), on pool "default"
*******************************************************************************
locality: 2
   0: PU L#24(P#12), Core L#12(P#16), Socket L#0(P#0), on pool "default"
   1: PU L#26(P#13), Core L#13(P#17), Socket L#0(P#0), on pool "default"
   2: PU L#28(P#14), Core L#14(P#18), Socket L#0(P#0), on pool "default"
   3: PU L#30(P#15), Core L#15(P#20), Socket L#0(P#0), on pool "default"
   4: PU L#32(P#16), Core L#16(P#21), Socket L#0(P#0), on pool "default"
   5: PU L#34(P#17), Core L#17(P#22), Socket L#0(P#0), on pool "default"
*******************************************************************************
locality: 6
   0: PU L#48(P#24), Core L#24(P#0), Socket L#1(P#1), on pool "default"
   1: PU L#50(P#25), Core L#25(P#1), Socket L#1(P#1), on pool "default"
   2: PU L#52(P#26), Core L#26(P#2), Socket L#1(P#1), on pool "default"
   3: PU L#54(P#27), Core L#27(P#4), Socket L#1(P#1), on pool "default"
   4: PU L#56(P#28), Core L#28(P#5), Socket L#1(P#1), on pool "default"
   5: PU L#58(P#29), Core L#29(P#6), Socket L#1(P#1), on pool "default"
*******************************************************************************
locality: 7
   0: PU L#84(P#42), Core L#42(P#24), Socket L#1(P#1), on pool "default"
   1: PU L#86(P#43), Core L#43(P#25), Socket L#1(P#1), on pool "default"
   2: PU L#88(P#44), Core L#44(P#26), Socket L#1(P#1), on pool "default"
   3: PU L#90(P#45), Core L#45(P#28), Socket L#1(P#1), on pool "default"
   4: PU L#92(P#46), Core L#46(P#29), Socket L#1(P#1), on pool "default"
   5: PU L#94(P#47), Core L#47(P#30), Socket L#1(P#1), on pool "default"
*******************************************************************************
locality: 1
   0: PU L#72(P#36), Core L#36(P#16), Socket L#1(P#1), on pool "default"
   1: PU L#74(P#37), Core L#37(P#17), Socket L#1(P#1), on pool "default"
   2: PU L#76(P#38), Core L#38(P#18), Socket L#1(P#1), on pool "default"
   3: PU L#78(P#39), Core L#39(P#20), Socket L#1(P#1), on pool "default"
   4: PU L#80(P#40), Core L#40(P#21), Socket L#1(P#1), on pool "default"
   5: PU L#82(P#41), Core L#41(P#22), Socket L#1(P#1), on pool "default"
*******************************************************************************
locality: 3
   0: PU L#60(P#30), Core L#30(P#8), Socket L#1(P#1), on pool "default"
   1: PU L#62(P#31), Core L#31(P#9), Socket L#1(P#1), on pool "default"
   2: PU L#64(P#32), Core L#32(P#10), Socket L#1(P#1), on pool "default"
   3: PU L#66(P#33), Core L#33(P#12), Socket L#1(P#1), on pool "default"
   4: PU L#68(P#34), Core L#34(P#13), Socket L#1(P#1), on pool "default"
   5: PU L#70(P#35), Core L#35(P#14), Socket L#1(P#1), on pool "default"
#!/usr/bin/env bash
set -e


sbatch << END_OF_SLURM_SCRIPT
#!/usr/bin/env bash
#SBATCH --nodes=1
#SBATCH --ntasks=8
#SBATCH --cpus-per-task=12
#SBATCH --cores-per-socket=6
#SBATCH --partition=allq
#SBATCH --qos=swdev


module purge
module load opt/all
module load userspace/all
module load libraries/zstd/1.3.7
module load gcc/10.2.0
module load openmpi/gcc-10.2.0/4.0.4
module load libraries/papi/6.0.0.1
module load perftools/2.9.1


mpirun --n 8 --mca btl_openib_allow_ib true --display-map --display-allocation 
my_hpx_command --hpx:print-bind

END_OF_SLURM_SCRIPT
_______________________________________________
hpx-users mailing list
[email protected]
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

Reply via email to