Also, could you send the output of 'lstopo --of txt', please? Regards Hartmut --------------- https://stellar.cct.lsu.edu https://github.com/STEllAR-GROUP/hpx
> -----Original Message----- > From: Hartmut Kaiser <[email protected]> > Sent: Tuesday, May 25, 2021 7:25 AM > To: '[email protected]' <[email protected]> > Subject: RE: [hpx-users] Assign HPX localities to NUMA nodes, in order > > Kor, > > From what I can see, `--hpx:print-bind` does not report NUMA domains, only > sockets. Why do you think the localities are not mapped correctly to NUMA > domains (assuming the sequencing of printing the locality information is > random and does not reflect the sequencing of the NUMA domains)? > > We could look into printing the NUMA domain information as well, however. > > HTH > Regards Hartmut > --------------- > https://stellar.cct.lsu.edu > https://github.com/STEllAR-GROUP/hpx > > > > -----Original Message----- > > From: [email protected] <hpx-users-bounces@stellar- > > group.org> On Behalf Of Kor de Jong > > Sent: Tuesday, May 25, 2021 2:40 AM > > To: [email protected] > > Subject: Re: [hpx-users] Assign HPX localities to NUMA nodes, in order > > > > Hi Mikael and other HPX experts, > > > > Thanks for your suggestions! Unfortunately they did not improve things > > for me. To be clear, the only thing I don't understand is the binding > > reported by `--hpx:print-bind`. What I do understand is: > > > > - The binding of MPI process ranks to numa nodes, reported by mpirun's > > `-- display-map`. Process rank 0 is bound to numa node 0, process rank > > 1 is bound to numa node 1, etc. This is exactly how I want things to be. > > > > - Relation between HPX localities and MPI ranks, printed from my own > > code: hpx::get_locality_id() == hpx::util::mpi_environment::rank() == > > std::getenv("OMPI_COMM_WORLD_RANK"). This implies that HPX localities > > are ordered the same way as the MPI processes. Locality 0 should be > > bound to numa node 0, locality 1 should be bound to numa node 1, etc. > > This is exactly how I want things to be. > > > > The weird thing is that, according to `--hpx:print-bind`, localities > > are scattered over the numa nodes. Locality 0 always ends up at the > > first numa node, but the other ones are bound to numa nodes in a > > seemingly random order. When performing scaling tests over numa nodes, > > the resulting graphs show artifacts which could be the result of HPX > > localities not being ordered according to increasing memory latencies. > > > > At the moment I can only think of `--hpx:print-bind` being wrong, > > which is unlikely I guess. But why does it suggest that the localities > > are scattered over the numa nodes, when all other information suggests > > that they are ordered according to the numa nodes? > > > > Maybe I am just misunderstanding things. To be able to interpret the > > results of my scaling tests, I would really like to understand what is > > going on. > > > > Thanks in advance for any insights any of you might have for me! > > > > Kor > > > > > > On 5/21/21 5:02 PM, Simberg Mikael wrote: > > > Hi Kor, > > > > > > > > > The nondeterministic nature of your problem is a bit worrying, and I > > > don't have any insight into that. However, there's an alternative > > > way to set the bindings as well. Would you mind trying the > > > --hpx:use-process-mask option to see if you get the expected bindings? > > > By default HPX tries to reconstruct the bindings based on various > > > environment variables, but if you pass --hpx:use-process-mask it > > > will use the process mask that srun/mpi/others typically set, and > > > only spawn worker threads on cores in the process mask. Note that > > > the default, even with --hpx:use-process-mask, is still to only > > > spawn one worker thread per core (not per hyperthread), so if you > > > want exactly the binding you ask for with mpirun you should also add > > > -- > > hpx:threads=all. > > > > > > > > > Mikael > > > > > > -------------------------------------------------------------------- > > > -- > > > -- > > > *From:* [email protected] > > > <[email protected]> on behalf of Kor de Jong > > > <[email protected]> > > > *Sent:* Friday, May 21, 2021 4:25:29 PM > > > *To:* [email protected] > > > *Subject:* {Spam?} [hpx-users] Assign HPX localities to NUMA nodes, > > > in order Dear HPX-experts, > > > > > > I am trying to spawn 8 hpx processes on a cluster node with 8 numa > > > nodes, containing 6 real cpu cores each. All seems well, but the > > > output of `--hpx:print-bind` confuses me. > > > > > > I am using slurm (sbatch command) and openmpi (mpirun command in > > > sbatch script). The output of mpirun's `--display-map` makes > > > complete > > sense. > > > All 8 process ranks get assigned to the 6 cores in the 8 numa nodes, > > > in order. Process rank 0 is on the first numa node, etc. > > > > > > The output of `--hpx:print-bind` seems not in sync with this. There > > > is a correspondence between mpi ranks and hpx locality ids, but the > > > mapping of hpx localities to cpu cores is different now. For > > > example, it seems that locality 1 is not on the second numa node (as > > > per mpirun's `--display-map`), but on the 7-th (as per hpx's > > > `--print-bind`). Also, the output of `--print-bind` differs per > > invocation. > > > > > > It is important for me that hpx localities are assigned to numa > > > nodes in order. Localities with similar IDs communicate more with > > > each other than with other localities. > > > > > > I have attached the slurm script and outputs mentioned above. Does > > > somebody maybe have an idea what is going on and how to fix things? > > > Does hpx maybe re-assign the ranks upon initialization? If so, can I > > > influence this to make this ordering similar to the ordering of the > > > numa nodes? > > > > > > BTW, I am pretty sure all this worked fine some time ago, when I was > > > still using an earlier version of HPX, another version of MPI, and > > > started HPX processes using srun instead of mpirun. > > > > > > Thanks for any info! > > > > > > Kor > > > > > > > > > _______________________________________________ > > > hpx-users mailing list > > > [email protected] > > > https://mail.cct.lsu.edu/mailman/listinfo/hpx-users > > > > > _______________________________________________ > > hpx-users mailing list > > [email protected] > > https://mail.cct.lsu.edu/mailman/listinfo/hpx-users _______________________________________________ hpx-users mailing list [email protected] https://mail.cct.lsu.edu/mailman/listinfo/hpx-users
