Martin

Sorry for the late reply, I forget to check the users list as most of the 
discussions take place on IRC (which I also forget to check these days...). 
Anyway


When using hpx in distrivuted mode, you have two options - compile hpx with 
NETWORKING on and use either the MPI parcelport (which we assume you have 
compiled and installed on your system as usual), or use the new still slightly 
experimental libfabric parcelport. There is a libfabric provider for PSM2 which 
is the one used on omnipath OPA networks.

see here https://github.com/ofiwg/libfabric/wiki/Provider-Feature-Matrix-master 
for capabilities.

HPX currently runs on the sockets and GNI providers and uses endpoint type 
FI_EP_RDM and makes use of FI_SEND, FI_RECV, FI_RMA, and secondary capability 
FI_SOURCE plus a few others I can't remember from the top of my head.

Looking at the chart, PSM2 provider supports all the things we need, so it 
ought to be possible to run the libfabric network layer on an omnipath machine.


However - currently the master branch of HPX doesn't support this and the stuff 
you'd need is in another branch that needs a bit of work to merge in. I have it 
in my todo list to get the network running on summit (infiniband verbs - no 
FI_SOURCE = problem), but I'm not sure when I'll be able to start work on it.


You should probably just use the MPI parcelport in HPX for now - but If you 
were interested in getting the libfabric stuff running for improved distributed 
performance, it ought to be straightforward to get woking since all the stuff 
we need appears to be supported - however it'd need a bit of tweaking and 
experimenting to get running probably - is there any way I can get access to 
your machine to log in and try a build/test?


If you're more interested in simply using mpi in your existing code and not 
using hpx as a distributed tasking layer, then just turn 
HPX_WITH_NETWORKING=OFF and then use hpx for tasks on a node and your existing 
mpi between nodes.


HTH


JB



________________________________
From: [email protected] 
<[email protected]> on behalf of Ohlerich, Martin 
<[email protected]>
Sent: 10 December 2019 10:58:27
To: [email protected]
Subject: [hpx-users] Request for Experience


Dear Colleagues,


my name is Martin Ohlerich. I'm working at the Leibniz Super-Computing Center 
near Munich (LRZ), and test currently the capabilities of HPX. On our Linux 
cluster with infinitband network, the tests were so far successful. On 
SuperMUC-NG, we've an Intel OPA network, which seems to have some peculiarities 
(that we also observed when trying to employ GPI (GASPI)). Is there any 
experience with such a network type for HPX in the community?

I welcome any hint on where to find about the startup mechanism, and debugging 
possibilities. I tried so far the easy approach to install HPX via Spack. On 
SNG that might be not the correct way to go.


Many thanks in advance! Also in the name of our users!

Best regards,

Martin Ohlerich
_______________________________________________
hpx-users mailing list
[email protected]
https://mail.cct.lsu.edu/mailman/listinfo/hpx-users

Reply via email to