[OMPI devel] GPUDirect RDMA/Async for DL Acceleration (MPI)

Jaco Joubert Fri, 12 Oct 2018 03:07:13 -0700

Good day All,

My hypothesis is that with a SmartNIC offloading the CPU, some benefits of
Infiniband can also be achieved with Ethernet and I am looking for
information regarding fully supporting GPUDirect on the NIC's side.


I was able to DMA between a SmartNIC and a V100 GPU through PCIe. However,
to make this useful and more general it should work transparently with
things like MPI (and NCCL). Most resources I've found explains CUDA-Aware
MPI from a user's point of view, but I couldn't as of yet find information
about what need to be implemented on the NIC's side.

I've seen that there are MCA BTL parameters to set tcp, sm, self, openib
etc. I believe some development needs to be done in order to enable MPI to
make use of the SmartNIC, perhaps adding another BTL option? AFAIU, the
data that needs to be sent (and destination rank?), should be copied to the
TX Queue of the NIC. The NIC can then encap the raw data with relevant
headers and forward over a network without any CPU involvement.

Can anyone please point me to documentation, code, or give advice on how to
approach the integration between MPI and NIC?

Regards
Jaco
-- 
*Jaco Joubert*
*Software Engineer*

*Netronome* | 1st Floor, Southdowns Ridge Office Park, Cnr John Vorster &
                      Nellmapius Street, Irene, Centurion 0157, South Africa
Phone: +27 (012) 665-4427 <(012)%665-4427> | Skype: jaco.joubert12 |
www.netronome.com

_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

[OMPI devel] GPUDirect RDMA/Async for DL Acceleration (MPI)

Reply via email to