[OMPI users] openib BTL vs UCX. Which do I need to use GPUDirect RDMA?

Oskar Lappi via users Tue, 30 Jun 2020 15:28:02 -0700

Hi,

I'm trying to troubleshoot a problem, we don't seem to be getting thebandwidth we'd expect from our distributed CUDA program, where we'reusing Open MPI to pass data between GPUs in a HPC cluster.

I thought I found a possible root cause, but now I'm unsure of how tofix this, since the documentation provides conflicting information.


Running

    ompi_info --all| grep "MCA btl"

gives me the following output:

                 MCA btl: tcp (MCA v2.1.0, API v3.1.0, Component v4.0.2)
                 MCA btl: vader (MCA v2.1.0, API v3.1.0, Component v4.0.2)
                 MCA btl: smcuda (MCA v2.1.0, API v3.1.0, Component v4.0.2)
                 MCA btl: self (MCA v2.1.0, API v3.1.0, Component v4.0.2)

According to this: https://www.open-mpi.org/faq/?category=runcuda, theopenib btl is a prerequisite for GPUDirect RDMA.

However, I'm also reading that UCX is the preferred way to do RDMA andthat it has CUDA support.

Can anyone tell me what a proper configuration for GPUDirect RDMA overInfiniband looks like?


Best regards,

Oskar Lappi

[OMPI users] openib BTL vs UCX. Which do I need to use GPUDirect RDMA?

Reply via email to