Hello Jose, I suspect the issue here is that the OpenIB BTl isn't finding a connection module when you are requesting MPI_THREAD_MULTIPLE. The rdmacm connection is deselected if MPI_THREAD_MULTIPLE thread support level is being requested.
If you run the test in a shell with export OMPI_MCA_btl_base_verbose=100 there may be some more info to help diagnose what's going on. Another option would be to build Open MPI with UCX support. That's the better way to use Open MPI over IB/RoCE. Howard On 2/2/22, 10:52 AM, "users on behalf of Jose E. Roman via users" <users-boun...@lists.open-mpi.org on behalf of users@lists.open-mpi.org> wrote: Hi. I am using Open MPI 4.1.1 with the openib BTL on a 4-node cluster with Ethernet 10/25Gb (RoCE). It is using libibverbs from Ubuntu 18.04 (kernel 4.15.0-166-generic). With this hello world example: #include <stdio.h> #include <mpi.h> int main (int argc, char *argv[]) { int rank, size, provided; MPI_Init_thread(&argc, &argv, MPI_THREAD_FUNNELED, &provided); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf("Hello world from process %d of %d, provided=%d\n", rank, size, provided); MPI_Finalize(); return 0; } I get the following output when run on one node: $ ./hellow -------------------------------------------------------------------------- No OpenFabrics connection schemes reported that they were able to be used on a specific port. As such, the openib BTL (OpenFabrics support) will be disabled for this port. Local host: kahan01 Local device: qedr0 Local port: 1 CPCs attempted: rdmacm, udcm -------------------------------------------------------------------------- Hello world from process 0 of 1, provided=1 The message does not appear if I run on the front-end (does not have RoCE network) or if I run it on the node either using MPI_Init() instead of MPI_Init_thread() or using MPI_THREAD_SINGLE instead of MPI_THREAD_FUNNELED. Is there any reason why MPI_Init_thread() is behaving differently to MPI_Init()? Note that I am not using threads, and just one MPI process. The question has a second part: is there a way to determine (without running an MPI program) that MPI_Init_thread() won't work but MPI_Init() will work? I am asking this because PETSc programs default to use MPI_Init_thread() when PETSc's configure script finds the MPI_Init_thread() symbol in the MPI library. But in situations like the one reported here, it would be better to revert to MPI_Init() since MPI_Init_thread() will not work as expected. [The configure script cannot run an MPI program due to batch systems.] Thanks for your help. Jose