Hi. I am using Open MPI 4.1.1 with the openib BTL on a 4-node cluster with Ethernet 10/25Gb (RoCE). It is using libibverbs from Ubuntu 18.04 (kernel 4.15.0-166-generic).
With this hello world example: #include <stdio.h> #include <mpi.h> int main (int argc, char *argv[]) { int rank, size, provided; MPI_Init_thread(&argc, &argv, MPI_THREAD_FUNNELED, &provided); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &size); printf("Hello world from process %d of %d, provided=%d\n", rank, size, provided); MPI_Finalize(); return 0; } I get the following output when run on one node: $ ./hellow -------------------------------------------------------------------------- No OpenFabrics connection schemes reported that they were able to be used on a specific port. As such, the openib BTL (OpenFabrics support) will be disabled for this port. Local host: kahan01 Local device: qedr0 Local port: 1 CPCs attempted: rdmacm, udcm -------------------------------------------------------------------------- Hello world from process 0 of 1, provided=1 The message does not appear if I run on the front-end (does not have RoCE network) or if I run it on the node either using MPI_Init() instead of MPI_Init_thread() or using MPI_THREAD_SINGLE instead of MPI_THREAD_FUNNELED. Is there any reason why MPI_Init_thread() is behaving differently to MPI_Init()? Note that I am not using threads, and just one MPI process. The question has a second part: is there a way to determine (without running an MPI program) that MPI_Init_thread() won't work but MPI_Init() will work? I am asking this because PETSc programs default to use MPI_Init_thread() when PETSc's configure script finds the MPI_Init_thread() symbol in the MPI library. But in situations like the one reported here, it would be better to revert to MPI_Init() since MPI_Init_thread() will not work as expected. [The configure script cannot run an MPI program due to batch systems.] Thanks for your help. Jose