Hi Jose,

A number of things.  

First for recent versions of Open MPI including the 4.1.x release stream, 
MPI_THREAD_MULTIPLE is supported by default.  However, some transport options 
available when using MPI_Init may not be available when requesting 
MPI_THREAD_MULTIPLE.
You may want to let Open MPI trundle along with tcp used for inter-node 
messaging and see if your application performs well enough. For a small system 
tcp may well suffice. 

Second, if you want to pursue this further, you want to rebuild Open MPI with 
--enable-debug.  The debug output will be considerably more verbose and 
provides more info.  I think you will get  a message saying rdmacm CPC is 
excluded owing to the requested thread support level.  There may be info about 
why udcm is not selected as well.

Third, what sort of RoCE devices are available on your system?  The output from 
ibv_devinfo may be useful. 

As for UCX,  if it’s the version that came with your ubuntu release 18.0.4 it 
may be pretty old.  It's likely that UCX has not been tested on the RoCE 
devices on your system.

You can run 

ucx_info -v

to check the version number of UCX that you are picking up.

You can download the latest release of UCX at

https://github.com/openucx/ucx/releases/tag/v1.12.0

Instructions for how to build are in the README.md at 
https://github.com/openucx/ucx.
You will want to configure with 

contrib/configure-release-mt --enable-gtest

You want to add the --enable-gtest to the configure options so that you can run 
the ucx sanity checks.   Note this takes quite a while to run but is pretty 
thorough at validating your UCX build. 
You'll want to run this test on one of the nodes with a RoCE device -  

ucx_info -d

This will show which UCX transports/devices are available.

See the Running internal unit tests section of the README.md

Hope this helps,

Howard


On 2/3/22, 8:46 AM, "Jose E. Roman" <jro...@dsic.upv.es> wrote:

    Thanks. The verbose output is:

    [kahan01.upvnet.upv.es:29732] mca: base: components_register: registering 
framework btl components
    [kahan01.upvnet.upv.es:29732] mca: base: components_register: found loaded 
component self
    [kahan01.upvnet.upv.es:29732] mca: base: components_register: component 
self register function successful
    [kahan01.upvnet.upv.es:29732] mca: base: components_register: found loaded 
component sm
    [kahan01.upvnet.upv.es:29732] mca: base: components_register: found loaded 
component openib
    [kahan01.upvnet.upv.es:29732] mca: base: components_register: component 
openib register function successful
    [kahan01.upvnet.upv.es:29732] mca: base: components_register: found loaded 
component vader
    [kahan01.upvnet.upv.es:29732] mca: base: components_register: component 
vader register function successful
    [kahan01.upvnet.upv.es:29732] mca: base: components_register: found loaded 
component tcp
    [kahan01.upvnet.upv.es:29732] mca: base: components_register: component tcp 
register function successful
    [kahan01.upvnet.upv.es:29732] mca: base: components_open: opening btl 
components
    [kahan01.upvnet.upv.es:29732] mca: base: components_open: found loaded 
component self
    [kahan01.upvnet.upv.es:29732] mca: base: components_open: component self 
open function successful
    [kahan01.upvnet.upv.es:29732] mca: base: components_open: found loaded 
component openib
    [kahan01.upvnet.upv.es:29732] mca: base: components_open: component openib 
open function successful
    [kahan01.upvnet.upv.es:29732] mca: base: components_open: found loaded 
component vader
    [kahan01.upvnet.upv.es:29732] mca: base: components_open: component vader 
open function successful
    [kahan01.upvnet.upv.es:29732] mca: base: components_open: found loaded 
component tcp
    [kahan01.upvnet.upv.es:29732] mca: base: components_open: component tcp 
open function successful
    [kahan01.upvnet.upv.es:29732] select: initializing btl component self
    [kahan01.upvnet.upv.es:29732] select: init of component self returned 
success
    [kahan01.upvnet.upv.es:29732] select: initializing btl component openib
    [kahan01.upvnet.upv.es:29732] Checking distance from this process to 
device=qedr0
    [kahan01.upvnet.upv.es:29732] hwloc_distances->nbobjs=4
    [kahan01.upvnet.upv.es:29732] hwloc_distances->values[0]=10
    [kahan01.upvnet.upv.es:29732] hwloc_distances->values[1]=16
    [kahan01.upvnet.upv.es:29732] hwloc_distances->values[2]=16
    [kahan01.upvnet.upv.es:29732] hwloc_distances->values[3]=16
    [kahan01.upvnet.upv.es:29732] ibv_obj->type set to NULL
    [kahan01.upvnet.upv.es:29732] Process is bound: distance to device is 
0.000000
    [kahan01.upvnet.upv.es:29732] Checking distance from this process to 
device=qedr1
    [kahan01.upvnet.upv.es:29732] hwloc_distances->nbobjs=4
    [kahan01.upvnet.upv.es:29732] hwloc_distances->values[0]=10
    [kahan01.upvnet.upv.es:29732] hwloc_distances->values[1]=16
    [kahan01.upvnet.upv.es:29732] hwloc_distances->values[2]=16
    [kahan01.upvnet.upv.es:29732] hwloc_distances->values[3]=16
    [kahan01.upvnet.upv.es:29732] ibv_obj->type set to NULL
    [kahan01.upvnet.upv.es:29732] Process is bound: distance to device is 
0.000000
    [kahan01.upvnet.upv.es:29732] openib BTL: rdmacm CPC unavailable for use on 
qedr0:1; skipped
    --------------------------------------------------------------------------
    No OpenFabrics connection schemes reported that they were able to be
    used on a specific port.  As such, the openib BTL (OpenFabrics
    support) will be disabled for this port.

      Local host:           kahan01
      Local device:         qedr0
      Local port:           1
      CPCs attempted:       rdmacm, udcm
    --------------------------------------------------------------------------
    [kahan01.upvnet.upv.es:29732] select: init of component openib returned 
failure
    [kahan01.upvnet.upv.es:29732] mca: base: close: component openib closed
    [kahan01.upvnet.upv.es:29732] mca: base: close: unloading component openib
    [kahan01.upvnet.upv.es:29732] select: initializing btl component vader
    [kahan01.upvnet.upv.es:29732] select: init of component vader returned 
failure
    [kahan01.upvnet.upv.es:29732] mca: base: close: component vader closed
    [kahan01.upvnet.upv.es:29732] mca: base: close: unloading component vader
    [kahan01.upvnet.upv.es:29732] select: initializing btl component tcp
    [kahan01.upvnet.upv.es:29732] btl: tcp: Searching for exclude 
address+prefix: 127.0.0.1 / 8
    [kahan01.upvnet.upv.es:29732] btl: tcp: Found match: 127.0.0.1 (lo)
    [kahan01.upvnet.upv.es:29732] btl:tcp: Attempting to bind to AF_INET port 
1024
    [kahan01.upvnet.upv.es:29732] btl:tcp: Successfully bound to AF_INET port 
1024
    [kahan01.upvnet.upv.es:29732] btl:tcp: my listening v4 socket is 
0.0.0.0:1024
    [kahan01.upvnet.upv.es:29732] btl:tcp: examining interface eno1
    [kahan01.upvnet.upv.es:29732] btl:tcp: using ipv6 interface eno1
    [kahan01.upvnet.upv.es:29732] btl:tcp: examining interface eno5
    [kahan01.upvnet.upv.es:29732] btl:tcp: using ipv6 interface eno5
    [kahan01.upvnet.upv.es:29732] select: init of component tcp returned success
    [kahan01.upvnet.upv.es:29732] mca: bml: Using self btl for send to 
[[45435,1],0] on node kahan01
    Hello world from process 0 of 1, provided=1
    [kahan01.upvnet.upv.es:29732] mca: base: close: component self closed
    [kahan01.upvnet.upv.es:29732] mca: base: close: unloading component self
    [kahan01.upvnet.upv.es:29732] mca: base: close: component tcp closed
    [kahan01.upvnet.upv.es:29732] mca: base: close: unloading component tcp


    Regarding UCX, at some point I tried but IIRC the installation of UCX in 
this machine does not work for some reason. Is there an easy way to check if 
UCX works well before installing Open MPI?

    Jose



    > El 3 feb 2022, a las 16:38, Pritchard Jr., Howard <howa...@lanl.gov> 
escribió:
    > 
    > Hello Jose,
    > 
    > I suspect the issue here is that the OpenIB BTl isn't finding a 
connection module when you are requesting MPI_THREAD_MULTIPLE.
    > The rdmacm connection is deselected if MPI_THREAD_MULTIPLE thread support 
level is being requested.
    > 
    > If you run the test in a shell with
    > 
    > export OMPI_MCA_btl_base_verbose=100
    > 
    > there may be some more info to help diagnose what's going on.
    > 
    > Another option would be to build Open MPI with UCX support.  That's the 
better way to use Open MPI over IB/RoCE.
    > 
    > Howard
    > 
    > On 2/2/22, 10:52 AM, "users on behalf of Jose E. Roman via users" 
<users-boun...@lists.open-mpi.org on behalf of users@lists.open-mpi.org> wrote:
    > 
    >    Hi.
    > 
    >    I am using Open MPI 4.1.1 with the openib BTL on a 4-node cluster with 
Ethernet 10/25Gb (RoCE). It is using libibverbs from Ubuntu 18.04 (kernel 
4.15.0-166-generic).
    > 
    >    With this hello world example:
    > 
    >    #include <stdio.h>
    >    #include <mpi.h>
    >    int main (int argc, char *argv[])
    >    {
    >     int rank, size, provided;
    >     MPI_Init_thread(&argc, &argv, MPI_THREAD_FUNNELED, &provided);
    >     MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    >     MPI_Comm_size(MPI_COMM_WORLD, &size);
    >     printf("Hello world from process %d of %d, provided=%d\n", rank, 
size, provided);
    >     MPI_Finalize();
    >     return 0;
    >    }
    > 
    >    I get the following output when run on one node:
    > 
    >    $ ./hellow
    >    
--------------------------------------------------------------------------
    >    No OpenFabrics connection schemes reported that they were able to be
    >    used on a specific port.  As such, the openib BTL (OpenFabrics
    >    support) will be disabled for this port.
    > 
    >     Local host:           kahan01
    >     Local device:         qedr0
    >     Local port:           1
    >     CPCs attempted:       rdmacm, udcm
    >    
--------------------------------------------------------------------------
    >    Hello world from process 0 of 1, provided=1
    > 
    > 
    >    The message does not appear if I run on the front-end (does not have 
RoCE network) or if I run it on the node either using MPI_Init() instead of 
MPI_Init_thread() or using MPI_THREAD_SINGLE instead of MPI_THREAD_FUNNELED.
    > 
    >    Is there any reason why MPI_Init_thread() is behaving differently to 
MPI_Init()? Note that I am not using threads, and just one MPI process.
    > 
    > 
    >    The question has a second part: is there a way to determine (without 
running an MPI program) that MPI_Init_thread() won't work but MPI_Init() will 
work? I am asking this because PETSc programs default to use MPI_Init_thread() 
when PETSc's configure script finds the MPI_Init_thread() symbol in the MPI 
library. But in situations like the one reported here, it would be better to 
revert to MPI_Init() since MPI_Init_thread() will not work as expected. [The 
configure script cannot run an MPI program due to batch systems.]
    > 
    >    Thanks for your help.
    >    Jose
    > 


Reply via email to