Howard, I don't have much time now to try with --enable-debug.
The RoCE device we have is FastLinQ QL41000 Series 10/25/40/50GbE Controller The output of ibv_devinfo is: hca_id: qedr0 transport: InfiniBand (0) fw_ver: 8.20.0.0 node_guid: 2267:7cff:fe11:4a50 sys_image_guid: 2267:7cff:fe11:4a50 vendor_id: 0x1077 vendor_part_id: 32880 hw_ver: 0x0 phys_port_cnt: 1 port: 1 state: PORT_ACTIVE (4) max_mtu: 4096 (5) active_mtu: 1024 (3) sm_lid: 0 port_lid: 0 port_lmc: 0x00 link_layer: Ethernet hca_id: qedr1 transport: InfiniBand (0) fw_ver: 8.20.0.0 node_guid: 2267:7cff:fe11:4a51 sys_image_guid: 2267:7cff:fe11:4a51 vendor_id: 0x1077 vendor_part_id: 32880 hw_ver: 0x0 phys_port_cnt: 1 port: 1 state: PORT_DOWN (1) max_mtu: 4096 (5) active_mtu: 1024 (3) sm_lid: 0 port_lid: 0 port_lmc: 0x00 link_layer: Ethernet Regarding UCX, we have tried with the latest version. Compilation goes through, but the ucv_info command gives an error: # Memory domain: qedr0 # Component: ib # register: unlimited, cost: 180 nsec # remote key: 8 bytes # local memory handle is required for zcopy # # Transport: rc_verbs # Device: qedr0:1 # Type: network # System device: qedr0 (0) [1643982133.674556] [kahan01:8217 :0] rc_iface.c:505 UCX ERROR ibv_create_srq() failed: Function not implemented # < failed to open interface > # # Transport: ud_verbs # Device: qedr0:1 # Type: network # System device: qedr0 (0) [qelr_create_qp:545]create qp: failed on ibv_cmd_create_qp with 22 [1643982133.681169] [kahan01:8217 :0] ib_iface.c:994 UCX ERROR iface=0x56074944bf10: failed to create UD QP TX wr:256 sge:6 inl:64 resp:0 RX wr:4096 sge:1 resp:0: Invalid argument # < failed to open interface > # # Memory domain: qedr1 # Component: ib # register: unlimited, cost: 180 nsec # remote key: 8 bytes # local memory handle is required for zcopy # < no supported devices found > Any idea what the error in ibv_create_srq() means? Thanks for your help. Jose > El 3 feb 2022, a las 17:52, Pritchard Jr., Howard <howa...@lanl.gov> escribió: > > Hi Jose, > > A number of things. > > First for recent versions of Open MPI including the 4.1.x release stream, > MPI_THREAD_MULTIPLE is supported by default. However, some transport options > available when using MPI_Init may not be available when requesting > MPI_THREAD_MULTIPLE. > You may want to let Open MPI trundle along with tcp used for inter-node > messaging and see if your application performs well enough. For a small > system tcp may well suffice. > > Second, if you want to pursue this further, you want to rebuild Open MPI with > --enable-debug. The debug output will be considerably more verbose and > provides more info. I think you will get a message saying rdmacm CPC is > excluded owing to the requested thread support level. There may be info > about why udcm is not selected as well. > > Third, what sort of RoCE devices are available on your system? The output > from ibv_devinfo may be useful. > > As for UCX, if it’s the version that came with your ubuntu release 18.0.4 it > may be pretty old. It's likely that UCX has not been tested on the RoCE > devices on your system. > > You can run > > ucx_info -v > > to check the version number of UCX that you are picking up. > > You can download the latest release of UCX at > > https://github.com/openucx/ucx/releases/tag/v1.12.0 > > Instructions for how to build are in the README.md at > https://github.com/openucx/ucx. > You will want to configure with > > contrib/configure-release-mt --enable-gtest > > You want to add the --enable-gtest to the configure options so that you can > run the ucx sanity checks. Note this takes quite a while to run but is > pretty thorough at validating your UCX build. > You'll want to run this test on one of the nodes with a RoCE device - > > ucx_info -d > > This will show which UCX transports/devices are available. > > See the Running internal unit tests section of the README.md > > Hope this helps, > > Howard > > > On 2/3/22, 8:46 AM, "Jose E. Roman" <jro...@dsic.upv.es> wrote: > > Thanks. The verbose output is: > > [kahan01.upvnet.upv.es:29732] mca: base: components_register: registering > framework btl components > [kahan01.upvnet.upv.es:29732] mca: base: components_register: found loaded > component self > [kahan01.upvnet.upv.es:29732] mca: base: components_register: component > self register function successful > [kahan01.upvnet.upv.es:29732] mca: base: components_register: found loaded > component sm > [kahan01.upvnet.upv.es:29732] mca: base: components_register: found loaded > component openib > [kahan01.upvnet.upv.es:29732] mca: base: components_register: component > openib register function successful > [kahan01.upvnet.upv.es:29732] mca: base: components_register: found loaded > component vader > [kahan01.upvnet.upv.es:29732] mca: base: components_register: component > vader register function successful > [kahan01.upvnet.upv.es:29732] mca: base: components_register: found loaded > component tcp > [kahan01.upvnet.upv.es:29732] mca: base: components_register: component > tcp register function successful > [kahan01.upvnet.upv.es:29732] mca: base: components_open: opening btl > components > [kahan01.upvnet.upv.es:29732] mca: base: components_open: found loaded > component self > [kahan01.upvnet.upv.es:29732] mca: base: components_open: component self > open function successful > [kahan01.upvnet.upv.es:29732] mca: base: components_open: found loaded > component openib > [kahan01.upvnet.upv.es:29732] mca: base: components_open: component openib > open function successful > [kahan01.upvnet.upv.es:29732] mca: base: components_open: found loaded > component vader > [kahan01.upvnet.upv.es:29732] mca: base: components_open: component vader > open function successful > [kahan01.upvnet.upv.es:29732] mca: base: components_open: found loaded > component tcp > [kahan01.upvnet.upv.es:29732] mca: base: components_open: component tcp > open function successful > [kahan01.upvnet.upv.es:29732] select: initializing btl component self > [kahan01.upvnet.upv.es:29732] select: init of component self returned > success > [kahan01.upvnet.upv.es:29732] select: initializing btl component openib > [kahan01.upvnet.upv.es:29732] Checking distance from this process to > device=qedr0 > [kahan01.upvnet.upv.es:29732] hwloc_distances->nbobjs=4 > [kahan01.upvnet.upv.es:29732] hwloc_distances->values[0]=10 > [kahan01.upvnet.upv.es:29732] hwloc_distances->values[1]=16 > [kahan01.upvnet.upv.es:29732] hwloc_distances->values[2]=16 > [kahan01.upvnet.upv.es:29732] hwloc_distances->values[3]=16 > [kahan01.upvnet.upv.es:29732] ibv_obj->type set to NULL > [kahan01.upvnet.upv.es:29732] Process is bound: distance to device is > 0.000000 > [kahan01.upvnet.upv.es:29732] Checking distance from this process to > device=qedr1 > [kahan01.upvnet.upv.es:29732] hwloc_distances->nbobjs=4 > [kahan01.upvnet.upv.es:29732] hwloc_distances->values[0]=10 > [kahan01.upvnet.upv.es:29732] hwloc_distances->values[1]=16 > [kahan01.upvnet.upv.es:29732] hwloc_distances->values[2]=16 > [kahan01.upvnet.upv.es:29732] hwloc_distances->values[3]=16 > [kahan01.upvnet.upv.es:29732] ibv_obj->type set to NULL > [kahan01.upvnet.upv.es:29732] Process is bound: distance to device is > 0.000000 > [kahan01.upvnet.upv.es:29732] openib BTL: rdmacm CPC unavailable for use > on qedr0:1; skipped > -------------------------------------------------------------------------- > No OpenFabrics connection schemes reported that they were able to be > used on a specific port. As such, the openib BTL (OpenFabrics > support) will be disabled for this port. > > Local host: kahan01 > Local device: qedr0 > Local port: 1 > CPCs attempted: rdmacm, udcm > -------------------------------------------------------------------------- > [kahan01.upvnet.upv.es:29732] select: init of component openib returned > failure > [kahan01.upvnet.upv.es:29732] mca: base: close: component openib closed > [kahan01.upvnet.upv.es:29732] mca: base: close: unloading component openib > [kahan01.upvnet.upv.es:29732] select: initializing btl component vader > [kahan01.upvnet.upv.es:29732] select: init of component vader returned > failure > [kahan01.upvnet.upv.es:29732] mca: base: close: component vader closed > [kahan01.upvnet.upv.es:29732] mca: base: close: unloading component vader > [kahan01.upvnet.upv.es:29732] select: initializing btl component tcp > [kahan01.upvnet.upv.es:29732] btl: tcp: Searching for exclude > address+prefix: 127.0.0.1 / 8 > [kahan01.upvnet.upv.es:29732] btl: tcp: Found match: 127.0.0.1 (lo) > [kahan01.upvnet.upv.es:29732] btl:tcp: Attempting to bind to AF_INET port > 1024 > [kahan01.upvnet.upv.es:29732] btl:tcp: Successfully bound to AF_INET port > 1024 > [kahan01.upvnet.upv.es:29732] btl:tcp: my listening v4 socket is > 0.0.0.0:1024 > [kahan01.upvnet.upv.es:29732] btl:tcp: examining interface eno1 > [kahan01.upvnet.upv.es:29732] btl:tcp: using ipv6 interface eno1 > [kahan01.upvnet.upv.es:29732] btl:tcp: examining interface eno5 > [kahan01.upvnet.upv.es:29732] btl:tcp: using ipv6 interface eno5 > [kahan01.upvnet.upv.es:29732] select: init of component tcp returned > success > [kahan01.upvnet.upv.es:29732] mca: bml: Using self btl for send to > [[45435,1],0] on node kahan01 > Hello world from process 0 of 1, provided=1 > [kahan01.upvnet.upv.es:29732] mca: base: close: component self closed > [kahan01.upvnet.upv.es:29732] mca: base: close: unloading component self > [kahan01.upvnet.upv.es:29732] mca: base: close: component tcp closed > [kahan01.upvnet.upv.es:29732] mca: base: close: unloading component tcp > > > Regarding UCX, at some point I tried but IIRC the installation of UCX in > this machine does not work for some reason. Is there an easy way to check if > UCX works well before installing Open MPI? > > Jose > > > >> El 3 feb 2022, a las 16:38, Pritchard Jr., Howard <howa...@lanl.gov> >> escribió: >> >> Hello Jose, >> >> I suspect the issue here is that the OpenIB BTl isn't finding a connection >> module when you are requesting MPI_THREAD_MULTIPLE. >> The rdmacm connection is deselected if MPI_THREAD_MULTIPLE thread support >> level is being requested. >> >> If you run the test in a shell with >> >> export OMPI_MCA_btl_base_verbose=100 >> >> there may be some more info to help diagnose what's going on. >> >> Another option would be to build Open MPI with UCX support. That's the >> better way to use Open MPI over IB/RoCE. >> >> Howard >> >> On 2/2/22, 10:52 AM, "users on behalf of Jose E. Roman via users" >> <users-boun...@lists.open-mpi.org on behalf of users@lists.open-mpi.org> >> wrote: >> >> Hi. >> >> I am using Open MPI 4.1.1 with the openib BTL on a 4-node cluster with >> Ethernet 10/25Gb (RoCE). It is using libibverbs from Ubuntu 18.04 (kernel >> 4.15.0-166-generic). >> >> With this hello world example: >> >> #include <stdio.h> >> #include <mpi.h> >> int main (int argc, char *argv[]) >> { >> int rank, size, provided; >> MPI_Init_thread(&argc, &argv, MPI_THREAD_FUNNELED, &provided); >> MPI_Comm_rank(MPI_COMM_WORLD, &rank); >> MPI_Comm_size(MPI_COMM_WORLD, &size); >> printf("Hello world from process %d of %d, provided=%d\n", rank, size, >> provided); >> MPI_Finalize(); >> return 0; >> } >> >> I get the following output when run on one node: >> >> $ ./hellow >> -------------------------------------------------------------------------- >> No OpenFabrics connection schemes reported that they were able to be >> used on a specific port. As such, the openib BTL (OpenFabrics >> support) will be disabled for this port. >> >> Local host: kahan01 >> Local device: qedr0 >> Local port: 1 >> CPCs attempted: rdmacm, udcm >> -------------------------------------------------------------------------- >> Hello world from process 0 of 1, provided=1 >> >> >> The message does not appear if I run on the front-end (does not have RoCE >> network) or if I run it on the node either using MPI_Init() instead of >> MPI_Init_thread() or using MPI_THREAD_SINGLE instead of MPI_THREAD_FUNNELED. >> >> Is there any reason why MPI_Init_thread() is behaving differently to >> MPI_Init()? Note that I am not using threads, and just one MPI process. >> >> >> The question has a second part: is there a way to determine (without >> running an MPI program) that MPI_Init_thread() won't work but MPI_Init() >> will work? I am asking this because PETSc programs default to use >> MPI_Init_thread() when PETSc's configure script finds the MPI_Init_thread() >> symbol in the MPI library. But in situations like the one reported here, it >> would be better to revert to MPI_Init() since MPI_Init_thread() will not >> work as expected. [The configure script cannot run an MPI program due to >> batch systems.] >> >> Thanks for your help. >> Jose >> > >