Hi Åke,

On 12/3/21 08:27, Åke Sandgren wrote:
On 02-12-2021 14:18, Åke Sandgren wrote:
On 12/2/21 2:06 PM, Ole Holm Nielsen wrote:
These are updated observations of running OpenMPI codes with an
Omni-Path network fabric on AlmaLinux 8.5::

Using the foss-2021b toolchain and OpenMPI/4.1.1-GCC-11.2.0 my trivial
MPI test code works correctly:

$ ml OpenMPI
$ ml

Currently Loaded Modules:
    1) GCCcore/11.2.0                     9) hwloc/2.5.0-GCCcore-11.2.0
    2) zlib/1.2.11-GCCcore-11.2.0        10) OpenSSL/1.1
    3) binutils/2.37-GCCcore-11.2.0      11)
libevent/2.1.12-GCCcore-11.2.0
    4) GCC/11.2.0                        12) UCX/1.11.2-GCCcore-11.2.0
    5) numactl/2.0.14-GCCcore-11.2.0     13)
libfabric/1.13.2-GCCcore-11.2.0
    6) XZ/5.2.5-GCCcore-11.2.0           14) PMIx/4.1.0-GCCcore-11.2.0
    7) libxml2/2.9.10-GCCcore-11.2.0     15) OpenMPI/4.1.1-GCC-11.2.0
    8) libpciaccess/0.16-GCCcore-11.2.0

$ mpicc mpi_test.c
$ mpirun -n 2 a.out

(null): There are 2 processes

(null): Rank  1:  d008

(null): Rank  0:  d008


I also tried the OpenMPI/4.1.0-GCC-10.2.0 module, but this still gives
the error messages:

$ ml OpenMPI/4.1.0-GCC-10.2.0
$ ml

Currently Loaded Modules:
    1) GCCcore/10.2.0               3) binutils/2.35-GCCcore-10.2.0   5)
numactl/2.0.13-GCCcore-10.2.0   7) libxml2/2.9.10-GCCcore-10.2.0      9)
hwloc/2.2.0-GCCcore-10.2.0      11) UCX/1.9.0-GCCcore-10.2.0         13)
PMIx/3.1.5-GCCcore-10.2.0
    2) zlib/1.2.11-GCCcore-10.2.0   4) GCC/10.2.0                     6)
XZ/5.2.5-GCCcore-10.2.0         8) libpciaccess/0.16-GCCcore-10.2.0  10)
libevent/2.1.12-GCCcore-10.2.0  12) libfabric/1.11.0-GCCcore-10.2.0  14)
OpenMPI/4.1.0-GCC-10.2.0

$ mpicc mpi_test.c
$ mpirun -n 2 a.out
[1638449983.577933] [d008:910356:0]       ib_iface.c:966  UCX  ERROR
ibv_create_cq(cqe=4096) failed: Operation not supported
[1638449983.577827] [d008:910355:0]       ib_iface.c:966  UCX  ERROR
ibv_create_cq(cqe=4096) failed: Operation not supported
[d008.nifl.fysik.dtu.dk:910355] pml_ucx.c:273  Error: Failed to create
UCP worker
[d008.nifl.fysik.dtu.dk:910356] pml_ucx.c:273  Error: Failed to create
UCP worker

(null): There are 2 processes

(null): Rank  0:  d008

(null): Rank  1:  d008

Conclusion: The foss-2021b toolchain with OpenMPI/4.1.1-GCC-11.2.0 seems
to be required on systems with an Omni-Path network fabric on AlmaLinux
8.5.  Perhaps the newer UCX/1.11.2-GCCcore-11.2.0 is really what's
needed, compared to UCX/1.9.0-GCCcore-10.2.0 from foss-2020b.

Does anyone have comments on this?

UCX is the problem here in combination with libfabric I think. Write a
hook that upgrades the version of UCX to 1.11-something if it's <
1.11-ish, or just that specific version if you have older-and-working
versions.

You are right that the nodes with Omni-Path have different libfabric
packages which come from the EL8.5 BaseOS as well as the latest
Cornelis/Intel Omni-Path drivers:

$ rpm -qa | grep libfabric
libfabric-verbs-1.10.0-2.x86_64
libfabric-1.12.1-1.el8.x86_64
libfabric-devel-1.12.1-1.el8.x86_64
libfabric-psm2-1.10.0-2.x86_64

The 1.12 packages are from EL8.5, and 1.10 packages are from Cornelis.

Regarding UCX, I was first using the trusted foss-2020b toolchain which
includes UCX/1.9.0-GCCcore-10.2.0. I guess that we shouldn't mess with
the toolchains?

The foss-2021b toolchain includes the newer UCX 1.11, which seems to
solve this particular problem.

Can we make any best practices recommendations from these observations?

I didn't check properly, but UCX does not depend on libfabric, OpenMPI
does, so I'd write a hook that replaces libfabric < 1.12 with at least
1.12.1.
Sometimes you just have to mess with the toolchains, and this looks like
one of those situations.

Or as a test build your own OpenMPI-4.1.0 or 4.0.5 (that 2020b uses)
with an updated libfabric and check if that fixes the problem. And if it
does, write a hook that replaces libfabric. See the framework/contrib
for examples, I did that for UCX so there is code there to show you how.

I don't feel qualified to mess around with modifying EB toolchains...

The foss-2021b toolchain including OpenMPI/4.1.1-GCC-11.2.0 seems to solve the present problem. Do you think there are any disadvantages with asking users to go for foss-2021b? Of course we may need several modules to be upgraded from foss-2020b to foss-2021b.

Another possibility may be the coming driver upgrade from Cornelis Networks to support the Omni-Path fabric on EL 8.4 and EL 8.5. I'm definitely going to check this when it becomes available.

Thanks,
Ole

Reply via email to