On 12/2/21 8:20 PM, Ole Holm Nielsen wrote:
> Hi Åke,
>
> On 02-12-2021 14:18, Åke Sandgren wrote:
>> On 12/2/21 2:06 PM, Ole Holm Nielsen wrote:
>>> These are updated observations of running OpenMPI codes with an
>>> Omni-Path network fabric on AlmaLinux 8.5::
>>>
>>> Using the foss-2021b toolchain and OpenMPI/4.1.1-GCC-11.2.0 my trivial
>>> MPI test code works correctly:
>>>
>>> $ ml OpenMPI
>>> $ ml
>>>
>>> Currently Loaded Modules:
>>> 1) GCCcore/11.2.0 9) hwloc/2.5.0-GCCcore-11.2.0
>>> 2) zlib/1.2.11-GCCcore-11.2.0 10) OpenSSL/1.1
>>> 3) binutils/2.37-GCCcore-11.2.0 11)
>>> libevent/2.1.12-GCCcore-11.2.0
>>> 4) GCC/11.2.0 12) UCX/1.11.2-GCCcore-11.2.0
>>> 5) numactl/2.0.14-GCCcore-11.2.0 13)
>>> libfabric/1.13.2-GCCcore-11.2.0
>>> 6) XZ/5.2.5-GCCcore-11.2.0 14) PMIx/4.1.0-GCCcore-11.2.0
>>> 7) libxml2/2.9.10-GCCcore-11.2.0 15) OpenMPI/4.1.1-GCC-11.2.0
>>> 8) libpciaccess/0.16-GCCcore-11.2.0
>>>
>>> $ mpicc mpi_test.c
>>> $ mpirun -n 2 a.out
>>>
>>> (null): There are 2 processes
>>>
>>> (null): Rank 1: d008
>>>
>>> (null): Rank 0: d008
>>>
>>>
>>> I also tried the OpenMPI/4.1.0-GCC-10.2.0 module, but this still gives
>>> the error messages:
>>>
>>> $ ml OpenMPI/4.1.0-GCC-10.2.0
>>> $ ml
>>>
>>> Currently Loaded Modules:
>>> 1) GCCcore/10.2.0 3) binutils/2.35-GCCcore-10.2.0 5)
>>> numactl/2.0.13-GCCcore-10.2.0 7) libxml2/2.9.10-GCCcore-10.2.0 9)
>>> hwloc/2.2.0-GCCcore-10.2.0 11) UCX/1.9.0-GCCcore-10.2.0 13)
>>> PMIx/3.1.5-GCCcore-10.2.0
>>> 2) zlib/1.2.11-GCCcore-10.2.0 4) GCC/10.2.0 6)
>>> XZ/5.2.5-GCCcore-10.2.0 8) libpciaccess/0.16-GCCcore-10.2.0 10)
>>> libevent/2.1.12-GCCcore-10.2.0 12) libfabric/1.11.0-GCCcore-10.2.0 14)
>>> OpenMPI/4.1.0-GCC-10.2.0
>>>
>>> $ mpicc mpi_test.c
>>> $ mpirun -n 2 a.out
>>> [1638449983.577933] [d008:910356:0] ib_iface.c:966 UCX ERROR
>>> ibv_create_cq(cqe=4096) failed: Operation not supported
>>> [1638449983.577827] [d008:910355:0] ib_iface.c:966 UCX ERROR
>>> ibv_create_cq(cqe=4096) failed: Operation not supported
>>> [d008.nifl.fysik.dtu.dk:910355] pml_ucx.c:273 Error: Failed to create
>>> UCP worker
>>> [d008.nifl.fysik.dtu.dk:910356] pml_ucx.c:273 Error: Failed to create
>>> UCP worker
>>>
>>> (null): There are 2 processes
>>>
>>> (null): Rank 0: d008
>>>
>>> (null): Rank 1: d008
>>>
>>> Conclusion: The foss-2021b toolchain with OpenMPI/4.1.1-GCC-11.2.0 seems
>>> to be required on systems with an Omni-Path network fabric on AlmaLinux
>>> 8.5. Perhaps the newer UCX/1.11.2-GCCcore-11.2.0 is really what's
>>> needed, compared to UCX/1.9.0-GCCcore-10.2.0 from foss-2020b.
>>>
>>> Does anyone have comments on this?
>>
>> UCX is the problem here in combination with libfabric I think. Write a
>> hook that upgrades the version of UCX to 1.11-something if it's <
>> 1.11-ish, or just that specific version if you have older-and-working
>> versions.
>
> You are right that the nodes with Omni-Path have different libfabric
> packages which come from the EL8.5 BaseOS as well as the latest
> Cornelis/Intel Omni-Path drivers:
>
> $ rpm -qa | grep libfabric
> libfabric-verbs-1.10.0-2.x86_64
> libfabric-1.12.1-1.el8.x86_64
> libfabric-devel-1.12.1-1.el8.x86_64
> libfabric-psm2-1.10.0-2.x86_64
>
> The 1.12 packages are from EL8.5, and 1.10 packages are from Cornelis.
>
> Regarding UCX, I was first using the trusted foss-2020b toolchain which
> includes UCX/1.9.0-GCCcore-10.2.0. I guess that we shouldn't mess with
> the toolchains?
>
> The foss-2021b toolchain includes the newer UCX 1.11, which seems to
> solve this particular problem.
>
> Can we make any best practices recommendations from these observations?
I didn't check properly, but UCX does not depend on libfabric, OpenMPI
does, so I'd write a hook that replaces libfabric < 1.12 with at least
1.12.1.
Sometimes you just have to mess with the toolchains, and this looks like
one of those situations.
Or as a test build your own OpenMPI-4.1.0 or 4.0.5 (that 2020b uses)
with an updated libfabric and check if that fixes the problem. And if it
does, write a hook that replaces libfabric. See the framework/contrib
for examples, I did that for UCX so there is code there to show you how.
--
Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden
Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90-580 14
Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se