I want to build a version of OpenMPI v4 to run on a cluster where some
ranks are IBM POWER (little endian) and other ranks are x86_64. The
OpenMPI wiki says that I need to define OMPI_ENABLE_HETEROGENEOUS_SUPPORT
so that structure padding will be inserted as appropriate to make this
work; but a
One of my fellow developers at IBM is having problems building OpenMPI 4.
The 'configure' command is failing trying to set up the mca hcoll '.so'.
The error message is to the effect that libsharp.so is missing. Looking by
hand, the sharp library is in the expected place in the MOFED install.
Is
'configure' ends with
--- MCA component coll:hcoll (m4 configuration macro)
checking for MCA component coll:hcoll compile mode... dso
checking hcoll/api/hcoll_api.h usability... yes
checking hcoll/api/hcoll_api.h presence... yes
checking for hcoll/api/hcoll_api.h... yes
looking for library in lib
c
The 'configure' command is hidden a few levels of Makefile down; it may
take me a while to isolate it. I will send another note when I have all
the information requested.
T J (Chris) Ward, IBM Research.
Scalable Data-Centric Computing - IBM Spectrum MPI
IBM United Kingdom Ltd., Hursley Park,
I have put the tarball requested on my web site here
http://tjcw.freeshell.org/ompi-output.tar.bz2 ; it is too large to be
posted to the mailing list.
It has a 'typescript' from running configure, and the config.log file. My
'configure' command was
../configure --prefix=/install/u/tjcw/workspace
The sharp libs are in
/install/u/tjcw/workspace/ibm_smpi_toucan_ucx/ompibase/dependencies/mofed_400/opt/mellanox/sharp/lib
. I will an appropriate LDFLAGS to the configure command. Thanks !
Yes, I expected to be using
/install/u/tjcw/workspace/ibm_smpi_toucan_ucx/ompibase/dependencies/bin/ld
Adding an appropriate LDFLAGS= didn't help; the revised tarball is here
http://tjcw.freeshell.org/ompi-output-2.tar.bz2 . Do I need to specify
'-lsharp' to the link command ? If so, how do I do that ?
T J (Chris) Ward, IBM Research.
Scalable Data-Centric Computing - IBM Spectrum MPI
IBM Unite
-bash-4.2$ ls -l
/install/u/tjcw/workspace/ibm_smpi_toucan_ucx/ompibase/dependencies/mofed_400/opt/mellanox/sharp/lib/libsharp_coll.so.2
lrwxrwxrwx 1 tjcw tjcw 22 Oct 14 08:58
/install/u/tjcw/workspace/ibm_smpi_toucan_ucx/ompibase/dependencies/mofed_400/opt/mellanox/sharp/lib/libsharp_coll.so.2
Setting LD_LIBRARY_PATH didn't help; I got the same error.
Is the problem because of my MOFED level ? It may be that libsharp.so is
in a different directory, or that libhcoll.so depends on libhsharp.so in a
different way, than with other levels of MOFED.
I just tried building ompi v4.0.x , and
I'm using a MOFED from file MLNX_OFED_LINUX-4.0-0.0.8.2-rhel7.3-x86_64.tgz
, this on a machine running RHEL 7.6 . Should I be using a newer MOFED ?
T J (Chris) Ward, IBM Research.
Scalable Data-Centric Computing - IBM Spectrum MPI
IBM United Kingdom Ltd., Hursley Park, Winchester, Hants, SO21
In my last posting, I had a typo in my LD_LIBRARY_PATH setting. With this
fixed, now I get
configure:279563: gcc -std=gnu99 -std=gnu99 -o conftest -O3 -DNDEBUG
-finline-functions -fno-strict-aliasing -mcx16 -pthread
-I/install/u/tjcw/workspace/ompi/build/opal/mca/event/libevent2022/libevent/inc
I set up MOFED 4.7.1 , and now the configure complete successfully
(without needing to set LD_LIBRARY_PATH or add LDFLAGS=-L...). But the
'make' fails; the last lines of the output are
CCLD mca_coll_hcoll.la
/bin/ld: cannot find -ludev
collect2: error: ld returned 1 exit status
make[2]: ***
Thanks ! OpenMPI builds successfully for me now.
T J (Chris) Ward, IBM Research.
Scalable Data-Centric Computing - IBM Spectrum MPI
IBM United Kingdom Ltd., Hursley Park, Winchester, Hants, SO21 2JN
011-44-1962-818679
LinkedIn https://www.linkedin.com/in/tjcward/
ResearchGate https://www.res
13 matches
Mail list logo