Uwe Sauter <[email protected]> writes:

> But if libpmi.so is provided by Slurm, why do I get the error messages? Does 
> OpenMPI statically link libpmi.a which then depends
> on an older version of libslurm.so?
> If OpenMPI dynamically links agains libpmi.so which itself either links 
> against libslurm.so or libslurm.so.28, shouldn't this
> result in no direct dependency from OpenMPI to libslurm.so?

A colleague of mine (on Cc) has dug into this, and found the following:

It looks like it is the presence of lib64/libpmi2.la and lib64/libpmi.la
that is the "culprit".  They are installed by the slurm-devel RPM.
Openmpi uses GNU libtool for linking, which finds these files, and
follow their "dependency_libs" specification, thus linking directly to
libslurm.so.  For instance:

$ readelf -d libmca_common_pmi.so

Dynamic section at offset 0x1590 contains 36 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libpmi.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libslurm.so.27]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x0000000000000001 (NEEDED)             Shared library: [libhwloc.so.5]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libpmi2.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [librt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libutil.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libimf.so]
 0x0000000000000001 (NEEDED)             Shared library: [libsvml.so]
 0x0000000000000001 (NEEDED)             Shared library: [libirng.so]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libintlc.so.5]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000e (SONAME)             Library soname: 
[libmca_common_pmi.so.1]
 0x000000000000000f (RPATH)              Library rpath: [/opt/slurm/lib64]
[...]

(ldd is of no help here, since it will recursively follow all
dependencies.)

We have tested removing all .la files in slurm's lib64 folder, and
rebuilding openmpi.  Then openmpi is only directly linked against
libpmi.so and libpmi2.so:

$ readelf -d libmca_common_pmi.so

Dynamic section at offset 0x1570 contains 34 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [libpmi.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libpmi2.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [librt.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libutil.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libimf.so]
 0x0000000000000001 (NEEDED)             Shared library: [libsvml.so]
 0x0000000000000001 (NEEDED)             Shared library: [libirng.so]
 0x0000000000000001 (NEEDED)             Shared library: [libm.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libgcc_s.so.1]
 0x0000000000000001 (NEEDED)             Shared library: [libintlc.so.5]
 0x0000000000000001 (NEEDED)             Shared library: [libpthread.so.0]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x0000000000000001 (NEEDED)             Shared library: [libdl.so.2]
 0x000000000000000e (SONAME)             Library soname: 
[libmca_common_pmi.so.1]
 0x000000000000000f (RPATH)              Library rpath: [/opt/slurm/lib64]
[...]

We have verified that we can compile openmpi (1.8.6) against slurm
14.03.7 (with the .la files removed), and then upgrade slurm to 15.08.0
without having to recompile openmpi.

My understanding of linking and libraries is not very thorough,
unfortunately, but according to

https://lists.fedoraproject.org/pipermail/mingw/2012-January/004421.html

the .la files are only needed in order to link against static libraries,
and since Slurm doesn't provide any static libraries, I guess it would
be safe for the slurm-devel rpm not to include these files.

-- 
Regards,
Bjørn-Helge Mevik, dr. scient,
Department for Research Computing, University of Oslo

Reply via email to