Re: [OMPI devel] Open-MPI backwards compatibility and library version changes

2018-11-24 Thread Bert Wesarg via devel
FYI,

debian libtool packages take care of this with this patch:

https://git.launchpad.net/ubuntu/+source/libtool/tree/debian/patches/link_all_deplibs.patch

Best,
Bert
On Mon, Nov 19, 2018 at 12:01 AM Christopher Samuel  wrote:
>
> Hi Brian,
>
> On 17/11/18 5:13 am, Barrett, Brian via devel wrote:
>
> > Unfortunately, I don’t have a good idea of what to do now. We already
> > did the damage on the 3.x series. Our backwards compatibility testing
> > (as lame as it is) just links libmpi, so it’s all good. But if anyone
> > uses libtool, we’ll have a problem, because we install the .la files
> > that allow libtool to see the dependency of libmpi on libopen-pal, and
> > it gets too excited.
> >
> > We’ll need to talk about how we think about this change in the future.
>
> Thanks for that - personally I think it's a misfeature in libtool to add
> these extra dependencies, it would be handy if there was a way to turn
> it off - but that's not your problem.
>
> For us it just means that when we bring in a new Open-MPI we just need
> to build new versions of our installed libraries and codes against it,
> fortunately that's something that Easybuild makes (relatively) easy.
>
> Thanks for your time everyone - this is my last week at Swinburne before
> I leave Australia to start at NERSC in December!
>
> All the best,
> Chris
> --
>   Christopher Samuel OzGrav Senior Data Science Support
>   ARC Centre of Excellence for Gravitational Wave Discovery
>   http://www.ozgrav.org/  http://twitter.com/ozgrav
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/devel
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Re: [OMPI devel] Open-MPI backwards compatibility and library version changes

2018-11-18 Thread Christopher Samuel
Hi Brian,

On 17/11/18 5:13 am, Barrett, Brian via devel wrote:

> Unfortunately, I don’t have a good idea of what to do now. We already 
> did the damage on the 3.x series. Our backwards compatibility testing 
> (as lame as it is) just links libmpi, so it’s all good. But if anyone 
> uses libtool, we’ll have a problem, because we install the .la files 
> that allow libtool to see the dependency of libmpi on libopen-pal, and 
> it gets too excited.
> 
> We’ll need to talk about how we think about this change in the future.

Thanks for that - personally I think it's a misfeature in libtool to add 
these extra dependencies, it would be handy if there was a way to turn 
it off - but that's not your problem.

For us it just means that when we bring in a new Open-MPI we just need 
to build new versions of our installed libraries and codes against it, 
fortunately that's something that Easybuild makes (relatively) easy.

Thanks for your time everyone - this is my last week at Swinburne before 
I leave Australia to start at NERSC in December!

All the best,
Chris
-- 
  Christopher Samuel OzGrav Senior Data Science Support
  ARC Centre of Excellence for Gravitational Wave Discovery
  http://www.ozgrav.org/  http://twitter.com/ozgrav
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel

Re: [OMPI devel] Open-MPI backwards compatibility and library version changes

2018-11-16 Thread Barrett, Brian via devel
Gilles -

Look at the output of Chris’s libtool link line; you can see it’s explicitly 
adding a dependency on libopen-pal.so to the test binary.  Once it does that, 
it’s game over, the OS linking system will, rightly, complain about us changing 
the c:r:a in the libtool version system in a way that isn’t backwards 
compatible.

Unfortunately, I don’t have a good idea of what to do now.  We already did the 
damage on the 3.x series.  Our backwards compatibility testing (as lame as it 
is) just links libmpi, so it’s all good.  But if anyone uses libtool, we’ll 
have a problem, because we install the .la files that allow libtool to see the 
dependency of libmpi on libopen-pal, and it gets too excited.

We’ll need to talk about how we think about this change in the future.

Brian

> On Nov 14, 2018, at 6:07 PM, Gilles Gouaillardet 
>  wrote:
> 
> Chris,
> 
> I am a bit puzzled at your logs.
> 
> As far as I understand,
> 
> ldd libhhgttg.so.1
> 
> reports that libopen-rte.so.40 and libopen-pal.so.40 are both
> dependencies, but that does not say anything on
> who is depending on them. They could be directly needed by
> libhhgttg.so.1 (I hope / do not think it is the case),
> or indirectly by libmpi.so.40 (I'd rather bet on that).
> 
> In the latter case, having libhhgttg.so.1 point to an other
> libmpi.so.40 that depends on newer opal/orte libraries should just
> work.
> 
> You might want to run string libhhgttg.so.1 and look for libmpi.so.40
> (I found it) and libopen-pal.so.40 (I did not find it) or
> libopen-rte.so.40 (I did not find it too).
> 
> 
> Note if you
> gcc -shared -o libhhgttg.so.1 libhhgttg.c -lmpi -lopen-rte -lopen-pal
> then your lib will explicitly depend on the "internal" MPI libraries
> and you will face the same issue that your end user.
> You should not need to do that (I assume you do not explicitly call
> internal opal/orte subroutines), and hence avoid doing it.
> That being said, keep in mind that some build systems might do that
> for you under the hood (I have seen that, but I cannot remember which
> one), and that would be a bad thing, at least from an Open MPI point
> of view.
> 
> 
> Cheers,
> 
> Gilles
> On Wed, Nov 14, 2018 at 6:46 PM Christopher Samuel  
> wrote:
>> 
>> On 15/11/18 2:16 am, Barrett, Brian via devel wrote:
>> 
>>> In practice, this should not be a problem. The wrapper compilers (and
>>> our instructions for linking when not using the wrapper compilers)
>>> only link against libmpi.so (or a set of libraries if using Fortran),
>>> as libmpi.so contains the public interface. libmpi.so has a
>>> dependency on libopen-pal.so so the loader will load the version of
>>> libopen-pal.so that matches the version of Open MPI used to build
>>> libmpi.so However, if someone explicitly links against libopen-pal.so
>>> you end up where we are today.
>> 
>> Unfortunately that's not the case, just creating a shared library
>> that only links in libmpi.so will create dependencies on the private
>> libraries too in the final shared library. :-(
>> 
>> Here's a toy example to illustrate that.
>> 
>> [csamuel@farnarkle2 libtool]$ cat hhgttg.c
>> int answer(void)
>> {
>>return(42);
>> }
>> 
>> [csamuel@farnarkle2 libtool]$ gcc hhgttg.c -c -o hhgttg.o
>> 
>> [csamuel@farnarkle2 libtool]$ gcc -shared -Wl,-soname,libhhgttg.so.1 -o
>> libhhgttg.so.1 hhgttg.o -lmpi
>> 
>> [csamuel@farnarkle2 libtool]$ ldd libhhgttg.so.1
>>linux-vdso.so.1 =>  (0x7ffc625b3000)
>>libmpi.so.40 =>
>> /apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libmpi.so.40
>> (0x7f018a582000)
>>libc.so.6 => /lib64/libc.so.6 (0x7f018a09e000)
>>libopen-rte.so.40 =>
>> /apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libopen-rte.so.40
>> (0x7f018a4b5000)
>>libopen-pal.so.40 =>
>> /apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libopen-pal.so.40
>> (0x7f0189fde000)
>>libdl.so.2 => /lib64/libdl.so.2 (0x7f0189dda000)
>>librt.so.1 => /lib64/librt.so.1 (0x7f0189bd2000)
>>libutil.so.1 => /lib64/libutil.so.1 (0x7f01899cf000)
>>libm.so.6 => /lib64/libm.so.6 (0x7f01896cd000)
>>libpthread.so.0 => /lib64/libpthread.so.0 (0x7f01894b1000)
>>libz.so.1 => /lib64/libz.so.1 (0x7f018929b000)
>>libhwloc.so.5 => /lib64/libhwloc.so.5 (0x7f018905e000)
>>/lib64/ld-linux-x86-64.so.2 (0x7f018a46b000)
>>libnuma.so.1 => /lib64/libnuma.so.1 (0x7f0188e52000)
>>libltdl.so.7 => /lib64/libltdl.so.7 (0x7f0188c48000)
>>libgcc_s.so.1 =>
>> /apps/skylake/software/core/gcccore/6.4.0/lib64/libgcc_s.so.1
>> (0x7f018a499000)
>> 
>> 
>> All the best,
>> Chris
>> --
>>  Christopher Samuel OzGrav Senior Data Science Support
>>  ARC Centre of Excellence for Gravitational Wave Discovery
>>  http://www.ozgrav.org/  http://twitter.com/ozgrav
>> ___
>> devel mailing list
>> devel@list

Re: [OMPI devel] Open-MPI backwards compatibility and library version changes

2018-11-14 Thread Gilles Gouaillardet
Chris,

I am a bit puzzled at your logs.

As far as I understand,

ldd libhhgttg.so.1

reports that libopen-rte.so.40 and libopen-pal.so.40 are both
dependencies, but that does not say anything on
who is depending on them. They could be directly needed by
libhhgttg.so.1 (I hope / do not think it is the case),
or indirectly by libmpi.so.40 (I'd rather bet on that).

In the latter case, having libhhgttg.so.1 point to an other
libmpi.so.40 that depends on newer opal/orte libraries should just
work.

You might want to run string libhhgttg.so.1 and look for libmpi.so.40
(I found it) and libopen-pal.so.40 (I did not find it) or
libopen-rte.so.40 (I did not find it too).


Note if you
gcc -shared -o libhhgttg.so.1 libhhgttg.c -lmpi -lopen-rte -lopen-pal
then your lib will explicitly depend on the "internal" MPI libraries
and you will face the same issue that your end user.
You should not need to do that (I assume you do not explicitly call
internal opal/orte subroutines), and hence avoid doing it.
That being said, keep in mind that some build systems might do that
for you under the hood (I have seen that, but I cannot remember which
one), and that would be a bad thing, at least from an Open MPI point
of view.


Cheers,

Gilles
On Wed, Nov 14, 2018 at 6:46 PM Christopher Samuel  wrote:
>
> On 15/11/18 2:16 am, Barrett, Brian via devel wrote:
>
> > In practice, this should not be a problem. The wrapper compilers (and
> >  our instructions for linking when not using the wrapper compilers)
> > only link against libmpi.so (or a set of libraries if using Fortran),
> > as libmpi.so contains the public interface. libmpi.so has a
> > dependency on libopen-pal.so so the loader will load the version of
> > libopen-pal.so that matches the version of Open MPI used to build
> > libmpi.so However, if someone explicitly links against libopen-pal.so
> > you end up where we are today.
>
> Unfortunately that's not the case, just creating a shared library
> that only links in libmpi.so will create dependencies on the private
> libraries too in the final shared library. :-(
>
> Here's a toy example to illustrate that.
>
> [csamuel@farnarkle2 libtool]$ cat hhgttg.c
> int answer(void)
> {
> return(42);
> }
>
> [csamuel@farnarkle2 libtool]$ gcc hhgttg.c -c -o hhgttg.o
>
> [csamuel@farnarkle2 libtool]$ gcc -shared -Wl,-soname,libhhgttg.so.1 -o
> libhhgttg.so.1 hhgttg.o -lmpi
>
> [csamuel@farnarkle2 libtool]$ ldd libhhgttg.so.1
> linux-vdso.so.1 =>  (0x7ffc625b3000)
> libmpi.so.40 =>
> /apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libmpi.so.40
> (0x7f018a582000)
> libc.so.6 => /lib64/libc.so.6 (0x7f018a09e000)
> libopen-rte.so.40 =>
> /apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libopen-rte.so.40
> (0x7f018a4b5000)
> libopen-pal.so.40 =>
> /apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libopen-pal.so.40
> (0x7f0189fde000)
> libdl.so.2 => /lib64/libdl.so.2 (0x7f0189dda000)
> librt.so.1 => /lib64/librt.so.1 (0x7f0189bd2000)
> libutil.so.1 => /lib64/libutil.so.1 (0x7f01899cf000)
> libm.so.6 => /lib64/libm.so.6 (0x7f01896cd000)
> libpthread.so.0 => /lib64/libpthread.so.0 (0x7f01894b1000)
> libz.so.1 => /lib64/libz.so.1 (0x7f018929b000)
> libhwloc.so.5 => /lib64/libhwloc.so.5 (0x7f018905e000)
> /lib64/ld-linux-x86-64.so.2 (0x7f018a46b000)
> libnuma.so.1 => /lib64/libnuma.so.1 (0x7f0188e52000)
> libltdl.so.7 => /lib64/libltdl.so.7 (0x7f0188c48000)
> libgcc_s.so.1 =>
> /apps/skylake/software/core/gcccore/6.4.0/lib64/libgcc_s.so.1
> (0x7f018a499000)
>
>
> All the best,
> Chris
> --
>   Christopher Samuel OzGrav Senior Data Science Support
>   ARC Centre of Excellence for Gravitational Wave Discovery
>   http://www.ozgrav.org/  http://twitter.com/ozgrav
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/devel
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel


Re: [OMPI devel] Open-MPI backwards compatibility and library version changes

2018-11-14 Thread Christopher Samuel
On 15/11/18 12:10 pm, Christopher Samuel wrote:

> I wonder if it's because they use libtool instead?

Yup, it's libtool - using it compile my toy example shows the same
behaviour with "readelf -d" pulling in the private libraries directly. :-(

[csamuel@farnarkle2 libtool]$ cat hhgttg.c
int answer(void)
{
return(42);
}


[csamuel@farnarkle2 libtool]$ libtool compile gcc hhgttg.c -c -o hhgttg.o
libtool: compile:  gcc hhgttg.c -c  -fPIC -DPIC -o .libs/hhgttg.o
libtool: compile:  gcc hhgttg.c -c -o hhgttg.o >/dev/null 2>&1


[csamuel@farnarkle2 libtool]$ libtool link gcc -o libhhgttg.la hhgttg.lo 
-lmpi -rpath /usr/local/lib
libtool: link: gcc -shared  -fPIC -DPIC  .libs/hhgttg.o   -Wl,-rpath 
-Wl,/apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib 
-Wl,-rpath 
-Wl,/apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib 
/apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libmpi.so 
-L/apps/skylake/software/core/gcccore/6.4.0/lib64 
-L/apps/skylake/software/core/gcccore/6.4.0/lib 
-L/apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib 
/apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libopen-rte.so 
/apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libopen-pal.so 
-ldl -lrt -lutil -lm -lpthread -lz -lhwloc-Wl,-soname 
-Wl,libhhgttg.so.0 -o .libs/libhhgttg.so.0.0.0
libtool: link: (cd ".libs" && rm -f "libhhgttg.so.0" && ln -s 
"libhhgttg.so.0.0.0" "libhhgttg.so.0")
libtool: link: (cd ".libs" && rm -f "libhhgttg.so" && ln -s 
"libhhgttg.so.0.0.0" "libhhgttg.so")
libtool: link: ar cru .libs/libhhgttg.a  hhgttg.o
libtool: link: ranlib .libs/libhhgttg.a
libtool: link: ( cd ".libs" && rm -f "libhhgttg.la" && ln -s 
"../libhhgttg.la" "libhhgttg.la" )


[csamuel@farnarkle2 libtool]$ readelf -d .libs/libhhgttg.so.0| fgrep -i lib
  0x0001 (NEEDED) Shared library: [libmpi.so.40]
  0x0001 (NEEDED) Shared library: 
[libopen-rte.so.40]
  0x0001 (NEEDED) Shared library: 
[libopen-pal.so.40]
  0x0001 (NEEDED) Shared library: [libdl.so.2]
  0x0001 (NEEDED) Shared library: [librt.so.1]
  0x0001 (NEEDED) Shared library: [libutil.so.1]
  0x0001 (NEEDED) Shared library: [libm.so.6]
  0x0001 (NEEDED) Shared library: [libpthread.so.0]
  0x0001 (NEEDED) Shared library: [libz.so.1]
  0x0001 (NEEDED) Shared library: [libhwloc.so.5]
  0x0001 (NEEDED) Shared library: [libc.so.6]
  0x000e (SONAME) Library soname: [libhhgttg.so.0]
  0x001d (RUNPATH)Library runpath: 
[/apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib]


All the best,
Chris
-- 
  Christopher Samuel OzGrav Senior Data Science Support
  ARC Centre of Excellence for Gravitational Wave Discovery
  http://www.ozgrav.org/  http://twitter.com/ozgrav

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel


Re: [OMPI devel] Open-MPI backwards compatibility and library version changes

2018-11-14 Thread Christopher Samuel
On 15/11/18 11:45 am, Christopher Samuel wrote:

> Unfortunately that's not the case, just creating a shared library
> that only links in libmpi.so will create dependencies on the private
> libraries too in the final shared library. :-(

Hmm, I might be misinterpreting the output of "ldd", it looks like it
reports the dependencies of dependencies not just the direct
dependencies.  "readelf -d" seems more reliable.

[csamuel@farnarkle2 libtool]$ readelf -d libhhgttg.so.1 | fgrep -i lib
  0x0001 (NEEDED) Shared library: [libmpi.so.40]
  0x0001 (NEEDED) Shared library: [libc.so.6]
  0x000e (SONAME) Library soname: [libhhgttg.so.1]

Whereas the HDF5 libraries really do have them listed as a dependency.

[csamuel@farnarkle2 1.10.1]$ readelf -d ./lib/libhdf5_fortran.so.100 | 
fgrep -i lib
  0x0001 (NEEDED) Shared library: [libhdf5.so.101]
  0x0001 (NEEDED) Shared library: [libsz.so.2]
  0x0001 (NEEDED) Shared library: 
[libmpi_usempif08.so.40]
  0x0001 (NEEDED) Shared library: 
[libmpi_usempi_ignore_tkr.so.40]
  0x0001 (NEEDED) Shared library: 
[libmpi_mpifh.so.40]
  0x0001 (NEEDED) Shared library: [libmpi.so.40]
  0x0001 (NEEDED) Shared library: 
[libopen-rte.so.40]
  0x0001 (NEEDED) Shared library: 
[libopen-pal.so.40]
  0x0001 (NEEDED) Shared library: [libdl.so.2]
  0x0001 (NEEDED) Shared library: [librt.so.1]
  0x0001 (NEEDED) Shared library: [libutil.so.1]
  0x0001 (NEEDED) Shared library: [libpthread.so.0]
  0x0001 (NEEDED) Shared library: [libz.so.1]
  0x0001 (NEEDED) Shared library: [libhwloc.so.5]
  0x0001 (NEEDED) Shared library: [libgfortran.so.3]
  0x0001 (NEEDED) Shared library: [libm.so.6]
  0x0001 (NEEDED) Shared library: [libquadmath.so.0]
  0x0001 (NEEDED) Shared library: [libc.so.6]
  0x0001 (NEEDED) Shared library: [libgcc_s.so.1]
  0x000e (SONAME) Library soname: 
[libhdf5_fortran.so.100]
  0x001d (RUNPATH)Library runpath: 
[/apps/skylake/software/mpi/gcc/6.4.0/openmpi/3.0.0/hdf5/1.10.1/lib:/apps/skylake/software/core/szip/2.1.1/lib:/apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib:/apps/skylake/software/core/gcccore/6.4.0/lib/../lib64]

I wonder if it's because they use libtool instead?

All the best,
Chris
-- 
  Christopher Samuel OzGrav Senior Data Science Support
  ARC Centre of Excellence for Gravitational Wave Discovery
  http://www.ozgrav.org/  http://twitter.com/ozgrav

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel


Re: [OMPI devel] Open-MPI backwards compatibility and library version changes

2018-11-14 Thread Christopher Samuel
On 15/11/18 2:16 am, Barrett, Brian via devel wrote:

> In practice, this should not be a problem. The wrapper compilers (and
>  our instructions for linking when not using the wrapper compilers)
> only link against libmpi.so (or a set of libraries if using Fortran),
> as libmpi.so contains the public interface. libmpi.so has a
> dependency on libopen-pal.so so the loader will load the version of
> libopen-pal.so that matches the version of Open MPI used to build
> libmpi.so However, if someone explicitly links against libopen-pal.so
> you end up where we are today.

Unfortunately that's not the case, just creating a shared library
that only links in libmpi.so will create dependencies on the private
libraries too in the final shared library. :-(

Here's a toy example to illustrate that.

[csamuel@farnarkle2 libtool]$ cat hhgttg.c
int answer(void)
{
return(42);
}

[csamuel@farnarkle2 libtool]$ gcc hhgttg.c -c -o hhgttg.o

[csamuel@farnarkle2 libtool]$ gcc -shared -Wl,-soname,libhhgttg.so.1 -o 
libhhgttg.so.1 hhgttg.o -lmpi

[csamuel@farnarkle2 libtool]$ ldd libhhgttg.so.1
linux-vdso.so.1 =>  (0x7ffc625b3000)
libmpi.so.40 => 
/apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libmpi.so.40 
(0x7f018a582000)
libc.so.6 => /lib64/libc.so.6 (0x7f018a09e000)
libopen-rte.so.40 => 
/apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libopen-rte.so.40 
(0x7f018a4b5000)
libopen-pal.so.40 => 
/apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libopen-pal.so.40 
(0x7f0189fde000)
libdl.so.2 => /lib64/libdl.so.2 (0x7f0189dda000)
librt.so.1 => /lib64/librt.so.1 (0x7f0189bd2000)
libutil.so.1 => /lib64/libutil.so.1 (0x7f01899cf000)
libm.so.6 => /lib64/libm.so.6 (0x7f01896cd000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x7f01894b1000)
libz.so.1 => /lib64/libz.so.1 (0x7f018929b000)
libhwloc.so.5 => /lib64/libhwloc.so.5 (0x7f018905e000)
/lib64/ld-linux-x86-64.so.2 (0x7f018a46b000)
libnuma.so.1 => /lib64/libnuma.so.1 (0x7f0188e52000)
libltdl.so.7 => /lib64/libltdl.so.7 (0x7f0188c48000)
libgcc_s.so.1 => 
/apps/skylake/software/core/gcccore/6.4.0/lib64/libgcc_s.so.1 
(0x7f018a499000)


All the best,
Chris
-- 
  Christopher Samuel OzGrav Senior Data Science Support
  ARC Centre of Excellence for Gravitational Wave Discovery
  http://www.ozgrav.org/  http://twitter.com/ozgrav
___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel


Re: [OMPI devel] Open-MPI backwards compatibility and library version changes

2018-11-14 Thread Barrett, Brian via devel
Chris -

When we look at ABI stability for Open MPI releases, we look only at the MPI 
and SHMEM interfaces, not the internal interfaces used by Open MPI internally.  
libopen-pal.so is an internal library, and we do not guarantee ABI stability 
across minor releases.  In 3.0.3, there was a backwards incompatible change in 
libopen-pal.so, which is why the shared library version numbers were increased 
in a way that prevented loading a new version of libopen-pal.so when the 
application was linked against an earlier version of the library.

In practice, this should not be a problem.  The wrapper compilers (and our 
instructions for linking when not using the wrapper compilers) only link 
against libmpi.so (or a set of libraries if using Fortran), as libmpi.so 
contains the public interface.  libmpi.so has a dependency on libopen-pal.so, 
so the loader will load the version of libopen-pal.so that matches the version 
of Open MPI used to build libmpi.so.  However, if someone explicitly links 
against libopen-pal.so, you end up where we are today.

There’s probably a bug in HDF5’s mechanism for linking against Open MPI, since 
it pulled in a dependency on libopen-pal.so.  However, there may be some things 
we can do in the future to better handle this scenario.  Unfortunately, most of 
the Open MPI developers (myself included) are at the SC’18 conference this 
week, so it will take us some time to investigate further.

Brian

> On Nov 14, 2018, at 5:20 AM, Christopher Samuel  wrote:
> 
> Hi folks,
> 
> Just resub'd after a long time to ask a question about binary/backwards 
> compatibility.
> 
> We got bitten when upgrading from 3.0.0 to 3.0.3 which we assumed would be 
> binary compatible and so (after some testing to confirm it was) replaced our 
> existing 3.0.0 install with the 3.0.3 one (because we're using hierarchical 
> namespaces in Lmod it meant we avoided needed to recompile everything we'd 
> already built over the last 12 months with 3.0.0).
> 
> However, once we'd done that we heard from a user that their code would no 
> longer run because it couldn't find libopen-pal.so.40 and saw that instead 
> 3.0.3 had libopen-pal.so.42.
> 
> Initially we thought this was some odd build system problem, but then on 
> digging further we realised that they were linking against libraries that in 
> turn were built against OpenMPI (HDF5) and that those had embedded the 
> libopen-pal.so.40 names.
> 
> Of course our testing hadn't found that because we weren't linking against 
> anything like those for our MPI tests. :-(
> 
> But I was really surprised to see that these version numbers were changing, I 
> thought the idea was to keep things backwardly compatible within these series?
> 
> Now fortunately our reason for doing the forced upgrade (we found our 3.0.0 
> didn't work with our upgrade to Slurm 18.08.3) was us missing one combination 
> out of our testing whilst fault-finding and having gotten it going we've been 
> able to drop back to the original 3.0.0 & fixed it for them.
> 
> But is this something that you folks have come across before?
> 
> All the best,
> Chris
> -- 
>  Christopher Samuel OzGrav Senior Data Science Support
>  ARC Centre of Excellence for Gravitational Wave Discovery
>  http://www.ozgrav.org/  http://twitter.com/ozgrav
> 
> 
> 
> ___
> devel mailing list
> devel@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/devel

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel