[OMPI devel] Open MPI SC'18 State of the Union BOF slides

2018-11-16 Thread Jeff Squyres (jsquyres) via devel
Thanks to all who came to the Open MPI SotU BOF at SC'18 in Dallas, TX, USA 
this week!  It was great talking with you all.

Here are the slides that we presented:

https://www.open-mpi.org/papers/sc-2018/

Please feel free to ask any followup questions on the users or devel lists.

-- 
Jeff Squyres
jsquy...@cisco.com

___
devel mailing list
devel@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/devel


Re: [OMPI devel] Open-MPI backwards compatibility and library version changes

2018-11-16 Thread Barrett, Brian via devel
Gilles -

Look at the output of Chris’s libtool link line; you can see it’s explicitly 
adding a dependency on libopen-pal.so to the test binary.  Once it does that, 
it’s game over, the OS linking system will, rightly, complain about us changing 
the c:r:a in the libtool version system in a way that isn’t backwards 
compatible.

Unfortunately, I don’t have a good idea of what to do now.  We already did the 
damage on the 3.x series.  Our backwards compatibility testing (as lame as it 
is) just links libmpi, so it’s all good.  But if anyone uses libtool, we’ll 
have a problem, because we install the .la files that allow libtool to see the 
dependency of libmpi on libopen-pal, and it gets too excited.

We’ll need to talk about how we think about this change in the future.

Brian

> On Nov 14, 2018, at 6:07 PM, Gilles Gouaillardet 
>  wrote:
> 
> Chris,
> 
> I am a bit puzzled at your logs.
> 
> As far as I understand,
> 
> ldd libhhgttg.so.1
> 
> reports that libopen-rte.so.40 and libopen-pal.so.40 are both
> dependencies, but that does not say anything on
> who is depending on them. They could be directly needed by
> libhhgttg.so.1 (I hope / do not think it is the case),
> or indirectly by libmpi.so.40 (I'd rather bet on that).
> 
> In the latter case, having libhhgttg.so.1 point to an other
> libmpi.so.40 that depends on newer opal/orte libraries should just
> work.
> 
> You might want to run string libhhgttg.so.1 and look for libmpi.so.40
> (I found it) and libopen-pal.so.40 (I did not find it) or
> libopen-rte.so.40 (I did not find it too).
> 
> 
> Note if you
> gcc -shared -o libhhgttg.so.1 libhhgttg.c -lmpi -lopen-rte -lopen-pal
> then your lib will explicitly depend on the "internal" MPI libraries
> and you will face the same issue that your end user.
> You should not need to do that (I assume you do not explicitly call
> internal opal/orte subroutines), and hence avoid doing it.
> That being said, keep in mind that some build systems might do that
> for you under the hood (I have seen that, but I cannot remember which
> one), and that would be a bad thing, at least from an Open MPI point
> of view.
> 
> 
> Cheers,
> 
> Gilles
> On Wed, Nov 14, 2018 at 6:46 PM Christopher Samuel  
> wrote:
>> 
>> On 15/11/18 2:16 am, Barrett, Brian via devel wrote:
>> 
>>> In practice, this should not be a problem. The wrapper compilers (and
>>> our instructions for linking when not using the wrapper compilers)
>>> only link against libmpi.so (or a set of libraries if using Fortran),
>>> as libmpi.so contains the public interface. libmpi.so has a
>>> dependency on libopen-pal.so so the loader will load the version of
>>> libopen-pal.so that matches the version of Open MPI used to build
>>> libmpi.so However, if someone explicitly links against libopen-pal.so
>>> you end up where we are today.
>> 
>> Unfortunately that's not the case, just creating a shared library
>> that only links in libmpi.so will create dependencies on the private
>> libraries too in the final shared library. :-(
>> 
>> Here's a toy example to illustrate that.
>> 
>> [csamuel@farnarkle2 libtool]$ cat hhgttg.c
>> int answer(void)
>> {
>>return(42);
>> }
>> 
>> [csamuel@farnarkle2 libtool]$ gcc hhgttg.c -c -o hhgttg.o
>> 
>> [csamuel@farnarkle2 libtool]$ gcc -shared -Wl,-soname,libhhgttg.so.1 -o
>> libhhgttg.so.1 hhgttg.o -lmpi
>> 
>> [csamuel@farnarkle2 libtool]$ ldd libhhgttg.so.1
>>linux-vdso.so.1 =>  (0x7ffc625b3000)
>>libmpi.so.40 =>
>> /apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libmpi.so.40
>> (0x7f018a582000)
>>libc.so.6 => /lib64/libc.so.6 (0x7f018a09e000)
>>libopen-rte.so.40 =>
>> /apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libopen-rte.so.40
>> (0x7f018a4b5000)
>>libopen-pal.so.40 =>
>> /apps/skylake/software/compiler/gcc/6.4.0/openmpi/3.0.0/lib/libopen-pal.so.40
>> (0x7f0189fde000)
>>libdl.so.2 => /lib64/libdl.so.2 (0x7f0189dda000)
>>librt.so.1 => /lib64/librt.so.1 (0x7f0189bd2000)
>>libutil.so.1 => /lib64/libutil.so.1 (0x7f01899cf000)
>>libm.so.6 => /lib64/libm.so.6 (0x7f01896cd000)
>>libpthread.so.0 => /lib64/libpthread.so.0 (0x7f01894b1000)
>>libz.so.1 => /lib64/libz.so.1 (0x7f018929b000)
>>libhwloc.so.5 => /lib64/libhwloc.so.5 (0x7f018905e000)
>>/lib64/ld-linux-x86-64.so.2 (0x7f018a46b000)
>>libnuma.so.1 => /lib64/libnuma.so.1 (0x7f0188e52000)
>>libltdl.so.7 => /lib64/libltdl.so.7 (0x7f0188c48000)
>>libgcc_s.so.1 =>
>> /apps/skylake/software/core/gcccore/6.4.0/lib64/libgcc_s.so.1
>> (0x7f018a499000)
>> 
>> 
>> All the best,
>> Chris
>> --
>>  Christopher Samuel OzGrav Senior Data Science Support
>>  ARC Centre of Excellence for Gravitational Wave Discovery
>>  http://www.ozgrav.org/  http://twitter.com/ozgrav
>> ___
>> devel mailing list
>>