WHAT: Break ABI between 1.4 and 1.5 series.

WHY: To settle the ABI and .so versioning issues once and for all.

WHERE: Open MPI's .so versions and the opal_wrapper compiler.

WHEN: For 1.5[.0].  This is only meaningful if we do it for the *entire* v1.5 
series.

TIMEOUT: Next Tuesday teleconf, 23 Feb 2010

=======================================================

BACKGROUND / REQUIRED READING:
------------------------------

 * Ticket 2092: https://svn.open-mpi.org/trac/ompi/ticket/2092
 * Libtool .so versioning rules: 
https://svn.open-mpi.org/trac/ompi/wiki/ReleaseProcedures

Libtool .so version numbers are expressed as c:r:a.  libmpi is currently 
versioned "correctly", meaning that we advance the c:r:a triple as necessary 
for each release.  libopen-pal and libopen-rte, however, are currently fixed at 
0:0:0, which is Wrong.  The reasons why they are fixed at 0:0:0 are expressed 
in #2092.

SHORT VERSION OF THIS PROPOSAL:
-------------------------------

 * For v1.5.0, set c:r:a of libmpi to 1:0:0.
 * Starting with v1.5.0, set c:r:a for libopen-rte and libopen-al properly.
 * This means a break in ABI between v1.4.x and v1.5.x, but the ABI will remain 
constant for all of 1.5.x/1.6.x.
 * The wrapper compilers will need to be updated to recognize the difference 
between static and dynamic linking.

LONGER VERSION / MORE DETAILS AND RATIONALE:
--------------------------------------------

The fix for these issues involves several dominos falling in order.  You need 
to read this whole proposal to understand the full scope, sorry.  :-\

1. We need to fix the wrapper compilers to recognize the difference between 
shared library linking and static linking.  Right now, the MPI wrappers always 
do this:

    -lmpi -lopen-rte -lopen-pal

2. Listing all three libraries is only necessary when linking statically.  When 
linking dynamically, only the top-level library should be listed (e.g., -lmpi 
for MPI applications).  The implicit linker dependencies of libmpi.so will 
automatically pull in libopen-rte.so.  Likewise, the implicit dependencies of 
libopen-rte.so will automatically pull in libopen-pal.so.  More specifically, 
when linking dynamically, MPI a.out applications will only explicitly depend on 
libmpi.so (not libopen-rte.so and not libopen-pal.so).

3. Hence, the wrappers need to learn the difference between static and dynamic 
linking: when linking dynamically, only list "-lmpi".  When linking statically, 
list all 3 libraries.  This allows minimization of explicit library 
dependencies in dynamic linking, and is arguably the Right way to do it.

--> More below about how to make the wrappers understand the difference between 
static/shared linking.

4. When MPI applications only depend on libmpi, we can properly version 
libopen-rte.so and libopen-pal.so.  Hence, for v1.5.0, we will have non-0:0:0 
.so versions for these two libraries.

5. Since MPI application a.out's created by the v1.4 series will have explicit 
dependencies on all 3 libraries, they will be ABI incompatible with Open MPI 
v1.5's ORTE and OPAL libraries (as opposed to MPI applications created with 
updated wrappers in v1.5, which will only depend on libmpi when linking 
dynamically).

6. The question then remains: what to set libmpi.so's c:r:a values in v1.5.0?  
I say it should be 1:0:0.  Here's why:
  * Recall that we have added some new MPI-2.2 functions in v1.5.  Hence, 
libmpi.so's "c" needs to increase to 1 and "r" needs to be set to 0.  The 
questions is what to do with the "a" value.
  * By extension of #5, we should also make libmpi.so be ABI incompatible 
between v1.4.x and v1.5.x (to prevent some needless confusion -- rather than 
have libmpi be ABI compatible and libopen-rte and libopen-pal *not* be ABI 
compatible, I think it would be better to make *all 3* be ABI incompatible).  
This means setting the libmpi.so "a" value to 0 (as opposed to setting it to 1).

Crystal clear?  I thought so.  :-)

------

Here's my proposal on how to change the wrapper compilers to understand the 
difference between static and dynamic linking:

*** FIRST: give the wrapper the ability to link one library or all libraries
- wrapper data text files grow a new option: libs_private (a la pkg-config(1) 
files)
- wrapper data text files list -l<top_lib> in libs, and everything else in 
libs_private.  For example, for mpicc:
  libs=-lmpi
  libs_private=-lopen-rte -lopen-pal

*** NEXT: give the wrappers the ability to switch between just ${libs} or 
${libs}+${libs_private}.  Pseudocode:
- wrapper always adds ${libs} to the argv
- wrapper examines each argv[x]:
  --ompi:shared) found_in_argv=1 ;;
  --ompi:static) add ${libs_private} ; found_in_argv=1 ;;
- if (!found_in_argv) 
  - if default set via configure, add ${libs_private} (SEE BELOW)

*** LAST: give sysadmin ability to set wrapper behavior defaults
- if --disable-shared is set in OMPI's configure, wrappers default to adding 
both ${libs} and ${libs_private}
- new configure option: --enable-wrapper-static-link-by-default (or some better 
name) which forces wrappers to default add ${libs} and ${libs_private} 
(--disable... does the opposite)

Note that per above, wrapper command line options always override configure 
defaults.

This is not entirely perfect, for the following reasons:

1. sysadmins may have to specify a new option to configure (only if they build 
both static and shared and want users to default to static)
2. two new options to the wrappers
3. you can still get in a situation where the wrapper will fail (e.g., wrapper 
only uses ${libs}, but only the .a's exist, and therefore the link fails)

I think #1 and #2 are tolerable.  

I can't think of a reasonable case where #3 can occur without someone mucking 
with an already-installed OMPI (e.g., "rm $prefix/lib/libmpi.so").  The only 
case I can think of where this *might* happen is with RPMs -- ompi (which has 
libmpi.so) and ompi-devel (which has libmpi.a).  ompi-devel depends on ompi, so 
you couldn't remove the ompi RPM (libmpi.so) and only leave the ompi-devel RPM 
(libmpi.a).  Hence, I even think #3 is tolerable.

Thoughts?  Opinions?  Need caffeine?  WAKE UP!  The proposal's over.  ;-)

-- 
Jeff Squyres
jsquy...@cisco.com

For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to