Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#512616: [openmpi] missing symbols?

2009-01-23 Thread Dirk Eddelbuettel

On 23 January 2009 at 09:09, Christophe Prud'homme wrote:
| It means that, indeed, we _must_ recompile/relink all libs and
| programs in Debian depending on openmpi
| 
| Dirk, what do we do ? that's quite a job to do.Perhaps put back 1.2.8
| in unstable with an epoch and upload 1.3 in experimental and send an
| email to all parties concerned with openmpi saying that 1.3 requires
| recompiling/relinking
| 
| is this diagnostic ok with you ? or am I missing something ?

Manuel already asked me this. I would prefer a lower-key solution of 

-- a heads-up email to all maintainers
-- followed by bugreports two weeks later
-- followed by binary non-maintainer uploads two weeks later

Dirk

-- 
Three out of two people have difficulties with fractions.


Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#512616: [openmpi] missing symbols?

2009-01-23 Thread Jeff Squyres

On Jan 23, 2009, at 3:09 AM, Christophe Prud'homme wrote:

FWIW, "drop in replacement" in this context means recompile and  
relink.  We
did not provide binary compatibility between the 1.2 series and the  
1.3

series.

that would mean that all libs and programs in Debian depending on
openmpi must be recompiled and relinked
yes ?


Correct.  Completely coincidentally and unrelated to this e-mail  
thread, Sun, Cisco, Sandia, and U. Tennessee have had a bunch of  
conversations over this past week about how to avoid this for future  
versions (i.e., be able to have forward binary compatibility if you  
compile against OMPI va.b, you can change your LD_LIBRARY_PATH and run  
with OMPI vc.d).  If all goes well, this will work for v1.4 and  
forward -- it will likely *not* be true for the v1.3.x series.



I have another package in debian using openmpi called paraview for
parallel scientific visualisation
paraview
paraview: symbol lookup error:
/usr/lib/paraview/libvtkParallel.so.pv3.4: undefined symbol:
_ZN3MPI3Win14Set_errhandlerERKNS_10ErrhandlerE

the same link problem as with hypre

now after some investigation when I look in
openmpi/ompi/mpi/cxx/{win,comm}_inlin.h

I find the _inline_ implementation of MPI::Win::Set_errhandler and
MPI::Comm::Set_errhandler
it seems that before openmpi 1.3 these functions where provided with
the library, i.e. they were not inlined
but shipped with the mpi_cxx lib.


Correct.  IIRC, we made this change because we eliminated the use of  
the STL from our C++ bindings and it was therefore unnecessary to have  
these functions in the library anymore (the vast majority of the  
OMPI's C++ MPI API bindings are inlined).



It means that, indeed, we _must_ recompile/relink all libs and
programs in Debian depending on openmpi


Sorry about that.  :-(

--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#512616: [openmpi] missing symbols?

2009-01-22 Thread Manuel Prinz
Sorry for that, the message was supposed to be a private message to
Dirk. I did not notice (or expect) you override Reply-To.

Best regards
Manuel



signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#512616: [openmpi] missing symbols?

2009-01-22 Thread Manuel Prinz
Am Donnerstag, den 22.01.2009, 15:12 -0600 schrieb Dirk Eddelbuettel:
> On 22 January 2009 at 15:23, Jeff Squyres wrote:
> | FWIW, "drop in replacement" in this context means recompile and  
> | relink.  We did not provide binary compatibility between the 1.2  
> | series and the 1.3 series.

Das klingt gar nicht gut.

> Ack. So we need to push that through all Open MPI-using apps in Debian.

Wie lässt sich das lösen? BinNMUs oder sollten wir Pakete für 1.2.x und
1.3.x parallel anbieten?

Mag blöd klingen, aber damit habe ich keine Erfahrung und so einen Fall
während der "Ausbildung" (aka NM) auch nicht durchgespielt.

Viele Grüße
Manuel


signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil


Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#512616: [openmpi] missing symbols?

2009-01-22 Thread Dirk Eddelbuettel

Dear Open MPI developers,

This bug report just came in against the new Open MPI 1.3 release which we
built the same way as 1.2.*.  


Christophe,

Sorry about that.  And yes, it should be a drop-in replacement. You can
revert back to 1.2.8 from testing for now.  If you have a small
self-contained C++ example, it would help debugging.


Regards, Dirk


On 22 January 2009 at 11:10, Christophe Prud'homme wrote:
| Package: openmpi
| Version: 1.3-1
| Severity: serious
| 
| --- Please enter the report below this line. ---
| 
| 
| Hello
| 
| I just upgraded to openmpi 1.3-1. The compilation of my codes went fine.
| The linking stage sometimes failed with
| 
| undefined reference to `MPI::Win::Set_errhandler(MPI::Errhandler const&)'
| undefined reference to `MPI::Comm::Set_errhandler(MPI::Errhandler const&)'
| 
| in some external libs (trilinos) using openmpi 
| 
| When linking is ok, runtime fails with for example
| 
| symbol lookup error: /usr/lib/libHYPRE_FEI.so.2.0.0: undefined symbol: 
| _ZN3MPI3Win14Set_errhandlerERKNS_10ErrhandlerE   
| 
| After playing with nm I got
| 
| nm /usr/lib/openmpi/lib/libmpi_cxx.a  
 
|  W _ZN3MPI6Status9Set_errorEi   
|  W _ZNK3MPI3Win14Set_errhandlerERKNS_10ErrhandlerE  
|  W _ZNK3MPI4Comm14Set_errhandlerERKNS_10ErrhandlerE 
|  W _ZN3MPI6Status9Set_errorEi   
|  W _ZNK3MPI3Win14Set_errhandlerERKNS_10ErrhandlerE  
|  W _ZNK3MPI4Comm14Set_errhandlerERKNS_10ErrhandlerE 
|  W _ZN3MPI6Status9Set_errorEi   
|  W _ZNK3MPI3Win14Set_errhandlerERKNS_10ErrhandlerE  
|  W _ZNK3MPI4Comm14Set_errhandlerERKNS_10ErrhandlerE 
|  W _ZN3MPI6Status9Set_errorEi   
|  W _ZNK3MPI3Win14Set_errhandlerERKNS_10ErrhandlerE
|  W _ZNK3MPI4Comm14Set_errhandlerERKNS_10ErrhandlerE
|  W _ZN3MPI6Status9Set_errorEi
|  W _ZNK3MPI3Win14Set_errhandlerERKNS_10ErrhandlerE
|  W _ZNK3MPI4Comm14Set_errhandlerERKNS_10ErrhandlerE
|  W _ZN3MPI6Status9Set_errorEi
|  W _ZNK3MPI3Win14Set_errhandlerERKNS_10ErrhandlerE
|  W _ZNK3MPI4Comm14Set_errhandlerERKNS_10ErrhandlerE
| 
| which mean that the missing symbol is registered as a weak symbol.
| 
| PS: I have -lmpi++ -lmpi at the linking stage and libmpi_cxx.* are present in 
| /usr/lib/openmpi/lib
| 
| Am I missing something with 1.3? shouldn't it be a drop in replacement ?
| 
| --- System information. ---
| Architecture: i386
| Kernel:   Linux 2.6.26-1-686-bigmem
| 
| Debian Release: 5.0
|   500 unstableftp.fr.debian.org 
|   500 testing security.debian.org 
|   500 hardy   ppa.launchpad.net 
|   500 UNRELEASED  kde42.debian.net 
| 
| --- Package information. ---
| Depends   (Version) | Installed
| ===-+-===
| | 
| 
| 
| 
| -- 
| Debian Developer
| Annecy - Grenoble
| Scientific computing related software
| -- 
| Pkg-openmpi-maintainers mailing list
| pkg-openmpi-maintain...@lists.alioth.debian.org
| http://lists.alioth.debian.org/mailman/listinfo/pkg-openmpi-maintainers

-- 
Three out of two people have difficulties with fractions.