[OMPI devel] Fwd: 1.5rc5 has been posted

2010-08-30 Thread Larry Baker
The same problem (LIBS = is missing -lpthread) occurs in orte/tools/ 
{orte-clean,orte-iof,orte-ps,orted,orterun,orte-top}/Makefile.


Larry Baker
US Geological Survey
650-329-5608
ba...@usgs.gov

Begin forwarded message:


From: Larry Baker 
Date: August 30, 2010 4:48:01 PM PDT
To: Open MPI Developers 
Subject: Re: [OMPI devel] 1.5rc5 has been posted

To follow up on http://www.open-mpi.org/community/lists/devel/2010/08/8417.php 
: OpenMPI 1.5rc5 fails in opal/tools/wrappers for PGI 10.3.


The problem appears to be a missing -lpthread in the definition of  
most of the *LIBS variables in OpenMPI 1.5rc5 opal/tools/wrappers/ 
Makefile:


[root@hydra src]# diff openmpi-{1.4.2,1.5rc5}/opal/tools/wrappers/ 
Makefile | grep lutil

< LIBS = -lnsl -lutil  -lpthread
> LIBS = -lnsl  -lutil
< OMPI_WRAPPER_EXTRA_LIBS =   -ldl   -Wl,--export-dynamic -lnsl - 
lutil -lpthread -ldl
> OMPI_WRAPPER_EXTRA_LIBS =   -ldl   -Wl,--export-dynamic -lnsl - 
lutil -ldl
< OPAL_WRAPPER_EXTRA_LIBS = -ldl   -Wl,--export-dynamic -lnsl - 
lutil -lpthread -ldl
> OPAL_WRAPPER_EXTRA_LIBS = -ldl   -Wl,--export-dynamic -lnsl - 
lutil -ldl
< ORTE_WRAPPER_EXTRA_LIBS =  -ldl   -Wl,--export-dynamic -lnsl - 
lutil -lpthread -ldl
> ORTE_WRAPPER_EXTRA_LIBS =  -ldl   -Wl,--export-dynamic -lnsl - 
lutil -ldl
< WRAPPER_EXTRA_LIBS =   -ldl   -Wl,--export-dynamic -lnsl -lutil - 
lpthread -ldl

> WRAPPER_EXTRA_LIBS = -ldl   -Wl,--export-dynamic -lnsl -lutil -ldl
< crs_blcr_LIBS = -lnsl -lutil  -lpthread
> crs_blcr_LIBS = -lnsl  -lutil -lpthread


[root@hydra src]# diff openmpi-{1.4.2,1.5rc5}/opal/tools/wrappers/ 
Makefile | grep LINK

< LINK = $(LIBTOOL) --tag=CC $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
> LINK = $(LIBTOOL) $(AM_V_lt) --tag=CC $(AM_LIBTOOLFLAGS) \
<$(LINK) $(opal_wrapper_OBJECTS) $(opal_wrapper_LDADD) $(LIBS)
> 	$(AM_V_CCLD)$(LINK) $(opal_wrapper_OBJECTS) $ 
(opal_wrapper_LDADD) $(LIBS)


I don't know anything about automake, so I don't know what code to  
look at that changed between 1.4.2 and 1.5rc5 that defines the *LIBS  
Makefile variables.


Larry Baker
US Geological Survey
650-329-5608
ba...@usgs.gov

On Aug 17, 2010, at 2:18 PM, Jeff Squyres wrote:


We still have one known possible regression:

   https://svn.open-mpi.org/trac/ompi/ticket/2530

But we posted rc5 anyway (there's a bunch of stuff that has been  
pending for a while that is now in).  Please test!


   http://www.open-mpi.org/software/ompi/v1.5/

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel






Re: [OMPI devel] 1.5rc5 has been posted

2010-08-30 Thread Larry Baker
The fix I posted in http://www.open-mpi.org/community/lists/devel/2010/08/8311.php 
 for the Redefinition of symbol assert causes a link failure of  
opal_wrapper.  This is because there are assert() calls in opal/mca/ 
memory/ptmalloc2/arena.c, which is included in opal/mca/memory/ 
ptmalloc2/malloc.c before the conditional on MALLOC_DEBUG, which is  
where I moved #include .  arena.c does not contain its own  
#include .  I changed the patch to opal/mca/memory/ptmalloc2/ 
malloc.c to define assert where it was before, but in such a way that  
it always uses the system  header file to define the assert  
macro.


In opal/mca/memory/ptmalloc2/malloc.c, change lines 364-369 from:


#if MALLOC_DEBUG
#include 
#else
#undef assert
#define assert(x) ((void)0)
#endif


to:


#if MALLOC_DEBUG && defined( NDEBUG )
#error -DMALLOC_DEBUG is inconsistent with -DNDEBUG
#endif

#include 


The reason the conditional uses the value of MALLOC_DEBUG, but  
defined( NDEBUG ), is that the code that depends on MALLOC_DEBUG uses  
#if MALLOC_DEBUG conditionals, while  uses #ifdef NDEBUG to  
define the assert macro.  I used the same semantics to detect the  
inconsistency between MALLOC_DEBUG and NDEBUG.


Larry Baker
US Geological Survey
650-329-5608
ba...@usgs.gov

On Aug 23, 2010, at 5:29 PM, Larry Baker wrote:

The PGI C compiler complains (issues a warning) for the redefinition  
of the assert macro in opal/mca/memory/ptmalloc2/malloc.c:


Making all in mca/memory/ptmalloc2
make[2]: Entering directory `/home/baker/openmpi-1.5rc5/opal/mca/ 
memory/ptmalloc2'

 CC opal_ptmalloc2_component.lo
 CC opal_ptmalloc2_munmap.lo
 CC malloc.lo
PGC-W-0221-Redefinition of symbol assert (/usr/include/assert.h: 51)
PGC-W-0258-Argument 1 in macro assert is not identical to previous  
definition (/usr/include/assert.h: 51)


FYI.  assert.h is an unusual include file -- it does not use an  
ifdef guard macro in the usual way, but undef's assert if the guard  
macro is defined (NOT if assert is defined, which is the root cause  
of this warning), define's the guard macro, then (re)define's  
assert() based on the current value of NDEBUG.


opal/mca/memory/ptmalloc2/malloc.c did not change from OpenMPI  
1.4.2.  malloc.c include's opal/mca/memory/ptmalloc2/hooks.c, which  
did change in OpenMPI 1.5rc5.  hooks.c indirectly include's  
 through opal/mca/base/mca_base_param.h.  This is where  
the warning occurs.


malloc.c define's its own assert macro in lines 364-369:

364 #if MALLOC_DEBUG
365 #include 
366 #else
367 #undef assert
368 #define assert(x) ((void)0)
369 #endif

The warning occurs because the definition of assert in line 368 is  
not the same as the definition in :


# define assert(expr)   (__ASSERT_VOID_CAST (0))

However, there is no reason to define assert here -- the only code  
in malloc.c that needs assert is already inside an #if !  
MALLOC_DEBUG conditional at line 2450.


The fix is to delete lines 364-396 in opal/mca/memory/ptmalloc2/ 
malloc.c and move the #include  to be inside the  
conditional between lines 2459 and 2461:


2459 #else

#include 

2461 #define check_chunk(A,P)  do_check_chunk(A,P)


Larry Baker
US Geological Survey
650-329-5608
ba...@usgs.gov

On Aug 17, 2010, at 2:18 PM, Jeff Squyres wrote:


We still have one known possible regression:

   https://svn.open-mpi.org/trac/ompi/ticket/2530

But we posted rc5 anyway (there's a bunch of stuff that has been  
pending for a while that is now in).  Please test!


   http://www.open-mpi.org/software/ompi/v1.5/

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] 1.5rc5 has been posted

2010-08-30 Thread Larry Baker
To follow up on http://www.open-mpi.org/community/lists/devel/2010/08/8417.php 
: OpenMPI 1.5rc5 fails in opal/tools/wrappers for PGI 10.3.


The problem appears to be a missing -lpthread in the definition of  
most of the *LIBS variables in OpenMPI 1.5rc5 opal/tools/wrappers/ 
Makefile:


[root@hydra src]# diff openmpi-{1.4.2,1.5rc5}/opal/tools/wrappers/ 
Makefile | grep lutil

< LIBS = -lnsl -lutil  -lpthread
> LIBS = -lnsl  -lutil
< OMPI_WRAPPER_EXTRA_LIBS =   -ldl   -Wl,--export-dynamic -lnsl - 
lutil -lpthread -ldl
> OMPI_WRAPPER_EXTRA_LIBS =   -ldl   -Wl,--export-dynamic -lnsl - 
lutil -ldl
< OPAL_WRAPPER_EXTRA_LIBS = -ldl   -Wl,--export-dynamic -lnsl -lutil  
-lpthread -ldl
> OPAL_WRAPPER_EXTRA_LIBS = -ldl   -Wl,--export-dynamic -lnsl -lutil  
-ldl
< ORTE_WRAPPER_EXTRA_LIBS =  -ldl   -Wl,--export-dynamic -lnsl - 
lutil -lpthread -ldl
> ORTE_WRAPPER_EXTRA_LIBS =  -ldl   -Wl,--export-dynamic -lnsl - 
lutil -ldl
< WRAPPER_EXTRA_LIBS =   -ldl   -Wl,--export-dynamic -lnsl -lutil - 
lpthread -ldl

> WRAPPER_EXTRA_LIBS = -ldl   -Wl,--export-dynamic -lnsl -lutil -ldl
< crs_blcr_LIBS = -lnsl -lutil  -lpthread
> crs_blcr_LIBS = -lnsl  -lutil -lpthread


[root@hydra src]# diff openmpi-{1.4.2,1.5rc5}/opal/tools/wrappers/ 
Makefile | grep LINK

< LINK = $(LIBTOOL) --tag=CC $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \
> LINK = $(LIBTOOL) $(AM_V_lt) --tag=CC $(AM_LIBTOOLFLAGS) \
<$(LINK) $(opal_wrapper_OBJECTS) $(opal_wrapper_LDADD) $(LIBS)
> 	$(AM_V_CCLD)$(LINK) $(opal_wrapper_OBJECTS) $(opal_wrapper_LDADD)  
$(LIBS)


I don't know anything about automake, so I don't know what code to  
look at that changed between 1.4.2 and 1.5rc5 that defines the *LIBS  
Makefile variables.


Larry Baker
US Geological Survey
650-329-5608
ba...@usgs.gov

On Aug 17, 2010, at 2:18 PM, Jeff Squyres wrote:


We still have one known possible regression:

https://svn.open-mpi.org/trac/ompi/ticket/2530

But we posted rc5 anyway (there's a bunch of stuff that has been  
pending for a while that is now in).  Please test!


http://www.open-mpi.org/software/ompi/v1.5/

--
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] 1.5rc5 has been posted

2010-08-30 Thread Larry Baker
OpenMPI 1.5rc5 fails in opal/tools/wrappers for PGI 10.3 (see http://www.open-mpi.org/community/lists/devel/2010/08/8312.php) 
:



Making all in tools/wrappers
make[2]: Entering directory `/usr/local/src/openmpi-1.5rc5/opal/ 
tools/wrappers'

  CC opal_wrapper.o
  CCLD   opal_wrapper
../../../opal/.libs/libopen-pal.so: undefined reference to  
`pthread_create'

../../../opal/.libs/libopen-pal.so: undefined reference to `assert'
../../../opal/.libs/libopen-pal.so: undefined reference to  
`pthread_mutex_trylock'
../../../opal/.libs/libopen-pal.so: undefined reference to  
`pthread_atfork'
../../../opal/.libs/libopen-pal.so: undefined reference to  
`pthread_join'

make[2]: *** [opal_wrapper] Error 2
make[2]: Leaving directory `/usr/local/src/openmpi-1.5rc5/opal/tools/ 
wrappers'

make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/usr/local/src/openmpi-1.5rc5/opal'
make: *** [all-recursive] Error 1


OpenMPI 1.4.2 does not have this problem.  After the make for OpenMPI  
1.4.2, I rm'd opal_wrapper and compared the make commands that are  
issued for 1.4.2:



[root@hydra wrappers]# cd opal/tools/wrappers
[root@hydra wrappers]# ls
CMakeLists.txt generic_wrapper.1inMakefile  
Makefile.in  opalcc-wrapper-data.txt.in  opalc++-wrapper- 
data.txt.in  opal_wrapper.1opal_wrapper.c
generic_wrapper.1  help-opal-wrapper.txt  Makefile.am  opalcc- 
wrapper-data.txt  opalc++-wrapper-data.txt 
opal_wrapper opal_wrapper.1in  opal_wrapper.o

[root@hydra wrappers]# rm opal_wrapper
rm: remove regular file `opal_wrapper'? y
[root@hydra wrappers]# make -n
rm -f opal_wrapper
/bin/sh ../../../libtool --tag=CC   --mode=link pgcc -m64  -DNDEBUG - 
g -O3 -tp amd64 -DNO_PGI_OFFSET   -export-dynamic   -o opal_wrapper  
opal_wrapper.o ../../../opal/libopen-pal.la -lnsl -lutil  -lpthread


I see that -lpthread is missing in the 1.5rc5 build:


[root@hydra wrappers]# cd opal/tools/wrappers
[root@hydra wrappers]# ls
CMakeLists.txt   help-opal-wrapper.txt  Makefile.am  opalcc- 
wrapper-data.txt opalc++-wrapper-data.txt opal.pc  
opal_wrapper.1in  opal_wrapper.o
generic_wrapper.1in  Makefile   Makefile.in  opalcc- 
wrapper-data.txt.in  opalc++-wrapper-data.txt.in  opal.pc.in   
opal_wrapper.c

[root@hydra wrappers]# make -n
rm -f opal_wrapper
echo "  CCLD  " opal_wrapper;/bin/sh ../../../libtool --silent -- 
tag=CC   --mode=link pgcc -m64  -DNDEBUG -g -O3 -tp amd64 - 
DNO_PGI_OFFSET   -export-dynamic  -o opal_wrapper  
opal_wrapper.o ../../../opal/libopen-pal.la -lnsl  -lutil

echo Creating opal_wrapper.1 man page...
sed -e 's/#PACKAGE_NAME#/Open MPI/g' \
  -e 's/#PACKAGE_VERSION#/1.5rc5/g' \
  -e 's/#OMPI_DATE#/Aug 17, 2010/g' \
  > opal_wrapper.1 < opal_wrapper.1in


That account for all the missing pthread_* references.  However, when  
I manually issue the link command and supply -lpthread, assert is  
still undefined:


[root@hydra wrappers]# /bin/sh ../../../libtool --silent --tag=CC
--mode=link pgcc -m64  -DNDEBUG -g -O3 -tp amd64 -DNO_PGI_OFFSET   - 
export-dynamic  -o opal_wrapper opal_wrapper.o ../../../opal/libopen- 
pal.la -lnsl  -lutil -lpthread

../../../opal/.libs/libopen-pal.so: undefined reference to `assert'


I get the same result when I cut-and-paste the 1.4.2 link command:

[root@hydra wrappers]# /bin/sh ../../../libtool --tag=CC   -- 
mode=link pgcc -m64  -DNDEBUG -g -O3 -tp amd64 -DNO_PGI_OFFSET   - 
export-dynamic   -o opal_wrapper opal_wrapper.o ../../../opal/ 
libopen-pal.la -lnsl -lutil  -lpthread
libtool: link: pgcc -m64 -DNDEBUG -g -O3 -tp amd64 -DNO_PGI_OFFSET - 
o .libs/opal_wrapper opal_wrapper.o -Wl,--export-dynamic  ../../../ 
opal/.libs/libopen-pal.so -ldl -lnsl -lutil -lpthread -Wl,-rpath - 
Wl,/opt/pgi/linux86-64/10.3/openmpi/lib

../../../opal/.libs/libopen-pal.so: undefined reference to `assert'


I re-ran the make without my patches, and the assert() reference  
disappeared:



[root@hydra openmpi-1.5rc5]# tail make.log
  CCLD   opal_wrapper
../../../opal/.libs/libopen-pal.so: undefined reference to  
`pthread_create'
../../../opal/.libs/libopen-pal.so: undefined reference to  
`pthread_mutex_trylock'
../../../opal/.libs/libopen-pal.so: undefined reference to  
`pthread_atfork'
../../../opal/.libs/libopen-pal.so: undefined reference to  
`pthread_join'

make[2]: *** [opal_wrapper] Error 2
make[2]: Leaving directory `/usr/local/src/openmpi-1.5rc5/opal/tools/ 
wrappers'

make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/usr/local/src/openmpi-1.5rc5/opal'
make: *** [all-recursive] Error 1


I don't know why -- -DNDEBUG should have eliminated any declarations  
from .


Manually adding -lpthreads makes the link error go away:


[root@hydra openmpi-1.5rc5]# cd opal/tools/wrappers
[root@hydra wrappers]# /bin/sh ../../../libtool --silent --tag=CC
--mode=link pgcc -m64  -DNDEBUG -g -O3 -tp amd64 -DNO_PGI_OFFSET   - 
export-dynamic  -o opal_wrapper opal_wrapper.o ../../../opal/libopen-