[OMPI devel] Fwd: 1.5rc5 has been posted
The same problem (LIBS = is missing -lpthread) occurs in orte/tools/ {orte-clean,orte-iof,orte-ps,orted,orterun,orte-top}/Makefile. Larry Baker US Geological Survey 650-329-5608 ba...@usgs.gov Begin forwarded message: From: Larry BakerDate: August 30, 2010 4:48:01 PM PDT To: Open MPI Developers Subject: Re: [OMPI devel] 1.5rc5 has been posted To follow up on http://www.open-mpi.org/community/lists/devel/2010/08/8417.php : OpenMPI 1.5rc5 fails in opal/tools/wrappers for PGI 10.3. The problem appears to be a missing -lpthread in the definition of most of the *LIBS variables in OpenMPI 1.5rc5 opal/tools/wrappers/ Makefile: [root@hydra src]# diff openmpi-{1.4.2,1.5rc5}/opal/tools/wrappers/ Makefile | grep lutil < LIBS = -lnsl -lutil -lpthread > LIBS = -lnsl -lutil < OMPI_WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl - lutil -lpthread -ldl > OMPI_WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl - lutil -ldl < OPAL_WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl - lutil -lpthread -ldl > OPAL_WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl - lutil -ldl < ORTE_WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl - lutil -lpthread -ldl > ORTE_WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl - lutil -ldl < WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl -lutil - lpthread -ldl > WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl -lutil -ldl < crs_blcr_LIBS = -lnsl -lutil -lpthread > crs_blcr_LIBS = -lnsl -lutil -lpthread [root@hydra src]# diff openmpi-{1.4.2,1.5rc5}/opal/tools/wrappers/ Makefile | grep LINK < LINK = $(LIBTOOL) --tag=CC $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \ > LINK = $(LIBTOOL) $(AM_V_lt) --tag=CC $(AM_LIBTOOLFLAGS) \ <$(LINK) $(opal_wrapper_OBJECTS) $(opal_wrapper_LDADD) $(LIBS) > $(AM_V_CCLD)$(LINK) $(opal_wrapper_OBJECTS) $ (opal_wrapper_LDADD) $(LIBS) I don't know anything about automake, so I don't know what code to look at that changed between 1.4.2 and 1.5rc5 that defines the *LIBS Makefile variables. Larry Baker US Geological Survey 650-329-5608 ba...@usgs.gov On Aug 17, 2010, at 2:18 PM, Jeff Squyres wrote: We still have one known possible regression: https://svn.open-mpi.org/trac/ompi/ticket/2530 But we posted rc5 anyway (there's a bunch of stuff that has been pending for a while that is now in). Please test! http://www.open-mpi.org/software/ompi/v1.5/ -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] 1.5rc5 has been posted
The fix I posted in http://www.open-mpi.org/community/lists/devel/2010/08/8311.php for the Redefinition of symbol assert causes a link failure of opal_wrapper. This is because there are assert() calls in opal/mca/ memory/ptmalloc2/arena.c, which is included in opal/mca/memory/ ptmalloc2/malloc.c before the conditional on MALLOC_DEBUG, which is where I moved #include . arena.c does not contain its own #include . I changed the patch to opal/mca/memory/ptmalloc2/ malloc.c to define assert where it was before, but in such a way that it always uses the system header file to define the assert macro. In opal/mca/memory/ptmalloc2/malloc.c, change lines 364-369 from: #if MALLOC_DEBUG #include #else #undef assert #define assert(x) ((void)0) #endif to: #if MALLOC_DEBUG && defined( NDEBUG ) #error -DMALLOC_DEBUG is inconsistent with -DNDEBUG #endif #include The reason the conditional uses the value of MALLOC_DEBUG, but defined( NDEBUG ), is that the code that depends on MALLOC_DEBUG uses #if MALLOC_DEBUG conditionals, while uses #ifdef NDEBUG to define the assert macro. I used the same semantics to detect the inconsistency between MALLOC_DEBUG and NDEBUG. Larry Baker US Geological Survey 650-329-5608 ba...@usgs.gov On Aug 23, 2010, at 5:29 PM, Larry Baker wrote: The PGI C compiler complains (issues a warning) for the redefinition of the assert macro in opal/mca/memory/ptmalloc2/malloc.c: Making all in mca/memory/ptmalloc2 make[2]: Entering directory `/home/baker/openmpi-1.5rc5/opal/mca/ memory/ptmalloc2' CC opal_ptmalloc2_component.lo CC opal_ptmalloc2_munmap.lo CC malloc.lo PGC-W-0221-Redefinition of symbol assert (/usr/include/assert.h: 51) PGC-W-0258-Argument 1 in macro assert is not identical to previous definition (/usr/include/assert.h: 51) FYI. assert.h is an unusual include file -- it does not use an ifdef guard macro in the usual way, but undef's assert if the guard macro is defined (NOT if assert is defined, which is the root cause of this warning), define's the guard macro, then (re)define's assert() based on the current value of NDEBUG. opal/mca/memory/ptmalloc2/malloc.c did not change from OpenMPI 1.4.2. malloc.c include's opal/mca/memory/ptmalloc2/hooks.c, which did change in OpenMPI 1.5rc5. hooks.c indirectly include's through opal/mca/base/mca_base_param.h. This is where the warning occurs. malloc.c define's its own assert macro in lines 364-369: 364 #if MALLOC_DEBUG 365 #include 366 #else 367 #undef assert 368 #define assert(x) ((void)0) 369 #endif The warning occurs because the definition of assert in line 368 is not the same as the definition in : # define assert(expr) (__ASSERT_VOID_CAST (0)) However, there is no reason to define assert here -- the only code in malloc.c that needs assert is already inside an #if ! MALLOC_DEBUG conditional at line 2450. The fix is to delete lines 364-396 in opal/mca/memory/ptmalloc2/ malloc.c and move the #include to be inside the conditional between lines 2459 and 2461: 2459 #else #include 2461 #define check_chunk(A,P) do_check_chunk(A,P) Larry Baker US Geological Survey 650-329-5608 ba...@usgs.gov On Aug 17, 2010, at 2:18 PM, Jeff Squyres wrote: We still have one known possible regression: https://svn.open-mpi.org/trac/ompi/ticket/2530 But we posted rc5 anyway (there's a bunch of stuff that has been pending for a while that is now in). Please test! http://www.open-mpi.org/software/ompi/v1.5/ -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] 1.5rc5 has been posted
To follow up on http://www.open-mpi.org/community/lists/devel/2010/08/8417.php : OpenMPI 1.5rc5 fails in opal/tools/wrappers for PGI 10.3. The problem appears to be a missing -lpthread in the definition of most of the *LIBS variables in OpenMPI 1.5rc5 opal/tools/wrappers/ Makefile: [root@hydra src]# diff openmpi-{1.4.2,1.5rc5}/opal/tools/wrappers/ Makefile | grep lutil < LIBS = -lnsl -lutil -lpthread > LIBS = -lnsl -lutil < OMPI_WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl - lutil -lpthread -ldl > OMPI_WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl - lutil -ldl < OPAL_WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl -lutil -lpthread -ldl > OPAL_WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl -lutil -ldl < ORTE_WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl - lutil -lpthread -ldl > ORTE_WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl - lutil -ldl < WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl -lutil - lpthread -ldl > WRAPPER_EXTRA_LIBS = -ldl -Wl,--export-dynamic -lnsl -lutil -ldl < crs_blcr_LIBS = -lnsl -lutil -lpthread > crs_blcr_LIBS = -lnsl -lutil -lpthread [root@hydra src]# diff openmpi-{1.4.2,1.5rc5}/opal/tools/wrappers/ Makefile | grep LINK < LINK = $(LIBTOOL) --tag=CC $(AM_LIBTOOLFLAGS) $(LIBTOOLFLAGS) \ > LINK = $(LIBTOOL) $(AM_V_lt) --tag=CC $(AM_LIBTOOLFLAGS) \ <$(LINK) $(opal_wrapper_OBJECTS) $(opal_wrapper_LDADD) $(LIBS) > $(AM_V_CCLD)$(LINK) $(opal_wrapper_OBJECTS) $(opal_wrapper_LDADD) $(LIBS) I don't know anything about automake, so I don't know what code to look at that changed between 1.4.2 and 1.5rc5 that defines the *LIBS Makefile variables. Larry Baker US Geological Survey 650-329-5608 ba...@usgs.gov On Aug 17, 2010, at 2:18 PM, Jeff Squyres wrote: We still have one known possible regression: https://svn.open-mpi.org/trac/ompi/ticket/2530 But we posted rc5 anyway (there's a bunch of stuff that has been pending for a while that is now in). Please test! http://www.open-mpi.org/software/ompi/v1.5/ -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/ ___ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel
Re: [OMPI devel] 1.5rc5 has been posted
OpenMPI 1.5rc5 fails in opal/tools/wrappers for PGI 10.3 (see http://www.open-mpi.org/community/lists/devel/2010/08/8312.php) : Making all in tools/wrappers make[2]: Entering directory `/usr/local/src/openmpi-1.5rc5/opal/ tools/wrappers' CC opal_wrapper.o CCLD opal_wrapper ../../../opal/.libs/libopen-pal.so: undefined reference to `pthread_create' ../../../opal/.libs/libopen-pal.so: undefined reference to `assert' ../../../opal/.libs/libopen-pal.so: undefined reference to `pthread_mutex_trylock' ../../../opal/.libs/libopen-pal.so: undefined reference to `pthread_atfork' ../../../opal/.libs/libopen-pal.so: undefined reference to `pthread_join' make[2]: *** [opal_wrapper] Error 2 make[2]: Leaving directory `/usr/local/src/openmpi-1.5rc5/opal/tools/ wrappers' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/usr/local/src/openmpi-1.5rc5/opal' make: *** [all-recursive] Error 1 OpenMPI 1.4.2 does not have this problem. After the make for OpenMPI 1.4.2, I rm'd opal_wrapper and compared the make commands that are issued for 1.4.2: [root@hydra wrappers]# cd opal/tools/wrappers [root@hydra wrappers]# ls CMakeLists.txt generic_wrapper.1inMakefile Makefile.in opalcc-wrapper-data.txt.in opalc++-wrapper- data.txt.in opal_wrapper.1opal_wrapper.c generic_wrapper.1 help-opal-wrapper.txt Makefile.am opalcc- wrapper-data.txt opalc++-wrapper-data.txt opal_wrapper opal_wrapper.1in opal_wrapper.o [root@hydra wrappers]# rm opal_wrapper rm: remove regular file `opal_wrapper'? y [root@hydra wrappers]# make -n rm -f opal_wrapper /bin/sh ../../../libtool --tag=CC --mode=link pgcc -m64 -DNDEBUG - g -O3 -tp amd64 -DNO_PGI_OFFSET -export-dynamic -o opal_wrapper opal_wrapper.o ../../../opal/libopen-pal.la -lnsl -lutil -lpthread I see that -lpthread is missing in the 1.5rc5 build: [root@hydra wrappers]# cd opal/tools/wrappers [root@hydra wrappers]# ls CMakeLists.txt help-opal-wrapper.txt Makefile.am opalcc- wrapper-data.txt opalc++-wrapper-data.txt opal.pc opal_wrapper.1in opal_wrapper.o generic_wrapper.1in Makefile Makefile.in opalcc- wrapper-data.txt.in opalc++-wrapper-data.txt.in opal.pc.in opal_wrapper.c [root@hydra wrappers]# make -n rm -f opal_wrapper echo " CCLD " opal_wrapper;/bin/sh ../../../libtool --silent -- tag=CC --mode=link pgcc -m64 -DNDEBUG -g -O3 -tp amd64 - DNO_PGI_OFFSET -export-dynamic -o opal_wrapper opal_wrapper.o ../../../opal/libopen-pal.la -lnsl -lutil echo Creating opal_wrapper.1 man page... sed -e 's/#PACKAGE_NAME#/Open MPI/g' \ -e 's/#PACKAGE_VERSION#/1.5rc5/g' \ -e 's/#OMPI_DATE#/Aug 17, 2010/g' \ > opal_wrapper.1 < opal_wrapper.1in That account for all the missing pthread_* references. However, when I manually issue the link command and supply -lpthread, assert is still undefined: [root@hydra wrappers]# /bin/sh ../../../libtool --silent --tag=CC --mode=link pgcc -m64 -DNDEBUG -g -O3 -tp amd64 -DNO_PGI_OFFSET - export-dynamic -o opal_wrapper opal_wrapper.o ../../../opal/libopen- pal.la -lnsl -lutil -lpthread ../../../opal/.libs/libopen-pal.so: undefined reference to `assert' I get the same result when I cut-and-paste the 1.4.2 link command: [root@hydra wrappers]# /bin/sh ../../../libtool --tag=CC -- mode=link pgcc -m64 -DNDEBUG -g -O3 -tp amd64 -DNO_PGI_OFFSET - export-dynamic -o opal_wrapper opal_wrapper.o ../../../opal/ libopen-pal.la -lnsl -lutil -lpthread libtool: link: pgcc -m64 -DNDEBUG -g -O3 -tp amd64 -DNO_PGI_OFFSET - o .libs/opal_wrapper opal_wrapper.o -Wl,--export-dynamic ../../../ opal/.libs/libopen-pal.so -ldl -lnsl -lutil -lpthread -Wl,-rpath - Wl,/opt/pgi/linux86-64/10.3/openmpi/lib ../../../opal/.libs/libopen-pal.so: undefined reference to `assert' I re-ran the make without my patches, and the assert() reference disappeared: [root@hydra openmpi-1.5rc5]# tail make.log CCLD opal_wrapper ../../../opal/.libs/libopen-pal.so: undefined reference to `pthread_create' ../../../opal/.libs/libopen-pal.so: undefined reference to `pthread_mutex_trylock' ../../../opal/.libs/libopen-pal.so: undefined reference to `pthread_atfork' ../../../opal/.libs/libopen-pal.so: undefined reference to `pthread_join' make[2]: *** [opal_wrapper] Error 2 make[2]: Leaving directory `/usr/local/src/openmpi-1.5rc5/opal/tools/ wrappers' make[1]: *** [all-recursive] Error 1 make[1]: Leaving directory `/usr/local/src/openmpi-1.5rc5/opal' make: *** [all-recursive] Error 1 I don't know why -- -DNDEBUG should have eliminated any declarations from . Manually adding -lpthreads makes the link error go away: [root@hydra openmpi-1.5rc5]# cd opal/tools/wrappers [root@hydra wrappers]# /bin/sh ../../../libtool --silent --tag=CC --mode=link pgcc -m64 -DNDEBUG -g -O3 -tp amd64 -DNO_PGI_OFFSET - export-dynamic -o opal_wrapper opal_wrapper.o ../../../opal/libopen-