I've reproduced Siegmar's issue when I have the threads options on but it does not show up when they are off. It is actually segv'ing in mca_btl_sm_component_close on an access at address 0 (obviously not a good thing). I am going compile things with debug on and see if I can track this further but I think I am smelling the smoke of a bug...

Siegmar, I was able to get stuff working with 32 bits when I removed -with-threads=posix and replaced "-enable-mpi-threads" with --disable-mpi-threads in your configure line. I think your previous issue with things not building must be left over cruft.

Note, my compiler hang disappeared on me. So maybe there was an environmental issue on my side.

--td


On 10/21/2010 06:47 AM, Terry Dontje wrote:
On 10/21/2010 06:43 AM, Jeff Squyres (jsquyres) wrote:
Also, i'm not entirely sure what all the commands are that you are showing. Some of those warnings (eg in config.log) are normal.

The 32 bit test failure is not, though. Terry - any idea there?
The test program is failing in MPI_Finalize which seems odd and the code itself looks pretty dead simple. I am rebuilding a v1.5 workspace without the different thread options. Once that is done I'll try the test program.

BTW, when I tried to build with the original options Siegmar used the compiles looked like they hung, doh.

--td


Sent from my PDA. No type good.

On Oct 21, 2010, at 6:25 AM, "Terry Dontje" <terry.don...@oracle.com <mailto:terry.don...@oracle.com>> wrote:

I wonder if the error below be due to crap being left over in the source tree. Can you do a "make clean". Note on a new checkout from the v1.5 svn branch I was able to build 64 bit with the following configure line:

../configure FC=f95 F77=f77 CC=cc CXX=CC --without-openib --without-udapl -enable-heterogeneous --enable-cxx-exceptions --enable-shared --enable-orterun-prefix-by-default --with-sge --disable-mpi-threads --enable-mpi-f90 --with-mpi-f90-size=small --disable-progress-threads --prefix=/workspace/tdd/ctnext/v15 CFLAGS=-m64 CXXFLAGS=-m64 FFLAGS=-m64 FCFLAGS=-m64

--td
On 10/21/2010 05:38 AM, Siegmar Gross wrote:
Hi,

thank you very much for your reply.

   Can you remove the -with-threads and -enable-mpi-threads options from
the configure line and see if that helps your 32 bit problem any?
I cannot build the package when I remove these options.

linpc4 openmpi-1.5-Linux.x86_64.32_cc 189 head -8 config.log
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.

It was created by Open MPI configure 1.5, which was
generated by GNU Autoconf 2.65.  Invocation command line was

   $ ../openmpi-1.5/configure --prefix=/usr/local/openmpi-1.5_32_cc
   CFLAGS=-m32 CXXFLAGS=-m32 FFLAGS=-m32 FCFLAGS=-m32 CXXLDFLAGS=-m32
   CPPFLAGS= LDFLAGS=-m32 C_INCL_PATH= C_INCLUDE_PATH= CPLUS_INCLUDE_PATH=
   OBJC_INCLUDE_PATH= MPICHHOME= CC=cc CXX=CC F77=f95 FC=f95
   --without-udapl --enable-shared --enable-heterogeneous
   --enable-cxx-exceptions


linpc4 openmpi-1.5-Linux.x86_64.32_cc 190 head -8 ../*.old/config.log
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.

It was created by Open MPI configure 1.5, which was
generated by GNU Autoconf 2.65.  Invocation command line was

   $ ../openmpi-1.5/configure --prefix=/usr/local/openmpi-1.5_32_cc
   CFLAGS=-m32 CXXFLAGS=-m32 FFLAGS=-m32 FCFLAGS=-m32 CXXLDFLAGS=-m32
   CPPFLAGS= LDFLAGS=-m32 C_INCL_PATH= C_INCLUDE_PATH= CPLUS_INCLUDE_PATH=
   OBJC_INCLUDE_PATH= MPICHHOME= CC=cc CXX=CC F77=f95 FC=f95
   --without-udapl --with-threads=posix --enable-mpi-threads
   --enable-shared --enable-heterogeneous --enable-cxx-exceptions


linpc4 openmpi-1.5-Linux.x86_64.32_cc 194 dir log.* ../*.old/log.*
... 132406 Oct 19 13:01
   ../openmpi-1.5-Linux.x86_64.32_cc.old/log.configure.Linux.x86_64.32_cc
... 195587 Oct 19 16:09
   ../openmpi-1.5-Linux.x86_64.32_cc.old/log.make-check.Linux.x86_64.32_cc
... 356672 Oct 19 16:07
   ../openmpi-1.5-Linux.x86_64.32_cc.old/log.make-install.Linux.x86_64.32_cc
... 280596 Oct 19 13:42
   ../openmpi-1.5-Linux.x86_64.32_cc.old/log.make.Linux.x86_64.32_cc
... 132265 Oct 21 10:51 log.configure.Linux.x86_64.32_cc
...  10890 Oct 21 10:51 log.make.Linux.x86_64.32_cc


linpc4 openmpi-1.5-Linux.x86_64.32_cc 195 grep -i warning:
   log.configure.Linux.x86_64.32_cc
configure: WARNING: *** Did not find corresponding C type
configure: WARNING: MPI_REAL16 and MPI_COMPLEX32 support have been disabled
configure: WARNING: *** Corresponding Fortran 77 type (REAL*16) not supported
configure: WARNING: *** Skipping Fortran 90 type (REAL*16)
configure: WARNING: valgrind.h not found
configure: WARNING: Unknown architecture ... proceeding anyway
configure: WARNING: File locks may not work with NFS.  See the Installation and
configure: WARNING:  -xldscope=hidden has been added to CFLAGS

linpc4 openmpi-1.5-Linux.x86_64.32_cc 196 grep -i warning:
   ../*.old/log.configure.Linux.x86_64.32_cc
configure: WARNING: *** Did not find corresponding C type
configure: WARNING: MPI_REAL16 and MPI_COMPLEX32 support have been disabled
configure: WARNING: *** Corresponding Fortran 77 type (REAL*16) not supported
configure: WARNING: *** Skipping Fortran 90 type (REAL*16)
configure: WARNING: valgrind.h not found
configure: WARNING: Unknown architecture ... proceeding anyway
configure: WARNING: File locks may not work with NFS.  See the Installation and
configure: WARNING:  -xldscope=hidden has been added to CFLAGS

linpc4 openmpi-1.5-Linux.x86_64.32_cc 197 grep -i error:
   log.configure.Linux.x86_64.32_cc
configure: error: no libz found; check path for ZLIB package first...
configure: error: no vtf3.h found; check path for VTF3 package first...
configure: error: no BPatch.h found; check path for Dyninst package first...
configure: error: no f2c.h found; check path for CLAPACK package first...
configure: error: MPI Correctness Checking support cannot be built inside Open
MPI
configure: error: no papi.h found; check path for PAPI package first...
configure: error: no libcpc.h found; check path for CPC package first...
configure: error: no ctool/ctool.h found; check path for CTool package first...

linpc4 openmpi-1.5-Linux.x86_64.32_cc 198 grep -i error:
   ../*.old/log.configure.Linux.x86_64.32_cc
configure: error: no libz found; check path for ZLIB package first...
configure: error: no vtf3.h found; check path for VTF3 package first...
configure: error: no BPatch.h found; check path for Dyninst package first...
configure: error: no f2c.h found; check path for CLAPACK package first...
configure: error: MPI Correctness Checking support cannot be built inside Open
MPI
configure: error: no papi.h found; check path for PAPI package first...
configure: error: no libcpc.h found; check path for CPC package first...
configure: error: no ctool/ctool.h found; check path for CTool package first...
linpc4 openmpi-1.5-Linux.x86_64.32_cc 199


linpc4 openmpi-1.5-Linux.x86_64.32_cc 199 grep -i warning:
   log.make.Linux.x86_64.32_cc
linpc4 openmpi-1.5-Linux.x86_64.32_cc 200 grep -i warning:
   ../*.old/log.make.Linux.x86_64.32_cc
".../opal/mca/crs/none/crs_none_module.c", line 136: warning:
   statement not reached
".../orte/mca/errmgr/errmgr.h", line 135: warning: attribute
   "noreturn" may not be applied to variable, ignored
... (much more warnings for errmgr.h, line 135)
".../orte/tools/orte-ps/orte-ps.c", line 288: warning: initializer
   does not fit or is out of range: 0xfffffffe
".../orte/tools/orte-ps/orte-ps.c", line 289: warning: initializer
   does not fit or is out of range: 0xfffffffe
...
f95: Warning: Option -rpath passed to ld, if ld is invoked,
   ignored otherwise
...
".../ompi/mca/io/romio/romio/adio/common/ad_fstype.c", line 397:
   warning: statement not reached
".../ompi/mca/osc/rdma/osc_rdma_data_move.c", line 296:
   warning: statement not reached
".../ompi/mca/osc/rdma/osc_rdma_data_move.c", line 678:
   warning: statement not reached
".../ompi/mca/pml/cm/pml_cm_cancel.c", line 65: warning:
   statement not reached
...
CC: Warning: Specify a supported level of optimization when using
   -xopenmp, -xopenmp will not set an optimization level in a
   future release. Optimization level changed to 3 to support -xopenmp.
... (a lot more of these warnings)
cc: Warning: Optimizer level changed from 0 to 3 to support
   parallelized code.
...

Above you have all different warnings if I didn't miss one.


linpc4 openmpi-1.5-Linux.x86_64.32_cc 201 grep -i error:
   log.make.Linux.x86_64.32_cc cc1: error: unrecognized command line
   option "-fno-directives-only"
linpc4 openmpi-1.5-Linux.x86_64.32_cc 202 grep -i error:
   ../*.old/log.make.Linux.x86_64.32_cc
linpc4 openmpi-1.5-Linux.x86_64.32_cc 203


linpc4 openmpi-1.5-Linux.x86_64.32_cc 205 tail -15
   log.make.Linux.x86_64.32_cc
make[3]: Leaving directory `/.../opal/libltdl'
make[2]: Leaving directory `/.../opal/libltdl'
Making all in asm
make[2]: Entering directory `/.../opal/asm'
   CC     asm.lo
rm -f atomic-asm.S
ln -s ".../opal/asm/generated/atomic-ia32-linux-nongas.s" atomic-asm.S
   CPPAS  atomic-asm.lo
cc1: error: unrecognized command line option "-fno-directives-only"
cc: cpp failed for atomic-asm.S
make[2]: *** [atomic-asm.lo] Error 1
make[2]: Leaving directory `/.../opal/asm'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/.../opal'
make: *** [all-recursive] Error 1


It is the same error which I reported yesterday for "Linux x86_64,
Oracle/Sun C, 64-bit).

Please let me know if you need anything else.


Kind regards

Siegmar


--td
On 10/20/2010 09:38 AM, Siegmar Gross wrote:
Hi,

I have built Open MPI 1.5 on Linux x86_64 with the Oracle/Sun Studio C
compiler. Unfortunately "mpiexec" breaks when I run a small propgram.

linpc4 small_prog 106 cc -V
cc: Sun C 5.10 Linux_i386 2009/06/03
usage: cc [ options] files.  Use 'cc -flags' for details

linpc4 small_prog 107 uname -a
Linux linpc4 2.6.27.45-0.1-default #1 SMP 2010-02-22 16:49:47 +0100 x86_64
x86_64 x86_64 GNU/Linux

linpc4 small_prog 108 mpicc -show
cc -I/usr/local/openmpi-1.5_32_cc/include -mt
    -L/usr/local/openmpi-1.5_32_cc/lib -lmpi -ldl -Wl,--export-dynamic -lnsl
    -lutil -lm -ldl

linpc4 small_prog 109 mpicc -m32 rank_size.c
linpc4 small_prog 110 mpiexec -np 2 a.out
I'm process 0 of 2 available processes running on linpc4.
MPI standard 2.1 is supported.
I'm process 1 of 2 available processes running on linpc4.
MPI standard 2.1 is supported.
[linpc4:11564] *** Process received signal ***
[linpc4:11564] Signal: Segmentation fault (11)
[linpc4:11564] Signal code:  (128)
[linpc4:11564] Failing at address: (nil)
[linpc4:11565] *** Process received signal ***
[linpc4:11565] Signal: Segmentation fault (11)
[linpc4:11565] Signal code:  (128)
[linpc4:11565] Failing at address: (nil)
[linpc4:11564] [ 0] [0xffffe410]
[linpc4:11564] [ 1] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    (mca_base_components_close+0x8c) [0xf774ccd0]
[linpc4:11564] [ 2] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    (mca_btl_base_close+0xc5) [0xf76bd255]
[linpc4:11564] [ 3] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    (mca_bml_base_close+0x32) [0xf76bd112]
[linpc4:11564] [ 4] /usr/local/openmpi-1.5_32_cc/lib/openmpi/
    mca_pml_ob1.so [0xf73d971f]
[linpc4:11564] [ 5] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    (mca_base_components_close+0x8c) [0xf774ccd0]
[linpc4:11564] [ 6] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    (mca_pml_base_close+0xc1) [0xf76e4385]
[linpc4:11564] [ 7] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    [0xf76889e6]
[linpc4:11564] [ 8] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    (PMPI_Finalize+0x3c) [0xf769dd4c]
[linpc4:11564] [ 9] a.out(main+0x98) [0x8048a18]
[linpc4:11564] [10] /lib/libc.so.6(__libc_start_main+0xe5) [0xf749c705]
[linpc4:11564] [11] a.out(_start+0x41) [0x8048861]
[linpc4:11564] *** End of error message ***
[linpc4:11565] [ 0] [0xffffe410]
[linpc4:11565] [ 1] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    (mca_base_components_close+0x8c) [0xf76bccd0]
[linpc4:11565] [ 2] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    (mca_btl_base_close+0xc5) [0xf762d255]
[linpc4:11565] [ 3] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    (mca_bml_base_close+0x32) [0xf762d112]
[linpc4:11565] [ 4] /usr/local/openmpi-1.5_32_cc/lib/openmpi/
    mca_pml_ob1.so [0xf734971f]
[linpc4:11565] [ 5] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    (mca_base_components_close+0x8c) [0xf76bccd0]
[linpc4:11565] [ 6] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    (mca_pml_base_close+0xc1) [0xf7654385]
[linpc4:11565] [ 7] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    [0xf75f89e6]
[linpc4:11565] [ 8] /usr/local/openmpi-1.5_32_cc/lib/libmpi.so.1
    (PMPI_Finalize+0x3c) [0xf760dd4c]
[linpc4:11565] [ 9] a.out(main+0x98) [0x8048a18]
[linpc4:11565] [10] /lib/libc.so.6(__libc_start_main+0xe5) [0xf740c705]
[linpc4:11565] [11] a.out(_start+0x41) [0x8048861]
[linpc4:11565] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 0 with PID 11564 on node linpc4 exited
    on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
2 total processes killed (some possibly by mpiexec during cleanup)
linpc4 small_prog 111


"make check" shows that one test failed.

linpc4 openmpi-1.5-Linux.x86_64.32_cc 114 grep FAIL
    log.make-check.Linux.x86_64.32_cc
FAIL: opal_path_nfs
linpc4 openmpi-1.5-Linux.x86_64.32_cc 115 grep PASS
    log.make-check.Linux.x86_64.32_cc
PASS: predefined_gap_test
PASS: dlopen_test
PASS: atomic_barrier
PASS: atomic_barrier_noinline
PASS: atomic_spinlock
PASS: atomic_spinlock_noinline
PASS: atomic_math
PASS: atomic_math_noinline
PASS: atomic_cmpset
PASS: atomic_cmpset_noinline
decode [PASSED]
PASS: opal_datatype_test
PASS: checksum
PASS: position
decode [PASSED]
PASS: ddt_test
decode [PASSED]
PASS: ddt_raw
linpc4 openmpi-1.5-Linux.x86_64.32_cc 116

I used the following command to build the package.

../openmpi-1.5/configure --prefix=/usr/local/openmpi-1.5_32_cc \
    CFLAGS="-m32" CXXFLAGS="-m32" FFLAGS="-m32" FCFLAGS="-m32" \
    CXXLDFLAGS="-m32" CPPFLAGS="" \
    LDFLAGS="-m32" \
    C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \
    OBJC_INCLUDE_PATH="" MPICHHOME="" \
    CC="cc" CXX="CC" F77="f95" FC="f95" \
    --without-udapl --with-threads=posix --enable-mpi-threads \
    --enable-shared --enable-heterogeneous --enable-cxx-exceptions \
    |&   tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.32_cc

I have also built the package with gcc-4.2.0 and it seems to work
although the nfs-test failed as well. Therefore I'm not sure if
the failing test is responsible for the failure with the cc-version.

../openmpi-1.5/configure --prefix=/usr/local/openmpi-1.5_32_gcc \
    CFLAGS="-m32" CXXFLAGS="-m32" FFLAGS="-m32" FCFLAGS="-m32" \
    CXXLDFLAGS="-m32" CPPFLAGS="" \
    LDFLAGS="-m32" \
    C_INCL_PATH="" C_INCLUDE_PATH="" CPLUS_INCLUDE_PATH="" \
    OBJC_INCLUDE_PATH="" MPIHOME="" \
    CC="gcc" CPP="cpp" CXX="g++" CXXCPP="cpp" F77="gfortran" \
    --without-udapl --with-threads=posix --enable-mpi-threads \
    --enable-shared --enable-heterogeneous --enable-cxx-exceptions \
    |&   tee log.configure.$SYSTEM_ENV.$MACHINE_ENV.32_gcc

linpc4 small_prog 107 gcc -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-4.2.0/configure --prefix=/usr/local/gcc-4.2.0
    --enable-languages=c,c++,java,fortran,objc --enable-java-gc=boehm
    --enable-nls --enable-libgcj --enable-threads=posix
Thread model: posix
gcc version 4.2.0

linpc4 small_prog 109 mpicc -show
gcc -I/usr/local/openmpi-1.5_32_gcc/include -fexceptions -pthread
    -L/usr/local/openmpi-1.5_32_gcc/lib -lmpi -ldl -Wl,--export-dynamic
    -lnsl -lutil -lm -ldl

linpc4 small_prog 110 mpicc -m32 rank_size.c
linpc4 small_prog 111 mpiexec -np 2 a.out
I'm process 0 of 2 available processes running on linpc4.
MPI standard 2.1 is supported.
I'm process 1 of 2 available processes running on linpc4.
MPI standard 2.1 is supported.

linpc4 small_prog 112 grep FAIL /.../log.make-check.Linux.x86_64.32_gcc
FAIL: opal_path_nfs
linpc4 small_prog 113 grep PASS /.../log.make-check.Linux.x86_64.32_gcc
PASS: predefined_gap_test
PASS: dlopen_test
PASS: atomic_barrier
PASS: atomic_barrier_noinline
PASS: atomic_spinlock
PASS: atomic_spinlock_noinline
PASS: atomic_math
PASS: atomic_math_noinline
PASS: atomic_cmpset
PASS: atomic_cmpset_noinline
decode [PASSED]
PASS: opal_datatype_test
PASS: checksum
PASS: position
decode [PASSED]
PASS: ddt_test
decode [NOT PASSED]
PASS: ddt_raw
linpc4 small_prog 114


I used the following small test program.

#include<stdio.h>
#include<stdlib.h>
#include "mpi.h"

int main (int argc, char *argv[])
{
    int  ntasks,                                /* number of parallel tasks     
*/
         mytid,                         /* my task id                   
*/
         version, subversion,           /* version of MPI standard      
*/
         namelen;                               /* length of processor name     
*/
    char processor_name[MPI_MAX_PROCESSOR_NAME];

    MPI_Init (&argc,&argv);
    MPI_Comm_rank (MPI_COMM_WORLD,&mytid);
    MPI_Comm_size (MPI_COMM_WORLD,&ntasks);
    MPI_Get_processor_name (processor_name,&namelen);
    printf ("I'm process %d of %d available processes running on %s.\n",
          mytid, ntasks, processor_name);
    MPI_Get_version (&version,&subversion);
    printf ("MPI standard %d.%d is supported.\n", version, subversion);
    MPI_Finalize ();
    return EXIT_SUCCESS;
}


Thank you very much for any help to solve the problem with the
Oracle/Sun Compiler in advance.


Best regards

Siegmar

_______________________________________________
users mailing list
us...@open-mpi.org  <mailto:us...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/users
--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
emailterry.don...@oracle.com  <mailto:terry.don...@oracle.com>  
<mailto:terry.don...@oracle.com>
_______________________________________________
users mailing list
us...@open-mpi.org  <mailto:us...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
<mime-attachment.gif>
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>



_______________________________________________
users mailing list
us...@open-mpi.org <mailto:us...@open-mpi.org>
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>




_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle * - Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com <mailto:terry.don...@oracle.com>



Reply via email to