On Mar 22, 2006, at 1:47 PM, Michael Kluskens wrote:
Trying to find the cause of one or more errors, might involve
libopal.so
Built openmpi-1.1a1r9351 on Debian Linux on Operton with PGI 6.1-3
using "./configure --with-gnu-ld F77=pgf77 FFLAGS=-fastsse FC=pgf90
FCFLAGS=-fastsse"
My program generates the following error which I do not understand:
Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR)
Failing at addr:0x4
[0] func:/usr/local/lib/libopal.so.0 [0x2a959927dd]
*** End of error message ***
Is it possible I'm over running the OpenMPI buffers, my test program
works fine other than the "GPR data corruption" errors (uses
MPI_SPAWN and posted previously); the basic MPI difference between my
test program and the real program is massive amount of data being
distributed via BCAST and SEND/RECV.
It worries me that the call stack only goes that deep - there should
be more functions listed there (if nothing else, the main()
function). Can you run your application in a debugger and try to get
a full stack trace? Typically, segmentation faults point to
overwriting user buffers, but without more detail, it's hard to pin-
point the issue.
Thanks,
Brian
--
Brian Barrett
Open MPI developer
http://www.open-mpi.org/