On Mar 22, 2006, at 1:47 PM, Michael Kluskens wrote:

Trying to find the cause of one or more errors, might involve libopal.so

Built openmpi-1.1a1r9351 on Debian Linux on Operton with PGI 6.1-3
using "./configure --with-gnu-ld F77=pgf77 FFLAGS=-fastsse FC=pgf90
FCFLAGS=-fastsse"

My program generates the following error which I do not understand:

Signal:11 info.si_errno:0(Success) si_code:1(SEGV_MAPERR)
Failing at addr:0x4
[0] func:/usr/local/lib/libopal.so.0 [0x2a959927dd]
*** End of error message ***

Is it possible I'm over running the OpenMPI buffers, my test program
works fine other than the "GPR data corruption" errors (uses
MPI_SPAWN and posted previously); the basic MPI difference between my
test program and the real program is massive amount of data being
distributed via BCAST and SEND/RECV.

It worries me that the call stack only goes that deep - there should be more functions listed there (if nothing else, the main() function). Can you run your application in a debugger and try to get a full stack trace? Typically, segmentation faults point to overwriting user buffers, but without more detail, it's hard to pin- point the issue.


Thanks,

Brian

--
  Brian Barrett
  Open MPI developer
  http://www.open-mpi.org/


Reply via email to