I got something similar 2 days ago, with a large software package abusing of 
MPI_Waitany/MPI_Waitsome (that was working seamlessly a month ago). I had to 
find a quick fix. Upon figuring out that turning the leave_pinned off fixes the 
problem, I did not investigate any further.

Do you see a similar behavior?

  George.

On Jan 30, 2014, at 17:26 , Rolf vandeVaart <rvandeva...@nvidia.com> wrote:

> I am seeing this happening to me very intermittently.  Looks like mpirun is 
> getting a SEGV.  Is anyone else seeing this?
> This is 1.7.4 built yesterday.  (Note that I added some stuff to what is 
> being printed out so the message is slightly different than 1.7.4 output)
> 
> mpirun - -np 6 -host 
> drossetti-ivy0,drossetti-ivy1,drossetti-ivy2,drossetti-ivy3 --mca 
> btl_openib_warn_default_gid_prefix 0  --  `pwd`/src/MPI_Waitsome_p_c
> MPITEST info  (0): Starting:  MPI_Waitsome_p:  Persistent Waitsome using two 
> nodes
> MPITEST_results: MPI_Waitsome_p:  Persistent Waitsome using two nodes all 
> tests PASSED (742)
> [drossetti-ivy0:10353] *** Process (mpirun)received signal ***
> [drossetti-ivy0:10353] Signal: Segmentation fault (11)
> [drossetti-ivy0:10353] Signal code: Address not mapped (1)
> [drossetti-ivy0:10353] Failing at address: 0x7fd31e5f208d
> [drossetti-ivy0:10353] End of signal information - not sleeping
> gmake[1]: *** [MPI_Waitsome_p_c] Segmentation fault (core dumped)
> gmake[1]: Leaving directory 
> `/geppetto/home/rvandevaart/public/ompi-tests/trunk/intel_tests'
> 
> (gdb) where
> #0  0x00007fd31f620807 in ?? () from /lib64/libgcc_s.so.1
> #1  0x00007fd31f6210b9 in _Unwind_Backtrace () from /lib64/libgcc_s.so.1
> #2  0x00007fd31fb2893e in backtrace () from /lib64/libc.so.6
> #3  0x00007fd320b0d622 in opal_backtrace_buffer (message_out=0x7fd31e5e33a0, 
> len_out=0x7fd31e5e33ac)
>    at ../../../../../opal/mca/backtrace/execinfo/backtrace_execinfo.c:57
> #4  0x00007fd320b0a794 in show_stackframe (signo=11, info=0x7fd31e5e3930, 
> p=0x7fd31e5e3800) at ../../../opal/util/stacktrace.c:354
> #5  <signal handler called>
> #6  0x00007fd31e5f208d in ?? ()
> #7  0x00007fd31e5e46d8 in ?? ()
> #8  0x000000000000c2a8 in ?? ()
> #9  0x0000000000000000 in ?? ()
> 
> 
> -----------------------------------------------------------------------------------
> This email message is for the sole use of the intended recipient(s) and may 
> contain
> confidential information.  Any unauthorized review, use, disclosure or 
> distribution
> is prohibited.  If you are not the intended recipient, please contact the 
> sender by
> reply email and destroy all copies of the original message.
> -----------------------------------------------------------------------------------
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to