That option might explain why your test process is failing (which segfaulted as 
well), but obviously wouldn't have anything to do with mpirun

On Jan 30, 2014, at 9:29 AM, Rolf vandeVaart <rvandeva...@nvidia.com> wrote:

> I just retested with --mca mpi_leave_pinned 0 and that made no difference.  I 
> still see the mpirun crash.
> 
>> -----Original Message-----
>> From: devel [mailto:devel-boun...@open-mpi.org] On Behalf Of George
>> Bosilca
>> Sent: Thursday, January 30, 2014 11:59 AM
>> To: Open MPI Developers
>> Subject: Re: [OMPI devel] Intermittent mpirun crash?
>> 
>> I got something similar 2 days ago, with a large software package abusing of
>> MPI_Waitany/MPI_Waitsome (that was working seamlessly a month ago). I
>> had to find a quick fix. Upon figuring out that turning the leave_pinned off
>> fixes the problem, I did not investigate any further.
>> 
>> Do you see a similar behavior?
>> 
>> George.
>> 
>> On Jan 30, 2014, at 17:26 , Rolf vandeVaart <rvandeva...@nvidia.com> wrote:
>> 
>>> I am seeing this happening to me very intermittently.  Looks like mpirun is
>> getting a SEGV.  Is anyone else seeing this?
>>> This is 1.7.4 built yesterday.  (Note that I added some stuff to what
>>> is being printed out so the message is slightly different than 1.7.4
>>> output)
>>> 
>>> mpirun - -np 6 -host
>>> drossetti-ivy0,drossetti-ivy1,drossetti-ivy2,drossetti-ivy3 --mca
>>> btl_openib_warn_default_gid_prefix 0  --  `pwd`/src/MPI_Waitsome_p_c
>>> MPITEST info  (0): Starting:  MPI_Waitsome_p:  Persistent Waitsome
>>> using two nodes
>>> MPITEST_results: MPI_Waitsome_p:  Persistent Waitsome using two nodes
>>> all tests PASSED (742) [drossetti-ivy0:10353] *** Process
>>> (mpirun)received signal *** [drossetti-ivy0:10353] Signal:
>>> Segmentation fault (11) [drossetti-ivy0:10353] Signal code: Address
>>> not mapped (1) [drossetti-ivy0:10353] Failing at address:
>>> 0x7fd31e5f208d [drossetti-ivy0:10353] End of signal information - not
>>> sleeping
>>> gmake[1]: *** [MPI_Waitsome_p_c] Segmentation fault (core dumped)
>>> gmake[1]: Leaving directory `/geppetto/home/rvandevaart/public/ompi-
>> tests/trunk/intel_tests'
>>> 
>>> (gdb) where
>>> #0  0x00007fd31f620807 in ?? () from /lib64/libgcc_s.so.1
>>> #1  0x00007fd31f6210b9 in _Unwind_Backtrace () from
>>> /lib64/libgcc_s.so.1
>>> #2  0x00007fd31fb2893e in backtrace () from /lib64/libc.so.6
>>> #3  0x00007fd320b0d622 in opal_backtrace_buffer
>> (message_out=0x7fd31e5e33a0, len_out=0x7fd31e5e33ac)
>>>   at
>>> ../../../../../opal/mca/backtrace/execinfo/backtrace_execinfo.c:57
>>> #4  0x00007fd320b0a794 in show_stackframe (signo=11,
>>> info=0x7fd31e5e3930, p=0x7fd31e5e3800) at
>>> ../../../opal/util/stacktrace.c:354
>>> #5  <signal handler called>
>>> #6  0x00007fd31e5f208d in ?? ()
>>> #7  0x00007fd31e5e46d8 in ?? ()
>>> #8  0x000000000000c2a8 in ?? ()
>>> #9  0x0000000000000000 in ?? ()
>>> 
>>> 
>>> ----------------------------------------------------------------------
>>> ------------- This email message is for the sole use of the intended
>>> recipient(s) and may contain confidential information.  Any
>>> unauthorized review, use, disclosure or distribution is prohibited.
>>> If you are not the intended recipient, please contact the sender by
>>> reply email and destroy all copies of the original message.
>>> ----------------------------------------------------------------------
>>> ------------- _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to