Both 2.1.0rc2 and 2.0.2 appear to crash about 1 run in every 5.
This probabilistic nature is why I did not notice it in 2.0x.

-Paul

On Mon, Mar 6, 2017 at 7:58 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:

> I am traveling all this week and so don't know when I can take a look, but
> will try.
> -Paul
>
> On Mon, Mar 6, 2017 at 7:40 PM, r...@open-mpi.org <r...@open-mpi.org> wrote:
>
>> I’m not sure what could be going on here. I take it you were able to run
>> this example for the 2.0 series under this environment, yes? This code
>> hasn’t changed since that release, so I’m not sure why it would be failing
>> to resolve symbols now.
>>
>>
>> On Mar 6, 2017, at 2:22 PM, Paul Hargrove <phhargr...@lbl.gov> wrote:
>>
>> RC2 tarball for 2.1.0 configured with only --prefix=...
>> and --enable-mca-no-build=patcher
>> I don't have time to dig right now:
>>
>> $ mpirun -mca btl sm,self -np 2 examples/ring_c
>> [openbsd-i386:95593] *** Process received signal ***
>> ------------------------------------------------------------
>> --------------
>> mpirun noticed that process rank 1 with PID 0 on node openbsd-i386 exited
>> on signal 11 (Segmentation fault).
>> ------------------------------------------------------------
>> --------------
>>
>> $ gdb examples/ring_c ring_c.core
>> [...]
>> (gdb) where
>> #0  0x0ff27cf3 in _dl_find_symbol_obj (object=0x7d49a000, name=0xc7d96ab
>> "strsignal", hash=Variable "hash" is
>> not available.
>> )
>>     at /usr/src/libexec/ld.so/resolve.c:540
>> #1  0x0ff27f8d in _dl_find_symbol (name=0xc7d96ab "strsignal",
>> this=0x830f1584, flags=Variable "flags" is not
>> available.
>> )
>>     at /usr/src/libexec/ld.so/resolve.c:669
>> #2  0x0ff2a75f in _dl_bind (object=0x7d49a600, index=3704) at
>> /usr/src/libexec/ld.so/i386/rtld_machine.c:387
>> #3  0x0ff26637 in _dl_bind_start () at /usr/src/libexec/ld.so/i386/ld
>> asm.S:155
>> #4  0x7d49a600 in ?? ()
>> #5  0x00000e78 in ?? ()
>> #6  0x0d560033 in __fgetwc_unlock (fp=0x1) at
>> /usr/src/lib/libc/stdio/fgetwc.c:65
>> #7  <signal handler called>
>> #8  0x0ff27cf3 in _dl_find_symbol_obj (object=0x7dd41c00, name=0xd48042f
>> "recv", hash=Variable "hash" is not available.
>> )
>>     at /usr/src/libexec/ld.so/resolve.c:540
>> #9  0x0ff27f8d in _dl_find_symbol (name=0xd48042f "recv",
>> this=0x830f1c34, flags=Variable "flags" is not available.
>> )
>>     at /usr/src/libexec/ld.so/resolve.c:669
>> #10 0x0ff2a75f in _dl_bind (object=0x82980e00, index=32) at
>> /usr/src/libexec/ld.so/i386/rtld_machine.c:387
>> #11 0x0ff26637 in _dl_bind_start () at /usr/src/libexec/ld.so/i386/ld
>> asm.S:155
>> #12 0x82980e00 in ?? ()
>> #13 0x00000020 in ?? ()
>> #14 0x0c820033 in opal_getcwd ()
>>    from /home/phargrov/OMPI/openmpi-2.1.0rc2-openbsd6-i386/INST/lib/
>> libopen-pal.so.30.0
>> #15 0x0d4856e2 in mca_oob_usock_peer_recv_connect_ack ()
>>    from /home/phargrov/OMPI/openmpi-2.1.0rc2-openbsd6-i386/INST/lib/
>> openmpi/mca_oob_usock.so
>> #16 0x0d48789e in mca_oob_usock_recv_handler ()
>>    from /home/phargrov/OMPI/openmpi-2.1.0rc2-openbsd6-i386/INST/lib/
>> openmpi/mca_oob_usock.so
>> #17 0x0c82f11a in opal_libevent2022_event_base_loop (base=0x805b9000,
>> flags=1)
>>     at /home/phargrov/OMPI/openmpi-2.1.0rc2-openbsd6-i386/openmpi-2
>> .1.0rc2/opal/mca/event/libevent2022/libevent/event.c:1321
>> #18 0x0c7f16b4 in progress_engine ()
>>    from /home/phargrov/OMPI/openmpi-2.1.0rc2-openbsd6-i386/INST/lib/
>> libopen-pal.so.30.0
>> #19 0x0b3cc852 in _rthread_start (v=0x7dd42428) at
>> /usr/src/lib/librthread/rthread.c:115
>> #20 0x0d5c4f82 in __tfork_thread () at /usr/src/lib/libc/arch/i386/sy
>> s/tfork_thread.S:95
>>
>> -Paul
>>
>> --
>> Paul H. Hargrove                          phhargr...@lbl.gov
>> Computer Languages & Systems Software (CLaSS) Group
>> Computer Science Department               Tel: +1-510-495-2352
>> <(510)%20495-2352>
>> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
>> <(510)%20486-6900>
>> _______________________________________________
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> devel@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/devel
>>
>
>
>
> --
> Paul H. Hargrove                          phhargr...@lbl.gov
> Computer Languages & Systems Software (CLaSS) Group
> Computer Science Department               Tel: +1-510-495-2352
> <(510)%20495-2352>
> Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
> <(510)%20486-6900>
>



-- 
Paul H. Hargrove                          phhargr...@lbl.gov
Computer Languages & Systems Software (CLaSS) Group
Computer Science Department               Tel: +1-510-495-2352
Lawrence Berkeley National Laboratory     Fax: +1-510-486-6900
_______________________________________________
devel mailing list
devel@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/devel

Reply via email to