Ah -- the point being that this is not an issue related to the libltdl work.


> On Feb 2, 2015, at 2:51 AM, Adrian Reber <adr...@lisas.de> wrote:
> 
> I have reported the same error a few days ago and submitted it now as a
> github issue: https://github.com/open-mpi/ompi/issues/371
> 
> On Mon, Feb 02, 2015 at 12:36:54PM +1100, Christopher Samuel wrote:
>> On 31/01/15 10:51, Jeff Squyres (jsquyres) wrote:
>> 
>>> New tarball posted (same location).  Now featuring 100% fewer "make check" 
>>> failures.
>> 
>> On our BG/Q front-end node (PPC64, RHEL 6.4) I see:
>> 
>> ../../config/test-driver: line 95: 30173 Segmentation fault      (core 
>> dumped) "$@" > $log_file 2>&1
>> FAIL: opal_lifo
>> 
>> Stack trace implies the culprit is in:
>> 
>> #0  0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1)
>>    at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51
>> 51              old = *addr;
>> 
>> I've attached a script of gdb doing "thread apply all bt full" in
>> case that's helpful.
>> 
>> All the best,
>> Chris
>> -- 
>> Christopher Samuel        Senior Systems Administrator
>> VLSCI - Victorian Life Sciences Computation Initiative
>> Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545
>> http://www.vlsci.org.au/      http://twitter.com/vlsci
>> 
> 
>> Script started on Mon 02 Feb 2015 12:32:56 EST
>> 
>> [samuel@avoca class]$ gdb 
>> /vlsci/VLSCI/samuel/tmp/OMPI/build-gcc/test/class/.libs/lt-opal_lifo 
>> core.32444
>> [?1034hGNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1)
>> Copyright (C) 2010 Free Software Foundation, Inc.
>> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
>> This is free software: you are free to change and redistribute it.
>> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
>> and "show warranty" for details.
>> This GDB was configured as "ppc64-redhat-linux-gnu".
>> For bug reporting instructions, please see:
>> <http://www.gnu.org/software/gdb/bugs/>...
>> Reading symbols from 
>> /vlsci/VLSCI/samuel/tmp/OMPI/build-gcc/test/class/.libs/lt-opal_lifo...done.
>> [New Thread 32465]
>> [New Thread 32464]
>> [New Thread 32466]
>> [New Thread 32444]
>> [New Thread 32469]
>> [New Thread 32467]
>> [New Thread 32470]
>> [New Thread 32463]
>> [New Thread 32468]
>> Missing separate debuginfo for 
>> /vlsci/VLSCI/samuel/tmp/OMPI/build-gcc/opal/.libs/libopen-pal.so.0
>> Try: yum --disablerepo='*' --enablerepo='*-debug*' install 
>> /usr/lib/debug/.build-id/de/a09192aa84bbc15579ae5190dc8acd16eb94fe
>> Missing separate debuginfo for /usr/local/slurm/14.03.10/lib/libpmi.so.0
>> Try: yum --disablerepo='*' --enablerepo='*-debug*' install 
>> /usr/lib/debug/.build-id/28/09dfc4706ed44259cc31a5898c8d1a9b76b949
>> Missing separate debuginfo for /usr/local/slurm/14.03.10/lib/libslurm.so.27
>> Try: yum --disablerepo='*' --enablerepo='*-debug*' install 
>> /usr/lib/debug/.build-id/e2/39d8a2994ae061ab7ada0ebb7719b8efa5de96
>> Missing separate debuginfo for 
>> Try: yum --disablerepo='*' --enablerepo='*-debug*' install 
>> /usr/lib/debug/.build-id/1a/063e3d64bb5560021ec2ba5329fb1e420b470f
>> Reading symbols from 
>> /vlsci/VLSCI/samuel/tmp/OMPI/build-gcc/opal/.libs/libopen-pal.so.0...done.
>> Loaded symbols for 
>> /vlsci/VLSCI/samuel/tmp/OMPI/build-gcc/opal/.libs/libopen-pal.so.0
>> Reading symbols from /usr/local/slurm/14.03.10/lib/libpmi.so.0...done.
>> Loaded symbols for /usr/local/slurm/14.03.10/lib/libpmi.so.0
>> Reading symbols from /usr/local/slurm/14.03.10/lib/libslurm.so.27...done.
>> Loaded symbols for /usr/local/slurm/14.03.10/lib/libslurm.so.27
>> Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
>> Loaded symbols for /lib64/libdl.so.2
>> Reading symbols from /lib64/libpthread.so.0...(no debugging symbols 
>> found)...done.
>> [Thread debugging using libthread_db enabled]
>> Loaded symbols for /lib64/libpthread.so.0
>> Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done.
>> Loaded symbols for /lib64/librt.so.1
>> Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
>> Loaded symbols for /lib64/libm.so.6
>> Reading symbols from /lib64/libutil.so.1...(no debugging symbols 
>> found)...done.
>> Loaded symbols for /lib64/libutil.so.1
>> Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
>> Loaded symbols for /lib64/libc.so.6
>> Reading symbols from /lib64/ld64.so.1...(no debugging symbols found)...done.
>> Loaded symbols for /lib64/ld64.so.1
>> Core was generated by 
>> `/vlsci/VLSCI/samuel/tmp/OMPI/build-gcc/test/class/.libs/lt-opal_lifo '.
>> Program terminated with signal 11, Segmentation fault.
>> #0  0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1)
>>    at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51
>> 51           old = *addr;
>> Missing separate debuginfos, use: debuginfo-install 
>> glibc-2.12-1.107.el6_4.5.ppc64
>> (gdb) thread apply all bt full
>> 
>> Thread 9 (Thread 0xfff7a0ef200 (LWP 32468)):
>> #0  0x00000080adb6629c in .__libc_write () from /lib64/libpthread.so.0
>> No symbol table info available.
>> #1  0x00000fff7d6905b4 in show_stackframe (signo=11, info=0xfff7a0ee3d8, 
>> p=0xfff7a0edd00)
>>    at /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/util/stacktrace.c:81
>>        print_buffer = "[avoca:32444] *** Process received signal ***\n", 
>> '\000' <repeats 977 times>
>>        tmp = 0xfff7a0ed858 "[avoca:32444] *** Process received signal ***\n"
>>        size = 1024
>>        ret = 46
>>        si_code_str = 0xfff7d75bab8 ""
>> #2  <signal handler called>
>> No symbol table info available.
>> #3  0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1)
>>    at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51
>>        old = 1
>> #4  0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193
>>        item = 0x0
>> #5  0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50
>>        i = 4002
>>        lifo = 0xffff9e4a6a0
>>        item = 0x1000511c840
>>        start = {tv_sec = 1422840607, tv_usec = 750972}
>>        stop = {tv_sec = 0, tv_usec = 0}
>>        total = {tv_sec = 0, tv_usec = 0}
>>        timing = 0
>> #6  0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0
>> No symbol table info available.
>> #7  0x00000080ada5a53c in .__clone () from /lib64/libc.so.6
>> No symbol table info available.
>> 
>> Thread 8 (Thread 0xfff7d2ef200 (LWP 32463)):
>> #0  0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1)
>>    at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51
>>        old = 1
>> #1  0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193
>>        item = 0x0
>> #2  0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50
>>        i = 2049
>>        lifo = 0xffff9e4a6a0
>>        item = 0x1000511c7e0
>>        start = {tv_sec = 1422840607, tv_usec = 750871}
>>        stop = {tv_sec = 17589991303296, tv_usec = 24}
>>        total = {tv_sec = 17589991305936, tv_usec = 17589991336208}
>>        timing = 2.8183218451323255e-315
>> #3  0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0
>> No symbol table info available.
>> #4  0x00000080ada5a53c in .__clone () from /lib64/libc.so.6
>> No symbol table info available.
>> 
>> Thread 7 (Thread 0xfff78cef200 (LWP 32470)):
>> #0  0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1)
>>    at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51
>>        old = 1
>> #1  0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193
>> ---Type <return> to continue, or q <return> to quit---
>>        item = 0x0
>> #2  0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50
>>        i = 1883
>>        lifo = 0xffff9e4a6a0
>>        item = 0x1000511c7e0
>>        start = {tv_sec = 1422840607, tv_usec = 751036}
>>        stop = {tv_sec = 0, tv_usec = 0}
>>        total = {tv_sec = 0, tv_usec = 0}
>>        timing = 0
>> #3  0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0
>> No symbol table info available.
>> #4  0x00000080ada5a53c in .__clone () from /lib64/libc.so.6
>> No symbol table info available.
>> 
>> Thread 6 (Thread 0xfff7aaef200 (LWP 32467)):
>> #0  0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1)
>>    at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51
>>        old = 1
>> #1  0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193
>>        item = 0x0
>> #2  0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50
>>        i = 3250
>>        lifo = 0xffff9e4a6a0
>>        item = 0x1000511c7e0
>>        start = {tv_sec = 1422840607, tv_usec = 750953}
>>        stop = {tv_sec = 0, tv_usec = 0}
>>        total = {tv_sec = 0, tv_usec = 0}
>>        timing = 0
>> #3  0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0
>> No symbol table info available.
>> #4  0x00000080ada5a53c in .__clone () from /lib64/libc.so.6
>> No symbol table info available.
>> 
>> Thread 5 (Thread 0xfff796ef200 (LWP 32469)):
>> #0  0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1)
>>    at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51
>>        old = 1
>> #1  0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193
>>        item = 0x0
>> #2  0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50
>>        i = 1922
>>        lifo = 0xffff9e4a6a0
>>        item = 0x1000511c7e0
>>        start = {tv_sec = 1422840607, tv_usec = 751004}
>>        stop = {tv_sec = 0, tv_usec = 0}
>>        total = {tv_sec = 0, tv_usec = 0}
>>        timing = 0
>> #3  0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0
>> No symbol table info available.
>> #4  0x00000080ada5a53c in .__clone () from /lib64/libc.so.6
>> No symbol table info available.
>> 
>> Thread 4 (Thread 0x80ad907ef0 (LWP 32444)):
>> #0  0x00000080adb5c754 in .pthread_join () from /lib64/libpthread.so.0
>> No symbol table info available.
>> ---Type <return> to continue, or q <return> to quit---
>> #1  0x0000000010001ccc in main (argc=1, argv=0xffff9e4ab68) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:163
>>        ret = 0x1
>>        i = 0
>>        threads = {17589991305728, 17589980819968, 17589970334208, 
>> 17589959848448, 17589949362688, 17589938876928, 17589928391168, 
>> 17589917905408}
>>        item = 0x1000511c8d0
>>        prev = 0xffff9e4a6c0
>>        item2 = 0x1000511b640
>>        start = {tv_sec = 1422840607, tv_usec = 750782}
>>        stop = {tv_sec = 1422840607, tv_usec = 515534}
>>        total = {tv_sec = 0, tv_usec = 42314}
>>        lifo = {super = {obj_class = 0xfff7d7733e8, obj_reference_count = 1}, 
>> opal_lifo_head = {data = {counter = 0, item = 0x1000511c7e0}}, 
>>          opal_lifo_ghost = {super = {obj_class = 0xfff7d773228, 
>> obj_reference_count = 1}, opal_list_next = 0xffff9e4a6c0, opal_list_prev = 
>> 0x0, 
>>            item_free = 1}}
>>        success = false
>>        timing = 4.2313999999999998e-08
>>        rc = 0
>> 
>> Thread 3 (Thread 0xfff7b4ef200 (LWP 32466)):
>> #0  opal_atomic_swap_32 (addr=0x1000511c860, newval=1) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:52
>>        old = 0
>> #1  0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193
>>        item = 0x1000511c840
>> #2  0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50
>>        i = 1876
>>        lifo = 0xffff9e4a6a0
>>        item = 0x1000511c840
>>        start = {tv_sec = 1422840607, tv_usec = 750939}
>>        stop = {tv_sec = 0, tv_usec = 0}
>>        total = {tv_sec = 0, tv_usec = 0}
>>        timing = 0
>> #3  0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0
>> No symbol table info available.
>> #4  0x00000080ada5a53c in .__clone () from /lib64/libc.so.6
>> No symbol table info available.
>> 
>> Thread 2 (Thread 0xfff7c8ef200 (LWP 32464)):
>> #0  0x0000000010000f88 in opal_atomic_cmpset_64 (addr=0xffff9e4a6b8, 
>> oldval=1099596679232, newval=1099596679136)
>>    at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/powerpc/atomic.h:194
>>        ret = 1099596679232
>> #1  0x00000000100010e4 in opal_atomic_cmpset_ptr (addr=0xffff9e4a6b8, 
>> oldval=0x1000511c840, newval=0x1000511c7e0)
>>    at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:227
>> No locals.
>> #2  0x0000000010001438 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:198
>>        item = 0x1000511c840
>> #3  0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50
>>        i = 3968
>>        lifo = 0xffff9e4a6a0
>>        item = 0x1000511c840
>>        start = {tv_sec = 1422840607, tv_usec = 750893}
>>        stop = {tv_sec = 0, tv_usec = 0}
>>        total = {tv_sec = 0, tv_usec = 0}
>>        timing = 0
>> #4  0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0
>> No symbol table info available.
>> #5  0x00000080ada5a53c in .__clone () from /lib64/libc.so.6
>> ---Type <return> to continue, or q <return> to quit---
>> No symbol table info available.
>> 
>> Thread 1 (Thread 0xfff7beef200 (LWP 32465)):
>> #0  0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1)
>>    at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51
>>        old = 1
>> #1  0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193
>>        item = 0x0
>> #2  0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at 
>> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50
>>        i = 3734
>>        lifo = 0xffff9e4a6a0
>>        item = 0x1000511c7e0
>>        start = {tv_sec = 1422840607, tv_usec = 750907}
>>        stop = {tv_sec = 0, tv_usec = 0}
>>        total = {tv_sec = 0, tv_usec = 0}
>>        timing = 0
>> #3  0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0
>> No symbol table info available.
>> #4  0x00000080ada5a53c in .__clone () from /lib64/libc.so.6
>> No symbol table info available.
>> (gdb) quit
>> ]0;samuel@avoca:~tmp/OMPI/build-gcc/test/class[samuel@avoca class]$ exit
>> 
>> Script done on Mon 02 Feb 2015 12:33:16 EST
> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Searchable archives: 
>> http://www.open-mpi.org/community/lists/devel/2015/02/index.php
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post: 
> http://www.open-mpi.org/community/lists/devel/2015/02/16873.php


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to