Ah -- the point being that this is not an issue related to the libltdl work.
> On Feb 2, 2015, at 2:51 AM, Adrian Reber <adr...@lisas.de> wrote: > > I have reported the same error a few days ago and submitted it now as a > github issue: https://github.com/open-mpi/ompi/issues/371 > > On Mon, Feb 02, 2015 at 12:36:54PM +1100, Christopher Samuel wrote: >> On 31/01/15 10:51, Jeff Squyres (jsquyres) wrote: >> >>> New tarball posted (same location). Now featuring 100% fewer "make check" >>> failures. >> >> On our BG/Q front-end node (PPC64, RHEL 6.4) I see: >> >> ../../config/test-driver: line 95: 30173 Segmentation fault (core >> dumped) "$@" > $log_file 2>&1 >> FAIL: opal_lifo >> >> Stack trace implies the culprit is in: >> >> #0 0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1) >> at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51 >> 51 old = *addr; >> >> I've attached a script of gdb doing "thread apply all bt full" in >> case that's helpful. >> >> All the best, >> Chris >> -- >> Christopher Samuel Senior Systems Administrator >> VLSCI - Victorian Life Sciences Computation Initiative >> Email: sam...@unimelb.edu.au Phone: +61 (0)3 903 55545 >> http://www.vlsci.org.au/ http://twitter.com/vlsci >> > >> Script started on Mon 02 Feb 2015 12:32:56 EST >> >> [samuel@avoca class]$ gdb >> /vlsci/VLSCI/samuel/tmp/OMPI/build-gcc/test/class/.libs/lt-opal_lifo >> core.32444 >> [?1034hGNU gdb (GDB) Red Hat Enterprise Linux (7.2-60.el6_4.1) >> Copyright (C) 2010 Free Software Foundation, Inc. >> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> >> This is free software: you are free to change and redistribute it. >> There is NO WARRANTY, to the extent permitted by law. Type "show copying" >> and "show warranty" for details. >> This GDB was configured as "ppc64-redhat-linux-gnu". >> For bug reporting instructions, please see: >> <http://www.gnu.org/software/gdb/bugs/>... >> Reading symbols from >> /vlsci/VLSCI/samuel/tmp/OMPI/build-gcc/test/class/.libs/lt-opal_lifo...done. >> [New Thread 32465] >> [New Thread 32464] >> [New Thread 32466] >> [New Thread 32444] >> [New Thread 32469] >> [New Thread 32467] >> [New Thread 32470] >> [New Thread 32463] >> [New Thread 32468] >> Missing separate debuginfo for >> /vlsci/VLSCI/samuel/tmp/OMPI/build-gcc/opal/.libs/libopen-pal.so.0 >> Try: yum --disablerepo='*' --enablerepo='*-debug*' install >> /usr/lib/debug/.build-id/de/a09192aa84bbc15579ae5190dc8acd16eb94fe >> Missing separate debuginfo for /usr/local/slurm/14.03.10/lib/libpmi.so.0 >> Try: yum --disablerepo='*' --enablerepo='*-debug*' install >> /usr/lib/debug/.build-id/28/09dfc4706ed44259cc31a5898c8d1a9b76b949 >> Missing separate debuginfo for /usr/local/slurm/14.03.10/lib/libslurm.so.27 >> Try: yum --disablerepo='*' --enablerepo='*-debug*' install >> /usr/lib/debug/.build-id/e2/39d8a2994ae061ab7ada0ebb7719b8efa5de96 >> Missing separate debuginfo for >> Try: yum --disablerepo='*' --enablerepo='*-debug*' install >> /usr/lib/debug/.build-id/1a/063e3d64bb5560021ec2ba5329fb1e420b470f >> Reading symbols from >> /vlsci/VLSCI/samuel/tmp/OMPI/build-gcc/opal/.libs/libopen-pal.so.0...done. >> Loaded symbols for >> /vlsci/VLSCI/samuel/tmp/OMPI/build-gcc/opal/.libs/libopen-pal.so.0 >> Reading symbols from /usr/local/slurm/14.03.10/lib/libpmi.so.0...done. >> Loaded symbols for /usr/local/slurm/14.03.10/lib/libpmi.so.0 >> Reading symbols from /usr/local/slurm/14.03.10/lib/libslurm.so.27...done. >> Loaded symbols for /usr/local/slurm/14.03.10/lib/libslurm.so.27 >> Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done. >> Loaded symbols for /lib64/libdl.so.2 >> Reading symbols from /lib64/libpthread.so.0...(no debugging symbols >> found)...done. >> [Thread debugging using libthread_db enabled] >> Loaded symbols for /lib64/libpthread.so.0 >> Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done. >> Loaded symbols for /lib64/librt.so.1 >> Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done. >> Loaded symbols for /lib64/libm.so.6 >> Reading symbols from /lib64/libutil.so.1...(no debugging symbols >> found)...done. >> Loaded symbols for /lib64/libutil.so.1 >> Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done. >> Loaded symbols for /lib64/libc.so.6 >> Reading symbols from /lib64/ld64.so.1...(no debugging symbols found)...done. >> Loaded symbols for /lib64/ld64.so.1 >> Core was generated by >> `/vlsci/VLSCI/samuel/tmp/OMPI/build-gcc/test/class/.libs/lt-opal_lifo '. >> Program terminated with signal 11, Segmentation fault. >> #0 0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1) >> at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51 >> 51 old = *addr; >> Missing separate debuginfos, use: debuginfo-install >> glibc-2.12-1.107.el6_4.5.ppc64 >> (gdb) thread apply all bt full >> >> Thread 9 (Thread 0xfff7a0ef200 (LWP 32468)): >> #0 0x00000080adb6629c in .__libc_write () from /lib64/libpthread.so.0 >> No symbol table info available. >> #1 0x00000fff7d6905b4 in show_stackframe (signo=11, info=0xfff7a0ee3d8, >> p=0xfff7a0edd00) >> at /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/util/stacktrace.c:81 >> print_buffer = "[avoca:32444] *** Process received signal ***\n", >> '\000' <repeats 977 times> >> tmp = 0xfff7a0ed858 "[avoca:32444] *** Process received signal ***\n" >> size = 1024 >> ret = 46 >> si_code_str = 0xfff7d75bab8 "" >> #2 <signal handler called> >> No symbol table info available. >> #3 0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1) >> at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51 >> old = 1 >> #4 0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193 >> item = 0x0 >> #5 0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50 >> i = 4002 >> lifo = 0xffff9e4a6a0 >> item = 0x1000511c840 >> start = {tv_sec = 1422840607, tv_usec = 750972} >> stop = {tv_sec = 0, tv_usec = 0} >> total = {tv_sec = 0, tv_usec = 0} >> timing = 0 >> #6 0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0 >> No symbol table info available. >> #7 0x00000080ada5a53c in .__clone () from /lib64/libc.so.6 >> No symbol table info available. >> >> Thread 8 (Thread 0xfff7d2ef200 (LWP 32463)): >> #0 0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1) >> at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51 >> old = 1 >> #1 0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193 >> item = 0x0 >> #2 0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50 >> i = 2049 >> lifo = 0xffff9e4a6a0 >> item = 0x1000511c7e0 >> start = {tv_sec = 1422840607, tv_usec = 750871} >> stop = {tv_sec = 17589991303296, tv_usec = 24} >> total = {tv_sec = 17589991305936, tv_usec = 17589991336208} >> timing = 2.8183218451323255e-315 >> #3 0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0 >> No symbol table info available. >> #4 0x00000080ada5a53c in .__clone () from /lib64/libc.so.6 >> No symbol table info available. >> >> Thread 7 (Thread 0xfff78cef200 (LWP 32470)): >> #0 0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1) >> at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51 >> old = 1 >> #1 0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193 >> ---Type <return> to continue, or q <return> to quit--- >> item = 0x0 >> #2 0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50 >> i = 1883 >> lifo = 0xffff9e4a6a0 >> item = 0x1000511c7e0 >> start = {tv_sec = 1422840607, tv_usec = 751036} >> stop = {tv_sec = 0, tv_usec = 0} >> total = {tv_sec = 0, tv_usec = 0} >> timing = 0 >> #3 0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0 >> No symbol table info available. >> #4 0x00000080ada5a53c in .__clone () from /lib64/libc.so.6 >> No symbol table info available. >> >> Thread 6 (Thread 0xfff7aaef200 (LWP 32467)): >> #0 0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1) >> at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51 >> old = 1 >> #1 0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193 >> item = 0x0 >> #2 0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50 >> i = 3250 >> lifo = 0xffff9e4a6a0 >> item = 0x1000511c7e0 >> start = {tv_sec = 1422840607, tv_usec = 750953} >> stop = {tv_sec = 0, tv_usec = 0} >> total = {tv_sec = 0, tv_usec = 0} >> timing = 0 >> #3 0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0 >> No symbol table info available. >> #4 0x00000080ada5a53c in .__clone () from /lib64/libc.so.6 >> No symbol table info available. >> >> Thread 5 (Thread 0xfff796ef200 (LWP 32469)): >> #0 0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1) >> at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51 >> old = 1 >> #1 0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193 >> item = 0x0 >> #2 0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50 >> i = 1922 >> lifo = 0xffff9e4a6a0 >> item = 0x1000511c7e0 >> start = {tv_sec = 1422840607, tv_usec = 751004} >> stop = {tv_sec = 0, tv_usec = 0} >> total = {tv_sec = 0, tv_usec = 0} >> timing = 0 >> #3 0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0 >> No symbol table info available. >> #4 0x00000080ada5a53c in .__clone () from /lib64/libc.so.6 >> No symbol table info available. >> >> Thread 4 (Thread 0x80ad907ef0 (LWP 32444)): >> #0 0x00000080adb5c754 in .pthread_join () from /lib64/libpthread.so.0 >> No symbol table info available. >> ---Type <return> to continue, or q <return> to quit--- >> #1 0x0000000010001ccc in main (argc=1, argv=0xffff9e4ab68) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:163 >> ret = 0x1 >> i = 0 >> threads = {17589991305728, 17589980819968, 17589970334208, >> 17589959848448, 17589949362688, 17589938876928, 17589928391168, >> 17589917905408} >> item = 0x1000511c8d0 >> prev = 0xffff9e4a6c0 >> item2 = 0x1000511b640 >> start = {tv_sec = 1422840607, tv_usec = 750782} >> stop = {tv_sec = 1422840607, tv_usec = 515534} >> total = {tv_sec = 0, tv_usec = 42314} >> lifo = {super = {obj_class = 0xfff7d7733e8, obj_reference_count = 1}, >> opal_lifo_head = {data = {counter = 0, item = 0x1000511c7e0}}, >> opal_lifo_ghost = {super = {obj_class = 0xfff7d773228, >> obj_reference_count = 1}, opal_list_next = 0xffff9e4a6c0, opal_list_prev = >> 0x0, >> item_free = 1}} >> success = false >> timing = 4.2313999999999998e-08 >> rc = 0 >> >> Thread 3 (Thread 0xfff7b4ef200 (LWP 32466)): >> #0 opal_atomic_swap_32 (addr=0x1000511c860, newval=1) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:52 >> old = 0 >> #1 0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193 >> item = 0x1000511c840 >> #2 0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50 >> i = 1876 >> lifo = 0xffff9e4a6a0 >> item = 0x1000511c840 >> start = {tv_sec = 1422840607, tv_usec = 750939} >> stop = {tv_sec = 0, tv_usec = 0} >> total = {tv_sec = 0, tv_usec = 0} >> timing = 0 >> #3 0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0 >> No symbol table info available. >> #4 0x00000080ada5a53c in .__clone () from /lib64/libc.so.6 >> No symbol table info available. >> >> Thread 2 (Thread 0xfff7c8ef200 (LWP 32464)): >> #0 0x0000000010000f88 in opal_atomic_cmpset_64 (addr=0xffff9e4a6b8, >> oldval=1099596679232, newval=1099596679136) >> at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/powerpc/atomic.h:194 >> ret = 1099596679232 >> #1 0x00000000100010e4 in opal_atomic_cmpset_ptr (addr=0xffff9e4a6b8, >> oldval=0x1000511c840, newval=0x1000511c7e0) >> at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:227 >> No locals. >> #2 0x0000000010001438 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:198 >> item = 0x1000511c840 >> #3 0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50 >> i = 3968 >> lifo = 0xffff9e4a6a0 >> item = 0x1000511c840 >> start = {tv_sec = 1422840607, tv_usec = 750893} >> stop = {tv_sec = 0, tv_usec = 0} >> total = {tv_sec = 0, tv_usec = 0} >> timing = 0 >> #4 0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0 >> No symbol table info available. >> #5 0x00000080ada5a53c in .__clone () from /lib64/libc.so.6 >> ---Type <return> to continue, or q <return> to quit--- >> No symbol table info available. >> >> Thread 1 (Thread 0xfff7beef200 (LWP 32465)): >> #0 0x0000000010001048 in opal_atomic_swap_32 (addr=0x20, newval=1) >> at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/include/opal/sys/atomic_impl.h:51 >> old = 1 >> #1 0x0000000010001408 in opal_lifo_pop_atomic (lifo=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/opal/class/opal_lifo.h:193 >> item = 0x0 >> #2 0x0000000010001630 in thread_test (arg=0xffff9e4a6a0) at >> /vlsci/VLSCI/samuel/tmp/OMPI/openmpi-gitclone/test/class/opal_lifo.c:50 >> i = 3734 >> lifo = 0xffff9e4a6a0 >> item = 0x1000511c7e0 >> start = {tv_sec = 1422840607, tv_usec = 750907} >> stop = {tv_sec = 0, tv_usec = 0} >> total = {tv_sec = 0, tv_usec = 0} >> timing = 0 >> #3 0x00000080adb5c21c in .start_thread () from /lib64/libpthread.so.0 >> No symbol table info available. >> #4 0x00000080ada5a53c in .__clone () from /lib64/libc.so.6 >> No symbol table info available. >> (gdb) quit >> ]0;samuel@avoca:~tmp/OMPI/build-gcc/test/class[samuel@avoca class]$ exit >> >> Script done on Mon 02 Feb 2015 12:33:16 EST > >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel >> Searchable archives: >> http://www.open-mpi.org/community/lists/devel/2015/02/index.php > _______________________________________________ > devel mailing list > de...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel > Link to this post: > http://www.open-mpi.org/community/lists/devel/2015/02/16873.php -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/