On 20/03/2015 23:24, Adrian M Negreanu wrote: > Given that it's a power8 CPU (SMT), can this be triggered by an instruction > reordering > somewhere in qwaitcondition_unix.cpp ? Maybe -O0 can help ?
I did a trial with -00 for all (not only qwaitcondition_unix.cpp) and it still failed. > > Also valgrind, besides the fact that slows the execution, it also modifies > the process instructions. > > > On Thu, Mar 19, 2015 at 11:04 PM, Dimitri van Heesch <doxy...@gmail.com > <mailto:doxy...@gmail.com>> wrote: > > Hi Normand, > > The issues seems to be in this piece of code: > > DotRunner *DotRunnerQueue::dequeue() > { > QMutexLocker locker(&m_mutex); > while (m_queue.isEmpty()) > { > // wait until something is added to the queue > m_bufferNotEmpty.wait(&m_mutex); > } > DotRunner *result = m_queue.dequeue(); > return result; > } > > It is one of the few areas that executed by multiple threads, > but it is protected by a mutex (under the hood the QMutex and > QWaitCondition map to pthread calls). > Since m_bufferNotEmpty has its own mutex internally, it should only allow > one thread to be awaken. What you are seeing, it seems, is two threads > doing a dequeue() simultaneously. > Would be nice if you could help me with debugging this issue. I do not know how to debug this, adding printf do not help me to isolate a problem. Any suggestions ? === // wait until something is added to the queue m_bufferNotEmpty.wait(&m_mutex); } + pthread_t id = pthread_self(); + printf("%08x: %p: DotRunnerQueue::dequeue, m_queue %p\n",id, this, m_queue); DotRunner *result = m_queue.dequeue(); return result; } === ... ad0df190: 0x1002f8d0460: DotRunnerQueue::dequeue, m_queue 0x1002f8d0468 ac0df190: 0x1002f8d0460: DotRunnerQueue::dequeue, m_queue 0x1002f8d0468 Running dot for graph 309/1586 Running dot for graph 310/1586 ac8df190: 0x1002f8d0460: DotRunnerQueue::dequeue, m_queue 0x1002f8d0468 Running dot for graph 311/1586 ab0df190: 0x1002f8d0460: DotRunnerQueue::dequeue, m_queue 0x1002f8d0468 ab8df190: 0x1002f8d0460: DotRunnerQueue::dequeue, m_queue 0x1002f8d0468 ad0df190: 0x1002f8d0460: DotRunnerQueue::dequeue, m_queue 0x1002f8d0468 Running dot for graph 312/1586 Running dot for graph 313/1586 Running dot for graph 314/1586 ac0df190: 0x1002f8d0460: DotRunnerQueue::dequeue, m_queue 0x1002f8d0468 Running dot for graph 315/1586 ab0df190: 0x1002f8d0460: DotRunnerQueue::dequeue, m_queue 0x1002f8d0468 ac0df190: 0x1002f8d0460: DotRunnerQueue::dequeue, m_queue 0x1002f8d0468 ac8df190: 0x1002f8d0460: DotRunner === > > A workaround is to set DOT_NUM_THREADS to 1. The workaround is what is implemented today for openSUSE tumbleweed for ppc64le. > > Regards, > Dimitri > > > On 18 Mar 2015, at 12:45 , Normand <norm...@linux.vnet.ibm.com > <mailto:norm...@linux.vnet.ibm.com>> wrote: > > > > > > On 11/03/2015 14:16, Normand wrote: > >> Hi there > >> > >> while building doxygen for opensuse on Power8 guest I hit a failure as > detailed in (2) > >> The related backtrace extracted for core file is appended below in (1) > >> > >> > >> === (1) > >> Core was generated by `./bin/doxygen '. > >> Program terminated with signal SIGABRT, Aborted. > >> #0 0x00003fffa5acd194 in __GI_raise (sig=<optimized out>) at > ../sysdeps/unix/sysv/linux/raise.c:55 > >> 55 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory. > >> Missing separate debuginfos, use: zypper install > libgcc_s1-debuginfo-4.8.3+r218481-2.1.ppc64le > libstdc++6-debuginfo-4.8.3+r218481-2.1.ppc64le > >> (gdb) bt > >> #0 0x00003fffa5acd194 in __GI_raise (sig=<optimized out>) at > ../sysdeps/unix/sysv/linux/raise.c:55 > >> #1 0x00003fffa5acf184 in __GI_abort () at abort.c:78 > >> #2 0x00003fffa5b136c4 in __libc_message (do_abort=<optimized out>, > fmt=<optimized out>) at ../sysdeps/posix/libc_fatal.c:175 > >> #3 0x00003fffa5b1ba84 in malloc_printerr (action=<optimized out>, > str=0x3fffa5c06b50 "double free or corruption (fasttop)", ptr=<optimized > out>) at malloc.c:4960 > >> #4 0x00003fffa5b1cadc in _int_free (av=<optimized out>, p=<optimized > out>, have_lock=<optimized out>) at malloc.c:3831 > >> #5 0x00003fffa5dece10 in operator delete(void*) () from > /usr/lib64/libstdc++.so.6 > >> #6 0x00000000106620e4 in QGList::takeFirst (this=<optimized out>) at > qglist.cpp:628 > >> #7 0x000000001053ba84 in dequeue (this=<optimized out>) at > ../qtools/qqueue.h:59 > >> #8 DotRunnerQueue::dequeue (this=0x1001910fcc0) at dot.cpp:1170 > >> #9 0x000000001053bb18 in DotWorkerThread::run (this=0x10019112a50) at > dot.cpp:1191 > >> #10 0x00000000106a0a44 in QThreadPrivate::start (arg=0x10019112a50) at > qthread_unix.cpp:87 > >> #11 0x00003fffa5ee9454 in start_thread (arg=0x3fffa38bf180) at > pthread_create.c:335 > >> #12 0x00003fffa5b9e0c4 in clone () at > ../sysdeps/unix/sysv/linux/powerpc/powerpc64/clone.S:96 > >> === > >> > >> (2) https://bugzilla.suse.com/show_bug.cgi?id=921577 > >> > > > > > > > > I was able to recreate the problem with doxygen last git commit 1c8bbb6 > > * 1c8bbb6 (HEAD, origin/master, origin/HEAD, master) Merge pull > request #314 > > > > The associated backtrace (from core file) only differ from above by > some line numbers > > But is still pointing to same call sequence: > > from DotWorkerThread::run > > to delete in QCollection::Item QGList::takeFirst > > === > > #0 0x00003fff8433d194 in raise () from /lib64/libc.so.6 > > Missing separate debuginfos, use: zypper install > glibc-debuginfo-2.21-3.3.ppc64le > libgcc_s1-debuginfo-4.8.3+r218481-4.3.ppc64le > libstdc++6-debuginfo-4.8.3+r218481-4.3.ppc64le > > (gdb) bt > > #0 0x00003fff8433d194 in raise () from /lib64/libc.so.6 > > #1 0x00003fff8433f184 in abort () from /lib64/libc.so.6 > > #2 0x00003fff843836c4 in __libc_message () from /lib64/libc.so.6 > > #3 0x00003fff8438ba84 in malloc_printerr () from /lib64/libc.so.6 > > #4 0x00003fff8438cadc in _int_free () from /lib64/libc.so.6 > > #5 0x00003fff8465ce10 in operator delete(void*) () from > /usr/lib64/libstdc++.so.6 > > #6 0x000000001066c5e4 in QGList::takeFirst (this=<optimized out>) at > qglist.cpp:628 > > #7 0x0000000010544e04 in dequeue (this=<optimized out>) at > ../qtools/qqueue.h:59 > > #8 DotRunnerQueue::dequeue (this=0x1001707f7b0) at dot.cpp:1181 > > #9 0x0000000010544e98 in DotWorkerThread::run (this=0x1001707efd0) at > dot.cpp:1202 > > #10 0x00000000106aaf44 in QThreadPrivate::start (arg=0x1001707efd0) at > qthread_unix.cpp:87 > > #11 0x00003fff84759454 in start_thread () from /lib64/libpthread.so.0 > > #12 0x00003fff8440e0c4 in clone () from /lib64/libc.so.6 > > === > > > > The occurence is timing dependent, and there is no failure if trying to > start doxygen via gdb or valgrind, > > so I do not know how to continue investigation. > > > > any suggestions are welcome. > > > > --- > > Michel Normand > > -- Michel Normand ------------------------------------------------------------------------------ _______________________________________________ Doxygen-users mailing list Doxygen-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/doxygen-users