On Wed, 2017-05-31 at 18:26 +0000, William Good wrote: > So it is actually two different locks that just happen to occupy the > same address at different times? Usually, helgrind indicates when > each lock was first observed but there is no mention of a second lock. To verify this hypothesis, you might run with -v -v -v. Each time a lock is pthread_mutex_init-ed, you should see a line such as: client request: code 48470103, addr 0x5400040, len 0 the request corresponds to the enum client request defined in helgrind.h : 0x103 = 256 + 3, which is _VG_USERREQ__HG_PTHREAD_MUTEX_INIT_POST
If you see such a line twice with the same addr, then that indicates we had 2 initialisations of a mutex at the same addr. And the comment below makes me believe helgrind does not handle that very cleanly. > No my reproducer is fairly large That is not a surprise :). If the problem is effectively linked to re-creation of another mutex at the same addr, then i think a small reproducer should be easy to write. But let's first confirm you see 2 initialisations You might also try with --tool=drd, to see if drd confirms the race condition. Philippe > > > > ______________________________________________________________________ > From: Philippe Waroquiers <philippe.waroqui...@skynet.be> > Sent: Monday, May 29, 2017 5:20 PM > To: William Good > Cc: valgrind-users@lists.sourceforge.net > Subject: Re: [Valgrind-users] Helgrind detects race with same lock > > You might have been unlucky and have a lock that was freed and then > re-used. > > See extract of mk_LockP_from_LockN comments: > So we check that each LockN is a member of the admin_locks double > linked list of all Lock structures. That stops us prodding around > in potentially freed-up Lock structures. However, it's not quite a > proper check: if a new Lock has been reallocated at the same > address as one which was previously freed, we'll wind up copying > the new one as the basis for the LockP, which is completely bogus > because it is unrelated to the previous Lock that lived there. > Let's hope that doesn't happen too often. > > Do you have a small reproducer for the below ? > Philippe > > > On Mon, 2017-05-29 at 17:33 +0000, William Good wrote: > > Hello, > > > > I am trying to understand this helgrind output. It says there is a > > data-race on a read. However both threads hold the same lock. How > > can this be a race when both threads hold the lock during the > access? > > > > > > ==31341== > > ---------------------------------------------------------------- > > ==31341== > > ==31341== Lock at 0x5990828 was first observed > > ==31341== at 0x4C31A76: pthread_mutex_init (hg_intercepts.c:779) > > ==31341== by 0x4026AF: thread_pool_submit (threadpool.c:85) > > ==31341== by 0x402012: qsort_internal_parallel (quicksort.c:142) > > ==31341== by 0x402040: qsort_internal_parallel (quicksort.c:151) > > ==31341== by 0x402040: qsort_internal_parallel (quicksort.c:151) > > ==31341== by 0x402450: thread_work (threadpool.c:233) > > ==31341== by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389) > > ==31341== by 0x4E42DC4: start_thread > > (in /usr/lib64/libpthread-2.17.so) > > ==31341== by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so) > > ==31341== Address 0x5990828 is 40 bytes inside a block of size 152 > > alloc'd > > ==31341== at 0x4C2CD95: calloc (vg_replace_malloc.c:711) > > ==31341== by 0x4026A1: thread_pool_submit (threadpool.c:84) > > ==31341== by 0x402012: qsort_internal_parallel (quicksort.c:142) > > ==31341== by 0x402040: qsort_internal_parallel (quicksort.c:151) > > ==31341== by 0x402040: qsort_internal_parallel (quicksort.c:151) > > ==31341== by 0x40279F: future_get (threadpool.c:112) > > ==31341== by 0x402048: qsort_internal_parallel (quicksort.c:152) > > ==31341== by 0x402040: qsort_internal_parallel (quicksort.c:151) > > ==31341== by 0x402450: thread_work (threadpool.c:233) > > ==31341== by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389) > > ==31341== by 0x4E42DC4: start_thread > > (in /usr/lib64/libpthread-2.17.so) > > ==31341== by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so) > > ==31341== Block was alloc'd by thread #3 > > ==31341== > > ==31341== Possible data race during read of size 4 at 0x5990880 by > > thread #2 > > ==31341== Locks held: 1, at address 0x5990828 > > ==31341== at 0x4023A9: thread_work (threadpool.c:229) > > ==31341== by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389) > > ==31341== by 0x4E42DC4: start_thread > > (in /usr/lib64/libpthread-2.17.so) > > ==31341== by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so) > > ==31341== > > ==31341== This conflicts with a previous write of size 4 by thread > #3 > > ==31341== Locks held: 1, at address 0x5990828 > > ==31341== at 0x4027B3: future_get (threadpool.c:114) > > ==31341== by 0x402048: qsort_internal_parallel (quicksort.c:152) > > ==31341== by 0x402040: qsort_internal_parallel (quicksort.c:151) > > ==31341== by 0x40279F: future_get (threadpool.c:112) > > ==31341== by 0x402048: qsort_internal_parallel (quicksort.c:152) > > ==31341== by 0x40279F: future_get (threadpool.c:112) > > ==31341== by 0x402048: qsort_internal_parallel (quicksort.c:152) > > ==31341== by 0x402040: qsort_internal_parallel (quicksort.c:151) > > ==31341== Address 0x5990880 is 128 bytes inside a block of size 152 > > alloc'd > > ==31341== at 0x4C2CD95: calloc (vg_replace_malloc.c:711) > > ==31341== by 0x4026A1: thread_pool_submit (threadpool.c:84) > > ==31341== by 0x402012: qsort_internal_parallel (quicksort.c:142) > > ==31341== by 0x402040: qsort_internal_parallel (quicksort.c:151) > > ==31341== by 0x402040: qsort_internal_parallel (quicksort.c:151) > > ==31341== by 0x40279F: future_get (threadpool.c:112) > > ==31341== by 0x402048: qsort_internal_parallel (quicksort.c:152) > > ==31341== by 0x402040: qsort_internal_parallel (quicksort.c:151) > > ==31341== by 0x402450: thread_work (threadpool.c:233) > > ==31341== by 0x4C3083E: mythread_wrapper (hg_intercepts.c:389) > > ==31341== by 0x4E42DC4: start_thread > > (in /usr/lib64/libpthread-2.17.so) > > ==31341== by 0x5355CEC: clone (in /usr/lib64/libc-2.17.so) > > ==31341== Block was alloc'd by thread #3 > > ==31341== > > ==31341== > > ---------------------------------------------------------------- > > > > > ------------------------------------------------------------------------------ > > Check out the vibrant tech community on one of the world's most > > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > > _______________________________________________ Valgrind-users > mailing list Valgrind-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/valgrind-users > > Valgrind-users Info Page - SourceForge > lists.sourceforge.net > To see the collection of prior postings to the list, visit the > Valgrind-users Archives. Using Valgrind-users: To post a message to > all the list ... > > > > > ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Valgrind-users mailing list Valgrind-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/valgrind-users