https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83853

--- Comment #2 from rene.r...@fu-berlin.de ---
Hi, 
sorry I submitted accidentally before writing the text.

I am investigating some use cases for condition_variables using c++11 thread
support library.
In my use case I have the following setup.
2 Threads are synchronised via a concurrent queue and a condition variable.
Thread 2 publishes work in the concurrent queue. 
Thread 2 is also owner of the condition variable and publishes this to Thread 1
as well.
Thread 1, will wait until new work is available using the queue as
synchronisation mechanism. If so, it takes the work and if done signals Thread
2 via the condition variable that the work is finished.

Thread 2 accesses the wait within a unique lock and uses a shared bool to
protect against spurious wake ups. 
Thread 1 will acquire a lock_guard and update the bool within the lock.
After releasing the lock it will call notify_all() (note, it could also be
notify_one) to signal that Thread 2 can continue working. 

The condition variable is in local scope of a while loop in Thread 2.
And before leaving the scope it has to wait for the condition variable.

So that means these 2 Threads (under the assumption that every thing else does
work, which is in this case not trivial) should play ping pong in the sense
that:
Thread 2 (T2) publishes work in the queue, T1 waits until a new job is
available. 
T1 processes the job and signals T2.
T2 is waiting for T1 to finish.
T2 goes into another round in the while loop destroying the local condition
variable and initialising a new one (possibly in the same memory location)
before submitting it to the queue. 

This goes round and round until all work was processed.
But, I experienced on a mac that this setup will cause a dead lock (tried gcc
and clang)
I did not see this happen on a unix machine.
But compiling with -fsantize=thread will also issue a data race warning on
unix.

Here is the error:

WARNING: ThreadSanitizer: data race (pid=18098)
  Write of size 8 at 0x7ffa298bda68 by thread T2:
    #0 pthread_cond_destroy
../../.././libsanitizer/tsan/tsan_interceptors.cc:1105
(libtsan.so.0+0x000000028fee)
    #1 Job::~Job()
/home/mi/rmaerker/tmp/d0aee45e0ac53d1de5fc5e1836989914/condy.cpp:110
(a.out+0x00000040225e)
    #2 publisher(unsigned int, std::vector<std::vector<unsigned long,
std::allocator<unsigned long> >, std::allocator<std::vector<unsigned long,
std::allocator<unsigned long> > > >&)
/home/mi/rmaerker/tmp/d0aee45e0ac53d1de5fc5e1836989914/condy.cpp:213
(a.out+0x00000040225e)
    #3 void std::__invoke_impl<void, void (*)(unsigned int,
std::vector<std::vector<unsigned long, std::allocator<unsigned long> >,
std::allocator<std::vector<unsigned long, std::allocator<unsigned long> > >
>&), unsigned int, std::vector<std::vector<unsigned long,
std::allocator<unsigned long> >, std::allocator<std::vector<unsigned long,
std::allocator<unsigned long> > > >&>(std::__invoke_other, void (*&&)(unsigned
int, std::vector<std::vector<unsigned long, std::allocator<unsigned long> >,
std::allocator<std::vector<unsigned long, std::allocator<unsigned long> > >
>&), unsigned int&&, std::vector<std::vector<unsigned long,
std::allocator<unsigned long> >, std::allocator<std::vector<unsigned long,
std::allocator<unsigned long> > > >&)
/import/GCC/7.1.0/include/c++/7.1.0/bits/invoke.h:60 (a.out+0x0000004035a3)
    #4 std::__invoke_result<void (*)(unsigned int,
std::vector<std::vector<unsigned long, std::allocator<unsigned long> >,
std::allocator<std::vector<unsigned long, std::allocator<unsigned long> > >
>&), unsigned int, std::vector<std::vector<unsigned long,
std::allocator<unsigned long> >, std::allocator<std::vector<unsigned long,
std::allocator<unsigned long> > > >&>::type std::__invoke<void (*)(unsigned
int, std::vector<std::vector<unsigned long, std::allocator<unsigned long> >,
std::allocator<std::vector<unsigned long, std::allocator<unsigned long> > >
>&), unsigned int, std::vector<std::vector<unsigned long,
std::allocator<unsigned long> >, std::allocator<std::vector<unsigned long,
std::allocator<unsigned long> > > >&>(void (*&&)(unsigned int,
std::vector<std::vector<unsigned long, std::allocator<unsigned long> >,
std::allocator<std::vector<unsigned long, std::allocator<unsigned long> > >
>&), unsigned int&&, std::vector<std::vector<unsigned long,
std::allocator<unsigned long> >, std::allocator<std::vector<unsigned long,
std::allocator<unsigned long> > > >&)
/import/GCC/7.1.0/include/c++/7.1.0/bits/invoke.h:95 (a.out+0x0000004035a3)
    #5 decltype (__invoke((_S_declval<0ul>)(), (_S_declval<1ul>)(),
(_S_declval<2ul>)())) std::thread::_Invoker<std::tuple<void (*)(unsigned int,
std::vector<std::vector<unsigned long, std::allocator<unsigned long> >,
std::allocator<std::vector<unsigned long, std::allocator<unsigned long> > >
>&), unsigned int, std::vector<std::vector<unsigned long,
std::allocator<unsigned long> >, std::allocator<std::vector<unsigned long,
std::allocator<unsigned long> > > >&> >::_M_invoke<0ul, 1ul,
2ul>(std::_Index_tuple<0ul, 1ul, 2ul>)
/import/GCC/7.1.0/include/c++/7.1.0/thread:234 (a.out+0x0000004035a3)
    #6 std::thread::_Invoker<std::tuple<void (*)(unsigned int,
std::vector<std::vector<unsigned long, std::allocator<unsigned long> >,
std::allocator<std::vector<unsigned long, std::allocator<unsigned long> > >
>&), unsigned int, std::vector<std::vector<unsigned long,
std::allocator<unsigned long> >, std::allocator<std::vector<unsigned long,
std::allocator<unsigned long> > > >&> >::operator()()
/import/GCC/7.1.0/include/c++/7.1.0/thread:243 (a.out+0x0000004035a3)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void
(*)(unsigned int, std::vector<std::vector<unsigned long,
std::allocator<unsigned long> >, std::allocator<std::vector<unsigned long,
std::allocator<unsigned long> > > >&), unsigned int,
std::vector<std::vector<unsigned long, std::allocator<unsigned long> >,
std::allocator<std::vector<unsigned long, std::allocator<unsigned long> > > >&>
> >::_M_run() /import/GCC/7.1.0/include/c++/7.1.0/thread:186
(a.out+0x0000004035a3)
    #8 execute_native_thread_routine
../../../.././libstdc++-v3/src/c++11/thread.cc:83
(libstdc++.so.6+0x0000000b9bce)

  Previous read of size 8 at 0x7ffa298bda68 by thread T1:
    #0 pthread_cond_signal
../../.././libsanitizer/tsan/tsan_interceptors.cc:1091
(libtsan.so.0+0x000000028def)
    #1 __gthread_cond_signal
/srv/public/gcc-7.1.0/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:858
(libstdc++.so.6+0x0000000b3f28)
    #2 std::condition_variable::notify_one()
../../../.././libstdc++-v3/src/c++11/condition_variable.cc:62
(libstdc++.so.6+0x0000000b3f28)
    #3 void std::__invoke_impl<void, void (*)()>(std::__invoke_other, void
(*&&)()) /import/GCC/7.1.0/include/c++/7.1.0/bits/invoke.h:60
(a.out+0x0000004035d7)
    #4 std::__invoke_result<void (*)()>::type std::__invoke<void (*)()>(void
(*&&)()) /import/GCC/7.1.0/include/c++/7.1.0/bits/invoke.h:95
(a.out+0x0000004035d7)
    #5 decltype (__invoke((_S_declval<0ul>)()))
std::thread::_Invoker<std::tuple<void (*)()>
>::_M_invoke<0ul>(std::_Index_tuple<0ul>)
/import/GCC/7.1.0/include/c++/7.1.0/thread:234 (a.out+0x0000004035d7)
    #6 std::thread::_Invoker<std::tuple<void (*)()> >::operator()()
/import/GCC/7.1.0/include/c++/7.1.0/thread:243 (a.out+0x0000004035d7)
    #7 std::thread::_State_impl<std::thread::_Invoker<std::tuple<void (*)()> >
>::_M_run() /import/GCC/7.1.0/include/c++/7.1.0/thread:186
(a.out+0x0000004035d7)
    #8 execute_native_thread_routine
../../../.././libstdc++-v3/src/c++11/thread.cc:83
(libstdc++.so.6+0x0000000b9bce)

  Location is stack of thread T2.

  Thread T2 (tid=18101, running) created by main thread at:
    #0 pthread_create ../../.././libsanitizer/tsan/tsan_interceptors.cc:900
(libtsan.so.0+0x000000028900)
    #1 __gthread_create
/srv/public/gcc-7.1.0/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:662
(libstdc++.so.6+0x0000000b9e84)
    #2 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State,
std::default_delete<std::thread::_State> >, void (*)())
../../../.././libstdc++-v3/src/c++11/thread.cc:163
(libstdc++.so.6+0x0000000b9e84)
    #3 __libc_start_main <null> (libc.so.6+0x0000000202b0)

  Thread T1 (tid=18100, running) created by main thread at:
    #0 pthread_create ../../.././libsanitizer/tsan/tsan_interceptors.cc:900
(libtsan.so.0+0x000000028900)
    #1 __gthread_create
/srv/public/gcc-7.1.0/x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu/bits/gthr-default.h:662
(libstdc++.so.6+0x0000000b9e84)
    #2 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State,
std::default_delete<std::thread::_State> >, void (*)())
../../../.././libstdc++-v3/src/c++11/thread.cc:163
(libstdc++.so.6+0x0000000b9e84)
    #3 main
/home/mi/rmaerker/tmp/d0aee45e0ac53d1de5fc5e1836989914/condy.cpp:278
(a.out+0x0000004029b1)

SUMMARY: ThreadSanitizer: data race
/home/mi/rmaerker/tmp/d0aee45e0ac53d1de5fc5e1836989914/condy.cpp:110 in
Job::~Job()

It basically says, that while T2 is currently destroying the condition
variable, T1 is still accessing it by calling notify on it???

Note that if the condition variable is put before the while loop and reset in
every iteration, it performs without warnings on unix, but still dead locks on
mac.
If I put the notify of T1 inside of the lock_guard it will also run without
data race warnings and does not dead lock on mac.
If I put the condition variable as a global instance it does work on all
platforms. 

So there is a little error that might happen in the following situation.
After T1 updates the shared bool, it leaves the lock. 
So here it could potentially happen, that T1 is preempted and T2 has a spurious
wakeup but now acquiring the lock, as the predicate returns true now. In this
case T2 can continue in the loop and destroy the lock, and T1 will call notify
on it while T2 already destroys it. 
Having said that, this seems unlikely to happen so often, does it. I mean I am
not very confident about how often something like a spurious wake really
happens, especially as between wait and notify there are only a few cycles. 

I have prepared a gist with the sample code.
Here are the instructions to reproduce and the platform/compiler versions used:

$> git clone https://gist.github.com/d0aee45e0ac53d1de5fc5e1836989914.git
$> cd d0aee45e0ac53d1de5fc5e1836989914
$> g++ -Wall -O1 -g -pedantic -fsanitize=thread condy.cpp -lpthread
$> .a/out

expected outcome:

```
Start listening
Job created [0x7ffa298bda40] <false>
< notified 0 [false]
> notify 0
< notified 0 [true]
= finished 0
Job destroyed [0x7ffa298bda40] <true>
Job created [0x7ffa298bda40] <false>
< notified 0 [false]
> notify 0
< notified 0 [true]
= finished 0
Job destroyed [0x7ffa298bda40] <true>
Job created [0x7ffa298bda40] <false>
< notified 0 [false]
> notify 0
< notified 0 [true]
= finished 0
Job destroyed [0x7ffa298bda40] <true>
....
Job destroyed [0x7ffa298bda40] <true>
Job created [0x7ffa298bda40] <false>
< notified 0 [false]
> notify 0
< notified 0 [true]
= finished 0
Job destroyed [0x7ffa298bda40] <true>
Job created [0x7ffa298bda40] <false>
Job destroyed [0x7ffa298bda40] <false>
stopped 0 queue.empty() true
Stop listening
Par sum = 27644338
Seq sum = 27644338
```

You can disable the debug output by commenting out the define DEBUG_OUTPUT
macro at the beginning of the file.
There are three more macros at the beginning for the different cases:

A) defining a local job (local condition variable)
B) defining a local job that does not issue the ThreadSanitizer warning (thread
local condition variable but global to the while loop)
C) defining a global job.

gcc-version:
g++ (GCC) 7.1.0

linked libraries:
linux-vdso.so.1 (0x00007ffe515da000)
libtsan.so.0 => /import/GCC//7.1.0/lib64/libtsan.so.0 (0x00007f552e94c000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f552e72f000)
libstdc++.so.6 => /import/GCC//7.1.0/lib64/libstdc++.so.6 (0x00007f552e3ad000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f552e0a9000)
libgcc_s.so.1 => /import/GCC//7.1.0/lib64/libgcc_s.so.1 (0x00007f552de93000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f552daf4000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f552d8f0000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f552d6e8000)
/lib64/ld-linux-x86-64.so.2 (0x00007f552fa03000)

linux:
Debian 4.9.65-3+deb9u2 (2018-01-04) x86_64 GNU/Linux

On mac:

macOS Sierra 10.12.6 (16G1114)
Darwin Kernel Version 16.7.0

Same results with all clang compilers clang++-mp-3.5 until clang++-mp-5.0 and
all compilers g++-mp-4.9 until g++-mp-8. The program will dead lock if the
condition is created in local scope of the thread.

I would really like to know, it is in general forbidden, that the waiting
thread owns the condition variable as his stack memory might get inaccessible
when it is put into wait. I couldn't find any information regarding this.
Or is there a general user error on my site, which I might not see (because it
is not a trivial case). I am truly sorry if that would be the case.
Or is it possible that there is kind of a bug in the condition variable
implementation, such that the signalling is not broadcasted correctly?

Thank you very much for your help.

Reply via email to