On Wed, Jan 21, 2015 at 12:23:34PM +0400, Dmitry Vyukov wrote: > Hi Mike, > > Yes, I can quantify the cost. Is it very high. > > Here is the patch that I used: > > --- rtl/tsan_rtl.cc (revision 226644) > +++ rtl/tsan_rtl.cc (working copy) > @@ -709,7 +709,11 @@ > ALWAYS_INLINE USED > void MemoryAccess(ThreadState *thr, uptr pc, uptr addr, > int kAccessSizeLog, bool kAccessIsWrite, bool kIsAtomic) { > u64 *shadow_mem = (u64*)MemToShadow(addr); > + > + atomic_fetch_add((atomic_uint64_t*)shadow_mem, 0, memory_order_acq_rel);
And the cost of adding that atomic_fetch_add guarded by if (__builtin_expect (someCondition, 0)) ? If that doesn't slow down the non-deterministic default case too much, that would allow users to choose what they prefer - much faster unreliable and slower deterministic. Then for the gcc testsuite we could opt for the latter. > + > > On the standard tsan benchmark that does 8-byte writes: > before: > [ OK ] DISABLED_BENCH.Mop8Write (1161 ms) > after: > [ OK ] DISABLED_BENCH.Mop8Write (5085 ms) > > So that's 338% slowdown. Jakub