On Jan 23, 2013, at 9:55 AM CST, Thiago Macieira wrote:

> On quarta-feira, 23 de janeiro de 2013 09.22.49, Dave Goodell wrote:
>> 
>> If you don't want to write inline assembly, this might be your best bet. 
>> But on TSO systems like x86, you only need a "compiler barrier".  In x86
>> inline assembly syntax, this looks like:
>> 
>> __asm__ __volatile__  ( "" ::: "memory" )
>> 
>> This prevents GCC (and compilers that correctly support this syntax) from
>> reordering accesses across this statement or making assumptions about the
>> state of memory across this point.
> 
> Indeed, but it also prevents proper merging of stores and loads. For example, 
> if x is an atomic variable, the compiler can and should merge these two 
> stores:
> 
>       x = 1;
>       x = 2;
> 
> It's a trade-off until we have the proper implementation with std::atomic.

Agreed.  My comments refer to the current common state of the world, not a 
C11/C++11 world.  That's all too new for me to be able to rely on its 
availability yet.  I am eagerly awaiting the day that inline assembly and 
atomic libraries can be ditched altogether.

>> The only case you need to worry about on x86 (assuming you are not fiddling
>> with "special" memory) is that earlier stores could be reordered after
>> subsequent loads.  That should be the only time you need the real "mfence"
>> instructions that you have below.  "load-acquire" and "store-release" don't
>> fall into that category.
> 
> Which is why the qatomic_gcc.h implementation is a last resort for Qt. Since 
> it does a full memory barrier including mfence, it's way too expensive.
> 
> Note that x86 does have sfence and lfence instructions too. I should go ask 
> some colleagues at Intel about when they should be used, because I haven't 
> yet 
> found a case where they are needed.

It does, but AIUI they are only needed for write-combining (WC) memory regions, 
such as memory mapped video card frame buffers.

> Here's the listing of the current implementations of atomics in Qt and their 
> drawbacks, in the order that they are preferred:
> 
> 0) MSVC: applies only to MSVC, using intrinsics
> 
> 1) Arch-specific implementation: uses assembly for the fetchAndAdd, 
> testAndSet 
> and fetchAndStore operations, but uses direct volatile loads/stores for 
> loadAcquire and storeRelease, and non-volatile loads/stores for load/store, 
> which are subject to reordering by the compiler. This is the default on x86 
> with GCC, Clang and ICC.

The above will probably only work on Itanium machines where compilers seem to 
emit acquire/release suffixes on volatile load/store operations.  You should 
probably update this implementation to include compiler barriers.

> 2) std::atomic implementation, if std::atomic is supported. This appeared on 
> GCC in 4.6, but was implemented using the old intrinsics with full barrier. 
> In 
> GCC 4.7, it uses the new intrinsics that support more relaxed barriers. This 
> works fine for x86, but the implementation is incredibly sub-optimal on other 
> architectures (uses locks on ARM).
> 
> 3) GCC old intrinsics implementation, the one with full barriers.
> 
> From GCC's plans, the implementation in 4.8 will solve the implementation 
> issues on other architectures, so we may start using that. That means Qt 5.2.

Makes sense.

> In any of the implementations, all the code is inlined, so there's nothing 
> for 
> valgrind to react to. The best we could do is insert some of valgrind's no-op 
> hints, on debug or special builds. Can you recommend what hints to add?

Julian/Bart/etc. may have more to add here.  I remember having trouble with 
annotating load-acquire/store-release in the past.  Here's the (only partially 
helpful) thread on the topic: 

> May I also suggest an out-of-line marker strategy, similar to what the Linux 
> kernel does for catching processor faults, and the ABI does for exceptions? 
> If 
> we had this, we'd leave the markers enabled even in release builds.

I'm not sure I'm familiar with this technique.  Do you have a link to any 
reading (even code is fine) that I could do on the subject?

-Dave


------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
_______________________________________________
Valgrind-users mailing list
Valgrind-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/valgrind-users

Reply via email to