On Thu, 2012-04-12 at 13:36 +0200, Andi Kleen wrote: > On Thu, Apr 12, 2012 at 12:38:44AM +0200, Torvald Riegel wrote: > > On Wed, 2012-04-11 at 15:06 +0200, Andi Kleen wrote: > > > > Tests passing, bootstrap in progress. > > > > > > > > Comments? > > > > > > Do you really imply ACQUIRE/RELEASE with HLE_ACQUIRE/RELEASE now? I don't > > > see that in the code. I think that's really required, otherwise the > > > optimizer > > > will do the wrong thing and move memory references outside the region. > > > > Perhaps HLE_ACQUIRE / HLE_RELEASE should be something like HLE_START / > > HLE_END instead? Not particularly great names, but at least it avoids > > overloading ACQUIRE/RELEASE and thus should make it clearer that you > > still need to specify a memory order. > > It still seems wrong to me. HLE is an x86 construct, so weaker > memory orders on the compiler level than what the instruction implements > does not really make sense to me. And the instruction just has LOCK > semantics.
What if another vendor shows up, perhaps on another architecture? > > Currently it's highly error prone -- on the Russel hard to misuse scale not > higher > than 1 as is [1] It would be a three, if the patch would contain documentation of the additional bit. I guess that's a hint :) It could even be a four, depending on the point of view. Not from the POV of the Intel HLE feature. But from a conceptual POV, HLE is pretty independent of memory orders. > > At the minimum it would need a warning with RELAXED as suggested > by Jakub earlier. That could be helpful. However, I'm not 100% sure that HLE is only useful paired with acquire/release memory orders in general (ie, possibly on other archs). For example, if you only care about having acq_rel atomics being protected by your possibly-elided lock, then they won't get moved out of the critical section (unless I read the C++11 memory model too conservatively). Therefore, given that I don't see the atomic builtins being used by lots of programmers, I'd rather make them more general. > > I agree with Jakub that users really should specify memory order bits, > > if they want ordering. Andi, I also see your point regarding catching > > bugs, but this is really expert stuff, and my hope is that we can make > > HLE really transparent or at least provide better abstractions around it > > At least this form of HLE cannot be transparent, it has to be annotated by the > programmer. Let me elaborate. The point I was trying to make is that it should be transparent for the non-concurrency-expert programmer. Or at least make this specific detail we're discussing here transparent (ie, whether HLE_ACQUIRE should imply a certain memory order). That is, if non-experts only see a default lock implementation, or see a lock-implementation with a use-HLE-here-if-possible flag, then they don't have to deal with memory orders anyway.