Hi, On 2023-11-10 21:49:06 +0000, John Morris wrote: > >I wonder if it's worth providing a set of "locked read" functions. > > Most out-of-order machines include “read acquire” and “write release” which > are pretty close to what you’re suggesting.
Is that really true? It's IA64 lingo. X86 doesn't have them, while arm has more granular barriers, they don't neatly map onto acquire/release either. I don't think an acquire here would actually be equivalent to making this a full barrier - an acquire barrier allows moving reads or stores from *before* the barrier to be moved after the barrier. It just prevents the opposite. And for proper use of acquire/release semantics we'd need to pair operations much more closely. Right now we often rely on another preceding memory barrier to ensure correct ordering, having to use paired operations everywhere would lead to slower code. I thoroughly dislike how strongly C++11/C11 prefer paired atomics *on the same address* over "global" fences. It often leads to substantially slower code. And they don't at all map neatly on hardware, where largely barrier semantics are *not* tied to individual addresses. And the fence specification is just about unreadable (although I think they did fix some of the worst issues). Greetings, Andres Freund