On Wed, Oct 12, 2011 at 6:06 PM, John McCall <[email protected]> wrote: > On Oct 12, 2011, at 2:54 PM, Andrew MacLeod wrote: >> On 10/12/2011 05:04 PM, John McCall wrote: >>>> The compiler built-ins can always be called directly with arbitrary sized >>>> objects, so it would be good for the generic routine to handle it rather >>>> than artificially restrict it. >>> Are you saying that the generic routine promises co-operation with >>> lock-free atomics when the size parameter is sufficiently small? That >>> seems unfortunate. The generic routine is certainly going to be slower >>> than lock-free atomics, but that doesn't mean its performance is >>> unimportant; a good implementation using striped spin-locks would probably >>> end up on the order of only 2-4 times slower than the lock-free code, so >>> adding a bunch of pre-checks may be quite significant. >>> >> The rule is that on a given target, a given object size is either A) >> lockfree or B) unknown. >> >> so if the compiler always generates lock free instructions for say 8 byte >> values, the library will have to as well. ie, __atomic_load_8() will also >> be implemented as lock free. it has to be or things break. The >> interface from the compiler point of view is that you simply call the >> routine __atomic_load(obj), the presence of __atomic_load_8 is transparent >> to the caller. Normally the compiler will turn __atomic_load() into the >> direct call, but I don't see any reason to force the compiler do that. It >> should be free to leave the original call to __atomic_load () and that code >> should co-exist with stuff that did call __atomic_load_8. its free to do >> whatever it wants with sizes that don't exactly match the 5 'optimal' sizes. >> >> It does mean that the library entry point for atomic_load() may start with a >> switch on the size of the object to jump to the "efficient" routines if >> size matches, but thats about it. Is that what you are concerned about? > > Yes. The library entry point for atomic_load is an implementation detail of > the ABI that we have total control of, so why make stronger promises than > necessary? Why not force the compiler to directly call (or inline, if it > knows it's lock-free) __atomic_load_8? > >>>>> Honestly, I don't think future-proofing against arbitrary new atomic >>>>> instructions really makes any sense. Even going up to 16 bytes (on >>>>> architectures where that can't be done lock-free now) worries me a bit. >>>>> >>>>> Rounding up also worries me, since the user has no control over the >>>>> padding bytes, but they can still cause spurious failures on, say, >>>>> compare-and-swap. >>>> If the padding is under control of the 'atomic' keyword for the type, then >>>> we have complete control over those padding bytes. Regardless of what junk >>>> might be in them, they are part of the atomic data structure and the only >>>> way to access them is through a full atomic access. Its like adding >>>> another user field to the structure and not setting it. It shouldn't cause >>>> spurious failures. >>> Well, we at least have to make sure that our atomic operations always >>> zero-pad their operands. For example, if we do an atomic store into a >>> 5-byte struct that we've padded to 8 bytes, we have to make sure we store a >>> zero pattern into the pad. That's feasible, but it's complexity that we >>> should at least acknowledge before committing to it. >>> >>> I can also see this exposing lots of what are, admittedly, source bugs, >>> like only zero'ing the first sizeof(T) bytes of an _Atomic(T). >> >> I'm still not sure I see how it matters. Nothing should ever access or >> change that padding field in an unfriendly way since it is always part of >> the atomic word that is load/stored, exchanged or compare_exchanged. Those >> are the only 4 ways to access the memory. All the fetch_op's only operate >> on full integral values, so thats not a concern. Compare_exchange is the >> only one which it could be affected, and it requires that the 'expected' >> value be from an atomic load or exchange… > > I wasn't aware of that. That's really a very strange constraint — why is it > illegal to independently assemble a value that I expect might be in an atomic > variable?
The standard doesn't make any guarantees about the layout of std::atomic<T>; how exactly could you manipulate the bits of an std::atomic<T> in a cross-platform manner? -Eli _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
