On Tue, Sep 13, 2011 at 1:38 AM, John McCall <[email protected]> wrote: > On Sep 13, 2011, at 1:20 AM, Eli Friedman wrote: >> On Mon, Sep 12, 2011 at 11:16 PM, John McCall <[email protected]> wrote: >>> On Sep 12, 2011, at 10:07 PM, Eli Friedman wrote: >>>> On Mon, Sep 12, 2011 at 8:34 PM, John McCall <[email protected]> wrote: >>>>> +/// Return the maximum size that permits atomic accesses for the given >>>>> +/// architecture. >>>>> +static CharUnits getMaxAtomicAccessSize(CodeGenModule &CGM, >>>>> + llvm::Triple::ArchType arch) { >>>>> + // ARM has 8-byte atomic accesses, but it's not clear whether we >>>>> + // want to rely on them here. >>>> >>>> I don't see any problem with relying on them... but the compiler >>>> probably wouldn't end up using them very often. Nothing is normally >>>> aligned to 8 bytes on ARM (well, at least not on iOS). >>> >>> Er. Clang, at least, seems to evaluate __alignof(double) as 8, and >>> vector types are probably the same. >> >> IIRC, that isn't the ABI alignment. > > Ah yes, you're right. So unless we started supporting attribute((aligned)) > on ivars, it wouldn't matter. > >>>> I think you're missing a check on the size here: we do not currently >>>> support i24 atomic stores, and I do not intend to implement such >>>> support. >>> >>> Good catch! Anything subtle besides "power of two"? Are there >>> architectures that don't guarantee atomicity of stores *smaller* than >>> pointer size? >> >> I don't know of any architecture that supports atomic stores of >> pointer size, and has smaller stores which are not atomic. There are >> architectures which do not have any stores smaller than a pointer... >> LLVM does not support any such architecture, though, and I doubt >> anyone would try to use Objective-C on such an architecture. > > Okay, I wasn't sure whether any of LLVM's less-personally-relevant-to-me > architectures lacked small loads and stores. > >>>> I want to double-check that Unordered is really what you want... if a >>>> value starts off at 0, one thread sets it to 1, and another thread >>>> calls the getter twice, is it legal if the first getter call returns >>>> 1, and the second 0? >>> >>> Probably not; I wasn't considering that level of causality breakage. >>> I'll talk it over with the runtime folks; we probably want this to be >>> release/acquire or sequentially consistent or something. >> >> SequentiallyConsistent will get you the same behavior as an >> implementation with locks, and is the only way to completely avoid >> causality violations with atomic properties. That would be the most >> conservative choice, but I can't really say whether that's the right >> choice without knowing how people write multi-threaded Objective-C >> code. > > The current behavior is actually to emit naked loads and stores when > the data is small enough, so I suspect there's a lot of accidental > reliance on external synchronization going on here. Well, that and > relatively strong consistency guarantees like x86's. > > Is this worth thinking about? Specifically, will full SC compile any > differently from acquiring loads and releasing stores on any of the > Apple architectures?
On x86, SC stores use xchg instructions rather than movs, which makes them slower. On ARM, I believe SC requires an extra DMB after stores, compared to acq/rel operations: http://www.decadent.org.uk/pipermail/cpp-threads/2008-December/001953.html. Jeffrey _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
