On 6/6/12 8:19 PM, Alex Rønne Petersen wrote:
On 07-06-2012 03:11, Andrei Alexandrescu wrote:
On 6/6/12 6:01 PM, Alex Rønne Petersen wrote:
(At this point, I probably don't need to point out how x86-biased and
unportable shared is.....)

I confess I'll need that spelled out. How is shared biased towards x86
and nonportable?

Thanks,

Andrei

The issue lies in its assumption that the architecture being targeted
supports atomic operations and/or memory barriers at all. Some
architectures plain don't support these, others do, but for certain data
sizes like 64-bit ints, they don't, etc. x86 is probably the
architecture that has the best support for low-level memory control as
far as atomicity and memory barriers go.

Actually x86 is one of the more forgiving architectures (most code works even when written without barriers). Indeed we assume the target architecture supports double-word atomic load.

The problem is that shared is supposed to guarantee that operations on
shared data *always* obeys whatever atomicity/memory barrier rules we
end up defining for it (obviously we don't want generated code to have
different semantics across architectures due to subtle issues like the
lack of certain operations in the ISA). Right now, based on what I've
read in the NG and on mailing lists, people seem to assume that shared
will provide full-blown x86-level atomicity and/or memory barriers.
Providing these features on e.g. ARM is a pipe dream at best (for
instance, ARM has no atomic load for 64-bit values).

http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html mentions that there is a way to implement atomic load for 64-bit values.

All this being said, shared could probably be implemented with plain old
locks on these architectures if correctness is the only goal. But, from
a more pragmatic point of view, this would completely butcher
performance and adds potential for deadlocks, and all other issues
associated with thread synchronization in general. We really shouldn't
have such a core feature of the language fall back to a dirty hack like
this on low-end/embedded architectures (where performance of this kind
of stuff is absolutely critical), IMO.

That's how C++'s atomic<T> does things, by the way. But I sympathize with your viewpoint that there should be no hidden locks. We could define shared to refuse compilation on odd machines, and THEN provide an atomic template with the expected performance of a lock.


Andrei

Reply via email to