On 07-06-2012 03:55, Andrei Alexandrescu wrote:
On 6/6/12 8:19 PM, Alex Rønne Petersen wrote:
On 07-06-2012 03:11, Andrei Alexandrescu wrote:
On 6/6/12 6:01 PM, Alex Rønne Petersen wrote:
(At this point, I probably don't need to point out how x86-biased and
unportable shared is.....)

I confess I'll need that spelled out. How is shared biased towards x86
and nonportable?

Thanks,

Andrei

The issue lies in its assumption that the architecture being targeted
supports atomic operations and/or memory barriers at all. Some
architectures plain don't support these, others do, but for certain data
sizes like 64-bit ints, they don't, etc. x86 is probably the
architecture that has the best support for low-level memory control as
far as atomicity and memory barriers go.

Actually x86 is one of the more forgiving architectures (most code works
even when written without barriers). Indeed we assume the target
architecture supports double-word atomic load.

And if cent/ucent ever get implemented (which does seem likely, although they're low-prio), we'll have to assume 128-bit too. Here Be Dragons. ;)


The problem is that shared is supposed to guarantee that operations on
shared data *always* obeys whatever atomicity/memory barrier rules we
end up defining for it (obviously we don't want generated code to have
different semantics across architectures due to subtle issues like the
lack of certain operations in the ISA). Right now, based on what I've
read in the NG and on mailing lists, people seem to assume that shared
will provide full-blown x86-level atomicity and/or memory barriers.
Providing these features on e.g. ARM is a pipe dream at best (for
instance, ARM has no atomic load for 64-bit values).

http://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html mentions that
there is a way to implement atomic load for 64-bit values.

You learn something new every day! When we did research for MCI's atomic intrinsics, we didn't notice these instructions on ARM. Thanks for the link.

This covers most significant architectures today, but I'm still worried about e.g. Super-H, Alpha, SPARC, MIPS, and others that are listed on http://dlang.org/version.html (I think that at least SPARC lacks double-word atomic load/store).


All this being said, shared could probably be implemented with plain old
locks on these architectures if correctness is the only goal. But, from
a more pragmatic point of view, this would completely butcher
performance and adds potential for deadlocks, and all other issues
associated with thread synchronization in general. We really shouldn't
have such a core feature of the language fall back to a dirty hack like
this on low-end/embedded architectures (where performance of this kind
of stuff is absolutely critical), IMO.

That's how C++'s atomic<T> does things, by the way. But I sympathize
with your viewpoint that there should be no hidden locks. We could
define shared to refuse compilation on odd machines, and THEN provide an
atomic template with the expected performance of a lock.

That may be a reasonable approach. But if we do this, I think we need to revisit the core.atomic API, since it unnecessarily requires the shared qualifier for some things (just because shared overall isn't useful on a target architecture doesn't mean that e.g. a 32-bit atomic load can't be done on it).



Andrei


--
Alex Rønne Petersen
[email protected]
http://lycus.org

Reply via email to