On Saturday, 9 January 2016 at 15:16:43 UTC, Ola Fosheim Grøstad wrote:
On Saturday, 9 January 2016 at 14:20:18 UTC, Andy Smith wrote:
I'm a little worried you have no volatile writes or fences around your code when you 'publish' an event using head/tail etc. It looks like it's working but how are you ensuring no compiler/CPU reordering is ocurring. Does x86_64 actually allow you to get away with this? I know its memory model is stricter than others...

But not on ARM, so he should use atomic acquire-release semantics on indices for push/pull.

Not only that. It's a problem on x86 as well because advanced optimizers like those in GDC or LDC will happily assume that the members are not written to by another thread (because there would be a race otherwise) and cache the loads or even eliminate some stores. Any guarantees the processor would offer are irrelevant if the compiler has already reordered your operations. It should be quite easy to see such effects. Just compile a simple test case on GDC or LDC with optimizations on.

I suggest porting over spsc_queue from Boost.

That would certainly be a starting point, although a high-performance single-producer single-consumer queue is trivial to implement once you understand atomics.

Then again, I couldn't convince Andrei that a well-defined memory model (in which it even makes sense to talk about atomics in the first place) is important for D yet. Right now, you basically have to hope that everything works as in C++ – which is not a bad bet on GDC and LDC, and the weaker optimizer in DMD hides a lot of potential issues there – and stay away from things like `consume` loads that would depend on language semantics.

 — David

Reply via email to