On Saturday, 9 January 2016 at 17:42:41 UTC, David Nadlinger wrote:
Not only that. It's a problem on x86 as well because advanced optimizers like those in GDC or LDC will happily assume that the members are not written to by another thread (because there would be a race otherwise) and cache the loads or even eliminate some stores. Any guarantees the processor would offer are irrelevant if the compiler has already reordered your operations. It should be quite easy to see such effects. Just compile a simple test case on GDC or LDC with optimizations on.

Yes, well, he had a sleep() in there that might prevent the compiler from moving instructions etc, but even if so... it is a bad thing to rely on.

Explicit atomics document what is going on (even when not technically needed), and should always be used where you want atomic operations, I think.

That would certainly be a starting point, although a high-performance single-producer single-consumer queue is trivial to implement once you understand atomics.

Yes. But my experience from writing custom multi-single queues is that it can end up harder than it looks to get it working and efficient. Don't be surprised if it takes 5-10x more time than anticipated to get it 100% right...

So why not start with something that is correct? You still have to spend quite a bit of time to verify that the code is identical... but that is much easier than to "formally" prove that your own implementation is correct.

(Intuition is often wrong in this area...)


Reply via email to