On Wednesday, 11 May 2022 at 09:34:20 UTC, ichneumwn wrote:
Hi Forum,
I have a snippet of code as follows:
```
extern(C) extern __gshared uint g_count;
// inside a class member function:
while(g_count) <= count) {}
```
This is from a first draft of the code without proper thread
synchronisation. The global variable g_count is updated from a
bit of C++ code. As soon as I turn the optimiser on, the code
never gets passed this point, leading me to suspect it gets
turned into
```
while(true) {}
```
If modify the code in the following way:
```
import core.volatile : volatileLoad;
while(volatileLoad(&g_count) <= count) {}
```
it works again.
My question is, have I hit a compiler bug (ldc 1.28.1, aarch64
[Raspberry Pi]) or is this part of the language design. I would
have thought since D use thread-local storage by default, that
for a __gshared variable it would be understood that it can get
modified by another thread.
This is part of the language spec. The language assumes that
there is a single thread running, and any thread synchronization
must be done by the user. This is well known from C and C++, from
which D (implicitly afaik) borrows the memory model.
Example: imagine loading a struct with 2 ulongs from shared
memory: `auto s = global_struct_variable;`. Loading the data into
local storage `s` - e.g. CPU registers - would happen in two
steps, first member1, then member2 (simplified, let's assume it
spans across a cache boundary, etc..). During that load sequence,
another thread might write to the struct. If the language must
have defined behavior in that situation (other thread write),
then a global mutex lock/unlock must be done before/after _every_
read and write of shared data. That'd be a big performance impact
on multithreading. Instead, single-thread execution is assumed,
and thus the optimization is valid.
Your solution with `volatileLoad` is correct.
Access through atomic function would prevent the compiler from
optimising this away as well, but if I were to use a Mutex
inside the loop, there is no way for the compiler to tell
*what* that Mutex is protecting and it might still decide to
optimise the test away (assuming that is what is happening, did
not attempt to look at the assembler code).
Any function call (inside the loop) for which it cannot be proven
that it never modifies your memory variable will work. That's why
I'm pretty sure that mutex lock/unlock will work.
On Wednesday, 11 May 2022 at 09:37:26 UTC, rikki cattermole wrote:
Compiler optimizations should not be defined by a programming
language specification.
This is not true. Compiler optimizations are valid if and only if
they can be proven by the programming language specification. A
compiler optimization can never change valid program behavior. If
an optimization does change behavior, then either the program is
invalid per the language spec, or the optimization is bugged (or
the observed behavior change is outside the language spec, such
as how long a program takes to execute).
-Johan