On 1/18/21 11:33 PM, Liu Hao wrote:
> 在 2021/1/19 上午11:54, Zebediah Figura (she/her) 写道:
>>
>> Do you mean that the writer should change the above code to e.g.
>> "__atomic_load_n(&var, ___ATOMIC_ACQUIRE)", or that the read already
>> implies acquire semantics? I didn't think that the latter was true, but
>> I don't trust that judgement; locking is way too hard...
>>
> 
> Yes. There isn't necessarily any locking. A notable example is x86, which is 
> very close to total
> store ordering: Loading from a memory location implies acquire semantics.
> 
> 
>> But even if the above code isn't correct, I don't think it's worth
>> potentially breaking it.
>>
> 
> If `var` is not declared `volatile` then it is definitely undesirable. GCC 
> can optimize it to an
> infinite loop.

But not with a compiler barrier inserted into the loop, i.e. as part of
YieldProcessor(). (I even checked locally, it will not optimize access
to a non-volatile variable in that case.)

I know the C11 spec doesn't actually make guarantees about this, but the
GCC documentation does, and if the programmer is expecting
YieldProcessor() to act as a compiler barrier (and even a partial
processor barrier, judging by its ARM implementation), then it seems
like playing with fire to violate that.

Anyway, thanks to that godbolt.org site—thanks for pointing that out to
me—I can test that YieldProcessor() does seem to provide a compiler
barrier on MSVC:

https://gcc.godbolt.org/z/e6GMYz

> 
> If `var` is declared `volatile`, this is still incorrect, but it was what 
> people wrote code back in
> VC6 days. The `volatile` qualifier doesn't guarantee correct ordering if the 
> loop isn't executed *at
> all*, so clobbering "memory" doesn't yield any benefit; but if the loop is 
> executed, `volatile`
> reads cannot be optimized out, so it doesn't yield any harm, either.
> 
> 
>>
>> So here's a fun fact: if I change the statement
>>
>> __asm__ ("nop");
>>
>> to
>>
>> __asm__ ("nop":);
>>
>> then gcc will *not* reload ival. This is apparently the difference
>> between "basic" and "extended" asm, which I was not previously aware of.
>> I can't find it explicitly stated in gcc's documentation that basic asm
>> always clobbers memory, but apparently it's the case.
>>
>> Of course, in that case, I'd encourage sticking with the explicitly
>> documented behavior and using extended asm.
>>
> 
> That's interesting. As you have pointed out, it prevents the compiler from 
> reusing a value loaded
> earlier, but this probably isn't needed by `YieldProcessor()`; it's only 
> necessary for
> `MemoryBarrier()`.
> 
> 
> 
> 
> 
> _______________________________________________
> Mingw-w64-public mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
> 



_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to