Ah, so I guess we don't need a barrier at all on x86 for the release
semantics.
Presumably we still need something for Dekker-style algorithms, although I
don't think we use those anywhere in the stdlib, at least.
I guess it's just a question of which is faster?

On Tue, Apr 28, 2020 at 8:24 PM Cholerae Hu <cholerae...@gmail.com> wrote:

> On x86-TSO model, it seems that we don't need any mfence to archive
> acquire-release semantics. Acquire-release semantics only need compiler
> barrier to prevent compiler reordering, see https://godbolt.org/z/7JcX-d .
>
> 在 2020年4月29日星期三 UTC+8上午7:42:26,keith....@gmail.com写道:
>>
>> It looks like the mechanism used by C's std::atomic would not be useful
>> for us.
>>
>> We require release semantics on atomic stores.  That is, if one thread
>> does:
>>
>> .. some other writes ...
>> atomic.StoreInt32(p, 1)
>>
>> and another thread does
>>
>> if atomic.LoadInt32(p) == 1 {
>>    .. some other reads ...
>> }
>>
>> If the load sees the store, then the "other reads" must see all of the
>> "other writes". For the C atomic you cited, it does:
>>
>> regular write
>> mfence
>>
>> That doesn't provide the guarantee we need. A write before the atomic
>> could be reordered with the regular write, causing the reader to not see
>> one of the writes it was required to.
>>
>> For our use case, it would have to be
>>
>> mfence
>> regular write
>>
>> and the semantics of mfence would need to prevent write-write reorderings
>> (does it do that? Not sure.)
>>
>> We'd need some indication that changing it would be faster, as well.
>>
>> On Tuesday, April 28, 2020 at 4:03:00 AM UTC-7, Cholerae Hu wrote:
>>>
>>> But on gcc 9.3, atomic store with seq_cst order, will be compiled to
>>> mov+fence rather than xchg, see https://gcc.godbolt.org/z/ucbQt6 . Why
>>> do we use xchg rather than mov+fence in Go?
>>>
>>> 在 2020年4月28日星期二 UTC+8上午7:26:15,Ian Lance Taylor写道:
>>>>
>>>> On Sun, Apr 26, 2020 at 1:31 AM Cholerae Hu <chole...@gmail.com>
>>>> wrote:
>>>> >
>>>> > Atomic.StoreX doesn't return the old value of the given pointer, so
>>>> lock mov would work. Why do we use a xchg instead? It there any performance
>>>> issue?
>>>>
>>>> I assume that you are talking about Intel processors.  Intel
>>>> processors do not have a lock mov instruction.
>>>>
>>>> From the Intel architecture manual:
>>>>
>>>>     The LOCK prefix can be prepended only to the following
>>>> instructions and only to those forms
>>>>     of the instructions where the destination operand is a memory
>>>> operand: ADD, ADC, AND,
>>>>     BTC, BTR, BTS, CMPXCHG, CMPXCH8B, DEC, INC, NEG, NOT, OR, SBB, SUB,
>>>> XOR,
>>>>     XADD, and XCHG.
>>>>
>>>> Ian
>>>>
>>> --
> You received this message because you are subscribed to a topic in the
> Google Groups "golang-nuts" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/golang-nuts/EbBrCk2LOaU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> golang-nuts+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/golang-nuts/80d2c494-809b-47d0-bb9b-549b32068c1c%40googlegroups.com
> <https://groups.google.com/d/msgid/golang-nuts/80d2c494-809b-47d0-bb9b-549b32068c1c%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to golang-nuts+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/golang-nuts/CA%2BZMcON9ErhZU7_NEovAsjbrBc2ffaPaTYBzjD2nqwWBywN_%2BA%40mail.gmail.com.

Reply via email to