Ah, so I guess we don't need a barrier at all on x86 for the release semantics. Presumably we still need something for Dekker-style algorithms, although I don't think we use those anywhere in the stdlib, at least. I guess it's just a question of which is faster?
On Tue, Apr 28, 2020 at 8:24 PM Cholerae Hu <cholerae...@gmail.com> wrote: > On x86-TSO model, it seems that we don't need any mfence to archive > acquire-release semantics. Acquire-release semantics only need compiler > barrier to prevent compiler reordering, see https://godbolt.org/z/7JcX-d . > > 在 2020年4月29日星期三 UTC+8上午7:42:26,keith....@gmail.com写道: >> >> It looks like the mechanism used by C's std::atomic would not be useful >> for us. >> >> We require release semantics on atomic stores. That is, if one thread >> does: >> >> .. some other writes ... >> atomic.StoreInt32(p, 1) >> >> and another thread does >> >> if atomic.LoadInt32(p) == 1 { >> .. some other reads ... >> } >> >> If the load sees the store, then the "other reads" must see all of the >> "other writes". For the C atomic you cited, it does: >> >> regular write >> mfence >> >> That doesn't provide the guarantee we need. A write before the atomic >> could be reordered with the regular write, causing the reader to not see >> one of the writes it was required to. >> >> For our use case, it would have to be >> >> mfence >> regular write >> >> and the semantics of mfence would need to prevent write-write reorderings >> (does it do that? Not sure.) >> >> We'd need some indication that changing it would be faster, as well. >> >> On Tuesday, April 28, 2020 at 4:03:00 AM UTC-7, Cholerae Hu wrote: >>> >>> But on gcc 9.3, atomic store with seq_cst order, will be compiled to >>> mov+fence rather than xchg, see https://gcc.godbolt.org/z/ucbQt6 . Why >>> do we use xchg rather than mov+fence in Go? >>> >>> 在 2020年4月28日星期二 UTC+8上午7:26:15,Ian Lance Taylor写道: >>>> >>>> On Sun, Apr 26, 2020 at 1:31 AM Cholerae Hu <chole...@gmail.com> >>>> wrote: >>>> > >>>> > Atomic.StoreX doesn't return the old value of the given pointer, so >>>> lock mov would work. Why do we use a xchg instead? It there any performance >>>> issue? >>>> >>>> I assume that you are talking about Intel processors. Intel >>>> processors do not have a lock mov instruction. >>>> >>>> From the Intel architecture manual: >>>> >>>> The LOCK prefix can be prepended only to the following >>>> instructions and only to those forms >>>> of the instructions where the destination operand is a memory >>>> operand: ADD, ADC, AND, >>>> BTC, BTR, BTS, CMPXCHG, CMPXCH8B, DEC, INC, NEG, NOT, OR, SBB, SUB, >>>> XOR, >>>> XADD, and XCHG. >>>> >>>> Ian >>>> >>> -- > You received this message because you are subscribed to a topic in the > Google Groups "golang-nuts" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/golang-nuts/EbBrCk2LOaU/unsubscribe. > To unsubscribe from this group and all its topics, send an email to > golang-nuts+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/golang-nuts/80d2c494-809b-47d0-bb9b-549b32068c1c%40googlegroups.com > <https://groups.google.com/d/msgid/golang-nuts/80d2c494-809b-47d0-bb9b-549b32068c1c%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CA%2BZMcON9ErhZU7_NEovAsjbrBc2ffaPaTYBzjD2nqwWBywN_%2BA%40mail.gmail.com.