Re: [lock-free] One reason why I like atomic_thread_fence...

Chris M. Thomasson Mon, 23 Apr 2018 14:19:42 -0700

On Friday, April 20, 2018 at 2:06:22 AM UTC-7, Dmitry Vyukov wrote:
>
> On Mon, Apr 16, 2018 at 12:32 AM, Chris M. Thomasson 
> <cri...@charter.net <javascript:>> wrote: 
> > 
> > 
> > On Friday, April 13, 2018 at 11:45:51 PM UTC-7, Dmitry Vyukov wrote: 
> >> 
> >> On Mon, Apr 9, 2018 at 3:38 AM, Chris M. Thomasson <cri...@charter.net> 
>
> >> wrote: 
> >> > On Saturday, April 7, 2018 at 1:46:20 AM UTC-7, Dmitry Vyukov wrote: 
> >> >> 
> >> >> On Thu, Apr 5, 2018 at 10:03 PM, Chris M. Thomasson 
> >> >> <cri...@charter.net> 
> >> >> wrote: 
> >> >> > On Tuesday, April 3, 2018 at 5:44:38 AM UTC-7, Dmitry Vyukov 
> wrote: 
> >> >> >> 
> >> >> >> On Sat, Mar 31, 2018 at 10:41 PM, Chris M. Thomasson 
> >> >> >> <cri...@charter.net> wrote: 
> [...]
> > D should be a pure relaxed store, and C should not be covered by the 
> > RELEASE. Iirc, it works this way on SPARC RMO mode. However, on x86, C 
> will 
> > be covered because each store has implied release characteristics, wb 
> memory 
> > aside for a moment. 
>
> C and D are completely symmetric wrt the RELEASE. Later you can 
> discover that there is also a thread that does: 
>
> // consumer 2 
> while (C != 3) backoff; 
> ACQUIRE 
> assert(A == 1 && B == 2); 
>
> And now suddenly C is release operation exactly the same way D is. 
>


Argh! D is the "control" signal. C should not be used as a signal to 
acquire. Damn. C should not have to be involved because the RELEASE is 
_before_ C was assigned.

Argh.


> >> At lease this is how this is defined in C/C++ standards. 
> >> ACQUIRE/RELEASE fences do not establish any happens-before relations 
> >> themselves. You still need a load in one thread to observe a value 
> >> stored in another thread. And only that "materializes" standalone 
> >> fence synchronization. So a store that materializes RELEASE fence will 
> >> always be a subsequent store. 
> > 
> > 
> > Humm... That is too strict, and has to be there whether we use 
> standalone 
> > fences or not. 
>
> No, in C/C++ memory ordering constrains tied to memory operations act 
> only on that memory operation. 


> Consider: 
>
> DATA = 1; 
> C.store(1, memory_order_release); 
> D.store(1, memory_order_relaxed); 
>
> vs: 
>
> DATA = 1; 
> atomic_thread_fence(memory_order_release); 
> C.store(1, memory_order_relaxed); 
> D.store(1, memory_order_relaxed); 
>
>
> And 2 consumers: 
>
> // consumer 1 
> while (C.load(memory_order_acquire) == 0) backoff(); 
> assert(DATA == 1); 
>
> // consumer 2 
> while (D.load(memory_order_acquire) == 0) backoff(); 
> assert(DATA == 1); 
>
> Both consumers are correct wrt the atomic_thread_fence version of 
> producer. But only the first one is correct wrt the store(1, 
> memory_order_release) version of producer. 
>
> And this can actually break on x86 because: 
>

Iirc, x86 has implied release semantics on stores, and acquire on loads. If 
we load C and observe 1, we will see DATA as 1. However, D is not in sync 
at all. If we load D and see it as being 1, then C = 1 and DATA = 1.

Consumer 1 will see C = 1 and DATA = 1: D will be out of sync
Consumer 2 will see D = 1, C = 1 and DATA = 1: They will all be in sync


 [...]

> > Yes! The stand alone fence can say, we want to perform an acquire 
> barrier 
> > wrt m_head. Something like that should be able to create more fine grain 
> > setups. Perhaps even something like the following pseudo-code: 
> > ______________________ 
> > // setup 
> > int a = 0; 
> > int b = 0; 
> > int c = 0; 
> > signal = false; 
> > 
> > // producer 
> > a = 1; 
> > b = 2; 
> > RELEASE(&signal, &a, &b); 
> > c = 3; 
> > STORE_RELAXED(&signal, true); 
> > 
> > // consumers 
> > while (LOAD_RELAXED(&signal) != true) backoff; 
> > ACQUIRE(&signal, &a, &b); 
> > assert(a == 1 && b == 2); 
> > ______________________ 
> > 
> > The consumers would always see a and b as 1 and 2, however, c was not 
> > covered, so it is an incoherent state wrt said consumers. 
> > 
> > The acquire would only target a and b, as would the release. 
> > 
> > Hummm... Just thinking out loud here. :^) 
>
> This would help verification tools tremendously... but unfortunately 
> it's not the reality we are living in :) 
>

Shi% happens! However, it would allow one to combine the flexibility of 
standalone fences and targeted memory addresses all in one.

Need to think some more on this.

Thank you! :^D

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"Scalable Synchronization Algorithms" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to lock-free+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/lock-free/24cd3cc6-49bc-4e9c-b045-4eb7650ddf7b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [lock-free] One reason why I like atomic_thread_fence...

Reply via email to