Hi Alan, On Thu, Jun 21, 2018 at 01:27:12PM -0400, Alan Stern wrote: > More than one kernel developer has expressed the opinion that the LKMM > should enforce ordering of writes by release-acquire chains and by > locking. In other words, given the following code: > > WRITE_ONCE(x, 1); > spin_unlock(&s): > spin_lock(&s); > WRITE_ONCE(y, 1); > > or the following: > > smp_store_release(&x, 1); > r1 = smp_load_acquire(&x); // r1 = 1 > WRITE_ONCE(y, 1); > > the stores to x and y should be propagated in order to all other CPUs, > even though those other CPUs might not access the lock s or be part of > the release-acquire chain. In terms of the memory model, this means > that rel-rf-acq-po should be part of the cumul-fence relation. > > All the architectures supported by the Linux kernel (including RISC-V) > do behave this way, albeit for varying reasons. Therefore this patch > changes the model in accordance with the developers' wishes.
Interesting... I think the second example would preclude us using LDAPR for load-acquire, so I'm surprised that RISC-V is ok with this. For example, the first test below is allowed on arm64. I also think this would break if we used DMB LD to implement load-acquire (second test below). So I'm not a big fan of this change, and I'm surprised this works on all architectures. What's the justification? Will --->8 AArch64 MP+poslq-poqp+poap "PosWRLQ PodRWQP RfePA PodRRAP FrePL" Generator=diyone7 (version 7.46+3) Prefetch=0:x=F,0:y=W,1:y=F,1:x=T Com=Rf Fr Orig=PosWRLQ PodRWQP RfePA PodRRAP FrePL { 0:X1=x; 0:X4=y; 1:X1=y; 1:X3=x; } P0 | P1 ; MOV W0,#1 | LDAR W0,[X1] ; STLR W0,[X1] | LDR W2,[X3] ; LDAPR W2,[X1] | ; MOV W3,#1 | ; STR W3,[X4] | ; exists (1:X0=1 /\ 1:X2=0) AArch64 MP+pos-dmb.ld+poap "PosWR DMB.LDdRW RfePA PodRRAP Fre" Generator=diyone7 (version 7.46+3) Prefetch=0:x=F,0:y=W,1:y=F,1:x=T Com=Rf Fr Orig=PosWR DMB.LDdRW RfePA PodRRAP Fre { 0:X1=x; 0:X4=y; 1:X1=y; 1:X3=x; } P0 | P1 ; MOV W0,#1 | LDAR W0,[X1] ; STR W0,[X1] | LDR W2,[X3] ; LDR W2,[X1] | ; DMB LD | ; MOV W3,#1 | ; STR W3,[X4] | ; exists (1:X0=1 /\ 1:X2=0)