On Thu, 18 Dec 2025 08:51:02 -0500
Mathieu Desnoyers <[email protected]> wrote:
> On 2025-12-18 04:03, David Laight wrote:
> [...]
> >> + *
> >> + * The compiler barrier() is ineffective at fixing this issue. It does
> >> + * not prevent the compiler CSE from losing the address dependency:
> >> + *
> >> + * int fct_2_volatile_barriers(void)
> >> + * {
> >> + * int *a, *b;
> >> + *
> >> + * do {
> >> + * a = READ_ONCE(p);
> >> + * asm volatile ("" : : : "memory");
> >> + * b = READ_ONCE(p);
> >> + * } while (a != b);
> >> + * asm volatile ("" : : : "memory"); <-- barrier()
> >> + * return *b;
> >> + * }
> >> + *
> >> + * With gcc 14.2 (arm64):
> >> + *
> >> + * fct_2_volatile_barriers:
> >> + * adrp x0, .LANCHOR0
> >> + * add x0, x0, :lo12:.LANCHOR0
> >> + * .L2:
> >> + * ldr x1, [x0] <-- x1 populated by first load.
> >> + * ldr x2, [x0]
> >> + * cmp x1, x2
> >> + * bne .L2
> >> + * ldr w0, [x1] <-- x1 is used for access which should
> >> depend on b.
> >> + * ret
> >> + *
> >> + * On weakly-ordered architectures, this lets CPU speculation use the
> >> + * result from the first load to speculate "ldr w0, [x1]" before
> >> + * "ldr x2, [x0]".
> >> + * Based on the RCU documentation, the control dependency does not
> >> + * prevent the CPU from speculating loads.
> >
> > I'm not sure that example (of something that doesn't work) is really
> > necessary.
> > The simple example of, given:
> > return a == b ? *a : 0;
> > the generated code might speculatively dereference 'b' (not a) before
> > returning
> > zero when the pointers are different.
>
> In the past discussion that led to this new API, AFAIU, Linus made it
> clear that this counter example needs to be in a comment:
I might remember that...
But if you read the proposed comment it starts looking like an example.
It is also very long for the file it is in - even if clearly marked as why
the same effect can't be achieved with barrier().
Maybe the long gory comment belongs in the rst file?
I do wonder if some places need this:
#define OPTIMISER_HIDE_VAL(x) ({ auto _x = x; OPTIMISER_HIDE_VAR(_x); _x; })
Then you could do:
#define ptr_eq(x, y) (OPTIMISER_HIDE_VAL(x) == OPTIMISER_HIDE_VAL(y))
which includes the check that the pointers are the same type.
But it would be more generally useful for hiding constants from the optimiser.
David
>
> https://lore.kernel.org/lkml/CAHk-=wgBgh5U+dyNaN=+xcdcm2omgsrbch4vbtk8i5zdgws...@mail.gmail.com/
>
> This counter-example is what convinced him that this addresses a real
> issue.
>
> Thanks,
>
> Mathieu
>
>