On 2024-09-28 16:49, Alan Stern wrote:
On Sat, Sep 28, 2024 at 09:51:27AM -0400, Mathieu Desnoyers wrote:
Compiler CSE and SSA GVN optimizations can cause the address dependency
of addresses returned by rcu_dereference to be lost when comparing those
pointers with either constants or previously loaded pointers.
Introduce ptr_eq() to compare two addresses while preserving the address
dependencies for later use of the address. It should be used when
comparing an address returned by rcu_dereference().
This is needed to prevent the compiler CSE and SSA GVN optimizations
from replacing the registers holding @a or @b based on their
"Replacing" isn't the right word. What the compiler does is use one
rather than the other. Furthermore, the compiler can play these games
even with values that aren't in registers.
You should just say: "... from using @a (or @b) in places where the
source refers to @b (or @a) (based on the fact that after the
comparison, the two are known to be equal), which does not ..."
OK.
equality, which does not preserve address dependencies and allows the
following misordering speculations:
- If @b is a constant, the compiler can issue the loads which depend
on @a before loading @a.
- If @b is a register populated by a prior load, weakly-ordered
CPUs can speculate loads which depend on @a before loading @a.
It shouldn't matter whether @a and @b are constants, registers, or
anything else. All that matters is that the compiler uses the wrong
one, which allows weakly ordered CPUs to speculate loads you wouldn't
expect it to, based on the source code alone.
I only partially agree here.
On weakly-ordered architectures, indeed we don't care whether the
issue is caused by the compiler reordering the code (constant)
or the CPU speculating the load (registers).
However, on strongly-ordered architectures, AFAIU, only the constant
case is problematic (compiler reordering the dependent load), because
CPU speculating the loads across the control dependency is not an
issue.
So am I tempted to keep examples that clearly state whether
the issue is caused by compiler reordering instructions, or by
CPU speculation.
The same logic applies with @a and @b swapped.
The compiler barrier() is ineffective at fixing this issue.
It does not prevent the compiler CSE from losing the address dependency:
int fct_2_volatile_barriers(void)
{
int *a, *b;
do {
a = READ_ONCE(p);
asm volatile ("" : : : "memory");
b = READ_ONCE(p);
} while (a != b);
asm volatile ("" : : : "memory"); <----- barrier()
return *b;
}
With gcc 14.2 (arm64):
fct_2_volatile_barriers:
adrp x0, .LANCHOR0
add x0, x0, :lo12:.LANCHOR0
.L2:
ldr x1, [x0] <------ x1 populated by first load.
ldr x2, [x0]
cmp x1, x2
bne .L2
ldr w0, [x1] <------ x1 is used for access which should depend
on b.
ret
On weakly-ordered architectures, this lets CPU speculation use the
result from the first load to speculate "ldr w0, [x1]" before
"ldr x2, [x0]".
Based on the RCU documentation, the control dependency does not prevent
the CPU from speculating loads.
[...]
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 2df665fa2964..f26705c267e8 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -186,6 +186,68 @@ void ftrace_likely_update(struct ftrace_likely_data *f,
int val,
__asm__ ("" : "=r" (var) : "0" (var))
#endif
+/*
+ * Compare two addresses while preserving the address dependencies for
+ * later use of the address. It should be used when comparing an address
+ * returned by rcu_dereference().
+ *
+ * This is needed to prevent the compiler CSE and SSA GVN optimizations
+ * from replacing the registers holding @a or @b based on their
+ * equality, which does not preserve address dependencies and allows the
Replacing with:
* This is needed to prevent the compiler CSE and SSA GVN optimizations
* from using @a (or @b) in places where the source refers to @b (or @a)
* based on the fact that after the comparison, the two are known to be
* equal, which does not preserve address dependencies and allows the
* following misordering speculations:
+ * following misordering speculations:
+ *
+ * - If @b is a constant, the compiler can issue the loads which depend
+ * on @a before loading @a.
+ * - If @b is a register populated by a prior load, weakly-ordered
+ * CPUs can speculate loads which depend on @a before loading @a.
+ *
+ * The same logic applies with @a and @b swapped.
This could be more concise, and it should be more general (along the
same lines as the description above).
As per my earlier comment, I would prefer to keep the examples specific
rather than general so it is clear which scenarios are problematic on
weakly vs strongly ordered architectures.
[...]
Thanks,
Mathieu
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com