Mark notes that gcc optimization passes have the potential to elide
necessary invocations of this instruction sequence, so mark the asm
volaltile.
---
>From Mark:
The volatile will inhibit *some* cases where the compiler could lift the
array_index_nospec() call out of a branch, e.g. where there are multiple
invocations of array_index_nospec() with the same arguments:
if (idx < foo) {
idx1 = array_idx_nospec(idx, foo)
do_something(idx1);
}
< some other code >
if (idx < foo) {
idx2 = array_idx_nospec(idx, foo);
do_something_else(idx2);
}
... since the compiler can determine that the two invocations yield the same
result, and reuse the first result (likely the same register as idx was in
originally) for the second branch, effectively re-writing the above as:
if (idx < foo) {
idx = array_idx_nospec(idx, foo);
do_something(idx);
}
< some other code >
if (idx < foo) {
do_something_else(idx);
}
... if we don't take the first branch, then speculatively take the second, we
lose the nospec protection.
There's more info on volatile asm in the GCC docs:
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#Volatile
Cc: <[email protected]>
Fixes: babdde2698d4 ("x86: Implement array_index_mask_nospec")
Cc: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Ingo Molnar <[email protected]>
Reported-by: Mark Rutland <[email protected]>
Acked-by: Mark Rutland <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
Changes in v2:
* drop the barrier() call (Mark)
* update the example (Mark)
arch/x86/include/asm/barrier.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/include/asm/barrier.h b/arch/x86/include/asm/barrier.h
index 042b5e892ed1..14de0432d288 100644
--- a/arch/x86/include/asm/barrier.h
+++ b/arch/x86/include/asm/barrier.h
@@ -38,7 +38,7 @@ static inline unsigned long array_index_mask_nospec(unsigned
long index,
{
unsigned long mask;
- asm ("cmp %1,%2; sbb %0,%0;"
+ asm volatile ("cmp %1,%2; sbb %0,%0;"
:"=r" (mask)
:"g"(size),"r" (index)
:"cc");