On 2025-10-17 06:08, Thomas Gleixner wrote:
When CONFIG_CPU_SPECTRE=n then get_user() is missing the 8 byte ASM variant
for no real good reason. This prevents using get_user(u64) in generic code.
Implement it as a sequence of two 4-byte reads with LE/BE awareness and
make the unsigned long (or long long) type for the intermediate variable to
read into dependend on the the target type.
The __long_type() macro and idea was lifted from PowerPC. Thanks to
Christophe for pointing it out.
Reported-by: kernel test robot <[email protected]>
Signed-off-by: Thomas Gleixner <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Closes:
https://lore.kernel.org/oe-kbuild-all/[email protected]/
---
V2a: Solve the *ptr issue vs. unsigned long long - Russell/Christophe
V2: New patch to fix the 0-day fallout
---
arch/arm/include/asm/uaccess.h | 26 +++++++++++++++++++++++++-
1 file changed, 25 insertions(+), 1 deletion(-)
--- a/arch/arm/include/asm/uaccess.h
+++ b/arch/arm/include/asm/uaccess.h
@@ -283,10 +283,17 @@ extern int __put_user_8(void *, unsigned
__gu_err; \
})
+/*
+ * This is a type: either unsigned long, if the argument fits into
+ * that type, or otherwise unsigned long long.
+ */
+#define __long_type(x) \
+ __typeof__(__builtin_choose_expr(sizeof(x) > sizeof(0UL), 0ULL, 0UL))
+
#define __get_user_err(x, ptr, err, __t) \
do { \
unsigned long __gu_addr = (unsigned long)(ptr); \
- unsigned long __gu_val; \
+ __long_type(x) __gu_val; \
unsigned int __ua_flags; \
__chk_user_ptr(ptr); \
might_fault(); \
@@ -295,6 +302,7 @@ do {
\
case 1: __get_user_asm_byte(__gu_val, __gu_addr, err, __t); break;
\
case 2: __get_user_asm_half(__gu_val, __gu_addr, err, __t); break;
\
case 4: __get_user_asm_word(__gu_val, __gu_addr, err, __t); break;
\
+ case 8: __get_user_asm_dword(__gu_val, __gu_addr, err, __t); break;
\
default: (__gu_val) = __get_user_bad(); \
} \
uaccess_restore(__ua_flags); \
@@ -353,6 +361,22 @@ do {
\
#define __get_user_asm_word(x, addr, err, __t) \
__get_user_asm(x, addr, err, "ldr" __t)
+#ifdef __ARMEB__
+#define __WORD0_OFFS 4
+#define __WORD1_OFFS 0
+#else
+#define __WORD0_OFFS 0
+#define __WORD1_OFFS 4
+#endif
+
+#define __get_user_asm_dword(x, addr, err, __t)
\
+ ({ \
+ unsigned long __w0, __w1; \
+ __get_user_asm(__w0, addr + __WORD0_OFFS, err, "ldr" __t); \
+ __get_user_asm(__w1, addr + __WORD1_OFFS, err, "ldr" __t); \
+ (x) = ((u64)__w1 << 32) | (u64) __w0; \
+})
If we look at __get_user_asm_half, it always loads the lower addresses
first (__gu_addr), and then loads the following address (__gu_addr + 1).
This new code for dword flips the order of word accesses between BE and
LE, which means that on BE we're reading the second word and then moving
back one word.
I'm not sure whether it matters or not, but I'm pointing it out in case
it matters in terms of hardware memory access pattern.
Also we end up with __get_user_asm_{half,dword} that effectively do the
same tricks in very different ways, so it would be good to come up with
a unified pattern.
Thanks,
Mathieu
+
#define __put_user_switch(x, ptr, __err, __fn)
\
do { \
const __typeof__(*(ptr)) __user *__pu_ptr = (ptr); \
--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com