Re: [RHEL-8] arm64: add missing early clobber in atomic64_dec_if_positive()
Hi Mark, Thanks for reporting this. On Sat, May 19, 2018 at 08:17:26PM -0400, Mark Salter wrote: > When running a kernel compiled with gcc8 on a machine using LSE, I > get: > > Unable to handle kernel paging request at virtual address 112221 [...] > The fault happens at the casal insn of inlined atomic64_dec_if_positive(). > The inline asm code in that function has: > > "1: ldr x30, %[v]\n" > " subs%[ret], x30, #1\n" > " b.lt2f\n" > " casal x30, %[ret], %[v]\n" > " sub x30, x30, #1\n" > " sub x30, x30, %[ret]\n" > " cbnzx30, 1b\n" > "2:") > : [ret] "+r" (x0), [v] "+Q" (v->counter) > > gcc8 used register x0 for both [ret] and [v] and the subs was > clobbering [v] before it was used for casal. Gcc is free to do > this because [ret] lacks an early clobber modifier. So add one > to tell gcc a separate register is needed for [v]. Oh blimey, it looks like GCC is realising that counter is at offset 0 of atomic_t and therefore assigns the same register for [ret] and [v], which is actually forced to be x0 by the 'register' local variable in C code. The "+Q" constraint only says that the memory is read/write, so the pointer is fair game. I agree with your fix, but we also need to fix up the other places relying on this. Patch below -- please yell if you think I missed any. Cheers, Will --->8 >From 3d9417b28ed2588c33b7e54e6681c88f0224201a Mon Sep 17 00:00:00 2001 From: Will DeaconDate: Mon, 21 May 2018 17:44:57 +0100 Subject: [PATCH] arm64: lse: Add early clobbers to some input/output asm operands For LSE atomics that read and write a register operand, we need to ensure that these operands are annotated as "early clobber" if the register is written before all of the input operands have been consumed. Failure to do so can result in the compiler allocating the same register to both operands, leading to splats such as: Unable to handle kernel paging request at virtual address 112221 [...] x1 : x0 : 2221 Process swapper/0 (pid: 1, stack limit = 0x8209f908) Call trace: test_atomic64+0x1360/0x155c where x0 has been allocated as both the value to be stored and also the atomic_t pointer. This patch adds the missing clobbers. Cc: Cc: Dave Martin Cc: Robin Murphy Reported-by: Mark Salter Signed-off-by: Will Deacon --- arch/arm64/include/asm/atomic_lse.h | 24 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h index 9ef0797380cb..f9b0b09153e0 100644 --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -117,7 +117,7 @@ static inline void atomic_and(int i, atomic_t *v) /* LSE atomics */ " mvn %w[i], %w[i]\n" " stclr %w[i], %[v]") - : [i] "+r" (w0), [v] "+Q" (v->counter) + : [i] "+" (w0), [v] "+Q" (v->counter) : "r" (x1) : __LL_SC_CLOBBERS); } @@ -135,7 +135,7 @@ static inline int atomic_fetch_and##name(int i, atomic_t *v)\ /* LSE atomics */ \ " mvn %w[i], %w[i]\n" \ " ldclr" #mb "%w[i], %w[i], %[v]")\ - : [i] "+r" (w0), [v] "+Q" (v->counter) \ + : [i] "+" (w0), [v] "+Q" (v->counter) \ : "r" (x1) \ : __LL_SC_CLOBBERS, ##cl); \ \ @@ -161,7 +161,7 @@ static inline void atomic_sub(int i, atomic_t *v) /* LSE atomics */ " neg %w[i], %w[i]\n" " stadd %w[i], %[v]") - : [i] "+r" (w0), [v] "+Q" (v->counter) + : [i] "+" (w0), [v] "+Q" (v->counter) : "r" (x1) : __LL_SC_CLOBBERS); } @@ -180,7 +180,7 @@ static inline int atomic_sub_return##name(int i, atomic_t *v) \ " neg %w[i], %w[i]\n" \ " ldadd" #mb "%w[i], w30, %[v]\n" \ " add %w[i], %w[i], w30") \ - : [i] "+r" (w0), [v] "+Q" (v->counter) \ + : [i] "+" (w0), [v] "+Q" (v->counter) \ : "r" (x1) \ : __LL_SC_CLOBBERS , ##cl); \ \ @@ -207,7 +207,7 @@ static inline int atomic_fetch_sub##name(int i, atomic_t
Re: [RHEL-8] arm64: add missing early clobber in atomic64_dec_if_positive()
Hi Mark, Thanks for reporting this. On Sat, May 19, 2018 at 08:17:26PM -0400, Mark Salter wrote: > When running a kernel compiled with gcc8 on a machine using LSE, I > get: > > Unable to handle kernel paging request at virtual address 112221 [...] > The fault happens at the casal insn of inlined atomic64_dec_if_positive(). > The inline asm code in that function has: > > "1: ldr x30, %[v]\n" > " subs%[ret], x30, #1\n" > " b.lt2f\n" > " casal x30, %[ret], %[v]\n" > " sub x30, x30, #1\n" > " sub x30, x30, %[ret]\n" > " cbnzx30, 1b\n" > "2:") > : [ret] "+r" (x0), [v] "+Q" (v->counter) > > gcc8 used register x0 for both [ret] and [v] and the subs was > clobbering [v] before it was used for casal. Gcc is free to do > this because [ret] lacks an early clobber modifier. So add one > to tell gcc a separate register is needed for [v]. Oh blimey, it looks like GCC is realising that counter is at offset 0 of atomic_t and therefore assigns the same register for [ret] and [v], which is actually forced to be x0 by the 'register' local variable in C code. The "+Q" constraint only says that the memory is read/write, so the pointer is fair game. I agree with your fix, but we also need to fix up the other places relying on this. Patch below -- please yell if you think I missed any. Cheers, Will --->8 >From 3d9417b28ed2588c33b7e54e6681c88f0224201a Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Mon, 21 May 2018 17:44:57 +0100 Subject: [PATCH] arm64: lse: Add early clobbers to some input/output asm operands For LSE atomics that read and write a register operand, we need to ensure that these operands are annotated as "early clobber" if the register is written before all of the input operands have been consumed. Failure to do so can result in the compiler allocating the same register to both operands, leading to splats such as: Unable to handle kernel paging request at virtual address 112221 [...] x1 : x0 : 2221 Process swapper/0 (pid: 1, stack limit = 0x8209f908) Call trace: test_atomic64+0x1360/0x155c where x0 has been allocated as both the value to be stored and also the atomic_t pointer. This patch adds the missing clobbers. Cc: Cc: Dave Martin Cc: Robin Murphy Reported-by: Mark Salter Signed-off-by: Will Deacon --- arch/arm64/include/asm/atomic_lse.h | 24 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h index 9ef0797380cb..f9b0b09153e0 100644 --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -117,7 +117,7 @@ static inline void atomic_and(int i, atomic_t *v) /* LSE atomics */ " mvn %w[i], %w[i]\n" " stclr %w[i], %[v]") - : [i] "+r" (w0), [v] "+Q" (v->counter) + : [i] "+" (w0), [v] "+Q" (v->counter) : "r" (x1) : __LL_SC_CLOBBERS); } @@ -135,7 +135,7 @@ static inline int atomic_fetch_and##name(int i, atomic_t *v)\ /* LSE atomics */ \ " mvn %w[i], %w[i]\n" \ " ldclr" #mb "%w[i], %w[i], %[v]")\ - : [i] "+r" (w0), [v] "+Q" (v->counter) \ + : [i] "+" (w0), [v] "+Q" (v->counter) \ : "r" (x1) \ : __LL_SC_CLOBBERS, ##cl); \ \ @@ -161,7 +161,7 @@ static inline void atomic_sub(int i, atomic_t *v) /* LSE atomics */ " neg %w[i], %w[i]\n" " stadd %w[i], %[v]") - : [i] "+r" (w0), [v] "+Q" (v->counter) + : [i] "+" (w0), [v] "+Q" (v->counter) : "r" (x1) : __LL_SC_CLOBBERS); } @@ -180,7 +180,7 @@ static inline int atomic_sub_return##name(int i, atomic_t *v) \ " neg %w[i], %w[i]\n" \ " ldadd" #mb "%w[i], w30, %[v]\n" \ " add %w[i], %w[i], w30") \ - : [i] "+r" (w0), [v] "+Q" (v->counter) \ + : [i] "+" (w0), [v] "+Q" (v->counter) \ : "r" (x1) \ : __LL_SC_CLOBBERS , ##cl); \ \ @@ -207,7 +207,7 @@ static inline int atomic_fetch_sub##name(int i, atomic_t *v)\ /* LSE atomics */ \ " neg %w[i],
[RHEL-8] arm64: add missing early clobber in atomic64_dec_if_positive()
When running a kernel compiled with gcc8 on a machine using LSE, I get: Unable to handle kernel paging request at virtual address 112221 Mem abort info: ESR = 0x9621 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x0021 CM = 0, WnR = 0 [00112221] address between user and kernel address ranges Internal error: Oops: 9621 [#1] SMP ... pstate: 2049 (nzCv daif +PAN -UAO) pc : test_atomic64+0x1360/0x155c lr : 0x sp : 0bc6fd60 x29: 0bc6fd60 x28: x27: x26: 08f04460 x25: 08de0584 x24: 08e91060 x23: aaa31337c001d00e x22: 999202269ddfadeb x21: aaa31337c001d00c x20: bbb42448e223f22f x19: aaa31337c001d00d x18: 0010 x17: 0222 x16: 10e0 x15: x14: 09233c08 x13: 89925a8f x12: 09925a97 x11: 0927f000 x10: 0bc6fac0 x9 : ffd0 x8 : 0853fdf8 x7 : deadbeef x6 : 0bc6fda0 x5 : aaa31337c001d00d x4 : deadbeefdeafcafe x3 : aaa31337c001d00d x2 : aaa31337c001d00e x1 : x0 : 2221 Process swapper/0 (pid: 1, stack limit = 0x8209f908) Call trace: test_atomic64+0x1360/0x155c test_atomics_init+0x10/0x28 do_one_initcall+0x134/0x160 kernel_init_freeable+0x18c/0x21c kernel_init+0x18/0x108 ret_from_fork+0x10/0x1c Code: f90023e1 f940001e f10007c0 54ab (c8fefc00) ---[ end trace 29569e7320c6e926 ]--- The fault happens at the casal insn of inlined atomic64_dec_if_positive(). The inline asm code in that function has: "1: ldr x30, %[v]\n" " subs%[ret], x30, #1\n" " b.lt2f\n" " casal x30, %[ret], %[v]\n" " sub x30, x30, #1\n" " sub x30, x30, %[ret]\n" " cbnzx30, 1b\n" "2:") : [ret] "+r" (x0), [v] "+Q" (v->counter) gcc8 used register x0 for both [ret] and [v] and the subs was clobbering [v] before it was used for casal. Gcc is free to do this because [ret] lacks an early clobber modifier. So add one to tell gcc a separate register is needed for [v]. Signed-off-by: Mark Salter--- arch/arm64/include/asm/atomic_lse.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h index 9ef0797380cb..99fa69c9c3cf 100644 --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -435,7 +435,7 @@ static inline long atomic64_dec_if_positive(atomic64_t *v) " sub x30, x30, %[ret]\n" " cbnzx30, 1b\n" "2:") - : [ret] "+r" (x0), [v] "+Q" (v->counter) + : [ret] "+" (x0), [v] "+Q" (v->counter) : : __LL_SC_CLOBBERS, "cc", "memory"); -- 2.17.0
[RHEL-8] arm64: add missing early clobber in atomic64_dec_if_positive()
When running a kernel compiled with gcc8 on a machine using LSE, I get: Unable to handle kernel paging request at virtual address 112221 Mem abort info: ESR = 0x9621 Exception class = DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 Data abort info: ISV = 0, ISS = 0x0021 CM = 0, WnR = 0 [00112221] address between user and kernel address ranges Internal error: Oops: 9621 [#1] SMP ... pstate: 2049 (nzCv daif +PAN -UAO) pc : test_atomic64+0x1360/0x155c lr : 0x sp : 0bc6fd60 x29: 0bc6fd60 x28: x27: x26: 08f04460 x25: 08de0584 x24: 08e91060 x23: aaa31337c001d00e x22: 999202269ddfadeb x21: aaa31337c001d00c x20: bbb42448e223f22f x19: aaa31337c001d00d x18: 0010 x17: 0222 x16: 10e0 x15: x14: 09233c08 x13: 89925a8f x12: 09925a97 x11: 0927f000 x10: 0bc6fac0 x9 : ffd0 x8 : 0853fdf8 x7 : deadbeef x6 : 0bc6fda0 x5 : aaa31337c001d00d x4 : deadbeefdeafcafe x3 : aaa31337c001d00d x2 : aaa31337c001d00e x1 : x0 : 2221 Process swapper/0 (pid: 1, stack limit = 0x8209f908) Call trace: test_atomic64+0x1360/0x155c test_atomics_init+0x10/0x28 do_one_initcall+0x134/0x160 kernel_init_freeable+0x18c/0x21c kernel_init+0x18/0x108 ret_from_fork+0x10/0x1c Code: f90023e1 f940001e f10007c0 54ab (c8fefc00) ---[ end trace 29569e7320c6e926 ]--- The fault happens at the casal insn of inlined atomic64_dec_if_positive(). The inline asm code in that function has: "1: ldr x30, %[v]\n" " subs%[ret], x30, #1\n" " b.lt2f\n" " casal x30, %[ret], %[v]\n" " sub x30, x30, #1\n" " sub x30, x30, %[ret]\n" " cbnzx30, 1b\n" "2:") : [ret] "+r" (x0), [v] "+Q" (v->counter) gcc8 used register x0 for both [ret] and [v] and the subs was clobbering [v] before it was used for casal. Gcc is free to do this because [ret] lacks an early clobber modifier. So add one to tell gcc a separate register is needed for [v]. Signed-off-by: Mark Salter --- arch/arm64/include/asm/atomic_lse.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/atomic_lse.h b/arch/arm64/include/asm/atomic_lse.h index 9ef0797380cb..99fa69c9c3cf 100644 --- a/arch/arm64/include/asm/atomic_lse.h +++ b/arch/arm64/include/asm/atomic_lse.h @@ -435,7 +435,7 @@ static inline long atomic64_dec_if_positive(atomic64_t *v) " sub x30, x30, %[ret]\n" " cbnzx30, 1b\n" "2:") - : [ret] "+r" (x0), [v] "+Q" (v->counter) + : [ret] "+" (x0), [v] "+Q" (v->counter) : : __LL_SC_CLOBBERS, "cc", "memory"); -- 2.17.0