Re: [RHEL-8] arm64: add missing early clobber in atomic64_dec_if_positive()

2018-05-21 Thread Will Deacon
Hi Mark,

Thanks for reporting this.

On Sat, May 19, 2018 at 08:17:26PM -0400, Mark Salter wrote:
> When running a kernel compiled with gcc8 on a machine using LSE, I
> get:
> 
>  Unable to handle kernel paging request at virtual address 112221

[...]

> The fault happens at the casal insn of inlined atomic64_dec_if_positive().
> The inline asm code in that function has:
> 
>   "1: ldr x30, %[v]\n"
>   "   subs%[ret], x30, #1\n"
>   "   b.lt2f\n"
>   "   casal   x30, %[ret], %[v]\n"
>   "   sub x30, x30, #1\n"
>   "   sub x30, x30, %[ret]\n"
>   "   cbnzx30, 1b\n"
>   "2:")
>   : [ret] "+r" (x0), [v] "+Q" (v->counter)
> 
> gcc8 used register x0 for both [ret] and [v] and the subs was
> clobbering [v] before it was used for casal. Gcc is free to do
> this because [ret] lacks an early clobber modifier. So add one
> to tell gcc a separate register is needed for [v].

Oh blimey, it looks like GCC is realising that counter is at offset 0
of atomic_t and therefore assigns the same register for [ret] and [v],
which is actually forced to be x0 by the 'register' local variable in
C code. The "+Q" constraint only says that the memory is read/write, so
the pointer is fair game.

I agree with your fix, but we also need to fix up the other places relying
on this. Patch below -- please yell if you think I missed any.

Cheers,

Will

--->8

>From 3d9417b28ed2588c33b7e54e6681c88f0224201a Mon Sep 17 00:00:00 2001
From: Will Deacon 
Date: Mon, 21 May 2018 17:44:57 +0100
Subject: [PATCH] arm64: lse: Add early clobbers to some input/output asm
 operands

For LSE atomics that read and write a register operand, we need to
ensure that these operands are annotated as "early clobber" if the
register is written before all of the input operands have been consumed.
Failure to do so can result in the compiler allocating the same register
to both operands, leading to splats such as:

 Unable to handle kernel paging request at virtual address 112221
 [...]
 x1 :  x0 : 2221
 Process swapper/0 (pid: 1, stack limit = 0x8209f908)
 Call trace:
  test_atomic64+0x1360/0x155c

where x0 has been allocated as both the value to be stored and also the
atomic_t pointer.

This patch adds the missing clobbers.

Cc: 
Cc: Dave Martin 
Cc: Robin Murphy 
Reported-by: Mark Salter 
Signed-off-by: Will Deacon 
---
 arch/arm64/include/asm/atomic_lse.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/atomic_lse.h 
b/arch/arm64/include/asm/atomic_lse.h
index 9ef0797380cb..f9b0b09153e0 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -117,7 +117,7 @@ static inline void atomic_and(int i, atomic_t *v)
/* LSE atomics */
"   mvn %w[i], %w[i]\n"
"   stclr   %w[i], %[v]")
-   : [i] "+r" (w0), [v] "+Q" (v->counter)
+   : [i] "+" (w0), [v] "+Q" (v->counter)
: "r" (x1)
: __LL_SC_CLOBBERS);
 }
@@ -135,7 +135,7 @@ static inline int atomic_fetch_and##name(int i, atomic_t 
*v)\
/* LSE atomics */   \
"   mvn %w[i], %w[i]\n" \
"   ldclr" #mb "%w[i], %w[i], %[v]")\
-   : [i] "+r" (w0), [v] "+Q" (v->counter)  \
+   : [i] "+" (w0), [v] "+Q" (v->counter) \
: "r" (x1)  \
: __LL_SC_CLOBBERS, ##cl);  \
\
@@ -161,7 +161,7 @@ static inline void atomic_sub(int i, atomic_t *v)
/* LSE atomics */
"   neg %w[i], %w[i]\n"
"   stadd   %w[i], %[v]")
-   : [i] "+r" (w0), [v] "+Q" (v->counter)
+   : [i] "+" (w0), [v] "+Q" (v->counter)
: "r" (x1)
: __LL_SC_CLOBBERS);
 }
@@ -180,7 +180,7 @@ static inline int atomic_sub_return##name(int i, atomic_t 
*v)   \
"   neg %w[i], %w[i]\n" \
"   ldadd" #mb "%w[i], w30, %[v]\n" \
"   add %w[i], %w[i], w30") \
-   : [i] "+r" (w0), [v] "+Q" (v->counter)  \
+   : [i] "+" (w0), [v] "+Q" (v->counter) \
: "r" (x1)  \
: __LL_SC_CLOBBERS , ##cl); \
\
@@ -207,7 +207,7 @@ static inline int atomic_fetch_sub##name(int i, atomic_t 

Re: [RHEL-8] arm64: add missing early clobber in atomic64_dec_if_positive()

2018-05-21 Thread Will Deacon
Hi Mark,

Thanks for reporting this.

On Sat, May 19, 2018 at 08:17:26PM -0400, Mark Salter wrote:
> When running a kernel compiled with gcc8 on a machine using LSE, I
> get:
> 
>  Unable to handle kernel paging request at virtual address 112221

[...]

> The fault happens at the casal insn of inlined atomic64_dec_if_positive().
> The inline asm code in that function has:
> 
>   "1: ldr x30, %[v]\n"
>   "   subs%[ret], x30, #1\n"
>   "   b.lt2f\n"
>   "   casal   x30, %[ret], %[v]\n"
>   "   sub x30, x30, #1\n"
>   "   sub x30, x30, %[ret]\n"
>   "   cbnzx30, 1b\n"
>   "2:")
>   : [ret] "+r" (x0), [v] "+Q" (v->counter)
> 
> gcc8 used register x0 for both [ret] and [v] and the subs was
> clobbering [v] before it was used for casal. Gcc is free to do
> this because [ret] lacks an early clobber modifier. So add one
> to tell gcc a separate register is needed for [v].

Oh blimey, it looks like GCC is realising that counter is at offset 0
of atomic_t and therefore assigns the same register for [ret] and [v],
which is actually forced to be x0 by the 'register' local variable in
C code. The "+Q" constraint only says that the memory is read/write, so
the pointer is fair game.

I agree with your fix, but we also need to fix up the other places relying
on this. Patch below -- please yell if you think I missed any.

Cheers,

Will

--->8

>From 3d9417b28ed2588c33b7e54e6681c88f0224201a Mon Sep 17 00:00:00 2001
From: Will Deacon 
Date: Mon, 21 May 2018 17:44:57 +0100
Subject: [PATCH] arm64: lse: Add early clobbers to some input/output asm
 operands

For LSE atomics that read and write a register operand, we need to
ensure that these operands are annotated as "early clobber" if the
register is written before all of the input operands have been consumed.
Failure to do so can result in the compiler allocating the same register
to both operands, leading to splats such as:

 Unable to handle kernel paging request at virtual address 112221
 [...]
 x1 :  x0 : 2221
 Process swapper/0 (pid: 1, stack limit = 0x8209f908)
 Call trace:
  test_atomic64+0x1360/0x155c

where x0 has been allocated as both the value to be stored and also the
atomic_t pointer.

This patch adds the missing clobbers.

Cc: 
Cc: Dave Martin 
Cc: Robin Murphy 
Reported-by: Mark Salter 
Signed-off-by: Will Deacon 
---
 arch/arm64/include/asm/atomic_lse.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/atomic_lse.h 
b/arch/arm64/include/asm/atomic_lse.h
index 9ef0797380cb..f9b0b09153e0 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -117,7 +117,7 @@ static inline void atomic_and(int i, atomic_t *v)
/* LSE atomics */
"   mvn %w[i], %w[i]\n"
"   stclr   %w[i], %[v]")
-   : [i] "+r" (w0), [v] "+Q" (v->counter)
+   : [i] "+" (w0), [v] "+Q" (v->counter)
: "r" (x1)
: __LL_SC_CLOBBERS);
 }
@@ -135,7 +135,7 @@ static inline int atomic_fetch_and##name(int i, atomic_t 
*v)\
/* LSE atomics */   \
"   mvn %w[i], %w[i]\n" \
"   ldclr" #mb "%w[i], %w[i], %[v]")\
-   : [i] "+r" (w0), [v] "+Q" (v->counter)  \
+   : [i] "+" (w0), [v] "+Q" (v->counter) \
: "r" (x1)  \
: __LL_SC_CLOBBERS, ##cl);  \
\
@@ -161,7 +161,7 @@ static inline void atomic_sub(int i, atomic_t *v)
/* LSE atomics */
"   neg %w[i], %w[i]\n"
"   stadd   %w[i], %[v]")
-   : [i] "+r" (w0), [v] "+Q" (v->counter)
+   : [i] "+" (w0), [v] "+Q" (v->counter)
: "r" (x1)
: __LL_SC_CLOBBERS);
 }
@@ -180,7 +180,7 @@ static inline int atomic_sub_return##name(int i, atomic_t 
*v)   \
"   neg %w[i], %w[i]\n" \
"   ldadd" #mb "%w[i], w30, %[v]\n" \
"   add %w[i], %w[i], w30") \
-   : [i] "+r" (w0), [v] "+Q" (v->counter)  \
+   : [i] "+" (w0), [v] "+Q" (v->counter) \
: "r" (x1)  \
: __LL_SC_CLOBBERS , ##cl); \
\
@@ -207,7 +207,7 @@ static inline int atomic_fetch_sub##name(int i, atomic_t 
*v)\
/* LSE atomics */   \
"   neg %w[i], 

[RHEL-8] arm64: add missing early clobber in atomic64_dec_if_positive()

2018-05-19 Thread Mark Salter
When running a kernel compiled with gcc8 on a machine using LSE, I
get:

 Unable to handle kernel paging request at virtual address 112221
 Mem abort info:
   ESR = 0x9621
   Exception class = DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
 Data abort info:
   ISV = 0, ISS = 0x0021
   CM = 0, WnR = 0
 [00112221] address between user and kernel address ranges
 Internal error: Oops: 9621 [#1] SMP
 ...
 pstate: 2049 (nzCv daif +PAN -UAO)
 pc : test_atomic64+0x1360/0x155c
 lr : 0x
 sp : 0bc6fd60
 x29: 0bc6fd60 x28: 
 x27:  x26: 08f04460
 x25: 08de0584 x24: 08e91060
 x23: aaa31337c001d00e x22: 999202269ddfadeb
 x21: aaa31337c001d00c x20: bbb42448e223f22f
 x19: aaa31337c001d00d x18: 0010
 x17: 0222 x16: 10e0
 x15:  x14: 09233c08
 x13: 89925a8f x12: 09925a97
 x11: 0927f000 x10: 0bc6fac0
 x9 : ffd0 x8 : 0853fdf8
 x7 : deadbeef x6 : 0bc6fda0
 x5 : aaa31337c001d00d x4 : deadbeefdeafcafe
 x3 : aaa31337c001d00d x2 : aaa31337c001d00e
 x1 :  x0 : 2221
 Process swapper/0 (pid: 1, stack limit = 0x8209f908)
 Call trace:
  test_atomic64+0x1360/0x155c
  test_atomics_init+0x10/0x28
  do_one_initcall+0x134/0x160
  kernel_init_freeable+0x18c/0x21c
  kernel_init+0x18/0x108
  ret_from_fork+0x10/0x1c
 Code: f90023e1 f940001e f10007c0 54ab (c8fefc00)
 ---[ end trace 29569e7320c6e926 ]---

The fault happens at the casal insn of inlined atomic64_dec_if_positive().
The inline asm code in that function has:

"1: ldr x30, %[v]\n"
"   subs%[ret], x30, #1\n"
"   b.lt2f\n"
"   casal   x30, %[ret], %[v]\n"
"   sub x30, x30, #1\n"
"   sub x30, x30, %[ret]\n"
"   cbnzx30, 1b\n"
"2:")
: [ret] "+r" (x0), [v] "+Q" (v->counter)

gcc8 used register x0 for both [ret] and [v] and the subs was
clobbering [v] before it was used for casal. Gcc is free to do
this because [ret] lacks an early clobber modifier. So add one
to tell gcc a separate register is needed for [v].

Signed-off-by: Mark Salter 
---
 arch/arm64/include/asm/atomic_lse.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/atomic_lse.h 
b/arch/arm64/include/asm/atomic_lse.h
index 9ef0797380cb..99fa69c9c3cf 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -435,7 +435,7 @@ static inline long atomic64_dec_if_positive(atomic64_t *v)
"   sub x30, x30, %[ret]\n"
"   cbnzx30, 1b\n"
"2:")
-   : [ret] "+r" (x0), [v] "+Q" (v->counter)
+   : [ret] "+" (x0), [v] "+Q" (v->counter)
:
: __LL_SC_CLOBBERS, "cc", "memory");
 
-- 
2.17.0



[RHEL-8] arm64: add missing early clobber in atomic64_dec_if_positive()

2018-05-19 Thread Mark Salter
When running a kernel compiled with gcc8 on a machine using LSE, I
get:

 Unable to handle kernel paging request at virtual address 112221
 Mem abort info:
   ESR = 0x9621
   Exception class = DABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
 Data abort info:
   ISV = 0, ISS = 0x0021
   CM = 0, WnR = 0
 [00112221] address between user and kernel address ranges
 Internal error: Oops: 9621 [#1] SMP
 ...
 pstate: 2049 (nzCv daif +PAN -UAO)
 pc : test_atomic64+0x1360/0x155c
 lr : 0x
 sp : 0bc6fd60
 x29: 0bc6fd60 x28: 
 x27:  x26: 08f04460
 x25: 08de0584 x24: 08e91060
 x23: aaa31337c001d00e x22: 999202269ddfadeb
 x21: aaa31337c001d00c x20: bbb42448e223f22f
 x19: aaa31337c001d00d x18: 0010
 x17: 0222 x16: 10e0
 x15:  x14: 09233c08
 x13: 89925a8f x12: 09925a97
 x11: 0927f000 x10: 0bc6fac0
 x9 : ffd0 x8 : 0853fdf8
 x7 : deadbeef x6 : 0bc6fda0
 x5 : aaa31337c001d00d x4 : deadbeefdeafcafe
 x3 : aaa31337c001d00d x2 : aaa31337c001d00e
 x1 :  x0 : 2221
 Process swapper/0 (pid: 1, stack limit = 0x8209f908)
 Call trace:
  test_atomic64+0x1360/0x155c
  test_atomics_init+0x10/0x28
  do_one_initcall+0x134/0x160
  kernel_init_freeable+0x18c/0x21c
  kernel_init+0x18/0x108
  ret_from_fork+0x10/0x1c
 Code: f90023e1 f940001e f10007c0 54ab (c8fefc00)
 ---[ end trace 29569e7320c6e926 ]---

The fault happens at the casal insn of inlined atomic64_dec_if_positive().
The inline asm code in that function has:

"1: ldr x30, %[v]\n"
"   subs%[ret], x30, #1\n"
"   b.lt2f\n"
"   casal   x30, %[ret], %[v]\n"
"   sub x30, x30, #1\n"
"   sub x30, x30, %[ret]\n"
"   cbnzx30, 1b\n"
"2:")
: [ret] "+r" (x0), [v] "+Q" (v->counter)

gcc8 used register x0 for both [ret] and [v] and the subs was
clobbering [v] before it was used for casal. Gcc is free to do
this because [ret] lacks an early clobber modifier. So add one
to tell gcc a separate register is needed for [v].

Signed-off-by: Mark Salter 
---
 arch/arm64/include/asm/atomic_lse.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/atomic_lse.h 
b/arch/arm64/include/asm/atomic_lse.h
index 9ef0797380cb..99fa69c9c3cf 100644
--- a/arch/arm64/include/asm/atomic_lse.h
+++ b/arch/arm64/include/asm/atomic_lse.h
@@ -435,7 +435,7 @@ static inline long atomic64_dec_if_positive(atomic64_t *v)
"   sub x30, x30, %[ret]\n"
"   cbnzx30, 1b\n"
"2:")
-   : [ret] "+r" (x0), [v] "+Q" (v->counter)
+   : [ret] "+" (x0), [v] "+Q" (v->counter)
:
: __LL_SC_CLOBBERS, "cc", "memory");
 
-- 
2.17.0