Re: [PATCH, AArch64 v2 01/11] aarch64: Simplify LSE cas generation

2018-10-31 Thread Richard Henderson
On 10/31/18 10:02 AM, Richard Henderson wrote:
> On 10/30/18 7:48 PM, James Greenhalgh wrote:
>> On Tue, Oct 02, 2018 at 11:19:05AM -0500, Richard Henderson wrote:
>>> The cas insn is a single insn, and if expanded properly need not
>>> be split after reload.  Use the proper inputs for the insn.
>>
>> OK.
> 
> Thanks.  Committed 1-4 & 9.

Now only 7, 8, 10, 11 are outstanding.

I could split out the pure isa TImode compare-and-swap, putting the out-of-line
portion into a separate patch if you like, James.


r~


Re: [PATCH, AArch64 v2 01/11] aarch64: Simplify LSE cas generation

2018-10-31 Thread Richard Henderson
On 10/30/18 7:48 PM, James Greenhalgh wrote:
> On Tue, Oct 02, 2018 at 11:19:05AM -0500, Richard Henderson wrote:
>> The cas insn is a single insn, and if expanded properly need not
>> be split after reload.  Use the proper inputs for the insn.
> 
> OK.

Thanks.  Committed 1-4 & 9.


r~


Re: [PATCH, AArch64 v2 01/11] aarch64: Simplify LSE cas generation

2018-10-30 Thread James Greenhalgh
On Tue, Oct 02, 2018 at 11:19:05AM -0500, Richard Henderson wrote:
> The cas insn is a single insn, and if expanded properly need not
> be split after reload.  Use the proper inputs for the insn.

OK.

Thanks,
James

> 
>   * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
>   Force oldval into the rval register for TARGET_LSE; emit the compare
>   during initial expansion so that it may be deleted if unused.
>   (aarch64_gen_atomic_cas): Remove.
>   * config/aarch64/atomics.md (@aarch64_compare_and_swap_lse):
>   Change = to +r for operand 0; use match_dup for operand 2;
>   remove is_weak and mod_f operands as unused.  Drop the split
>   and merge with...
>   (@aarch64_atomic_cas): ... this pattern's output; remove.
>   (@aarch64_compare_and_swap_lse): Similarly.
>   (@aarch64_atomic_cas): Similarly.


[PATCH, AArch64 v2 01/11] aarch64: Simplify LSE cas generation

2018-10-02 Thread Richard Henderson
The cas insn is a single insn, and if expanded properly need not
be split after reload.  Use the proper inputs for the insn.

* config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
Force oldval into the rval register for TARGET_LSE; emit the compare
during initial expansion so that it may be deleted if unused.
(aarch64_gen_atomic_cas): Remove.
* config/aarch64/atomics.md (@aarch64_compare_and_swap_lse):
Change = to +r for operand 0; use match_dup for operand 2;
remove is_weak and mod_f operands as unused.  Drop the split
and merge with...
(@aarch64_atomic_cas): ... this pattern's output; remove.
(@aarch64_compare_and_swap_lse): Similarly.
(@aarch64_atomic_cas): Similarly.
---
 gcc/config/aarch64/aarch64-protos.h |   1 -
 gcc/config/aarch64/aarch64.c|  46 ---
 gcc/config/aarch64/atomics.md   | 121 
 3 files changed, 49 insertions(+), 119 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index caf1d2041f0..3d045cf43be 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -562,7 +562,6 @@ rtx aarch64_load_tp (rtx);
 
 void aarch64_expand_compare_and_swap (rtx op[]);
 void aarch64_split_compare_and_swap (rtx op[]);
-void aarch64_gen_atomic_cas (rtx, rtx, rtx, rtx, rtx);
 
 bool aarch64_atomic_ldop_supported_p (enum rtx_code);
 void aarch64_gen_atomic_ldop (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 12f7dfe9a75..fbec54fe5da 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -14183,16 +14183,27 @@ aarch64_expand_compare_and_swap (rtx operands[])
 }
 
   if (TARGET_LSE)
-emit_insn (gen_aarch64_compare_and_swap_lse (mode, rval, mem, oldval,
-newval, is_weak, mod_s,
-mod_f));
+{
+  /* The CAS insn requires oldval and rval overlap, but we need to
+have a copy of oldval saved across the operation to tell if
+the operation is successful.  */
+  if (mode == QImode || mode == HImode)
+   rval = copy_to_mode_reg (SImode, gen_lowpart (SImode, oldval));
+  else if (reg_overlap_mentioned_p (rval, oldval))
+rval = copy_to_mode_reg (mode, oldval);
+  else
+   emit_move_insn (rval, oldval);
+  emit_insn (gen_aarch64_compare_and_swap_lse (mode, rval, mem,
+  newval, mod_s));
+  aarch64_gen_compare_reg (EQ, rval, oldval);
+}
   else
 emit_insn (gen_aarch64_compare_and_swap (mode, rval, mem, oldval, newval,
 is_weak, mod_s, mod_f));
 
-
   if (mode == QImode || mode == HImode)
-emit_move_insn (operands[1], gen_lowpart (mode, rval));
+rval = gen_lowpart (mode, rval);
+  emit_move_insn (operands[1], rval);
 
   x = gen_rtx_REG (CCmode, CC_REGNUM);
   x = gen_rtx_EQ (SImode, x, const0_rtx);
@@ -14242,31 +14253,6 @@ aarch64_emit_post_barrier (enum memmodel model)
 }
 }
 
-/* Emit an atomic compare-and-swap operation.  RVAL is the destination register
-   for the data in memory.  EXPECTED is the value expected to be in memory.
-   DESIRED is the value to store to memory.  MEM is the memory location.  MODEL
-   is the memory ordering to use.  */
-
-void
-aarch64_gen_atomic_cas (rtx rval, rtx mem,
-   rtx expected, rtx desired,
-   rtx model)
-{
-  machine_mode mode;
-
-  mode = GET_MODE (mem);
-
-  /* Move the expected value into the CAS destination register.  */
-  emit_insn (gen_rtx_SET (rval, expected));
-
-  /* Emit the CAS.  */
-  emit_insn (gen_aarch64_atomic_cas (mode, rval, mem, desired, model));
-
-  /* Compare the expected value with the value loaded by the CAS, to establish
- whether the swap was made.  */
-  aarch64_gen_compare_reg (EQ, rval, expected);
-}
-
 /* Split a compare and swap pattern.  */
 
 void
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index bba8e9e9c8e..22660850af1 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -85,56 +85,50 @@
   }
 )
 
-(define_insn_and_split "@aarch64_compare_and_swap_lse"
-  [(set (reg:CC CC_REGNUM) ;; bool out
-(unspec_volatile:CC [(const_int 0)] UNSPECV_ATOMIC_CMPSW))
-   (set (match_operand:SI 0 "register_operand" "=")  ;; val out
+(define_insn "@aarch64_compare_and_swap_lse"
+  [(set (match_operand:SI 0 "register_operand" "+r")   ;; val out
 (zero_extend:SI
-  (match_operand:SHORT 1 "aarch64_sync_memory_operand" "+Q"))) ;; memory
+ (match_operand:SHORT 1 "aarch64_sync_memory_operand" "+Q"))) ;; memory
(set (match_dup 1)
 (unspec_volatile:SHORT
-  [(match_operand:SI 2 "aarch64_plus_operand"