Re: [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode

Sam Tebbs Wed, 25 Jul 2018 06:09:12 -0700

On 07/23/2018 05:01 PM, Sudakshina Das wrote:

Hi Sam



On Monday 23 July 2018 11:39 AM, Sam Tebbs wrote:

Hi all,

This patch extends the aarch64_get_lane_zero_extendsi instructiondefinition to

also cover DI mode. This prevents a redundant AND instruction from being
generated due to the pattern failing to be matched.

Example:

typedef char v16qi __attribute__ ((vector_size (16)));

unsigned long long
foo (v16qi a)
{
  return a[0];
}

Previously generated:

foo:
        umov    w0, v0.b[0]
        and     x0, x0, 255
        ret

And now generates:

foo:
        umov    w0, v0.b[0]
        ret

Bootstrapped on aarch64-none-linux-gnu and tested on aarch64-none-elfwith no

regressions.

gcc/
2018-07-23  Sam Tebbs <sam.te...@arm.com>

        * config/aarch64/aarch64-simd.md
    (*aarch64_get_lane_zero_extendsi<mode>):
        Rename to...
(*aarch64_get_lane_zero_extend<mode><VDQQH:mode>): ... This.
        Use GPI iterator instead of SI mode.

gcc/testsuite
2018-07-23  Sam Tebbs <sam.te...@arm.com>

        * gcc.target/aarch64/extract_zero_extend.c: New file

You will need an approval from a maintainer, but I would only add onerequest to this:

diff --git a/gcc/config/aarch64/aarch64-simd.mdb/gcc/config/aarch64/aarch64-simd.md

index 89e38e6..15fb661 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3032,15 +3032,16 @@
   [(set_attr "type" "neon_to_gp<q>")]
 )

-(define_insn "*aarch64_get_lane_zero_extendsi<mode>"
-  [(set (match_operand:SI 0 "register_operand" "=r")
-    (zero_extend:SI
+(define_insn "*aarch64_get_lane_zero_extend<mode><VDQQH:mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+    (zero_extend:GPI

Since you are adding 4 new patterns with this change, could you add

more cases in your test as well to make sure you have coverage foreach of them.


Thanks
Sudi


Hi Sudi,

Thanks for the feedback. Here is an updated patch that adds moretestcases to cover the patterns generated by the different modecombinations. The changelog and description from my original email stillapply.


       (vec_select:<VEL>
         (match_operand:VDQQH 1 "register_operand" "w")
         (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
   "TARGET_SIMD"
   {

- operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL(operands[2]));

+    operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode,
+                       INTVAL (operands[2]));
     return "umov\\t%w0, %1.<Vetype>[%2]";
   }
   [(set_attr "type" "neon_to_gp<q>")]

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index f1784d72e55c412d076de43f2f7aad4632d55ecb..e92a3b49c65e84d2a16a2a480c359a0b4d8fa3e3 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3033,15 +3033,16 @@
   [(set_attr "type" "neon_to_gp<q>")]
 )
 
-(define_insn "*aarch64_get_lane_zero_extendsi<mode>"
-  [(set (match_operand:SI 0 "register_operand" "=r")
-	(zero_extend:SI
+(define_insn "*aarch64_get_lane_zero_extend<GPI:mode><VDQQH:mode>"
+  [(set (match_operand:GPI 0 "register_operand" "=r")
+	(zero_extend:GPI
 	  (vec_select:<VEL>
 	    (match_operand:VDQQH 1 "register_operand" "w")
 	    (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
   "TARGET_SIMD"
   {
-    operands[2] = aarch64_endian_lane_rtx (<MODE>mode, INTVAL (operands[2]));
+    operands[2] = aarch64_endian_lane_rtx (<VDQQH:MODE>mode,
+					   INTVAL (operands[2]));
     return "umov\\t%w0, %1.<Vetype>[%2]";
   }
   [(set_attr "type" "neon_to_gp<q>")]
diff --git a/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c b/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c
new file mode 100644
index 0000000000000000000000000000000000000000..deb613cd23150a83dfd36ae84504415993b97be3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/extract_zero_extend.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+/* Tests *aarch64_get_lane_zero_extenddiv16qi.  */
+typedef unsigned char v16qi __attribute__ ((vector_size (16)));
+/* Tests *aarch64_get_lane_zero_extenddiv8qi.  */
+typedef unsigned char v8qi __attribute__ ((vector_size (8)));
+
+/* Tests *aarch64_get_lane_zero_extendsiv8hi.  */
+typedef unsigned short v16hi __attribute__ ((vector_size (16)));
+/* Tests *aarch64_get_lane_zero_extendsiv4hi.  */
+typedef unsigned short v8hi __attribute__ ((vector_size (8)));
+
+unsigned long long
+foo_16qi (v16qi a)
+{
+  return a[0];
+}
+
+unsigned long long
+foo_8qi (v8qi a)
+{
+  return a[0];
+}
+
+unsigned int
+foo_16hi (v16hi a)
+{
+  return a[0];
+}
+
+unsigned int
+foo_8hi (v8hi a)
+{
+  return a[0];
+}
+
+/* { dg-final { scan-assembler-times "umov\\t" 4 } } */
+/* { dg-final { scan-assembler-not "and\\t" } } */

Re: [GCC][PATCH][Aarch64] Stop redundant zero-extension after UMOV when in DI mode

Reply via email to