Re: [patch][aarch64]: add usra and ssra combine patterns
On Mon, Jun 17, 2019 at 05:42:45PM +0100, Sylvia Taylor wrote: > Updating patch with missing scan-assembler checks. This is OK. I committed it on your behalf as r273703. Thanks, James > Cheers, > Syl > > -Original Message- > From: Sylvia Taylor > Sent: 04 June 2019 12:24 > To: James Greenhalgh > Cc: Richard Earnshaw ; Marcus Shawcroft > ; gcc-patches@gcc.gnu.org; nd > Subject: RE: [patch][aarch64]: add usra and ssra combine patterns > > Hi James, > > I've managed to remove the odd redundant git diff change. > > Regarding aarch64_sra_n, this patch shouldn't affect it. > > I am also not aware of any way of enabling this combine inside the pattern > used for those intrinsics, so I kept them separate. > > Cheers, > Syl > > -Original Message- > From: James Greenhalgh > Sent: 03 June 2019 11:20 > To: Sylvia Taylor > Cc: Richard Earnshaw ; Marcus Shawcroft > ; gcc-patches@gcc.gnu.org; nd > Subject: Re: [patch][aarch64]: add usra and ssra combine patterns > > On Thu, May 30, 2019 at 03:25:19PM +0100, Sylvia Taylor wrote: > > Greetings, > > > > This patch adds support to combine: > > > > 1) ushr and add into usra, example: > > > > ushrv0.16b, v0.16b, 2 > > add v0.16b, v0.16b, v2.16b > > --- > > usrav2.16b, v0.16b, 2 > > > > 2) sshr and add into ssra, example: > > > > sshrv1.16b, v1.16b, 2 > > add v1.16b, v1.16b, v3.16b > > --- > > ssrav3.16b, v1.16b, 2 > > > > Bootstrapped and tested on aarch64-none-linux-gnu. > > > > Ok for trunk? If yes, I don't have any commit rights, so can someone > > please commit it on my behalf. > > This patch has an unrelated change to > aarch64_get_lane_zero_extend Please revert that and > resend. > > What changes (if any) should we make to aarch64_sra_n based on > this patch, and to the vsra_n intrinsics in arm_neon.h ? > > Thanks, > James > > > > > Cheers, > > Syl > > > > gcc/ChangeLog: > > > > 2019-05-30 Sylvia Taylor > > > > * config/aarch64/aarch64-simd.md > > (*aarch64_simd_sra): New. > > * config/aarch64/iterators.md > > (SHIFTRT): New iterator. > > (sra_op): New attribute. > > > > gcc/testsuite/ChangeLog: > > > > 2019-05-30 Sylvia Taylor > > > > * gcc.target/aarch64/simd/ssra.c: New test. > > * gcc.target/aarch64/simd/usra.c: New test. >
RE: [patch][aarch64]: add usra and ssra combine patterns
Updating patch with missing scan-assembler checks. Cheers, Syl -Original Message- From: Sylvia Taylor Sent: 04 June 2019 12:24 To: James Greenhalgh Cc: Richard Earnshaw ; Marcus Shawcroft ; gcc-patches@gcc.gnu.org; nd Subject: RE: [patch][aarch64]: add usra and ssra combine patterns Hi James, I've managed to remove the odd redundant git diff change. Regarding aarch64_sra_n, this patch shouldn't affect it. I am also not aware of any way of enabling this combine inside the pattern used for those intrinsics, so I kept them separate. Cheers, Syl -Original Message- From: James Greenhalgh Sent: 03 June 2019 11:20 To: Sylvia Taylor Cc: Richard Earnshaw ; Marcus Shawcroft ; gcc-patches@gcc.gnu.org; nd Subject: Re: [patch][aarch64]: add usra and ssra combine patterns On Thu, May 30, 2019 at 03:25:19PM +0100, Sylvia Taylor wrote: > Greetings, > > This patch adds support to combine: > > 1) ushr and add into usra, example: > > ushr v0.16b, v0.16b, 2 > add v0.16b, v0.16b, v2.16b > --- > usra v2.16b, v0.16b, 2 > > 2) sshr and add into ssra, example: > > sshr v1.16b, v1.16b, 2 > add v1.16b, v1.16b, v3.16b > --- > ssra v3.16b, v1.16b, 2 > > Bootstrapped and tested on aarch64-none-linux-gnu. > > Ok for trunk? If yes, I don't have any commit rights, so can someone > please commit it on my behalf. This patch has an unrelated change to aarch64_get_lane_zero_extend Please revert that and resend. What changes (if any) should we make to aarch64_sra_n based on this patch, and to the vsra_n intrinsics in arm_neon.h ? Thanks, James > > Cheers, > Syl > > gcc/ChangeLog: > > 2019-05-30 Sylvia Taylor > > * config/aarch64/aarch64-simd.md > (*aarch64_simd_sra): New. > * config/aarch64/iterators.md > (SHIFTRT): New iterator. > (sra_op): New attribute. > > gcc/testsuite/ChangeLog: > > 2019-05-30 Sylvia Taylor > > * gcc.target/aarch64/simd/ssra.c: New test. > * gcc.target/aarch64/simd/usra.c: New test. > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > e3852c5d182b70978d7603225fce55c0b8ee2894..502ac5f3b45a1da059bb07701150 > a531091378ed 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -3110,22 +3122,22 @@ > operands[2] = aarch64_endian_lane_rtx (mode, INTVAL (operands[2])); > return "smov\\t%0, %1.[%2]"; >} > - [(set_attr "type" "neon_to_gp")] > -) > - > -(define_insn "*aarch64_get_lane_zero_extend" > - [(set (match_operand:GPI 0 "register_operand" "=r") > - (zero_extend:GPI > - (vec_select: > - (match_operand:VDQQH 1 "register_operand" "w") > - (parallel [(match_operand:SI 2 "immediate_operand" "i")]] > - "TARGET_SIMD" > - { > -operands[2] = aarch64_endian_lane_rtx (mode, > -INTVAL (operands[2])); > -return "umov\\t%w0, %1.[%2]"; > - } > - [(set_attr "type" "neon_to_gp")] > + [(set_attr "type" "neon_to_gp")] > +) > + > +(define_insn "*aarch64_get_lane_zero_extend" > + [(set (match_operand:GPI 0 "register_operand" "=r") > + (zero_extend:GPI > + (vec_select: > + (match_operand:VDQQH 1 "register_operand" "w") > + (parallel [(match_operand:SI 2 "immediate_operand" "i")]] > + "TARGET_SIMD" > + { > +operands[2] = aarch64_endian_lane_rtx (mode, > +INTVAL (operands[2])); > +return "umov\\t%w0, %1.[%2]"; > + } > + [(set_attr "type" "neon_to_gp")] > ) > > ;; Lane extraction of a value, neither sign nor zero extension These changes should be dropped. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index eeed08e71ca0b96726cb28743ef38487a8287600..aba6af24eee1c29fe4524eb352747c94617b30c7 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -986,6 +986,18 @@ [(set_attr "type" "neon_shift_imm")] ) +(define_insn "*aarch64_simd_sra" + [(set (match_operand:VDQ_I 0 "register_operand" "=w") + (plus:VDQ_I + (SHIFTRT:VDQ_I + (match_operand:VDQ_I 1 "register_operand" "w") + (match_operand:VDQ_I 2 "aarch64_simd_rshift_imm" "Dr")) + (match_operand:VDQ_I 3 "register_operand" "0")
RE: [patch][aarch64]: add usra and ssra combine patterns
Hi James, I've managed to remove the odd redundant git diff change. Regarding aarch64_sra_n, this patch shouldn't affect it. I am also not aware of any way of enabling this combine inside the pattern used for those intrinsics, so I kept them separate. Cheers, Syl -Original Message- From: James Greenhalgh Sent: 03 June 2019 11:20 To: Sylvia Taylor Cc: Richard Earnshaw ; Marcus Shawcroft ; gcc-patches@gcc.gnu.org; nd Subject: Re: [patch][aarch64]: add usra and ssra combine patterns On Thu, May 30, 2019 at 03:25:19PM +0100, Sylvia Taylor wrote: > Greetings, > > This patch adds support to combine: > > 1) ushr and add into usra, example: > > ushr v0.16b, v0.16b, 2 > add v0.16b, v0.16b, v2.16b > --- > usra v2.16b, v0.16b, 2 > > 2) sshr and add into ssra, example: > > sshr v1.16b, v1.16b, 2 > add v1.16b, v1.16b, v3.16b > --- > ssra v3.16b, v1.16b, 2 > > Bootstrapped and tested on aarch64-none-linux-gnu. > > Ok for trunk? If yes, I don't have any commit rights, so can someone > please commit it on my behalf. This patch has an unrelated change to aarch64_get_lane_zero_extend Please revert that and resend. What changes (if any) should we make to aarch64_sra_n based on this patch, and to the vsra_n intrinsics in arm_neon.h ? Thanks, James > > Cheers, > Syl > > gcc/ChangeLog: > > 2019-05-30 Sylvia Taylor > > * config/aarch64/aarch64-simd.md > (*aarch64_simd_sra): New. > * config/aarch64/iterators.md > (SHIFTRT): New iterator. > (sra_op): New attribute. > > gcc/testsuite/ChangeLog: > > 2019-05-30 Sylvia Taylor > > * gcc.target/aarch64/simd/ssra.c: New test. > * gcc.target/aarch64/simd/usra.c: New test. > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > e3852c5d182b70978d7603225fce55c0b8ee2894..502ac5f3b45a1da059bb07701150 > a531091378ed 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -3110,22 +3122,22 @@ > operands[2] = aarch64_endian_lane_rtx (mode, INTVAL (operands[2])); > return "smov\\t%0, %1.[%2]"; >} > - [(set_attr "type" "neon_to_gp")] > -) > - > -(define_insn "*aarch64_get_lane_zero_extend" > - [(set (match_operand:GPI 0 "register_operand" "=r") > - (zero_extend:GPI > - (vec_select: > - (match_operand:VDQQH 1 "register_operand" "w") > - (parallel [(match_operand:SI 2 "immediate_operand" "i")]] > - "TARGET_SIMD" > - { > -operands[2] = aarch64_endian_lane_rtx (mode, > -INTVAL (operands[2])); > -return "umov\\t%w0, %1.[%2]"; > - } > - [(set_attr "type" "neon_to_gp")] > + [(set_attr "type" "neon_to_gp")] > +) > + > +(define_insn "*aarch64_get_lane_zero_extend" > + [(set (match_operand:GPI 0 "register_operand" "=r") > + (zero_extend:GPI > + (vec_select: > + (match_operand:VDQQH 1 "register_operand" "w") > + (parallel [(match_operand:SI 2 "immediate_operand" "i")]] > + "TARGET_SIMD" > + { > +operands[2] = aarch64_endian_lane_rtx (mode, > +INTVAL (operands[2])); > +return "umov\\t%w0, %1.[%2]"; > + } > + [(set_attr "type" "neon_to_gp")] > ) > > ;; Lane extraction of a value, neither sign nor zero extension These changes should be dropped. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index eeed08e71ca0b96726cb28743ef38487a8287600..aba6af24eee1c29fe4524eb352747c94617b30c7 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -986,6 +986,18 @@ [(set_attr "type" "neon_shift_imm")] ) +(define_insn "*aarch64_simd_sra" + [(set (match_operand:VDQ_I 0 "register_operand" "=w") + (plus:VDQ_I + (SHIFTRT:VDQ_I + (match_operand:VDQ_I 1 "register_operand" "w") + (match_operand:VDQ_I 2 "aarch64_simd_rshift_imm" "Dr")) + (match_operand:VDQ_I 3 "register_operand" "0")))] + "TARGET_SIMD" + "sra\t%0., %1., %2" + [(set_attr "type" "neon_shift_acc")] +) + (define_insn "aarch64_simd_imm_shl" [(set (match_operand:VDQ_I 0 "register_operand" "=w") (ashift:VDQ_I (match_operand:VDQ_I 1 "regis
Re: [patch][aarch64]: add usra and ssra combine patterns
On Thu, May 30, 2019 at 03:25:19PM +0100, Sylvia Taylor wrote: > Greetings, > > This patch adds support to combine: > > 1) ushr and add into usra, example: > > ushr v0.16b, v0.16b, 2 > add v0.16b, v0.16b, v2.16b > --- > usra v2.16b, v0.16b, 2 > > 2) sshr and add into ssra, example: > > sshr v1.16b, v1.16b, 2 > add v1.16b, v1.16b, v3.16b > --- > ssra v3.16b, v1.16b, 2 > > Bootstrapped and tested on aarch64-none-linux-gnu. > > Ok for trunk? If yes, I don't have any commit rights, > so can someone please commit it on my behalf. This patch has an unrelated change to aarch64_get_lane_zero_extend Please revert that and resend. What changes (if any) should we make to aarch64_sra_n based on this patch, and to the vsra_n intrinsics in arm_neon.h ? Thanks, James > > Cheers, > Syl > > gcc/ChangeLog: > > 2019-05-30 Sylvia Taylor > > * config/aarch64/aarch64-simd.md > (*aarch64_simd_sra): New. > * config/aarch64/iterators.md > (SHIFTRT): New iterator. > (sra_op): New attribute. > > gcc/testsuite/ChangeLog: > > 2019-05-30 Sylvia Taylor > > * gcc.target/aarch64/simd/ssra.c: New test. > * gcc.target/aarch64/simd/usra.c: New test. > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > e3852c5d182b70978d7603225fce55c0b8ee2894..502ac5f3b45a1da059bb07701150a531091378ed > 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -3110,22 +3122,22 @@ > operands[2] = aarch64_endian_lane_rtx (mode, INTVAL (operands[2])); > return "smov\\t%0, %1.[%2]"; >} > - [(set_attr "type" "neon_to_gp")] > -) > - > -(define_insn "*aarch64_get_lane_zero_extend" > - [(set (match_operand:GPI 0 "register_operand" "=r") > - (zero_extend:GPI > - (vec_select: > - (match_operand:VDQQH 1 "register_operand" "w") > - (parallel [(match_operand:SI 2 "immediate_operand" "i")]] > - "TARGET_SIMD" > - { > -operands[2] = aarch64_endian_lane_rtx (mode, > -INTVAL (operands[2])); > -return "umov\\t%w0, %1.[%2]"; > - } > - [(set_attr "type" "neon_to_gp")] > + [(set_attr "type" "neon_to_gp")] > +) > + > +(define_insn "*aarch64_get_lane_zero_extend" > + [(set (match_operand:GPI 0 "register_operand" "=r") > + (zero_extend:GPI > + (vec_select: > + (match_operand:VDQQH 1 "register_operand" "w") > + (parallel [(match_operand:SI 2 "immediate_operand" "i")]] > + "TARGET_SIMD" > + { > +operands[2] = aarch64_endian_lane_rtx (mode, > +INTVAL (operands[2])); > +return "umov\\t%w0, %1.[%2]"; > + } > + [(set_attr "type" "neon_to_gp")] > ) > > ;; Lane extraction of a value, neither sign nor zero extension These changes should be dropped.
[patch][aarch64]: add usra and ssra combine patterns
Greetings, This patch adds support to combine: 1) ushr and add into usra, example: ushrv0.16b, v0.16b, 2 add v0.16b, v0.16b, v2.16b --- usrav2.16b, v0.16b, 2 2) sshr and add into ssra, example: sshrv1.16b, v1.16b, 2 add v1.16b, v1.16b, v3.16b --- ssrav3.16b, v1.16b, 2 Bootstrapped and tested on aarch64-none-linux-gnu. Ok for trunk? If yes, I don't have any commit rights, so can someone please commit it on my behalf. Cheers, Syl gcc/ChangeLog: 2019-05-30 Sylvia Taylor * config/aarch64/aarch64-simd.md (*aarch64_simd_sra): New. * config/aarch64/iterators.md (SHIFTRT): New iterator. (sra_op): New attribute. gcc/testsuite/ChangeLog: 2019-05-30 Sylvia Taylor * gcc.target/aarch64/simd/ssra.c: New test. * gcc.target/aarch64/simd/usra.c: New test. diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index e3852c5d182b70978d7603225fce55c0b8ee2894..502ac5f3b45a1da059bb07701150a531091378ed 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -953,6 +953,18 @@ [(set_attr "type" "neon_shift_imm")] ) +(define_insn "*aarch64_simd_sra" + [(set (match_operand:VDQ_I 0 "register_operand" "=w") + (plus:VDQ_I + (SHIFTRT:VDQ_I + (match_operand:VDQ_I 1 "register_operand" "w") + (match_operand:VDQ_I 2 "aarch64_simd_rshift_imm" "Dr")) + (match_operand:VDQ_I 3 "register_operand" "0")))] + "TARGET_SIMD" + "sra\t%0., %1., %2" + [(set_attr "type" "neon_shift_acc")] +) + (define_insn "aarch64_simd_imm_shl" [(set (match_operand:VDQ_I 0 "register_operand" "=w") (ashift:VDQ_I (match_operand:VDQ_I 1 "register_operand" "w") @@ -3110,22 +3122,22 @@ operands[2] = aarch64_endian_lane_rtx (mode, INTVAL (operands[2])); return "smov\\t%0, %1.[%2]"; } - [(set_attr "type" "neon_to_gp")] -) - -(define_insn "*aarch64_get_lane_zero_extend" - [(set (match_operand:GPI 0 "register_operand" "=r") - (zero_extend:GPI - (vec_select: - (match_operand:VDQQH 1 "register_operand" "w") - (parallel [(match_operand:SI 2 "immediate_operand" "i")]] - "TARGET_SIMD" - { -operands[2] = aarch64_endian_lane_rtx (mode, - INTVAL (operands[2])); -return "umov\\t%w0, %1.[%2]"; - } - [(set_attr "type" "neon_to_gp")] + [(set_attr "type" "neon_to_gp")] +) + +(define_insn "*aarch64_get_lane_zero_extend" + [(set (match_operand:GPI 0 "register_operand" "=r") + (zero_extend:GPI + (vec_select: + (match_operand:VDQQH 1 "register_operand" "w") + (parallel [(match_operand:SI 2 "immediate_operand" "i")]] + "TARGET_SIMD" + { +operands[2] = aarch64_endian_lane_rtx (mode, + INTVAL (operands[2])); +return "umov\\t%w0, %1.[%2]"; + } + [(set_attr "type" "neon_to_gp")] ) ;; Lane extraction of a value, neither sign nor zero extension diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index 6caeeac80867edda29b5438efdcee475ed609ff6..6273b7be5932aef695d12e9f723a43cb6c50abe8 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -1160,6 +1160,8 @@ ;; This code iterator allows the shifts supported in arithmetic instructions (define_code_iterator ASHIFT [ashift ashiftrt lshiftrt]) +(define_code_iterator SHIFTRT [ashiftrt lshiftrt]) + ;; Code iterator for logical operations (define_code_iterator LOGICAL [and ior xor]) @@ -1342,6 +1344,9 @@ (define_code_attr shift [(ashift "lsl") (ashiftrt "asr") (lshiftrt "lsr") (rotatert "ror")]) +;; Op prefix for shift right and accumulate. +(define_code_attr sra_op [(ashiftrt "s") (lshiftrt "u")]) + ;; Map shift operators onto underlying bit-field instructions (define_code_attr bfshift [(ashift "ubfiz") (ashiftrt "sbfx") (lshiftrt "ubfx") (rotatert "extr")]) diff --git a/gcc/testsuite/gcc.target/aarch64/simd/ssra.c b/gcc/testsuite/gcc.target/aarch64/simd/ssra.c new file mode 100644 index ..e9c2e04c0b88ac18be81f4ee8a872e6829af9db2 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/simd/ssra.c @@ -0,0 +1,36 @@ +/* { dg-do compile { target aarch64*-*-* } } */ +/* { dg-options "-O3" } */ +/* { dg-skip-if "" { *-*-* } {"*sve*"} {""} } */ + +#include + +#define SSRA(func, vtype, n) \ + void func ()\ + { \ + int i; \ + for (i = 0; i < n; i++) \ + { \ + s1##vtype[i] += s2##vtype[i] >> 2; \ + } \ + } + +#define TEST_VDQ_I_MODES(FUNC)