Re: [PATCH 7/8]AArch64: Consolidate zero and sign extension patterns and add missing ones.
Tamar Christina via Gcc-patches writes: > Hi All, > > The target has various zero and sign extension patterns. These however live > in > various locations around the MD file and almost all of them are split > differently. Due to the various patterns we also ended up missing valid > extensions. For instance smov is almost never generated. > > This change tries to make this more manageable by consolidating the patterns > as > much as possible and in doing so fix the missing alternatives. > > There were also some duplicate patterns. Note that the > zero_extend<*_ONLY:mode>2 patterns are nearly identical however > QImode lacks an alternative that the others don't have, so I have left them as > 3 different patterns next to each other. > > In a lot of cases the wrong iterator was used leaving out cases that should > exist. > > I've also changed the masks used for zero extensions to hex instead of decimal > as it's more clear what they do that way, and aligns better with output of > other compilers. > > This leave the bulk of the extensions in just 3 patterns. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md > (*aarch64_get_lane_zero_extend): Changed to ... > (*aarch64_get_lane_zero_extend): ... This. > (*aarch64_get_lane_extenddi): New. > * config/aarch64/aarch64.md (sidi2, *extendsidi2_aarch64, > qihi2, *extendqihi2_aarch64, *zero_extendsidi2_aarch64): Remove > duplicate patterns. > (2, > *extend2_aarch64): Remove, consolidate > into ... > (extend2): ... This. > (*zero_extendqihi2_aarch64, > *zero_extend2_aarch64): Remove, consolidate into > ... > (zero_extend2, > zero_extend2, > zero_extend2): > (*ands_compare0): Renamed to ... > (*ands_compare0): ... This. > * config/aarch64/iterators.md (HI_ONLY, QI_ONLY): New. > (short_mask): Use hex rather than dec and add SI. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/ands_3.c: Update codegen. > * gcc.target/aarch64/sve/slp_1.c: Likewise. > * gcc.target/aarch64/tst_5.c: Likewise. > * gcc.target/aarch64/tst_6.c: Likewise. > > --- inline copy of patch -- > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > 8a84a8560e982b8155b18541f5504801b3330124..d0b37c4dd48aeafd3d87c90dc3270e71af5a72b9 > 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -4237,19 +4237,34 @@ (define_insn > "*aarch64_get_lane_extend" >[(set_attr "type" "neon_to_gp")] > ) > > -(define_insn "*aarch64_get_lane_zero_extend" > +(define_insn "*aarch64_get_lane_extenddi" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (sign_extend:DI > + (vec_select: > + (match_operand:VS 1 "register_operand" "w") > + (parallel [(match_operand:SI 2 "immediate_operand" "i")]] > + "TARGET_SIMD" > + { > +operands[2] = aarch64_endian_lane_rtx (mode, > +INTVAL (operands[2])); > +return "smov\\t%x0, %1.[%2]"; > + } > + [(set_attr "type" "neon_to_gp")] > +) > + > +(define_insn "*aarch64_get_lane_zero_extend" >[(set (match_operand:GPI 0 "register_operand" "=r") > (zero_extend:GPI > - (vec_select: > - (match_operand:VDQQH 1 "register_operand" "w") > + (vec_select: > + (match_operand:VDQV_L 1 "register_operand" "w") > (parallel [(match_operand:SI 2 "immediate_operand" "i")]] >"TARGET_SIMD" >{ > -operands[2] = aarch64_endian_lane_rtx (mode, > +operands[2] = aarch64_endian_lane_rtx (mode, > INTVAL (operands[2])); > -return "umov\\t%w0, %1.[%2]"; > +return "umov\\t%w0, %1.[%2]"; >} > - [(set_attr "type" "neon_to_gp")] > + [(set_attr "type" "neon_to_gp")] > ) Do you have any tests for the extra SI sign-extends? I think it'd better to use a consistent style here: either have a single pattern for all source modes (like you do with the zero_extends) or have a separate extend-to-DI-only pattern for SI inputs (like you do with the sign_extends). If we go with the single-pattern approach, then as per the reviews in other patches that came after this patch was posted, it'd be good to compile out the invalid extend-SI-to-SI cases, e.g. using a condition based on or whatever (extended to Advanced SIMD modes). Same comments for the other patterns: would be good to compile-out invalid cases. E.g. in particular: > -(define_insn "*zero_extend2_aarch64" > - [(set (match_operand:GPI 0 "register_operand" "=r,r,w,r") > -(zero_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" > "r,m,m,w")))] > +(define_insn "zero_extend2" > + [(set (match_operand:SD_HSDI 0 "register_operand" "=r,r,w,w,r,w") > +(zero_extend:SD_HSDI > +
RE: [PATCH 7/8]AArch64: Consolidate zero and sign extension patterns and add missing ones.
Ping. > -Original Message- > From: Tamar Christina > Sent: Monday, October 31, 2022 12:00 PM > To: gcc-patches@gcc.gnu.org > Cc: nd ; Richard Earnshaw ; > Marcus Shawcroft ; Kyrylo Tkachov > ; Richard Sandiford > > Subject: [PATCH 7/8]AArch64: Consolidate zero and sign extension patterns > and add missing ones. > > Hi All, > > The target has various zero and sign extension patterns. These however live > in various locations around the MD file and almost all of them are split > differently. Due to the various patterns we also ended up missing valid > extensions. For instance smov is almost never generated. > > This change tries to make this more manageable by consolidating the > patterns as much as possible and in doing so fix the missing alternatives. > > There were also some duplicate patterns. Note that the > zero_extend<*_ONLY:mode>2 patterns are nearly > identical however QImode lacks an alternative that the others don't have, so > I have left them as > 3 different patterns next to each other. > > In a lot of cases the wrong iterator was used leaving out cases that should > exist. > > I've also changed the masks used for zero extensions to hex instead of > decimal as it's more clear what they do that way, and aligns better with > output of other compilers. > > This leave the bulk of the extensions in just 3 patterns. > > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > > Ok for master? > > Thanks, > Tamar > > gcc/ChangeLog: > > * config/aarch64/aarch64-simd.md > (*aarch64_get_lane_zero_extend): > Changed to ... > (*aarch64_get_lane_zero_extend): ... > This. > (*aarch64_get_lane_extenddi): New. > * config/aarch64/aarch64.md (sidi2, *extendsidi2_aarch64, > qihi2, *extendqihi2_aarch64, *zero_extendsidi2_aarch64): > Remove > duplicate patterns. > (2, > *extend2_aarch64): Remove, > consolidate > into ... > (extend2): ... This. > (*zero_extendqihi2_aarch64, > *zero_extend2_aarch64): Remove, > consolidate into > ... > (zero_extend2, > zero_extend2, > zero_extend2): > (*ands_compare0): Renamed to ... > (*ands_compare0): ... This. > * config/aarch64/iterators.md (HI_ONLY, QI_ONLY): New. > (short_mask): Use hex rather than dec and add SI. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/ands_3.c: Update codegen. > * gcc.target/aarch64/sve/slp_1.c: Likewise. > * gcc.target/aarch64/tst_5.c: Likewise. > * gcc.target/aarch64/tst_6.c: Likewise. > > --- inline copy of patch -- > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index > 8a84a8560e982b8155b18541f5504801b3330124..d0b37c4dd48aeafd3d87c90dc > 3270e71af5a72b9 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -4237,19 +4237,34 @@ (define_insn > "*aarch64_get_lane_extend" >[(set_attr "type" "neon_to_gp")] > ) > > -(define_insn > "*aarch64_get_lane_zero_extend" > +(define_insn "*aarch64_get_lane_extenddi" > + [(set (match_operand:DI 0 "register_operand" "=r") > + (sign_extend:DI > + (vec_select: > + (match_operand:VS 1 "register_operand" "w") > + (parallel [(match_operand:SI 2 "immediate_operand" "i")]] > + "TARGET_SIMD" > + { > +operands[2] = aarch64_endian_lane_rtx (mode, > +INTVAL (operands[2])); > +return "smov\\t%x0, %1.[%2]"; > + } > + [(set_attr "type" "neon_to_gp")] > +) > + > +(define_insn > "*aarch64_get_lane_zero_extend" >[(set (match_operand:GPI 0 "register_operand" "=r") > (zero_extend:GPI > - (vec_select: > - (match_operand:VDQQH 1 "register_operand" "w") > + (vec_select: > + (match_operand:VDQV_L 1 "register_operand" "w") > (parallel [(match_operand:SI 2 "immediate_operand" "i")]] >"TARGET_SIMD" >{ > -operands[2] = aarch64_endian_lane_rtx (mode, > +operands[2] = aarch64_endian_lane_rtx (mode, > INTVAL (operands[2])); > -return "umov\\t%w0, %1.[%2]"; > +return "umov\\t%w0, %1.[%2]"; >} > - [(set_attr "type" "neon_to_gp")] > + [(set_attr "type" "neon_to_gp")]
[PATCH 7/8]AArch64: Consolidate zero and sign extension patterns and add missing ones.
Hi All, The target has various zero and sign extension patterns. These however live in various locations around the MD file and almost all of them are split differently. Due to the various patterns we also ended up missing valid extensions. For instance smov is almost never generated. This change tries to make this more manageable by consolidating the patterns as much as possible and in doing so fix the missing alternatives. There were also some duplicate patterns. Note that the zero_extend<*_ONLY:mode>2 patterns are nearly identical however QImode lacks an alternative that the others don't have, so I have left them as 3 different patterns next to each other. In a lot of cases the wrong iterator was used leaving out cases that should exist. I've also changed the masks used for zero extensions to hex instead of decimal as it's more clear what they do that way, and aligns better with output of other compilers. This leave the bulk of the extensions in just 3 patterns. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * config/aarch64/aarch64-simd.md (*aarch64_get_lane_zero_extend): Changed to ... (*aarch64_get_lane_zero_extend): ... This. (*aarch64_get_lane_extenddi): New. * config/aarch64/aarch64.md (sidi2, *extendsidi2_aarch64, qihi2, *extendqihi2_aarch64, *zero_extendsidi2_aarch64): Remove duplicate patterns. (2, *extend2_aarch64): Remove, consolidate into ... (extend2): ... This. (*zero_extendqihi2_aarch64, *zero_extend2_aarch64): Remove, consolidate into ... (zero_extend2, zero_extend2, zero_extend2): (*ands_compare0): Renamed to ... (*ands_compare0): ... This. * config/aarch64/iterators.md (HI_ONLY, QI_ONLY): New. (short_mask): Use hex rather than dec and add SI. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ands_3.c: Update codegen. * gcc.target/aarch64/sve/slp_1.c: Likewise. * gcc.target/aarch64/tst_5.c: Likewise. * gcc.target/aarch64/tst_6.c: Likewise. --- inline copy of patch -- diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 8a84a8560e982b8155b18541f5504801b3330124..d0b37c4dd48aeafd3d87c90dc3270e71af5a72b9 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -4237,19 +4237,34 @@ (define_insn "*aarch64_get_lane_extend" [(set_attr "type" "neon_to_gp")] ) -(define_insn "*aarch64_get_lane_zero_extend" +(define_insn "*aarch64_get_lane_extenddi" + [(set (match_operand:DI 0 "register_operand" "=r") + (sign_extend:DI + (vec_select: + (match_operand:VS 1 "register_operand" "w") + (parallel [(match_operand:SI 2 "immediate_operand" "i")]] + "TARGET_SIMD" + { +operands[2] = aarch64_endian_lane_rtx (mode, + INTVAL (operands[2])); +return "smov\\t%x0, %1.[%2]"; + } + [(set_attr "type" "neon_to_gp")] +) + +(define_insn "*aarch64_get_lane_zero_extend" [(set (match_operand:GPI 0 "register_operand" "=r") (zero_extend:GPI - (vec_select: - (match_operand:VDQQH 1 "register_operand" "w") + (vec_select: + (match_operand:VDQV_L 1 "register_operand" "w") (parallel [(match_operand:SI 2 "immediate_operand" "i")]] "TARGET_SIMD" { -operands[2] = aarch64_endian_lane_rtx (mode, +operands[2] = aarch64_endian_lane_rtx (mode, INTVAL (operands[2])); -return "umov\\t%w0, %1.[%2]"; +return "umov\\t%w0, %1.[%2]"; } - [(set_attr "type" "neon_to_gp")] + [(set_attr "type" "neon_to_gp")] ) ;; Lane extraction of a value, neither sign nor zero extension diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 3ea16dbc2557c6a4f37104d44a49f77f768eb53d..09ae1118371f82ca63146fceb953eb9e820d05a4 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1911,22 +1911,6 @@ (define_insn "storewb_pair_" ;; Sign/Zero extension ;; --- -(define_expand "sidi2" - [(set (match_operand:DI 0 "register_operand") - (ANY_EXTEND:DI (match_operand:SI 1 "nonimmediate_operand")))] - "" -) - -(define_insn "*extendsidi2_aarch64" - [(set (match_operand:DI 0 "register_operand" "=r,r") -(sign_extend:DI (match_operand:SI 1 "nonimmediate_operand" "r,m")))] - "" - "@ - sxtw\t%0, %w1 - ldrsw\t%0, %1" - [(set_attr "type" "extend,load_4")] -) - (define_insn "*load_pair_extendsidi2_aarch64" [(set (match_operand:DI 0 "register_operand" "=r") (sign_extend:DI (match_operand:SI 1 "aarch64_mem_pair_operand" "Ump"))) @@ -1940,21 +1924,6 @@ (define_insn "*load_pair_extendsidi2_aarch64" [(set_attr "type" "load_8")] ) -(define_insn