Re: [PATCH 1/2] aarch64: Match unpredicated shift patterns for ADR, SRA, and ADDHNB instructions

Richard Sandiford Tue, 06 May 2025 07:06:25 -0700

Hi,

Thanks for the update.  The patch mostly looks good, but one minor and
one more substantial comment below.


BTW, the patch seems to have been corrupted en route, in that unchanged
lines have too much space.  Attaching is fine if that's easier.

Dhruv Chawla <dhr...@nvidia.com> writes:
> diff --git a/gcc/config/aarch64/aarch64-sve.md 
> b/gcc/config/aarch64/aarch64-sve.md
> index a72ca2a500d..42802bac653 100644
> --- a/gcc/config/aarch64/aarch64-sve.md
> +++ b/gcc/config/aarch64/aarch64-sve.md
> @@ -4149,80 +4149,58 @@
>   (define_expand "@aarch64_adr<mode>_shift"
>     [(set (match_operand:SVE_FULL_SDI 0 "register_operand")
>       (plus:SVE_FULL_SDI
> -       (unspec:SVE_FULL_SDI
> -         [(match_dup 4)
> -          (ashift:SVE_FULL_SDI
> -            (match_operand:SVE_FULL_SDI 2 "register_operand")
> -            (match_operand:SVE_FULL_SDI 3 "const_1_to_3_operand"))]
> -         UNSPEC_PRED_X)
> +       (ashift:SVE_FULL_SDI
> +         (match_operand:SVE_FULL_SDI 2 "register_operand")
> +         (match_operand:SVE_FULL_SDI 3 "const_1_to_3_operand"))
>         (match_operand:SVE_FULL_SDI 1 "register_operand")))]
>     "TARGET_SVE && TARGET_NON_STREAMING"
> -  {
> -    operands[4] = CONSTM1_RTX (<VPRED>mode);
> -  }
> +  {}
>   )

The {} can be removed.

> [...]
> @@ -4803,6 +4781,9 @@
>   
>   ;; Unpredicated shift by a scalar, which expands into one of the vector
>   ;; shifts below.
> +;;
> +;; The unpredicated form is emitted only when the shift amount is a constant
> +;; value that is valid for the shift being carried out.
>   (define_expand "<ASHIFT:optab><mode>3"
>     [(set (match_operand:SVE_I 0 "register_operand")
>       (ASHIFT:SVE_I
> @@ -4810,20 +4791,29 @@
>         (match_operand:<VEL> 2 "general_operand")))]
>     "TARGET_SVE"
>     {
> -    rtx amount;
> +    rtx amount = NULL_RTX;
>       if (CONST_INT_P (operands[2]))
>         {
> -     amount = gen_const_vec_duplicate (<MODE>mode, operands[2]);
> -     if (!aarch64_sve_<lr>shift_operand (operands[2], <MODE>mode))
> -       amount = force_reg (<MODE>mode, amount);
> +     if (aarch64_simd_shift_imm_p (operands[2], <MODE>mode, <optab>_optab == 
> ashl_optab))
> +       operands[2] = aarch64_simd_gen_const_vector_dup (<MODE>mode, INTVAL 
> (operands[2]));
> +     else
> +       {
> +         amount = gen_const_vec_duplicate (<MODE>mode, operands[2]);
> +         if (!aarch64_sve_<lr>shift_operand (operands[2], <MODE>mode))
> +           amount = force_reg (<MODE>mode, amount);
> +       }
>         }
>       else
>         {
>       amount = convert_to_mode (<VEL>mode, operands[2], 0);
>       amount = expand_vector_broadcast (<MODE>mode, amount);
>         }
> -    emit_insn (gen_v<optab><mode>3 (operands[0], operands[1], amount));
> -    DONE;
> +
> +    if (amount)
> +      {
> +     emit_insn (gen_v<optab><mode>3 (operands[0], operands[1], amount));
> +     DONE;
> +      }
>     }
>   )
>   

Instead of the two hunks above, I think we should leave <ASHIFT:optab><mode>3
alone and change v<optab><mode>3.

This would involve changing:

> @@ -4867,27 +4857,27 @@
>     ""
>   )
  
> -;; Unpredicated shift operations by a constant (post-RA only).
> +;; Unpredicated shift operations by a constant.
>   ;; These are generated by splitting a predicated instruction whose
>   ;; predicate is unused.
> -(define_insn "*post_ra_v_ashl<mode>3"
> +(define_insn "*v_ashl<mode>3"
>     [(set (match_operand:SVE_I 0 "register_operand")
>       (ashift:SVE_I
>         (match_operand:SVE_I 1 "register_operand")
>         (match_operand:SVE_I 2 "aarch64_simd_lshift_imm")))]
> -  "TARGET_SVE && reload_completed"
> +  "TARGET_SVE"
>     {@ [ cons: =0 , 1 , 2   ]
>        [ w    , w , vs1 ] add\t%0.<Vetype>, %1.<Vetype>, %1.<Vetype>
>        [ w    , w , Dl  ] lsl\t%0.<Vetype>, %1.<Vetype>, #%2
>     }
>   )
  
> -(define_insn "*post_ra_v_<optab><mode>3"
> +(define_insn "*v_<optab><mode>3"
>     [(set (match_operand:SVE_I 0 "register_operand" "=w")
>       (SHIFTRT:SVE_I
>         (match_operand:SVE_I 1 "register_operand" "w")
>         (match_operand:SVE_I 2 "aarch64_simd_rshift_imm")))]
> -  "TARGET_SVE && reload_completed"
> +  "TARGET_SVE"
>     "<shift>\t%0.<Vetype>, %1.<Vetype>, #%2"
>   )
  
...these instructions to named patterns, e.g. aarch64_vashl<mode>3_const
and aarch64_v<optab><mode>3_const respectively (with no _ after v, for
consistency with the optab name).  Then v<optab><mode>3 could
generate aarch64_v<optab><mode>3_const for the constant case.

Thanks,
Richard

Re: [PATCH 1/2] aarch64: Match unpredicated shift patterns for ADR, SRA, and ADDHNB instructions

Reply via email to