Re: [PATCH][PR121599] aarch64: Fix ICE when op2 is zero for SVE2 saturating add intrinsics.

Jennifer Schmitz Sat, 18 Oct 2025 14:32:56 -0700
> On 15 Sep 2025, at 09:29, Jennifer Schmitz <[email protected]> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
>> On 28 Aug 2025, at 09:19, Jennifer Schmitz <[email protected]> wrote:
>> 
>> External email: Use caution opening links or attachments
>> 
>> 
>>> On 25 Aug 2025, at 15:40, Kyrylo Tkachov <[email protected]> wrote:
>>> 
>>> Hi Jennifer,
>>> 
>>>> On 25 Aug 2025, at 12:56, Jennifer Schmitz <[email protected]> wrote:
>>>> 
>>>> When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero
>>>> vector and predication is _z, an ICE in vregs occurs, e.g. for
>>>> 
>>>> svuint8_t foo (svbool_t pg, svuint8_t op1)
>>>> {
>>>> return svsqadd_u8_z (pg, op1, svdup_s8 (0));
>>>> }
>>>> 
>>>> The insn failed to match the pattern (aarch64-sve2.md):
>>>> 
>>>> ;; Predicated binary operations with no reverse form, merging with zero.
>>>> ;; At present we don't generate these patterns via a cond_* optab,
>>>> ;; so there's no correctness requirement to handle merging with an
>>>> ;; independent value.
>>>> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_z"
>>>> [(set (match_operand:SVE_FULL_I 0 "register_operand")
>>>> (unspec:SVE_FULL_I
>>>> [(match_operand:<VPRED> 1 "register_operand")
>>>> (unspec:SVE_FULL_I
>>>>  [(match_operand 5)
>>>>   (unspec:SVE_FULL_I
>>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>>> (match_operand:SVE_FULL_I 3 "register_operand")]
>>>> SVE2_COND_INT_BINARY_NOREV)]
>>>>  UNSPEC_PRED_X)
>>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>>>> UNSPEC_SEL))]
>>>> "TARGET_SVE2"
>>>> {@ [ cons: =0 , 1   , 2 , 3  ]
>>>>  [ &w       , Upl , 0 , w  ] movprfx\t%0.<Vetype>, %1/z, 
>>>> %0.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>>>>  [ &w       , Upl , w , w  ] movprfx\t%0.<Vetype>, %1/z, 
>>>> %2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>>>> }
>>>> "&& !CONSTANT_P (operands[5])"
>>>> {
>>>> operands[5] = CONSTM1_RTX (<VPRED>mode);
>>>> }
>>>> [(set_attr "movprfx" "yes")]
>>>> )
>>>> 
>>>> because operands[3] and operands[4] were both expanded into the same 
>>>> register
>>>> operand containing a zero vector by
>>>> 
>>>> ;; Predicated binary arithmetic with merging.
>>>> (define_expand "@cond_<sve_int_op><mode>"
>>>> [(set (match_operand:SVE_FULL_I 0 "register_operand")
>>>> (unspec:SVE_FULL_I
>>>> [(match_operand:<VPRED> 1 "register_operand")
>>>> (unspec:SVE_FULL_I
>>>>  [(match_dup 5)
>>>>   (unspec:SVE_FULL_I
>>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>>> (match_operand:SVE_FULL_I 3 "register_operand")]
>>>> SVE2_COND_INT_BINARY)]
>>>>  UNSPEC_PRED_X)
>>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_reg_or_zero")]
>>>> UNSPEC_SEL))]
>>>> "TARGET_SVE2"
>>>> {
>>>> operands[5] = CONSTM1_RTX (<MODE>mode);
>>>> }
>>>> )
>>>> 
>>>> This patch fixes the ICE by changing the predicate of operands[3]
>>>> in both patterns to aarch64_simd_reg_or_zero. It also adds tests
>>>> adjusted from the report in PR121599.
>>>> 
>>>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>>>> OK for trunk?
>>>> 
>>>> Alex Coplan pointed out in the bugzilla ticket that this ICE goes back
>>>> to GCC 10. Shall we backport?
>>>> 
>>>> Signed-off-by: Jennifer Schmitz <[email protected]>
>>>> 
>>>> gcc/
>>>> PR target/121599
>>>> * config/aarch64/aarch64-sve2.md (@cond_<sve_int_op><mode>):
>>>> Set predicate for operand 3 to aarch64_simd_reg_or_zero.
>>>> (*cond_<sve_int_op><mode>_z): Likewise.
>>>> 
>>>> gcc/testsuite/
>>>> PR target/121599
>>>> * gcc.target/aarch64/sve2/pr121599.c: New test.
>>>> ---
>>>> gcc/config/aarch64/aarch64-sve2.md            |  4 +--
>>>> .../gcc.target/aarch64/sve2/pr121599.c        | 31 +++++++++++++++++++
>>>> 2 files changed, 33 insertions(+), 2 deletions(-)
>>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>>> 
>>>> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
>>>> b/gcc/config/aarch64/aarch64-sve2.md
>>>> index a3cbbce8b31..c6a84fa7f3d 100644
>>>> --- a/gcc/config/aarch64/aarch64-sve2.md
>>>> +++ b/gcc/config/aarch64/aarch64-sve2.md
>>>> @@ -1021,7 +1021,7 @@
>>>>  [(match_dup 5)
>>>>   (unspec:SVE_FULL_I
>>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>>> - (match_operand:SVE_FULL_I 3 "register_operand")]
>>>> + (match_operand:SVE_FULL_I 3 "aarch64_simd_reg_or_zero")]
>>>> SVE2_COND_INT_BINARY)]
>>>>  UNSPEC_PRED_X)
>>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_reg_or_zero")]
>>>> @@ -1136,7 +1136,7 @@
>>>>  [(match_operand 5)
>>>>   (unspec:SVE_FULL_I
>>>> [(match_operand:SVE_FULL_I 2 "register_operand")
>>>> - (match_operand:SVE_FULL_I 3 "register_operand")]
>>>> + (match_operand:SVE_FULL_I 3 "aarch64_simd_reg_or_zero")]
>>>> SVE2_COND_INT_BINARY_NOREV)]
>>>>  UNSPEC_PRED_X)
>>>> (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>>> 
>>> I think the ICE happens because operands[4] ends up being a register which 
>>> the aarch64_simd_imm_zero can’t handle.
>>> The pattern *cond_<sve_int_op><mode>_z is also not supposed to be handling 
>>> register operands, it only expresses a merge with zero.
>>> Allowing a zero in operands[3] here is problematic as the output code 
>>> wouldn’t know how to print it if the midend used a zero vector for it.
>>> 
>>> I think the right solution is to adjust the @cond_<sve_int_op><mode> 
>>> expander to only allow aarch64_simd_imm_zero for operands[4] for the 
>>> SVE2_COND_INT_BINARY_NOREV subset of SVE2_COND_INT_BINARY.
>> Hi Kyrill,
>> Thanks for reviewing the patch.
>> I tried allowing only aarch64_simd_imm_zero for operands[4] for 
>> SVE2_COND_INT_BINARY_NOREV, but while that solved the ICE it caused all 
>> tests predication other than _z to fail.
>> I chose instead to add a pattern similar to the existing 
>> *cond_<sve_int_op><mode>_3 for SVE2_COND_INT_BINARY_NOREV that handles the 
>> case when operands[3] and operands[4] are the same register operand.
>> Thanks,
>> Jennifer
> Gentle ping.
> Thanks,
> Jennifer
Gentle ping.
Thanks,
Jennifer
>> 
>> When op2 in SVE2 saturating add intrinsics (svuqadd, svsqadd) is a zero
>> vector and predication is _z, an ICE in vregs occurs, e.g. for
>> 
>> svuint8_t foo (svbool_t pg, svuint8_t op1)
>> {
>>   return svsqadd_u8_z (pg, op1, svdup_s8 (0));
>> }
>> 
>> The insn failed to match the pattern (aarch64-sve2.md):
>> 
>> ;; Predicated binary operations with no reverse form, merging with zero.
>> ;; At present we don't generate these patterns via a cond_* optab,
>> ;; so there's no correctness requirement to handle merging with an
>> ;; independent value.
>> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_z"
>> [(set (match_operand:SVE_FULL_I 0 "register_operand")
>>       (unspec:SVE_FULL_I
>>         [(match_operand:<VPRED> 1 "register_operand")
>>          (unspec:SVE_FULL_I
>>            [(match_operand 5)
>>             (unspec:SVE_FULL_I
>>               [(match_operand:SVE_FULL_I 2 "register_operand")
>>                (match_operand:SVE_FULL_I 3 "register_operand")]
>>               SVE2_COND_INT_BINARY_NOREV)]
>>            UNSPEC_PRED_X)
>>          (match_operand:SVE_FULL_I 4 "aarch64_simd_imm_zero")]
>>         UNSPEC_SEL))]
>> "TARGET_SVE2"
>> {@ [ cons: =0 , 1   , 2 , 3  ]
>>    [ &w       , Upl , 0 , w  ] movprfx\t%0.<Vetype>, %1/z, 
>> %0.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>>    [ &w       , Upl , w , w  ] movprfx\t%0.<Vetype>, %1/z, 
>> %2.<Vetype>\;<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>
>> }
>> "&& !CONSTANT_P (operands[5])"
>> {
>>   operands[5] = CONSTM1_RTX (<VPRED>mode);
>> }
>> [(set_attr "movprfx" "yes")]
>> )
>> 
>> because operands[3] and operands[4] were both expanded into the same register
>> operand containing a zero vector by define_expand "@cond_<sve_int_op><mode>".
>> 
>> This patch fixes the ICE by adding the pattern
>> ;; Predicated binary operations with no reverse form, merging with
>> ;; the second input.
>> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_3"
>> for SVE2_COND_INT_BINARY_NOREV that allows operands[4] to be a register 
>> operand
>> (equal to operands[3]). It adds a vcond_mask expression such that we end
>> up with an insn matched by
>> ;; Predicated binary arithmetic, merging with the first input.
>> (define_insn_and_rewrite "*cond_<sve_int_op><mode>_2"
>> 
>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>> OK for trunk?
>> 
>> Alex Coplan pointed out in the bugzilla ticket that this ICE goes back
>> to GCC 10. Shall we backport?
>> 
>> Signed-off-by: Jennifer Schmitz <[email protected]>
>> 
>> gcc/
>>       PR target/121599
>>       * config/aarch64/aarch64-sve2.md (*cond_<sve_int_op><mode>_3):
>>       New pattern for SVE2_COND_INT_BINARY_NOREV.
>> 
>> gcc/testsuite/
>>       PR target/121599
>>       * gcc.target/aarch64/sve2/pr121599.c: New test.
>> ---
>> gcc/config/aarch64/aarch64-sve2.md            | 36 +++++++++++++++++++
>> .../gcc.target/aarch64/sve2/pr121599.c        | 31 ++++++++++++++++
>> 2 files changed, 67 insertions(+)
>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>> 
>> diff --git a/gcc/config/aarch64/aarch64-sve2.md 
>> b/gcc/config/aarch64/aarch64-sve2.md
>> index a3cbbce8b31..d82dc41df3e 100644
>> --- a/gcc/config/aarch64/aarch64-sve2.md
>> +++ b/gcc/config/aarch64/aarch64-sve2.md
>> @@ -1124,6 +1124,42 @@
>>  [(set_attr "movprfx" "yes")]
>> )
>> 
>> +;; Predicated binary operations with no reverse form, merging with
>> +;; the second input.
>> +(define_insn_and_rewrite "*cond_<sve_int_op><mode>_3"
>> +  [(set (match_operand:SVE_FULL_I 0 "register_operand")
>> +       (unspec:SVE_FULL_I
>> +         [(match_operand:<VPRED> 1 "register_operand")
>> +          (unspec:SVE_FULL_I
>> +            [(match_operand 5)
>> +             (unspec:SVE_FULL_I
>> +               [(match_operand:SVE_FULL_I 2 "register_operand")
>> +                (match_operand:SVE_FULL_I 3 "register_operand")]
>> +               SVE2_COND_INT_BINARY_NOREV)]
>> +            UNSPEC_PRED_X)
>> +          (match_operand:SVE_FULL_I 4 "register_operand")]
>> +         UNSPEC_SEL))]
>> +  "TARGET_SVE2
>> +   && rtx_equal_p (operands[3], operands[4])"
>> +  "#"
>> +  "&& 1"
>> +  {
>> +    if (reload_completed
>> +        && register_operand (operands[4], <MODE>mode)
>> +        && !rtx_equal_p (operands[0], operands[4]))
>> +      {
>> +       emit_insn (gen_vcond_mask_<mode><vpred> (operands[0], operands[2],
>> +                                                operands[4], operands[1]));
>> +       operands[4] = operands[2] = operands[0];
>> +      }
>> +    else if (!CONSTANT_P (operands[5]))
>> +      operands[5] = CONSTM1_RTX (<VPRED>mode);
>> +    else
>> +      FAIL;
>> +  }
>> +  [(set_attr "movprfx" "yes")]
>> +)
>> +
>> ;; Predicated binary operations with no reverse form, merging with zero.
>> ;; At present we don't generate these patterns via a cond_* optab,
>> ;; so there's no correctness requirement to handle merging with an
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c 
>> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>> new file mode 100644
>> index 00000000000..dbfa613649d
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>> @@ -0,0 +1,31 @@
>> +/* PR target/121599.  */
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" } */
>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>> +
>> +#include <arm_sve.h>
>> +
>> +/*
>> +** foo:
>> +**     movi    d([0-9]+), #0
>> +**     sel     z0\.b, p0, z0\.b, z\1\.b
>> +**     usqadd  z0\.b, p0/m, z0\.b, z\1\.b
>> +**     ret
>> +*/
>> +svuint8_t foo (svbool_t pg, svuint8_t op1)
>> +{
>> +    return svsqadd_u8_z (pg, op1, svdup_s8 (0));
>> +}
>> +
>> +/*
>> +** bar:
>> +**     movi    d([0-9]+), #0
>> +**     sel     z0\.b, p0, z0\.b, z\1\.b
>> +**     suqadd  z0\.b, p0/m, z0\.b, z\1\.b
>> +**     ret
>> +*/
>> +svint8_t bar (svbool_t pg, svint8_t op1)
>> +{
>> +    return svuqadd_n_s8_z (pg, op1, 0);
>> +}
>> +
>> --
>> 2.34.1
>>> 
>>> Thanks,
>>> Kyrill
>>> 
>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c 
>>>> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>>> new file mode 100644
>>>> index 00000000000..cd80ef9115c
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121599.c
>>>> @@ -0,0 +1,31 @@
>>>> +/* PR target/121599.  */
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-O2" } */
>>>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>>>> +
>>>> +#include <arm_sve.h>
>>>> +
>>>> +/*
>>>> +** foo:
>>>> +** movi d([0-9]+), #0
>>>> +** movprfx z0\.b, p0/z, z0\.b
>>>> +** usqadd z0\.b, p0/m, z0\.b, z\1\.b
>>>> +** ret
>>>> +*/
>>>> +svuint8_t foo (svbool_t pg, svuint8_t op1)
>>>> +{
>>>> +    return svsqadd_u8_z (pg, op1, svdup_s8 (0));
>>>> +}
>>>> +
>>>> +/*
>>>> +** bar:
>>>> +** movi d([0-9]+), #0
>>>> +** movprfx z0\.b, p0/z, z0\.b
>>>> +** suqadd z0\.b, p0/m, z0\.b, z\1\.b
>>>> +** ret
>>>> +*/
>>>> +svint8_t bar (svbool_t pg, svint8_t op1)
>>>> +{
>>>> +    return svuqadd_n_s8_z (pg, op1, 0);
>>>> +}
>>>> +
>>>> --
>>>> 2.34.1
Re: [PATCH][PR121599] aarch64: Fix ICE when op2 is zero for SVE2 saturating add intrinsics.

Reply via email to