> On 15 Sep 2025, at 09:30, Jennifer Schmitz <[email protected]> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
>> On 27 Aug 2025, at 09:14, Jennifer Schmitz <[email protected]> wrote:
>> 
>> External email: Use caution opening links or attachments
>> 
>> 
>>> On 26 Aug 2025, at 16:02, Alex Coplan <[email protected]> wrote:
>>> 
>>> External email: Use caution opening links or attachments
>>> 
>>> 
>>> On 26/08/2025 13:39, Jennifer Schmitz wrote:
>>>> The function gimple_folder::fold_pfalse that was introduced in
>>>> g5289540ed58e42ae66255e31f22afe4ca0a6e15e
>>>> inappropriately folds
>>>> 1) svbrka/b with _m if the inactive operand is pfalse
>>>> 2) svpmov_lane with _m if the non-governing predicate operand is pfalse.
>>>> To fix this, we added extra checks in the folding logic excluding certain
>>>> function shapes.
>>>> 
>>>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>>>> OK for trunk?
>>>> 
>>>> Signed-off-by: Jennifer Schmitz <[email protected]>
>>>> 
>>>> gcc/
>>>>    PR target/121604
>>>>    * config/aarch64/aarch64-sve-builtins.cc
>>>>    (gimple_folder::fold_pfalse): Add checks for function shape.
>>>> 
>>>> gcc/testsuite/
>>>>    PR target/121604
>>>>    * gcc.target/aarch64/sve/pr121604_brk.c: New test.
>>>>    * gcc.target/aarch64/sve2/pr121604_pmov.c: Likewise.
>>>> ---
>>>> gcc/config/aarch64/aarch64-sve-builtins.cc    |  6 +++--
>>>> .../gcc.target/aarch64/sve/pr121604_brk.c     | 25 +++++++++++++++++++
>>>> .../gcc.target/aarch64/sve2/pr121604_pmov.c   | 16 ++++++++++++
>>>> 3 files changed, 45 insertions(+), 2 deletions(-)
>>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>>> 
>>>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
>>>> b/gcc/config/aarch64/aarch64-sve-builtins.cc
>>>> index 1764cf8f7e8..41b328d96fc 100644
>>>> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
>>>> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
>>>> @@ -3641,9 +3641,11 @@ gimple_folder::fold_pfalse ()
>>>>     inactive vector (arg0), while other function shapes are folded
>>>>     to op1 (arg1).  */
>>>>     tree arg1 = gimple_call_arg (call, 1);
>>>> -      if (is_pfalse (arg1))
>>>> +      if (is_pfalse (arg1)
>>>> +       && shape != aarch64_sve::shapes::pmov_to_vector_lane)
>>>>    return fold_call_to (arg0);
>>>> -      if (is_pfalse (arg0))
>>>> +      if (is_pfalse (arg0)
>>>> +       && shape != aarch64_sve::shapes::unary)
>>>>    return fold_call_to (arg1);
>>>>     return nullptr;
>>>>   }
>>> 
>>> This fix feels very fragile, like we could easily have the same bug 
>>> introduced
>>> again for other intrinsics that have non-governing predicate operands; have 
>>> you
>>> verified there are no other such SVE intrinsics?  Either way, it's
>>> possible that more such intrinsics will get introduced in the future,
>>> and we'll have the exact same bug for those.
>>> 
>>> Relying on is_pfalse () to infer the position of the GP is just too
>>> fragile IMO.
>>> 
>>> As I mentioned in the PR, I think it would be better if we explicitly
>>> determined:
>>> (1) the position of the governing predicate, if any, and
>>> (2) the position of the inactive argument (to fold to on pfalse), if any,
>>> and used that information to do the folding.
>>> 
>>> Either we can do that by dispatching on the shapes (i.e. positively
>>> classify the shapes into groups for the two cases that we know we can
>>> safely optimize), or perhaps that information can be encoded in some
>>> other way.
>> Hi Alex,
>> I agree that it would be more robust if we positively checked for the 
>> intrinsics to which the fold can be safely applied.
>> We could probably do this using the function shape. But I think that the 
>> list of shapes where the fold can be applied is much longer than the cases 
>> that need to be excluded. In fact, svbrka/b and svpmov_lane with _m 
>> predication seem to be exceptional cases (which does not say that there will 
>> not be more of those in future).
>> I looked around for ways to access the position of the governing predicate. 
>> It seems to me that there is no direct access to the position but rather it 
>> is inferred using functions like function_shape::has_merge_argument_p or 
>> like in function_resolver::check_gp_argument. But I think for svpmov_lane_m 
>> it would still assume that the predicate is a governing predicate, wouldn’t 
>> it?
>> What do you think is the best way forward here?
>> Thanks for any advice.
>> Jennifer
> Gentle ping.
> Thanks,
> Jennifer
Gentle ping.
Thanks,
Jennifer
>>> 
>>> Thanks,
>>> Alex
>>> 
>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c 
>>>> b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>>>> new file mode 100644
>>>> index 00000000000..a474a20554d
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>>>> @@ -0,0 +1,25 @@
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-O2" } */
>>>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>>>> +
>>>> +#include <arm_sve.h>
>>>> +
>>>> +/*
>>>> +** foo:
>>>> +**   ptrue   p0\.b, all
>>>> +**   brkb    p0\.b, p0/z, p0\.b
>>>> +**   ret
>>>> +*/
>>>> +svbool_t foo () {
>>>> +  return svbrkb_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ());
>>>> +}
>>>> +
>>>> +/*
>>>> +** bar:
>>>> +**   ptrue   p0\.b, all
>>>> +**   brka    p0\.b, p0/z, p0\.b
>>>> +**   ret
>>>> +*/
>>>> +svbool_t bar () {
>>>> +  return svbrka_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ());
>>>> +}
>>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c 
>>>> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>>> new file mode 100644
>>>> index 00000000000..16844ee4add
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>>> @@ -0,0 +1,16 @@
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-O2 -march=armv8.2-a+sve2p1" } */
>>>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>>>> +
>>>> +#include <arm_sve.h>
>>>> +
>>>> +/*
>>>> +** f:
>>>> +**   pfalse  p([0-7]+)\.b
>>>> +**   mov     z0\.b, #-1
>>>> +**   pmov    z0\[1\], p\1\.d
>>>> +**   ret
>>>> +*/
>>>> +svuint64_t f () {
>>>> +    return svpmov_lane_u64_m (svdup_u64 (~0UL), svpfalse (), 1);
>>>> +}
>>>> --
>>>> 2.34.1


Reply via email to