> On 27 Aug 2025, at 09:14, Jennifer Schmitz <[email protected]> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
>> On 26 Aug 2025, at 16:02, Alex Coplan <[email protected]> wrote:
>> 
>> External email: Use caution opening links or attachments
>> 
>> 
>> On 26/08/2025 13:39, Jennifer Schmitz wrote:
>>> The function gimple_folder::fold_pfalse that was introduced in
>>> g5289540ed58e42ae66255e31f22afe4ca0a6e15e
>>> inappropriately folds
>>> 1) svbrka/b with _m if the inactive operand is pfalse
>>> 2) svpmov_lane with _m if the non-governing predicate operand is pfalse.
>>> To fix this, we added extra checks in the folding logic excluding certain
>>> function shapes.
>>> 
>>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>>> OK for trunk?
>>> 
>>> Signed-off-by: Jennifer Schmitz <[email protected]>
>>> 
>>> gcc/
>>>     PR target/121604
>>>     * config/aarch64/aarch64-sve-builtins.cc
>>>     (gimple_folder::fold_pfalse): Add checks for function shape.
>>> 
>>> gcc/testsuite/
>>>     PR target/121604
>>>     * gcc.target/aarch64/sve/pr121604_brk.c: New test.
>>>     * gcc.target/aarch64/sve2/pr121604_pmov.c: Likewise.
>>> ---
>>> gcc/config/aarch64/aarch64-sve-builtins.cc    |  6 +++--
>>> .../gcc.target/aarch64/sve/pr121604_brk.c     | 25 +++++++++++++++++++
>>> .../gcc.target/aarch64/sve2/pr121604_pmov.c   | 16 ++++++++++++
>>> 3 files changed, 45 insertions(+), 2 deletions(-)
>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>> 
>>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
>>> b/gcc/config/aarch64/aarch64-sve-builtins.cc
>>> index 1764cf8f7e8..41b328d96fc 100644
>>> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
>>> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
>>> @@ -3641,9 +3641,11 @@ gimple_folder::fold_pfalse ()
>>>      inactive vector (arg0), while other function shapes are folded
>>>      to op1 (arg1).  */
>>>      tree arg1 = gimple_call_arg (call, 1);
>>> -      if (is_pfalse (arg1))
>>> +      if (is_pfalse (arg1)
>>> +       && shape != aarch64_sve::shapes::pmov_to_vector_lane)
>>>     return fold_call_to (arg0);
>>> -      if (is_pfalse (arg0))
>>> +      if (is_pfalse (arg0)
>>> +       && shape != aarch64_sve::shapes::unary)
>>>     return fold_call_to (arg1);
>>>      return nullptr;
>>>    }
>> 
>> This fix feels very fragile, like we could easily have the same bug 
>> introduced
>> again for other intrinsics that have non-governing predicate operands; have 
>> you
>> verified there are no other such SVE intrinsics?  Either way, it's
>> possible that more such intrinsics will get introduced in the future,
>> and we'll have the exact same bug for those.
>> 
>> Relying on is_pfalse () to infer the position of the GP is just too
>> fragile IMO.
>> 
>> As I mentioned in the PR, I think it would be better if we explicitly
>> determined:
>> (1) the position of the governing predicate, if any, and
>> (2) the position of the inactive argument (to fold to on pfalse), if any,
>> and used that information to do the folding.
>> 
>> Either we can do that by dispatching on the shapes (i.e. positively
>> classify the shapes into groups for the two cases that we know we can
>> safely optimize), or perhaps that information can be encoded in some
>> other way.
> Hi Alex,
> I agree that it would be more robust if we positively checked for the 
> intrinsics to which the fold can be safely applied.
> We could probably do this using the function shape. But I think that the list 
> of shapes where the fold can be applied is much longer than the cases that 
> need to be excluded. In fact, svbrka/b and svpmov_lane with _m predication 
> seem to be exceptional cases (which does not say that there will not be more 
> of those in future).
> I looked around for ways to access the position of the governing predicate. 
> It seems to me that there is no direct access to the position but rather it 
> is inferred using functions like function_shape::has_merge_argument_p or like 
> in function_resolver::check_gp_argument. But I think for svpmov_lane_m it 
> would still assume that the predicate is a governing predicate, wouldn’t it?
> What do you think is the best way forward here?
> Thanks for any advice.
> Jennifer
Gentle ping.
Thanks,
Jennifer
>> 
>> Thanks,
>> Alex
>> 
>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c 
>>> b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>>> new file mode 100644
>>> index 00000000000..a474a20554d
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>>> @@ -0,0 +1,25 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2" } */
>>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>>> +
>>> +#include <arm_sve.h>
>>> +
>>> +/*
>>> +** foo:
>>> +**   ptrue   p0\.b, all
>>> +**   brkb    p0\.b, p0/z, p0\.b
>>> +**   ret
>>> +*/
>>> +svbool_t foo () {
>>> +  return svbrkb_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ());
>>> +}
>>> +
>>> +/*
>>> +** bar:
>>> +**   ptrue   p0\.b, all
>>> +**   brka    p0\.b, p0/z, p0\.b
>>> +**   ret
>>> +*/
>>> +svbool_t bar () {
>>> +  return svbrka_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ());
>>> +}
>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c 
>>> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>> new file mode 100644
>>> index 00000000000..16844ee4add
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>> @@ -0,0 +1,16 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -march=armv8.2-a+sve2p1" } */
>>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>>> +
>>> +#include <arm_sve.h>
>>> +
>>> +/*
>>> +** f:
>>> +**   pfalse  p([0-7]+)\.b
>>> +**   mov     z0\.b, #-1
>>> +**   pmov    z0\[1\], p\1\.d
>>> +**   ret
>>> +*/
>>> +svuint64_t f () {
>>> +    return svpmov_lane_u64_m (svdup_u64 (~0UL), svpfalse (), 1);
>>> +}
>>> --
>>> 2.34.1
> 
> 

Reply via email to