> On 27 Aug 2025, at 09:14, Jennifer Schmitz <[email protected]> wrote:
>
> External email: Use caution opening links or attachments
>
>
>> On 26 Aug 2025, at 16:02, Alex Coplan <[email protected]> wrote:
>>
>> External email: Use caution opening links or attachments
>>
>>
>> On 26/08/2025 13:39, Jennifer Schmitz wrote:
>>> The function gimple_folder::fold_pfalse that was introduced in
>>> g5289540ed58e42ae66255e31f22afe4ca0a6e15e
>>> inappropriately folds
>>> 1) svbrka/b with _m if the inactive operand is pfalse
>>> 2) svpmov_lane with _m if the non-governing predicate operand is pfalse.
>>> To fix this, we added extra checks in the folding logic excluding certain
>>> function shapes.
>>>
>>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>>> OK for trunk?
>>>
>>> Signed-off-by: Jennifer Schmitz <[email protected]>
>>>
>>> gcc/
>>> PR target/121604
>>> * config/aarch64/aarch64-sve-builtins.cc
>>> (gimple_folder::fold_pfalse): Add checks for function shape.
>>>
>>> gcc/testsuite/
>>> PR target/121604
>>> * gcc.target/aarch64/sve/pr121604_brk.c: New test.
>>> * gcc.target/aarch64/sve2/pr121604_pmov.c: Likewise.
>>> ---
>>> gcc/config/aarch64/aarch64-sve-builtins.cc | 6 +++--
>>> .../gcc.target/aarch64/sve/pr121604_brk.c | 25 +++++++++++++++++++
>>> .../gcc.target/aarch64/sve2/pr121604_pmov.c | 16 ++++++++++++
>>> 3 files changed, 45 insertions(+), 2 deletions(-)
>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>>
>>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc
>>> b/gcc/config/aarch64/aarch64-sve-builtins.cc
>>> index 1764cf8f7e8..41b328d96fc 100644
>>> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
>>> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
>>> @@ -3641,9 +3641,11 @@ gimple_folder::fold_pfalse ()
>>> inactive vector (arg0), while other function shapes are folded
>>> to op1 (arg1). */
>>> tree arg1 = gimple_call_arg (call, 1);
>>> - if (is_pfalse (arg1))
>>> + if (is_pfalse (arg1)
>>> + && shape != aarch64_sve::shapes::pmov_to_vector_lane)
>>> return fold_call_to (arg0);
>>> - if (is_pfalse (arg0))
>>> + if (is_pfalse (arg0)
>>> + && shape != aarch64_sve::shapes::unary)
>>> return fold_call_to (arg1);
>>> return nullptr;
>>> }
>>
>> This fix feels very fragile, like we could easily have the same bug
>> introduced
>> again for other intrinsics that have non-governing predicate operands; have
>> you
>> verified there are no other such SVE intrinsics? Either way, it's
>> possible that more such intrinsics will get introduced in the future,
>> and we'll have the exact same bug for those.
>>
>> Relying on is_pfalse () to infer the position of the GP is just too
>> fragile IMO.
>>
>> As I mentioned in the PR, I think it would be better if we explicitly
>> determined:
>> (1) the position of the governing predicate, if any, and
>> (2) the position of the inactive argument (to fold to on pfalse), if any,
>> and used that information to do the folding.
>>
>> Either we can do that by dispatching on the shapes (i.e. positively
>> classify the shapes into groups for the two cases that we know we can
>> safely optimize), or perhaps that information can be encoded in some
>> other way.
> Hi Alex,
> I agree that it would be more robust if we positively checked for the
> intrinsics to which the fold can be safely applied.
> We could probably do this using the function shape. But I think that the list
> of shapes where the fold can be applied is much longer than the cases that
> need to be excluded. In fact, svbrka/b and svpmov_lane with _m predication
> seem to be exceptional cases (which does not say that there will not be more
> of those in future).
> I looked around for ways to access the position of the governing predicate.
> It seems to me that there is no direct access to the position but rather it
> is inferred using functions like function_shape::has_merge_argument_p or like
> in function_resolver::check_gp_argument. But I think for svpmov_lane_m it
> would still assume that the predicate is a governing predicate, wouldn’t it?
> What do you think is the best way forward here?
> Thanks for any advice.
> Jennifer
Gentle ping.
Thanks,
Jennifer
>>
>> Thanks,
>> Alex
>>
>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>>> b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>>> new file mode 100644
>>> index 00000000000..a474a20554d
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>>> @@ -0,0 +1,25 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2" } */
>>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>>> +
>>> +#include <arm_sve.h>
>>> +
>>> +/*
>>> +** foo:
>>> +** ptrue p0\.b, all
>>> +** brkb p0\.b, p0/z, p0\.b
>>> +** ret
>>> +*/
>>> +svbool_t foo () {
>>> + return svbrkb_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ());
>>> +}
>>> +
>>> +/*
>>> +** bar:
>>> +** ptrue p0\.b, all
>>> +** brka p0\.b, p0/z, p0\.b
>>> +** ret
>>> +*/
>>> +svbool_t bar () {
>>> + return svbrka_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ());
>>> +}
>>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>> new file mode 100644
>>> index 00000000000..16844ee4add
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>> @@ -0,0 +1,16 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -march=armv8.2-a+sve2p1" } */
>>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>>> +
>>> +#include <arm_sve.h>
>>> +
>>> +/*
>>> +** f:
>>> +** pfalse p([0-7]+)\.b
>>> +** mov z0\.b, #-1
>>> +** pmov z0\[1\], p\1\.d
>>> +** ret
>>> +*/
>>> +svuint64_t f () {
>>> + return svpmov_lane_u64_m (svdup_u64 (~0UL), svpfalse (), 1);
>>> +}
>>> --
>>> 2.34.1
>
>