> On 26 Aug 2025, at 16:02, Alex Coplan <alex.cop...@arm.com> wrote:
>
> External email: Use caution opening links or attachments
>
>
> On 26/08/2025 13:39, Jennifer Schmitz wrote:
>> The function gimple_folder::fold_pfalse that was introduced in
>> g5289540ed58e42ae66255e31f22afe4ca0a6e15e
>> inappropriately folds
>> 1) svbrka/b with _m if the inactive operand is pfalse
>> 2) svpmov_lane with _m if the non-governing predicate operand is pfalse.
>> To fix this, we added extra checks in the folding logic excluding certain
>> function shapes.
>>
>> The patch was bootstrapped and tested on aarch64-linux-gnu, no regression.
>> OK for trunk?
>>
>> Signed-off-by: Jennifer Schmitz <jschm...@nvidia.com>
>>
>> gcc/
>> PR target/121604
>> * config/aarch64/aarch64-sve-builtins.cc
>> (gimple_folder::fold_pfalse): Add checks for function shape.
>>
>> gcc/testsuite/
>> PR target/121604
>> * gcc.target/aarch64/sve/pr121604_brk.c: New test.
>> * gcc.target/aarch64/sve2/pr121604_pmov.c: Likewise.
>> ---
>> gcc/config/aarch64/aarch64-sve-builtins.cc | 6 +++--
>> .../gcc.target/aarch64/sve/pr121604_brk.c | 25 +++++++++++++++++++
>> .../gcc.target/aarch64/sve2/pr121604_pmov.c | 16 ++++++++++++
>> 3 files changed, 45 insertions(+), 2 deletions(-)
>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>>
>> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc
>> b/gcc/config/aarch64/aarch64-sve-builtins.cc
>> index 1764cf8f7e8..41b328d96fc 100644
>> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
>> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
>> @@ -3641,9 +3641,11 @@ gimple_folder::fold_pfalse ()
>> inactive vector (arg0), while other function shapes are folded
>> to op1 (arg1). */
>> tree arg1 = gimple_call_arg (call, 1);
>> - if (is_pfalse (arg1))
>> + if (is_pfalse (arg1)
>> + && shape != aarch64_sve::shapes::pmov_to_vector_lane)
>> return fold_call_to (arg0);
>> - if (is_pfalse (arg0))
>> + if (is_pfalse (arg0)
>> + && shape != aarch64_sve::shapes::unary)
>> return fold_call_to (arg1);
>> return nullptr;
>> }
>
> This fix feels very fragile, like we could easily have the same bug introduced
> again for other intrinsics that have non-governing predicate operands; have
> you
> verified there are no other such SVE intrinsics? Either way, it's
> possible that more such intrinsics will get introduced in the future,
> and we'll have the exact same bug for those.
>
> Relying on is_pfalse () to infer the position of the GP is just too
> fragile IMO.
>
> As I mentioned in the PR, I think it would be better if we explicitly
> determined:
> (1) the position of the governing predicate, if any, and
> (2) the position of the inactive argument (to fold to on pfalse), if any,
> and used that information to do the folding.
>
> Either we can do that by dispatching on the shapes (i.e. positively
> classify the shapes into groups for the two cases that we know we can
> safely optimize), or perhaps that information can be encoded in some
> other way.
Hi Alex,
I agree that it would be more robust if we positively checked for the
intrinsics to which the fold can be safely applied.
We could probably do this using the function shape. But I think that the list
of shapes where the fold can be applied is much longer than the cases that need
to be excluded. In fact, svbrka/b and svpmov_lane with _m predication seem to
be exceptional cases (which does not say that there will not be more of those
in future).
I looked around for ways to access the position of the governing predicate. It
seems to me that there is no direct access to the position but rather it is
inferred using functions like function_shape::has_merge_argument_p or like in
function_resolver::check_gp_argument. But I think for svpmov_lane_m it would
still assume that the predicate is a governing predicate, wouldn’t it?
What do you think is the best way forward here?
Thanks for any advice.
Jennifer
>
> Thanks,
> Alex
>
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>> b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>> new file mode 100644
>> index 00000000000..a474a20554d
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr121604_brk.c
>> @@ -0,0 +1,25 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" } */
>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>> +
>> +#include <arm_sve.h>
>> +
>> +/*
>> +** foo:
>> +** ptrue p0\.b, all
>> +** brkb p0\.b, p0/z, p0\.b
>> +** ret
>> +*/
>> +svbool_t foo () {
>> + return svbrkb_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ());
>> +}
>> +
>> +/*
>> +** bar:
>> +** ptrue p0\.b, all
>> +** brka p0\.b, p0/z, p0\.b
>> +** ret
>> +*/
>> +svbool_t bar () {
>> + return svbrka_b_m (svpfalse (), svptrue_b8 (), svptrue_b8 ());
>> +}
>> diff --git a/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>> b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>> new file mode 100644
>> index 00000000000..16844ee4add
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve2/pr121604_pmov.c
>> @@ -0,0 +1,16 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -march=armv8.2-a+sve2p1" } */
>> +/* { dg-final { check-function-bodies "**" "" "" } } */
>> +
>> +#include <arm_sve.h>
>> +
>> +/*
>> +** f:
>> +** pfalse p([0-7]+)\.b
>> +** mov z0\.b, #-1
>> +** pmov z0\[1\], p\1\.d
>> +** ret
>> +*/
>> +svuint64_t f () {
>> + return svpmov_lane_u64_m (svdup_u64 (~0UL), svpfalse (), 1);
>> +}
>> --
>> 2.34.1