Hi Richard, > -----Original Message----- > From: Richard Sandiford <richard.sandif...@arm.com> > Sent: Wednesday, October 9, 2024 12:58 PM > To: gcc-patches@gcc.gnu.org > Cc: ktkac...@nvidia.com; Richard Earnshaw <richard.earns...@arm.com>; > Tamar Christina <tamar.christ...@arm.com> > Subject: [PATCH] aarch64: Fix folding of degenerate svwhilele case [PR117045] > > The svwhilele folder mishandled the degenerate case in which > the second argument is the maximum integer. In that case, > the result is all-true regardless of the first parameter: > > If the second scalar operand is equal to the maximum signed integer > value then a condition which includes an equality test can never fail > and the result will be an all-true predicate. > > This is because the conceptual "increment the first operand > by 1 after each element" is done modulo the range of the operand. > The GCC code was instead treating it as infinite precision. > whilele_5.c even had a test for the incorrect behaviour. > > The easiest fix seemed to be to handle that case specially before > doing constant folding. This also copes with variable first operands. > > Tested on aarch64-linux-gnu. I'll push on Friday if there are no > comments before then. Since it's a wrong-code bug, I'd also like > to backport to release branches. > > Thanks, > Richard > > > gcc/ > PR target/116999 > PR target/117045 > * config/aarch64/aarch64-sve-builtins-base.cc > (svwhilelx_impl::fold): Check for WHILELTs of the minimum value > and WHILELEs of the maximum value. Fold them to all-false and > all-true respectively. > > gcc/testsuite/ > PR target/116999 > PR target/117045 > * gcc.target/aarch64/sve/acle/general/whilele_5.c: Fix bogus > expected result. > * gcc.target/aarch64/sve/acle/general/whilele_11.c: New test. > * gcc.target/aarch64/sve/acle/general/whilele_12.c: Likewise. > --- > .../aarch64/aarch64-sve-builtins-base.cc | 11 +++++- > .../aarch64/sve/acle/general/whilele_11.c | 31 +++++++++++++++++ > .../aarch64/sve/acle/general/whilele_12.c | 34 +++++++++++++++++++ > .../aarch64/sve/acle/general/whilele_5.c | 2 +- > 4 files changed, 76 insertions(+), 2 deletions(-) > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c > create mode 100644 > gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c > > diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc > b/gcc/config/aarch64/aarch64-sve-builtins-base.cc > index 4b33585d981..3d0975e4294 100644 > --- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc > +++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc > @@ -2945,7 +2945,9 @@ public: > : while_comparison (unspec_for_sint, unspec_for_uint), m_eq_p (eq_p) > {} > > - /* Try to fold a call by treating its arguments as constants of type T. */ > + /* Try to fold a call by treating its arguments as constants of type T. > + We have already filtered out the degenerate cases of X .LT. MIN > + and X .LE. MAX. */ > template<typename T> > gimple * > fold_type (gimple_folder &f) const > @@ -3001,6 +3003,13 @@ public: > if (f.vectors_per_tuple () > 1) > return nullptr; > > + /* Filter out cases where the condition is always true or always false. > */ > + tree arg1 = gimple_call_arg (f.call, 1); > + if (!m_eq_p && operand_equal_p (arg1, TYPE_MIN_VALUE (TREE_TYPE > (arg1)))) > + return f.fold_to_pfalse ();
Just a quick question for my own understanding, I assume the reason MIN is handled here is because fold_type will decrement the value at some point? Otherwise wouldn't MIN + 1 still fit inside the type's precision? FWIW patch looks good to me, just wondering why the MIN case is needed :) Cheers, Tamar > + if (m_eq_p && operand_equal_p (arg1, TYPE_MAX_VALUE (TREE_TYPE > (arg1)))) > + return f.fold_to_ptrue (); > + > if (f.type_suffix (1).unsigned_p) > return fold_type<poly_uint64> (f); > else > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c > b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c > new file mode 100644 > index 00000000000..2be9dc5c534 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_11.c > @@ -0,0 +1,31 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +#include <arm_sve.h> > +#include <limits.h> > + > +svbool_t > +f1 (volatile int32_t *ptr) > +{ > + return svwhilelt_b8_s32 (*ptr, INT32_MIN); > +} > + > +svbool_t > +f2 (volatile uint32_t *ptr) > +{ > + return svwhilelt_b16_u32 (*ptr, 0); > +} > + > +svbool_t > +f3 (volatile int64_t *ptr) > +{ > + return svwhilelt_b32_s64 (*ptr, INT64_MIN); > +} > + > +svbool_t > +f4 (volatile uint64_t *ptr) > +{ > + return svwhilelt_b64_u64 (*ptr, 0); > +} > + > +/* { dg-final { scan-assembler-times {\tpfalse\tp[0-9]+\.b\n} 4 } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c > b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c > new file mode 100644 > index 00000000000..713065c3145 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_12.c > @@ -0,0 +1,34 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +#include <arm_sve.h> > +#include <limits.h> > + > +svbool_t > +f1 (volatile int32_t *ptr) > +{ > + return svwhilele_b8_s32 (*ptr, INT32_MAX); > +} > + > +svbool_t > +f2 (volatile uint32_t *ptr) > +{ > + return svwhilele_b16_u32 (*ptr, UINT32_MAX); > +} > + > +svbool_t > +f3 (volatile int64_t *ptr) > +{ > + return svwhilele_b32_s64 (*ptr, INT64_MAX); > +} > + > +svbool_t > +f4 (volatile uint64_t *ptr) > +{ > + return svwhilele_b64_u64 (*ptr, UINT64_MAX); > +} > + > +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.b(?:, all)\n} } } */ > +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h(?:, all)\n} } } */ > +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.s(?:, all)\n} } } */ > +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.d(?:, all)\n} } } */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c > b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c > index 7c73aa5926b..79f0e64b2ae 100644 > --- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c > +++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/whilele_5.c > @@ -28,7 +28,7 @@ test3 (svbool_t *ptr) > *ptr = svwhilele_b16_s32 (0x7ffffffb, 0x7fffffff); > } > > -/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h, vl5\n} } } */ > +/* { dg-final { scan-assembler {\tptrue\tp[0-9]+\.h(?:, all)\n} } } */ > > void > test4 (svbool_t *ptr) > -- > 2.25.1