Richard Biener <[email protected]> writes: > On Mon, Mar 9, 2026 at 10:22 AM Eikansh Gupta <[email protected]> > wrote: >> >> Add two match.pd patterns that simplify COND_ADD when the mask is >> a strict greater-than comparison against zero and the else value equals >> one of the addition operands: >> >> COND_ADD (a > 0, b, a, b) -> b + MAX_EXPR <a, 0> >> COND_ADD (a > 0, a, b, b) -> b + MAX_EXPR <a, 0> > > I'll note this has to wait for stage1. > > Also I had expected a predicated add to be faster than a separate max > + add? It's > definitely more canonical on GIMPLE given it's a single stmt vs. two. > Both suggest > that match.pd isn't a good place to do such transform.
If the gt is single-use then I suppose it's gt + cond_add vs max + add. And add is arguably simpler than cond_add. Would adding :s to the gt make it more acceptable? That said, having rules in this direction (condition op to multiple unconditional ops) sounds like it might risk cycles... Richard > Richard. > >> PR 124097 >> >> gcc/ChangeLog: >> >> * match.pd (COND_ADD (a > 0, a, b, b) -> b + MAX_EXPR <a, 0>): New >> pattern. >> >> gcc/testsuite/ChangeLog: >> >> * gcc.dg/tree-ssa/pr124097.c: New test. >> >> Signed-off-by: Eikansh Gupta <[email protected]> >> --- >> gcc/match.pd | 15 +++++++++++++++ >> gcc/testsuite/gcc.dg/tree-ssa/pr124097.c | 17 +++++++++++++++++ >> 2 files changed, 32 insertions(+) >> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr124097.c >> >> diff --git a/gcc/match.pd b/gcc/match.pd >> index 7f16fd4e081..2bb4f425895 100644 >> --- a/gcc/match.pd >> +++ b/gcc/match.pd >> @@ -11323,6 +11323,21 @@ and, >> && fold_real_zero_addition_p (type, NULL_TREE, @4, 0))) >> (IFN_COND_ADD (bit_and @0 @2) @1 @3 @1))) >> >> +/* COND_ADD (a > 0, b, a, b) -> b + max (a, 0) >> + COND_ADD (a > 0, a, b, b) -> b + max (a, 0) */ >> +(simplify >> + (IFN_COND_ADD (gt @0 zerop@1) @2 @0 @2) >> + (if (FLOAT_TYPE_P (type) >> + && !HONOR_NANS (type) >> + && !HONOR_SIGNED_ZEROS (type)) >> + (plus @2 (max @0 @1)))) >> +(simplify >> + (IFN_COND_ADD (gt @0 zerop@1) @0 @2 @2) >> + (if (FLOAT_TYPE_P (type) >> + && !HONOR_NANS (type) >> + && !HONOR_SIGNED_ZEROS (type)) >> + (plus @2 (max @0 @1)))) >> + >> /* Detect simplication for a conditional length reduction where >> >> a = mask ? b : 0 >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr124097.c >> b/gcc/testsuite/gcc.dg/tree-ssa/pr124097.c >> new file mode 100644 >> index 00000000000..5769a2ca5d0 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr124097.c >> @@ -0,0 +1,17 @@ >> +/* { dg-do compile } */ >> +/* { dg-options "-O3 -ffast-math -march=armv9-a -fdump-tree-optimized" } */ >> + >> +float __attribute__ ((noipa)) >> +foo (float *a) >> +{ >> + float sum = 0.; >> + for (int i = 0; i < 32000; i++) >> + if (a[i] > (float) 0.) >> + sum += a[i]; >> + return sum; >> +} >> + >> +/* The main vectorized loop should have COND_ADD simplified to MAX_EXPR + >> plus. >> + The SVE epilogue retains COND_ADD */ >> +/* { dg-final { scan-tree-dump-times { MAX_EXPR } 2 "optimized" } } */ >> +/* { dg-final { scan-tree-dump-times { \.COND_ADD } 2 "optimized" } } */ >> -- >> 2.34.1 >>
