On Mon, Mar 9, 2026 at 10:22 AM Eikansh Gupta <[email protected]> wrote:
>
> Add two match.pd patterns that simplify COND_ADD when the mask is
> a strict greater-than comparison against zero and the else value equals
> one of the addition operands:
>
>   COND_ADD (a > 0, b, a, b)  ->  b + MAX_EXPR <a, 0>
>   COND_ADD (a > 0, a, b, b)  ->  b + MAX_EXPR <a, 0>

I'll note this has to wait for stage1.

Also I had expected a predicated add to be faster than a separate max
+ add?  It's
definitely more canonical on GIMPLE given it's a single stmt vs. two.
Both suggest
that match.pd isn't a good place to do such transform.

Richard.

>         PR 124097
>
> gcc/ChangeLog:
>
>         * match.pd (COND_ADD (a > 0, a, b, b)  ->  b + MAX_EXPR <a, 0>): New 
> pattern.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/tree-ssa/pr124097.c: New test.
>
> Signed-off-by: Eikansh Gupta <[email protected]>
> ---
>  gcc/match.pd                             | 15 +++++++++++++++
>  gcc/testsuite/gcc.dg/tree-ssa/pr124097.c | 17 +++++++++++++++++
>  2 files changed, 32 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr124097.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 7f16fd4e081..2bb4f425895 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -11323,6 +11323,21 @@ and,
>             && fold_real_zero_addition_p (type, NULL_TREE, @4, 0)))
>     (IFN_COND_ADD (bit_and @0 @2) @1 @3 @1)))
>
> +/* COND_ADD (a > 0, b, a, b) -> b + max (a, 0)
> +   COND_ADD (a > 0, a, b, b) -> b + max (a, 0) */
> +(simplify
> + (IFN_COND_ADD (gt @0 zerop@1) @2 @0 @2)
> + (if (FLOAT_TYPE_P (type)
> +      && !HONOR_NANS (type)
> +      && !HONOR_SIGNED_ZEROS (type))
> +  (plus @2 (max @0 @1))))
> +(simplify
> + (IFN_COND_ADD (gt @0 zerop@1) @0 @2 @2)
> + (if (FLOAT_TYPE_P (type)
> +      && !HONOR_NANS (type)
> +      && !HONOR_SIGNED_ZEROS (type))
> +  (plus @2 (max @0 @1))))
> +
>  /* Detect simplication for a conditional length reduction where
>
>     a = mask ? b : 0
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr124097.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr124097.c
> new file mode 100644
> index 00000000000..5769a2ca5d0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr124097.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -ffast-math -march=armv9-a -fdump-tree-optimized" } */
> +
> +float __attribute__ ((noipa))
> +foo (float *a)
> +{
> +  float sum = 0.;
> +  for (int i = 0; i < 32000; i++)
> +    if (a[i] > (float) 0.)
> +      sum += a[i];
> +  return sum;
> +}
> +
> +/* The main vectorized loop should have COND_ADD simplified to MAX_EXPR + 
> plus.
> +   The SVE epilogue retains COND_ADD */
> +/* { dg-final { scan-tree-dump-times { MAX_EXPR } 2 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times { \.COND_ADD } 2 "optimized" } } */
> --
> 2.34.1
>

Reply via email to