On Thu, Mar 12, 2026 at 10:33 AM Philipp Tomsich
<[email protected]> wrote:
>
> In forward_propagate_into_comparison_1, the invariant_only_p
> restriction prevents folding comparisons when the defining SSA value
> has multiple uses and the folded result is not a constant.  This
> blocks the simplification of patterns like (++*a == 1) into (*a == 0),
> where comparing the pre-increment value against zero is cheaper on
> most targets (e.g., beqz on RISC-V, cbz on AArch64).
>
> Relax invariant_only_p when the defining statement is a PLUS_EXPR
> with a constant operand, the comparison is an equality test against a
> non-zero constant, and the folded constant would be zero.  GIMPLE
> canonicalizes (X - C) to (X + -C), so only PLUS_EXPR needs handling.
> This ensures we only fold toward zero comparisons, never away from
> them (e.g., --*a == 0 must not fold to *a == 1).
>
> For example, given:
>   _1 = *a;
>   _2 = _1 + 1;
>   *a = _2;
>   if (_2 == 1)
>
> forwprop now produces:
>   if (_1 == 0)
>
> which generates beqz/cbz instead of li+beq/cmp+b.eq.
>
> gcc/ChangeLog:
>
>         * tree-ssa-forwprop.cc (forward_propagate_into_comparison_1):
>         Relax invariant_only_p for PLUS_EXPR with constant operand
>         when the fold produces an equality comparison against zero.

We are trying to remove forward_propagate_into_comparison_1 so it
would be better if not adding there.
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120206 .

The specific issue you are looking into is already recorded as
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120283 (and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93006 and
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96702) even.
It is mentioned the single_use there is "overly restrictive". See comment #4.

If you want to allow for your specific item here (though it does not
fully fix PR 120283). Change:
(if (single_use (@3))
into:
(if (single_use (@3) || wi::to_wide (res) == 0)

Thanks,
Andrew


>
> gcc/testsuite/ChangeLog:
>
>         * gcc.dg/tree-ssa/forwprop-pre-incr-cmp.c: New test.
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-pre-incr-cmp.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-pre-incr-cmp.c
> new file mode 100644
> index 000000000000..77e74700b9ef
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-pre-incr-cmp.c
> @@ -0,0 +1,93 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-forwprop2" } */
> +
> +/* Verify that forwprop folds (++*a == 1) into (*a == 0), comparing the
> +   pre-increment value against zero instead of comparing the incremented
> +   value against 1.  Only fold when the result is a comparison against
> +   zero (which is cheaper on most architectures).  */
> +
> +void g ();
> +
> +/* Unsigned EQ: ++*a == 1 -> *a == 0.  */
> +void f1 (unsigned int *a)
> +{
> +  if (++*a == 1)
> +    g ();
> +}
> +
> +/* Unsigned NE: ++*a != 1 -> *a != 0.  */
> +void f2 (unsigned int *a)
> +{
> +  if (++*a != 1)
> +    g ();
> +}
> +
> +/* Unsigned EQ with addend > 1: (*a += 3) == 3 -> *a == 0.  */
> +void f3 (unsigned int *a)
> +{
> +  if ((*a += 3) == 3)
> +    g ();
> +}
> +
> +/* Unsigned EQ with non-zero result: (*a += 3) == 10 does NOT fold
> +   (result would be 7, not zero).  */
> +void f4 (unsigned int *a)
> +{
> +  if ((*a += 3) == 10)
> +    g ();
> +}
> +
> +/* Unsigned EQ already comparing against zero: --*a == 0 must NOT
> +   fold to *a == 1 (regression away from zero).  */
> +void f5 (unsigned int *a)
> +{
> +  if (--*a == 0)
> +    g ();
> +}
> +
> +/* Signed EQ: ++*a == 1 -> *a == 0.  */
> +void f6 (int *a)
> +{
> +  if (++*a == 1)
> +    g ();
> +}
> +
> +/* Signed NE: ++*a != 1 -> *a != 0.  */
> +void f7 (int *a)
> +{
> +  if (++*a != 1)
> +    g ();
> +}
> +
> +/* Signed EQ already comparing against zero: --*a == 0 must NOT
> +   fold to *a == 1 (regression away from zero).  */
> +void f8 (int *a)
> +{
> +  if (--*a == 0)
> +    g ();
> +}
> +
> +/* Ordering comparison: (++*a > 1) must NOT fold, even though the
> +   folded constant would be zero -- the relaxation is restricted
> +   to EQ_EXPR and NE_EXPR.  */
> +void f9 (int *a)
> +{
> +  if (++*a > 1)
> +    g ();
> +}
> +
> +/* Positive: unsigned and signed EQ/NE fold to zero.
> +   Use scan-tree-dump-times to independently verify that both unsigned
> +   (f1/f2) and signed (f6/f7) variants fold.  */
> +/* { dg-final { scan-tree-dump-times "Replaced '_\[0-9\]+ == 1' with 
> '_\[0-9\]+ == 0'" 2 "forwprop2" } } */
> +/* { dg-final { scan-tree-dump-times "Replaced '_\[0-9\]+ != 1' with 
> '_\[0-9\]+ != 0'" 2 "forwprop2" } } */
> +/* { dg-final { scan-tree-dump-times "Replaced '_\[0-9\]+ == 3' with 
> '_\[0-9\]+ == 0'" 1 "forwprop2" } } */
> +
> +/* Negative: non-zero result must not fold.  */
> +/* { dg-final { scan-tree-dump-not "Replaced '_\[0-9\]+ == 10' with 
> '_\[0-9\]+ == 7'" "forwprop2" } } */
> +
> +/* Negative: already-zero comparison must not fold away from zero.  */
> +/* { dg-final { scan-tree-dump-not "Replaced '_\[0-9\]+ == 0' with 
> '_\[0-9\]+ ==" "forwprop2" } } */
> +
> +/* Negative: ordering comparison must not fold via this path.  */
> +/* { dg-final { scan-tree-dump-not "Replaced '_\[0-9\]+ > 1' with '_\[0-9\]+ 
> > 0'" "forwprop2" } } */
> diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
> index b5544414ca6e..0e1637d4782a 100644
> --- a/gcc/tree-ssa-forwprop.cc
> +++ b/gcc/tree-ssa-forwprop.cc
> @@ -467,6 +467,32 @@ forward_propagate_into_comparison_1 (gimple *stmt,
>                   || TREE_CODE_CLASS (def_code) == tcc_comparison))
>             invariant_only_p = false;
>
> +         /* Allow combining when the defining statement is an addition
> +            with a constant, and the fold will produce a comparison
> +            against zero.  On most architectures, comparing against
> +            zero is cheaper than comparing against a non-zero constant.
> +            Only relax invariant_only_p when the original comparison
> +            is non-zero and the folded result would be zero -- otherwise
> +            we could regress by moving a comparison away from zero.
> +            Note: GIMPLE canonicalizes (X - C) to (X + -C), so only
> +            PLUS_EXPR needs to be handled here.  */
> +         if (invariant_only_p
> +             && (code == EQ_EXPR || code == NE_EXPR)
> +             && TREE_CODE (op1) == INTEGER_CST
> +             && !integer_zerop (op1)
> +             && def_code == PLUS_EXPR
> +             && TREE_CODE (gimple_assign_rhs2 (def_stmt)) == INTEGER_CST)
> +           {
> +             tree rhs2 = gimple_assign_rhs2 (def_stmt);
> +             /* op1 and rhs2 may have different types due to implicit
> +                promotions; int_const_binop handles this by converting
> +                rhs2 to op1's precision.  We only check integer_zerop
> +                on the result, which is type-insensitive.  */
> +             tree folded_cst = int_const_binop (MINUS_EXPR, op1, rhs2);
> +             if (folded_cst && integer_zerop (folded_cst))
> +               invariant_only_p = false;
> +           }
> +
>           tmp = combine_cond_expr_cond (stmt, code, type,
>                                         rhs0, op1, invariant_only_p);
>           if (tmp)
> --
> 2.34.1
>

Reply via email to