Thanks for the heads-up.  This one popped up in our "spring cleaning" of
old GCC tickets (some RISC-V related).
We'll take a look.

On Thu, 12 Mar 2026 at 20:57, Andrew Pinski <[email protected]>
wrote:

> On Thu, Mar 12, 2026 at 10:33 AM Philipp Tomsich
> <[email protected]> wrote:
> >
> > In forward_propagate_into_comparison_1, the invariant_only_p
> > restriction prevents folding comparisons when the defining SSA value
> > has multiple uses and the folded result is not a constant.  This
> > blocks the simplification of patterns like (++*a == 1) into (*a == 0),
> > where comparing the pre-increment value against zero is cheaper on
> > most targets (e.g., beqz on RISC-V, cbz on AArch64).
> >
> > Relax invariant_only_p when the defining statement is a PLUS_EXPR
> > with a constant operand, the comparison is an equality test against a
> > non-zero constant, and the folded constant would be zero.  GIMPLE
> > canonicalizes (X - C) to (X + -C), so only PLUS_EXPR needs handling.
> > This ensures we only fold toward zero comparisons, never away from
> > them (e.g., --*a == 0 must not fold to *a == 1).
> >
> > For example, given:
> >   _1 = *a;
> >   _2 = _1 + 1;
> >   *a = _2;
> >   if (_2 == 1)
> >
> > forwprop now produces:
> >   if (_1 == 0)
> >
> > which generates beqz/cbz instead of li+beq/cmp+b.eq.
> >
> > gcc/ChangeLog:
> >
> >         * tree-ssa-forwprop.cc (forward_propagate_into_comparison_1):
> >         Relax invariant_only_p for PLUS_EXPR with constant operand
> >         when the fold produces an equality comparison against zero.
>
> We are trying to remove forward_propagate_into_comparison_1 so it
> would be better if not adding there.
> See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120206 .
>
> The specific issue you are looking into is already recorded as
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120283 (and
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93006 and
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96702) even.
> It is mentioned the single_use there is "overly restrictive". See comment
> #4.
>
> If you want to allow for your specific item here (though it does not
> fully fix PR 120283). Change:
> (if (single_use (@3))
> into:
> (if (single_use (@3) || wi::to_wide (res) == 0)
>
> Thanks,
> Andrew
>
>
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.dg/tree-ssa/forwprop-pre-incr-cmp.c: New test.
> >
> > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-pre-incr-cmp.c
> b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-pre-incr-cmp.c
> > new file mode 100644
> > index 000000000000..77e74700b9ef
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-pre-incr-cmp.c
> > @@ -0,0 +1,93 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-O2 -fdump-tree-forwprop2" } */
> > +
> > +/* Verify that forwprop folds (++*a == 1) into (*a == 0), comparing the
> > +   pre-increment value against zero instead of comparing the incremented
> > +   value against 1.  Only fold when the result is a comparison against
> > +   zero (which is cheaper on most architectures).  */
> > +
> > +void g ();
> > +
> > +/* Unsigned EQ: ++*a == 1 -> *a == 0.  */
> > +void f1 (unsigned int *a)
> > +{
> > +  if (++*a == 1)
> > +    g ();
> > +}
> > +
> > +/* Unsigned NE: ++*a != 1 -> *a != 0.  */
> > +void f2 (unsigned int *a)
> > +{
> > +  if (++*a != 1)
> > +    g ();
> > +}
> > +
> > +/* Unsigned EQ with addend > 1: (*a += 3) == 3 -> *a == 0.  */
> > +void f3 (unsigned int *a)
> > +{
> > +  if ((*a += 3) == 3)
> > +    g ();
> > +}
> > +
> > +/* Unsigned EQ with non-zero result: (*a += 3) == 10 does NOT fold
> > +   (result would be 7, not zero).  */
> > +void f4 (unsigned int *a)
> > +{
> > +  if ((*a += 3) == 10)
> > +    g ();
> > +}
> > +
> > +/* Unsigned EQ already comparing against zero: --*a == 0 must NOT
> > +   fold to *a == 1 (regression away from zero).  */
> > +void f5 (unsigned int *a)
> > +{
> > +  if (--*a == 0)
> > +    g ();
> > +}
> > +
> > +/* Signed EQ: ++*a == 1 -> *a == 0.  */
> > +void f6 (int *a)
> > +{
> > +  if (++*a == 1)
> > +    g ();
> > +}
> > +
> > +/* Signed NE: ++*a != 1 -> *a != 0.  */
> > +void f7 (int *a)
> > +{
> > +  if (++*a != 1)
> > +    g ();
> > +}
> > +
> > +/* Signed EQ already comparing against zero: --*a == 0 must NOT
> > +   fold to *a == 1 (regression away from zero).  */
> > +void f8 (int *a)
> > +{
> > +  if (--*a == 0)
> > +    g ();
> > +}
> > +
> > +/* Ordering comparison: (++*a > 1) must NOT fold, even though the
> > +   folded constant would be zero -- the relaxation is restricted
> > +   to EQ_EXPR and NE_EXPR.  */
> > +void f9 (int *a)
> > +{
> > +  if (++*a > 1)
> > +    g ();
> > +}
> > +
> > +/* Positive: unsigned and signed EQ/NE fold to zero.
> > +   Use scan-tree-dump-times to independently verify that both unsigned
> > +   (f1/f2) and signed (f6/f7) variants fold.  */
> > +/* { dg-final { scan-tree-dump-times "Replaced '_\[0-9\]+ == 1' with
> '_\[0-9\]+ == 0'" 2 "forwprop2" } } */
> > +/* { dg-final { scan-tree-dump-times "Replaced '_\[0-9\]+ != 1' with
> '_\[0-9\]+ != 0'" 2 "forwprop2" } } */
> > +/* { dg-final { scan-tree-dump-times "Replaced '_\[0-9\]+ == 3' with
> '_\[0-9\]+ == 0'" 1 "forwprop2" } } */
> > +
> > +/* Negative: non-zero result must not fold.  */
> > +/* { dg-final { scan-tree-dump-not "Replaced '_\[0-9\]+ == 10' with
> '_\[0-9\]+ == 7'" "forwprop2" } } */
> > +
> > +/* Negative: already-zero comparison must not fold away from zero.  */
> > +/* { dg-final { scan-tree-dump-not "Replaced '_\[0-9\]+ == 0' with
> '_\[0-9\]+ ==" "forwprop2" } } */
> > +
> > +/* Negative: ordering comparison must not fold via this path.  */
> > +/* { dg-final { scan-tree-dump-not "Replaced '_\[0-9\]+ > 1' with
> '_\[0-9\]+ > 0'" "forwprop2" } } */
> > diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc
> > index b5544414ca6e..0e1637d4782a 100644
> > --- a/gcc/tree-ssa-forwprop.cc
> > +++ b/gcc/tree-ssa-forwprop.cc
> > @@ -467,6 +467,32 @@ forward_propagate_into_comparison_1 (gimple *stmt,
> >                   || TREE_CODE_CLASS (def_code) == tcc_comparison))
> >             invariant_only_p = false;
> >
> > +         /* Allow combining when the defining statement is an addition
> > +            with a constant, and the fold will produce a comparison
> > +            against zero.  On most architectures, comparing against
> > +            zero is cheaper than comparing against a non-zero constant.
> > +            Only relax invariant_only_p when the original comparison
> > +            is non-zero and the folded result would be zero -- otherwise
> > +            we could regress by moving a comparison away from zero.
> > +            Note: GIMPLE canonicalizes (X - C) to (X + -C), so only
> > +            PLUS_EXPR needs to be handled here.  */
> > +         if (invariant_only_p
> > +             && (code == EQ_EXPR || code == NE_EXPR)
> > +             && TREE_CODE (op1) == INTEGER_CST
> > +             && !integer_zerop (op1)
> > +             && def_code == PLUS_EXPR
> > +             && TREE_CODE (gimple_assign_rhs2 (def_stmt)) ==
> INTEGER_CST)
> > +           {
> > +             tree rhs2 = gimple_assign_rhs2 (def_stmt);
> > +             /* op1 and rhs2 may have different types due to implicit
> > +                promotions; int_const_binop handles this by converting
> > +                rhs2 to op1's precision.  We only check integer_zerop
> > +                on the result, which is type-insensitive.  */
> > +             tree folded_cst = int_const_binop (MINUS_EXPR, op1, rhs2);
> > +             if (folded_cst && integer_zerop (folded_cst))
> > +               invariant_only_p = false;
> > +           }
> > +
> >           tmp = combine_cond_expr_cond (stmt, code, type,
> >                                         rhs0, op1, invariant_only_p);
> >           if (tmp)
> > --
> > 2.34.1
> >
>

Reply via email to