Thanks for the heads-up. This one popped up in our "spring cleaning" of old GCC tickets (some RISC-V related). We'll take a look.
On Thu, 12 Mar 2026 at 20:57, Andrew Pinski <[email protected]> wrote: > On Thu, Mar 12, 2026 at 10:33 AM Philipp Tomsich > <[email protected]> wrote: > > > > In forward_propagate_into_comparison_1, the invariant_only_p > > restriction prevents folding comparisons when the defining SSA value > > has multiple uses and the folded result is not a constant. This > > blocks the simplification of patterns like (++*a == 1) into (*a == 0), > > where comparing the pre-increment value against zero is cheaper on > > most targets (e.g., beqz on RISC-V, cbz on AArch64). > > > > Relax invariant_only_p when the defining statement is a PLUS_EXPR > > with a constant operand, the comparison is an equality test against a > > non-zero constant, and the folded constant would be zero. GIMPLE > > canonicalizes (X - C) to (X + -C), so only PLUS_EXPR needs handling. > > This ensures we only fold toward zero comparisons, never away from > > them (e.g., --*a == 0 must not fold to *a == 1). > > > > For example, given: > > _1 = *a; > > _2 = _1 + 1; > > *a = _2; > > if (_2 == 1) > > > > forwprop now produces: > > if (_1 == 0) > > > > which generates beqz/cbz instead of li+beq/cmp+b.eq. > > > > gcc/ChangeLog: > > > > * tree-ssa-forwprop.cc (forward_propagate_into_comparison_1): > > Relax invariant_only_p for PLUS_EXPR with constant operand > > when the fold produces an equality comparison against zero. > > We are trying to remove forward_propagate_into_comparison_1 so it > would be better if not adding there. > See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120206 . > > The specific issue you are looking into is already recorded as > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120283 (and > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93006 and > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96702) even. > It is mentioned the single_use there is "overly restrictive". See comment > #4. > > If you want to allow for your specific item here (though it does not > fully fix PR 120283). Change: > (if (single_use (@3)) > into: > (if (single_use (@3) || wi::to_wide (res) == 0) > > Thanks, > Andrew > > > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/tree-ssa/forwprop-pre-incr-cmp.c: New test. > > > > diff --git a/gcc/testsuite/gcc.dg/tree-ssa/forwprop-pre-incr-cmp.c > b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-pre-incr-cmp.c > > new file mode 100644 > > index 000000000000..77e74700b9ef > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/tree-ssa/forwprop-pre-incr-cmp.c > > @@ -0,0 +1,93 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -fdump-tree-forwprop2" } */ > > + > > +/* Verify that forwprop folds (++*a == 1) into (*a == 0), comparing the > > + pre-increment value against zero instead of comparing the incremented > > + value against 1. Only fold when the result is a comparison against > > + zero (which is cheaper on most architectures). */ > > + > > +void g (); > > + > > +/* Unsigned EQ: ++*a == 1 -> *a == 0. */ > > +void f1 (unsigned int *a) > > +{ > > + if (++*a == 1) > > + g (); > > +} > > + > > +/* Unsigned NE: ++*a != 1 -> *a != 0. */ > > +void f2 (unsigned int *a) > > +{ > > + if (++*a != 1) > > + g (); > > +} > > + > > +/* Unsigned EQ with addend > 1: (*a += 3) == 3 -> *a == 0. */ > > +void f3 (unsigned int *a) > > +{ > > + if ((*a += 3) == 3) > > + g (); > > +} > > + > > +/* Unsigned EQ with non-zero result: (*a += 3) == 10 does NOT fold > > + (result would be 7, not zero). */ > > +void f4 (unsigned int *a) > > +{ > > + if ((*a += 3) == 10) > > + g (); > > +} > > + > > +/* Unsigned EQ already comparing against zero: --*a == 0 must NOT > > + fold to *a == 1 (regression away from zero). */ > > +void f5 (unsigned int *a) > > +{ > > + if (--*a == 0) > > + g (); > > +} > > + > > +/* Signed EQ: ++*a == 1 -> *a == 0. */ > > +void f6 (int *a) > > +{ > > + if (++*a == 1) > > + g (); > > +} > > + > > +/* Signed NE: ++*a != 1 -> *a != 0. */ > > +void f7 (int *a) > > +{ > > + if (++*a != 1) > > + g (); > > +} > > + > > +/* Signed EQ already comparing against zero: --*a == 0 must NOT > > + fold to *a == 1 (regression away from zero). */ > > +void f8 (int *a) > > +{ > > + if (--*a == 0) > > + g (); > > +} > > + > > +/* Ordering comparison: (++*a > 1) must NOT fold, even though the > > + folded constant would be zero -- the relaxation is restricted > > + to EQ_EXPR and NE_EXPR. */ > > +void f9 (int *a) > > +{ > > + if (++*a > 1) > > + g (); > > +} > > + > > +/* Positive: unsigned and signed EQ/NE fold to zero. > > + Use scan-tree-dump-times to independently verify that both unsigned > > + (f1/f2) and signed (f6/f7) variants fold. */ > > +/* { dg-final { scan-tree-dump-times "Replaced '_\[0-9\]+ == 1' with > '_\[0-9\]+ == 0'" 2 "forwprop2" } } */ > > +/* { dg-final { scan-tree-dump-times "Replaced '_\[0-9\]+ != 1' with > '_\[0-9\]+ != 0'" 2 "forwprop2" } } */ > > +/* { dg-final { scan-tree-dump-times "Replaced '_\[0-9\]+ == 3' with > '_\[0-9\]+ == 0'" 1 "forwprop2" } } */ > > + > > +/* Negative: non-zero result must not fold. */ > > +/* { dg-final { scan-tree-dump-not "Replaced '_\[0-9\]+ == 10' with > '_\[0-9\]+ == 7'" "forwprop2" } } */ > > + > > +/* Negative: already-zero comparison must not fold away from zero. */ > > +/* { dg-final { scan-tree-dump-not "Replaced '_\[0-9\]+ == 0' with > '_\[0-9\]+ ==" "forwprop2" } } */ > > + > > +/* Negative: ordering comparison must not fold via this path. */ > > +/* { dg-final { scan-tree-dump-not "Replaced '_\[0-9\]+ > 1' with > '_\[0-9\]+ > 0'" "forwprop2" } } */ > > diff --git a/gcc/tree-ssa-forwprop.cc b/gcc/tree-ssa-forwprop.cc > > index b5544414ca6e..0e1637d4782a 100644 > > --- a/gcc/tree-ssa-forwprop.cc > > +++ b/gcc/tree-ssa-forwprop.cc > > @@ -467,6 +467,32 @@ forward_propagate_into_comparison_1 (gimple *stmt, > > || TREE_CODE_CLASS (def_code) == tcc_comparison)) > > invariant_only_p = false; > > > > + /* Allow combining when the defining statement is an addition > > + with a constant, and the fold will produce a comparison > > + against zero. On most architectures, comparing against > > + zero is cheaper than comparing against a non-zero constant. > > + Only relax invariant_only_p when the original comparison > > + is non-zero and the folded result would be zero -- otherwise > > + we could regress by moving a comparison away from zero. > > + Note: GIMPLE canonicalizes (X - C) to (X + -C), so only > > + PLUS_EXPR needs to be handled here. */ > > + if (invariant_only_p > > + && (code == EQ_EXPR || code == NE_EXPR) > > + && TREE_CODE (op1) == INTEGER_CST > > + && !integer_zerop (op1) > > + && def_code == PLUS_EXPR > > + && TREE_CODE (gimple_assign_rhs2 (def_stmt)) == > INTEGER_CST) > > + { > > + tree rhs2 = gimple_assign_rhs2 (def_stmt); > > + /* op1 and rhs2 may have different types due to implicit > > + promotions; int_const_binop handles this by converting > > + rhs2 to op1's precision. We only check integer_zerop > > + on the result, which is type-insensitive. */ > > + tree folded_cst = int_const_binop (MINUS_EXPR, op1, rhs2); > > + if (folded_cst && integer_zerop (folded_cst)) > > + invariant_only_p = false; > > + } > > + > > tmp = combine_cond_expr_cond (stmt, code, type, > > rhs0, op1, invariant_only_p); > > if (tmp) > > -- > > 2.34.1 > > >
