On 5/11/2026 10:10 AM, Richard Biener wrote:
On Wed, Apr 22, 2026 at 5:12 PM Daniel Henrique Barboza
<[email protected]> wrote:

Ping

On 3/20/2026 2:02 PM, Daniel Henrique Barboza wrote:
From: Daniel Barboza <[email protected]>

PR101179 requires a combination of patterns that transforms the code to
better utilize other existing patterns (mostly 122608).

First, if we have a pattern that uses a zero_one that results in two
constants, and the constants are then used in a subsequent OP, we want
the constants to be explicit.  The reason is that 122608 has code that
will then distribute the OP along the constants. Some frontends already
do that, but not always.  E.g: given this C code:

   int test(int y) { return y % (4 << ((y % 100 == 0) * 2)); }

The C frontend will convert it to:

   ;; enabled by -tree-original
   { return yD.4597 % (yD.4597 % 100 == 0 ? 16 : 4); }

Which is what we want.  But this next C code will not be converted by the
frontend.  I haven't dig into it but my educated guess is that the extra
cast before the zero_one val is the deal breaker:

   _Bool x = y % 100 == 0;
   return y % (4 << (x * 2));

   ;; enabled by -tree-original
   {
     _BoolD.4579 xD.4596 = yD.4593 % 100 == 0;
     return yD.4593 % (4 << (intD.7) xD.4596 * 2);
   }

I haven't dig into it but my educated guess is that the extra cast
before the zero_one val is not helping here.  But note that there's no
guarantee that all frontends will behave this way, and there's no
guarantee that we might introduce the pattern ourselves for other
unrelated optimizations that will affect these.  To be certain we're
introducing some of the most common patterns that generates constants,
ignoring cases where we want the pattern to be left untouched (e.g.
WIDEN_MULT_PLUS_EXPR).

Second: in a case where we have a comparison with the same RHS in two
legs of a conditional, for example:

     bb3:
     _14 = y_9(D) & 15;
     _3 = _14 == 0;
     goto <bb 5>;

     bb4:
     _13 = y_9(D) & 3;
     _6 = _13 == 0;

     bb5:
     # _5 = PHI <_6(4), _3(3)>
     _7 = (intD.7) _5;

We want to move the comparison == 0 to the merge block, allowing other
optimizations that operate in single statement blocks to take place.
This gimple can be produced by patterns such as:

    return x ? (y % 16) == 0 : (y % 4) == 0;

I don't think we want to handle this via match.pd patterns.  In PHI-OPT
we for example have code to hoist/sink conversions, in sinking there's
code to sink common stores.

So I'd rather view this as opportunity to enhance
factor_out_conditional_operation
from phiopt.

The other cases are an opportunity for the reverse transform, PRE handles
some of them, phiprop as well.  I think that

   <bb 2> :
   if (x_4(D) != 0)
     goto <bb 4>; [INV]
   else
     goto <bb 3>; [INV]

   <bb 3> :

   <bb 4> :
   # iftmp.0_3 = PHI <16(2), 4(3)>
   _1 = y_7(D) % iftmp.0_3;

would fit a new PHI-OPT kind similar to factor_out_conditional_operation,
but handling the case where either the resulting operations are a lot
cheaper or on one arm the constant is the neutral element (PRE handles
that case).

Understood.  I'll start doing the phiopt part since I'm more familiar with that
pass, then I'll see what I can do in PRE.  I'll be sure to cry for help when
needed.


Thanks,
Daniel



And finally, w.r.t mod operations, we want to always convert them to
something lighter (like a lshift or bit_and).  For zero comparisons with
a integer_pow2 operand we want to do a bit_and instead, and the frontend
will convert it for us regardless of SSA_NAME sign. But the C frontend
won't convert a mod operand wrapped into a if, e.g. this C code:

    return y % (x ? 16 : 4) == 0;

We're adding a pattern that handles this case, turning this into:

    return ((xD.4590 != 0 ? 15 : 3) & yD.4589) == 0;

With all these patterns we will optimize and standardize all code
examples in PR101179.

Bootstrapped on x86, aarch64 and rv64.
Regression tested on x86 and aarch64.

       PR tree-optimization/101179

gcc/ChangeLog:

       * match.pd(`A OP ((CST1 shift (zero_one * CST2)))`): New
       pattern.
       (`A OP ((CST1 shift (zero_one*CST2)) + CST3)`): New pattern.
       (`A OP ((zero_one shift CST1) + CST2)`): New pattern.
       (`A OP (CST2 - (zero_one lshift CST1))`): New pattern.
       (`if cond (cmp (OP A B) C) (cmp (OP A D) C)`): New pattern.
       (`A % (cond X pow2A pow2B) EQ|NE 0`)`: New pattern.

gcc/testsuite/ChangeLog:

       * gcc.dg/tree-ssa/pr101179-2.c: New test.
       * gcc.dg/tree-ssa/pr101179-3.c: New test.
       * gcc.dg/tree-ssa/pr101179.c: New test.
---
   gcc/match.pd                               | 133 ++++++++++++++++++++-
   gcc/testsuite/gcc.dg/tree-ssa/pr101179-2.c |  36 ++++++
   gcc/testsuite/gcc.dg/tree-ssa/pr101179-3.c |  11 ++
   gcc/testsuite/gcc.dg/tree-ssa/pr101179.c   |  95 +++++++++++++++
   4 files changed, 274 insertions(+), 1 deletion(-)
   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr101179-2.c
   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr101179-3.c
   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr101179.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 7f16fd4e081..885e4413e08 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -8642,6 +8642,137 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
      replace if (x == 0) with tem = ~x; if (tem != 0) which is
      clearly less optimal and which we'll transform again in forwprop.  */

+/* PR101179: decompose the following zero_one patterns than creates
+   two CSTs into a cond based format.  E.g:
+
+   A OP (4 << (zero_one * 2)) ->  A OP (zero_one ? 16 : 4)
+
+   This will give optimizations like PR122608 the opportunity to
+   turn this further into:
+
+   "zero_one ? A OP CST1 : A OP CST2".  */
+(for op (plus minus mult bit_and bit_ior bit_xor lshift rshift
+      rdiv trunc_div ceil_div floor_div round_div exact_div
+      trunc_mod ceil_mod floor_mod round_mod)
+ (for shift (lshift rshift)
+
+  /* A OP ((CST1 shift (zero_one * CST2))) ->
+     A OP (zero_one ? CST1 shift CST2 : CST1) */
+  (simplify
+   (op @0 (shift INTEGER_CST@1
+             (mult:c (convert? zero_one_valued_p@2) INTEGER_CST@3)))
+   (op @0 (cond @2 (shift @1 @3) @1)))
+  (simplify
+   (op (shift INTEGER_CST@1
+             (mult:c (convert? zero_one_valued_p@2) INTEGER_CST@3))
+     @0)
+   (op (cond @2 (shift @1 @3) @1) @0))
+
+  /* A OP ((CST1 shift (zero_one*CST2)) + CST3) ->
+     A OP (zero_one ? (CST1 shift CST2) + CST3 : CST1 + CST3) */
+  (simplify
+   (op @0 (plus:c
+        (shift INTEGER_CST@1
+                (mult:c (convert? zero_one_valued_p@2) INTEGER_CST@3))
+        INTEGER_CST@4))
+   (op @0 (cond @2 (plus (shift @1 @3) @4) (plus @1 @4))))
+  (simplify
+   (op (plus:c
+      (shift INTEGER_CST@1
+             (mult:c (convert? zero_one_valued_p@2) INTEGER_CST@3))
+      INTEGER_CST@4)
+       @0)
+   (op (cond @2 (plus (shift @1 @3) @4) (plus @1 @4)) @0))
+
+  /* A OP ((zero_one shift CST1) + CST2) ->
+     A OP (zero_one ? (1 shift CST1) + CST2 : CST2) */
+  (simplify
+   (op @0 (plus:c (shift zero_one_valued_p@2 INTEGER_CST@1) INTEGER_CST@3))
+   (op @0 (cond @2 (plus (shift { build_one_cst (type); } @1) @3)
+                 @3)))
+  (simplify
+   (op (plus:c (shift zero_one_valued_p@2 INTEGER_CST@1) INTEGER_CST@3) @0)
+   (op (cond @2 (plus (shift { build_one_cst (type); } @1) @3) @3)
+       @0))
+
+  /* A OP (CST2 - (zero_one lshift CST1)) ->
+     A OP (zero_one ? CST2 - (1 lshift CST1) : CST2) */
+  (simplify
+   (op @0 (minus INTEGER_CST@3
+             (lshift zero_one_valued_p@2 INTEGER_CST@1)))
+   (op @0 (cond @2 (minus @3 (lshift { build_one_cst (type); } @1)) @3)))
+  (simplify
+   (op (minus INTEGER_CST@3
+             (lshift zero_one_valued_p@2 INTEGER_CST@1)) @0)
+   (op (cond @2 (minus @3 (lshift { build_one_cst (type); } @1)) @3) @0))
+))
+
+/* PR101179: if cond (cmp (OP A B) C) (cmp (OP A D) C)
+
+   The idea here is to detect patterns like this C code:
+
+   (x ? (y%16) == 0 : (y%4) == 0)
+
+   That will generate phiopt dumps such as:
+
+   bb3:
+   _14 = y_9 & 15;
+   _3 = _14 == 0;
+   goto <bb 5>;
+
+   bb4:
+   _13 = y_9 & 3;
+   _6 = _13 == 0;
+
+   bb5:
+   # _5 = PHI <_6 (4), _3 (3)>
+   _7 = (intD.7) _5;
+
+   We want to turn that into a pattern more friendly to other
+   optimizations (like PR122608) by moving the comparison to
+   the merge block:
+
+      bb3:
+   _14 = y_9 & 15;
+   goto <bb 5>;
+
+   bb4:
+   _13 = y_9 & 3;
+
+   bb5:
+   # _5 = PHI <_6 (4), _3 (3)>
+   _6 = _5 == 0;
+   _7 = (intD.7) _5;
+
+   Note that several opcodes (plus/minus, div, min/max) do
+   not generate this pattern due to their own specific
+   comparison optimizations.  */
+(for op (mult bit_and bit_ior bit_xor lshift rshift
+      trunc_mod ceil_mod floor_mod round_mod)
+ (for cmp (simple_comparison)
+  (simplify
+   (cond @0 (cmp (op @1 @2) @4) (cmp (op @1 @3) @4))
+   (cmp (op @1 (cond @0 @2 @3)) @4)))
+)
+
+/* PR101179: A % (cond X pow2A pow2B) EQ|NE 0 ->
+          A & (cond X pow2A-1 pow2B-1) EQ|NE 0
+
+   Turning a comparison between a mod with a pow2 operand and zero
+   into a bit_and is usually done by the frontend.  However the
+   frontend (from C, at least) is not able to do this conversion
+   if the mod is using the result of a cond with two pow2 operands.  */
+(for op (trunc_mod ceil_mod floor_mod round_mod)
+ (for cmp (eq ne)
+  (simplify
+   (cmp (op @0 (cond@4 @3 integer_pow2p@1 integer_pow2p@2)) integer_zerop@5)
+    (if (INTEGRAL_TYPE_P (type)
+      && INTEGRAL_TYPE_P (TREE_TYPE (@0))
+      && tree_int_cst_sgn (@1) > 0
+      && tree_int_cst_sgn (@2) > 0)
+     (cmp (bit_and @0 (minus @4 { build_one_cst (TREE_TYPE (@4)); }))
+                   { build_zero_cst (TREE_TYPE (@0)); })))))
+
   #if GIMPLE
   /* From fold_binary_op_with_conditional_arg handle the case of
      rewriting (a ? b : c) > d to a ? (b > d) : (c > d) when the
@@ -8658,7 +8789,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
          || !operation_could_trap_p (cmp, true, false, @3))
      (cond @0 (cmp!:type @1 @3) (cmp!:type @2 @3)))))

-/* Similar to above:
+/* PR 122608 - Similar to above:
      (c ? a : b) op d  ->  c ? (a op d) : (b op d)
      But with non-vector binary ops, most of them non-commutative.  */
   (for op (plus minus mult bit_and bit_ior bit_xor
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr101179-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr101179-2.c
new file mode 100644
index 00000000000..f8e9be2d404
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr101179-2.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-phiopt1" } */
+
+/* Tests for the pattern
+   if cond X (cmp (OP A B) C) (cmp (OP A D) C) ->
+   (cmp (op A (cond X B D) C))).  */
+
+#define F(OP,NAME) \
+ int NAME##_test (int y, _Bool x) { \
+   return x ? (y OP 16) == 0 : (y OP 4) == 0; \
+}
+
+F (%, mod)
+F (<<, lshift)
+F (>>, rshift)
+
+int bit_and_test (int y, _Bool x) {
+   return x ? (y & 63) != 1 : (y & 16) != 1;
+}
+
+int bit_ior_test (int y, _Bool x) {
+   return x ? (y | 3) < 0xF : (y | 8) < 0XF;
+}
+
+int bit_xor_test (int y, _Bool x) {
+  return x ? (y ^ 3) <= 13 : (y ^ 8) <= 13;
+}
+
+int mult_test (int y, _Bool x) {
+   return x ? (y * 3) > 0xF : (y * 2) > 0XF;
+}
+/* { dg-final { scan-tree-dump-times " == 0" 3 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times " != 1" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times " <= 14" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times " <= 13" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times " > 15" 1 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr101179-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr101179-3.c
new file mode 100644
index 00000000000..9ca0a6a03e4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr101179-3.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-phiopt1" } */
+
+int test (int y, _Bool x) {
+  return y % (x ? 16 : 4) == 0;
+}
+
+/* { dg-final { scan-tree-dump-times " PHI <15" 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times " PHI <16" 0 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times " & " 1 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times " % " 0 "phiopt1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr101179.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr101179.c
new file mode 100644
index 00000000000..f448e7a73fa
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr101179.c
@@ -0,0 +1,95 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-phiopt1" } */
+
+#define F(OP,NAME) \
+ int NAME##_test (int y, _Bool x) { \
+   return y OP (4 << (x * 3)); \
+}
+
+#define F2(OP,NAME) \
+ int NAME##_test2 (int y, _Bool x) { \
+   return y OP ((4 << (x * 3)) + 1); \
+}
+
+#define F3(OP,NAME) \
+ int NAME##_test3 (int y, _Bool x) { \
+   return y OP (32 >> (x * 3)); \
+}
+
+#define F4(OP,NAME) \
+ int NAME##_test4 (int y, _Bool x) { \
+   return y OP ((32 >> (x * 3)) + 1); \
+}
+
+#define F5(OP,NAME) \
+ int NAME##_test5 (int y, _Bool x) { \
+   return y OP ((x << 2) + 3); \
+}
+
+#define F6(OP,NAME) \
+ int NAME##_test6 (int y, _Bool x) { \
+   return y OP (7 - (x << 2)); \
+}
+
+/* Same patterns from above but switching operands */
+#define F_alt(OP,NAME) \
+ int NAME##_test_alt (int y, _Bool x) { \
+   return (4 << (x * 3)) OP y; \
+}
+
+#define F_alt2(OP,NAME) \
+ int NAME##_test_alt2 (int y, _Bool x) { \
+   return ((4 << (x * 3)) + 1) OP y; \
+}
+
+#define F_alt3(OP,NAME) \
+ int NAME##_test_alt3 (int y, _Bool x) { \
+   return (32 >> (x * 3)) OP y; \
+}
+
+#define F_alt4(OP,NAME) \
+ int NAME##_test_alt4 (int y, _Bool x) { \
+   return ((32 >> (x * 3)) + 1) OP y; \
+}
+
+#define F_alt5(OP,NAME) \
+ int NAME##_test_alt5 (int y, _Bool x) { \
+   return ((x << 2) + 3) OP y; \
+}
+
+#define F_alt6(OP,NAME) \
+ int NAME##_test_alt6 (int y, _Bool x) { \
+   return (7 - (x << 2)) OP y; \
+}
+
+#define T(OP,NAME) \
+  F (OP, NAME) \
+  F2 (OP, NAME) \
+  F3 (OP, NAME) \
+  F4 (OP, NAME) \
+  F5 (OP, NAME) \
+  F6 (OP, NAME) \
+  F_alt (OP, NAME) \
+  F_alt2 (OP, NAME) \
+  F_alt3 (OP, NAME) \
+  F_alt4 (OP, NAME) \
+  F_alt5 (OP, NAME) \
+  F_alt6 (OP, NAME)
+
+T (+, plus)
+T (-, minus)
+T (*, mult)
+T (|, bit_and)
+T (|, bit_ior)
+T (^, bit_xor)
+T (/, div)
+T (%, mod)
+T (<<, lshift)
+T (>>, rshift)
+
+/* { dg-final { scan-tree-dump-times "PHI <32" 20 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "PHI <33" 20 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "PHI <4" 20 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "PHI <5" 20 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "PHI <7" 20 "phiopt1" } } */
+/* { dg-final { scan-tree-dump-times "PHI <3\\(" 20 "phiopt1" } } */


Reply via email to