On 5/11/2026 8:05 AM, Richard Biener wrote:
On Sat, Mar 21, 2026 at 8:41 PM Daniel Henrique Barboza
<[email protected]> wrote:

Hi,

After doing more tests I'll ask to leave this patch aside for now.  It
is regressing some aarch64 and x86 cases, which is not ideal for a gimple
optimization that should make code better (or at least not worse) for
all possible targets.  As it is only RISC-V gains from it.

I'll note the first half,

+(for op (lshift rshift bit_and mult)
+ (simplify
+  (cond (eq @0 integer_zerop) (op @0 @1) @0)
+  @0)
+ (simplify
+  (cond (ne @0 integer_zerop) @0 (op @0 @1))
+  @0))

looks profitable in general, no?

It is but it is already being done.

The reason why this pattern was needed is because this transformation happens
after the pattern from 56110.  In hindsight a more suitable solution would
be to delay the 56110 pattern to avoid tripping this existing transformation,
I think.


Thanks,
Daniel



Thanks,
Daniel





On 3/20/2026 1:44 PM, Daniel Henrique Barboza wrote:
From: Daniel Barboza <[email protected]>

Remove if mispredicts for bit_ior, lshift and rshift ops that follows
the following pattern:

if (cmp) SSA_NAME OP CST1 else SSA_NAME

By executing the OP everytime, using the zero_one pattern 'cmp' with
a 'mult' to re-create CST1:

IMM = cmp * CST1 SSA_NAME OP IMM

This works as long as 'OP' is an operation that results in SSA_NAME if
IMM == 0.

A helper pattern was added to simplify the following related case:

if (SSA_NAME == 0) SSA_NAME OP CST1 else SSA_NAME

if OP happens to be an operation that matches the same criteria from
above, this whole pattern can be reduced to 'SSA_NAME'.  Otherwise our main
pattern will overcomplicate it needlesly and we'll have VRP regressions.
This was detected by pr103281-1.c.

As for OPs supported, we do not support XOR as a valid OP for this
transformation because a XOR in the format we're handling here happens
to match a CRC pattern (see gimple-crc-optimization.cc and crc-10.c test
file).  We do not support PLUS at this point because it will break a lot
of scanner tests - something to go after in a follow-up.

Two existing tests were changed as a result of this optimization.

Bootstrapped on x86, aarch64 and rv64.
Regression tested on x86 and aarch64.

       PR tree-optimization/56110

gcc/ChangeLog:

       * match.pd(`if A == 0 A OP CST1 else A`): New pattern.
       (`if A !=0 A else A OP CST1`) : New pattern.
       (`if (cmp) SSA_NAME OP CST1 else SSA_NAME`): New pattern.

gcc/testsuite/ChangeLog:

       * gcc.dg/tree-ssa/pr107195-3.c: The code in 'foo3' is now being
       optimized with -O2 after these changes.  Other functions in this
       test file weren't affected.
       * gcc.target/aarch64/sve/cond_shift_1.c: add a PLUS operand in the
       template to avoid the 56110 pattern being applied, allowing the
       the cond_shifts to occur as expected by the test.
       * gcc.dg/tree-ssa/pr56110-2.c: New test.
       * gcc.dg/tree-ssa/pr56110-3.c: New test.
       * gcc.dg/tree-ssa/pr56110.c: New test.
---
   gcc/match.pd                                  | 33 ++++++++++++
   gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c    |  2 +-
   gcc/testsuite/gcc.dg/tree-ssa/pr56110-2.c     | 51 +++++++++++++++++++
   gcc/testsuite/gcc.dg/tree-ssa/pr56110-3.c     | 34 +++++++++++++
   gcc/testsuite/gcc.dg/tree-ssa/pr56110.c       | 27 ++++++++++
   .../gcc.target/aarch64/sve/cond_shift_1.c     |  3 +-
   6 files changed, 147 insertions(+), 3 deletions(-)
   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr56110-2.c
   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr56110-3.c
   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr56110.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 7f16fd4e081..a4aaf705780 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6685,6 +6685,39 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
         && INTEGRAL_TYPE_P (TREE_TYPE (@0)))
     (cond @1 (convert @2) (convert @3))))

+/* PR56110: helper pattern to simplify this trivial case
+   that the main pattern below can overcomplicate, resulting
+   in VRP having problems optimizing away unneeded function
+   calls (see pr103281-1.c).
+
+   In theory we only need to handle @0==0 and shifts
+   but let's also handle mult, bit_and and the @0!=0
+   case since we're at it.  */
+(for op (lshift rshift bit_and mult)
+ (simplify
+  (cond (eq @0 integer_zerop) (op @0 @1) @0)
+  @0)
+ (simplify
+  (cond (ne @0 integer_zerop) @0 (op @0 @1))
+  @0))
+
+/* PR56110: "if (cond) "A OP CST1" else A -> make OP
+   unconditional by using the cond bool value to re-create
+   CST1 via cond*CST1.  This works as long as OP is an
+   operation that returns "A" when CST1 is zero.
+
+   We're deliberately not handling bit_xor because the XOR
+   pattern is used in CRC detection.  */
+(for cmp (simple_comparison)
+ (for op (bit_ior lshift rshift)
+  (simplify
+   (cond (cmp@2 @3 @4) (op @0 INTEGER_CST@1) @0)
+    (if (INTEGRAL_TYPE_P (type)
+      && INTEGRAL_TYPE_P (TREE_TYPE (@0))
+      && TYPE_PRECISION (type) <= BITS_PER_WORD
+      && (TYPE_UNSIGNED (TREE_TYPE (@1)) || tree_int_cst_sgn (@1) > 0))
+     (op @0 (mult (convert:type @2) (convert:type @1)))))))
+
   /* Simplification moved from fold_cond_expr_with_comparison.  It may also
      be extended.  */
   /* This pattern implements two kinds simplification:
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
index eba4218b3c9..c4b1b800b16 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr107195-3.c
@@ -1,6 +1,6 @@
   /* Inspired by 'libgomp.oacc-c-c++-common/nvptx-sese-1.c'.  */

-/* { dg-additional-options -O1 } */
+/* { dg-additional-options -O2 } */
   /* { dg-additional-options -fdump-tree-dom3-raw } */


diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr56110-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr56110-2.c
new file mode 100644
index 00000000000..d3603c18bd3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr56110-2.c
@@ -0,0 +1,51 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+/* Macro adapted from builtin-object-size-common.h  */
+#define FAIL() \
+  do { \
+    __builtin_printf ("Failure at line: %d\n", __LINE__);    \
+    abort();                                                 \
+  } while (0)
+
+void abort(void);
+
+unsigned f1 (unsigned x, unsigned m, unsigned n)
+{
+  if (x & 1)
+    m >>= 2;
+  return m + n;
+}
+
+unsigned f2 (unsigned x, unsigned m, unsigned n)
+{
+  if (x & 1)
+    m <<= 2;
+  return m + n;
+}
+
+unsigned f3 (unsigned x, unsigned m, unsigned n)
+{
+  if (x & 1)
+    m |= 2;
+  return m + n;
+}
+
+int main (void) {
+  if (f1 (0, 4, 1) != 5)
+    FAIL ();
+  if (f1 (1, 4, 1) != 2)
+    FAIL ();
+
+  if (f2 (0, 2, 1) != 3)
+    FAIL ();
+  if (f2 (1, 2, 1) != 9)
+    FAIL ();
+
+  if (f3 (0, 4, 1) != 5)
+    FAIL ();
+  if (f3 (1, 4, 1) != 7)
+    FAIL ();
+
+  return 0;
+}
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr56110-3.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr56110-3.c
new file mode 100644
index 00000000000..6530dc2f5a5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr56110-3.c
@@ -0,0 +1,34 @@
+/* { dg-additional-options -O2 } */
+/* { dg-additional-options -fdump-tree-phiopt3 } */
+
+#define EQ_ZERO(opname, OP)          \
+__attribute__((noinline,noclone))    \
+int eqzero_##opname(int m) {         \
+  if (m == 0)                                \
+    m = m OP 2;                              \
+  return m;                          \
+}
+
+#define NE_ZERO(opname, OP)          \
+__attribute__((noinline,noclone))    \
+int nezero_##opname(int m) {         \
+  if (m != 0)                                \
+    return m;                                \
+  else                                       \
+    m = m OP 2;                      \
+  return m;                          \
+}
+
+EQ_ZERO(lshift, <<)
+EQ_ZERO(rshift, >>)
+EQ_ZERO(bit_and, &)
+EQ_ZERO(mult, *)
+
+NE_ZERO(lshift, <<)
+NE_ZERO(rshift, >>)
+NE_ZERO(bit_and, &)
+NE_ZERO(mult, *)
+
+/* { dg-final { scan-tree-dump-times "PHI" 0 phiopt3 } } */
+/* { dg-final { scan-tree-dump-times " == " 0 phiopt3 } } */
+/* { dg-final { scan-tree-dump-times " != " 0 phiopt3 } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr56110.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr56110.c
new file mode 100644
index 00000000000..b8134f9116f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr56110.c
@@ -0,0 +1,27 @@
+/* { dg-additional-options -O2 } */
+/* { dg-additional-options -fdump-tree-phiopt3 } */
+
+unsigned f1 (unsigned x, unsigned m)
+{
+    if (m & 0x008080)
+        x >>= 8;
+
+    return x;
+}
+
+unsigned f2 (unsigned x, unsigned m)
+{
+    if (m & 0x008080)
+        x <<= 8;
+
+    return x;
+}
+
+unsigned f3 (unsigned x, unsigned m)
+{
+    if (m & 0x008080)
+        x |= 8;
+
+    return x;
+}
+/* { dg-final { scan-tree-dump-times "PHI" 0 phiopt3 } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cond_shift_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/cond_shift_1.c
index f2c51b291b2..15d3ef9b4af 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/cond_shift_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/cond_shift_1.c
@@ -9,7 +9,7 @@
                       TYPE *__restrict b, int n)                      \
     {                                                                 \
       for (int i = 0; i < n; ++i)                                             \
-      r[i] = a[i] > 20 ? b[i] OP 3 : b[i];                           \
+      r[i] = a[i] > 20 ? b[i] OP 3 : b[i] + 1;                               \
     }

   #define TEST_TYPE(T, TYPE) \
@@ -44,5 +44,4 @@ TEST_ALL (DEF_LOOP)
   /* { dg-final { scan-assembler-times {\tlsr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */

   /* { dg-final { scan-assembler-not {\tmov\tz[^,]*z} } } */
-/* { dg-final { scan-assembler-not {\tmovprfx\t} } } */
   /* { dg-final { scan-assembler-not {\tsel\t} } } */


Reply via email to