Re: [PATCH v2] match.pd: zero_one != 0 ? CST1:CST2 -> (zero_one*diff)+CST2 [PR71336]

Jeffrey Law Wed, 18 Mar 2026 10:39:41 -0700


On 3/18/2026 7:27 AM, Daniel Henrique Barboza wrote:



On 3/17/2026 5:10 PM, Jeffrey Law wrote:


On 3/9/2026 12:39 PM, Daniel Henrique Barboza wrote:

From: Daniel Barboza <[email protected]>

Identify cases where a zero_one comparison is used to conditional
constant assignment and turn that into an unconditional PLUS. For the
code in PR71336:

int test(int a) {
      return a & 1 ? 7 : 3;
}

We'll turn that into "(a&1) * (7 - 3) + 3", which yields the same
results but without the conditional, promoving more optimization
opportunities.  In an armv8-a target the original code generates:

tst     x0, 1   // 38   [c=8 l=4]  *anddi3nr_compare0_zextract
mov     w1, 3   // 41   [c=4 l=4]  *movsi_aarch64/3
mov     w0, 7   // 42   [c=4 l=4]  *movsi_aarch64/3
csel    w0, w1, w0, eq  // 17   [c=4 l=4]  *cmovsi_insn/0
ret             // 47   [c=0 l=4]  *do_return

With this transformation:

ubfiz   w0, w0, 2, 1    // 7    [c=4 l=4] *andim_ashiftsi_bfiz
add     w0, w0, 3       // 13   [c=4 l=4]  *addsi3_aarch64/0
ret             // 21   [c=0 l=4]  *do_return

Similar gains are noticeable in RISC-V and x86.

For completeness sake we're also adding the variant "zero_one == 0".
Both transformations check for type <= word_size to avoid introducing a
wide integer multiplication that the target will have trouble dealing
with.

Bootstrapped and regression tested in x86 and aarch64.

    PR tree-optimization/71336

gcc/ChangeLog:

    * match.pd(`zero_one EQ|NE 0 ? CST1:CST2`): New pattern.

gcc/testsuite/ChangeLog:

    * gcc.dg/tree-ssa/pr71336-2.c: New test.
    * gcc.dg/tree-ssa/pr71336.c: New test.
---

Changes from v1:
- add type <= word_size check to avoid a wide int multiplication, as
    suggested by Richard

- v1 link:https://gcc.gnu.org/pipermail/gcc-patches/2026-March/710125.html


   gcc/match.pd                              | 38 +++++++++++++++

gcc/testsuite/gcc.dg/tree-ssa/pr71336-2.c | 59+++++++++++++++++++++++

   gcc/testsuite/gcc.dg/tree-ssa/pr71336.c   | 20 ++++++++
   3 files changed, 117 insertions(+)
   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr71336-2.c
   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr71336.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 7f16fd4e081..590575ea2e0 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5195,6 +5195,44 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
          && expr_no_side_effects_p (@2))
          (op (mult (convert:type @0) @2) @1))))
   +/* PR71336:

+ zero_one != 0 ? CST1 : CST2 -> ((typeof (CST2))zero_one * diff)+ CST2,

+   where CST1 > CST2 and diff = CST1 - CST2.
+
+   Includes the "zero_one == 0 ? (...)" variant too.  */
+(for cmp (ne eq)
+ (simplify

+ (cond (cmp zero_one_valued_p@0 integer_zerop) INTEGER_CST@1INTEGER_CST@2)

+  (with {
+    unsigned HOST_WIDE_INT diff = 0;
+
+    if (tree_int_cst_sgn (@1) > 0 && tree_int_cst_sgn (@2) > 0
+    && tree_fits_uhwi_p (@1) && tree_fits_uhwi_p (@2))
+     {
+    if (cmp == NE_EXPR
+        && wi::gtu_p (wi::to_wide (@1), wi::to_wide (@2)))
+      diff = tree_to_uhwi (@1) - tree_to_uhwi (@2);
+
+    if (cmp == EQ_EXPR
+        && wi::gtu_p (wi::to_wide (@2), wi::to_wide (@1)))
+      diff = tree_to_uhwi (@2) - tree_to_uhwi (@1);
+     }
+   }

+   (if (cmp == NE_EXPR
+    && INTEGRAL_TYPE_P (type)
+    && TYPE_PRECISION (type) <= BITS_PER_WORD
+    && INTEGRAL_TYPE_P (TREE_TYPE (@0))
+    && diff > 0)
+     (plus (mult (convert:type @0) { build_int_cst (type, diff); })
+        @2)
+    (if (cmp == EQ_EXPR
+     && INTEGRAL_TYPE_P (type)
+     && TYPE_PRECISION (type) <= BITS_PER_WORD
+     && INTEGRAL_TYPE_P (TREE_TYPE (@0))
+     && diff > 0)
+      (plus (mult (convert:type @0) { build_int_cst (type, diff); })
+         @1))))))

Note the parallels in how EQ/NE get handled.  Could we meaningfully
simplify the code by creating two new locals within with WITH holding @1
and @2 initially, then conditionally swap them if necessary. That
should (in theory) allow some code de-duplication.

I don't know if it's been discussed, but do we want to limit to cases
where the multiplication is 2^n and thus implementable via a shift?   Of
course that then begs if we should handle *3, *5 and *9 specially too,
but that's probably getting too close to catering to specific targets.
Of course there's also BZs around revamping expansion to do something
more sensible with these MULT sequences. Raphael's patch didn't work the
way we wanted, but I think Andrew and I both think it shows a path
forward to steering those MULT operations into conditional move
expanders.  So, yea, maybe just leave this as-is in the expectation that
we'll adjust the gimple->rtl interface to adjust how we generate code
for 0/1 * C to conditionally select between 0 and C.


So, about that .... I'm afraid I'll have to change the patterns being
generated here from 'mult' based to 'lshift' based.

Understood.

I think you'll also want to look at the types in the newgeneric/gimple. With your patch installed rv64 won't bootstrap due tosigned/unsigned comparison problems during stage2. Presumably yourpattern is matching during generic and changing types in ways thattrigger the signed/unsigned warnings, but I haven't really dove in.

In file included from ./tm.h:49,
                  from ../../../gcc/gcc/backend.h:28,
                  from ../../../gcc/gcc/adjust-alignment.cc:24:
../../../gcc/gcc/adjust-alignment.cc: In member function 'virtual unsigned int 
{anonymous}::pass_adjust_alignment::execute(function*)':
../../../gcc/gcc/config/riscv/riscv.h:250:24: error: comparison of integer 
expressions of different signedness: 'unsigned int' and 'int' 
[-Werror=sign-compare]
   250 |   (((COND) && ((ALIGN) < BITS_PER_WORD)                                
 \
       |                        ^
../../../gcc/gcc/config/riscv/riscv.h:276:3: note: in expansion of macro 
'RISCV_EXPAND_ALIGNMENT'
   276 |   RISCV_EXPAND_ALIGNMENT (true, TYPE, ALIGN)
       |   ^~~~~~~~~~~~~~~~~~~~~~
../../../gcc/gcc/defaults.h:1142:3: note: in expansion of macro 
'LOCAL_ALIGNMENT'
  1142 |   LOCAL_ALIGNMENT (TREE_TYPE (DECL), DECL_ALIGN (DECL))
       |   ^~~~~~~~~~~~~~~
../../../gcc/gcc/adjust-alignment.cc:71:24: note: in expansion of macro 
'LOCAL_DECL_ALIGNMENT'
    71 |       unsigned align = LOCAL_DECL_ALIGNMENT (var);
       |                        ^~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors
make[3]: *** [Makefile:1218: adjust-alignment.o] Error 1


Jeff

Re: [PATCH v2] match.pd: zero_one != 0 ? CST1:CST2 -> (zero_one*diff)+CST2 [PR71336]

Reply via email to