On Mon, Mar 9, 2026 at 3:36 PM Richard Biener
<[email protected]> wrote:
>
> On Mon, Mar 9, 2026 at 3:15 PM Daniel Henrique Barboza
> <[email protected]> wrote:
> >
> >
> >
> > On 3/9/2026 10:31 AM, Richard Biener wrote:
> > > On Mon, Mar 9, 2026 at 2:09 PM Daniel Henrique Barboza
> > > <[email protected]> wrote:
> > >>
> > >> From: Daniel Barboza <[email protected]>
> > >>
> > >> Identify cases where a zero_one comparison is used to conditional
> > >> constant assignment and turn that into an unconditional PLUS.  For the
> > >> code in PR71336:
> > >>
> > >> int test(int a) {
> > >>      return a & 1 ? 7 : 3;
> > >> }
> > >>
> > >> We'll turn that into "(a&1) * (7 - 3) + 3", which yields the same
> > >> results but without the conditional, promoving more optimization
> > >> opportunities.  In an armv8-a target the original code generates:
> > >>
> > >> tst     x0, 1   // 38   [c=8 l=4]  *anddi3nr_compare0_zextract
> > >> mov     w1, 3   // 41   [c=4 l=4]  *movsi_aarch64/3
> > >> mov     w0, 7   // 42   [c=4 l=4]  *movsi_aarch64/3
> > >> csel    w0, w1, w0, eq  // 17   [c=4 l=4]  *cmovsi_insn/0
> > >> ret             // 47   [c=0 l=4]  *do_return
> > >>
> > >> With this transformation:
> > >>
> > >> ubfiz   w0, w0, 2, 1    // 7    [c=4 l=4]  *andim_ashiftsi_bfiz
> > >> add     w0, w0, 3       // 13   [c=4 l=4]  *addsi3_aarch64/0
> > >> ret             // 21   [c=0 l=4]  *do_return
> > >>
> > >> Similar gains are noticeable in RISC-V and x86.
> > >>
> > >> For completeness sake we're also adding the variant "zero_one == 0".
> > >>
> > >> Bootstrapped and regression tested in x86 and aarch64.
> > >
> > > This is a very bad transform on targets that cannot do (wide) integer
> > > multiplication, like AVR.  I would suggest to gate this on types
> > > <= word_mode?  optab availability isn't going to work since to workaround
> > > similar issues AVR for example implements some of these but with
> > > explicit libgcc dispatch in the insn patterns.
> >
> > The original idea behind this transform (and similar ones like 56110 and 
> > 123967,
> > which I'm planning to send in the next few days) was to handle only pow2 
> > values
> > and recover the immediate via lshift.  I found that a bit restrictive and 
> > decided
> > to move to 'mult'.
> >
> > I guess we could use the lshift for all pow2 values and then, for non-pow2 
> > vals
> > that would require a mult, check if type <= word_mode.
>
> I'm not sure if a shift will be much better here.  There are targets where
> a conditional select is always better than using arithmetic and it is 
> difficult
> to recover that at RTL expansion time.  So maybe this kind of instruction
> selection should happen later, in a more (cost) controlled manner?
>
> I'm aware there are other places / passes in GCC that fall into this
> trap for AVR.
> It would be nice to develop a workable gating thats applicable to all of those
> places.  I understand we do not want to delay if-conversion for all targets.

Btw, you can always try using __int128 or even larger _BitInt to see the effects
also on other architectures.

Richard.

> Richard.
>
> >
> >
> > Thanks,
> > Daniel
> >
> >
> >
> > >
> > > Richard.
> > >
> > >>          PR tree-optimization/71336
> > >>
> > >> gcc/ChangeLog:
> > >>
> > >>          * match.pd(`zero_one EQ|NE 0 ? CST1:CST2`): New pattern.
> > >>
> > >> gcc/testsuite/ChangeLog:
> > >>
> > >>          * gcc.dg/tree-ssa/pr71336-2.c: New test.
> > >>          * gcc.dg/tree-ssa/pr71336.c: New test.
> > >> ---
> > >>   gcc/match.pd                              | 36 ++++++++++++++
> > >>   gcc/testsuite/gcc.dg/tree-ssa/pr71336-2.c | 59 +++++++++++++++++++++++
> > >>   gcc/testsuite/gcc.dg/tree-ssa/pr71336.c   | 20 ++++++++
> > >>   3 files changed, 115 insertions(+)
> > >>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr71336-2.c
> > >>   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr71336.c
> > >>
> > >> diff --git a/gcc/match.pd b/gcc/match.pd
> > >> index 7f16fd4e081..d041276c595 100644
> > >> --- a/gcc/match.pd
> > >> +++ b/gcc/match.pd
> > >> @@ -5195,6 +5195,42 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> > >>          && expr_no_side_effects_p (@2))
> > >>          (op (mult (convert:type @0) @2) @1))))
> > >>
> > >> +/* PR71336:
> > >> +   zero_one != 0 ? CST1 : CST2 -> ((typeof (CST2))zero_one * diff) + 
> > >> CST2,
> > >> +   where CST1 > CST2 and diff = CST1 - CST2.
> > >> +
> > >> +   Includes the "zero_one == 0 ? (...)" variant too.  */
> > >> +(for cmp (ne eq)
> > >> + (simplify
> > >> +  (cond (cmp zero_one_valued_p@0 integer_zerop) INTEGER_CST@1 
> > >> INTEGER_CST@2)
> > >> +  (with {
> > >> +    unsigned HOST_WIDE_INT diff = 0;
> > >> +
> > >> +    if (tree_int_cst_sgn (@1) > 0 && tree_int_cst_sgn (@2) > 0
> > >> +       && tree_fits_uhwi_p (@1) && tree_fits_uhwi_p (@2))
> > >> +     {
> > >> +       if (cmp == NE_EXPR
> > >> +           && wi::gtu_p (wi::to_wide (@1), wi::to_wide (@2)))
> > >> +         diff = tree_to_uhwi (@1) - tree_to_uhwi (@2);
> > >> +
> > >> +       if (cmp == EQ_EXPR
> > >> +           && wi::gtu_p (wi::to_wide (@2), wi::to_wide (@1)))
> > >> +         diff = tree_to_uhwi (@2) - tree_to_uhwi (@1);
> > >> +     }
> > >> +   }
> > >> +   (if (cmp == NE_EXPR
> > >> +       && INTEGRAL_TYPE_P (type)
> > >> +       && INTEGRAL_TYPE_P (TREE_TYPE (@0))
> > >> +       && diff > 0)
> > >> +     (plus (mult (convert:type @0) { build_int_cst (type, diff); })
> > >> +           @2)
> > >> +    (if (cmp == EQ_EXPR
> > >> +        && INTEGRAL_TYPE_P (type)
> > >> +        && INTEGRAL_TYPE_P (TREE_TYPE (@0))
> > >> +        && diff > 0)
> > >> +      (plus (mult (convert:type @0) { build_int_cst (type, diff); })
> > >> +            @1))))))
> > >> +
> > >>   /* ?: Value replacement. */
> > >>   /* a == 0 ? b : b + a  -> b + a */
> > >>   (for op (plus bit_ior bit_xor)
> > >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71336-2.c 
> > >> b/gcc/testsuite/gcc.dg/tree-ssa/pr71336-2.c
> > >> new file mode 100644
> > >> index 00000000000..da44489d3e4
> > >> --- /dev/null
> > >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71336-2.c
> > >> @@ -0,0 +1,59 @@
> > >> +/* { dg-do run } */
> > >> +/* { dg-options "-O1" } */
> > >> +
> > >> +/* Macro adapted from builtin-object-size-common.h  */
> > >> +#define FAIL() \
> > >> +  do { \
> > >> +    __builtin_printf ("Failure at line: %d\n", __LINE__);     \
> > >> +    abort();                                                 \
> > >> +  } while (0)
> > >> +
> > >> +void abort(void);
> > >> +
> > >> +int test (int a) {
> > >> +    return a & 1 ? 7 : 3;
> > >> +}
> > >> +
> > >> +int test2 (int a) {
> > >> +    return a & 1 ? 3 : 7;
> > >> +}
> > >> +
> > >> +int test3 (int a) {
> > >> +    return (a & 1) == 0 ? 3 : 7;
> > >> +}
> > >> +
> > >> +int test4 (int a) {
> > >> +    return (a & 1) == 0 ? 7 : 3;
> > >> +}
> > >> +
> > >> +int main (void) {
> > >> +  if (test (0) != 3)
> > >> +    FAIL ();
> > >> +  if (test (1) != 7)
> > >> +    FAIL ();
> > >> +  if (test (3) != 7)
> > >> +    FAIL ();
> > >> +
> > >> +  if (test2 (0) != 7)
> > >> +    FAIL ();
> > >> +  if (test2 (1) != 3)
> > >> +    FAIL ();
> > >> +  if (test2 (3) != 3)
> > >> +    FAIL ();
> > >> +
> > >> +  if (test3 (0) != 3)
> > >> +    FAIL ();
> > >> +  if (test3 (1) != 7)
> > >> +    FAIL ();
> > >> +  if (test3 (2) != 3)
> > >> +    FAIL ();
> > >> +
> > >> +  if (test4 (0) != 7)
> > >> +    FAIL ();
> > >> +  if (test4 (1) != 3)
> > >> +    FAIL ();
> > >> +  if (test4 (2) != 7)
> > >> +    FAIL ();
> > >> +
> > >> +  return 0;
> > >> +}
> > >> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71336.c 
> > >> b/gcc/testsuite/gcc.dg/tree-ssa/pr71336.c
> > >> new file mode 100644
> > >> index 00000000000..fb643bf1eb3
> > >> --- /dev/null
> > >> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71336.c
> > >> @@ -0,0 +1,20 @@
> > >> +/* { dg-additional-options -O1 } */
> > >> +/* { dg-additional-options -fdump-tree-gimple } */
> > >> +
> > >> +int test (int a) {
> > >> +    return a & 1 ? 7 : 3;
> > >> +}
> > >> +
> > >> +int test2 (int a) {
> > >> +    return (a & 1) == 0 ? 3 : 7;
> > >> +}
> > >> +
> > >> +int test3 (int a) {
> > >> +    return a & 1 ? 17 : 3;
> > >> +}
> > >> +
> > >> +int test4 (int a) {
> > >> +    return (a & 1) == 0 ? 3 : 17;
> > >> +}
> > >> +
> > >> +/* { dg-final { scan-tree-dump-times " goto " 0 gimple } } */
> > >> --
> > >> 2.43.0
> > >>
> >

Reply via email to