On Wed, Nov 26, 2025 at 10:18 AM H.J. Lu <[email protected]> wrote:
>
> On Wed, Nov 26, 2025 at 3:52 PM Uros Bizjak <[email protected]> wrote:
> >
> > On Wed, Nov 26, 2025 at 8:26 AM H.J. Lu <[email protected]> wrote:
> >
> > > > > The reasons for try_combine_insn are that
> > > > >
> > > > > 1. When general_operand is called, instruction info isn't available.
> > > > > 2. Only recog_for_combine_1 calls init_recog_no_volatile and
> > > > > tries memory operands.
> > > > But if we make stores work sensibly, then we no longer need the
> > > > instruction, right?  It's the mere need to track the instruction that
> > > > seems rather hokey/hackish to me right now.
> > > >
> > > > jeff
> > >
> > > For
> > >
> > > volatile unsigned char u8;
> > >
> > > void test (void)
> > > {
> > >   u8 = u8 + u8;
> > >   u8 = u8 - u8;
> > > }
> > >
> > > When volatile store is allowed,  we generate
> > >
> > > (insn 8 7 9 2 (parallel [
> > >             (set (mem/v/c:QI (symbol_ref:DI ("u8") [flags 0x2]
> > > <var_decl 0x7fe9719d6e40 u8>) [0 u8+0 S1 A8])
> > >                 (ashift:QI (mem/v/c:QI (symbol_ref:DI ("u8") [flags
> > > 0x2]  <var_decl 0x7fe9719d6e40 u8>) [0 u8+0 S1 A8])
> > >                     (const_int 1 [0x1])))
> > >             (clobber (reg:CC 17 flags))
> > >         ]) 
> > > "/export/gnu/import/git/gitlab/x86-gcc-test/gcc/testsuite/gcc.dg/pr86617.c":7:6
> > > 1139 {*ashlqi3_1}
> > >      (expr_list:REG_UNUSED (reg:CC 17 flags)
> > >         (nil)))
> > >
> > > on x86 which leads to
> > >
> > > salb u8(%rip)
> > >
> > > instead of 2 loads.  Without the instruction, we don't know if a
> > > memory reference
> > > should be allowed for stores.
> >
> > combine pass doesn't handle volatiles correctly in all cases, this is
> > the reason I think this propagation should be done in late-combine. We
> > are interested only in propagations of memory loads/stores into
> > instructions, and late-combine does exactly that.
> >
> > Uros.
>
> late combine doesn't try to combine memory references at all.

Try PR122343 testcase with the attached patch (gcc -O2):

_.308r.late_combine1:

trying to combine definition of r98 in:
    6: r98:SI=[`bar']
into:
    8: {r102:SI=r103:SI+r98:SI;clobber flags:CC;}
      REG_DEAD r103:SI
      REG_DEAD r98:SI
      REG_UNUSED flags:CC
successfully matched this instruction to *addsi_1:
(parallel [
        (set (reg:SI 102 [ _5 ])
            (plus:SI (reg:SI 103 [ z_3 ])
                (mem/v/c:SI (symbol_ref:DI ("bar") [flags 0x40]
<var_decl 0x7f7810bd6e40 bar>) [1 bar+0 S4 A32])))
        (clobber (reg:CC 17 flags))
    ])
original cost = 5 + 4 (weighted: 9.000000), replacement cost = 9
(weighted: 9.000000); keeping replacement
rescanning insn with uid = 8.
updating insn 8 in-place
verify found no changes in insn with uid = 8.
deleting insn 6
deleting insn with uid = 6.

foo:
       imull   $123, %edi, %eax
       addl    bar(%rip), %eax
       ret

Uros.
diff --git a/gcc/late-combine.cc b/gcc/late-combine.cc
index 770780eb04d..b38ab78a032 100644
--- a/gcc/late-combine.cc
+++ b/gcc/late-combine.cc
@@ -490,7 +490,7 @@ late_combine::optimizable_set (set_info *def)
   if (!insn->can_be_optimized ()
       || insn->is_asm ()
       || insn->is_call ()
-      || insn->has_volatile_refs ()
+      //      || insn->has_volatile_refs ()
       || insn->has_pre_post_modify ()
       || !can_move_insn_p (insn))
     return NULL_RTX;
@@ -874,8 +874,8 @@ late_combine::execute (function *fn)
   df_analyze ();
   crtl->ssa = new rtl_ssa::function_info (fn);
   // Don't allow memory_operand to match volatile MEMs.
-  init_recog_no_volatile ();
-
+  //  init_recog_no_volatile ();
+  init_recog ();
   insn_info *insn = *crtl->ssa->nondebug_insns ().begin ();
   while (insn)
     {

Reply via email to