On Wed, Nov 26, 2025 at 12:43 PM H.J. Lu <[email protected]> wrote: > > On Wed, Nov 26, 2025 at 5:53 PM Uros Bizjak <[email protected]> wrote: > > > > On Wed, Nov 26, 2025 at 10:18 AM H.J. Lu <[email protected]> wrote: > > > > > > On Wed, Nov 26, 2025 at 3:52 PM Uros Bizjak <[email protected]> wrote: > > > > > > > > On Wed, Nov 26, 2025 at 8:26 AM H.J. Lu <[email protected]> wrote: > > > > > > > > > > > The reasons for try_combine_insn are that > > > > > > > > > > > > > > 1. When general_operand is called, instruction info isn't > > > > > > > available. > > > > > > > 2. Only recog_for_combine_1 calls init_recog_no_volatile and > > > > > > > tries memory operands. > > > > > > But if we make stores work sensibly, then we no longer need the > > > > > > instruction, right? It's the mere need to track the instruction > > > > > > that > > > > > > seems rather hokey/hackish to me right now. > > > > > > > > > > > > jeff > > > > > > > > > > For > > > > > > > > > > volatile unsigned char u8; > > > > > > > > > > void test (void) > > > > > { > > > > > u8 = u8 + u8; > > > > > u8 = u8 - u8; > > > > > } > > > > > > > > > > When volatile store is allowed, we generate > > > > > > > > > > (insn 8 7 9 2 (parallel [ > > > > > (set (mem/v/c:QI (symbol_ref:DI ("u8") [flags 0x2] > > > > > <var_decl 0x7fe9719d6e40 u8>) [0 u8+0 S1 A8]) > > > > > (ashift:QI (mem/v/c:QI (symbol_ref:DI ("u8") [flags > > > > > 0x2] <var_decl 0x7fe9719d6e40 u8>) [0 u8+0 S1 A8]) > > > > > (const_int 1 [0x1]))) > > > > > (clobber (reg:CC 17 flags)) > > > > > ]) > > > > > "/export/gnu/import/git/gitlab/x86-gcc-test/gcc/testsuite/gcc.dg/pr86617.c":7:6 > > > > > 1139 {*ashlqi3_1} > > > > > (expr_list:REG_UNUSED (reg:CC 17 flags) > > > > > (nil))) > > > > > > > > > > on x86 which leads to > > > > > > > > > > salb u8(%rip) > > > > > > > > > > instead of 2 loads. Without the instruction, we don't know if a > > > > > memory reference > > > > > should be allowed for stores. > > > > > > > > combine pass doesn't handle volatiles correctly in all cases, this is > > > > the reason I think this propagation should be done in late-combine. We > > > > are interested only in propagations of memory loads/stores into > > > > instructions, and late-combine does exactly that. > > > > > > > > Uros. > > > > > > late combine doesn't try to combine memory references at all. > > > > Try PR122343 testcase with the attached patch (gcc -O2): > > > > _.308r.late_combine1: > > > > trying to combine definition of r98 in: > > 6: r98:SI=[`bar'] > > into: > > 8: {r102:SI=r103:SI+r98:SI;clobber flags:CC;} > > REG_DEAD r103:SI > > REG_DEAD r98:SI > > REG_UNUSED flags:CC > > successfully matched this instruction to *addsi_1: > > (parallel [ > > (set (reg:SI 102 [ _5 ]) > > (plus:SI (reg:SI 103 [ z_3 ]) > > (mem/v/c:SI (symbol_ref:DI ("bar") [flags 0x40] > > <var_decl 0x7f7810bd6e40 bar>) [1 bar+0 S4 A32]))) > > (clobber (reg:CC 17 flags)) > > ]) > > original cost = 5 + 4 (weighted: 9.000000), replacement cost = 9 > > (weighted: 9.000000); keeping replacement > > rescanning insn with uid = 8. > > updating insn 8 in-place > > verify found no changes in insn with uid = 8. > > deleting insn 6 > > deleting insn with uid = 6. > > > > foo: > > imull $123, %edi, %eax > > addl bar(%rip), %eax > > ret > > > > Uros. > > It doesn't work on: > > extern volatile int bar; > > void > foo (void) > { > bar += 123; > } > > Your patch generates: > > movl bar(%rip), %eax > addl $123, %eax > movl %eax, bar(%rip)
This is expected, late-combine pass won't create RMW instructions. Uros.
