On Wed, Mar 22, 2023 at 10:03:42AM +0000, Richard Biener via Gcc-patches wrote: > For the testcase bb_is_just_return is on top of the profile, changing > it to walk BB insns backwards puts it off the profile. That's because > in the forward walk you have to process possibly many debug insns > but in a backward walk you very likely run into control insns first. > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. > > OK? > > For the record, the profile was (after the delete_trivially_dead_insns > fix) > > Samples: 289K of event 'cycles:u', Event count (approx.): 384226334976 > > Overhead Samples Command Shared Object Symbol > > 3.52% 9747 cc1 cc1 [.] bb_is_just_return > > # > > and after the fix bb_is_just_return has no recorded samples anymore. > > Thanks, > Richard. > > PR rtl-optimization/109237 > * cfgcleanup.cc (bb_is_just_return): Walk insns backwards.
This seems to regress +FAIL: gcc.dg/guality/pr54200.c -Os -DPREVENT_OPTIMIZATION line 20 z == 3 on x86_64. This was meant just as a performance improvement, right? Just with -Os -g0 the assembly difference is: --- pr54200.s1 2023-03-28 09:45:57.120647323 +0200 +++ pr54200.s2 2023-03-28 09:46:00.423599004 +0200 @@ -19,10 +19,11 @@ foo: cmpl $1, %esi jne .L3 movl $2, o(%rip) - ret + jmp .L4 .L3: addl %esi, %eax addl %edx, %eax +.L4: ret .cfi_endproc .LFE1: so 1 byte longer (ret is 1 byte, jmp 2 bytes), but might be also slower. The difference starts during the pro_and_epilogue pass. > --- a/gcc/cfgcleanup.cc > +++ b/gcc/cfgcleanup.cc > @@ -2608,7 +2608,7 @@ bb_is_just_return (basic_block bb, rtx_insn **ret, > rtx_insn **use) > if (bb == EXIT_BLOCK_PTR_FOR_FN (cfun)) > return false; > > - FOR_BB_INSNS (bb, insn) > + FOR_BB_INSNS_REVERSE (bb, insn) > if (NONDEBUG_INSN_P (insn)) > { > rtx pat = PATTERN (insn); > -- > 2.35.3 Jakub