[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760 Andrew Pinski changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |DUPLICATE --- Comment #8 from Andrew Pinski --- The REG_UNUSED vs single_set issue is being tracked in PR 40209 . *** This bug has been marked as a duplicate of bug 40209 ***
[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760 --- Comment #7 from Jakub Jelinek --- This is now latent, we need to decide about the updating and usability of REG_UNUSED notes, but after moving the pass it shouldn't trigger on this testcase.
[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760 --- Comment #6 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:e44ed92dbbe9d4e5c23f486cd2f77a6f9ee513c5 commit r14-6210-ge44ed92dbbe9d4e5c23f486cd2f77a6f9ee513c5 Author: Jakub Jelinek Date: Wed Dec 6 09:59:12 2023 +0100 i386: Move vzeroupper pass from after reload pass to after postreload_cse [PR112760] Regardless of the outcome of the REG_UNUSED discussions, I think it is a good idea to move the vzeroupper pass one pass later. As can be seen in the multiple PRs and as postreload.cc documents, reload/LRA is known to create dead statements quite often, which is the reason why we have postreload_cse pass at all. Doing vzeroupper pass before such cleanup means the pass including df_analyze for it needs to process more instructions than needed and because mode switching adds note problem, also higher chance of having stale REG_UNUSED notes. And, I really don't see why vzeroupper can't wait until those cleanups are done. 2023-12-06 Jakub Jelinek PR rtl-optimization/112760 * config/i386/i386-passes.def (pass_insert_vzeroupper): Insert after pass_postreload_cse rather than pass_reload. * config/i386/i386-features.cc (rest_of_handle_insert_vzeroupper): Adjust comment for it. * gcc.dg/pr112760.c: New test.
[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #5 from Jakub Jelinek --- Created attachment 56753 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56753=edit gcc14-pr112760.patch Untested fix.
[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760 Jakub Jelinek changed: What|Removed |Added Keywords|needs-bisection | --- Comment #4 from Jakub Jelinek --- In reload dump I see no changes (except function_decl/var_decl addresses), in vzeroupper, postreload, split2, ree and cmpelim dumps a bunch of extra REG_DEAD notes here and there in r14-5355 compared to r14-5354, and finally pro_and_epilogue deletes (insn 20 19 62 2 (set (reg:CCZ 17 flags) (compare:CCZ (reg:SI 0 ax [110]) (reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1} (expr_list:REG_UNUSED (reg:CCZ 17 flags) (nil))) insn. In reload dump there is: (insn 20 19 44 2 (set (reg:CCZ 17 flags) (compare:CCZ (reg:SI 0 ax [110]) (reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1} (nil)) (insn 44 20 62 2 (set (reg:CCZ 17 flags) (compare:CCZ (reg:SI 0 ax [110]) (reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1} (nil)) (insn 62 44 46 2 (set (reg:HI 0 ax [118]) (const_int 1 [0x1])) "pr112760.c":6:22 86 {*movhi_internal} (expr_list:REG_EQUIV (const_int 1 [0x1]) (nil))) (insn 46 62 25 2 (set (reg:HI 3 bx [orig:103 _8+2 ] [103]) (if_then_else:HI (eq (reg:CCZ 17 flags) (const_int 0 [0])) (reg:HI 3 bx [orig:103 _8+2 ] [103]) (reg:HI 0 ax [118]))) "pr112760.c":6:22 1381 {*movhicc_noc} (nil)) so the insn 20 is indeed useless and in vzeroupper pass that was correctly marked in the notes: (insn 20 19 44 2 (set (reg:CCZ 17 flags) (compare:CCZ (reg:SI 0 ax [110]) (reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1} (expr_list:REG_UNUSED (reg:CCZ 17 flags) (nil))) (insn 44 20 62 2 (set (reg:CCZ 17 flags) (compare:CCZ (reg:SI 0 ax [110]) (reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1} (expr_list:REG_DEAD (reg:SI 1 dx [111]) (expr_list:REG_DEAD (reg:SI 0 ax [110]) (nil (insn 62 44 46 2 (set (reg:HI 0 ax [118]) (const_int 1 [0x1])) "pr112760.c":6:22 86 {*movhi_internal} (expr_list:REG_EQUIV (const_int 1 [0x1]) (nil))) (insn 46 62 25 2 (set (reg:HI 3 bx [orig:103 _8+2 ] [103]) (if_then_else:HI (eq (reg:CCZ 17 flags) (const_int 0 [0])) (reg:HI 3 bx [orig:103 _8+2 ] [103]) (reg:HI 0 ax [118]))) "pr112760.c":6:22 1381 {*movhicc_noc} (expr_list:REG_DEAD (reg:CCZ 17 flags) (expr_list:REG_DEAD (reg:HI 0 ax [118]) (nil But then postreload deletes insn 44 rather than 20 and keeps the notes around unchanged. Insn 20 is deleted in #2 0x00cce9df in copyprop_hardreg_forward_1 (bb=, vd=0x3bd2be0) at ../../gcc/regcprop.cc:829 #3 0x00ccfe1c in copyprop_hardreg_forward_bb_without_debug_insn (bb=) at ../../gcc/regcprop.cc:1235 #4 0x00d5b371 in prepare_shrink_wrap (entry_block=) at ../../gcc/shrink-wrap.cc:451 #5 0x00d5bb70 in try_shrink_wrapping (entry_edge=0x7fffd900, prologue_seq=0x7fffe9f25240) at ../../gcc/shrink-wrap.cc:674 #6 0x008b4320 in thread_prologue_and_epilogue_insns () at ../../gcc/function.cc:6056 and regcprop.cc documents it relies on up to date REG_DEAD/REG_UNUSED notes; after all the removal happens in /* Detect obviously dead sets (via REG_UNUSED notes) and remove them. */ if (set && !RTX_FRAME_RELATED_P (insn) && NONJUMP_INSN_P (insn) && !may_trap_p (set) && find_reg_note (insn, REG_UNUSED, SET_DEST (set)) && !side_effects_p (SET_SRC (set)) && !side_effects_p (SET_DEST (set))) { bool last = insn == BB_END (bb); delete_insn (insn); if (last) break; continue; } and regcprop.cc calls df_note_add_problem (); before calling df_analyze (). Except in the pro_and_epilogue case it is done elsewhere and it just calls into the regcprop.cc functions.
[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760 Jakub Jelinek changed: What|Removed |Added Summary|[14 Regression] wrong code |[14 Regression] wrong code |with -O2 -fno-dce |with -O2 -fno-dce |-fno-guess-branch-probabili |-fno-guess-branch-probabili |ty -m8bit-idiv -mavx|ty -m8bit-idiv -mavx |--param=max-cse-insns=0 and |--param=max-cse-insns=0 and |__builtin_add_overflow_p() |__builtin_add_overflow_p() ||since r14-5355 CC||jakub at gcc dot gnu.org, ||rsandifo at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- Started with r14-5355-g3cd3a09b3f91a1d023cb180763d40598d6bb274b
[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p()
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760 Uroš Bizjak changed: What|Removed |Added Component|target |rtl-optimization Last reconfirmed||2023-11-29 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Target Milestone|--- |14.0 --- Comment #2 from Uroš Bizjak --- With the original testcase, ce1 pass is if-converting: 20: flags:CCZ=cmp(r110:SI,r111:SI) REG_DEAD r111:SI REG_DEAD r110:SI 21: pc={(flags:CCZ==0)?L23:pc} REG_DEAD flags:CCZ 39: NOTE_INSN_BASIC_BLOCK 5 22: r103:HI=0x1 23: L23: with: IF-THEN-JOIN block found, pass 2, test 2, then 5, join 6 scanning new insn with uid = 45. scanning new insn with uid = 44. scanning new insn with uid = 46. if-conversion succeeded through noce_try_cmove Removing jump 21. deleting insn with uid = 21. deleting insn with uid = 22. to: 20: flags:CCZ=cmp(r110:SI,r111:SI) REG_DEAD r111:SI REG_DEAD r110:SI 45: r118:HI=0x1 44: flags:CCZ=cmp(r110:SI,r111:SI) 46: r103:HI={(flags:CCZ==0)?r103:HI:r118:HI} And things go downhill from here. Before postreload we have: 20: flags:CCZ=cmp(ax:SI,dx:SI) REG_UNUSED flags:CCZ 44: flags:CCZ=cmp(ax:SI,dx:SI) REG_DEAD dx:SI REG_DEAD ax:SI 62: ax:HI=0x1 REG_EQUIV 0x1 46: bx:HI={(flags:CCZ==0)?bx:HI:ax:HI} REG_DEAD flags:CCZ REG_DEAD ax:HI and in posteload pass (insn 44) is removed: 20: flags:CCZ=cmp(ax:SI,dx:SI) REG_UNUSED flags:CCZ 62: ax:HI=0x1 REG_EQUIV 0x1 46: bx:HI={(flags:CCZ==0)?bx:HI:ax:HI} REG_DEAD flags:CCZ REG_DEAD ax:HI here comes pro_and_epilogue pass that detects "unused" (insn 20) and removes it: df_analyze called deleting insn with uid = 20. Confirmed as RTL optimization problem.