[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355

2023-12-06 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760

Andrew Pinski  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #8 from Andrew Pinski  ---
The REG_UNUSED vs single_set issue is being tracked in PR 40209 .

*** This bug has been marked as a duplicate of bug 40209 ***

[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355

2023-12-06 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760

--- Comment #7 from Jakub Jelinek  ---
This is now latent, we need to decide about the updating and usability of
REG_UNUSED notes, but after moving the pass it shouldn't trigger on this
testcase.

[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355

2023-12-06 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760

--- Comment #6 from GCC Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:e44ed92dbbe9d4e5c23f486cd2f77a6f9ee513c5

commit r14-6210-ge44ed92dbbe9d4e5c23f486cd2f77a6f9ee513c5
Author: Jakub Jelinek 
Date:   Wed Dec 6 09:59:12 2023 +0100

i386: Move vzeroupper pass from after reload pass to after postreload_cse
[PR112760]

Regardless of the outcome of the REG_UNUSED discussions, I think
it is a good idea to move the vzeroupper pass one pass later.
As can be seen in the multiple PRs and as postreload.cc documents,
reload/LRA is known to create dead statements quite often, which
is the reason why we have postreload_cse pass at all.
Doing vzeroupper pass before such cleanup means the pass including
df_analyze for it needs to process more instructions than needed
and because mode switching adds note problem, also higher chance of
having stale REG_UNUSED notes.
And, I really don't see why vzeroupper can't wait until those cleanups
are done.

2023-12-06  Jakub Jelinek  

PR rtl-optimization/112760
* config/i386/i386-passes.def (pass_insert_vzeroupper): Insert
after pass_postreload_cse rather than pass_reload.
* config/i386/i386-features.cc (rest_of_handle_insert_vzeroupper):
Adjust comment for it.

* gcc.dg/pr112760.c: New test.

[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355

2023-12-01 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #5 from Jakub Jelinek  ---
Created attachment 56753
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56753=edit
gcc14-pr112760.patch

Untested fix.

[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355

2023-12-01 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760

Jakub Jelinek  changed:

   What|Removed |Added

   Keywords|needs-bisection |

--- Comment #4 from Jakub Jelinek  ---
In reload dump I see no changes (except function_decl/var_decl addresses), in
vzeroupper, postreload, split2, ree and cmpelim dumps a bunch of extra REG_DEAD
notes
here and there in r14-5355 compared to r14-5354, and finally pro_and_epilogue
deletes
(insn 20 19 62 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 0 ax [110])
(reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1}
 (expr_list:REG_UNUSED (reg:CCZ 17 flags)
(nil)))
insn.
In reload dump there is:
(insn 20 19 44 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 0 ax [110])
(reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1}
 (nil))
(insn 44 20 62 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 0 ax [110])
(reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1}
 (nil))
(insn 62 44 46 2 (set (reg:HI 0 ax [118])
(const_int 1 [0x1])) "pr112760.c":6:22 86 {*movhi_internal}
 (expr_list:REG_EQUIV (const_int 1 [0x1])
(nil)))
(insn 46 62 25 2 (set (reg:HI 3 bx [orig:103 _8+2 ] [103])
(if_then_else:HI (eq (reg:CCZ 17 flags)
(const_int 0 [0]))
(reg:HI 3 bx [orig:103 _8+2 ] [103])
(reg:HI 0 ax [118]))) "pr112760.c":6:22 1381 {*movhicc_noc}
 (nil))
so the insn 20 is indeed useless and in vzeroupper pass that was correctly
marked in
the notes:
(insn 20 19 44 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 0 ax [110])
(reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1}
 (expr_list:REG_UNUSED (reg:CCZ 17 flags)
(nil)))
(insn 44 20 62 2 (set (reg:CCZ 17 flags)
(compare:CCZ (reg:SI 0 ax [110])
(reg:SI 1 dx [111]))) "pr112760.c":6:22 11 {*cmpsi_1}
 (expr_list:REG_DEAD (reg:SI 1 dx [111])
(expr_list:REG_DEAD (reg:SI 0 ax [110])
(nil
(insn 62 44 46 2 (set (reg:HI 0 ax [118])
(const_int 1 [0x1])) "pr112760.c":6:22 86 {*movhi_internal}
 (expr_list:REG_EQUIV (const_int 1 [0x1])
(nil)))
(insn 46 62 25 2 (set (reg:HI 3 bx [orig:103 _8+2 ] [103])
(if_then_else:HI (eq (reg:CCZ 17 flags)
(const_int 0 [0]))
(reg:HI 3 bx [orig:103 _8+2 ] [103])
(reg:HI 0 ax [118]))) "pr112760.c":6:22 1381 {*movhicc_noc}
 (expr_list:REG_DEAD (reg:CCZ 17 flags)
(expr_list:REG_DEAD (reg:HI 0 ax [118])
(nil
But then postreload deletes insn 44 rather than 20 and keeps the notes around
unchanged.
Insn 20 is deleted in
#2  0x00cce9df in copyprop_hardreg_forward_1 (bb=, vd=0x3bd2be0) at ../../gcc/regcprop.cc:829
#3  0x00ccfe1c in copyprop_hardreg_forward_bb_without_debug_insn
(bb=) at ../../gcc/regcprop.cc:1235
#4  0x00d5b371 in prepare_shrink_wrap (entry_block=) at ../../gcc/shrink-wrap.cc:451
#5  0x00d5bb70 in try_shrink_wrapping (entry_edge=0x7fffd900,
prologue_seq=0x7fffe9f25240) at ../../gcc/shrink-wrap.cc:674
#6  0x008b4320 in thread_prologue_and_epilogue_insns () at
../../gcc/function.cc:6056
and regcprop.cc documents it relies on up to date REG_DEAD/REG_UNUSED notes;
after all
the removal happens in
  /* Detect obviously dead sets (via REG_UNUSED notes) and remove them.  */
  if (set
  && !RTX_FRAME_RELATED_P (insn)
  && NONJUMP_INSN_P (insn)
  && !may_trap_p (set)
  && find_reg_note (insn, REG_UNUSED, SET_DEST (set))
  && !side_effects_p (SET_SRC (set))
  && !side_effects_p (SET_DEST (set)))
{
  bool last = insn == BB_END (bb);
  delete_insn (insn);
  if (last)
break;
  continue;
}
and regcprop.cc calls df_note_add_problem (); before calling df_analyze (). 
Except
in the pro_and_epilogue case it is done elsewhere and it just calls into the
regcprop.cc functions.

[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p() since r14-5355

2023-12-01 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[14 Regression] wrong code  |[14 Regression] wrong code
   |with -O2 -fno-dce   |with -O2 -fno-dce
   |-fno-guess-branch-probabili |-fno-guess-branch-probabili
   |ty -m8bit-idiv -mavx|ty -m8bit-idiv -mavx
   |--param=max-cse-insns=0 and |--param=max-cse-insns=0 and
   |__builtin_add_overflow_p()  |__builtin_add_overflow_p()
   ||since r14-5355
 CC||jakub at gcc dot gnu.org,
   ||rsandifo at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Started with r14-5355-g3cd3a09b3f91a1d023cb180763d40598d6bb274b

[Bug rtl-optimization/112760] [14 Regression] wrong code with -O2 -fno-dce -fno-guess-branch-probability -m8bit-idiv -mavx --param=max-cse-insns=0 and __builtin_add_overflow_p()

2023-11-29 Thread ubizjak at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112760

Uroš Bizjak  changed:

   What|Removed |Added

  Component|target  |rtl-optimization
   Last reconfirmed||2023-11-29
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Target Milestone|--- |14.0

--- Comment #2 from Uroš Bizjak  ---
With the original testcase, ce1 pass is if-converting:

   20: flags:CCZ=cmp(r110:SI,r111:SI)
  REG_DEAD r111:SI
  REG_DEAD r110:SI
   21: pc={(flags:CCZ==0)?L23:pc}
  REG_DEAD flags:CCZ
   39: NOTE_INSN_BASIC_BLOCK 5
   22: r103:HI=0x1
   23: L23:

with:

IF-THEN-JOIN block found, pass 2, test 2, then 5, join 6
scanning new insn with uid = 45.
scanning new insn with uid = 44.
scanning new insn with uid = 46.
if-conversion succeeded through noce_try_cmove
Removing jump 21.
deleting insn with uid = 21.
deleting insn with uid = 22.

to:

   20: flags:CCZ=cmp(r110:SI,r111:SI)
  REG_DEAD r111:SI
  REG_DEAD r110:SI
   45: r118:HI=0x1
   44: flags:CCZ=cmp(r110:SI,r111:SI)
   46: r103:HI={(flags:CCZ==0)?r103:HI:r118:HI}

And things go downhill from here. Before postreload we have:

   20: flags:CCZ=cmp(ax:SI,dx:SI)
  REG_UNUSED flags:CCZ
   44: flags:CCZ=cmp(ax:SI,dx:SI)
  REG_DEAD dx:SI
  REG_DEAD ax:SI
   62: ax:HI=0x1
  REG_EQUIV 0x1
   46: bx:HI={(flags:CCZ==0)?bx:HI:ax:HI}
  REG_DEAD flags:CCZ
  REG_DEAD ax:HI

and in posteload pass (insn 44) is removed:

   20: flags:CCZ=cmp(ax:SI,dx:SI)
  REG_UNUSED flags:CCZ
   62: ax:HI=0x1
  REG_EQUIV 0x1
   46: bx:HI={(flags:CCZ==0)?bx:HI:ax:HI}
  REG_DEAD flags:CCZ
  REG_DEAD ax:HI

here comes pro_and_epilogue pass that detects "unused" (insn 20) and removes
it:

df_analyze called
deleting insn with uid = 20.

Confirmed as RTL optimization problem.