[Bug rtl-optimization/68212] Loop unroller breaks basic block frequencies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212 --- Comment #10 from Jiu Fu Guo --- I had a try for GCC11, https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574421.html. The patches could mitigate the BB-count mismatch issue for loops. In theory, this patch would make sense. But it also raises the mismatch BB's count out of the loop for some cases. With tests on spec2017, this patch could help performance on some bmks, while it also introduces recession on some bmks. So, I did not pursue pushing that patch.
[Bug rtl-optimization/68212] Loop unroller breaks basic block frequencies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212 --- Comment #9 from pthaugen at gcc dot gnu.org --- The problem can be seen in the loop2_unroll dump: pthaugen@pike:~/temp/pr68212$ grep "Invalid sum of" simple.c.272r.loop2_unroll ;; Invalid sum of incoming counts 285685646 (estimated locally), should be 212627725 (estimated locally) ;; Invalid sum of incoming counts 32061393 (estimated locally), should be 105119324 (estimated locally)
[Bug rtl-optimization/68212] Loop unroller breaks basic block frequencies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212 pthaugen at gcc dot gnu.org changed: What|Removed |Added CC||guojiufu at gcc dot gnu.org, ||pthaugen at gcc dot gnu.org --- Comment #8 from pthaugen at gcc dot gnu.org --- (In reply to Peter Bergner from comment #7) > (In reply to Pat Haugen from comment #4) > > Author: pthaugen > > Date: Fri Oct 14 17:10:18 2016 > > New Revision: 241170 > > > > URL: https://gcc.gnu.org/viewcvs?rev=241170&root=gcc&view=rev > > Log: > > PR rtl-optimization/68212 > > * cfgloopmanip.c (duplicate_loop_to_header_edge): Use preheader edge > > frequency when computing scale factor for peeled copies. > > * loop-unroll.c (unroll_loop_runtime_iterations): Fix freq/count > > values for switch/peel blocks/edges. > > Repeating Martin's question. Pat, is this PR fixed with your patch or is > there more to do? No, there are still problems. The patch noted fixed the count/probability for the peeled switch/case blocks created before entering the unrolled loop. But the counts for the loop header/exit blocks are still incorrect. The last activity I know of concerning that problem was the patch by Jiufu Guo here: https://gcc.gnu.org/pipermail/gcc-patches/2020-February/539594.html. Not sure if he has any more input here.
[Bug rtl-optimization/68212] Loop unroller breaks basic block frequencies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212 --- Comment #7 from Peter Bergner --- (In reply to Pat Haugen from comment #4) > Author: pthaugen > Date: Fri Oct 14 17:10:18 2016 > New Revision: 241170 > > URL: https://gcc.gnu.org/viewcvs?rev=241170&root=gcc&view=rev > Log: > PR rtl-optimization/68212 > * cfgloopmanip.c (duplicate_loop_to_header_edge): Use preheader edge > frequency when computing scale factor for peeled copies. > * loop-unroll.c (unroll_loop_runtime_iterations): Fix freq/count > values for switch/peel blocks/edges. Repeating Martin's question. Pat, is this PR fixed with your patch or is there more to do?
[Bug rtl-optimization/68212] Loop unroller breaks basic block frequencies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212 Bill Schmidt changed: What|Removed |Added Assignee|kelvin at gcc dot gnu.org |unassigned at gcc dot gnu.org Status|ASSIGNED|NEW --- Comment #6 from Bill Schmidt --- Kelvin has moved on...
[Bug rtl-optimization/68212] Loop unroller breaks basic block frequencies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212 Richard Biener changed: What|Removed |Added Status|NEW |ASSIGNED
[Bug rtl-optimization/68212] Loop unroller breaks basic block frequencies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212 Martin Liška changed: What|Removed |Added CC||marxin at gcc dot gnu.org --- Comment #5 from Martin Liška --- Can the bug be marked as resolved?
[Bug rtl-optimization/68212] Loop unroller breaks basic block frequencies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212 --- Comment #4 from Pat Haugen --- Author: pthaugen Date: Fri Oct 14 17:10:18 2016 New Revision: 241170 URL: https://gcc.gnu.org/viewcvs?rev=241170&root=gcc&view=rev Log: PR rtl-optimization/68212 * cfgloopmanip.c (duplicate_loop_to_header_edge): Use preheader edge frequency when computing scale factor for peeled copies. * loop-unroll.c (unroll_loop_runtime_iterations): Fix freq/count values for switch/peel blocks/edges. Modified: trunk/gcc/ChangeLog trunk/gcc/cfgloopmanip.c trunk/gcc/loop-unroll.c
[Bug rtl-optimization/68212] Loop unroller breaks basic block frequencies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212 Pat Haugen changed: What|Removed |Added CC||wschmidt at gcc dot gnu.org --- Comment #3 from Pat Haugen --- Patch to fix frequencies/counts of switch block code and peeled loop copies posted here https://gcc.gnu.org/ml/gcc-patches/2016-09/msg01363.html. I still see a couple issues that need to get worked: 1) Freq of unrolled loop block is still wrong, doesn't match sum of incoming edges. 2) Probably a separate issue, but general issue of unrolling reducing frequency too much such that loop no longer looks hotter than surrounding code. Since guessed frequency is relative (capped at 1), I don't think it's a simple matter of orig_freq/unroll factor to get freq of unrolled loop block.
[Bug rtl-optimization/68212] Loop unroller breaks basic block frequencies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212 --- Comment #2 from Pat Haugen --- Author: pthaugen Date: Tue Sep 13 15:58:52 2016 New Revision: 240113 URL: https://gcc.gnu.org/viewcvs?rev=240113&root=gcc&view=rev Log: PR tree-optimization/77536 PR rtl-optimization/68212 * config/rs6000/rs6000.md (div->recip splitter): Remove optimize_insn_for_speed_p condition. Modified: trunk/gcc/ChangeLog trunk/gcc/config/rs6000/rs6000.md
[Bug rtl-optimization/68212] Loop unroller breaks basic block frequencies
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68212 David Edelsohn changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2015-11-04 CC||dje at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from David Edelsohn --- Confirmed.