[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249 Jakub Jelinek changed: What|Removed |Added Target Milestone|9.3 |9.4 --- Comment #11 from Jakub Jelinek --- GCC 9.3.0 has been released, adjusting target milestone.
[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249 Jakub Jelinek changed: What|Removed |Added Target Milestone|9.2 |9.3 --- Comment #10 from Jakub Jelinek --- GCC 9.2 has been released.
[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249 Jakub Jelinek changed: What|Removed |Added Target Milestone|9.0 |9.2 --- Comment #9 from Jakub Jelinek --- GCC 9.1 has been released.
[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249 --- Comment #8 from Richard Earnshaw --- (In reply to Richard Earnshaw from comment #7) > (In reply to Segher Boessenkool from comment #4) > > That is code *size*. Code size is expected to grow a tiny bit, because of > > *better* register allocation. > > > > But we could not do make_more_copies at -Os, if that helps? (The hard > > register > > changes themselves are required for correctness). > > In this case, however, we get *worse* register allocation, since it is using > the the expensive register more frequently than a cheaper register which is > hardly used at all. > > In this particular case, all the uses of the "cheap" register (r7) could use > the 'expensive' register at no additional cost, since the cheap register is > being used only to hold a value that will be moved to another register (a > cheap operation regardless of the register used). FTR, I don't think the combine changes are directly implicated in this regression. They just expose a latent issue with register allocation and its costing.
[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249 --- Comment #7 from Richard Earnshaw --- (In reply to Segher Boessenkool from comment #4) > That is code *size*. Code size is expected to grow a tiny bit, because of > *better* register allocation. > > But we could not do make_more_copies at -Os, if that helps? (The hard > register > changes themselves are required for correctness). In this case, however, we get *worse* register allocation, since it is using the the expensive register more frequently than a cheaper register which is hardly used at all. In this particular case, all the uses of the "cheap" register (r7) could use the 'expensive' register at no additional cost, since the cheap register is being used only to hold a value that will be moved to another register (a cheap operation regardless of the register used).
[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249 --- Comment #6 from Segher Boessenkool --- (In reply to Wilco from comment #5) > (In reply to Segher Boessenkool from comment #4) > > That is code *size*. Code size is expected to grow a tiny bit, because of > > *better* register allocation. > > > > But we could not do make_more_copies at -Os, if that helps? (The hard > > register > > changes themselves are required for correctness). > > Better register allocation implies lower average codesize due to fewer > spills, fewer callee-saves, fewer moves etc. That depends on the case. And we are dealing with a quite specialised case here. > I still don't understand what specific problem make_more_copies is trying to > solve. Is it trying to do life-range splitting of argument registers? Nope. It is simply that before the hard-reg change we very often combined the argument register moves with other insns to something a different form than those other insns, importantly when we can do this because we know how those values are extended, etc. make_more_copies simply inserts another reg-reg move so that that new move can do this instead, since we no longer combine the hard register move. Without this we get a lot of actual code quality regressions.
[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249 Wilco changed: What|Removed |Added CC||wilco at gcc dot gnu.org --- Comment #5 from Wilco --- (In reply to Segher Boessenkool from comment #4) > That is code *size*. Code size is expected to grow a tiny bit, because of > *better* register allocation. > > But we could not do make_more_copies at -Os, if that helps? (The hard > register > changes themselves are required for correctness). Better register allocation implies lower average codesize due to fewer spills, fewer callee-saves, fewer moves etc. I still don't understand what specific problem make_more_copies is trying to solve. Is it trying to do life-range splitting of argument registers?
[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249 --- Comment #4 from Segher Boessenkool --- That is code *size*. Code size is expected to grow a tiny bit, because of *better* register allocation. But we could not do make_more_copies at -Os, if that helps? (The hard register changes themselves are required for correctness).
[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249 --- Comment #3 from Richard Earnshaw --- (In reply to Segher Boessenkool from comment #2) > What difference is there on some code of significant size? Do you see > regressions then? > > Of course there are some tiny examples where it now does worse, just like > there are examples where it now does better. Across the entirety of CSiBE thumb2 regresses by 0.05% (tested by effectively disabling r265398 on tip of tree). It seems to be specific to Thumb2 code, though. Thumb1 and Arm code now get worse when that specific patch is disabled. Though all three are still worse than gcc-8 overall.