[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398

2020-03-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|9.3 |9.4

--- Comment #11 from Jakub Jelinek  ---
GCC 9.3.0 has been released, adjusting target milestone.

[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398

2019-08-12 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|9.2 |9.3

--- Comment #10 from Jakub Jelinek  ---
GCC 9.2 has been released.

[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398

2019-05-03 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249

Jakub Jelinek  changed:

   What|Removed |Added

   Target Milestone|9.0 |9.2

--- Comment #9 from Jakub Jelinek  ---
GCC 9.1 has been released.

[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398

2019-04-29 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249

--- Comment #8 from Richard Earnshaw  ---
(In reply to Richard Earnshaw from comment #7)
> (In reply to Segher Boessenkool from comment #4)
> > That is code *size*.  Code size is expected to grow a tiny bit, because of
> > *better* register allocation.
> > 
> > But we could not do make_more_copies at -Os, if that helps?  (The hard
> > register
> > changes themselves are required for correctness).
> 
> In this case, however, we get *worse* register allocation, since it is using
> the the expensive register more frequently than a cheaper register which is
> hardly used at all.
> 
> In this particular case, all the uses of the "cheap" register (r7) could use
> the 'expensive' register at no additional cost, since the cheap register is
> being used only to hold a value that will be moved to another register (a
> cheap operation regardless of the register used).

FTR, I don't think the combine changes are directly implicated in this
regression.  They just expose a latent issue with register allocation and its
costing.

[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398

2019-04-29 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249

--- Comment #7 from Richard Earnshaw  ---
(In reply to Segher Boessenkool from comment #4)
> That is code *size*.  Code size is expected to grow a tiny bit, because of
> *better* register allocation.
> 
> But we could not do make_more_copies at -Os, if that helps?  (The hard
> register
> changes themselves are required for correctness).

In this case, however, we get *worse* register allocation, since it is using
the the expensive register more frequently than a cheaper register which is
hardly used at all.

In this particular case, all the uses of the "cheap" register (r7) could use
the 'expensive' register at no additional cost, since the cheap register is
being used only to hold a value that will be moved to another register (a cheap
operation regardless of the register used).

[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398

2019-04-29 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249

--- Comment #6 from Segher Boessenkool  ---
(In reply to Wilco from comment #5)
> (In reply to Segher Boessenkool from comment #4)
> > That is code *size*.  Code size is expected to grow a tiny bit, because of
> > *better* register allocation.
> > 
> > But we could not do make_more_copies at -Os, if that helps?  (The hard
> > register
> > changes themselves are required for correctness).
> 
> Better register allocation implies lower average codesize due to fewer
> spills, fewer callee-saves, fewer moves etc.

That depends on the case.  And we are dealing with a quite specialised case
here.

> I still don't understand what specific problem make_more_copies is trying to
> solve. Is it trying to do life-range splitting of argument registers?

Nope.  It is simply that before the hard-reg change we very often combined the
argument register moves with other insns to something a different form than
those other insns, importantly when we can do this because we know how those
values are extended, etc.  make_more_copies simply inserts another reg-reg move
so that that new move can do this instead, since we no longer combine the hard
register move.  Without this we get a lot of actual code quality regressions.

[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398

2019-04-29 Thread wilco at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249

Wilco  changed:

   What|Removed |Added

 CC||wilco at gcc dot gnu.org

--- Comment #5 from Wilco  ---
(In reply to Segher Boessenkool from comment #4)
> That is code *size*.  Code size is expected to grow a tiny bit, because of
> *better* register allocation.
> 
> But we could not do make_more_copies at -Os, if that helps?  (The hard
> register
> changes themselves are required for correctness).

Better register allocation implies lower average codesize due to fewer spills,
fewer callee-saves, fewer moves etc.

I still don't understand what specific problem make_more_copies is trying to
solve. Is it trying to do life-range splitting of argument registers?

[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398

2019-04-29 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249

--- Comment #4 from Segher Boessenkool  ---
That is code *size*.  Code size is expected to grow a tiny bit, because of
*better* register allocation.

But we could not do make_more_copies at -Os, if that helps?  (The hard register
changes themselves are required for correctness).

[Bug rtl-optimization/90249] [9/10 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398

2019-04-26 Thread rearnsha at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249

--- Comment #3 from Richard Earnshaw  ---
(In reply to Segher Boessenkool from comment #2)
> What difference is there on some code of significant size?  Do you see
> regressions then?
> 
> Of course there are some tiny examples where it now does worse, just like
> there are examples where it now does better.

Across the entirety of CSiBE thumb2 regresses by 0.05% (tested by effectively
disabling r265398 on tip of tree).

It seems to be specific to Thumb2 code, though.  Thumb1 and Arm code now get
worse when that specific patch is disabled.  Though all three are still worse
than gcc-8 overall.