[Bug tree-optimization/80155] [7/8 regression] Performance regression with code hoisting enabled

2017-10-11 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155

--- Comment #33 from prathamesh3492 at gcc dot gnu.org ---
Created attachment 42341
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42341=edit
Test-case to reproduce regression with cortex-m7

I have attached an artificial test-case that is fairly representative of the
regression we are seeing in a benchmark. The test-case mimics a deterministic
finite automaton. With code-hoisting there's an additional spill of r5 near
beginning of the function.

Looking at the loop from the attached test-case:
for (; *a && b != 'z'; a++)
  {
next = *a;
if (next == ',')
  {
a++;
break;
  }
switch (b) { ... }
  }

The for loop has same computation a++ in two sibling basic blocks,
which gets hoisted.

From PRE dump with code-hoisting:
   [23.80%] [count: INV]:
  # _25 = PHI <_151(25), _23(2)>
  # b_50 = PHI 
  # a_55 = PHI 
  next_29 = (int) _25;
  _44 = a_55 + 1;
  if (next_29 == 44)
goto ; [5.00%] [count: INV]
  else
goto ; [95.00%] [count: INV]

(a+1) seems to get hoisted in bb26:
_44 = a_55 + 1
just before
if (next_29 == 44) which corresponds to if (next == ',') condition.

The issue I think is that there is a use of 'a' near end of function:
*s = a;
which possibly results in register pressure forcing the compiler to spill r5.
Commenting out the assignment removes the spill.

Looking at register allocation with code-hoisting, it seems r2 is used
to hold the hoisted value (a + 1):

r0 = s
r1 = tab
r3 = a
r4 = b
r5 = *a
r2 = r3 + 1 (holding the hoisted value)

And without code-hoisting, it seems only r3 is assigned to 'a'.
r0 = s
r1 = tab
r2 = b
r3 = a
r4 = *a


This is evident from asm differences for the early-exit code-path:
if (next == ',')
  {
a++;
break;
  }

:
  *s = a;
  return b;


Without code-hoisting:
.L2:
cmp r4, #44
beq .L4

.L4:
addsr3, r3, #1
ldr r4, [sp], #4
str r3, [r0]
mov r0, r2
bx  lr

With code-hoisting:
.L2:
cmp r5, #44
add r2, r3, #1
beq .L3

.L3:
str r2, [r0]
mov r0, r4
pop {r4, r5}
bx  lr

Without code-hoisting it is reusing r3 to store a + 1, while due to code
hoisting it uses the extra register 'r2' to store the value of hoisted
expression a + 1.

Would it be a good idea to somehow "limit" the distance (in terms of number of
basic blocks maybe?) between the definition of hoisted variable and it's
furthest use during PRE ? If that exceeds a certain threshold then PRE should
choose not to hoist that expression. The threshold could be a param that can be
set by backends.
Does this analysis look reasonable ?

Thanks,
Prathamesh

[Bug tree-optimization/80155] [7/8 regression] Performance regression with code hoisting enabled

2017-10-04 Thread thopre01 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155

--- Comment #32 from Thomas Preud'homme  ---
(In reply to rguent...@suse.de from comment #31)
> On Wed, 4 Oct 2017, prathamesh3492 at gcc dot gnu.org wrote:
> 
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155
> > 
> > prathamesh3492 at gcc dot gnu.org changed:
> > 
> >What|Removed |Added
> > 
> >  CC||prathamesh3492 at gcc dot 
> > gnu.org
> > 
> > --- Comment #30 from prathamesh3492 at gcc dot gnu.org ---
> > Hi Richard,
> > I tried your patch in comment #9 with the fix in comment #13 but since
> > tree-ssa-pre.c appears to be refactored, the fix doesn't apply anymore and 
> > ICE
> > resurfaces. Could you guide me what fix I should apply to reproduce the
> > regression ? IIUC the issue here is that code-hoisting is increasing 
> > register
> > pressure thus causing the extra spill ? And GIMPLE does not seem to have 
> > cost
> > model for register allocation.
> > 
> > Are you planning to take a look at this PR soon ? If not I would like to 
> > give a
> > try and would be grateful for suggestions on how to approach this bug.
> > Thanks!
> 
> Neither am I planning to look at this soon nor do I have a good idea
> how to approach this bug.
> 
> My ideas were to compute register pressure & update it during elimination
> and thus avoid adding uses that increase pressure over some point.  While
> that might mitigate the issue it isn't in any way applying a cost model
> to individual inserts.  [nor is computing/updating register pressure easy]

Hi,

Looking at the testcase I attached to this ticket I'm regrettably not so sure
they are representative of the issue we were facing which resulted from too
much register pressure. With so few variable this is probably hitting some
other bug. I'll try and come up with a better reduced testcase.

Best regards.

[Bug tree-optimization/80155] [7/8 regression] Performance regression with code hoisting enabled

2017-10-04 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155

--- Comment #31 from rguenther at suse dot de  ---
On Wed, 4 Oct 2017, prathamesh3492 at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155
> 
> prathamesh3492 at gcc dot gnu.org changed:
> 
>What|Removed |Added
> 
>  CC||prathamesh3492 at gcc dot 
> gnu.org
> 
> --- Comment #30 from prathamesh3492 at gcc dot gnu.org ---
> Hi Richard,
> I tried your patch in comment #9 with the fix in comment #13 but since
> tree-ssa-pre.c appears to be refactored, the fix doesn't apply anymore and ICE
> resurfaces. Could you guide me what fix I should apply to reproduce the
> regression ? IIUC the issue here is that code-hoisting is increasing register
> pressure thus causing the extra spill ? And GIMPLE does not seem to have cost
> model for register allocation.
> 
> Are you planning to take a look at this PR soon ? If not I would like to give 
> a
> try and would be grateful for suggestions on how to approach this bug.
> Thanks!

Neither am I planning to look at this soon nor do I have a good idea
how to approach this bug.

My ideas were to compute register pressure & update it during elimination
and thus avoid adding uses that increase pressure over some point.  While
that might mitigate the issue it isn't in any way applying a cost model
to individual inserts.  [nor is computing/updating register pressure easy]

[Bug tree-optimization/80155] [7/8 regression] Performance regression with code hoisting enabled

2017-10-04 Thread prathamesh3492 at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155

prathamesh3492 at gcc dot gnu.org changed:

   What|Removed |Added

 CC||prathamesh3492 at gcc dot 
gnu.org

--- Comment #30 from prathamesh3492 at gcc dot gnu.org ---
Hi Richard,
I tried your patch in comment #9 with the fix in comment #13 but since
tree-ssa-pre.c appears to be refactored, the fix doesn't apply anymore and ICE
resurfaces. Could you guide me what fix I should apply to reproduce the
regression ? IIUC the issue here is that code-hoisting is increasing register
pressure thus causing the extra spill ? And GIMPLE does not seem to have cost
model for register allocation.

Are you planning to take a look at this PR soon ? If not I would like to give a
try and would be grateful for suggestions on how to approach this bug.
Thanks!

[Bug tree-optimization/80155] [7/8 regression] Performance regression with code hoisting enabled

2017-04-19 Thread law at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80155

Jeffrey A. Law  changed:

   What|Removed |Added

   Target Milestone|7.0 |8.0
Summary|[7 regression] Performance  |[7/8 regression]
   |regression with code|Performance regression with
   |hoisting enabled|code hoisting enabled

--- Comment #29 from Jeffrey A. Law  ---
Based on c#27, pushing out to gcc-8.