[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)

2011-10-10 Thread wschmidt at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557

--- Comment #7 from William J. Schmidt wschmidt at gcc dot gnu.org 2011-10-10 
12:40:01 UTC ---
I don't have anything too helpful to add.  This code as it stands is balanced
on a knife's edge for register usage for the particular target, so it's always
going to be sensitive to compiler changes (not just this one).

One thing I notice is that the loop is hand-unrolled four times.  Why not let
the compiler intelligently choose the unroll factor?  I don't know what the
result would be, but presumably the unroller has some heuristics to take target
characteristics into account.  Seems to me the factor of 4 is a bit aggressive
for this target.


[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)

2011-10-07 Thread izamyatin at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557

--- Comment #6 from Igor Zamyatin izamyatin at gmail dot com 2011-10-07 
10:33:33 UTC ---
Indeed, overall register pressure is not increased. Even before IRA dumps show
that register pressure is actually kept on the same level. 

Looks like it is a tricky case we met.

First, we can see that loop consists of 4 same group of instructions. The only
difference is the index value used by arrays in each group. Before the
reassociation improvement the group located on lines 30-33 of the attached test
for some reasons (I haven't checked this yet) got a different sequence of
instructions than others. After William's reassociation changes all groups got
similar sequence. (Maybe there were some good reason for that group to be
different? :) )

Now the tricky part.
In fast (i.e. before William's commit) version for group on lines 30-33 IRA
managed to hold c in eax register. Moreover because of shorter live range of
c IRA managed to reuse eax inside the operations of 30-th line. For others
group all work was made through memory. 
Since reassociation improvement made all groups to have the same look, we
unsurprisingly got memory instead of registers which led to the performance
drop.

That is sort of my vision of the whole picture. Any comments, ideas?


[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)

2011-09-30 Thread wschmidt at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557

--- Comment #5 from William J. Schmidt wschmidt at gcc dot gnu.org 2011-09-30 
14:30:56 UTC ---
Reassociation isn't doing anything untoward here that raises register pressure.
 The problem must be occurring downstream.  Likely the scheduler is making a
different decision that leads to more pressure.

Block 9 contains the following prior to reassociation:

  D.3497_48 = D.3496_47 + D.3475_117;
  t_50 = D.3497_48 + D.3493_44;

Reassociation changes this to:

  D.3497_48 = D.3493_44 + D.3496_47;
  t_50 = D.3497_48 + D.3475_117;

This extends the lifetime of D.3475_117 but shortens the lifetime of D.3493_44,
both of which go dead here.  Register pressure is not raised at any point. 
There are no further changes to this code in the rest of the tree-SSA phases.

Based on this, I don't see any reason to adjust the reassociation algorithm. 
Someone with some expertise in RTL phases could look into what happens later on
to cause the additional pressure, but I don't see this as a tree-optimization
issue.


[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)

2011-09-29 Thread izamyatin at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557

--- Comment #3 from Igor Zamyatin izamyatin at gmail dot com 2011-09-29 
08:34:45 UTC ---
William, thanks for quick response!

With -funroll-loops regression is still present.
Do you want me to attach some dumps?


[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)

2011-09-29 Thread wschmidt at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557

--- Comment #4 from William J. Schmidt wschmidt at gcc dot gnu.org 2011-09-29 
12:16:46 UTC ---
No, that's OK.  I should be able to reproduce this on a pool machine.

It may be difficult to come up with a good heuristic here given that
reassociation doesn't have a good estimate of register pressure available.  The
fix solved a couple of other problem reports in addition to 49749, so we need
to be careful about constraining it too much.

All this is just to say I may not have something for you right away. :)


[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)

2011-09-28 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 CC||wschmidt at gcc dot gnu.org
   Target Milestone|--- |4.7.0


[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)

2011-09-28 Thread izamyatin at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557

--- Comment #1 from Igor Zamyatin izamyatin at gmail dot com 2011-09-28 
11:52:18 UTC ---
Created attachment 25373
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25373
testcase


[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)

2011-09-28 Thread wschmidt at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557

--- Comment #2 from William J. Schmidt wschmidt at gcc dot gnu.org 2011-09-28 
12:13:50 UTC ---
The fix for 49749 is intended to remove dependencies between loop iterations. 
One possibility would be to condition the changes on the presence of
-funroll-loops.  Another would be to limit the changes to loops containing
fewer blocks or otherwise measuring simpler control flow.

To help make a good decision here, can you please try your test case with
-funroll-loops before and after the fix for 49749?