[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557 --- Comment #7 from William J. Schmidt wschmidt at gcc dot gnu.org 2011-10-10 12:40:01 UTC --- I don't have anything too helpful to add. This code as it stands is balanced on a knife's edge for register usage for the particular target, so it's always going to be sensitive to compiler changes (not just this one). One thing I notice is that the loop is hand-unrolled four times. Why not let the compiler intelligently choose the unroll factor? I don't know what the result would be, but presumably the unroller has some heuristics to take target characteristics into account. Seems to me the factor of 4 is a bit aggressive for this target.
[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557 --- Comment #6 from Igor Zamyatin izamyatin at gmail dot com 2011-10-07 10:33:33 UTC --- Indeed, overall register pressure is not increased. Even before IRA dumps show that register pressure is actually kept on the same level. Looks like it is a tricky case we met. First, we can see that loop consists of 4 same group of instructions. The only difference is the index value used by arrays in each group. Before the reassociation improvement the group located on lines 30-33 of the attached test for some reasons (I haven't checked this yet) got a different sequence of instructions than others. After William's reassociation changes all groups got similar sequence. (Maybe there were some good reason for that group to be different? :) ) Now the tricky part. In fast (i.e. before William's commit) version for group on lines 30-33 IRA managed to hold c in eax register. Moreover because of shorter live range of c IRA managed to reuse eax inside the operations of 30-th line. For others group all work was made through memory. Since reassociation improvement made all groups to have the same look, we unsurprisingly got memory instead of registers which led to the performance drop. That is sort of my vision of the whole picture. Any comments, ideas?
[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557 --- Comment #5 from William J. Schmidt wschmidt at gcc dot gnu.org 2011-09-30 14:30:56 UTC --- Reassociation isn't doing anything untoward here that raises register pressure. The problem must be occurring downstream. Likely the scheduler is making a different decision that leads to more pressure. Block 9 contains the following prior to reassociation: D.3497_48 = D.3496_47 + D.3475_117; t_50 = D.3497_48 + D.3493_44; Reassociation changes this to: D.3497_48 = D.3493_44 + D.3496_47; t_50 = D.3497_48 + D.3475_117; This extends the lifetime of D.3475_117 but shortens the lifetime of D.3493_44, both of which go dead here. Register pressure is not raised at any point. There are no further changes to this code in the rest of the tree-SSA phases. Based on this, I don't see any reason to adjust the reassociation algorithm. Someone with some expertise in RTL phases could look into what happens later on to cause the additional pressure, but I don't see this as a tree-optimization issue.
[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557 --- Comment #3 from Igor Zamyatin izamyatin at gmail dot com 2011-09-29 08:34:45 UTC --- William, thanks for quick response! With -funroll-loops regression is still present. Do you want me to attach some dumps?
[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557 --- Comment #4 from William J. Schmidt wschmidt at gcc dot gnu.org 2011-09-29 12:16:46 UTC --- No, that's OK. I should be able to reproduce this on a pool machine. It may be difficult to come up with a good heuristic here given that reassociation doesn't have a good estimate of register pressure available. The fix solved a couple of other problem reports in addition to 49749, so we need to be careful about constraining it too much. All this is just to say I may not have something for you right away. :)
[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added CC||wschmidt at gcc dot gnu.org Target Milestone|--- |4.7.0
[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557 --- Comment #1 from Igor Zamyatin izamyatin at gmail dot com 2011-09-28 11:52:18 UTC --- Created attachment 25373 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=25373 testcase
[Bug tree-optimization/50557] [4.7 Regression] Register pressure increase after reassociation (x86, 32 bits)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50557 --- Comment #2 from William J. Schmidt wschmidt at gcc dot gnu.org 2011-09-28 12:13:50 UTC --- The fix for 49749 is intended to remove dependencies between loop iterations. One possibility would be to condition the changes on the presence of -funroll-loops. Another would be to limit the changes to loops containing fewer blocks or otherwise measuring simpler control flow. To help make a good decision here, can you please try your test case with -funroll-loops before and after the fix for 49749?