[Bug rtl-optimization/87507] IRA unnecessarily uses non-volatile registers during register assignment

2018-11-13 Thread bergner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87507

Peter Bergner  changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #12 from Peter Bergner  ---
Fixed.

[Bug rtl-optimization/87507] IRA unnecessarily uses non-volatile registers during register assignment

2018-11-13 Thread bergner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87507

--- Comment #11 from Peter Bergner  ---
Author: bergner
Date: Wed Nov 14 02:17:35 2018
New Revision: 266097

URL: https://gcc.gnu.org/viewcvs?rev=266097=gcc=rev
Log:
gcc/
PR rtl-optimization/87507
* lower-subreg.c (operand_for_swap_move_operator): New function.
(simple_move): Strip simple operators.
(find_pseudo_copy): Likewise.
(resolve_operand_for_swap_move_operator): New function.
(resolve_simple_move): Strip simple operators and swap operands.

gcc/testsuite/
PR rtl-optimization/87507
* gcc.target/powerpc/pr87507.c: New test.
* gcc.target/powerpc/pr68805.c: Update expected results.

Added:
trunk/gcc/testsuite/gcc.target/powerpc/pr87507.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/lower-subreg.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.target/powerpc/pr68805.c

[Bug rtl-optimization/87507] IRA unnecessarily uses non-volatile registers during register assignment

2018-11-13 Thread bergner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87507

Peter Bergner  changed:

   What|Removed |Added

URL|https://gcc.gnu.org/ml/gcc- |https://gcc.gnu.org/ml/gcc-
   |patches/2018-11/msg00887.ht |patches/2018-11/msg01120.ht
   |ml  |ml

--- Comment #10 from Peter Bergner  ---
Submitted a new patch that changes lower-subreg to decompose the problematic
register pairs into separate regs which are easier to allocate.

[Bug rtl-optimization/87507] IRA unnecessarily uses non-volatile registers during register assignment

2018-10-30 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87507

--- Comment #9 from Segher Boessenkool  ---
Why isn't this handled in subreg1 already?  Sorry if that is obvious :-)

[Bug rtl-optimization/87507] IRA unnecessarily uses non-volatile registers during register assignment

2018-10-30 Thread bergner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87507

--- Comment #8 from Peter Bergner  ---
So Vlad is hesitant (probably rightly :) on accepting my patch.  Looking
closer, on BE, lower subreg2 is able to break the TImode accesses into 2 DImode
accesses which helps tremendously.  On LE (power8), split1 runs just before
lower subreg2 and inserts swaps on the memory accesses, which confuses lower
subreg, so we keep the TImode accesses and we get register pairs which are hard
to allocate and leads to poor decisions in this particular case.

As a hack, I moved lower subreg2 before split1 and we get the code we want.  I
don't think want to do that for real, so I will look at enhancing lower subreg
to recognize our TImode memory ops with swaps to see whether we can still
decompose them.

[Bug rtl-optimization/87507] IRA unnecessarily uses non-volatile registers during register assignment

2018-10-03 Thread bergner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87507

--- Comment #7 from Peter Bergner  ---
Created attachment 44778
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44778=edit
Proprosed patch

FYI, this is what I'm testing.

[Bug rtl-optimization/87507] IRA unnecessarily uses non-volatile registers during register assignment

2018-10-03 Thread bergner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87507

--- Comment #6 from Peter Bergner  ---
(In reply to Peter Bergner from comment #2)
> Seems to work fine on BE, so probably a LE only issue.

So this is a generic problem, not related to LE.  I can get this to fail on BE
as well if I strategically remove some hard regs via using -ffixed-r6
-ffixed-r9 -ffixed-r11.

[Bug rtl-optimization/87507] IRA unnecessarily uses non-volatile registers during register assignment

2018-10-03 Thread bergner at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87507

Peter Bergner  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2018-10-04
 CC||law at gcc dot gnu.org,
   ||vmakarov at gcc dot gnu.org
  Component|target  |rtl-optimization
 Ever confirmed|0   |1

--- Comment #5 from Peter Bergner  ---
The other problem is that we don't add in the cost of saving and restoring
non-volatile registers in the prologue and epilogue if HONOR_REG_ALLOC_ORDER is
set (it is for rs6000):

   if (!HONOR_REG_ALLOC_ORDER)
{
  if ((saved_nregs = calculate_saved_nregs (hard_regno, mode)) != 0)
  /* We need to save/restore the hard register in
 epilogue/prologue.  Therefore we increase the cost.  */
  {
rclass = REGNO_REG_CLASS (hard_regno);
add_cost = ((ira_memory_move_cost[mode][rclass][0]
 + ira_memory_move_cost[mode][rclass][1])
* saved_nregs / hard_regno_nregs (hard_regno,
  mode) - 1);
cost += add_cost;
full_cost += add_cost;
  }
}

The code in Comment 4 is what gives reg pair r7,r8 a cost of 1000 and since we
don't charge non-volatiles for prologue/epilogue save restore, reg pair r30,r31
gets a cost of zero, so assign_hard_reg picks r30,r31 and we get the
unnecessary non-volatile reg usage we're seeing.

It seems to me that we should always incorporate the save/restore cost into
non-volatiles.  However, even if we enable the code above for targets that set
HONOR_REG_ALLOC_ORDER, there is still a bug in that the additional cost it
computes above, because it doesn't take into consideration the basic block
frequency assigned to the prologue and epilogue blocks.  The additional cost
above really should be multiplied by the prologue/epilogue frequency.

The above analysis means this is no longer target bug, but a bug in the cost
computation in IRA, therefore I am resetting the Component and adding Vlad and
Jeff to the CC for their awareness.

I have a patch to fix the above that I am testing.