The patch here, https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01872.html,
attempted to scale down the register limit used by -fsched-pressure for the
case where the block in question executes as frequently as the entry block to
just the call_clobbered (i.e. call_used) regs. But the code is actually scaling
toward call_saved registers. The following patch corrects that by computing
call_saved regs per class and subtracting out some scaled portion of that.
Bootstrap/regtest on powerpc64le with no new failures. Ok for trunk?
-Pat
2016-10-07 Pat Haugen <[email protected]>
* haifa-sched.c call_used_regs_num: Rename to...
call_saved_regs_num: ...this.
(sched_pressure_start_bb): Scale call_saved regs not call_used.
(alloc_global_sched_pressure_data): Compute call_saved regs.
Index: gcc/haifa-sched.c
===================================================================
--- gcc/haifa-sched.c (revision 240812)
+++ gcc/haifa-sched.c (working copy)
@@ -932,9 +932,9 @@ static bitmap region_ref_regs;
/* Effective number of available registers of a given class (see comment
in sched_pressure_start_bb). */
static int sched_class_regs_num[N_REG_CLASSES];
-/* Number of call_used_regs. This is a helper for calculating of
+/* Number of call_saved_regs. This is a helper for calculating of
sched_class_regs_num. */
-static int call_used_regs_num[N_REG_CLASSES];
+static int call_saved_regs_num[N_REG_CLASSES];
/* Initiate register pressure relative info for scheduling the current
region. Currently it is only clearing register mentioned in the
@@ -3900,13 +3900,13 @@ sched_pressure_start_bb (basic_block bb)
* If the basic block executes as often as the prologue/epilogue,
then spill in the block is as costly as in the prologue, so the effective
number of available registers is
- (ira_class_hard_regs_num[cl] - call_used_regs_num[cl]).
+ (ira_class_hard_regs_num[cl] - call_saved_regs_num[cl]).
Note that all-else-equal, we prefer to spill in the prologue, since that
allows "extra" registers for other basic blocks of the function.
* If the basic block is on the cold path of the function and executes
rarely, then we should always prefer to spill in the block, rather than
in the prologue/epilogue. The effective number of available register is
- (ira_class_hard_regs_num[cl] - call_used_regs_num[cl]). */
+ (ira_class_hard_regs_num[cl] - call_saved_regs_num[cl]). */
{
int i;
int entry_freq = ENTRY_BLOCK_PTR_FOR_FN (cfun)->frequency;
@@ -3925,7 +3925,7 @@ sched_pressure_start_bb (basic_block bb)
enum reg_class cl = ira_pressure_classes[i];
sched_class_regs_num[cl] = ira_class_hard_regs_num[cl];
sched_class_regs_num[cl]
- -= (call_used_regs_num[cl] * entry_freq) / bb_freq;
+ -= (call_saved_regs_num[cl] * entry_freq) / bb_freq;
}
}
@@ -7237,17 +7237,17 @@ alloc_global_sched_pressure_data (void)
region_ref_regs = BITMAP_ALLOC (NULL);
}
- /* Calculate number of CALL_USED_REGS in register classes that
+ /* Calculate number of CALL_SAVED_REGS in register classes that
we calculate register pressure for. */
for (int c = 0; c < ira_pressure_classes_num; ++c)
{
enum reg_class cl = ira_pressure_classes[c];
- call_used_regs_num[cl] = 0;
+ call_saved_regs_num[cl] = 0;
for (int i = 0; i < ira_class_hard_regs_num[cl]; ++i)
- if (call_used_regs[ira_class_hard_regs[cl][i]])
- ++call_used_regs_num[cl];
+ if (!call_used_regs[ira_class_hard_regs[cl][i]])
+ ++call_saved_regs_num[cl];
}
}
}