Hi,

This patch improves register pressure scheduling (both SCHED_PRESSURE_WEIGHTED 
and SCHED_PRESSURE_MODEL) to better estimate number of available registers.

At the moment the scheduler does not account for spills in the prologues and 
restores in the epilogue, which occur from use of call-used registers.  The 
current state is, essentially, optimized for case when there is a hot loop 
inside the function, and the loop executes significantly more often than the 
prologue/epilogue.  However, on the opposite end, we have a case when the 
function is just a single non-cyclic basic block, which executes just as often 
as prologue / epilogue, so spills in the prologue hurt performance as much as 
spills in the basic block itself.  In such a case the scheduler should 
throttle-down on the number of available registers and try to not go beyond 
call-clobbered registers.

The patch uses basic block frequencies to balance the cost of using call-used 
registers for intermediate cases between the two above extremes.

The motivation for this patch was a floating-point testcase on 
arm-linux-gnueabihf (ARM is one of the few targets that use register pressure 
scheduling by default).

A "thanks" goes to Richard good discussion of the problem and suggestions on 
the approach to fix it.

The patch was bootstrapped on x86_64-linux-gnu (which doesn't really exercises 
the patch), and cross-tested on arm-linux-gnueabihf and aarch64-linux-gnu.

OK to apply?

--
Maxim Kuvyrkov
www.linaro.org


Attachment: 0001-sched_class_reg_num.ChangeLog
Description: Binary data

Attachment: 0001-sched_class_reg_num.patch
Description: Binary data

Reply via email to