On Thu, May 18, 2017 at 11:41 AM, Richard Sandiford <richard.sandif...@linaro.org> wrote: > "Bin.Cheng" <amker.ch...@gmail.com> writes: >> On Wed, May 17, 2017 at 1:37 PM, Richard Sandiford >> <richard.sandif...@linaro.org> wrote: >>> "Bin.Cheng" <amker.ch...@gmail.com> writes: >>>> -/* Calculates cost for having N_REGS registers. This number includes >>>> - induction variables, invariant variables and invariant expressions. */ >>>> +/* Estimate register pressure for loop having N_INVS invariants and >>>> N_CANDS >>>> + induction variables. Note N_INVS includes both invariant variables and >>>> + invariant expressions. */ >>>> >>>> static unsigned >>>> -ivopts_global_cost_for_size (struct ivopts_data *data, unsigned n_regs) >>>> +ivopts_estimate_reg_pressure (struct ivopts_data *data, unsigned n_invs, >>>> + unsigned n_cands) >>>> { >>>> - unsigned cost = estimate_reg_pressure_cost (n_regs, >>>> - data->regs_used, data->speed, >>>> - data->body_includes_call); >>>> - /* Add n_regs to the cost, so that we prefer eliminating ivs if >>>> possible. */ >>>> - return n_regs + cost; >>>> + unsigned cost; >>>> + unsigned n_old = data->regs_used, n_new = n_invs + n_cands; >>>> + unsigned regs_needed = n_new + n_old, available_regs = >>>> target_avail_regs; >>>> + bool speed = data->speed; >>>> + >>>> + /* If there is a call in the loop body, the call-clobbered registers >>>> + are not available for loop invariants. */ >>>> + if (data->body_includes_call) >>>> + available_regs = available_regs - target_clobbered_regs; >>>> + >>>> + /* If we have enough registers. */ >>>> + if (regs_needed + target_res_regs < available_regs) >>>> + cost = n_new; >>>> + /* If close to running out of registers, try to preserve them. */ >>>> + else if (regs_needed <= available_regs) >>>> + cost = target_reg_cost [speed] * regs_needed; >>>> + /* If we run out of available registers but the number of candidates >>>> + does not, we penalize extra registers using target_spill_cost. */ >>>> + else if (n_cands <= available_regs) >>>> + cost = target_reg_cost [speed] * available_regs >>>> + + target_spill_cost [speed] * (regs_needed - available_regs); >>>> + /* If the number of candidates runs out available registers, we penalize >>>> + extra candidate registers using target_spill_cost * 2. Because it is >>>> + more expensive to spill induction variable than invariant. */ >>>> + else >>>> + cost = target_reg_cost [speed] * available_regs >>>> + + target_spill_cost [speed] * (n_cands - available_regs) * 2 >>>> + + target_spill_cost [speed] * (regs_needed - n_cands); >>>> + >>>> + /* Finally, add the number of candidates, so that we prefer eliminating >>>> + induction variables if possible. */ >>>> + return cost + n_cands; >>> >>> It looks like the return is mixing units. Would it work to return >>> a <cost, n_cands> pair instead, and use lexicographical ordering? >> Hi Richard, >> It just penalizes the cost by the number of candidates, rather than >> returns n_cands to caller. Actually that information is available all >> the time in ivopts_data structure. > > Yeah, but what I meant was: "cost" seems to be measured in abstract > instruction units, while "n_cands" counts a number of variables. > It doesn't seem to make sense to add them together. Yes, unfortunately GCC adds up different costs and compares it together. > > If the idea is to use n_cands as a tie-breaker when the instruction We don't have such cost model which different costs could be applied/compared separately, also we don't have obvious break conditions. Sometimes the "instruction cost" is lower but results in worse code because register cost is higher; but sometimes the opposite holds. LLVM has such separated cost models I guess?
Thanks, bin > costs are the same, then lexicographical ordering of pairs would give > that without mixing the units. > > Thanks, > Richard