On Tue, 2016-05-10 at 22:57 +0200, Rafael J. Wysocki wrote: > > > >
[...] > --- > From: Rafael J. Wysocki <rafael.j.wyso...@intel.com> > Subject: [PATCH] intel_pstate: Clarify average performance > computation > > The core_pct_busy field of struct sample actually contains the > average performace during the last sampling period (in percent) > and not the utilization of the core as suggested by its name > which is confusing. > > For this reason, change the name of that field to core_avg_perf > and rename the function that computes its value accordingly. > > Also notice that storing this value as percentage requires a costly > integer multiplication to be carried out in a hot path, so instead > store it as an "extended fixed point" value with more fraction bits > and update the code using it accordingly (it is better to change the > name of the field along with its meaning in one go than to make those > two changes separately, as that would likely lead to more > confusion). > > Signed-off-by: Rafael J. Wysocki <rafael.j.wyso...@intel.com> > --- > drivers/cpufreq/intel_pstate.c | 31 +++++++++++++++--------------- > - > 1 file changed, 15 insertions(+), 16 deletions(-) > > Index: linux-pm/drivers/cpufreq/intel_pstate.c > =================================================================== > --- linux-pm.orig/drivers/cpufreq/intel_pstate.c > +++ linux-pm/drivers/cpufreq/intel_pstate.c > @@ -49,6 +49,9 @@ > #define int_tofp(X) ((int64_t)(X) << FRAC_BITS) > #define fp_toint(X) ((X) >> FRAC_BITS) > > +#define EXT_BITS 6 > +#define EXT_FRAC_BITS (EXT_BITS + FRAC_BITS) > + > static inline int32_t mul_fp(int32_t x, int32_t y) > { > return ((int64_t)x * (int64_t)y) >> FRAC_BITS; > @@ -72,10 +75,10 @@ static inline int ceiling_fp(int32_t x) > > /** > * struct sample - Store performance sample > - * @core_pct_busy: Ratio of APERF/MPERF in percent, which is > actual > + * @core_avg_perf: Ratio of APERF/MPERF which is the actual > average > * performance during last sample period > * @busy_scaled: Scaled busy value which is used to calculate > next > - * P state. This can be different than > core_pct_busy > + * P state. This can be different than > core_avg_perf > * to account for cpu idle period > * @aperf: Difference of actual performance frequency > clock count > * read from APERF MSR between last and > current sample > @@ -90,8 +93,8 @@ static inline int ceiling_fp(int32_t x) > * data for choosing next P State. > */ > struct sample { > - int32_t core_pct_busy; > int32_t busy_scaled; > + u64 core_avg_perf; > u64 aperf; > u64 mperf; > u64 tsc; > @@ -1147,15 +1150,12 @@ static void intel_pstate_get_cpu_pstates > intel_pstate_set_min_pstate(cpu); > } > > -static inline void intel_pstate_calc_busy(struct cpudata *cpu) > +static inline void intel_pstate_calc_avg_perf(struct cpudata *cpu) > { > struct sample *sample = &cpu->sample; > - int64_t core_pct; > - > - core_pct = sample->aperf * int_tofp(100); > - core_pct = div64_u64(core_pct, sample->mperf); > > - sample->core_pct_busy = (int32_t)core_pct; > + sample->core_avg_perf = div64_u64(sample->aperf << > EXT_FRAC_BITS, > + sample->mperf); > } > > static inline bool intel_pstate_sample(struct cpudata *cpu, u64 > time) > @@ -1198,9 +1198,8 @@ static inline bool intel_pstate_sample(s > > static inline int32_t get_avg_frequency(struct cpudata *cpu) > { > - return fp_toint(mul_fp(cpu->sample.core_pct_busy, > - int_tofp(cpu- > >pstate.max_pstate_physical * > - cpu->pstate.scaling > / 100))); > + return (cpu->sample.core_avg_perf * cpu- > >pstate.max_pstate_physical * > + cpu->pstate.scaling) >> EXT_FRAC_BITS; This breaks frequency display. Needs cast return ((u64)cpu->sample.core_avg_perf * cpu-> pstate.max_pstate_physical * cpu->pstate.scaling) >> EXT_FRAC_BITS; Otherwise results are very close with the version without this change. Thanks, Srinivas