On 16-07-17, 01:04, Joel Fernandes wrote:
> Currently the iowait_boost feature in schedutil makes the frequency go to max
> on iowait wakeups.  This feature was added to handle a case that Peter
> described where the throughput of operations involving continuous I/O requests
> [1] is reduced due to running at a lower frequency, however the lower
> throughput itself causes utilization to be low and hence causing frequency to
> be low hence its "stuck".
> 
> Instead of going to max, its also possible to achieve the same effect by
> ramping up to max if there are repeated in_iowait wakeups happening. This 
> patch
> is an attempt to do that. We start from a lower frequency (policy->mind)

s/mind/min/

> and double the boost for every consecutive iowait update until we reach the
> maximum iowait boost frequency (iowait_boost_max).
> 
> I ran a synthetic test (continuous O_DIRECT writes in a loop) on an x86 
> machine
> with intel_pstate in passive mode using schedutil. In this test the 
> iowait_boost
> value ramped from 800MHz to 4GHz in 60ms. The patch achieves the desired 
> improved
> throughput as the existing behavior.
> 
> Also while at it, make iowait_boost and iowait_boost_max as unsigned int since
> its unit is kHz and this is consistent with struct cpufreq_policy.
> 
> [1] https://patchwork.kernel.org/patch/9735885/
> 
> Cc: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com>
> Cc: Len Brown <l...@kernel.org>
> Cc: Rafael J. Wysocki <r...@rjwysocki.net>
> Cc: Viresh Kumar <viresh.ku...@linaro.org>
> Cc: Ingo Molnar <mi...@redhat.com>
> Cc: Peter Zijlstra <pet...@infradead.org>
> Suggested-by: Peter Zijlstra <pet...@infradead.org>
> Signed-off-by: Joel Fernandes <joe...@google.com>
> ---
> This version is based on some ideas from Viresh and Juri in v4.  Viresh, one
> difference between the idea we just discussed is, I am scaling up/down the
> boost only after consuming it. This has the effect of slightly delaying the
> "deboost" but achieves the same boost ramp time. Its more cleaner in the code
> IMO to avoid the scaling up and then down on the initial boost. Note that I
> also dropped iowait_boost_min and now I'm just starting the initial boost from
> policy->min since as I mentioned in the commit above, the ramp of the
> iowait_boost value is very quick and for the usecase its intended for, it 
> works
> fine. Hope this is acceptable. Thanks.
> 
>  kernel/sched/cpufreq_schedutil.c | 31 +++++++++++++++++++++++--------
>  1 file changed, 23 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> index 622eed1b7658..4225bbada88d 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -53,8 +53,9 @@ struct sugov_cpu {
>       struct update_util_data update_util;
>       struct sugov_policy *sg_policy;
>  
> -     unsigned long iowait_boost;
> -     unsigned long iowait_boost_max;
> +     bool iowait_boost_pending;
> +     unsigned int iowait_boost;
> +     unsigned int iowait_boost_max;
>       u64 last_update;
>  
>       /* The fields below are only needed when sharing a policy. */
> @@ -172,30 +173,43 @@ static void sugov_set_iowait_boost(struct sugov_cpu 
> *sg_cpu, u64 time,
>                                  unsigned int flags)
>  {
>       if (flags & SCHED_CPUFREQ_IOWAIT) {
> -             sg_cpu->iowait_boost = sg_cpu->iowait_boost_max;
> +             sg_cpu->iowait_boost_pending = true;
> +             sg_cpu->iowait_boost = max(sg_cpu->iowait_boost,
> +                                        sg_cpu->sg_policy->policy->min);
>       } else if (sg_cpu->iowait_boost) {
>               s64 delta_ns = time - sg_cpu->last_update;
>  
>               /* Clear iowait_boost if the CPU apprears to have been idle. */
> -             if (delta_ns > TICK_NSEC)
> +             if (delta_ns > TICK_NSEC) {
>                       sg_cpu->iowait_boost = 0;
> +                     sg_cpu->iowait_boost_pending = false;
> +             }

We don't really need to clear this flag here as we are already making
iowait_boost as 0 and that's what we check while using boost.

>       }
>  }
>  
>  static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util,
>                              unsigned long *max)
>  {
> -     unsigned long boost_util = sg_cpu->iowait_boost;
> -     unsigned long boost_max = sg_cpu->iowait_boost_max;
> +     unsigned long boost_util, boost_max;
>  
> -     if (!boost_util)
> +     if (!sg_cpu->iowait_boost)
>               return;
>  
> +     boost_util = sg_cpu->iowait_boost;
> +     boost_max = sg_cpu->iowait_boost_max;
> +

The above changes are not required anymore (and were required only
with my patch).

>       if (*util * boost_max < *max * boost_util) {
>               *util = boost_util;
>               *max = boost_max;
>       }
> -     sg_cpu->iowait_boost >>= 1;
> +
> +     if (sg_cpu->iowait_boost_pending) {
> +             sg_cpu->iowait_boost_pending = false;
> +             sg_cpu->iowait_boost = min(sg_cpu->iowait_boost << 1,
> +                                        sg_cpu->iowait_boost_max);

Now this has a problem. We will also boost after waiting for
rate_limit_us. And that's why I had proposed the tricky solution in
the first place. I thought we wanted to avoid instant boost only for
the first iteration, but after that we wanted to do it ASAP. Isn't it?

Now that you are using policy->min instead of policy->cur, we can
simplify the solution I proposed and always do 2 * iowait_boost before
getting current util/max in above if loop. i.e. we will start iowait
boost with min * 2 instead of min and that should be fine.

> +     } else {
> +             sg_cpu->iowait_boost >>= 1;
> +     }
>  }
>  
>  #ifdef CONFIG_NO_HZ_COMMON
> @@ -267,6 +281,7 @@ static unsigned int sugov_next_freq_shared(struct 
> sugov_cpu *sg_cpu, u64 time)
>               delta_ns = time - j_sg_cpu->last_update;
>               if (delta_ns > TICK_NSEC) {
>                       j_sg_cpu->iowait_boost = 0;
> +                     j_sg_cpu->iowait_boost_pending = false;

Not required here as well.

>                       continue;
>               }
>               if (j_sg_cpu->flags & SCHED_CPUFREQ_RT_DL)
> -- 
> 2.13.2.932.g7449e964c-goog

-- 
viresh

Reply via email to