Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Wed, Jul 12, 2017 at 9:14 PM, Viresh Kumarwrote: > On 12-07-17, 20:52, Joel Fernandes wrote: >> Yes, that makes sense, its a bit subtle but I get what you're doing >> now and I agree with it. Its also cleaner than my original patch :-) >> and yeah definitely needs a comment ;-) > > And I have full trust on you to include that comment in you V5 :) You bet :) thanks, -Joel
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Wed, Jul 12, 2017 at 9:14 PM, Viresh Kumar wrote: > On 12-07-17, 20:52, Joel Fernandes wrote: >> Yes, that makes sense, its a bit subtle but I get what you're doing >> now and I agree with it. Its also cleaner than my original patch :-) >> and yeah definitely needs a comment ;-) > > And I have full trust on you to include that comment in you V5 :) You bet :) thanks, -Joel
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 12-07-17, 20:52, Joel Fernandes wrote: > Yes, that makes sense, its a bit subtle but I get what you're doing > now and I agree with it. Its also cleaner than my original patch :-) > and yeah definitely needs a comment ;-) And I have full trust on you to include that comment in you V5 :) -- viresh
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 12-07-17, 20:52, Joel Fernandes wrote: > Yes, that makes sense, its a bit subtle but I get what you're doing > now and I agree with it. Its also cleaner than my original patch :-) > and yeah definitely needs a comment ;-) And I have full trust on you to include that comment in you V5 :) -- viresh
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Wed, Jul 12, 2017 at 8:35 PM, Viresh Kumarwrote: [..] > >> > diff --git a/kernel/sched/cpufreq_schedutil.c >> > b/kernel/sched/cpufreq_schedutil.c >> > index 076a2e31951c..3459f327c94e 100644 >> > --- a/kernel/sched/cpufreq_schedutil.c >> > +++ b/kernel/sched/cpufreq_schedutil.c >> > @@ -53,6 +53,7 @@ struct sugov_cpu { >> > struct update_util_data update_util; >> > struct sugov_policy *sg_policy; >> > >> > + bool iowait_boost_pending; >> > unsigned long iowait_boost; >> > unsigned long iowait_boost_max; >> > u64 last_update; >> > @@ -169,7 +170,12 @@ static void sugov_set_iowait_boost(struct sugov_cpu >> > *sg_cpu, u64 time, >> >unsigned int flags) >> > { >> > if (flags & SCHED_CPUFREQ_IOWAIT) { >> > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; >> > + sg_cpu->iowait_boost_pending = true; >> > + >> > + if (!sg_cpu->iowait_boost) { >> > + sg_cpu->iowait_boost = >> > sg_cpu->sg_policy->policy->cur; >> > + sg_cpu->iowait_boost >>= 1; >> > + } >> >> Hmm, this doesn't look right to me.. why are we decaying in this path? > > I am convinced that we need a comment here as what I did wasn't > straight forward :) > > The idea: We wouldn't increase the frequency for the first event with > IOWAIT flag set, but on every subsequent event (captured over > rate-limit-us window). You may have noticed that I am updating the > boost values in sugov_iowait_boost() a bit earlier now and they will > affect the current frequency update as well. > > Because I wanted to do a 2X there unconditionally if > iowait_boost_pending is set, I had to make it half for the very first > event with IOWAIT flag set. > > End result: > - First event, we stay at current freq. > - Second and all consecutive events every rate_limit_us time, 2X > - If there is no IOWAIT event in last rate_limit_us, X/2 > > Makes sense ? > Yes, that makes sense, its a bit subtle but I get what you're doing now and I agree with it. Its also cleaner than my original patch :-) and yeah definitely needs a comment ;-) thanks, -Joel
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Wed, Jul 12, 2017 at 8:35 PM, Viresh Kumar wrote: [..] > >> > diff --git a/kernel/sched/cpufreq_schedutil.c >> > b/kernel/sched/cpufreq_schedutil.c >> > index 076a2e31951c..3459f327c94e 100644 >> > --- a/kernel/sched/cpufreq_schedutil.c >> > +++ b/kernel/sched/cpufreq_schedutil.c >> > @@ -53,6 +53,7 @@ struct sugov_cpu { >> > struct update_util_data update_util; >> > struct sugov_policy *sg_policy; >> > >> > + bool iowait_boost_pending; >> > unsigned long iowait_boost; >> > unsigned long iowait_boost_max; >> > u64 last_update; >> > @@ -169,7 +170,12 @@ static void sugov_set_iowait_boost(struct sugov_cpu >> > *sg_cpu, u64 time, >> >unsigned int flags) >> > { >> > if (flags & SCHED_CPUFREQ_IOWAIT) { >> > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; >> > + sg_cpu->iowait_boost_pending = true; >> > + >> > + if (!sg_cpu->iowait_boost) { >> > + sg_cpu->iowait_boost = >> > sg_cpu->sg_policy->policy->cur; >> > + sg_cpu->iowait_boost >>= 1; >> > + } >> >> Hmm, this doesn't look right to me.. why are we decaying in this path? > > I am convinced that we need a comment here as what I did wasn't > straight forward :) > > The idea: We wouldn't increase the frequency for the first event with > IOWAIT flag set, but on every subsequent event (captured over > rate-limit-us window). You may have noticed that I am updating the > boost values in sugov_iowait_boost() a bit earlier now and they will > affect the current frequency update as well. > > Because I wanted to do a 2X there unconditionally if > iowait_boost_pending is set, I had to make it half for the very first > event with IOWAIT flag set. > > End result: > - First event, we stay at current freq. > - Second and all consecutive events every rate_limit_us time, 2X > - If there is no IOWAIT event in last rate_limit_us, X/2 > > Makes sense ? > Yes, that makes sense, its a bit subtle but I get what you're doing now and I agree with it. Its also cleaner than my original patch :-) and yeah definitely needs a comment ;-) thanks, -Joel
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 12-07-17, 15:28, Peter Zijlstra wrote: > Hmm, so you're worried about that ratelimit stuff? Yeah, somewhat. > Shouldn't we fix that > independently -- IIRC Rafael proposed a max-filter over the window. I had some doubts about that idea in general and shared them earlier with you: https://marc.info/?l=linux-kernel=149683722822537=2 But anyway, I think Joel is somewhat convinced to adapt to the solution I gave and that wouldn't have any of the problems I shared. All good now :) -- viresh
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 12-07-17, 15:28, Peter Zijlstra wrote: > Hmm, so you're worried about that ratelimit stuff? Yeah, somewhat. > Shouldn't we fix that > independently -- IIRC Rafael proposed a max-filter over the window. I had some doubts about that idea in general and shared them earlier with you: https://marc.info/?l=linux-kernel=149683722822537=2 But anyway, I think Joel is somewhat convinced to adapt to the solution I gave and that wouldn't have any of the problems I shared. All good now :) -- viresh
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 12-07-17, 19:02, Joel Fernandes wrote: > On Tue, Jul 11, 2017 at 10:00 PM, Viresh Kumar> wrote: > Hmm, Ok. I can try to do some measurements about consecutive calls > soon and let you know how often it happens. I also noticed its > possible to call twice in the enqueue path itself as well. Yeah, I think I told you that in previous replies. > It probably > affect my patch more because of starting from min than max. Yeah, it will. > > diff --git a/kernel/sched/cpufreq_schedutil.c > > b/kernel/sched/cpufreq_schedutil.c > > index 076a2e31951c..3459f327c94e 100644 > > --- a/kernel/sched/cpufreq_schedutil.c > > +++ b/kernel/sched/cpufreq_schedutil.c > > @@ -53,6 +53,7 @@ struct sugov_cpu { > > struct update_util_data update_util; > > struct sugov_policy *sg_policy; > > > > + bool iowait_boost_pending; > > unsigned long iowait_boost; > > unsigned long iowait_boost_max; > > u64 last_update; > > @@ -169,7 +170,12 @@ static void sugov_set_iowait_boost(struct sugov_cpu > > *sg_cpu, u64 time, > >unsigned int flags) > > { > > if (flags & SCHED_CPUFREQ_IOWAIT) { > > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; > > + sg_cpu->iowait_boost_pending = true; > > + > > + if (!sg_cpu->iowait_boost) { > > + sg_cpu->iowait_boost = > > sg_cpu->sg_policy->policy->cur; > > + sg_cpu->iowait_boost >>= 1; > > + } > > Hmm, this doesn't look right to me.. why are we decaying in this path? I am convinced that we need a comment here as what I did wasn't straight forward :) The idea: We wouldn't increase the frequency for the first event with IOWAIT flag set, but on every subsequent event (captured over rate-limit-us window). You may have noticed that I am updating the boost values in sugov_iowait_boost() a bit earlier now and they will affect the current frequency update as well. Because I wanted to do a 2X there unconditionally if iowait_boost_pending is set, I had to make it half for the very first event with IOWAIT flag set. End result: - First event, we stay at current freq. - Second and all consecutive events every rate_limit_us time, 2X - If there is no IOWAIT event in last rate_limit_us, X/2 Makes sense ? > I think nothing else other than setting the pending flag and the > initial iowait_boost value is needed here. > > Also I feel this is more "spike" prone than setting it initially to > min. As Peter was saying, since we apply the boost only if it > increases the frequency and not decreases, starting from min should at > worst just result in ignoring of the boost the first time. Yeah, we can discuss on what should be the default value to start with and I would be fine with "min" if Rafael is, as he proposed the iowait thing to begin with after seeing some real issues on hardware. > > } else if (sg_cpu->iowait_boost) { > > s64 delta_ns = time - sg_cpu->last_update; > > > > @@ -182,17 +188,26 @@ static void sugov_set_iowait_boost(struct sugov_cpu > > *sg_cpu, u64 time, > > static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long > > *util, > >unsigned long *max) > > { > > - unsigned long boost_util = sg_cpu->iowait_boost; > > - unsigned long boost_max = sg_cpu->iowait_boost_max; > > + unsigned long boost_util, boost_max; > > > > - if (!boost_util) > > + if (!sg_cpu->iowait_boost) > > return; > > > > + if (sg_cpu->iowait_boost_pending) { > > + sg_cpu->iowait_boost_pending = false; > > + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost << 1, I was talking about this unconditional 2X earlier. > > + sg_cpu->iowait_boost_max); > > + } else { > > + sg_cpu->iowait_boost >>= 1; > > + } > > And then this path will do the decay correctly when the boost is applied. Yeah, the else part should do good. -- viresh
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 12-07-17, 19:02, Joel Fernandes wrote: > On Tue, Jul 11, 2017 at 10:00 PM, Viresh Kumar > wrote: > Hmm, Ok. I can try to do some measurements about consecutive calls > soon and let you know how often it happens. I also noticed its > possible to call twice in the enqueue path itself as well. Yeah, I think I told you that in previous replies. > It probably > affect my patch more because of starting from min than max. Yeah, it will. > > diff --git a/kernel/sched/cpufreq_schedutil.c > > b/kernel/sched/cpufreq_schedutil.c > > index 076a2e31951c..3459f327c94e 100644 > > --- a/kernel/sched/cpufreq_schedutil.c > > +++ b/kernel/sched/cpufreq_schedutil.c > > @@ -53,6 +53,7 @@ struct sugov_cpu { > > struct update_util_data update_util; > > struct sugov_policy *sg_policy; > > > > + bool iowait_boost_pending; > > unsigned long iowait_boost; > > unsigned long iowait_boost_max; > > u64 last_update; > > @@ -169,7 +170,12 @@ static void sugov_set_iowait_boost(struct sugov_cpu > > *sg_cpu, u64 time, > >unsigned int flags) > > { > > if (flags & SCHED_CPUFREQ_IOWAIT) { > > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; > > + sg_cpu->iowait_boost_pending = true; > > + > > + if (!sg_cpu->iowait_boost) { > > + sg_cpu->iowait_boost = > > sg_cpu->sg_policy->policy->cur; > > + sg_cpu->iowait_boost >>= 1; > > + } > > Hmm, this doesn't look right to me.. why are we decaying in this path? I am convinced that we need a comment here as what I did wasn't straight forward :) The idea: We wouldn't increase the frequency for the first event with IOWAIT flag set, but on every subsequent event (captured over rate-limit-us window). You may have noticed that I am updating the boost values in sugov_iowait_boost() a bit earlier now and they will affect the current frequency update as well. Because I wanted to do a 2X there unconditionally if iowait_boost_pending is set, I had to make it half for the very first event with IOWAIT flag set. End result: - First event, we stay at current freq. - Second and all consecutive events every rate_limit_us time, 2X - If there is no IOWAIT event in last rate_limit_us, X/2 Makes sense ? > I think nothing else other than setting the pending flag and the > initial iowait_boost value is needed here. > > Also I feel this is more "spike" prone than setting it initially to > min. As Peter was saying, since we apply the boost only if it > increases the frequency and not decreases, starting from min should at > worst just result in ignoring of the boost the first time. Yeah, we can discuss on what should be the default value to start with and I would be fine with "min" if Rafael is, as he proposed the iowait thing to begin with after seeing some real issues on hardware. > > } else if (sg_cpu->iowait_boost) { > > s64 delta_ns = time - sg_cpu->last_update; > > > > @@ -182,17 +188,26 @@ static void sugov_set_iowait_boost(struct sugov_cpu > > *sg_cpu, u64 time, > > static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long > > *util, > >unsigned long *max) > > { > > - unsigned long boost_util = sg_cpu->iowait_boost; > > - unsigned long boost_max = sg_cpu->iowait_boost_max; > > + unsigned long boost_util, boost_max; > > > > - if (!boost_util) > > + if (!sg_cpu->iowait_boost) > > return; > > > > + if (sg_cpu->iowait_boost_pending) { > > + sg_cpu->iowait_boost_pending = false; > > + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost << 1, I was talking about this unconditional 2X earlier. > > + sg_cpu->iowait_boost_max); > > + } else { > > + sg_cpu->iowait_boost >>= 1; > > + } > > And then this path will do the decay correctly when the boost is applied. Yeah, the else part should do good. -- viresh
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
Hi Viresh, On Tue, Jul 11, 2017 at 10:00 PM, Viresh Kumarwrote: [..] >> Another approach than setting min in sugov_set_iowait_boost, is, since >> we have already retrieved the current util, we can check if flags == >> SCHED_CPUFREQ_IOWAIT, then set initial the iowait_boost such that >> (iowait_boost / iowait_boost_max) is aleast equal to (util / max) or >> iowait_boost_min, which ever is lower. > > So my concerns weren't only about the initial min value, but also that you > reduce the freq from sugov_set_iowait_boost(). We can discuss what the ideal > value to start with can be. > >> This still will not increase >> frequency on the first request, but will ensure the next one will >> benefit. > > If there is no non-enqueue path request lands. > >> Yes, I've seen that happen in my testing (consecutive iowait). > > The CFS scheduler can send a util update request every 1 ms for util updates > and > I am not sure why isn't that happening in your case. > > How much is the time between two consecutive IOWAIT requests in your case ? > Maybe it is too less (Ofcourse it isn't in your control :). But if we take > Peter's example, then it will surely have a non-enqueue path request between > two > IOWAIT requests. Hmm, Ok. I can try to do some measurements about consecutive calls soon and let you know how often it happens. I also noticed its possible to call twice in the enqueue path itself as well. It probably affect my patch more because of starting from min than max. >> I >> haven't seen the other case where you have IOWAIT followed by >> non-IOWAIT for a repeated set of IOWAIT requests. Would you more >> comfortable if we moved sugov_set_iowait_boost() after the >> sugov_should_update_freq() ? > > That may make us ignore all IOWAIT requests that happen between rate_limit_us > time. And that would be bad IMHO. True.. >> That way if there are consecutive >> requests in the same path, then it most likely rate-limiting will >> prevent such updates. I will also try to collect some stats as you >> suggested to see if how often if at all this can happen. > > Completely untested, but what about something like this ? This should get rid > of > the spikes you were getting. I actually like your approach better of consuming the flag first, but computing the boost later when its used. Its also more immune to the problem you described. Some comments below: > > diff --git a/kernel/sched/cpufreq_schedutil.c > b/kernel/sched/cpufreq_schedutil.c > index 076a2e31951c..3459f327c94e 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -53,6 +53,7 @@ struct sugov_cpu { > struct update_util_data update_util; > struct sugov_policy *sg_policy; > > + bool iowait_boost_pending; > unsigned long iowait_boost; > unsigned long iowait_boost_max; > u64 last_update; > @@ -169,7 +170,12 @@ static void sugov_set_iowait_boost(struct sugov_cpu > *sg_cpu, u64 time, >unsigned int flags) > { > if (flags & SCHED_CPUFREQ_IOWAIT) { > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; > + sg_cpu->iowait_boost_pending = true; > + > + if (!sg_cpu->iowait_boost) { > + sg_cpu->iowait_boost = sg_cpu->sg_policy->policy->cur; > + sg_cpu->iowait_boost >>= 1; > + } Hmm, this doesn't look right to me.. why are we decaying in this path? I think nothing else other than setting the pending flag and the initial iowait_boost value is needed here. Also I feel this is more "spike" prone than setting it initially to min. As Peter was saying, since we apply the boost only if it increases the frequency and not decreases, starting from min should at worst just result in ignoring of the boost the first time. > } else if (sg_cpu->iowait_boost) { > s64 delta_ns = time - sg_cpu->last_update; > > @@ -182,17 +188,26 @@ static void sugov_set_iowait_boost(struct sugov_cpu > *sg_cpu, u64 time, > static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util, >unsigned long *max) > { > - unsigned long boost_util = sg_cpu->iowait_boost; > - unsigned long boost_max = sg_cpu->iowait_boost_max; > + unsigned long boost_util, boost_max; > > - if (!boost_util) > + if (!sg_cpu->iowait_boost) > return; > > + if (sg_cpu->iowait_boost_pending) { > + sg_cpu->iowait_boost_pending = false; > + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost << 1, > + sg_cpu->iowait_boost_max); > + } else { > + sg_cpu->iowait_boost >>= 1; > + } And then this path will do the decay correctly when the boost is applied. thanks, -Joel
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
Hi Viresh, On Tue, Jul 11, 2017 at 10:00 PM, Viresh Kumar wrote: [..] >> Another approach than setting min in sugov_set_iowait_boost, is, since >> we have already retrieved the current util, we can check if flags == >> SCHED_CPUFREQ_IOWAIT, then set initial the iowait_boost such that >> (iowait_boost / iowait_boost_max) is aleast equal to (util / max) or >> iowait_boost_min, which ever is lower. > > So my concerns weren't only about the initial min value, but also that you > reduce the freq from sugov_set_iowait_boost(). We can discuss what the ideal > value to start with can be. > >> This still will not increase >> frequency on the first request, but will ensure the next one will >> benefit. > > If there is no non-enqueue path request lands. > >> Yes, I've seen that happen in my testing (consecutive iowait). > > The CFS scheduler can send a util update request every 1 ms for util updates > and > I am not sure why isn't that happening in your case. > > How much is the time between two consecutive IOWAIT requests in your case ? > Maybe it is too less (Ofcourse it isn't in your control :). But if we take > Peter's example, then it will surely have a non-enqueue path request between > two > IOWAIT requests. Hmm, Ok. I can try to do some measurements about consecutive calls soon and let you know how often it happens. I also noticed its possible to call twice in the enqueue path itself as well. It probably affect my patch more because of starting from min than max. >> I >> haven't seen the other case where you have IOWAIT followed by >> non-IOWAIT for a repeated set of IOWAIT requests. Would you more >> comfortable if we moved sugov_set_iowait_boost() after the >> sugov_should_update_freq() ? > > That may make us ignore all IOWAIT requests that happen between rate_limit_us > time. And that would be bad IMHO. True.. >> That way if there are consecutive >> requests in the same path, then it most likely rate-limiting will >> prevent such updates. I will also try to collect some stats as you >> suggested to see if how often if at all this can happen. > > Completely untested, but what about something like this ? This should get rid > of > the spikes you were getting. I actually like your approach better of consuming the flag first, but computing the boost later when its used. Its also more immune to the problem you described. Some comments below: > > diff --git a/kernel/sched/cpufreq_schedutil.c > b/kernel/sched/cpufreq_schedutil.c > index 076a2e31951c..3459f327c94e 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -53,6 +53,7 @@ struct sugov_cpu { > struct update_util_data update_util; > struct sugov_policy *sg_policy; > > + bool iowait_boost_pending; > unsigned long iowait_boost; > unsigned long iowait_boost_max; > u64 last_update; > @@ -169,7 +170,12 @@ static void sugov_set_iowait_boost(struct sugov_cpu > *sg_cpu, u64 time, >unsigned int flags) > { > if (flags & SCHED_CPUFREQ_IOWAIT) { > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; > + sg_cpu->iowait_boost_pending = true; > + > + if (!sg_cpu->iowait_boost) { > + sg_cpu->iowait_boost = sg_cpu->sg_policy->policy->cur; > + sg_cpu->iowait_boost >>= 1; > + } Hmm, this doesn't look right to me.. why are we decaying in this path? I think nothing else other than setting the pending flag and the initial iowait_boost value is needed here. Also I feel this is more "spike" prone than setting it initially to min. As Peter was saying, since we apply the boost only if it increases the frequency and not decreases, starting from min should at worst just result in ignoring of the boost the first time. > } else if (sg_cpu->iowait_boost) { > s64 delta_ns = time - sg_cpu->last_update; > > @@ -182,17 +188,26 @@ static void sugov_set_iowait_boost(struct sugov_cpu > *sg_cpu, u64 time, > static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util, >unsigned long *max) > { > - unsigned long boost_util = sg_cpu->iowait_boost; > - unsigned long boost_max = sg_cpu->iowait_boost_max; > + unsigned long boost_util, boost_max; > > - if (!boost_util) > + if (!sg_cpu->iowait_boost) > return; > > + if (sg_cpu->iowait_boost_pending) { > + sg_cpu->iowait_boost_pending = false; > + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost << 1, > + sg_cpu->iowait_boost_max); > + } else { > + sg_cpu->iowait_boost >>= 1; > + } And then this path will do the decay correctly when the boost is applied. thanks, -Joel
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Wed, Jul 12, 2017 at 03:16:23PM +0530, Viresh Kumar wrote: > No, I wasn't clear enough. Sorry about that. Lemme try again: > > Suppose min freq is 500 MHz and Max is 2 GHz. The iowait-boost is > set to 1 GHz right now (because of previous events with IOWAIT flag > set), and sugov_set_iowait_boost() gets called again with IOWAIT flag, > we boost the iowait-boost value to 2 GHz. We are in the rate_limit_us > window right now, we return without changing the frequency. > > If the next call into the schedutil governor happens due to normal > util-update, flags will be passed as 0. With the current patch, we > will bring iowait-boost back to 1 GHz (before updating the real > frequency to 2 GHz) as the prev-iowait-boost boolean would be set. > > And even if the task is periodically getting queued after IOWAIT, > actual boosting may not happen at all in some cases. Hmm, so you're worried about that ratelimit stuff? Shouldn't we fix that independently -- IIRC Rafael proposed a max-filter over the window.
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Wed, Jul 12, 2017 at 03:16:23PM +0530, Viresh Kumar wrote: > No, I wasn't clear enough. Sorry about that. Lemme try again: > > Suppose min freq is 500 MHz and Max is 2 GHz. The iowait-boost is > set to 1 GHz right now (because of previous events with IOWAIT flag > set), and sugov_set_iowait_boost() gets called again with IOWAIT flag, > we boost the iowait-boost value to 2 GHz. We are in the rate_limit_us > window right now, we return without changing the frequency. > > If the next call into the schedutil governor happens due to normal > util-update, flags will be passed as 0. With the current patch, we > will bring iowait-boost back to 1 GHz (before updating the real > frequency to 2 GHz) as the prev-iowait-boost boolean would be set. > > And even if the task is periodically getting queued after IOWAIT, > actual boosting may not happen at all in some cases. Hmm, so you're worried about that ratelimit stuff? Shouldn't we fix that independently -- IIRC Rafael proposed a max-filter over the window.
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 12-07-17, 11:36, Peter Zijlstra wrote: > On Wed, Jul 12, 2017 at 10:30:35AM +0530, Viresh Kumar wrote: > > On 11-07-17, 07:14, Joel Fernandes wrote: > > > > Another approach than setting min in sugov_set_iowait_boost, is, since > > > we have already retrieved the current util, we can check if flags == > > > SCHED_CPUFREQ_IOWAIT, then set initial the iowait_boost such that > > > (iowait_boost / iowait_boost_max) is aleast equal to (util / max) or > > > iowait_boost_min, which ever is lower. > > > > So my concerns weren't only about the initial min value, but also that you > > reduce the freq from sugov_set_iowait_boost(). We can discuss what the ideal > > value to start with can be. > > I'm not sure I see that. He only mucks with iowait_boost, not the actual > frequency afaict. > > And sugov_iowait_boost() picks the highest of util vs iowait_boost, > which wasn't changed. > > Or am I completely missing something? (that code is a bit hard to > follow) No, I wasn't clear enough. Sorry about that. Lemme try again: Suppose min freq is 500 MHz and Max is 2 GHz. The iowait-boost is set to 1 GHz right now (because of previous events with IOWAIT flag set), and sugov_set_iowait_boost() gets called again with IOWAIT flag, we boost the iowait-boost value to 2 GHz. We are in the rate_limit_us window right now, we return without changing the frequency. If the next call into the schedutil governor happens due to normal util-update, flags will be passed as 0. With the current patch, we will bring iowait-boost back to 1 GHz (before updating the real frequency to 2 GHz) as the prev-iowait-boost boolean would be set. And even if the task is periodically getting queued after IOWAIT, actual boosting may not happen at all in some cases. -- viresh
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 12-07-17, 11:36, Peter Zijlstra wrote: > On Wed, Jul 12, 2017 at 10:30:35AM +0530, Viresh Kumar wrote: > > On 11-07-17, 07:14, Joel Fernandes wrote: > > > > Another approach than setting min in sugov_set_iowait_boost, is, since > > > we have already retrieved the current util, we can check if flags == > > > SCHED_CPUFREQ_IOWAIT, then set initial the iowait_boost such that > > > (iowait_boost / iowait_boost_max) is aleast equal to (util / max) or > > > iowait_boost_min, which ever is lower. > > > > So my concerns weren't only about the initial min value, but also that you > > reduce the freq from sugov_set_iowait_boost(). We can discuss what the ideal > > value to start with can be. > > I'm not sure I see that. He only mucks with iowait_boost, not the actual > frequency afaict. > > And sugov_iowait_boost() picks the highest of util vs iowait_boost, > which wasn't changed. > > Or am I completely missing something? (that code is a bit hard to > follow) No, I wasn't clear enough. Sorry about that. Lemme try again: Suppose min freq is 500 MHz and Max is 2 GHz. The iowait-boost is set to 1 GHz right now (because of previous events with IOWAIT flag set), and sugov_set_iowait_boost() gets called again with IOWAIT flag, we boost the iowait-boost value to 2 GHz. We are in the rate_limit_us window right now, we return without changing the frequency. If the next call into the schedutil governor happens due to normal util-update, flags will be passed as 0. With the current patch, we will bring iowait-boost back to 1 GHz (before updating the real frequency to 2 GHz) as the prev-iowait-boost boolean would be set. And even if the task is periodically getting queued after IOWAIT, actual boosting may not happen at all in some cases. -- viresh
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Wed, Jul 12, 2017 at 10:30:35AM +0530, Viresh Kumar wrote: > On 11-07-17, 07:14, Joel Fernandes wrote: > > Another approach than setting min in sugov_set_iowait_boost, is, since > > we have already retrieved the current util, we can check if flags == > > SCHED_CPUFREQ_IOWAIT, then set initial the iowait_boost such that > > (iowait_boost / iowait_boost_max) is aleast equal to (util / max) or > > iowait_boost_min, which ever is lower. > > So my concerns weren't only about the initial min value, but also that you > reduce the freq from sugov_set_iowait_boost(). We can discuss what the ideal > value to start with can be. I'm not sure I see that. He only mucks with iowait_boost, not the actual frequency afaict. And sugov_iowait_boost() picks the highest of util vs iowait_boost, which wasn't changed. Or am I completely missing something? (that code is a bit hard to follow)
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Wed, Jul 12, 2017 at 10:30:35AM +0530, Viresh Kumar wrote: > On 11-07-17, 07:14, Joel Fernandes wrote: > > Another approach than setting min in sugov_set_iowait_boost, is, since > > we have already retrieved the current util, we can check if flags == > > SCHED_CPUFREQ_IOWAIT, then set initial the iowait_boost such that > > (iowait_boost / iowait_boost_max) is aleast equal to (util / max) or > > iowait_boost_min, which ever is lower. > > So my concerns weren't only about the initial min value, but also that you > reduce the freq from sugov_set_iowait_boost(). We can discuss what the ideal > value to start with can be. I'm not sure I see that. He only mucks with iowait_boost, not the actual frequency afaict. And sugov_iowait_boost() picks the highest of util vs iowait_boost, which wasn't changed. Or am I completely missing something? (that code is a bit hard to follow)
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 11-07-17, 07:14, Joel Fernandes wrote: > I think the whole point of IOWAIT boost was to solve the issue with a > long sequence of repeated I/O requests as described in the commit > message. So IIUC there isn't a usecase for that (increase freq. on > first request). Right. So we can take example that Peter gave earlier. Task runs .1 ms and waits for IO for 1 ms (at max speed). But there is high possibility that the util update handler gets called within that 1 ms (from non-enqueue paths) and because you chose to reduce iowait boost from sugov_set_iowait_boost() in your commit, we can easily end up ignoring iowait boosting. > Also its just for the first couple of requests in my > testing and doesn't hurt the performance at all for the intended > usecase while still not causing transient spikes. We can have bad enough timing where the util handler gets called right in that 1 ms of IOWAIT period and we will never boost. > Another approach than setting min in sugov_set_iowait_boost, is, since > we have already retrieved the current util, we can check if flags == > SCHED_CPUFREQ_IOWAIT, then set initial the iowait_boost such that > (iowait_boost / iowait_boost_max) is aleast equal to (util / max) or > iowait_boost_min, which ever is lower. So my concerns weren't only about the initial min value, but also that you reduce the freq from sugov_set_iowait_boost(). We can discuss what the ideal value to start with can be. > This still will not increase > frequency on the first request, but will ensure the next one will > benefit. If there is no non-enqueue path request lands. > Yes, I've seen that happen in my testing (consecutive iowait). The CFS scheduler can send a util update request every 1 ms for util updates and I am not sure why isn't that happening in your case. How much is the time between two consecutive IOWAIT requests in your case ? Maybe it is too less (Ofcourse it isn't in your control :). But if we take Peter's example, then it will surely have a non-enqueue path request between two IOWAIT requests. > I > haven't seen the other case where you have IOWAIT followed by > non-IOWAIT for a repeated set of IOWAIT requests. Would you more > comfortable if we moved sugov_set_iowait_boost() after the > sugov_should_update_freq() ? That may make us ignore all IOWAIT requests that happen between rate_limit_us time. And that would be bad IMHO. > That way if there are consecutive > requests in the same path, then it most likely rate-limiting will > prevent such updates. I will also try to collect some stats as you > suggested to see if how often if at all this can happen. Completely untested, but what about something like this ? This should get rid of the spikes you were getting. diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 076a2e31951c..3459f327c94e 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -53,6 +53,7 @@ struct sugov_cpu { struct update_util_data update_util; struct sugov_policy *sg_policy; + bool iowait_boost_pending; unsigned long iowait_boost; unsigned long iowait_boost_max; u64 last_update; @@ -169,7 +170,12 @@ static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, unsigned int flags) { if (flags & SCHED_CPUFREQ_IOWAIT) { - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; + sg_cpu->iowait_boost_pending = true; + + if (!sg_cpu->iowait_boost) { + sg_cpu->iowait_boost = sg_cpu->sg_policy->policy->cur; + sg_cpu->iowait_boost >>= 1; + } } else if (sg_cpu->iowait_boost) { s64 delta_ns = time - sg_cpu->last_update; @@ -182,17 +188,26 @@ static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util, unsigned long *max) { - unsigned long boost_util = sg_cpu->iowait_boost; - unsigned long boost_max = sg_cpu->iowait_boost_max; + unsigned long boost_util, boost_max; - if (!boost_util) + if (!sg_cpu->iowait_boost) return; + if (sg_cpu->iowait_boost_pending) { + sg_cpu->iowait_boost_pending = false; + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost << 1, + sg_cpu->iowait_boost_max); + } else { + sg_cpu->iowait_boost >>= 1; + } + + boost_util = sg_cpu->iowait_boost; + boost_max = sg_cpu->iowait_boost_max; + if (*util * boost_max < *max * boost_util) { *util = boost_util; *max = boost_max; } - sg_cpu->iowait_boost >>= 1; } #ifdef CONFIG_NO_HZ_COMMON -- viresh
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 11-07-17, 07:14, Joel Fernandes wrote: > I think the whole point of IOWAIT boost was to solve the issue with a > long sequence of repeated I/O requests as described in the commit > message. So IIUC there isn't a usecase for that (increase freq. on > first request). Right. So we can take example that Peter gave earlier. Task runs .1 ms and waits for IO for 1 ms (at max speed). But there is high possibility that the util update handler gets called within that 1 ms (from non-enqueue paths) and because you chose to reduce iowait boost from sugov_set_iowait_boost() in your commit, we can easily end up ignoring iowait boosting. > Also its just for the first couple of requests in my > testing and doesn't hurt the performance at all for the intended > usecase while still not causing transient spikes. We can have bad enough timing where the util handler gets called right in that 1 ms of IOWAIT period and we will never boost. > Another approach than setting min in sugov_set_iowait_boost, is, since > we have already retrieved the current util, we can check if flags == > SCHED_CPUFREQ_IOWAIT, then set initial the iowait_boost such that > (iowait_boost / iowait_boost_max) is aleast equal to (util / max) or > iowait_boost_min, which ever is lower. So my concerns weren't only about the initial min value, but also that you reduce the freq from sugov_set_iowait_boost(). We can discuss what the ideal value to start with can be. > This still will not increase > frequency on the first request, but will ensure the next one will > benefit. If there is no non-enqueue path request lands. > Yes, I've seen that happen in my testing (consecutive iowait). The CFS scheduler can send a util update request every 1 ms for util updates and I am not sure why isn't that happening in your case. How much is the time between two consecutive IOWAIT requests in your case ? Maybe it is too less (Ofcourse it isn't in your control :). But if we take Peter's example, then it will surely have a non-enqueue path request between two IOWAIT requests. > I > haven't seen the other case where you have IOWAIT followed by > non-IOWAIT for a repeated set of IOWAIT requests. Would you more > comfortable if we moved sugov_set_iowait_boost() after the > sugov_should_update_freq() ? That may make us ignore all IOWAIT requests that happen between rate_limit_us time. And that would be bad IMHO. > That way if there are consecutive > requests in the same path, then it most likely rate-limiting will > prevent such updates. I will also try to collect some stats as you > suggested to see if how often if at all this can happen. Completely untested, but what about something like this ? This should get rid of the spikes you were getting. diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 076a2e31951c..3459f327c94e 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -53,6 +53,7 @@ struct sugov_cpu { struct update_util_data update_util; struct sugov_policy *sg_policy; + bool iowait_boost_pending; unsigned long iowait_boost; unsigned long iowait_boost_max; u64 last_update; @@ -169,7 +170,12 @@ static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, unsigned int flags) { if (flags & SCHED_CPUFREQ_IOWAIT) { - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; + sg_cpu->iowait_boost_pending = true; + + if (!sg_cpu->iowait_boost) { + sg_cpu->iowait_boost = sg_cpu->sg_policy->policy->cur; + sg_cpu->iowait_boost >>= 1; + } } else if (sg_cpu->iowait_boost) { s64 delta_ns = time - sg_cpu->last_update; @@ -182,17 +188,26 @@ static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util, unsigned long *max) { - unsigned long boost_util = sg_cpu->iowait_boost; - unsigned long boost_max = sg_cpu->iowait_boost_max; + unsigned long boost_util, boost_max; - if (!boost_util) + if (!sg_cpu->iowait_boost) return; + if (sg_cpu->iowait_boost_pending) { + sg_cpu->iowait_boost_pending = false; + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost << 1, + sg_cpu->iowait_boost_max); + } else { + sg_cpu->iowait_boost >>= 1; + } + + boost_util = sg_cpu->iowait_boost; + boost_max = sg_cpu->iowait_boost_max; + if (*util * boost_max < *max * boost_util) { *util = boost_util; *max = boost_max; } - sg_cpu->iowait_boost >>= 1; } #ifdef CONFIG_NO_HZ_COMMON -- viresh
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 11/07/17 07:33, Joel Fernandes wrote: > Hi Juri, > > On Tue, Jul 11, 2017 at 12:14 AM, Juri Lelliwrote: > [..] > >> > Considering it a per-cpu thing, isn't enough that it gets bumped up or > >> > decayed only when a CPU does an update (by using the above from > >> > sugov_update_shared)? > >> > > >> > If we go this way I think we will only need to reset prev_iowait_boost > >> > if delta_ns > TICK_NSEC during the for_each_cpu() loop of sugov_next_ > >> > freq_shared(). > >> > > >> > >> Actually the "decay" was already being done before (without this > >> patch), I am just preserving the same old behavior where we do decay. > >> Perhaps your proposal can be a separate match? Or did I miss something > >> else subtle here? > >> > > > > True, we are currently decaying anyway. > > > > Looking again at this path made me however think if we really need to. I > > guess we need currently, as we bump frenquency to max and then decay it. > > But, with your changes, I was wondering if we can simplify the thing and > > decay only on the per-CPU update path. > > Yes that makes sense to me, but even if we're at max, we should still > benefit from your idea right? (I didn't follow why being at min makes > the idea better, I think its a good idea in both cases (whether we are > boosting to max and ramping down or starting from the min) but let me > know if I missed something) > Not better, but I thought that if we start from max and decay we probably need to have more chances to decay faster (so decaying also when aggregating request for a domain is probably needed)? > If I understand correctly what you're proposing: > 1. Remove the decay from sugov_iowait_boost > 2. Add the iowait_boost decay unconditionally to > sugov_set_iowait_boost for the !SCHED_CPUFREQ_IOWAIT case. > > That would also get rid of the prev_iowait_boost flag and simplify > things, and also address Peter's concern of adding 'bool' in composite > types :) > OK, just a suggestion however, didn't play with the idea myself, it might not work. :) Thanks, - Juri
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 11/07/17 07:33, Joel Fernandes wrote: > Hi Juri, > > On Tue, Jul 11, 2017 at 12:14 AM, Juri Lelli wrote: > [..] > >> > Considering it a per-cpu thing, isn't enough that it gets bumped up or > >> > decayed only when a CPU does an update (by using the above from > >> > sugov_update_shared)? > >> > > >> > If we go this way I think we will only need to reset prev_iowait_boost > >> > if delta_ns > TICK_NSEC during the for_each_cpu() loop of sugov_next_ > >> > freq_shared(). > >> > > >> > >> Actually the "decay" was already being done before (without this > >> patch), I am just preserving the same old behavior where we do decay. > >> Perhaps your proposal can be a separate match? Or did I miss something > >> else subtle here? > >> > > > > True, we are currently decaying anyway. > > > > Looking again at this path made me however think if we really need to. I > > guess we need currently, as we bump frenquency to max and then decay it. > > But, with your changes, I was wondering if we can simplify the thing and > > decay only on the per-CPU update path. > > Yes that makes sense to me, but even if we're at max, we should still > benefit from your idea right? (I didn't follow why being at min makes > the idea better, I think its a good idea in both cases (whether we are > boosting to max and ramping down or starting from the min) but let me > know if I missed something) > Not better, but I thought that if we start from max and decay we probably need to have more chances to decay faster (so decaying also when aggregating request for a domain is probably needed)? > If I understand correctly what you're proposing: > 1. Remove the decay from sugov_iowait_boost > 2. Add the iowait_boost decay unconditionally to > sugov_set_iowait_boost for the !SCHED_CPUFREQ_IOWAIT case. > > That would also get rid of the prev_iowait_boost flag and simplify > things, and also address Peter's concern of adding 'bool' in composite > types :) > OK, just a suggestion however, didn't play with the idea myself, it might not work. :) Thanks, - Juri
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
Hi Juri, On Tue, Jul 11, 2017 at 12:14 AM, Juri Lelliwrote: [..] >> > Considering it a per-cpu thing, isn't enough that it gets bumped up or >> > decayed only when a CPU does an update (by using the above from >> > sugov_update_shared)? >> > >> > If we go this way I think we will only need to reset prev_iowait_boost >> > if delta_ns > TICK_NSEC during the for_each_cpu() loop of sugov_next_ >> > freq_shared(). >> > >> >> Actually the "decay" was already being done before (without this >> patch), I am just preserving the same old behavior where we do decay. >> Perhaps your proposal can be a separate match? Or did I miss something >> else subtle here? >> > > True, we are currently decaying anyway. > > Looking again at this path made me however think if we really need to. I > guess we need currently, as we bump frenquency to max and then decay it. > But, with your changes, I was wondering if we can simplify the thing and > decay only on the per-CPU update path. Yes that makes sense to me, but even if we're at max, we should still benefit from your idea right? (I didn't follow why being at min makes the idea better, I think its a good idea in both cases (whether we are boosting to max and ramping down or starting from the min) but let me know if I missed something) If I understand correctly what you're proposing: 1. Remove the decay from sugov_iowait_boost 2. Add the iowait_boost decay unconditionally to sugov_set_iowait_boost for the !SCHED_CPUFREQ_IOWAIT case. That would also get rid of the prev_iowait_boost flag and simplify things, and also address Peter's concern of adding 'bool' in composite types :) thanks, -Joel
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
Hi Juri, On Tue, Jul 11, 2017 at 12:14 AM, Juri Lelli wrote: [..] >> > Considering it a per-cpu thing, isn't enough that it gets bumped up or >> > decayed only when a CPU does an update (by using the above from >> > sugov_update_shared)? >> > >> > If we go this way I think we will only need to reset prev_iowait_boost >> > if delta_ns > TICK_NSEC during the for_each_cpu() loop of sugov_next_ >> > freq_shared(). >> > >> >> Actually the "decay" was already being done before (without this >> patch), I am just preserving the same old behavior where we do decay. >> Perhaps your proposal can be a separate match? Or did I miss something >> else subtle here? >> > > True, we are currently decaying anyway. > > Looking again at this path made me however think if we really need to. I > guess we need currently, as we bump frenquency to max and then decay it. > But, with your changes, I was wondering if we can simplify the thing and > decay only on the per-CPU update path. Yes that makes sense to me, but even if we're at max, we should still benefit from your idea right? (I didn't follow why being at min makes the idea better, I think its a good idea in both cases (whether we are boosting to max and ramping down or starting from the min) but let me know if I missed something) If I understand correctly what you're proposing: 1. Remove the decay from sugov_iowait_boost 2. Add the iowait_boost decay unconditionally to sugov_set_iowait_boost for the !SCHED_CPUFREQ_IOWAIT case. That would also get rid of the prev_iowait_boost flag and simplify things, and also address Peter's concern of adding 'bool' in composite types :) thanks, -Joel
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Tue, Jul 11, 2017 at 7:14 AM, Joel Fernandeswrote: [..] >>> + } >>> } else if (sg_cpu->iowait_boost) { >>> s64 delta_ns = time - sg_cpu->last_update; >>> >>> /* Clear iowait_boost if the CPU apprears to have been idle. >>> */ >>> if (delta_ns > TICK_NSEC) >>> sg_cpu->iowait_boost = 0; >>> + >>> + /* >>> + * Since we don't decay iowait_boost when its consumed during >>> + * the previous SCHED_CPUFREQ_IOWAIT update, decay it now. >>> + */ >>> + if (sg_cpu->prev_iowait_boost) { >> >> SCHED_CPUFREQ_IOWAIT flag is set only by CFS from the enqueue_task() and in >> many >> cases we call the util hook twice from the same enqueue_task() instance >> before >> returning (2nd one after updating util). And in such cases we will set >> iowait_boost as 0 on the second call. >> >> Have you ever seen two consecutive calls to sugov_set_iowait_boost() with >> IOWAIT >> flag set ? Can we get the ratio of that against the other case where we have >> IOWAIT flag set in first call, followed by one or more non-IOWAIT calls and >> then >> IOWAIT again ? >> >> I am asking because if the calls with IOWAIT flag aren't consecutive then we >> will make iowait_boost as 0 in the next non-IOWAIT call. > > Yes, I've seen that happen in my testing (consecutive iowait). I > haven't seen the other case where you have IOWAIT followed by > non-IOWAIT for a repeated set of IOWAIT requests. Would you more > comfortable if we moved sugov_set_iowait_boost() after the > sugov_should_update_freq() ? That way if there are consecutive > requests in the same path, then it most likely rate-limiting will > prevent such updates. I will also try to collect some stats as you > suggested to see if how often if at all this can happen. Just to be more clear, I was saying that I've only seen repeated IOWAIT requests in the update path, not IOWAIT followed by non-IOWAIT cpufreq updates for a repeated sequence of IOWAIT wakeups. thanks, -Joel
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Tue, Jul 11, 2017 at 7:14 AM, Joel Fernandes wrote: [..] >>> + } >>> } else if (sg_cpu->iowait_boost) { >>> s64 delta_ns = time - sg_cpu->last_update; >>> >>> /* Clear iowait_boost if the CPU apprears to have been idle. >>> */ >>> if (delta_ns > TICK_NSEC) >>> sg_cpu->iowait_boost = 0; >>> + >>> + /* >>> + * Since we don't decay iowait_boost when its consumed during >>> + * the previous SCHED_CPUFREQ_IOWAIT update, decay it now. >>> + */ >>> + if (sg_cpu->prev_iowait_boost) { >> >> SCHED_CPUFREQ_IOWAIT flag is set only by CFS from the enqueue_task() and in >> many >> cases we call the util hook twice from the same enqueue_task() instance >> before >> returning (2nd one after updating util). And in such cases we will set >> iowait_boost as 0 on the second call. >> >> Have you ever seen two consecutive calls to sugov_set_iowait_boost() with >> IOWAIT >> flag set ? Can we get the ratio of that against the other case where we have >> IOWAIT flag set in first call, followed by one or more non-IOWAIT calls and >> then >> IOWAIT again ? >> >> I am asking because if the calls with IOWAIT flag aren't consecutive then we >> will make iowait_boost as 0 in the next non-IOWAIT call. > > Yes, I've seen that happen in my testing (consecutive iowait). I > haven't seen the other case where you have IOWAIT followed by > non-IOWAIT for a repeated set of IOWAIT requests. Would you more > comfortable if we moved sugov_set_iowait_boost() after the > sugov_should_update_freq() ? That way if there are consecutive > requests in the same path, then it most likely rate-limiting will > prevent such updates. I will also try to collect some stats as you > suggested to see if how often if at all this can happen. Just to be more clear, I was saying that I've only seen repeated IOWAIT requests in the update path, not IOWAIT followed by non-IOWAIT cpufreq updates for a repeated sequence of IOWAIT wakeups. thanks, -Joel
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
Hi Viresh, On Tue, Jul 11, 2017 at 3:14 AM, Viresh Kumarwrote: > On 09-07-17, 10:08, Joel Fernandes wrote: >> diff --git a/kernel/sched/cpufreq_schedutil.c >> b/kernel/sched/cpufreq_schedutil.c >> index 622eed1b7658..4d9e8b96bed1 100644 >> --- a/kernel/sched/cpufreq_schedutil.c >> +++ b/kernel/sched/cpufreq_schedutil.c >> @@ -53,7 +53,9 @@ struct sugov_cpu { >> struct update_util_data update_util; >> struct sugov_policy *sg_policy; >> >> + bool prev_iowait_boost; >> unsigned long iowait_boost; >> + unsigned long iowait_boost_min; >> unsigned long iowait_boost_max; >> u64 last_update; >> >> @@ -168,22 +170,47 @@ static void sugov_get_util(unsigned long *util, >> unsigned long *max) >> *max = cfs_max; >> } >> >> +static void sugov_decay_iowait_boost(struct sugov_cpu *sg_cpu) >> +{ >> + sg_cpu->iowait_boost >>= 1; >> + >> + if (sg_cpu->iowait_boost < sg_cpu->iowait_boost_min) >> + sg_cpu->iowait_boost = 0; >> +} >> + >> static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, >> unsigned int flags) >> { >> if (flags & SCHED_CPUFREQ_IOWAIT) { >> - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; >> + /* Remember for next time that we did an iowait boost */ >> + sg_cpu->prev_iowait_boost = true; >> + if (sg_cpu->iowait_boost) { >> + sg_cpu->iowait_boost <<= 1; >> + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost, >> +sg_cpu->iowait_boost_max); >> + } else { >> + sg_cpu->iowait_boost = sg_cpu->iowait_boost_min; > > I am not sure if boost should start from the min frequency, as the current > frequency will at least be equal to that. Which means that with no boost > initially, we will never increase the frequency for the first IOWAIT flag > event. I think the whole point of IOWAIT boost was to solve the issue with a long sequence of repeated I/O requests as described in the commit message. So IIUC there isn't a usecase for that (increase freq. on first request). Also its just for the first couple of requests in my testing and doesn't hurt the performance at all for the intended usecase while still not causing transient spikes. Another approach than setting min in sugov_set_iowait_boost, is, since we have already retrieved the current util, we can check if flags == SCHED_CPUFREQ_IOWAIT, then set initial the iowait_boost such that (iowait_boost / iowait_boost_max) is aleast equal to (util / max) or iowait_boost_min, which ever is lower. This still will not increase frequency on the first request, but will ensure the next one will benefit. Then again I fear running into slightly longer-term transients where 2 iowait requests are enough to boost the frequency to high values. I can try to measure how often this can hurt power for common usecases if you agree its worth exploring. > >> + } >> } else if (sg_cpu->iowait_boost) { >> s64 delta_ns = time - sg_cpu->last_update; >> >> /* Clear iowait_boost if the CPU apprears to have been idle. */ >> if (delta_ns > TICK_NSEC) >> sg_cpu->iowait_boost = 0; >> + >> + /* >> + * Since we don't decay iowait_boost when its consumed during >> + * the previous SCHED_CPUFREQ_IOWAIT update, decay it now. >> + */ >> + if (sg_cpu->prev_iowait_boost) { > > SCHED_CPUFREQ_IOWAIT flag is set only by CFS from the enqueue_task() and in > many > cases we call the util hook twice from the same enqueue_task() instance before > returning (2nd one after updating util). And in such cases we will set > iowait_boost as 0 on the second call. > > Have you ever seen two consecutive calls to sugov_set_iowait_boost() with > IOWAIT > flag set ? Can we get the ratio of that against the other case where we have > IOWAIT flag set in first call, followed by one or more non-IOWAIT calls and > then > IOWAIT again ? > > I am asking because if the calls with IOWAIT flag aren't consecutive then we > will make iowait_boost as 0 in the next non-IOWAIT call. Yes, I've seen that happen in my testing (consecutive iowait). I haven't seen the other case where you have IOWAIT followed by non-IOWAIT for a repeated set of IOWAIT requests. Would you more comfortable if we moved sugov_set_iowait_boost() after the sugov_should_update_freq() ? That way if there are consecutive requests in the same path, then it most likely rate-limiting will prevent such updates. I will also try to collect some stats as you suggested to see if how often if at all this can happen. thanks, -Joel
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
Hi Viresh, On Tue, Jul 11, 2017 at 3:14 AM, Viresh Kumar wrote: > On 09-07-17, 10:08, Joel Fernandes wrote: >> diff --git a/kernel/sched/cpufreq_schedutil.c >> b/kernel/sched/cpufreq_schedutil.c >> index 622eed1b7658..4d9e8b96bed1 100644 >> --- a/kernel/sched/cpufreq_schedutil.c >> +++ b/kernel/sched/cpufreq_schedutil.c >> @@ -53,7 +53,9 @@ struct sugov_cpu { >> struct update_util_data update_util; >> struct sugov_policy *sg_policy; >> >> + bool prev_iowait_boost; >> unsigned long iowait_boost; >> + unsigned long iowait_boost_min; >> unsigned long iowait_boost_max; >> u64 last_update; >> >> @@ -168,22 +170,47 @@ static void sugov_get_util(unsigned long *util, >> unsigned long *max) >> *max = cfs_max; >> } >> >> +static void sugov_decay_iowait_boost(struct sugov_cpu *sg_cpu) >> +{ >> + sg_cpu->iowait_boost >>= 1; >> + >> + if (sg_cpu->iowait_boost < sg_cpu->iowait_boost_min) >> + sg_cpu->iowait_boost = 0; >> +} >> + >> static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, >> unsigned int flags) >> { >> if (flags & SCHED_CPUFREQ_IOWAIT) { >> - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; >> + /* Remember for next time that we did an iowait boost */ >> + sg_cpu->prev_iowait_boost = true; >> + if (sg_cpu->iowait_boost) { >> + sg_cpu->iowait_boost <<= 1; >> + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost, >> +sg_cpu->iowait_boost_max); >> + } else { >> + sg_cpu->iowait_boost = sg_cpu->iowait_boost_min; > > I am not sure if boost should start from the min frequency, as the current > frequency will at least be equal to that. Which means that with no boost > initially, we will never increase the frequency for the first IOWAIT flag > event. I think the whole point of IOWAIT boost was to solve the issue with a long sequence of repeated I/O requests as described in the commit message. So IIUC there isn't a usecase for that (increase freq. on first request). Also its just for the first couple of requests in my testing and doesn't hurt the performance at all for the intended usecase while still not causing transient spikes. Another approach than setting min in sugov_set_iowait_boost, is, since we have already retrieved the current util, we can check if flags == SCHED_CPUFREQ_IOWAIT, then set initial the iowait_boost such that (iowait_boost / iowait_boost_max) is aleast equal to (util / max) or iowait_boost_min, which ever is lower. This still will not increase frequency on the first request, but will ensure the next one will benefit. Then again I fear running into slightly longer-term transients where 2 iowait requests are enough to boost the frequency to high values. I can try to measure how often this can hurt power for common usecases if you agree its worth exploring. > >> + } >> } else if (sg_cpu->iowait_boost) { >> s64 delta_ns = time - sg_cpu->last_update; >> >> /* Clear iowait_boost if the CPU apprears to have been idle. */ >> if (delta_ns > TICK_NSEC) >> sg_cpu->iowait_boost = 0; >> + >> + /* >> + * Since we don't decay iowait_boost when its consumed during >> + * the previous SCHED_CPUFREQ_IOWAIT update, decay it now. >> + */ >> + if (sg_cpu->prev_iowait_boost) { > > SCHED_CPUFREQ_IOWAIT flag is set only by CFS from the enqueue_task() and in > many > cases we call the util hook twice from the same enqueue_task() instance before > returning (2nd one after updating util). And in such cases we will set > iowait_boost as 0 on the second call. > > Have you ever seen two consecutive calls to sugov_set_iowait_boost() with > IOWAIT > flag set ? Can we get the ratio of that against the other case where we have > IOWAIT flag set in first call, followed by one or more non-IOWAIT calls and > then > IOWAIT again ? > > I am asking because if the calls with IOWAIT flag aren't consecutive then we > will make iowait_boost as 0 in the next non-IOWAIT call. Yes, I've seen that happen in my testing (consecutive iowait). I haven't seen the other case where you have IOWAIT followed by non-IOWAIT for a repeated set of IOWAIT requests. Would you more comfortable if we moved sugov_set_iowait_boost() after the sugov_should_update_freq() ? That way if there are consecutive requests in the same path, then it most likely rate-limiting will prevent such updates. I will also try to collect some stats as you suggested to see if how often if at all this can happen. thanks, -Joel
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Sun, Jul 09, 2017 at 10:08:26AM -0700, Joel Fernandes wrote: > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -53,7 +53,9 @@ struct sugov_cpu { > struct update_util_data update_util; > struct sugov_policy *sg_policy; > > + bool prev_iowait_boost; > unsigned long iowait_boost; So I absolutely detest 'bool' in composite types, but I see there's already some of that in this file :/
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Sun, Jul 09, 2017 at 10:08:26AM -0700, Joel Fernandes wrote: > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -53,7 +53,9 @@ struct sugov_cpu { > struct update_util_data update_util; > struct sugov_policy *sg_policy; > > + bool prev_iowait_boost; > unsigned long iowait_boost; So I absolutely detest 'bool' in composite types, but I see there's already some of that in this file :/
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Tue, Jul 11, 2017 at 03:44:32PM +0530, Viresh Kumar wrote: > On 09-07-17, 10:08, Joel Fernandes wrote: > > diff --git a/kernel/sched/cpufreq_schedutil.c > > b/kernel/sched/cpufreq_schedutil.c > > index 622eed1b7658..4d9e8b96bed1 100644 > > --- a/kernel/sched/cpufreq_schedutil.c > > +++ b/kernel/sched/cpufreq_schedutil.c > > @@ -53,7 +53,9 @@ struct sugov_cpu { > > struct update_util_data update_util; > > struct sugov_policy *sg_policy; > > > > + bool prev_iowait_boost; > > unsigned long iowait_boost; > > + unsigned long iowait_boost_min; > > unsigned long iowait_boost_max; > > u64 last_update; > > > > @@ -168,22 +170,47 @@ static void sugov_get_util(unsigned long *util, > > unsigned long *max) > > *max = cfs_max; > > } > > > > +static void sugov_decay_iowait_boost(struct sugov_cpu *sg_cpu) > > +{ > > + sg_cpu->iowait_boost >>= 1; > > + > > + if (sg_cpu->iowait_boost < sg_cpu->iowait_boost_min) > > + sg_cpu->iowait_boost = 0; > > +} > > + > > static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, > >unsigned int flags) > > { > > if (flags & SCHED_CPUFREQ_IOWAIT) { > > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; > > + /* Remember for next time that we did an iowait boost */ > > + sg_cpu->prev_iowait_boost = true; > > + if (sg_cpu->iowait_boost) { > > + sg_cpu->iowait_boost <<= 1; > > + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost, > > + sg_cpu->iowait_boost_max); > > + } else { > > + sg_cpu->iowait_boost = sg_cpu->iowait_boost_min; > > I am not sure if boost should start from the min frequency, as the current > frequency will at least be equal to that. Which means that with no boost > initially, we will never increase the frequency for the first IOWAIT flag > event. I suspect this actually works for Joel to get rid of the transient spikes he was seeing. Starting at the current freq, as you suggest, appears to make sense, but would add immediate transients back.
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On Tue, Jul 11, 2017 at 03:44:32PM +0530, Viresh Kumar wrote: > On 09-07-17, 10:08, Joel Fernandes wrote: > > diff --git a/kernel/sched/cpufreq_schedutil.c > > b/kernel/sched/cpufreq_schedutil.c > > index 622eed1b7658..4d9e8b96bed1 100644 > > --- a/kernel/sched/cpufreq_schedutil.c > > +++ b/kernel/sched/cpufreq_schedutil.c > > @@ -53,7 +53,9 @@ struct sugov_cpu { > > struct update_util_data update_util; > > struct sugov_policy *sg_policy; > > > > + bool prev_iowait_boost; > > unsigned long iowait_boost; > > + unsigned long iowait_boost_min; > > unsigned long iowait_boost_max; > > u64 last_update; > > > > @@ -168,22 +170,47 @@ static void sugov_get_util(unsigned long *util, > > unsigned long *max) > > *max = cfs_max; > > } > > > > +static void sugov_decay_iowait_boost(struct sugov_cpu *sg_cpu) > > +{ > > + sg_cpu->iowait_boost >>= 1; > > + > > + if (sg_cpu->iowait_boost < sg_cpu->iowait_boost_min) > > + sg_cpu->iowait_boost = 0; > > +} > > + > > static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, > >unsigned int flags) > > { > > if (flags & SCHED_CPUFREQ_IOWAIT) { > > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; > > + /* Remember for next time that we did an iowait boost */ > > + sg_cpu->prev_iowait_boost = true; > > + if (sg_cpu->iowait_boost) { > > + sg_cpu->iowait_boost <<= 1; > > + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost, > > + sg_cpu->iowait_boost_max); > > + } else { > > + sg_cpu->iowait_boost = sg_cpu->iowait_boost_min; > > I am not sure if boost should start from the min frequency, as the current > frequency will at least be equal to that. Which means that with no boost > initially, we will never increase the frequency for the first IOWAIT flag > event. I suspect this actually works for Joel to get rid of the transient spikes he was seeing. Starting at the current freq, as you suggest, appears to make sense, but would add immediate transients back.
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 09-07-17, 10:08, Joel Fernandes wrote: > diff --git a/kernel/sched/cpufreq_schedutil.c > b/kernel/sched/cpufreq_schedutil.c > index 622eed1b7658..4d9e8b96bed1 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -53,7 +53,9 @@ struct sugov_cpu { > struct update_util_data update_util; > struct sugov_policy *sg_policy; > > + bool prev_iowait_boost; > unsigned long iowait_boost; > + unsigned long iowait_boost_min; > unsigned long iowait_boost_max; > u64 last_update; > > @@ -168,22 +170,47 @@ static void sugov_get_util(unsigned long *util, > unsigned long *max) > *max = cfs_max; > } > > +static void sugov_decay_iowait_boost(struct sugov_cpu *sg_cpu) > +{ > + sg_cpu->iowait_boost >>= 1; > + > + if (sg_cpu->iowait_boost < sg_cpu->iowait_boost_min) > + sg_cpu->iowait_boost = 0; > +} > + > static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, > unsigned int flags) > { > if (flags & SCHED_CPUFREQ_IOWAIT) { > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; > + /* Remember for next time that we did an iowait boost */ > + sg_cpu->prev_iowait_boost = true; > + if (sg_cpu->iowait_boost) { > + sg_cpu->iowait_boost <<= 1; > + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost, > +sg_cpu->iowait_boost_max); > + } else { > + sg_cpu->iowait_boost = sg_cpu->iowait_boost_min; I am not sure if boost should start from the min frequency, as the current frequency will at least be equal to that. Which means that with no boost initially, we will never increase the frequency for the first IOWAIT flag event. > + } > } else if (sg_cpu->iowait_boost) { > s64 delta_ns = time - sg_cpu->last_update; > > /* Clear iowait_boost if the CPU apprears to have been idle. */ > if (delta_ns > TICK_NSEC) > sg_cpu->iowait_boost = 0; > + > + /* > + * Since we don't decay iowait_boost when its consumed during > + * the previous SCHED_CPUFREQ_IOWAIT update, decay it now. > + */ > + if (sg_cpu->prev_iowait_boost) { SCHED_CPUFREQ_IOWAIT flag is set only by CFS from the enqueue_task() and in many cases we call the util hook twice from the same enqueue_task() instance before returning (2nd one after updating util). And in such cases we will set iowait_boost as 0 on the second call. Have you ever seen two consecutive calls to sugov_set_iowait_boost() with IOWAIT flag set ? Can we get the ratio of that against the other case where we have IOWAIT flag set in first call, followed by one or more non-IOWAIT calls and then IOWAIT again ? I am asking because if the calls with IOWAIT flag aren't consecutive then we will make iowait_boost as 0 in the next non-IOWAIT call. -- viresh
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 09-07-17, 10:08, Joel Fernandes wrote: > diff --git a/kernel/sched/cpufreq_schedutil.c > b/kernel/sched/cpufreq_schedutil.c > index 622eed1b7658..4d9e8b96bed1 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -53,7 +53,9 @@ struct sugov_cpu { > struct update_util_data update_util; > struct sugov_policy *sg_policy; > > + bool prev_iowait_boost; > unsigned long iowait_boost; > + unsigned long iowait_boost_min; > unsigned long iowait_boost_max; > u64 last_update; > > @@ -168,22 +170,47 @@ static void sugov_get_util(unsigned long *util, > unsigned long *max) > *max = cfs_max; > } > > +static void sugov_decay_iowait_boost(struct sugov_cpu *sg_cpu) > +{ > + sg_cpu->iowait_boost >>= 1; > + > + if (sg_cpu->iowait_boost < sg_cpu->iowait_boost_min) > + sg_cpu->iowait_boost = 0; > +} > + > static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, > unsigned int flags) > { > if (flags & SCHED_CPUFREQ_IOWAIT) { > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; > + /* Remember for next time that we did an iowait boost */ > + sg_cpu->prev_iowait_boost = true; > + if (sg_cpu->iowait_boost) { > + sg_cpu->iowait_boost <<= 1; > + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost, > +sg_cpu->iowait_boost_max); > + } else { > + sg_cpu->iowait_boost = sg_cpu->iowait_boost_min; I am not sure if boost should start from the min frequency, as the current frequency will at least be equal to that. Which means that with no boost initially, we will never increase the frequency for the first IOWAIT flag event. > + } > } else if (sg_cpu->iowait_boost) { > s64 delta_ns = time - sg_cpu->last_update; > > /* Clear iowait_boost if the CPU apprears to have been idle. */ > if (delta_ns > TICK_NSEC) > sg_cpu->iowait_boost = 0; > + > + /* > + * Since we don't decay iowait_boost when its consumed during > + * the previous SCHED_CPUFREQ_IOWAIT update, decay it now. > + */ > + if (sg_cpu->prev_iowait_boost) { SCHED_CPUFREQ_IOWAIT flag is set only by CFS from the enqueue_task() and in many cases we call the util hook twice from the same enqueue_task() instance before returning (2nd one after updating util). And in such cases we will set iowait_boost as 0 on the second call. Have you ever seen two consecutive calls to sugov_set_iowait_boost() with IOWAIT flag set ? Can we get the ratio of that against the other case where we have IOWAIT flag set in first call, followed by one or more non-IOWAIT calls and then IOWAIT again ? I am asking because if the calls with IOWAIT flag aren't consecutive then we will make iowait_boost as 0 in the next non-IOWAIT call. -- viresh
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 10/07/17 22:12, Joel Fernandes wrote: > Hi Juri, > > On Mon, Jul 10, 2017 at 3:55 AM, Juri Lelliwrote: > > Hi Joel, > > > > On 09/07/17 10:08, Joel Fernandes wrote: [...] > >> static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long > >> *util, > >> -unsigned long *max) > >> +unsigned long *max, unsigned int flags) > >> { > >> unsigned long boost_util = sg_cpu->iowait_boost; > >> unsigned long boost_max = sg_cpu->iowait_boost_max; > >> @@ -195,7 +222,16 @@ static void sugov_iowait_boost(struct sugov_cpu > >> *sg_cpu, unsigned long *util, > >> *util = boost_util; > >> *max = boost_max; > >> } > >> - sg_cpu->iowait_boost >>= 1; > >> + > >> + /* > >> + * Incase iowait boost just happened on this CPU, don't reduce it > >> right > >> + * away since then the iowait boost will never increase on subsequent > >> + * in_iowait wakeups. > >> + */ > >> + if (flags & SCHED_CPUFREQ_IOWAIT && this_cpu_ptr(_cpu) == > >> sg_cpu) > >> + return; > >> + > >> + sugov_decay_iowait_boost(sg_cpu); > > > > Mmm, do we need to decay even when we are computing frequency requests > > for a domain? > > > > Considering it a per-cpu thing, isn't enough that it gets bumped up or > > decayed only when a CPU does an update (by using the above from > > sugov_update_shared)? > > > > If we go this way I think we will only need to reset prev_iowait_boost > > if delta_ns > TICK_NSEC during the for_each_cpu() loop of sugov_next_ > > freq_shared(). > > > > Actually the "decay" was already being done before (without this > patch), I am just preserving the same old behavior where we do decay. > Perhaps your proposal can be a separate match? Or did I miss something > else subtle here? > True, we are currently decaying anyway. Looking again at this path made me however think if we really need to. I guess we need currently, as we bump frenquency to max and then decay it. But, with your changes, I was wondering if we can simplify the thing and decay only on the per-CPU update path. The other reason for trying to simplify this is that I don't particularly like adding and consuming flags argument at this point, but I guess we could refactor the code if this is really a problem. Thanks, - Juri
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
On 10/07/17 22:12, Joel Fernandes wrote: > Hi Juri, > > On Mon, Jul 10, 2017 at 3:55 AM, Juri Lelli wrote: > > Hi Joel, > > > > On 09/07/17 10:08, Joel Fernandes wrote: [...] > >> static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long > >> *util, > >> -unsigned long *max) > >> +unsigned long *max, unsigned int flags) > >> { > >> unsigned long boost_util = sg_cpu->iowait_boost; > >> unsigned long boost_max = sg_cpu->iowait_boost_max; > >> @@ -195,7 +222,16 @@ static void sugov_iowait_boost(struct sugov_cpu > >> *sg_cpu, unsigned long *util, > >> *util = boost_util; > >> *max = boost_max; > >> } > >> - sg_cpu->iowait_boost >>= 1; > >> + > >> + /* > >> + * Incase iowait boost just happened on this CPU, don't reduce it > >> right > >> + * away since then the iowait boost will never increase on subsequent > >> + * in_iowait wakeups. > >> + */ > >> + if (flags & SCHED_CPUFREQ_IOWAIT && this_cpu_ptr(_cpu) == > >> sg_cpu) > >> + return; > >> + > >> + sugov_decay_iowait_boost(sg_cpu); > > > > Mmm, do we need to decay even when we are computing frequency requests > > for a domain? > > > > Considering it a per-cpu thing, isn't enough that it gets bumped up or > > decayed only when a CPU does an update (by using the above from > > sugov_update_shared)? > > > > If we go this way I think we will only need to reset prev_iowait_boost > > if delta_ns > TICK_NSEC during the for_each_cpu() loop of sugov_next_ > > freq_shared(). > > > > Actually the "decay" was already being done before (without this > patch), I am just preserving the same old behavior where we do decay. > Perhaps your proposal can be a separate match? Or did I miss something > else subtle here? > True, we are currently decaying anyway. Looking again at this path made me however think if we really need to. I guess we need currently, as we bump frenquency to max and then decay it. But, with your changes, I was wondering if we can simplify the thing and decay only on the per-CPU update path. The other reason for trying to simplify this is that I don't particularly like adding and consuming flags argument at this point, but I guess we could refactor the code if this is really a problem. Thanks, - Juri
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
Hi Juri, On Mon, Jul 10, 2017 at 3:55 AM, Juri Lelliwrote: > Hi Joel, > > On 09/07/17 10:08, Joel Fernandes wrote: >> Currently the iowait_boost feature in schedutil makes the frequency go to >> max. >> This feature was added to handle a case that Peter described where the >> throughput of operations involving continuous I/O requests [1] is reduced due >> to running at a lower frequency, however the lower throughput itself causes >> utilization to be low and hence causing frequency to be low hence its >> "stuck". >> >> Instead of going to max, its also possible to achieve the same effect by >> ramping up to max if there are repeated in_iowait wakeups happening. This >> patch >> is an attempt to do that. We start from a lower frequency (iowait_boost_min) >> and double the boost for every consecutive iowait update until we reach the >> maximum iowait boost frequency (iowait_boost_max). >> >> I ran a synthetic test on an x86 machine with intel_pstate in passive mode >> using schedutil. The patch achieves the desired effect as the existing >> behavior. Also tested on ARM64 platform and see that there the transient >> iowait >> requests aren't causing frequency spikes. >> >> [1] https://patchwork.kernel.org/patch/9735885/ >> >> Cc: Srinivas Pandruvada >> Cc: Len Brown >> Cc: Rafael J. Wysocki >> Cc: Viresh Kumar >> Cc: Ingo Molnar >> Cc: Peter Zijlstra >> Suggested-by: Peter Zijlstra >> Signed-off-by: Joel Fernandes [..] >> >> diff --git a/kernel/sched/cpufreq_schedutil.c >> b/kernel/sched/cpufreq_schedutil.c >> index 622eed1b7658..4d9e8b96bed1 100644 >> --- a/kernel/sched/cpufreq_schedutil.c >> +++ b/kernel/sched/cpufreq_schedutil.c >> @@ -53,7 +53,9 @@ struct sugov_cpu { >> struct update_util_data update_util; >> struct sugov_policy *sg_policy; >> >> + bool prev_iowait_boost; >> unsigned long iowait_boost; >> + unsigned long iowait_boost_min; >> unsigned long iowait_boost_max; >> u64 last_update; >> >> @@ -168,22 +170,47 @@ static void sugov_get_util(unsigned long *util, >> unsigned long *max) >> *max = cfs_max; >> } >> >> +static void sugov_decay_iowait_boost(struct sugov_cpu *sg_cpu) >> +{ >> + sg_cpu->iowait_boost >>= 1; >> + >> + if (sg_cpu->iowait_boost < sg_cpu->iowait_boost_min) >> + sg_cpu->iowait_boost = 0; >> +} >> + >> static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, >> unsigned int flags) >> { >> if (flags & SCHED_CPUFREQ_IOWAIT) { >> - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; >> + /* Remember for next time that we did an iowait boost */ >> + sg_cpu->prev_iowait_boost = true; >> + if (sg_cpu->iowait_boost) { >> + sg_cpu->iowait_boost <<= 1; >> + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost, >> +sg_cpu->iowait_boost_max); >> + } else { >> + sg_cpu->iowait_boost = sg_cpu->iowait_boost_min; >> + } >> } else if (sg_cpu->iowait_boost) { >> s64 delta_ns = time - sg_cpu->last_update; >> >> /* Clear iowait_boost if the CPU apprears to have been idle. */ >> if (delta_ns > TICK_NSEC) >> sg_cpu->iowait_boost = 0; > > In this case we might just also want to reset prev_iowait_boost > unconditionally and return. Ok, will do. >> + >> + /* >> + * Since we don't decay iowait_boost when its consumed during >> + * the previous SCHED_CPUFREQ_IOWAIT update, decay it now. >> + */ > > Code below seems pretty self-explaning to me. :) > Ok :) >> + if (sg_cpu->prev_iowait_boost) { >> + sugov_decay_iowait_boost(sg_cpu); >> + sg_cpu->prev_iowait_boost = false; >> + } >> } >> } >> >> static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long >> *util, >> -unsigned long *max) >> +unsigned long *max, unsigned int flags) >> { >> unsigned long boost_util = sg_cpu->iowait_boost; >> unsigned long boost_max = sg_cpu->iowait_boost_max; >> @@ -195,7 +222,16 @@ static void sugov_iowait_boost(struct sugov_cpu >> *sg_cpu, unsigned long *util, >> *util = boost_util; >> *max = boost_max; >> } >> - sg_cpu->iowait_boost >>= 1; >> + >> + /* >> + * Incase iowait boost just happened on this CPU, don't reduce it right >> + * away since then the iowait boost will never increase on subsequent >> + * in_iowait wakeups. >> + */ >> + if (flags & SCHED_CPUFREQ_IOWAIT &&
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
Hi Juri, On Mon, Jul 10, 2017 at 3:55 AM, Juri Lelli wrote: > Hi Joel, > > On 09/07/17 10:08, Joel Fernandes wrote: >> Currently the iowait_boost feature in schedutil makes the frequency go to >> max. >> This feature was added to handle a case that Peter described where the >> throughput of operations involving continuous I/O requests [1] is reduced due >> to running at a lower frequency, however the lower throughput itself causes >> utilization to be low and hence causing frequency to be low hence its >> "stuck". >> >> Instead of going to max, its also possible to achieve the same effect by >> ramping up to max if there are repeated in_iowait wakeups happening. This >> patch >> is an attempt to do that. We start from a lower frequency (iowait_boost_min) >> and double the boost for every consecutive iowait update until we reach the >> maximum iowait boost frequency (iowait_boost_max). >> >> I ran a synthetic test on an x86 machine with intel_pstate in passive mode >> using schedutil. The patch achieves the desired effect as the existing >> behavior. Also tested on ARM64 platform and see that there the transient >> iowait >> requests aren't causing frequency spikes. >> >> [1] https://patchwork.kernel.org/patch/9735885/ >> >> Cc: Srinivas Pandruvada >> Cc: Len Brown >> Cc: Rafael J. Wysocki >> Cc: Viresh Kumar >> Cc: Ingo Molnar >> Cc: Peter Zijlstra >> Suggested-by: Peter Zijlstra >> Signed-off-by: Joel Fernandes [..] >> >> diff --git a/kernel/sched/cpufreq_schedutil.c >> b/kernel/sched/cpufreq_schedutil.c >> index 622eed1b7658..4d9e8b96bed1 100644 >> --- a/kernel/sched/cpufreq_schedutil.c >> +++ b/kernel/sched/cpufreq_schedutil.c >> @@ -53,7 +53,9 @@ struct sugov_cpu { >> struct update_util_data update_util; >> struct sugov_policy *sg_policy; >> >> + bool prev_iowait_boost; >> unsigned long iowait_boost; >> + unsigned long iowait_boost_min; >> unsigned long iowait_boost_max; >> u64 last_update; >> >> @@ -168,22 +170,47 @@ static void sugov_get_util(unsigned long *util, >> unsigned long *max) >> *max = cfs_max; >> } >> >> +static void sugov_decay_iowait_boost(struct sugov_cpu *sg_cpu) >> +{ >> + sg_cpu->iowait_boost >>= 1; >> + >> + if (sg_cpu->iowait_boost < sg_cpu->iowait_boost_min) >> + sg_cpu->iowait_boost = 0; >> +} >> + >> static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, >> unsigned int flags) >> { >> if (flags & SCHED_CPUFREQ_IOWAIT) { >> - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; >> + /* Remember for next time that we did an iowait boost */ >> + sg_cpu->prev_iowait_boost = true; >> + if (sg_cpu->iowait_boost) { >> + sg_cpu->iowait_boost <<= 1; >> + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost, >> +sg_cpu->iowait_boost_max); >> + } else { >> + sg_cpu->iowait_boost = sg_cpu->iowait_boost_min; >> + } >> } else if (sg_cpu->iowait_boost) { >> s64 delta_ns = time - sg_cpu->last_update; >> >> /* Clear iowait_boost if the CPU apprears to have been idle. */ >> if (delta_ns > TICK_NSEC) >> sg_cpu->iowait_boost = 0; > > In this case we might just also want to reset prev_iowait_boost > unconditionally and return. Ok, will do. >> + >> + /* >> + * Since we don't decay iowait_boost when its consumed during >> + * the previous SCHED_CPUFREQ_IOWAIT update, decay it now. >> + */ > > Code below seems pretty self-explaning to me. :) > Ok :) >> + if (sg_cpu->prev_iowait_boost) { >> + sugov_decay_iowait_boost(sg_cpu); >> + sg_cpu->prev_iowait_boost = false; >> + } >> } >> } >> >> static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long >> *util, >> -unsigned long *max) >> +unsigned long *max, unsigned int flags) >> { >> unsigned long boost_util = sg_cpu->iowait_boost; >> unsigned long boost_max = sg_cpu->iowait_boost_max; >> @@ -195,7 +222,16 @@ static void sugov_iowait_boost(struct sugov_cpu >> *sg_cpu, unsigned long *util, >> *util = boost_util; >> *max = boost_max; >> } >> - sg_cpu->iowait_boost >>= 1; >> + >> + /* >> + * Incase iowait boost just happened on this CPU, don't reduce it right >> + * away since then the iowait boost will never increase on subsequent >> + * in_iowait wakeups. >> + */ >> + if (flags & SCHED_CPUFREQ_IOWAIT && this_cpu_ptr(_cpu) == sg_cpu) >> + return; >> + >> + sugov_decay_iowait_boost(sg_cpu); > > Mmm, do we need to decay even when we are computing frequency requests > for a domain? > > Considering it a
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
Hi Joel, On 09/07/17 10:08, Joel Fernandes wrote: > Currently the iowait_boost feature in schedutil makes the frequency go to max. > This feature was added to handle a case that Peter described where the > throughput of operations involving continuous I/O requests [1] is reduced due > to running at a lower frequency, however the lower throughput itself causes > utilization to be low and hence causing frequency to be low hence its "stuck". > > Instead of going to max, its also possible to achieve the same effect by > ramping up to max if there are repeated in_iowait wakeups happening. This > patch > is an attempt to do that. We start from a lower frequency (iowait_boost_min) > and double the boost for every consecutive iowait update until we reach the > maximum iowait boost frequency (iowait_boost_max). > > I ran a synthetic test on an x86 machine with intel_pstate in passive mode > using schedutil. The patch achieves the desired effect as the existing > behavior. Also tested on ARM64 platform and see that there the transient > iowait > requests aren't causing frequency spikes. > > [1] https://patchwork.kernel.org/patch/9735885/ > > Cc: Srinivas Pandruvada> Cc: Len Brown > Cc: Rafael J. Wysocki > Cc: Viresh Kumar > Cc: Ingo Molnar > Cc: Peter Zijlstra > Suggested-by: Peter Zijlstra > Signed-off-by: Joel Fernandes > --- > Changes since v1: > - not using tunables to plainly turn off iowait boost anymore > > kernel/sched/cpufreq_schedutil.c | 52 > ++-- > 1 file changed, 45 insertions(+), 7 deletions(-) > > diff --git a/kernel/sched/cpufreq_schedutil.c > b/kernel/sched/cpufreq_schedutil.c > index 622eed1b7658..4d9e8b96bed1 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -53,7 +53,9 @@ struct sugov_cpu { > struct update_util_data update_util; > struct sugov_policy *sg_policy; > > + bool prev_iowait_boost; > unsigned long iowait_boost; > + unsigned long iowait_boost_min; > unsigned long iowait_boost_max; > u64 last_update; > > @@ -168,22 +170,47 @@ static void sugov_get_util(unsigned long *util, > unsigned long *max) > *max = cfs_max; > } > > +static void sugov_decay_iowait_boost(struct sugov_cpu *sg_cpu) > +{ > + sg_cpu->iowait_boost >>= 1; > + > + if (sg_cpu->iowait_boost < sg_cpu->iowait_boost_min) > + sg_cpu->iowait_boost = 0; > +} > + > static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, > unsigned int flags) > { > if (flags & SCHED_CPUFREQ_IOWAIT) { > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; > + /* Remember for next time that we did an iowait boost */ > + sg_cpu->prev_iowait_boost = true; > + if (sg_cpu->iowait_boost) { > + sg_cpu->iowait_boost <<= 1; > + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost, > +sg_cpu->iowait_boost_max); > + } else { > + sg_cpu->iowait_boost = sg_cpu->iowait_boost_min; > + } > } else if (sg_cpu->iowait_boost) { > s64 delta_ns = time - sg_cpu->last_update; > > /* Clear iowait_boost if the CPU apprears to have been idle. */ > if (delta_ns > TICK_NSEC) > sg_cpu->iowait_boost = 0; In this case we might just also want to reset prev_iowait_boost unconditionally and return. > + > + /* > + * Since we don't decay iowait_boost when its consumed during > + * the previous SCHED_CPUFREQ_IOWAIT update, decay it now. > + */ Code below seems pretty self-explaning to me. :) > + if (sg_cpu->prev_iowait_boost) { > + sugov_decay_iowait_boost(sg_cpu); > + sg_cpu->prev_iowait_boost = false; > + } > } > } > > static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util, > -unsigned long *max) > +unsigned long *max, unsigned int flags) > { > unsigned long boost_util = sg_cpu->iowait_boost; > unsigned long boost_max = sg_cpu->iowait_boost_max; > @@ -195,7 +222,16 @@ static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, > unsigned long *util, > *util = boost_util; > *max = boost_max; > } > - sg_cpu->iowait_boost >>= 1; > + > + /* > + * Incase iowait boost just happened on this CPU, don't reduce it right > + * away since then the iowait boost will never increase on subsequent > + * in_iowait wakeups. > + */ > + if (flags & SCHED_CPUFREQ_IOWAIT &&
Re: [PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
Hi Joel, On 09/07/17 10:08, Joel Fernandes wrote: > Currently the iowait_boost feature in schedutil makes the frequency go to max. > This feature was added to handle a case that Peter described where the > throughput of operations involving continuous I/O requests [1] is reduced due > to running at a lower frequency, however the lower throughput itself causes > utilization to be low and hence causing frequency to be low hence its "stuck". > > Instead of going to max, its also possible to achieve the same effect by > ramping up to max if there are repeated in_iowait wakeups happening. This > patch > is an attempt to do that. We start from a lower frequency (iowait_boost_min) > and double the boost for every consecutive iowait update until we reach the > maximum iowait boost frequency (iowait_boost_max). > > I ran a synthetic test on an x86 machine with intel_pstate in passive mode > using schedutil. The patch achieves the desired effect as the existing > behavior. Also tested on ARM64 platform and see that there the transient > iowait > requests aren't causing frequency spikes. > > [1] https://patchwork.kernel.org/patch/9735885/ > > Cc: Srinivas Pandruvada > Cc: Len Brown > Cc: Rafael J. Wysocki > Cc: Viresh Kumar > Cc: Ingo Molnar > Cc: Peter Zijlstra > Suggested-by: Peter Zijlstra > Signed-off-by: Joel Fernandes > --- > Changes since v1: > - not using tunables to plainly turn off iowait boost anymore > > kernel/sched/cpufreq_schedutil.c | 52 > ++-- > 1 file changed, 45 insertions(+), 7 deletions(-) > > diff --git a/kernel/sched/cpufreq_schedutil.c > b/kernel/sched/cpufreq_schedutil.c > index 622eed1b7658..4d9e8b96bed1 100644 > --- a/kernel/sched/cpufreq_schedutil.c > +++ b/kernel/sched/cpufreq_schedutil.c > @@ -53,7 +53,9 @@ struct sugov_cpu { > struct update_util_data update_util; > struct sugov_policy *sg_policy; > > + bool prev_iowait_boost; > unsigned long iowait_boost; > + unsigned long iowait_boost_min; > unsigned long iowait_boost_max; > u64 last_update; > > @@ -168,22 +170,47 @@ static void sugov_get_util(unsigned long *util, > unsigned long *max) > *max = cfs_max; > } > > +static void sugov_decay_iowait_boost(struct sugov_cpu *sg_cpu) > +{ > + sg_cpu->iowait_boost >>= 1; > + > + if (sg_cpu->iowait_boost < sg_cpu->iowait_boost_min) > + sg_cpu->iowait_boost = 0; > +} > + > static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, > unsigned int flags) > { > if (flags & SCHED_CPUFREQ_IOWAIT) { > - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; > + /* Remember for next time that we did an iowait boost */ > + sg_cpu->prev_iowait_boost = true; > + if (sg_cpu->iowait_boost) { > + sg_cpu->iowait_boost <<= 1; > + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost, > +sg_cpu->iowait_boost_max); > + } else { > + sg_cpu->iowait_boost = sg_cpu->iowait_boost_min; > + } > } else if (sg_cpu->iowait_boost) { > s64 delta_ns = time - sg_cpu->last_update; > > /* Clear iowait_boost if the CPU apprears to have been idle. */ > if (delta_ns > TICK_NSEC) > sg_cpu->iowait_boost = 0; In this case we might just also want to reset prev_iowait_boost unconditionally and return. > + > + /* > + * Since we don't decay iowait_boost when its consumed during > + * the previous SCHED_CPUFREQ_IOWAIT update, decay it now. > + */ Code below seems pretty self-explaning to me. :) > + if (sg_cpu->prev_iowait_boost) { > + sugov_decay_iowait_boost(sg_cpu); > + sg_cpu->prev_iowait_boost = false; > + } > } > } > > static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util, > -unsigned long *max) > +unsigned long *max, unsigned int flags) > { > unsigned long boost_util = sg_cpu->iowait_boost; > unsigned long boost_max = sg_cpu->iowait_boost_max; > @@ -195,7 +222,16 @@ static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, > unsigned long *util, > *util = boost_util; > *max = boost_max; > } > - sg_cpu->iowait_boost >>= 1; > + > + /* > + * Incase iowait boost just happened on this CPU, don't reduce it right > + * away since then the iowait boost will never increase on subsequent > + * in_iowait wakeups. > + */ > + if (flags & SCHED_CPUFREQ_IOWAIT && this_cpu_ptr(_cpu) == sg_cpu) > + return; > + > + sugov_decay_iowait_boost(sg_cpu); Mmm, do we need to decay even when we are computing frequency requests for a domain?
[PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
Currently the iowait_boost feature in schedutil makes the frequency go to max. This feature was added to handle a case that Peter described where the throughput of operations involving continuous I/O requests [1] is reduced due to running at a lower frequency, however the lower throughput itself causes utilization to be low and hence causing frequency to be low hence its "stuck". Instead of going to max, its also possible to achieve the same effect by ramping up to max if there are repeated in_iowait wakeups happening. This patch is an attempt to do that. We start from a lower frequency (iowait_boost_min) and double the boost for every consecutive iowait update until we reach the maximum iowait boost frequency (iowait_boost_max). I ran a synthetic test on an x86 machine with intel_pstate in passive mode using schedutil. The patch achieves the desired effect as the existing behavior. Also tested on ARM64 platform and see that there the transient iowait requests aren't causing frequency spikes. [1] https://patchwork.kernel.org/patch/9735885/ Cc: Srinivas PandruvadaCc: Len Brown Cc: Rafael J. Wysocki Cc: Viresh Kumar Cc: Ingo Molnar Cc: Peter Zijlstra Suggested-by: Peter Zijlstra Signed-off-by: Joel Fernandes --- Changes since v1: - not using tunables to plainly turn off iowait boost anymore kernel/sched/cpufreq_schedutil.c | 52 ++-- 1 file changed, 45 insertions(+), 7 deletions(-) diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 622eed1b7658..4d9e8b96bed1 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -53,7 +53,9 @@ struct sugov_cpu { struct update_util_data update_util; struct sugov_policy *sg_policy; + bool prev_iowait_boost; unsigned long iowait_boost; + unsigned long iowait_boost_min; unsigned long iowait_boost_max; u64 last_update; @@ -168,22 +170,47 @@ static void sugov_get_util(unsigned long *util, unsigned long *max) *max = cfs_max; } +static void sugov_decay_iowait_boost(struct sugov_cpu *sg_cpu) +{ + sg_cpu->iowait_boost >>= 1; + + if (sg_cpu->iowait_boost < sg_cpu->iowait_boost_min) + sg_cpu->iowait_boost = 0; +} + static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, unsigned int flags) { if (flags & SCHED_CPUFREQ_IOWAIT) { - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; + /* Remember for next time that we did an iowait boost */ + sg_cpu->prev_iowait_boost = true; + if (sg_cpu->iowait_boost) { + sg_cpu->iowait_boost <<= 1; + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost, + sg_cpu->iowait_boost_max); + } else { + sg_cpu->iowait_boost = sg_cpu->iowait_boost_min; + } } else if (sg_cpu->iowait_boost) { s64 delta_ns = time - sg_cpu->last_update; /* Clear iowait_boost if the CPU apprears to have been idle. */ if (delta_ns > TICK_NSEC) sg_cpu->iowait_boost = 0; + + /* +* Since we don't decay iowait_boost when its consumed during +* the previous SCHED_CPUFREQ_IOWAIT update, decay it now. +*/ + if (sg_cpu->prev_iowait_boost) { + sugov_decay_iowait_boost(sg_cpu); + sg_cpu->prev_iowait_boost = false; + } } } static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util, - unsigned long *max) + unsigned long *max, unsigned int flags) { unsigned long boost_util = sg_cpu->iowait_boost; unsigned long boost_max = sg_cpu->iowait_boost_max; @@ -195,7 +222,16 @@ static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util, *util = boost_util; *max = boost_max; } - sg_cpu->iowait_boost >>= 1; + + /* +* Incase iowait boost just happened on this CPU, don't reduce it right +* away since then the iowait boost will never increase on subsequent +* in_iowait wakeups. +*/ + if (flags & SCHED_CPUFREQ_IOWAIT && this_cpu_ptr(_cpu) == sg_cpu) + return; + + sugov_decay_iowait_boost(sg_cpu); } #ifdef CONFIG_NO_HZ_COMMON @@ -233,7 +269,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, next_f = policy->cpuinfo.max_freq; } else { sugov_get_util(, ); -
[PATCH RFC v4] cpufreq: schedutil: Make iowait boost more energy efficient
Currently the iowait_boost feature in schedutil makes the frequency go to max. This feature was added to handle a case that Peter described where the throughput of operations involving continuous I/O requests [1] is reduced due to running at a lower frequency, however the lower throughput itself causes utilization to be low and hence causing frequency to be low hence its "stuck". Instead of going to max, its also possible to achieve the same effect by ramping up to max if there are repeated in_iowait wakeups happening. This patch is an attempt to do that. We start from a lower frequency (iowait_boost_min) and double the boost for every consecutive iowait update until we reach the maximum iowait boost frequency (iowait_boost_max). I ran a synthetic test on an x86 machine with intel_pstate in passive mode using schedutil. The patch achieves the desired effect as the existing behavior. Also tested on ARM64 platform and see that there the transient iowait requests aren't causing frequency spikes. [1] https://patchwork.kernel.org/patch/9735885/ Cc: Srinivas Pandruvada Cc: Len Brown Cc: Rafael J. Wysocki Cc: Viresh Kumar Cc: Ingo Molnar Cc: Peter Zijlstra Suggested-by: Peter Zijlstra Signed-off-by: Joel Fernandes --- Changes since v1: - not using tunables to plainly turn off iowait boost anymore kernel/sched/cpufreq_schedutil.c | 52 ++-- 1 file changed, 45 insertions(+), 7 deletions(-) diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 622eed1b7658..4d9e8b96bed1 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -53,7 +53,9 @@ struct sugov_cpu { struct update_util_data update_util; struct sugov_policy *sg_policy; + bool prev_iowait_boost; unsigned long iowait_boost; + unsigned long iowait_boost_min; unsigned long iowait_boost_max; u64 last_update; @@ -168,22 +170,47 @@ static void sugov_get_util(unsigned long *util, unsigned long *max) *max = cfs_max; } +static void sugov_decay_iowait_boost(struct sugov_cpu *sg_cpu) +{ + sg_cpu->iowait_boost >>= 1; + + if (sg_cpu->iowait_boost < sg_cpu->iowait_boost_min) + sg_cpu->iowait_boost = 0; +} + static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, unsigned int flags) { if (flags & SCHED_CPUFREQ_IOWAIT) { - sg_cpu->iowait_boost = sg_cpu->iowait_boost_max; + /* Remember for next time that we did an iowait boost */ + sg_cpu->prev_iowait_boost = true; + if (sg_cpu->iowait_boost) { + sg_cpu->iowait_boost <<= 1; + sg_cpu->iowait_boost = min(sg_cpu->iowait_boost, + sg_cpu->iowait_boost_max); + } else { + sg_cpu->iowait_boost = sg_cpu->iowait_boost_min; + } } else if (sg_cpu->iowait_boost) { s64 delta_ns = time - sg_cpu->last_update; /* Clear iowait_boost if the CPU apprears to have been idle. */ if (delta_ns > TICK_NSEC) sg_cpu->iowait_boost = 0; + + /* +* Since we don't decay iowait_boost when its consumed during +* the previous SCHED_CPUFREQ_IOWAIT update, decay it now. +*/ + if (sg_cpu->prev_iowait_boost) { + sugov_decay_iowait_boost(sg_cpu); + sg_cpu->prev_iowait_boost = false; + } } } static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util, - unsigned long *max) + unsigned long *max, unsigned int flags) { unsigned long boost_util = sg_cpu->iowait_boost; unsigned long boost_max = sg_cpu->iowait_boost_max; @@ -195,7 +222,16 @@ static void sugov_iowait_boost(struct sugov_cpu *sg_cpu, unsigned long *util, *util = boost_util; *max = boost_max; } - sg_cpu->iowait_boost >>= 1; + + /* +* Incase iowait boost just happened on this CPU, don't reduce it right +* away since then the iowait boost will never increase on subsequent +* in_iowait wakeups. +*/ + if (flags & SCHED_CPUFREQ_IOWAIT && this_cpu_ptr(_cpu) == sg_cpu) + return; + + sugov_decay_iowait_boost(sg_cpu); } #ifdef CONFIG_NO_HZ_COMMON @@ -233,7 +269,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time, next_f = policy->cpuinfo.max_freq; } else { sugov_get_util(, ); - sugov_iowait_boost(sg_cpu, , ); + sugov_iowait_boost(sg_cpu, , , flags); next_f = get_next_freq(sg_policy, util, max); /*