Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-04-07 Thread Patrick Bellasi
On 07-Apr 17:30, Peter Zijlstra wrote:
> On Thu, Mar 02, 2017 at 03:45:04PM +, Patrick Bellasi wrote:
> > +   struct task_struct *curr = cpu_curr(smp_processor_id());
> 
> Isn't that a weird way of writing 'current' ?

Right... (cough)... it's a new fangled way. :-/

Will cleanup before reposting the series.

-- 
#include 

Patrick Bellasi


Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-04-07 Thread Patrick Bellasi
On 07-Apr 17:30, Peter Zijlstra wrote:
> On Thu, Mar 02, 2017 at 03:45:04PM +, Patrick Bellasi wrote:
> > +   struct task_struct *curr = cpu_curr(smp_processor_id());
> 
> Isn't that a weird way of writing 'current' ?

Right... (cough)... it's a new fangled way. :-/

Will cleanup before reposting the series.

-- 
#include 

Patrick Bellasi


Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-04-07 Thread Peter Zijlstra
On Thu, Mar 02, 2017 at 03:45:04PM +, Patrick Bellasi wrote:
> + struct task_struct *curr = cpu_curr(smp_processor_id());

Isn't that a weird way of writing 'current' ?


Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-04-07 Thread Peter Zijlstra
On Thu, Mar 02, 2017 at 03:45:04PM +, Patrick Bellasi wrote:
> + struct task_struct *curr = cpu_curr(smp_processor_id());

Isn't that a weird way of writing 'current' ?


Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-17 Thread Patrick Bellasi
On 16-Mar 00:32, Rafael J. Wysocki wrote:
> On Wed, Mar 15, 2017 at 3:40 PM, Patrick Bellasi
>  wrote:
> > On 15-Mar 12:52, Rafael J. Wysocki wrote:
> >> On Friday, March 03, 2017 12:38:30 PM Patrick Bellasi wrote:
> >> > On 03-Mar 14:01, Viresh Kumar wrote:
> >> > > On 02-03-17, 15:45, Patrick Bellasi wrote:
> >> > > > diff --git a/kernel/sched/cpufreq_schedutil.c 
> >> > > > b/kernel/sched/cpufreq_schedutil.c
> >> > > > @@ -293,15 +305,29 @@ static void sugov_update_shared(struct 
> >> > > > update_util_data *hook, u64 time,
> >> > > > if (curr == sg_policy->thread)
> >> > > > goto done;
> >> > > >
> >> > > > +   /*
> >> > > > +* While RT/DL tasks are running we do not want FAIR tasks to
> >> > > > +* overwrite this CPU's flags, still we can update 
> >> > > > utilization and
> >> > > > +* frequency (if required/possible) to be fair with these 
> >> > > > tasks.
> >> > > > +*/
> >> > > > +   rt_mode = task_has_dl_policy(curr) ||
> >> > > > + task_has_rt_policy(curr) ||
> >> > > > + (flags & SCHED_CPUFREQ_RT_DL);
> >> > > > +   if (rt_mode)
> >> > > > +   sg_cpu->flags |= flags;
> >> > > > +   else
> >> > > > +   sg_cpu->flags = flags;
> >> > >
> >> > > This looks so hacked up :)
> >> >
> >> > It is... a bit... :)
> >> >
> >> > > Wouldn't it be better to let the scheduler tell us what all kind of 
> >> > > tasks it has
> >> > > in the rq of a CPU and pass a mask of flags?
> >> >
> >> > That would definitively report a more consistent view of what's going
> >> > on on each CPU.
> >> >
> >> > > I think it wouldn't be difficult (or time consuming) for the
> >> > > scheduler to know that, but I am not 100% sure.
> >> >
> >> > Main issue perhaps is that cpufreq_update_{util,this_cpu} are
> >> > currently called by the scheduling classes codes and not from the core
> >> > scheduler. However I agree that it should be possible to build up such
> >> > information and make it available to the scheduling classes code.
> >> >
> >> > I'll have a look at that.
> >> >
> >> > > IOW, the flags field in cpufreq_update_util() will represent all tasks 
> >> > > in the
> >> > > rq, instead of just the task that is getting enqueued/dequeued..
> >> > >
> >> > > And obviously we need to get some utilization numbers for the RT and 
> >> > > DL tasks
> >> > > going forward, switching to max isn't going to work for ever :)
> >> >
> >> > Regarding this last point, there are WIP patches Juri is working on to
> >> > feed DL demands to schedutil, his presentation at last ELC partially
> >> > covers these developments:
> >> >   
> >> > https://www.youtube.com/watch?v=wzrcWNIneWY=37=PLbzoR-pLrL6pSlkQDW7RpnNLuxPq6WVUR
> >> >
> >> > Instead, RT tasks are currently covered by an rt_avg metric which we
> >> > already know is not fitting for most purposes.
> >> > It seems that the main goal is twofold: move people to DL whenever
> >> > possible otherwise live with the go-to-max policy which is the only
> >> > sensible solution to satisfy the RT's class main goal, i.e. latency
> >> > reduction.
> >> >
> >> > Of course such a go-to-max policy for all RT tasks we already know
> >> > that is going to destroy energy on many different mobile scenarios.
> >> >
> >> > As a possible mitigation for that, while still being compliant with
> >> > the main RT's class goal, we recently posted the SchedTune v3
> >> > proposal:
> >> >   https://lkml.org/lkml/2017/2/28/355
> >> >
> >> > In that proposal, the simple usage of CGroups and the new capacity_max
> >> > attribute of the (existing) CPU controller should allow to define what
> >> > is the "max" value which is just enough to match the latency
> >> > constraints of a mobile application without sacrificing too much
> >> > energy.
> >
> > Given the following interesting question, let's add Thomas Gleixner to
> > the discussion, which can be interested in sharing his viewpoint.
> >
> >> And who's going to figure out what "max" value is most suitable?  And how?
> >
> > That's usually up to the system profiling which is part of the
> > platform optimizations and tunings.
> > For example it's possible to run  experiments to measure the bandwidth
> > and (completion) latency requirements from the RT workloads.
> >
> > It's something which people developing embedded/mobile systems are
> > quite aware of.
> 
> Well, I was expecting an answer like this to be honest and let me say
> that it is not too convincing from my perspective.
> 
> At least throwing embedded and mobile into one bag seems to be a
> stretch, because the usage patterns of those two groups are quite
> different, even though they may be similar from the hardware POV.
> 
> Mobile are mostly used as general-purpose computers nowadays (and I
> guess we're essentially talking about anything running Android, not
> just phones, aren't we?) with applications installed by users rather
> than by system 

Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-17 Thread Patrick Bellasi
On 16-Mar 00:32, Rafael J. Wysocki wrote:
> On Wed, Mar 15, 2017 at 3:40 PM, Patrick Bellasi
>  wrote:
> > On 15-Mar 12:52, Rafael J. Wysocki wrote:
> >> On Friday, March 03, 2017 12:38:30 PM Patrick Bellasi wrote:
> >> > On 03-Mar 14:01, Viresh Kumar wrote:
> >> > > On 02-03-17, 15:45, Patrick Bellasi wrote:
> >> > > > diff --git a/kernel/sched/cpufreq_schedutil.c 
> >> > > > b/kernel/sched/cpufreq_schedutil.c
> >> > > > @@ -293,15 +305,29 @@ static void sugov_update_shared(struct 
> >> > > > update_util_data *hook, u64 time,
> >> > > > if (curr == sg_policy->thread)
> >> > > > goto done;
> >> > > >
> >> > > > +   /*
> >> > > > +* While RT/DL tasks are running we do not want FAIR tasks to
> >> > > > +* overwrite this CPU's flags, still we can update 
> >> > > > utilization and
> >> > > > +* frequency (if required/possible) to be fair with these 
> >> > > > tasks.
> >> > > > +*/
> >> > > > +   rt_mode = task_has_dl_policy(curr) ||
> >> > > > + task_has_rt_policy(curr) ||
> >> > > > + (flags & SCHED_CPUFREQ_RT_DL);
> >> > > > +   if (rt_mode)
> >> > > > +   sg_cpu->flags |= flags;
> >> > > > +   else
> >> > > > +   sg_cpu->flags = flags;
> >> > >
> >> > > This looks so hacked up :)
> >> >
> >> > It is... a bit... :)
> >> >
> >> > > Wouldn't it be better to let the scheduler tell us what all kind of 
> >> > > tasks it has
> >> > > in the rq of a CPU and pass a mask of flags?
> >> >
> >> > That would definitively report a more consistent view of what's going
> >> > on on each CPU.
> >> >
> >> > > I think it wouldn't be difficult (or time consuming) for the
> >> > > scheduler to know that, but I am not 100% sure.
> >> >
> >> > Main issue perhaps is that cpufreq_update_{util,this_cpu} are
> >> > currently called by the scheduling classes codes and not from the core
> >> > scheduler. However I agree that it should be possible to build up such
> >> > information and make it available to the scheduling classes code.
> >> >
> >> > I'll have a look at that.
> >> >
> >> > > IOW, the flags field in cpufreq_update_util() will represent all tasks 
> >> > > in the
> >> > > rq, instead of just the task that is getting enqueued/dequeued..
> >> > >
> >> > > And obviously we need to get some utilization numbers for the RT and 
> >> > > DL tasks
> >> > > going forward, switching to max isn't going to work for ever :)
> >> >
> >> > Regarding this last point, there are WIP patches Juri is working on to
> >> > feed DL demands to schedutil, his presentation at last ELC partially
> >> > covers these developments:
> >> >   
> >> > https://www.youtube.com/watch?v=wzrcWNIneWY=37=PLbzoR-pLrL6pSlkQDW7RpnNLuxPq6WVUR
> >> >
> >> > Instead, RT tasks are currently covered by an rt_avg metric which we
> >> > already know is not fitting for most purposes.
> >> > It seems that the main goal is twofold: move people to DL whenever
> >> > possible otherwise live with the go-to-max policy which is the only
> >> > sensible solution to satisfy the RT's class main goal, i.e. latency
> >> > reduction.
> >> >
> >> > Of course such a go-to-max policy for all RT tasks we already know
> >> > that is going to destroy energy on many different mobile scenarios.
> >> >
> >> > As a possible mitigation for that, while still being compliant with
> >> > the main RT's class goal, we recently posted the SchedTune v3
> >> > proposal:
> >> >   https://lkml.org/lkml/2017/2/28/355
> >> >
> >> > In that proposal, the simple usage of CGroups and the new capacity_max
> >> > attribute of the (existing) CPU controller should allow to define what
> >> > is the "max" value which is just enough to match the latency
> >> > constraints of a mobile application without sacrificing too much
> >> > energy.
> >
> > Given the following interesting question, let's add Thomas Gleixner to
> > the discussion, which can be interested in sharing his viewpoint.
> >
> >> And who's going to figure out what "max" value is most suitable?  And how?
> >
> > That's usually up to the system profiling which is part of the
> > platform optimizations and tunings.
> > For example it's possible to run  experiments to measure the bandwidth
> > and (completion) latency requirements from the RT workloads.
> >
> > It's something which people developing embedded/mobile systems are
> > quite aware of.
> 
> Well, I was expecting an answer like this to be honest and let me say
> that it is not too convincing from my perspective.
> 
> At least throwing embedded and mobile into one bag seems to be a
> stretch, because the usage patterns of those two groups are quite
> different, even though they may be similar from the hardware POV.
> 
> Mobile are mostly used as general-purpose computers nowadays (and I
> guess we're essentially talking about anything running Android, not
> just phones, aren't we?) with applications installed by users rather
> than by system integrators, so I'm 

Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-15 Thread Rafael J. Wysocki
On Wed, Mar 15, 2017 at 3:40 PM, Patrick Bellasi
 wrote:
> On 15-Mar 12:52, Rafael J. Wysocki wrote:
>> On Friday, March 03, 2017 12:38:30 PM Patrick Bellasi wrote:
>> > On 03-Mar 14:01, Viresh Kumar wrote:
>> > > On 02-03-17, 15:45, Patrick Bellasi wrote:
>> > > > diff --git a/kernel/sched/cpufreq_schedutil.c 
>> > > > b/kernel/sched/cpufreq_schedutil.c
>> > > > @@ -293,15 +305,29 @@ static void sugov_update_shared(struct 
>> > > > update_util_data *hook, u64 time,
>> > > > if (curr == sg_policy->thread)
>> > > > goto done;
>> > > >
>> > > > +   /*
>> > > > +* While RT/DL tasks are running we do not want FAIR tasks to
>> > > > +* overwrite this CPU's flags, still we can update utilization 
>> > > > and
>> > > > +* frequency (if required/possible) to be fair with these 
>> > > > tasks.
>> > > > +*/
>> > > > +   rt_mode = task_has_dl_policy(curr) ||
>> > > > + task_has_rt_policy(curr) ||
>> > > > + (flags & SCHED_CPUFREQ_RT_DL);
>> > > > +   if (rt_mode)
>> > > > +   sg_cpu->flags |= flags;
>> > > > +   else
>> > > > +   sg_cpu->flags = flags;
>> > >
>> > > This looks so hacked up :)
>> >
>> > It is... a bit... :)
>> >
>> > > Wouldn't it be better to let the scheduler tell us what all kind of 
>> > > tasks it has
>> > > in the rq of a CPU and pass a mask of flags?
>> >
>> > That would definitively report a more consistent view of what's going
>> > on on each CPU.
>> >
>> > > I think it wouldn't be difficult (or time consuming) for the
>> > > scheduler to know that, but I am not 100% sure.
>> >
>> > Main issue perhaps is that cpufreq_update_{util,this_cpu} are
>> > currently called by the scheduling classes codes and not from the core
>> > scheduler. However I agree that it should be possible to build up such
>> > information and make it available to the scheduling classes code.
>> >
>> > I'll have a look at that.
>> >
>> > > IOW, the flags field in cpufreq_update_util() will represent all tasks 
>> > > in the
>> > > rq, instead of just the task that is getting enqueued/dequeued..
>> > >
>> > > And obviously we need to get some utilization numbers for the RT and DL 
>> > > tasks
>> > > going forward, switching to max isn't going to work for ever :)
>> >
>> > Regarding this last point, there are WIP patches Juri is working on to
>> > feed DL demands to schedutil, his presentation at last ELC partially
>> > covers these developments:
>> >   
>> > https://www.youtube.com/watch?v=wzrcWNIneWY=37=PLbzoR-pLrL6pSlkQDW7RpnNLuxPq6WVUR
>> >
>> > Instead, RT tasks are currently covered by an rt_avg metric which we
>> > already know is not fitting for most purposes.
>> > It seems that the main goal is twofold: move people to DL whenever
>> > possible otherwise live with the go-to-max policy which is the only
>> > sensible solution to satisfy the RT's class main goal, i.e. latency
>> > reduction.
>> >
>> > Of course such a go-to-max policy for all RT tasks we already know
>> > that is going to destroy energy on many different mobile scenarios.
>> >
>> > As a possible mitigation for that, while still being compliant with
>> > the main RT's class goal, we recently posted the SchedTune v3
>> > proposal:
>> >   https://lkml.org/lkml/2017/2/28/355
>> >
>> > In that proposal, the simple usage of CGroups and the new capacity_max
>> > attribute of the (existing) CPU controller should allow to define what
>> > is the "max" value which is just enough to match the latency
>> > constraints of a mobile application without sacrificing too much
>> > energy.
>
> Given the following interesting question, let's add Thomas Gleixner to
> the discussion, which can be interested in sharing his viewpoint.
>
>> And who's going to figure out what "max" value is most suitable?  And how?
>
> That's usually up to the system profiling which is part of the
> platform optimizations and tunings.
> For example it's possible to run  experiments to measure the bandwidth
> and (completion) latency requirements from the RT workloads.
>
> It's something which people developing embedded/mobile systems are
> quite aware of.

Well, I was expecting an answer like this to be honest and let me say
that it is not too convincing from my perspective.

At least throwing embedded and mobile into one bag seems to be a
stretch, because the usage patterns of those two groups are quite
different, even though they may be similar from the hardware POV.

Mobile are mostly used as general-purpose computers nowadays (and I
guess we're essentially talking about anything running Android, not
just phones, aren't we?) with applications installed by users rather
than by system integrators, so I'm doubtful about the viability of the
"system integrators should take care of it" assumption in this
particular case.

> I'm also quite confident on saying that most of
> them can agree that just going to the max OPP, each and every time a
> 

Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-15 Thread Rafael J. Wysocki
On Wed, Mar 15, 2017 at 3:40 PM, Patrick Bellasi
 wrote:
> On 15-Mar 12:52, Rafael J. Wysocki wrote:
>> On Friday, March 03, 2017 12:38:30 PM Patrick Bellasi wrote:
>> > On 03-Mar 14:01, Viresh Kumar wrote:
>> > > On 02-03-17, 15:45, Patrick Bellasi wrote:
>> > > > diff --git a/kernel/sched/cpufreq_schedutil.c 
>> > > > b/kernel/sched/cpufreq_schedutil.c
>> > > > @@ -293,15 +305,29 @@ static void sugov_update_shared(struct 
>> > > > update_util_data *hook, u64 time,
>> > > > if (curr == sg_policy->thread)
>> > > > goto done;
>> > > >
>> > > > +   /*
>> > > > +* While RT/DL tasks are running we do not want FAIR tasks to
>> > > > +* overwrite this CPU's flags, still we can update utilization 
>> > > > and
>> > > > +* frequency (if required/possible) to be fair with these 
>> > > > tasks.
>> > > > +*/
>> > > > +   rt_mode = task_has_dl_policy(curr) ||
>> > > > + task_has_rt_policy(curr) ||
>> > > > + (flags & SCHED_CPUFREQ_RT_DL);
>> > > > +   if (rt_mode)
>> > > > +   sg_cpu->flags |= flags;
>> > > > +   else
>> > > > +   sg_cpu->flags = flags;
>> > >
>> > > This looks so hacked up :)
>> >
>> > It is... a bit... :)
>> >
>> > > Wouldn't it be better to let the scheduler tell us what all kind of 
>> > > tasks it has
>> > > in the rq of a CPU and pass a mask of flags?
>> >
>> > That would definitively report a more consistent view of what's going
>> > on on each CPU.
>> >
>> > > I think it wouldn't be difficult (or time consuming) for the
>> > > scheduler to know that, but I am not 100% sure.
>> >
>> > Main issue perhaps is that cpufreq_update_{util,this_cpu} are
>> > currently called by the scheduling classes codes and not from the core
>> > scheduler. However I agree that it should be possible to build up such
>> > information and make it available to the scheduling classes code.
>> >
>> > I'll have a look at that.
>> >
>> > > IOW, the flags field in cpufreq_update_util() will represent all tasks 
>> > > in the
>> > > rq, instead of just the task that is getting enqueued/dequeued..
>> > >
>> > > And obviously we need to get some utilization numbers for the RT and DL 
>> > > tasks
>> > > going forward, switching to max isn't going to work for ever :)
>> >
>> > Regarding this last point, there are WIP patches Juri is working on to
>> > feed DL demands to schedutil, his presentation at last ELC partially
>> > covers these developments:
>> >   
>> > https://www.youtube.com/watch?v=wzrcWNIneWY=37=PLbzoR-pLrL6pSlkQDW7RpnNLuxPq6WVUR
>> >
>> > Instead, RT tasks are currently covered by an rt_avg metric which we
>> > already know is not fitting for most purposes.
>> > It seems that the main goal is twofold: move people to DL whenever
>> > possible otherwise live with the go-to-max policy which is the only
>> > sensible solution to satisfy the RT's class main goal, i.e. latency
>> > reduction.
>> >
>> > Of course such a go-to-max policy for all RT tasks we already know
>> > that is going to destroy energy on many different mobile scenarios.
>> >
>> > As a possible mitigation for that, while still being compliant with
>> > the main RT's class goal, we recently posted the SchedTune v3
>> > proposal:
>> >   https://lkml.org/lkml/2017/2/28/355
>> >
>> > In that proposal, the simple usage of CGroups and the new capacity_max
>> > attribute of the (existing) CPU controller should allow to define what
>> > is the "max" value which is just enough to match the latency
>> > constraints of a mobile application without sacrificing too much
>> > energy.
>
> Given the following interesting question, let's add Thomas Gleixner to
> the discussion, which can be interested in sharing his viewpoint.
>
>> And who's going to figure out what "max" value is most suitable?  And how?
>
> That's usually up to the system profiling which is part of the
> platform optimizations and tunings.
> For example it's possible to run  experiments to measure the bandwidth
> and (completion) latency requirements from the RT workloads.
>
> It's something which people developing embedded/mobile systems are
> quite aware of.

Well, I was expecting an answer like this to be honest and let me say
that it is not too convincing from my perspective.

At least throwing embedded and mobile into one bag seems to be a
stretch, because the usage patterns of those two groups are quite
different, even though they may be similar from the hardware POV.

Mobile are mostly used as general-purpose computers nowadays (and I
guess we're essentially talking about anything running Android, not
just phones, aren't we?) with applications installed by users rather
than by system integrators, so I'm doubtful about the viability of the
"system integrators should take care of it" assumption in this
particular case.

> I'm also quite confident on saying that most of
> them can agree that just going to the max OPP, each and every time a
> RT task becomes RUNNABLE, 

Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-15 Thread Patrick Bellasi
On 15-Mar 12:52, Rafael J. Wysocki wrote:
> On Friday, March 03, 2017 12:38:30 PM Patrick Bellasi wrote:
> > On 03-Mar 14:01, Viresh Kumar wrote:
> > > On 02-03-17, 15:45, Patrick Bellasi wrote:
> > > > diff --git a/kernel/sched/cpufreq_schedutil.c 
> > > > b/kernel/sched/cpufreq_schedutil.c
> > > > @@ -293,15 +305,29 @@ static void sugov_update_shared(struct 
> > > > update_util_data *hook, u64 time,
> > > > if (curr == sg_policy->thread)
> > > > goto done;
> > > >  
> > > > +   /*
> > > > +* While RT/DL tasks are running we do not want FAIR tasks to
> > > > +* overwrite this CPU's flags, still we can update utilization 
> > > > and
> > > > +* frequency (if required/possible) to be fair with these tasks.
> > > > +*/
> > > > +   rt_mode = task_has_dl_policy(curr) ||
> > > > + task_has_rt_policy(curr) ||
> > > > + (flags & SCHED_CPUFREQ_RT_DL);
> > > > +   if (rt_mode)
> > > > +   sg_cpu->flags |= flags;
> > > > +   else
> > > > +   sg_cpu->flags = flags;
> > > 
> > > This looks so hacked up :)
> > 
> > It is... a bit... :)
> > 
> > > Wouldn't it be better to let the scheduler tell us what all kind of tasks 
> > > it has
> > > in the rq of a CPU and pass a mask of flags?
> > 
> > That would definitively report a more consistent view of what's going
> > on on each CPU.
> > 
> > > I think it wouldn't be difficult (or time consuming) for the
> > > scheduler to know that, but I am not 100% sure.
> > 
> > Main issue perhaps is that cpufreq_update_{util,this_cpu} are
> > currently called by the scheduling classes codes and not from the core
> > scheduler. However I agree that it should be possible to build up such
> > information and make it available to the scheduling classes code.
> > 
> > I'll have a look at that.
> > 
> > > IOW, the flags field in cpufreq_update_util() will represent all tasks in 
> > > the
> > > rq, instead of just the task that is getting enqueued/dequeued..
> > > 
> > > And obviously we need to get some utilization numbers for the RT and DL 
> > > tasks
> > > going forward, switching to max isn't going to work for ever :)
> > 
> > Regarding this last point, there are WIP patches Juri is working on to
> > feed DL demands to schedutil, his presentation at last ELC partially
> > covers these developments:
> >   
> > https://www.youtube.com/watch?v=wzrcWNIneWY=37=PLbzoR-pLrL6pSlkQDW7RpnNLuxPq6WVUR
> > 
> > Instead, RT tasks are currently covered by an rt_avg metric which we
> > already know is not fitting for most purposes.
> > It seems that the main goal is twofold: move people to DL whenever
> > possible otherwise live with the go-to-max policy which is the only
> > sensible solution to satisfy the RT's class main goal, i.e. latency
> > reduction.
> > 
> > Of course such a go-to-max policy for all RT tasks we already know
> > that is going to destroy energy on many different mobile scenarios.
> > 
> > As a possible mitigation for that, while still being compliant with
> > the main RT's class goal, we recently posted the SchedTune v3
> > proposal:
> >   https://lkml.org/lkml/2017/2/28/355
> > 
> > In that proposal, the simple usage of CGroups and the new capacity_max
> > attribute of the (existing) CPU controller should allow to define what
> > is the "max" value which is just enough to match the latency
> > constraints of a mobile application without sacrificing too much
> > energy.

Given the following interesting question, let's add Thomas Gleixner to
the discussion, which can be interested in sharing his viewpoint.
 
> And who's going to figure out what "max" value is most suitable?  And how?

That's usually up to the system profiling which is part of the
platform optimizations and tunings.
For example it's possible to run  experiments to measure the bandwidth
and (completion) latency requirements from the RT workloads.

It's something which people developing embedded/mobile systems are
quite aware of. I'm also quite confident on saying that most of
them can agree that just going to the max OPP, each and every time a
RT task becomes RUNNABLE, it is something which is more likely to hurt
than to give benefits.

AFAIK the current policy (i.e. "go to max") has been adopted for the
following main reasons, which I'm reporting with some observations.


.:: Missing of a suitable utilization metric for RT tasks

 There is actually a utilization signal (rq->rt_avg) but it has been
 verified to be "too slow" for the practical usage of driving OPP
 selection.
 Other possibilities are perhaps under exploration but they are not
 yet there.


.:: Promote the migration from RT to DEADLINE

 Which makes a lot of sens for many existing use-cases, starting from
 Android as well. However, it's also true that we cannot (at least yet)
 split the world in DEALINE vs FAIR.
 There is still, and there will be, a fair amount of RT tasks which
 just it makes sense to serve at best both 

Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-15 Thread Patrick Bellasi
On 15-Mar 12:52, Rafael J. Wysocki wrote:
> On Friday, March 03, 2017 12:38:30 PM Patrick Bellasi wrote:
> > On 03-Mar 14:01, Viresh Kumar wrote:
> > > On 02-03-17, 15:45, Patrick Bellasi wrote:
> > > > diff --git a/kernel/sched/cpufreq_schedutil.c 
> > > > b/kernel/sched/cpufreq_schedutil.c
> > > > @@ -293,15 +305,29 @@ static void sugov_update_shared(struct 
> > > > update_util_data *hook, u64 time,
> > > > if (curr == sg_policy->thread)
> > > > goto done;
> > > >  
> > > > +   /*
> > > > +* While RT/DL tasks are running we do not want FAIR tasks to
> > > > +* overwrite this CPU's flags, still we can update utilization 
> > > > and
> > > > +* frequency (if required/possible) to be fair with these tasks.
> > > > +*/
> > > > +   rt_mode = task_has_dl_policy(curr) ||
> > > > + task_has_rt_policy(curr) ||
> > > > + (flags & SCHED_CPUFREQ_RT_DL);
> > > > +   if (rt_mode)
> > > > +   sg_cpu->flags |= flags;
> > > > +   else
> > > > +   sg_cpu->flags = flags;
> > > 
> > > This looks so hacked up :)
> > 
> > It is... a bit... :)
> > 
> > > Wouldn't it be better to let the scheduler tell us what all kind of tasks 
> > > it has
> > > in the rq of a CPU and pass a mask of flags?
> > 
> > That would definitively report a more consistent view of what's going
> > on on each CPU.
> > 
> > > I think it wouldn't be difficult (or time consuming) for the
> > > scheduler to know that, but I am not 100% sure.
> > 
> > Main issue perhaps is that cpufreq_update_{util,this_cpu} are
> > currently called by the scheduling classes codes and not from the core
> > scheduler. However I agree that it should be possible to build up such
> > information and make it available to the scheduling classes code.
> > 
> > I'll have a look at that.
> > 
> > > IOW, the flags field in cpufreq_update_util() will represent all tasks in 
> > > the
> > > rq, instead of just the task that is getting enqueued/dequeued..
> > > 
> > > And obviously we need to get some utilization numbers for the RT and DL 
> > > tasks
> > > going forward, switching to max isn't going to work for ever :)
> > 
> > Regarding this last point, there are WIP patches Juri is working on to
> > feed DL demands to schedutil, his presentation at last ELC partially
> > covers these developments:
> >   
> > https://www.youtube.com/watch?v=wzrcWNIneWY=37=PLbzoR-pLrL6pSlkQDW7RpnNLuxPq6WVUR
> > 
> > Instead, RT tasks are currently covered by an rt_avg metric which we
> > already know is not fitting for most purposes.
> > It seems that the main goal is twofold: move people to DL whenever
> > possible otherwise live with the go-to-max policy which is the only
> > sensible solution to satisfy the RT's class main goal, i.e. latency
> > reduction.
> > 
> > Of course such a go-to-max policy for all RT tasks we already know
> > that is going to destroy energy on many different mobile scenarios.
> > 
> > As a possible mitigation for that, while still being compliant with
> > the main RT's class goal, we recently posted the SchedTune v3
> > proposal:
> >   https://lkml.org/lkml/2017/2/28/355
> > 
> > In that proposal, the simple usage of CGroups and the new capacity_max
> > attribute of the (existing) CPU controller should allow to define what
> > is the "max" value which is just enough to match the latency
> > constraints of a mobile application without sacrificing too much
> > energy.

Given the following interesting question, let's add Thomas Gleixner to
the discussion, which can be interested in sharing his viewpoint.
 
> And who's going to figure out what "max" value is most suitable?  And how?

That's usually up to the system profiling which is part of the
platform optimizations and tunings.
For example it's possible to run  experiments to measure the bandwidth
and (completion) latency requirements from the RT workloads.

It's something which people developing embedded/mobile systems are
quite aware of. I'm also quite confident on saying that most of
them can agree that just going to the max OPP, each and every time a
RT task becomes RUNNABLE, it is something which is more likely to hurt
than to give benefits.

AFAIK the current policy (i.e. "go to max") has been adopted for the
following main reasons, which I'm reporting with some observations.


.:: Missing of a suitable utilization metric for RT tasks

 There is actually a utilization signal (rq->rt_avg) but it has been
 verified to be "too slow" for the practical usage of driving OPP
 selection.
 Other possibilities are perhaps under exploration but they are not
 yet there.


.:: Promote the migration from RT to DEADLINE

 Which makes a lot of sens for many existing use-cases, starting from
 Android as well. However, it's also true that we cannot (at least yet)
 split the world in DEALINE vs FAIR.
 There is still, and there will be, a fair amount of RT tasks which
 just it makes sense to serve at best both 

Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-15 Thread Rafael J. Wysocki
On Friday, March 03, 2017 12:38:30 PM Patrick Bellasi wrote:
> On 03-Mar 14:01, Viresh Kumar wrote:
> > On 02-03-17, 15:45, Patrick Bellasi wrote:
> > > diff --git a/kernel/sched/cpufreq_schedutil.c 
> > > b/kernel/sched/cpufreq_schedutil.c
> > > @@ -293,15 +305,29 @@ static void sugov_update_shared(struct 
> > > update_util_data *hook, u64 time,
> > >   if (curr == sg_policy->thread)
> > >   goto done;
> > >  
> > > + /*
> > > +  * While RT/DL tasks are running we do not want FAIR tasks to
> > > +  * overwrite this CPU's flags, still we can update utilization and
> > > +  * frequency (if required/possible) to be fair with these tasks.
> > > +  */
> > > + rt_mode = task_has_dl_policy(curr) ||
> > > +   task_has_rt_policy(curr) ||
> > > +   (flags & SCHED_CPUFREQ_RT_DL);
> > > + if (rt_mode)
> > > + sg_cpu->flags |= flags;
> > > + else
> > > + sg_cpu->flags = flags;
> > 
> > This looks so hacked up :)
> 
> It is... a bit... :)
> 
> > Wouldn't it be better to let the scheduler tell us what all kind of tasks 
> > it has
> > in the rq of a CPU and pass a mask of flags?
> 
> That would definitively report a more consistent view of what's going
> on on each CPU.
> 
> > I think it wouldn't be difficult (or time consuming) for the
> > scheduler to know that, but I am not 100% sure.
> 
> Main issue perhaps is that cpufreq_update_{util,this_cpu} are
> currently called by the scheduling classes codes and not from the core
> scheduler. However I agree that it should be possible to build up such
> information and make it available to the scheduling classes code.
> 
> I'll have a look at that.
> 
> > IOW, the flags field in cpufreq_update_util() will represent all tasks in 
> > the
> > rq, instead of just the task that is getting enqueued/dequeued..
> > 
> > And obviously we need to get some utilization numbers for the RT and DL 
> > tasks
> > going forward, switching to max isn't going to work for ever :)
> 
> Regarding this last point, there are WIP patches Juri is working on to
> feed DL demands to schedutil, his presentation at last ELC partially
> covers these developments:
>   
> https://www.youtube.com/watch?v=wzrcWNIneWY=37=PLbzoR-pLrL6pSlkQDW7RpnNLuxPq6WVUR
> 
> Instead, RT tasks are currently covered by an rt_avg metric which we
> already know is not fitting for most purposes.
> It seems that the main goal is twofold: move people to DL whenever
> possible otherwise live with the go-to-max policy which is the only
> sensible solution to satisfy the RT's class main goal, i.e. latency
> reduction.
> 
> Of course such a go-to-max policy for all RT tasks we already know
> that is going to destroy energy on many different mobile scenarios.
> 
> As a possible mitigation for that, while still being compliant with
> the main RT's class goal, we recently posted the SchedTune v3
> proposal:
>   https://lkml.org/lkml/2017/2/28/355
> 
> In that proposal, the simple usage of CGroups and the new capacity_max
> attribute of the (existing) CPU controller should allow to define what
> is the "max" value which is just enough to match the latency
> constraints of a mobile application without sacrificing too much
> energy.

And who's going to figure out what "max" value is most suitable?  And how?

Thanks,
Rafael



Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-15 Thread Rafael J. Wysocki
On Friday, March 03, 2017 12:38:30 PM Patrick Bellasi wrote:
> On 03-Mar 14:01, Viresh Kumar wrote:
> > On 02-03-17, 15:45, Patrick Bellasi wrote:
> > > diff --git a/kernel/sched/cpufreq_schedutil.c 
> > > b/kernel/sched/cpufreq_schedutil.c
> > > @@ -293,15 +305,29 @@ static void sugov_update_shared(struct 
> > > update_util_data *hook, u64 time,
> > >   if (curr == sg_policy->thread)
> > >   goto done;
> > >  
> > > + /*
> > > +  * While RT/DL tasks are running we do not want FAIR tasks to
> > > +  * overwrite this CPU's flags, still we can update utilization and
> > > +  * frequency (if required/possible) to be fair with these tasks.
> > > +  */
> > > + rt_mode = task_has_dl_policy(curr) ||
> > > +   task_has_rt_policy(curr) ||
> > > +   (flags & SCHED_CPUFREQ_RT_DL);
> > > + if (rt_mode)
> > > + sg_cpu->flags |= flags;
> > > + else
> > > + sg_cpu->flags = flags;
> > 
> > This looks so hacked up :)
> 
> It is... a bit... :)
> 
> > Wouldn't it be better to let the scheduler tell us what all kind of tasks 
> > it has
> > in the rq of a CPU and pass a mask of flags?
> 
> That would definitively report a more consistent view of what's going
> on on each CPU.
> 
> > I think it wouldn't be difficult (or time consuming) for the
> > scheduler to know that, but I am not 100% sure.
> 
> Main issue perhaps is that cpufreq_update_{util,this_cpu} are
> currently called by the scheduling classes codes and not from the core
> scheduler. However I agree that it should be possible to build up such
> information and make it available to the scheduling classes code.
> 
> I'll have a look at that.
> 
> > IOW, the flags field in cpufreq_update_util() will represent all tasks in 
> > the
> > rq, instead of just the task that is getting enqueued/dequeued..
> > 
> > And obviously we need to get some utilization numbers for the RT and DL 
> > tasks
> > going forward, switching to max isn't going to work for ever :)
> 
> Regarding this last point, there are WIP patches Juri is working on to
> feed DL demands to schedutil, his presentation at last ELC partially
> covers these developments:
>   
> https://www.youtube.com/watch?v=wzrcWNIneWY=37=PLbzoR-pLrL6pSlkQDW7RpnNLuxPq6WVUR
> 
> Instead, RT tasks are currently covered by an rt_avg metric which we
> already know is not fitting for most purposes.
> It seems that the main goal is twofold: move people to DL whenever
> possible otherwise live with the go-to-max policy which is the only
> sensible solution to satisfy the RT's class main goal, i.e. latency
> reduction.
> 
> Of course such a go-to-max policy for all RT tasks we already know
> that is going to destroy energy on many different mobile scenarios.
> 
> As a possible mitigation for that, while still being compliant with
> the main RT's class goal, we recently posted the SchedTune v3
> proposal:
>   https://lkml.org/lkml/2017/2/28/355
> 
> In that proposal, the simple usage of CGroups and the new capacity_max
> attribute of the (existing) CPU controller should allow to define what
> is the "max" value which is just enough to match the latency
> constraints of a mobile application without sacrificing too much
> energy.

And who's going to figure out what "max" value is most suitable?  And how?

Thanks,
Rafael



Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-03 Thread Patrick Bellasi
On 03-Mar 14:01, Viresh Kumar wrote:
> On 02-03-17, 15:45, Patrick Bellasi wrote:
> > diff --git a/kernel/sched/cpufreq_schedutil.c 
> > b/kernel/sched/cpufreq_schedutil.c
> > @@ -293,15 +305,29 @@ static void sugov_update_shared(struct 
> > update_util_data *hook, u64 time,
> > if (curr == sg_policy->thread)
> > goto done;
> >  
> > +   /*
> > +* While RT/DL tasks are running we do not want FAIR tasks to
> > +* overwrite this CPU's flags, still we can update utilization and
> > +* frequency (if required/possible) to be fair with these tasks.
> > +*/
> > +   rt_mode = task_has_dl_policy(curr) ||
> > + task_has_rt_policy(curr) ||
> > + (flags & SCHED_CPUFREQ_RT_DL);
> > +   if (rt_mode)
> > +   sg_cpu->flags |= flags;
> > +   else
> > +   sg_cpu->flags = flags;
> 
> This looks so hacked up :)

It is... a bit... :)

> Wouldn't it be better to let the scheduler tell us what all kind of tasks it 
> has
> in the rq of a CPU and pass a mask of flags?

That would definitively report a more consistent view of what's going
on on each CPU.

> I think it wouldn't be difficult (or time consuming) for the
> scheduler to know that, but I am not 100% sure.

Main issue perhaps is that cpufreq_update_{util,this_cpu} are
currently called by the scheduling classes codes and not from the core
scheduler. However I agree that it should be possible to build up such
information and make it available to the scheduling classes code.

I'll have a look at that.

> IOW, the flags field in cpufreq_update_util() will represent all tasks in the
> rq, instead of just the task that is getting enqueued/dequeued..
> 
> And obviously we need to get some utilization numbers for the RT and DL tasks
> going forward, switching to max isn't going to work for ever :)

Regarding this last point, there are WIP patches Juri is working on to
feed DL demands to schedutil, his presentation at last ELC partially
covers these developments:
  
https://www.youtube.com/watch?v=wzrcWNIneWY=37=PLbzoR-pLrL6pSlkQDW7RpnNLuxPq6WVUR

Instead, RT tasks are currently covered by an rt_avg metric which we
already know is not fitting for most purposes.
It seems that the main goal is twofold: move people to DL whenever
possible otherwise live with the go-to-max policy which is the only
sensible solution to satisfy the RT's class main goal, i.e. latency
reduction.

Of course such a go-to-max policy for all RT tasks we already know
that is going to destroy energy on many different mobile scenarios.

As a possible mitigation for that, while still being compliant with
the main RT's class goal, we recently posted the SchedTune v3
proposal:
  https://lkml.org/lkml/2017/2/28/355

In that proposal, the simple usage of CGroups and the new capacity_max
attribute of the (existing) CPU controller should allow to define what
is the "max" value which is just enough to match the latency
constraints of a mobile application without sacrificing too much
energy.

> -- 
> viresh

Cheers Patrick

-- 
#include 

Patrick Bellasi


Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-03 Thread Patrick Bellasi
On 03-Mar 14:01, Viresh Kumar wrote:
> On 02-03-17, 15:45, Patrick Bellasi wrote:
> > diff --git a/kernel/sched/cpufreq_schedutil.c 
> > b/kernel/sched/cpufreq_schedutil.c
> > @@ -293,15 +305,29 @@ static void sugov_update_shared(struct 
> > update_util_data *hook, u64 time,
> > if (curr == sg_policy->thread)
> > goto done;
> >  
> > +   /*
> > +* While RT/DL tasks are running we do not want FAIR tasks to
> > +* overwrite this CPU's flags, still we can update utilization and
> > +* frequency (if required/possible) to be fair with these tasks.
> > +*/
> > +   rt_mode = task_has_dl_policy(curr) ||
> > + task_has_rt_policy(curr) ||
> > + (flags & SCHED_CPUFREQ_RT_DL);
> > +   if (rt_mode)
> > +   sg_cpu->flags |= flags;
> > +   else
> > +   sg_cpu->flags = flags;
> 
> This looks so hacked up :)

It is... a bit... :)

> Wouldn't it be better to let the scheduler tell us what all kind of tasks it 
> has
> in the rq of a CPU and pass a mask of flags?

That would definitively report a more consistent view of what's going
on on each CPU.

> I think it wouldn't be difficult (or time consuming) for the
> scheduler to know that, but I am not 100% sure.

Main issue perhaps is that cpufreq_update_{util,this_cpu} are
currently called by the scheduling classes codes and not from the core
scheduler. However I agree that it should be possible to build up such
information and make it available to the scheduling classes code.

I'll have a look at that.

> IOW, the flags field in cpufreq_update_util() will represent all tasks in the
> rq, instead of just the task that is getting enqueued/dequeued..
> 
> And obviously we need to get some utilization numbers for the RT and DL tasks
> going forward, switching to max isn't going to work for ever :)

Regarding this last point, there are WIP patches Juri is working on to
feed DL demands to schedutil, his presentation at last ELC partially
covers these developments:
  
https://www.youtube.com/watch?v=wzrcWNIneWY=37=PLbzoR-pLrL6pSlkQDW7RpnNLuxPq6WVUR

Instead, RT tasks are currently covered by an rt_avg metric which we
already know is not fitting for most purposes.
It seems that the main goal is twofold: move people to DL whenever
possible otherwise live with the go-to-max policy which is the only
sensible solution to satisfy the RT's class main goal, i.e. latency
reduction.

Of course such a go-to-max policy for all RT tasks we already know
that is going to destroy energy on many different mobile scenarios.

As a possible mitigation for that, while still being compliant with
the main RT's class goal, we recently posted the SchedTune v3
proposal:
  https://lkml.org/lkml/2017/2/28/355

In that proposal, the simple usage of CGroups and the new capacity_max
attribute of the (existing) CPU controller should allow to define what
is the "max" value which is just enough to match the latency
constraints of a mobile application without sacrificing too much
energy.

> -- 
> viresh

Cheers Patrick

-- 
#include 

Patrick Bellasi


Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-03 Thread Viresh Kumar
On 02-03-17, 15:45, Patrick Bellasi wrote:
> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> @@ -293,15 +305,29 @@ static void sugov_update_shared(struct update_util_data 
> *hook, u64 time,
>   if (curr == sg_policy->thread)
>   goto done;
>  
> + /*
> +  * While RT/DL tasks are running we do not want FAIR tasks to
> +  * overwrite this CPU's flags, still we can update utilization and
> +  * frequency (if required/possible) to be fair with these tasks.
> +  */
> + rt_mode = task_has_dl_policy(curr) ||
> +   task_has_rt_policy(curr) ||
> +   (flags & SCHED_CPUFREQ_RT_DL);
> + if (rt_mode)
> + sg_cpu->flags |= flags;
> + else
> + sg_cpu->flags = flags;

This looks so hacked up :)

Wouldn't it be better to let the scheduler tell us what all kind of tasks it has
in the rq of a CPU and pass a mask of flags? I think it wouldn't be difficult
(or time consuming) for the scheduler to know that, but I am not 100% sure.

IOW, the flags field in cpufreq_update_util() will represent all tasks in the
rq, instead of just the task that is getting enqueued/dequeued..

And obviously we need to get some utilization numbers for the RT and DL tasks
going forward, switching to max isn't going to work for ever :)

-- 
viresh


Re: [PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-03 Thread Viresh Kumar
On 02-03-17, 15:45, Patrick Bellasi wrote:
> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> @@ -293,15 +305,29 @@ static void sugov_update_shared(struct update_util_data 
> *hook, u64 time,
>   if (curr == sg_policy->thread)
>   goto done;
>  
> + /*
> +  * While RT/DL tasks are running we do not want FAIR tasks to
> +  * overwrite this CPU's flags, still we can update utilization and
> +  * frequency (if required/possible) to be fair with these tasks.
> +  */
> + rt_mode = task_has_dl_policy(curr) ||
> +   task_has_rt_policy(curr) ||
> +   (flags & SCHED_CPUFREQ_RT_DL);
> + if (rt_mode)
> + sg_cpu->flags |= flags;
> + else
> + sg_cpu->flags = flags;

This looks so hacked up :)

Wouldn't it be better to let the scheduler tell us what all kind of tasks it has
in the rq of a CPU and pass a mask of flags? I think it wouldn't be difficult
(or time consuming) for the scheduler to know that, but I am not 100% sure.

IOW, the flags field in cpufreq_update_util() will represent all tasks in the
rq, instead of just the task that is getting enqueued/dequeued..

And obviously we need to get some utilization numbers for the RT and DL tasks
going forward, switching to max isn't going to work for ever :)

-- 
viresh


[PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-02 Thread Patrick Bellasi
The policy in use for RT/DL tasks sets the maximum frequency when a task
in these classes calls for a cpufreq_update_this_cpu().  However, the
current implementation might cause a frequency drop while a RT/DL task
is still running, just because for example a FAIR task wakes up and is
enqueued in the same CPU.

This issue is due to the sg_cpu's flags being overwritten at each call
of sugov_update_*. The wakeup of a FAIR task resets the flags and can
trigger a frequency update thus affecting the currently running RT/DL
task.

This can be fixed, in shared frequency domains, by adding (instead of
overwriting) the new flags before triggering a frequency update.  This
grants to stay at least at the frequency requested by the RT/DL class,
which is the maximum one for the time being, but can also be lower when
for example DL will be extended to provide a precise bandwidth
requirement.

Signed-off-by: Patrick Bellasi 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Rafael J. Wysocki 
Cc: Viresh Kumar 
Cc: linux-kernel@vger.kernel.org
Cc: linux...@vger.kernel.org
---
 kernel/sched/cpufreq_schedutil.c | 32 +---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index a3fe5e4..b98a167 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -196,10 +196,21 @@ static void sugov_update_single(struct update_util_data 
*hook, u64 time,
unsigned int flags)
 {
struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, 
update_util);
+   struct task_struct *curr = cpu_curr(smp_processor_id());
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
struct cpufreq_policy *policy = sg_policy->policy;
unsigned long util, max;
unsigned int next_f;
+   bool rt_mode;
+
+   /*
+* While RT/DL tasks are running we do not want FAIR tasks to
+* overvrite this CPU's flags, still we can update utilization and
+* frequency (if required/possible) to be fair with these tasks.
+*/
+   rt_mode = task_has_dl_policy(curr) ||
+ task_has_rt_policy(curr) ||
+ (flags & SCHED_CPUFREQ_RT_DL);
 
sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
@@ -207,7 +218,7 @@ static void sugov_update_single(struct update_util_data 
*hook, u64 time,
if (!sugov_should_update_freq(sg_policy, time))
return;
 
-   if (flags & SCHED_CPUFREQ_RT_DL) {
+   if (rt_mode) {
next_f = policy->cpuinfo.max_freq;
} else {
sugov_get_util(, );
@@ -278,6 +289,7 @@ static void sugov_update_shared(struct update_util_data 
*hook, u64 time,
struct task_struct *curr = cpu_curr(cpu);
unsigned long util, max;
unsigned int next_f;
+   bool rt_mode;
 
sugov_get_util(, );
 
@@ -293,15 +305,29 @@ static void sugov_update_shared(struct update_util_data 
*hook, u64 time,
if (curr == sg_policy->thread)
goto done;
 
+   /*
+* While RT/DL tasks are running we do not want FAIR tasks to
+* overwrite this CPU's flags, still we can update utilization and
+* frequency (if required/possible) to be fair with these tasks.
+*/
+   rt_mode = task_has_dl_policy(curr) ||
+ task_has_rt_policy(curr) ||
+ (flags & SCHED_CPUFREQ_RT_DL);
+   if (rt_mode)
+   sg_cpu->flags |= flags;
+   else
+   sg_cpu->flags = flags;
+
sg_cpu->util = util;
sg_cpu->max = max;
-   sg_cpu->flags = flags;
 
sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
 
if (sugov_should_update_freq(sg_policy, time)) {
-   next_f = sugov_next_freq_shared(sg_cpu, util, max, flags);
+   next_f = sg_policy->policy->cpuinfo.max_freq;
+   if (!rt_mode)
+   next_f = sugov_next_freq_shared(sg_cpu, util, max, 
flags);
sugov_update_commit(sg_policy, time, next_f);
}
 
-- 
2.7.4



[PATCH 3/6] cpufreq: schedutil: ensure max frequency while running RT/DL tasks

2017-03-02 Thread Patrick Bellasi
The policy in use for RT/DL tasks sets the maximum frequency when a task
in these classes calls for a cpufreq_update_this_cpu().  However, the
current implementation might cause a frequency drop while a RT/DL task
is still running, just because for example a FAIR task wakes up and is
enqueued in the same CPU.

This issue is due to the sg_cpu's flags being overwritten at each call
of sugov_update_*. The wakeup of a FAIR task resets the flags and can
trigger a frequency update thus affecting the currently running RT/DL
task.

This can be fixed, in shared frequency domains, by adding (instead of
overwriting) the new flags before triggering a frequency update.  This
grants to stay at least at the frequency requested by the RT/DL class,
which is the maximum one for the time being, but can also be lower when
for example DL will be extended to provide a precise bandwidth
requirement.

Signed-off-by: Patrick Bellasi 
Cc: Ingo Molnar 
Cc: Peter Zijlstra 
Cc: Rafael J. Wysocki 
Cc: Viresh Kumar 
Cc: linux-kernel@vger.kernel.org
Cc: linux...@vger.kernel.org
---
 kernel/sched/cpufreq_schedutil.c | 32 +---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index a3fe5e4..b98a167 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -196,10 +196,21 @@ static void sugov_update_single(struct update_util_data 
*hook, u64 time,
unsigned int flags)
 {
struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, 
update_util);
+   struct task_struct *curr = cpu_curr(smp_processor_id());
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
struct cpufreq_policy *policy = sg_policy->policy;
unsigned long util, max;
unsigned int next_f;
+   bool rt_mode;
+
+   /*
+* While RT/DL tasks are running we do not want FAIR tasks to
+* overvrite this CPU's flags, still we can update utilization and
+* frequency (if required/possible) to be fair with these tasks.
+*/
+   rt_mode = task_has_dl_policy(curr) ||
+ task_has_rt_policy(curr) ||
+ (flags & SCHED_CPUFREQ_RT_DL);
 
sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
@@ -207,7 +218,7 @@ static void sugov_update_single(struct update_util_data 
*hook, u64 time,
if (!sugov_should_update_freq(sg_policy, time))
return;
 
-   if (flags & SCHED_CPUFREQ_RT_DL) {
+   if (rt_mode) {
next_f = policy->cpuinfo.max_freq;
} else {
sugov_get_util(, );
@@ -278,6 +289,7 @@ static void sugov_update_shared(struct update_util_data 
*hook, u64 time,
struct task_struct *curr = cpu_curr(cpu);
unsigned long util, max;
unsigned int next_f;
+   bool rt_mode;
 
sugov_get_util(, );
 
@@ -293,15 +305,29 @@ static void sugov_update_shared(struct update_util_data 
*hook, u64 time,
if (curr == sg_policy->thread)
goto done;
 
+   /*
+* While RT/DL tasks are running we do not want FAIR tasks to
+* overwrite this CPU's flags, still we can update utilization and
+* frequency (if required/possible) to be fair with these tasks.
+*/
+   rt_mode = task_has_dl_policy(curr) ||
+ task_has_rt_policy(curr) ||
+ (flags & SCHED_CPUFREQ_RT_DL);
+   if (rt_mode)
+   sg_cpu->flags |= flags;
+   else
+   sg_cpu->flags = flags;
+
sg_cpu->util = util;
sg_cpu->max = max;
-   sg_cpu->flags = flags;
 
sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
 
if (sugov_should_update_freq(sg_policy, time)) {
-   next_f = sugov_next_freq_shared(sg_cpu, util, max, flags);
+   next_f = sg_policy->policy->cpuinfo.max_freq;
+   if (!rt_mode)
+   next_f = sugov_next_freq_shared(sg_cpu, util, max, 
flags);
sugov_update_commit(sg_policy, time, next_f);
}
 
-- 
2.7.4