Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2018-02-08 Thread Claudio Scordino

Hi Patrick,

Il 06/02/2018 19:36, Patrick Bellasi ha scritto:

On 06-Feb 19:14, Claudio Scordino wrote:

Hi Patrick,



At first glance, your proposal below makes to make sense.

However, I'm wondering if we cannot get it working using
rq->dl's provided information instead of flags?


Yes, we can use the value of rq->dl to check if there has been an increase of 
the deadline utilization.
Even if schedutil might have been triggered by a different scheduling class, 
the effect should be almost the same.

Below a potential patch. I've kept all frequency update decisions in a single 
point (i.e. sugov_should_update_freq).
Not yet tested (waiting for further comments).


I have a todo list entry to backport/test Peter's series on Android...
will add this patch too. Thanks.


Please discard my latest patch, as the tests on Odroid have shown performance 
regressions
(likely due to sugov_next_freq_shared being called more often)

Never mind. I have already tested another (even simpler) patch.
Sending soon as a new thread on LKML.

Many thanks and best regards,

Claudio


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2018-02-08 Thread Claudio Scordino

Hi Patrick,

Il 06/02/2018 19:36, Patrick Bellasi ha scritto:

On 06-Feb 19:14, Claudio Scordino wrote:

Hi Patrick,



At first glance, your proposal below makes to make sense.

However, I'm wondering if we cannot get it working using
rq->dl's provided information instead of flags?


Yes, we can use the value of rq->dl to check if there has been an increase of 
the deadline utilization.
Even if schedutil might have been triggered by a different scheduling class, 
the effect should be almost the same.

Below a potential patch. I've kept all frequency update decisions in a single 
point (i.e. sugov_should_update_freq).
Not yet tested (waiting for further comments).


I have a todo list entry to backport/test Peter's series on Android...
will add this patch too. Thanks.


Please discard my latest patch, as the tests on Odroid have shown performance 
regressions
(likely due to sugov_next_freq_shared being called more often)

Never mind. I have already tested another (even simpler) patch.
Sending soon as a new thread on LKML.

Many thanks and best regards,

Claudio


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2018-02-06 Thread Patrick Bellasi
On 06-Feb 19:14, Claudio Scordino wrote:
> Hi Patrick,

> >At first glance, your proposal below makes to make sense.
> >
> >However, I'm wondering if we cannot get it working using
> >rq->dl's provided information instead of flags?
> 
> Yes, we can use the value of rq->dl to check if there has been an increase of 
> the deadline utilization.
> Even if schedutil might have been triggered by a different scheduling class, 
> the effect should be almost the same.
> 
> Below a potential patch. I've kept all frequency update decisions in a single 
> point (i.e. sugov_should_update_freq).
> Not yet tested (waiting for further comments).

I have a todo list entry to backport/test Peter's series on Android...
will add this patch too. Thanks.

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2018-02-06 Thread Patrick Bellasi
On 06-Feb 19:14, Claudio Scordino wrote:
> Hi Patrick,

> >At first glance, your proposal below makes to make sense.
> >
> >However, I'm wondering if we cannot get it working using
> >rq->dl's provided information instead of flags?
> 
> Yes, we can use the value of rq->dl to check if there has been an increase of 
> the deadline utilization.
> Even if schedutil might have been triggered by a different scheduling class, 
> the effect should be almost the same.
> 
> Below a potential patch. I've kept all frequency update decisions in a single 
> point (i.e. sugov_should_update_freq).
> Not yet tested (waiting for further comments).

I have a todo list entry to backport/test Peter's series on Android...
will add this patch too. Thanks.

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2018-02-06 Thread Claudio Scordino

Hi Patrick,

Il 06/02/2018 16:43, Patrick Bellasi ha scritto:

Hi Claudio,

On 06-Feb 11:55, Claudio Scordino wrote:

Hi Peter,

Il 20/12/2017 16:30, Peter Zijlstra ha scritto:


So I ended up with the below (on top of Juri's cpufreq-dl patches).

It compiles, but that's about all the testing it had.

--- a/include/linux/sched/cpufreq.h
+++ b/include/linux/sched/cpufreq.h


[..]


@@ -188,17 +187,23 @@ static void sugov_get_util(struct sugov_
  static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
  {
+   unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
+   struct rq *rq = cpu_rq(sg_cpu->cpu);
+
+   if (rq->rt.rt_nr_running)
+   util = sg_cpu->max;
+
/*
 * Ideally we would like to set util_dl as min/guaranteed freq and
 * util_cfs + util_dl as requested freq. However, cpufreq is not yet
 * ready for such an interface. So, we only do the latter for now.
 */
-   return min(sg_cpu->util_cfs + sg_cpu->util_dl, sg_cpu->max);
+   return min(util, sg_cpu->max);
  }


[...]



What is the status of this patch ? I couldn't find it on the
tip/queue repositories.

BTW, I wonder if we actually want to remove also the information
about the scheduling class who triggered the frequency change.


Removing flags was the main goal of the patch, since they represents
mainly duplicated information which scheduling classes already know.

This was making flags update error prone and difficult to keep
aligned with existing scheduling classes info.


This prevents us from adopting class-specific behaviors.


In Peter's proposal he replaces flags with checks like:

if (rq->rt.rt_nr_running)


For example, we might want to skip the rate limits when deadline
asks for an increase of frequency, as shown in the patch below.
In this case, we could just remove the flags from sugov_cpu, but
leave the defines and the argument for sugov_update_*()


At first glance, your proposal below makes to make sense.

However, I'm wondering if we cannot get it working using
rq->dl's provided information instead of flags?


Yes, we can use the value of rq->dl to check if there has been an increase of 
the deadline utilization.
Even if schedutil might have been triggered by a different scheduling class, 
the effect should be almost the same.

Below a potential patch. I've kept all frequency update decisions in a single 
point (i.e. sugov_should_update_freq).
Not yet tested (waiting for further comments).

Thanks,

   Claudio




From 49a6eec60574ae93297406d40155e6ce4113e442 Mon Sep 17 00:00:00 2001
From: Claudio Scordino 
Date: Tue, 6 Feb 2018 18:42:23 +0100
Subject: [PATCH] cpufreq: schedutil: rate limits for SCHED_DEADLINE

When the SCHED_DEADLINE scheduling class asks to increase the CPU
frequency, we should not wait the rate limit, otherwise we may miss some
deadline.

This patch moves all frequency update decisions to a single point:
sugov_should_update_freq(). In addition, it ignores the rate limit
whenever there is an increase of the CPU frequency given by an increase
of the deadline utilization.

Signed-off-by: Claudio Scordino 
---
 kernel/sched/cpufreq_schedutil.c | 27 +++
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index b0bd77d..e8504f5 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -74,7 +74,11 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
 
 / Governor internals ***/
 
-static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)

+static bool sugov_should_update_freq(struct sugov_policy *sg_policy,
+u64 time,
+struct sugov_cpu *sg_cpu_old,
+struct sugov_cpu *sg_cpu_new,
+unsigned int next_freq)
 {
s64 delta_ns;
 
@@ -111,6 +115,14 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)

return true;
}
 
+	/*

+* Ignore rate limit when DL asked to increase the CPU frequency,
+* otherwise we may miss some deadline.
+*/
+   if ((next_freq > sg_policy->next_freq) &&
+   (sg_cpu_new->util_dl > sg_cpu_old->util_dl))
+   return true;
+
delta_ns = time - sg_policy->last_freq_update_time;
return delta_ns >= sg_policy->freq_update_delay_ns;
 }
@@ -271,6 +283,7 @@ static void sugov_update_single(struct update_util_data 
*hook, u64 time,
unsigned int flags)
 {
struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, 
update_util);
+   struct sugov_cpu sg_cpu_old = *sg_cpu;
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
unsigned 

Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2018-02-06 Thread Claudio Scordino

Hi Patrick,

Il 06/02/2018 16:43, Patrick Bellasi ha scritto:

Hi Claudio,

On 06-Feb 11:55, Claudio Scordino wrote:

Hi Peter,

Il 20/12/2017 16:30, Peter Zijlstra ha scritto:


So I ended up with the below (on top of Juri's cpufreq-dl patches).

It compiles, but that's about all the testing it had.

--- a/include/linux/sched/cpufreq.h
+++ b/include/linux/sched/cpufreq.h


[..]


@@ -188,17 +187,23 @@ static void sugov_get_util(struct sugov_
  static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
  {
+   unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
+   struct rq *rq = cpu_rq(sg_cpu->cpu);
+
+   if (rq->rt.rt_nr_running)
+   util = sg_cpu->max;
+
/*
 * Ideally we would like to set util_dl as min/guaranteed freq and
 * util_cfs + util_dl as requested freq. However, cpufreq is not yet
 * ready for such an interface. So, we only do the latter for now.
 */
-   return min(sg_cpu->util_cfs + sg_cpu->util_dl, sg_cpu->max);
+   return min(util, sg_cpu->max);
  }


[...]



What is the status of this patch ? I couldn't find it on the
tip/queue repositories.

BTW, I wonder if we actually want to remove also the information
about the scheduling class who triggered the frequency change.


Removing flags was the main goal of the patch, since they represents
mainly duplicated information which scheduling classes already know.

This was making flags update error prone and difficult to keep
aligned with existing scheduling classes info.


This prevents us from adopting class-specific behaviors.


In Peter's proposal he replaces flags with checks like:

if (rq->rt.rt_nr_running)


For example, we might want to skip the rate limits when deadline
asks for an increase of frequency, as shown in the patch below.
In this case, we could just remove the flags from sugov_cpu, but
leave the defines and the argument for sugov_update_*()


At first glance, your proposal below makes to make sense.

However, I'm wondering if we cannot get it working using
rq->dl's provided information instead of flags?


Yes, we can use the value of rq->dl to check if there has been an increase of 
the deadline utilization.
Even if schedutil might have been triggered by a different scheduling class, 
the effect should be almost the same.

Below a potential patch. I've kept all frequency update decisions in a single 
point (i.e. sugov_should_update_freq).
Not yet tested (waiting for further comments).

Thanks,

   Claudio




From 49a6eec60574ae93297406d40155e6ce4113e442 Mon Sep 17 00:00:00 2001
From: Claudio Scordino 
Date: Tue, 6 Feb 2018 18:42:23 +0100
Subject: [PATCH] cpufreq: schedutil: rate limits for SCHED_DEADLINE

When the SCHED_DEADLINE scheduling class asks to increase the CPU
frequency, we should not wait the rate limit, otherwise we may miss some
deadline.

This patch moves all frequency update decisions to a single point:
sugov_should_update_freq(). In addition, it ignores the rate limit
whenever there is an increase of the CPU frequency given by an increase
of the deadline utilization.

Signed-off-by: Claudio Scordino 
---
 kernel/sched/cpufreq_schedutil.c | 27 +++
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index b0bd77d..e8504f5 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -74,7 +74,11 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
 
 / Governor internals ***/
 
-static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)

+static bool sugov_should_update_freq(struct sugov_policy *sg_policy,
+u64 time,
+struct sugov_cpu *sg_cpu_old,
+struct sugov_cpu *sg_cpu_new,
+unsigned int next_freq)
 {
s64 delta_ns;
 
@@ -111,6 +115,14 @@ static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 time)

return true;
}
 
+	/*

+* Ignore rate limit when DL asked to increase the CPU frequency,
+* otherwise we may miss some deadline.
+*/
+   if ((next_freq > sg_policy->next_freq) &&
+   (sg_cpu_new->util_dl > sg_cpu_old->util_dl))
+   return true;
+
delta_ns = time - sg_policy->last_freq_update_time;
return delta_ns >= sg_policy->freq_update_delay_ns;
 }
@@ -271,6 +283,7 @@ static void sugov_update_single(struct update_util_data 
*hook, u64 time,
unsigned int flags)
 {
struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, 
update_util);
+   struct sugov_cpu sg_cpu_old = *sg_cpu;
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
unsigned long util, max;
unsigned int next_f;
@@ 

Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2018-02-06 Thread Patrick Bellasi
Hi Claudio,

On 06-Feb 11:55, Claudio Scordino wrote:
> Hi Peter,
> 
> Il 20/12/2017 16:30, Peter Zijlstra ha scritto:
> >
> >So I ended up with the below (on top of Juri's cpufreq-dl patches).
> >
> >It compiles, but that's about all the testing it had.
> >
> >--- a/include/linux/sched/cpufreq.h
> >+++ b/include/linux/sched/cpufreq.h

[..]

> >@@ -188,17 +187,23 @@ static void sugov_get_util(struct sugov_
> >  static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
> >  {
> >+unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
> >+struct rq *rq = cpu_rq(sg_cpu->cpu);
> >+
> >+if (rq->rt.rt_nr_running)
> >+util = sg_cpu->max;
> >+
> > /*
> >  * Ideally we would like to set util_dl as min/guaranteed freq and
> >  * util_cfs + util_dl as requested freq. However, cpufreq is not yet
> >  * ready for such an interface. So, we only do the latter for now.
> >  */
> >-return min(sg_cpu->util_cfs + sg_cpu->util_dl, sg_cpu->max);
> >+return min(util, sg_cpu->max);
> >  }

[...]

> 
> What is the status of this patch ? I couldn't find it on the
> tip/queue repositories.
> 
> BTW, I wonder if we actually want to remove also the information
> about the scheduling class who triggered the frequency change.

Removing flags was the main goal of the patch, since they represents
mainly duplicated information which scheduling classes already know.

This was making flags update error prone and difficult to keep
aligned with existing scheduling classes info.

> This prevents us from adopting class-specific behaviors.

In Peter's proposal he replaces flags with checks like:

   if (rq->rt.rt_nr_running)

> For example, we might want to skip the rate limits when deadline
> asks for an increase of frequency, as shown in the patch below.
> In this case, we could just remove the flags from sugov_cpu, but
> leave the defines and the argument for sugov_update_*()

At first glance, your proposal below makes to make sense.

However, I'm wondering if we cannot get it working using
rq->dl's provided information instead of flags?

> Best regards,
> 
> Claudio
>
> 
> From ed13fa5a8f93a43f8ff8f7d354b18c0031df482c Mon Sep 17 00:00:00 2001
> From: Claudio Scordino 
> Date: Wed, 27 Sep 2017 17:16:36 +0200
> Subject: [PATCH RFC] cpufreq: schedutil: rate limits for SCHED_DEADLINE
> 
> When the SCHED_DEADLINE scheduling class asks to increase CPU frequency,
> we should not wait the rate limit, otherwise we may miss some deadline.
> The patch just ignores the limit whenever SCHED_DEADLINE asks for a
> higher CPU frequency.
> 
> Signed-off-by: Claudio Scordino 
> ---
>  kernel/sched/cpufreq_schedutil.c | 24 +---
>  1 file changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> index dd062a1..5027ab1 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -75,7 +75,8 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
>  / Governor internals ***/
> -static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 
> time)
> +static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 
> time,
> +  unsigned int next_freq, unsigned int flags)
>  {
>   s64 delta_ns;
> @@ -112,6 +113,10 @@ static bool sugov_should_update_freq(struct sugov_policy 
> *sg_policy, u64 time)
>   return true;
>   }
> + /* Ignore rate limit if DL asked to increase CPU frequency */
> + if ((flags & SCHED_CPUFREQ_DL) && (next_freq > sg_policy->next_freq))
> + return true;


static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
{
unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
struct rq *rq = cpu_rq(sg_cpu->cpu);

if (rq->dl.dl_nr_running)


> +
>   delta_ns = time - sg_policy->last_freq_update_time;
>   return delta_ns >= sg_policy->freq_update_delay_ns;
>  }
> @@ -275,9 +280,6 @@ static void sugov_update_single(struct update_util_data 
> *hook, u64 time,
>   sugov_set_iowait_boost(sg_cpu, time);
>   sg_cpu->last_update = time;
> - if (!sugov_should_update_freq(sg_policy, time))
> - return;
> -
>   busy = sugov_cpu_is_busy(sg_cpu);
>   if (flags & SCHED_CPUFREQ_RT) {
> @@ -299,7 +301,8 @@ static void sugov_update_single(struct update_util_data 
> *hook, u64 time,
>   sg_policy->cached_raw_freq = 0;
>   }
>   }
> - sugov_update_commit(sg_policy, time, next_f);
> + if (sugov_should_update_freq(sg_policy, time, next_f, flags))
> + sugov_update_commit(sg_policy, time, next_f);
>  }
>  static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu, u64 
> time)
> @@ -362,14 +365,13 @@ static void sugov_update_shared(struct 

Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2018-02-06 Thread Patrick Bellasi
Hi Claudio,

On 06-Feb 11:55, Claudio Scordino wrote:
> Hi Peter,
> 
> Il 20/12/2017 16:30, Peter Zijlstra ha scritto:
> >
> >So I ended up with the below (on top of Juri's cpufreq-dl patches).
> >
> >It compiles, but that's about all the testing it had.
> >
> >--- a/include/linux/sched/cpufreq.h
> >+++ b/include/linux/sched/cpufreq.h

[..]

> >@@ -188,17 +187,23 @@ static void sugov_get_util(struct sugov_
> >  static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
> >  {
> >+unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
> >+struct rq *rq = cpu_rq(sg_cpu->cpu);
> >+
> >+if (rq->rt.rt_nr_running)
> >+util = sg_cpu->max;
> >+
> > /*
> >  * Ideally we would like to set util_dl as min/guaranteed freq and
> >  * util_cfs + util_dl as requested freq. However, cpufreq is not yet
> >  * ready for such an interface. So, we only do the latter for now.
> >  */
> >-return min(sg_cpu->util_cfs + sg_cpu->util_dl, sg_cpu->max);
> >+return min(util, sg_cpu->max);
> >  }

[...]

> 
> What is the status of this patch ? I couldn't find it on the
> tip/queue repositories.
> 
> BTW, I wonder if we actually want to remove also the information
> about the scheduling class who triggered the frequency change.

Removing flags was the main goal of the patch, since they represents
mainly duplicated information which scheduling classes already know.

This was making flags update error prone and difficult to keep
aligned with existing scheduling classes info.

> This prevents us from adopting class-specific behaviors.

In Peter's proposal he replaces flags with checks like:

   if (rq->rt.rt_nr_running)

> For example, we might want to skip the rate limits when deadline
> asks for an increase of frequency, as shown in the patch below.
> In this case, we could just remove the flags from sugov_cpu, but
> leave the defines and the argument for sugov_update_*()

At first glance, your proposal below makes to make sense.

However, I'm wondering if we cannot get it working using
rq->dl's provided information instead of flags?

> Best regards,
> 
> Claudio
>
> 
> From ed13fa5a8f93a43f8ff8f7d354b18c0031df482c Mon Sep 17 00:00:00 2001
> From: Claudio Scordino 
> Date: Wed, 27 Sep 2017 17:16:36 +0200
> Subject: [PATCH RFC] cpufreq: schedutil: rate limits for SCHED_DEADLINE
> 
> When the SCHED_DEADLINE scheduling class asks to increase CPU frequency,
> we should not wait the rate limit, otherwise we may miss some deadline.
> The patch just ignores the limit whenever SCHED_DEADLINE asks for a
> higher CPU frequency.
> 
> Signed-off-by: Claudio Scordino 
> ---
>  kernel/sched/cpufreq_schedutil.c | 24 +---
>  1 file changed, 13 insertions(+), 11 deletions(-)
> 
> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> index dd062a1..5027ab1 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -75,7 +75,8 @@ static DEFINE_PER_CPU(struct sugov_cpu, sugov_cpu);
>  / Governor internals ***/
> -static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 
> time)
> +static bool sugov_should_update_freq(struct sugov_policy *sg_policy, u64 
> time,
> +  unsigned int next_freq, unsigned int flags)
>  {
>   s64 delta_ns;
> @@ -112,6 +113,10 @@ static bool sugov_should_update_freq(struct sugov_policy 
> *sg_policy, u64 time)
>   return true;
>   }
> + /* Ignore rate limit if DL asked to increase CPU frequency */
> + if ((flags & SCHED_CPUFREQ_DL) && (next_freq > sg_policy->next_freq))
> + return true;


static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
{
unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
struct rq *rq = cpu_rq(sg_cpu->cpu);

if (rq->dl.dl_nr_running)


> +
>   delta_ns = time - sg_policy->last_freq_update_time;
>   return delta_ns >= sg_policy->freq_update_delay_ns;
>  }
> @@ -275,9 +280,6 @@ static void sugov_update_single(struct update_util_data 
> *hook, u64 time,
>   sugov_set_iowait_boost(sg_cpu, time);
>   sg_cpu->last_update = time;
> - if (!sugov_should_update_freq(sg_policy, time))
> - return;
> -
>   busy = sugov_cpu_is_busy(sg_cpu);
>   if (flags & SCHED_CPUFREQ_RT) {
> @@ -299,7 +301,8 @@ static void sugov_update_single(struct update_util_data 
> *hook, u64 time,
>   sg_policy->cached_raw_freq = 0;
>   }
>   }
> - sugov_update_commit(sg_policy, time, next_f);
> + if (sugov_should_update_freq(sg_policy, time, next_f, flags))
> + sugov_update_commit(sg_policy, time, next_f);
>  }
>  static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu, u64 
> time)
> @@ -362,14 +365,13 @@ static void sugov_update_shared(struct update_util_data 
> *hook, u64 time,
>   

Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2018-02-06 Thread Claudio Scordino

Hi Peter,

Il 20/12/2017 16:30, Peter Zijlstra ha scritto:


So I ended up with the below (on top of Juri's cpufreq-dl patches).

It compiles, but that's about all the testing it had.

--- a/include/linux/sched/cpufreq.h
+++ b/include/linux/sched/cpufreq.h
@@ -8,9 +8,7 @@
   * Interface between cpufreq drivers and the scheduler:
   */
  
-#define SCHED_CPUFREQ_RT	(1U << 0)

-#define SCHED_CPUFREQ_DL   (1U << 1)
-#define SCHED_CPUFREQ_IOWAIT   (1U << 2)
+#define SCHED_CPUFREQ_IOWAIT   (1U << 0)
  
  #ifdef CONFIG_CPU_FREQ

  struct update_util_data {
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -63,7 +63,6 @@ struct sugov_cpu {
unsigned long util_cfs;
unsigned long util_dl;
unsigned long max;
-   unsigned int flags;
  
  	/* The field below is for single-CPU policies only. */

  #ifdef CONFIG_NO_HZ_COMMON
@@ -188,17 +187,23 @@ static void sugov_get_util(struct sugov_
  
  static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)

  {
+   unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
+   struct rq *rq = cpu_rq(sg_cpu->cpu);
+
+   if (rq->rt.rt_nr_running)
+   util = sg_cpu->max;
+
/*
 * Ideally we would like to set util_dl as min/guaranteed freq and
 * util_cfs + util_dl as requested freq. However, cpufreq is not yet
 * ready for such an interface. So, we only do the latter for now.
 */
-   return min(sg_cpu->util_cfs + sg_cpu->util_dl, sg_cpu->max);
+   return min(util, sg_cpu->max);
  }
  
-static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time)

+static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, 
unsigned int flags)
  {
-   if (sg_cpu->flags & SCHED_CPUFREQ_IOWAIT) {
+   if (flags & SCHED_CPUFREQ_IOWAIT) {
if (sg_cpu->iowait_boost_pending)
return;
  
@@ -267,12 +272,11 @@ static void sugov_update_single(struct u

  {
struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, 
update_util);
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
-   struct cpufreq_policy *policy = sg_policy->policy;
unsigned long util, max;
unsigned int next_f;
bool busy;
  
-	sugov_set_iowait_boost(sg_cpu, time);

+   sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
  
  	if (!sugov_should_update_freq(sg_policy, time))

@@ -280,25 +284,22 @@ static void sugov_update_single(struct u
  
  	busy = sugov_cpu_is_busy(sg_cpu);
  
-	if (flags & SCHED_CPUFREQ_RT) {

-   next_f = policy->cpuinfo.max_freq;
-   } else {
-   sugov_get_util(sg_cpu);
-   max = sg_cpu->max;
-   util = sugov_aggregate_util(sg_cpu);
-   sugov_iowait_boost(sg_cpu, , );
-   next_f = get_next_freq(sg_policy, util, max);
-   /*
-* Do not reduce the frequency if the CPU has not been idle
-* recently, as the reduction is likely to be premature then.
-*/
-   if (busy && next_f < sg_policy->next_freq) {
-   next_f = sg_policy->next_freq;
+   sugov_get_util(sg_cpu);
+   max = sg_cpu->max;
+   util = sugov_aggregate_util(sg_cpu);
+   sugov_iowait_boost(sg_cpu, , );
+   next_f = get_next_freq(sg_policy, util, max);
+   /*
+* Do not reduce the frequency if the CPU has not been idle
+* recently, as the reduction is likely to be premature then.
+*/
+   if (busy && next_f < sg_policy->next_freq) {
+   next_f = sg_policy->next_freq;
  
-			/* Reset cached freq as next_freq has changed */

-   sg_policy->cached_raw_freq = 0;
-   }
+   /* Reset cached freq as next_freq has changed */
+   sg_policy->cached_raw_freq = 0;
}
+
sugov_update_commit(sg_policy, time, next_f);
  }
  
@@ -314,6 +315,9 @@ static unsigned int sugov_next_freq_shar

unsigned long j_util, j_max;
s64 delta_ns;
  
+		if (j_sg_cpu != sg_cpu)

+   sugov_get_util(j_sg_cpu);
+
/*
 * If the CFS CPU utilization was last updated before the
 * previous frequency update and the time elapsed between the
@@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
if (delta_ns > TICK_NSEC) {
j_sg_cpu->iowait_boost = 0;
j_sg_cpu->iowait_boost_pending = false;
-   j_sg_cpu->util_cfs = 0;
-   if (j_sg_cpu->util_dl == 0)
-   continue;
}
-   if (j_sg_cpu->flags & SCHED_CPUFREQ_RT)
-   return policy->cpuinfo.max_freq;
  
  		j_max = j_sg_cpu->max;

j_util = sugov_aggregate_util(j_sg_cpu);
@@ -357,17 +356,11 @@ 

Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2018-02-06 Thread Claudio Scordino

Hi Peter,

Il 20/12/2017 16:30, Peter Zijlstra ha scritto:


So I ended up with the below (on top of Juri's cpufreq-dl patches).

It compiles, but that's about all the testing it had.

--- a/include/linux/sched/cpufreq.h
+++ b/include/linux/sched/cpufreq.h
@@ -8,9 +8,7 @@
   * Interface between cpufreq drivers and the scheduler:
   */
  
-#define SCHED_CPUFREQ_RT	(1U << 0)

-#define SCHED_CPUFREQ_DL   (1U << 1)
-#define SCHED_CPUFREQ_IOWAIT   (1U << 2)
+#define SCHED_CPUFREQ_IOWAIT   (1U << 0)
  
  #ifdef CONFIG_CPU_FREQ

  struct update_util_data {
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -63,7 +63,6 @@ struct sugov_cpu {
unsigned long util_cfs;
unsigned long util_dl;
unsigned long max;
-   unsigned int flags;
  
  	/* The field below is for single-CPU policies only. */

  #ifdef CONFIG_NO_HZ_COMMON
@@ -188,17 +187,23 @@ static void sugov_get_util(struct sugov_
  
  static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)

  {
+   unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
+   struct rq *rq = cpu_rq(sg_cpu->cpu);
+
+   if (rq->rt.rt_nr_running)
+   util = sg_cpu->max;
+
/*
 * Ideally we would like to set util_dl as min/guaranteed freq and
 * util_cfs + util_dl as requested freq. However, cpufreq is not yet
 * ready for such an interface. So, we only do the latter for now.
 */
-   return min(sg_cpu->util_cfs + sg_cpu->util_dl, sg_cpu->max);
+   return min(util, sg_cpu->max);
  }
  
-static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time)

+static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, 
unsigned int flags)
  {
-   if (sg_cpu->flags & SCHED_CPUFREQ_IOWAIT) {
+   if (flags & SCHED_CPUFREQ_IOWAIT) {
if (sg_cpu->iowait_boost_pending)
return;
  
@@ -267,12 +272,11 @@ static void sugov_update_single(struct u

  {
struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, 
update_util);
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
-   struct cpufreq_policy *policy = sg_policy->policy;
unsigned long util, max;
unsigned int next_f;
bool busy;
  
-	sugov_set_iowait_boost(sg_cpu, time);

+   sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
  
  	if (!sugov_should_update_freq(sg_policy, time))

@@ -280,25 +284,22 @@ static void sugov_update_single(struct u
  
  	busy = sugov_cpu_is_busy(sg_cpu);
  
-	if (flags & SCHED_CPUFREQ_RT) {

-   next_f = policy->cpuinfo.max_freq;
-   } else {
-   sugov_get_util(sg_cpu);
-   max = sg_cpu->max;
-   util = sugov_aggregate_util(sg_cpu);
-   sugov_iowait_boost(sg_cpu, , );
-   next_f = get_next_freq(sg_policy, util, max);
-   /*
-* Do not reduce the frequency if the CPU has not been idle
-* recently, as the reduction is likely to be premature then.
-*/
-   if (busy && next_f < sg_policy->next_freq) {
-   next_f = sg_policy->next_freq;
+   sugov_get_util(sg_cpu);
+   max = sg_cpu->max;
+   util = sugov_aggregate_util(sg_cpu);
+   sugov_iowait_boost(sg_cpu, , );
+   next_f = get_next_freq(sg_policy, util, max);
+   /*
+* Do not reduce the frequency if the CPU has not been idle
+* recently, as the reduction is likely to be premature then.
+*/
+   if (busy && next_f < sg_policy->next_freq) {
+   next_f = sg_policy->next_freq;
  
-			/* Reset cached freq as next_freq has changed */

-   sg_policy->cached_raw_freq = 0;
-   }
+   /* Reset cached freq as next_freq has changed */
+   sg_policy->cached_raw_freq = 0;
}
+
sugov_update_commit(sg_policy, time, next_f);
  }
  
@@ -314,6 +315,9 @@ static unsigned int sugov_next_freq_shar

unsigned long j_util, j_max;
s64 delta_ns;
  
+		if (j_sg_cpu != sg_cpu)

+   sugov_get_util(j_sg_cpu);
+
/*
 * If the CFS CPU utilization was last updated before the
 * previous frequency update and the time elapsed between the
@@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
if (delta_ns > TICK_NSEC) {
j_sg_cpu->iowait_boost = 0;
j_sg_cpu->iowait_boost_pending = false;
-   j_sg_cpu->util_cfs = 0;
-   if (j_sg_cpu->util_dl == 0)
-   continue;
}
-   if (j_sg_cpu->flags & SCHED_CPUFREQ_RT)
-   return policy->cpuinfo.max_freq;
  
  		j_max = j_sg_cpu->max;

j_util = sugov_aggregate_util(j_sg_cpu);
@@ -357,17 +356,11 @@ 

Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2018-01-02 Thread Claudio Scordino

Hi Peter,

Il 31/12/2017 10:43, Claudio Scordino ha scritto:

Hi all,

Il 20/12/2017 16:56, Peter Zijlstra ha scritto:

On Wed, Dec 20, 2017 at 04:30:29PM +0100, Peter Zijlstra wrote:


So I ended up with the below (on top of Juri's cpufreq-dl patches).

It compiles, but that's about all the testing it had.


Should all be available at:

   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/core

compile tested only, I suppose the 0day bot will let me know soon enough
:-)


I'm going to do some testing from Tuesday.


The rebase looks good to me.

Tested on Odroid XU4 (ARM big.LITTLE), no regressions found.

Many thanks and best regards,

  Claudio


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2018-01-02 Thread Claudio Scordino

Hi Peter,

Il 31/12/2017 10:43, Claudio Scordino ha scritto:

Hi all,

Il 20/12/2017 16:56, Peter Zijlstra ha scritto:

On Wed, Dec 20, 2017 at 04:30:29PM +0100, Peter Zijlstra wrote:


So I ended up with the below (on top of Juri's cpufreq-dl patches).

It compiles, but that's about all the testing it had.


Should all be available at:

   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/core

compile tested only, I suppose the 0day bot will let me know soon enough
:-)


I'm going to do some testing from Tuesday.


The rebase looks good to me.

Tested on Odroid XU4 (ARM big.LITTLE), no regressions found.

Many thanks and best regards,

  Claudio


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-31 Thread Claudio Scordino

Hi all,

Il 20/12/2017 16:56, Peter Zijlstra ha scritto:

On Wed, Dec 20, 2017 at 04:30:29PM +0100, Peter Zijlstra wrote:


So I ended up with the below (on top of Juri's cpufreq-dl patches).

It compiles, but that's about all the testing it had.


Should all be available at:

   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/core

compile tested only, I suppose the 0day bot will let me know soon enough
:-)


I'm going to do some testing from Tuesday.

Happy new year,

   Claudio


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-31 Thread Claudio Scordino

Hi all,

Il 20/12/2017 16:56, Peter Zijlstra ha scritto:

On Wed, Dec 20, 2017 at 04:30:29PM +0100, Peter Zijlstra wrote:


So I ended up with the below (on top of Juri's cpufreq-dl patches).

It compiles, but that's about all the testing it had.


Should all be available at:

   git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/core

compile tested only, I suppose the 0day bot will let me know soon enough
:-)


I'm going to do some testing from Tuesday.

Happy new year,

   Claudio


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Juri Lelli
On 22/12/17 12:50, Patrick Bellasi wrote:
> On 22-Dec 13:43, Juri Lelli wrote:
> > On 22/12/17 12:38, Patrick Bellasi wrote:
> > > On 22-Dec 13:19, Peter Zijlstra wrote:
> > > > On Fri, Dec 22, 2017 at 12:07:37PM +, Patrick Bellasi wrote:
> > > > > > I was thinking that since dl is a 'global' scheduler the reservation
> > > > > > would be too and thus the freq just needs a single CPU to be 
> > > > > > observed;
> > > > > 
> > > > > AFAIU global is only the admission control (which is something worth a
> > > > > thread by itself...) while the dl_se->dl_bw are aggregated into the
> > > > > dl_rq->running_bw, which ultimately represents the DL bandwidth
> > > > > required for just a CPU.
> > > > 
> > > > Oh urgh yes, forgot that.. then the dl freq stuff isn't strictly correct
> > > > I think. But yes, that's another thread.
> > > 
> > > Mmm... maybe I don't get your point... I was referring to the global
> > > admission control of DL. If you have for example 3 60% DL tasks on a
> > > 2CPU system, AFAIU the CBS will allow the tasks in the system (since
> > > the overall utilization is 180 < 200 * 0.95) although that workload is
> > > not necessarily schedule (for example if the tasks wakeups at the
> > > same time one of them will miss its deadline).
> > > 
> > > But, yeah... maybe I'm completely wrong or, in any case, it's for a
> > > different thread...
> > > 
> > > > > > but I suppose there's nothing stopping anybody from splitting a 
> > > > > > clock
> > > > > > domain down the middle scheduling wise. So yes, good point.
> > > > > 
> > > > > That makes sense... moreover, using the global utilization, we would
> > > > > end up asking for capacities which cannot be provided by a single CPU.
> > > > 
> > > > Yes, but that _should_ not be a problem if you clock them all high
> > > > enough. But this gets to be complicated real fast I think.
> > > 
> > > IMO the current solution with Juri's patches is working as expected:
> > > we know how many DL tasks are runnable on a CPU and we properly
> > > account for their utilization.
> > > 
> > > The only "issue/limitation" is (eventually) the case described above.
> > > Dunno if we can enqueue 2 60% DL tasks on the same CPU... in that case
> > > we will ask for 120% Utilization?
> > 
> > In general it depends on the other parameters, deadline and period.
> 
> Right, but what about the case dealdine==period, with 60% utilization?
> AFAIU, 3 DL tasks with same parameters like above will be accepted on
> a 2 CPU system, isn't it?
> 
> And thus, in that case, we can end up with a 120% utlization request
> from DL for a single CPU... but, considering it's lunch o'clock,
> I'm likely missing something...

Nope. CBS on SMP only gives you bounded tardiness (at least with the AC
kernel does). Some deadlines might be missed.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Juri Lelli
On 22/12/17 12:50, Patrick Bellasi wrote:
> On 22-Dec 13:43, Juri Lelli wrote:
> > On 22/12/17 12:38, Patrick Bellasi wrote:
> > > On 22-Dec 13:19, Peter Zijlstra wrote:
> > > > On Fri, Dec 22, 2017 at 12:07:37PM +, Patrick Bellasi wrote:
> > > > > > I was thinking that since dl is a 'global' scheduler the reservation
> > > > > > would be too and thus the freq just needs a single CPU to be 
> > > > > > observed;
> > > > > 
> > > > > AFAIU global is only the admission control (which is something worth a
> > > > > thread by itself...) while the dl_se->dl_bw are aggregated into the
> > > > > dl_rq->running_bw, which ultimately represents the DL bandwidth
> > > > > required for just a CPU.
> > > > 
> > > > Oh urgh yes, forgot that.. then the dl freq stuff isn't strictly correct
> > > > I think. But yes, that's another thread.
> > > 
> > > Mmm... maybe I don't get your point... I was referring to the global
> > > admission control of DL. If you have for example 3 60% DL tasks on a
> > > 2CPU system, AFAIU the CBS will allow the tasks in the system (since
> > > the overall utilization is 180 < 200 * 0.95) although that workload is
> > > not necessarily schedule (for example if the tasks wakeups at the
> > > same time one of them will miss its deadline).
> > > 
> > > But, yeah... maybe I'm completely wrong or, in any case, it's for a
> > > different thread...
> > > 
> > > > > > but I suppose there's nothing stopping anybody from splitting a 
> > > > > > clock
> > > > > > domain down the middle scheduling wise. So yes, good point.
> > > > > 
> > > > > That makes sense... moreover, using the global utilization, we would
> > > > > end up asking for capacities which cannot be provided by a single CPU.
> > > > 
> > > > Yes, but that _should_ not be a problem if you clock them all high
> > > > enough. But this gets to be complicated real fast I think.
> > > 
> > > IMO the current solution with Juri's patches is working as expected:
> > > we know how many DL tasks are runnable on a CPU and we properly
> > > account for their utilization.
> > > 
> > > The only "issue/limitation" is (eventually) the case described above.
> > > Dunno if we can enqueue 2 60% DL tasks on the same CPU... in that case
> > > we will ask for 120% Utilization?
> > 
> > In general it depends on the other parameters, deadline and period.
> 
> Right, but what about the case dealdine==period, with 60% utilization?
> AFAIU, 3 DL tasks with same parameters like above will be accepted on
> a 2 CPU system, isn't it?
> 
> And thus, in that case, we can end up with a 120% utlization request
> from DL for a single CPU... but, considering it's lunch o'clock,
> I'm likely missing something...

Nope. CBS on SMP only gives you bounded tardiness (at least with the AC
kernel does). Some deadlines might be missed.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Patrick Bellasi
On 22-Dec 13:43, Juri Lelli wrote:
> On 22/12/17 12:38, Patrick Bellasi wrote:
> > On 22-Dec 13:19, Peter Zijlstra wrote:
> > > On Fri, Dec 22, 2017 at 12:07:37PM +, Patrick Bellasi wrote:
> > > > > I was thinking that since dl is a 'global' scheduler the reservation
> > > > > would be too and thus the freq just needs a single CPU to be observed;
> > > > 
> > > > AFAIU global is only the admission control (which is something worth a
> > > > thread by itself...) while the dl_se->dl_bw are aggregated into the
> > > > dl_rq->running_bw, which ultimately represents the DL bandwidth
> > > > required for just a CPU.
> > > 
> > > Oh urgh yes, forgot that.. then the dl freq stuff isn't strictly correct
> > > I think. But yes, that's another thread.
> > 
> > Mmm... maybe I don't get your point... I was referring to the global
> > admission control of DL. If you have for example 3 60% DL tasks on a
> > 2CPU system, AFAIU the CBS will allow the tasks in the system (since
> > the overall utilization is 180 < 200 * 0.95) although that workload is
> > not necessarily schedule (for example if the tasks wakeups at the
> > same time one of them will miss its deadline).
> > 
> > But, yeah... maybe I'm completely wrong or, in any case, it's for a
> > different thread...
> > 
> > > > > but I suppose there's nothing stopping anybody from splitting a clock
> > > > > domain down the middle scheduling wise. So yes, good point.
> > > > 
> > > > That makes sense... moreover, using the global utilization, we would
> > > > end up asking for capacities which cannot be provided by a single CPU.
> > > 
> > > Yes, but that _should_ not be a problem if you clock them all high
> > > enough. But this gets to be complicated real fast I think.
> > 
> > IMO the current solution with Juri's patches is working as expected:
> > we know how many DL tasks are runnable on a CPU and we properly
> > account for their utilization.
> > 
> > The only "issue/limitation" is (eventually) the case described above.
> > Dunno if we can enqueue 2 60% DL tasks on the same CPU... in that case
> > we will ask for 120% Utilization?
> 
> In general it depends on the other parameters, deadline and period.

Right, but what about the case dealdine==period, with 60% utilization?
AFAIU, 3 DL tasks with same parameters like above will be accepted on
a 2 CPU system, isn't it?

And thus, in that case, we can end up with a 120% utlization request
from DL for a single CPU... but, considering it's lunch o'clock,
I'm likely missing something...

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Patrick Bellasi
On 22-Dec 13:43, Juri Lelli wrote:
> On 22/12/17 12:38, Patrick Bellasi wrote:
> > On 22-Dec 13:19, Peter Zijlstra wrote:
> > > On Fri, Dec 22, 2017 at 12:07:37PM +, Patrick Bellasi wrote:
> > > > > I was thinking that since dl is a 'global' scheduler the reservation
> > > > > would be too and thus the freq just needs a single CPU to be observed;
> > > > 
> > > > AFAIU global is only the admission control (which is something worth a
> > > > thread by itself...) while the dl_se->dl_bw are aggregated into the
> > > > dl_rq->running_bw, which ultimately represents the DL bandwidth
> > > > required for just a CPU.
> > > 
> > > Oh urgh yes, forgot that.. then the dl freq stuff isn't strictly correct
> > > I think. But yes, that's another thread.
> > 
> > Mmm... maybe I don't get your point... I was referring to the global
> > admission control of DL. If you have for example 3 60% DL tasks on a
> > 2CPU system, AFAIU the CBS will allow the tasks in the system (since
> > the overall utilization is 180 < 200 * 0.95) although that workload is
> > not necessarily schedule (for example if the tasks wakeups at the
> > same time one of them will miss its deadline).
> > 
> > But, yeah... maybe I'm completely wrong or, in any case, it's for a
> > different thread...
> > 
> > > > > but I suppose there's nothing stopping anybody from splitting a clock
> > > > > domain down the middle scheduling wise. So yes, good point.
> > > > 
> > > > That makes sense... moreover, using the global utilization, we would
> > > > end up asking for capacities which cannot be provided by a single CPU.
> > > 
> > > Yes, but that _should_ not be a problem if you clock them all high
> > > enough. But this gets to be complicated real fast I think.
> > 
> > IMO the current solution with Juri's patches is working as expected:
> > we know how many DL tasks are runnable on a CPU and we properly
> > account for their utilization.
> > 
> > The only "issue/limitation" is (eventually) the case described above.
> > Dunno if we can enqueue 2 60% DL tasks on the same CPU... in that case
> > we will ask for 120% Utilization?
> 
> In general it depends on the other parameters, deadline and period.

Right, but what about the case dealdine==period, with 60% utilization?
AFAIU, 3 DL tasks with same parameters like above will be accepted on
a 2 CPU system, isn't it?

And thus, in that case, we can end up with a 120% utlization request
from DL for a single CPU... but, considering it's lunch o'clock,
I'm likely missing something...

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Juri Lelli
On 22/12/17 12:38, Patrick Bellasi wrote:
> On 22-Dec 13:19, Peter Zijlstra wrote:
> > On Fri, Dec 22, 2017 at 12:07:37PM +, Patrick Bellasi wrote:
> > > > I was thinking that since dl is a 'global' scheduler the reservation
> > > > would be too and thus the freq just needs a single CPU to be observed;
> > > 
> > > AFAIU global is only the admission control (which is something worth a
> > > thread by itself...) while the dl_se->dl_bw are aggregated into the
> > > dl_rq->running_bw, which ultimately represents the DL bandwidth
> > > required for just a CPU.
> > 
> > Oh urgh yes, forgot that.. then the dl freq stuff isn't strictly correct
> > I think. But yes, that's another thread.
> 
> Mmm... maybe I don't get your point... I was referring to the global
> admission control of DL. If you have for example 3 60% DL tasks on a
> 2CPU system, AFAIU the CBS will allow the tasks in the system (since
> the overall utilization is 180 < 200 * 0.95) although that workload is
> not necessarily schedule (for example if the tasks wakeups at the
> same time one of them will miss its deadline).
> 
> But, yeah... maybe I'm completely wrong or, in any case, it's for a
> different thread...
> 
> > > > but I suppose there's nothing stopping anybody from splitting a clock
> > > > domain down the middle scheduling wise. So yes, good point.
> > > 
> > > That makes sense... moreover, using the global utilization, we would
> > > end up asking for capacities which cannot be provided by a single CPU.
> > 
> > Yes, but that _should_ not be a problem if you clock them all high
> > enough. But this gets to be complicated real fast I think.
> 
> IMO the current solution with Juri's patches is working as expected:
> we know how many DL tasks are runnable on a CPU and we properly
> account for their utilization.
> 
> The only "issue/limitation" is (eventually) the case described above.
> Dunno if we can enqueue 2 60% DL tasks on the same CPU... in that case
> we will ask for 120% Utilization?

In general it depends on the other parameters, deadline and period.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Juri Lelli
On 22/12/17 12:38, Patrick Bellasi wrote:
> On 22-Dec 13:19, Peter Zijlstra wrote:
> > On Fri, Dec 22, 2017 at 12:07:37PM +, Patrick Bellasi wrote:
> > > > I was thinking that since dl is a 'global' scheduler the reservation
> > > > would be too and thus the freq just needs a single CPU to be observed;
> > > 
> > > AFAIU global is only the admission control (which is something worth a
> > > thread by itself...) while the dl_se->dl_bw are aggregated into the
> > > dl_rq->running_bw, which ultimately represents the DL bandwidth
> > > required for just a CPU.
> > 
> > Oh urgh yes, forgot that.. then the dl freq stuff isn't strictly correct
> > I think. But yes, that's another thread.
> 
> Mmm... maybe I don't get your point... I was referring to the global
> admission control of DL. If you have for example 3 60% DL tasks on a
> 2CPU system, AFAIU the CBS will allow the tasks in the system (since
> the overall utilization is 180 < 200 * 0.95) although that workload is
> not necessarily schedule (for example if the tasks wakeups at the
> same time one of them will miss its deadline).
> 
> But, yeah... maybe I'm completely wrong or, in any case, it's for a
> different thread...
> 
> > > > but I suppose there's nothing stopping anybody from splitting a clock
> > > > domain down the middle scheduling wise. So yes, good point.
> > > 
> > > That makes sense... moreover, using the global utilization, we would
> > > end up asking for capacities which cannot be provided by a single CPU.
> > 
> > Yes, but that _should_ not be a problem if you clock them all high
> > enough. But this gets to be complicated real fast I think.
> 
> IMO the current solution with Juri's patches is working as expected:
> we know how many DL tasks are runnable on a CPU and we properly
> account for their utilization.
> 
> The only "issue/limitation" is (eventually) the case described above.
> Dunno if we can enqueue 2 60% DL tasks on the same CPU... in that case
> we will ask for 120% Utilization?

In general it depends on the other parameters, deadline and period.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Patrick Bellasi
On 22-Dec 13:19, Peter Zijlstra wrote:
> On Fri, Dec 22, 2017 at 12:07:37PM +, Patrick Bellasi wrote:
> > > I was thinking that since dl is a 'global' scheduler the reservation
> > > would be too and thus the freq just needs a single CPU to be observed;
> > 
> > AFAIU global is only the admission control (which is something worth a
> > thread by itself...) while the dl_se->dl_bw are aggregated into the
> > dl_rq->running_bw, which ultimately represents the DL bandwidth
> > required for just a CPU.
> 
> Oh urgh yes, forgot that.. then the dl freq stuff isn't strictly correct
> I think. But yes, that's another thread.

Mmm... maybe I don't get your point... I was referring to the global
admission control of DL. If you have for example 3 60% DL tasks on a
2CPU system, AFAIU the CBS will allow the tasks in the system (since
the overall utilization is 180 < 200 * 0.95) although that workload is
not necessarily schedule (for example if the tasks wakeups at the
same time one of them will miss its deadline).

But, yeah... maybe I'm completely wrong or, in any case, it's for a
different thread...

> > > but I suppose there's nothing stopping anybody from splitting a clock
> > > domain down the middle scheduling wise. So yes, good point.
> > 
> > That makes sense... moreover, using the global utilization, we would
> > end up asking for capacities which cannot be provided by a single CPU.
> 
> Yes, but that _should_ not be a problem if you clock them all high
> enough. But this gets to be complicated real fast I think.

IMO the current solution with Juri's patches is working as expected:
we know how many DL tasks are runnable on a CPU and we properly
account for their utilization.

The only "issue/limitation" is (eventually) the case described above.
Dunno if we can enqueue 2 60% DL tasks on the same CPU... in that case
we will ask for 120% Utilization?

> > > Blergh that'd make a mess of things again.
> > 
> > Actually, looking better at your patch: are we not just ok with that?
> > 
> > I mean, we don't need this check on idle_cpu since in
> > sugov_aggregate_util we already skip the util=sg_cpu->max in case of
> > !rq->rt.rt_nr_running, while we aggregate just CFS and DL requests.
> 
> Right, well, I don't actually have an environment to test this sanely,
> so someone will have to go play with the various variations and see what
> works.

Definitively, we have some synthetics for mainline... as well as we
can easily backport this series to v4.9 and test for power/perf using
a full Android stack. But, give today is the 22th, I guess we can do
that after holidays (in ~2 weeks).

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Patrick Bellasi
On 22-Dec 13:19, Peter Zijlstra wrote:
> On Fri, Dec 22, 2017 at 12:07:37PM +, Patrick Bellasi wrote:
> > > I was thinking that since dl is a 'global' scheduler the reservation
> > > would be too and thus the freq just needs a single CPU to be observed;
> > 
> > AFAIU global is only the admission control (which is something worth a
> > thread by itself...) while the dl_se->dl_bw are aggregated into the
> > dl_rq->running_bw, which ultimately represents the DL bandwidth
> > required for just a CPU.
> 
> Oh urgh yes, forgot that.. then the dl freq stuff isn't strictly correct
> I think. But yes, that's another thread.

Mmm... maybe I don't get your point... I was referring to the global
admission control of DL. If you have for example 3 60% DL tasks on a
2CPU system, AFAIU the CBS will allow the tasks in the system (since
the overall utilization is 180 < 200 * 0.95) although that workload is
not necessarily schedule (for example if the tasks wakeups at the
same time one of them will miss its deadline).

But, yeah... maybe I'm completely wrong or, in any case, it's for a
different thread...

> > > but I suppose there's nothing stopping anybody from splitting a clock
> > > domain down the middle scheduling wise. So yes, good point.
> > 
> > That makes sense... moreover, using the global utilization, we would
> > end up asking for capacities which cannot be provided by a single CPU.
> 
> Yes, but that _should_ not be a problem if you clock them all high
> enough. But this gets to be complicated real fast I think.

IMO the current solution with Juri's patches is working as expected:
we know how many DL tasks are runnable on a CPU and we properly
account for their utilization.

The only "issue/limitation" is (eventually) the case described above.
Dunno if we can enqueue 2 60% DL tasks on the same CPU... in that case
we will ask for 120% Utilization?

> > > Blergh that'd make a mess of things again.
> > 
> > Actually, looking better at your patch: are we not just ok with that?
> > 
> > I mean, we don't need this check on idle_cpu since in
> > sugov_aggregate_util we already skip the util=sg_cpu->max in case of
> > !rq->rt.rt_nr_running, while we aggregate just CFS and DL requests.
> 
> Right, well, I don't actually have an environment to test this sanely,
> so someone will have to go play with the various variations and see what
> works.

Definitively, we have some synthetics for mainline... as well as we
can easily backport this series to v4.9 and test for power/perf using
a full Android stack. But, give today is the 22th, I guess we can do
that after holidays (in ~2 weeks).

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Juri Lelli
On 22/12/17 13:19, Peter Zijlstra wrote:
> On Fri, Dec 22, 2017 at 12:07:37PM +, Patrick Bellasi wrote:
> > > I was thinking that since dl is a 'global' scheduler the reservation
> > > would be too and thus the freq just needs a single CPU to be observed;
> > 
> > AFAIU global is only the admission control (which is something worth a
> > thread by itself...) while the dl_se->dl_bw are aggregated into the
> > dl_rq->running_bw, which ultimately represents the DL bandwidth
> > required for just a CPU.
> 
> Oh urgh yes, forgot that.. then the dl freq stuff isn't strictly correct
> I think. But yes, that's another thread.
> 
> > > but I suppose there's nothing stopping anybody from splitting a clock
> > > domain down the middle scheduling wise. So yes, good point.
> > 
> > That makes sense... moreover, using the global utilization, we would
> > end up asking for capacities which cannot be provided by a single CPU.
> 
> Yes, but that _should_ not be a problem if you clock them all high
> enough. But this gets to be complicated real fast I think.
> 
> > > Blergh that'd make a mess of things again.
> > 
> > Actually, looking better at your patch: are we not just ok with that?
> > 
> > I mean, we don't need this check on idle_cpu since in
> > sugov_aggregate_util we already skip the util=sg_cpu->max in case of
> > !rq->rt.rt_nr_running, while we aggregate just CFS and DL requests.
> 
> Right, well, I don't actually have an environment to test this sanely,
> so someone will have to go play with the various variations and see what
> works.

Adding Claudio and Luca to the thread (as I don't have a testing
platform myself ATM). ;)


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Juri Lelli
On 22/12/17 13:19, Peter Zijlstra wrote:
> On Fri, Dec 22, 2017 at 12:07:37PM +, Patrick Bellasi wrote:
> > > I was thinking that since dl is a 'global' scheduler the reservation
> > > would be too and thus the freq just needs a single CPU to be observed;
> > 
> > AFAIU global is only the admission control (which is something worth a
> > thread by itself...) while the dl_se->dl_bw are aggregated into the
> > dl_rq->running_bw, which ultimately represents the DL bandwidth
> > required for just a CPU.
> 
> Oh urgh yes, forgot that.. then the dl freq stuff isn't strictly correct
> I think. But yes, that's another thread.
> 
> > > but I suppose there's nothing stopping anybody from splitting a clock
> > > domain down the middle scheduling wise. So yes, good point.
> > 
> > That makes sense... moreover, using the global utilization, we would
> > end up asking for capacities which cannot be provided by a single CPU.
> 
> Yes, but that _should_ not be a problem if you clock them all high
> enough. But this gets to be complicated real fast I think.
> 
> > > Blergh that'd make a mess of things again.
> > 
> > Actually, looking better at your patch: are we not just ok with that?
> > 
> > I mean, we don't need this check on idle_cpu since in
> > sugov_aggregate_util we already skip the util=sg_cpu->max in case of
> > !rq->rt.rt_nr_running, while we aggregate just CFS and DL requests.
> 
> Right, well, I don't actually have an environment to test this sanely,
> so someone will have to go play with the various variations and see what
> works.

Adding Claudio and Luca to the thread (as I don't have a testing
platform myself ATM). ;)


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Patrick Bellasi
On 22-Dec 13:10, Peter Zijlstra wrote:
> On Fri, Dec 22, 2017 at 12:46:18PM +0100, Peter Zijlstra wrote:
> > Blergh that'd make a mess of things again.
> 
> Something like so then..
> 
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -187,11 +187,16 @@ static void sugov_get_util(struct sugov_
>  
>  static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
>  {
> - unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
>   struct rq *rq = cpu_rq(sg_cpu->cpu);
> + unsigned long util;
>  
> - if (rq->rt.rt_nr_running)
> + if (rq->rt.rt_nr_running) {
>   util = sg_cpu->max;
> + } else {
> + util = sg_cpu->util_dl;
> + if (rq->cfs.h_nr_running)
> + util += sg_cpu->util_cfs;

Since sugov_aggregate_util always follow sugov_get_util, maybe we
can move these checks into the latter and remove the first one?

That way, sg_cpu->util_{dl,rt,cfs} will always report exactly the
requests of each class considering also which tasks are RUNNABLE at
sugov_get_util time.

Still the observation of Juri is valid: do we wanna really disregard
all the CFS blocked load as soon as there are not CFS tasks?

> + }
>  
>   /*
>* Ideally we would like to set util_dl as min/guaranteed freq and

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Patrick Bellasi
On 22-Dec 13:10, Peter Zijlstra wrote:
> On Fri, Dec 22, 2017 at 12:46:18PM +0100, Peter Zijlstra wrote:
> > Blergh that'd make a mess of things again.
> 
> Something like so then..
> 
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -187,11 +187,16 @@ static void sugov_get_util(struct sugov_
>  
>  static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
>  {
> - unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
>   struct rq *rq = cpu_rq(sg_cpu->cpu);
> + unsigned long util;
>  
> - if (rq->rt.rt_nr_running)
> + if (rq->rt.rt_nr_running) {
>   util = sg_cpu->max;
> + } else {
> + util = sg_cpu->util_dl;
> + if (rq->cfs.h_nr_running)
> + util += sg_cpu->util_cfs;

Since sugov_aggregate_util always follow sugov_get_util, maybe we
can move these checks into the latter and remove the first one?

That way, sg_cpu->util_{dl,rt,cfs} will always report exactly the
requests of each class considering also which tasks are RUNNABLE at
sugov_get_util time.

Still the observation of Juri is valid: do we wanna really disregard
all the CFS blocked load as soon as there are not CFS tasks?

> + }
>  
>   /*
>* Ideally we would like to set util_dl as min/guaranteed freq and

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Peter Zijlstra
On Fri, Dec 22, 2017 at 12:14:45PM +, Patrick Bellasi wrote:
> I think that check is already gone for CFS in the current PeterZ tree.
> It seems we use TICK_NS just for the reset of iowait_boost, isn't it?

Easy enough to bring back though.

> However, if the remote updates of CFS works as expected,
> the removal of the TICK_NS for CFS is not intentional?

So I killed that because I now do get_util for all CPUs and figured
get_util provides up-to-date numbers. So we don't need no artificial
aging.

Esp. once we get that whole blocked load stuff sorted.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Peter Zijlstra
On Fri, Dec 22, 2017 at 12:14:45PM +, Patrick Bellasi wrote:
> I think that check is already gone for CFS in the current PeterZ tree.
> It seems we use TICK_NS just for the reset of iowait_boost, isn't it?

Easy enough to bring back though.

> However, if the remote updates of CFS works as expected,
> the removal of the TICK_NS for CFS is not intentional?

So I killed that because I now do get_util for all CPUs and figured
get_util provides up-to-date numbers. So we don't need no artificial
aging.

Esp. once we get that whole blocked load stuff sorted.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Peter Zijlstra
On Fri, Dec 22, 2017 at 12:07:37PM +, Patrick Bellasi wrote:
> > I was thinking that since dl is a 'global' scheduler the reservation
> > would be too and thus the freq just needs a single CPU to be observed;
> 
> AFAIU global is only the admission control (which is something worth a
> thread by itself...) while the dl_se->dl_bw are aggregated into the
> dl_rq->running_bw, which ultimately represents the DL bandwidth
> required for just a CPU.

Oh urgh yes, forgot that.. then the dl freq stuff isn't strictly correct
I think. But yes, that's another thread.

> > but I suppose there's nothing stopping anybody from splitting a clock
> > domain down the middle scheduling wise. So yes, good point.
> 
> That makes sense... moreover, using the global utilization, we would
> end up asking for capacities which cannot be provided by a single CPU.

Yes, but that _should_ not be a problem if you clock them all high
enough. But this gets to be complicated real fast I think.

> > Blergh that'd make a mess of things again.
> 
> Actually, looking better at your patch: are we not just ok with that?
> 
> I mean, we don't need this check on idle_cpu since in
> sugov_aggregate_util we already skip the util=sg_cpu->max in case of
> !rq->rt.rt_nr_running, while we aggregate just CFS and DL requests.

Right, well, I don't actually have an environment to test this sanely,
so someone will have to go play with the various variations and see what
works.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Peter Zijlstra
On Fri, Dec 22, 2017 at 12:07:37PM +, Patrick Bellasi wrote:
> > I was thinking that since dl is a 'global' scheduler the reservation
> > would be too and thus the freq just needs a single CPU to be observed;
> 
> AFAIU global is only the admission control (which is something worth a
> thread by itself...) while the dl_se->dl_bw are aggregated into the
> dl_rq->running_bw, which ultimately represents the DL bandwidth
> required for just a CPU.

Oh urgh yes, forgot that.. then the dl freq stuff isn't strictly correct
I think. But yes, that's another thread.

> > but I suppose there's nothing stopping anybody from splitting a clock
> > domain down the middle scheduling wise. So yes, good point.
> 
> That makes sense... moreover, using the global utilization, we would
> end up asking for capacities which cannot be provided by a single CPU.

Yes, but that _should_ not be a problem if you clock them all high
enough. But this gets to be complicated real fast I think.

> > Blergh that'd make a mess of things again.
> 
> Actually, looking better at your patch: are we not just ok with that?
> 
> I mean, we don't need this check on idle_cpu since in
> sugov_aggregate_util we already skip the util=sg_cpu->max in case of
> !rq->rt.rt_nr_running, while we aggregate just CFS and DL requests.

Right, well, I don't actually have an environment to test this sanely,
so someone will have to go play with the various variations and see what
works.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Patrick Bellasi
On 22-Dec 13:07, Juri Lelli wrote:
> Hi Peter,
> 
> On 22/12/17 12:46, Peter Zijlstra wrote:
> > On Fri, Dec 22, 2017 at 11:02:06AM +, Patrick Bellasi wrote:
> > > > @@ -315,8 +315,8 @@ static unsigned int sugov_next_freq_shared(struct 
> > > > sugov_cpu *sg_cpu, u64 time)
> > > > unsigned long j_util, j_max;
> > > > s64 delta_ns;
> > > >  
> > > > -   if (j_sg_cpu != sg_cpu)
> > > > -   sugov_get_util(j_sg_cpu);
> > > > +   if (idle_cpu(j))
> > > > +   continue;
> > > 
> > > That should work to skip IDLE CPUs... however I'm missing where now we
> > > get the sugov_get_util(j_sg_cpu) for active CPUs. It has been moved
> > > somewhere else I guess...
> > 
> > No, I'm just an idiot... lemme fix that.
> > 
> > > Moreover, that way don't we completely disregard CFS blocked load for
> > > IDLE CPUs... as well as DL reserved utilization, which should be
> > > released only at the 0-lag time?
> > 
> > I was thinking that since dl is a 'global' scheduler the reservation
> > would be too and thus the freq just needs a single CPU to be observed;
> > but I suppose there's nothing stopping anybody from splitting a clock
> > domain down the middle scheduling wise. So yes, good point.
> 
> Also, for CFS current behaviour is to start ignoring contributions after
> TICK_NS. It seems that your change might introduce regressions?

Good point, an energy regression I guess you mean...

I think that check is already gone for CFS in the current PeterZ tree.
It seems we use TICK_NS just for the reset of iowait_boost, isn't it?

However, if the remote updates of CFS works as expected,
the removal of the TICK_NS for CFS is not intentional?

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Patrick Bellasi
On 22-Dec 13:07, Juri Lelli wrote:
> Hi Peter,
> 
> On 22/12/17 12:46, Peter Zijlstra wrote:
> > On Fri, Dec 22, 2017 at 11:02:06AM +, Patrick Bellasi wrote:
> > > > @@ -315,8 +315,8 @@ static unsigned int sugov_next_freq_shared(struct 
> > > > sugov_cpu *sg_cpu, u64 time)
> > > > unsigned long j_util, j_max;
> > > > s64 delta_ns;
> > > >  
> > > > -   if (j_sg_cpu != sg_cpu)
> > > > -   sugov_get_util(j_sg_cpu);
> > > > +   if (idle_cpu(j))
> > > > +   continue;
> > > 
> > > That should work to skip IDLE CPUs... however I'm missing where now we
> > > get the sugov_get_util(j_sg_cpu) for active CPUs. It has been moved
> > > somewhere else I guess...
> > 
> > No, I'm just an idiot... lemme fix that.
> > 
> > > Moreover, that way don't we completely disregard CFS blocked load for
> > > IDLE CPUs... as well as DL reserved utilization, which should be
> > > released only at the 0-lag time?
> > 
> > I was thinking that since dl is a 'global' scheduler the reservation
> > would be too and thus the freq just needs a single CPU to be observed;
> > but I suppose there's nothing stopping anybody from splitting a clock
> > domain down the middle scheduling wise. So yes, good point.
> 
> Also, for CFS current behaviour is to start ignoring contributions after
> TICK_NS. It seems that your change might introduce regressions?

Good point, an energy regression I guess you mean...

I think that check is already gone for CFS in the current PeterZ tree.
It seems we use TICK_NS just for the reset of iowait_boost, isn't it?

However, if the remote updates of CFS works as expected,
the removal of the TICK_NS for CFS is not intentional?

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Peter Zijlstra
On Fri, Dec 22, 2017 at 12:46:18PM +0100, Peter Zijlstra wrote:
> Blergh that'd make a mess of things again.

Something like so then..

--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -187,11 +187,16 @@ static void sugov_get_util(struct sugov_
 
 static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
 {
-   unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
struct rq *rq = cpu_rq(sg_cpu->cpu);
+   unsigned long util;
 
-   if (rq->rt.rt_nr_running)
+   if (rq->rt.rt_nr_running) {
util = sg_cpu->max;
+   } else {
+   util = sg_cpu->util_dl;
+   if (rq->cfs.h_nr_running)
+   util += sg_cpu->util_cfs;
+   }
 
/*
 * Ideally we would like to set util_dl as min/guaranteed freq and


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Peter Zijlstra
On Fri, Dec 22, 2017 at 12:46:18PM +0100, Peter Zijlstra wrote:
> Blergh that'd make a mess of things again.

Something like so then..

--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -187,11 +187,16 @@ static void sugov_get_util(struct sugov_
 
 static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
 {
-   unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
struct rq *rq = cpu_rq(sg_cpu->cpu);
+   unsigned long util;
 
-   if (rq->rt.rt_nr_running)
+   if (rq->rt.rt_nr_running) {
util = sg_cpu->max;
+   } else {
+   util = sg_cpu->util_dl;
+   if (rq->cfs.h_nr_running)
+   util += sg_cpu->util_cfs;
+   }
 
/*
 * Ideally we would like to set util_dl as min/guaranteed freq and


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Patrick Bellasi
On 22-Dec 12:46, Peter Zijlstra wrote:
> On Fri, Dec 22, 2017 at 11:02:06AM +, Patrick Bellasi wrote:
> > > @@ -315,8 +315,8 @@ static unsigned int sugov_next_freq_shared(struct 
> > > sugov_cpu *sg_cpu, u64 time)
> > >   unsigned long j_util, j_max;
> > >   s64 delta_ns;
> > >  
> > > - if (j_sg_cpu != sg_cpu)
> > > - sugov_get_util(j_sg_cpu);
> > > + if (idle_cpu(j))
> > > + continue;
> > 
> > That should work to skip IDLE CPUs... however I'm missing where now we
> > get the sugov_get_util(j_sg_cpu) for active CPUs. It has been moved
> > somewhere else I guess...
> 
> No, I'm just an idiot... lemme fix that.

Then you just missed a call to sugov_get_util(j_sg_cpu) after the
above if... right, actually that was Viresh proposal...

> > Moreover, that way don't we completely disregard CFS blocked load for
> > IDLE CPUs... as well as DL reserved utilization, which should be
> > released only at the 0-lag time?
> 
> I was thinking that since dl is a 'global' scheduler the reservation
> would be too and thus the freq just needs a single CPU to be observed;

AFAIU global is only the admission control (which is something worth a
thread by itself...) while the dl_se->dl_bw are aggregated into the
dl_rq->running_bw, which ultimately represents the DL bandwidth
required for just a CPU.

> but I suppose there's nothing stopping anybody from splitting a clock
> domain down the middle scheduling wise. So yes, good point.

That makes sense... moreover, using the global utilization, we would
end up asking for capacities which cannot be provided by a single CPU.

> Blergh that'd make a mess of things again.

Actually, looking better at your patch: are we not just ok with that?

I mean, we don't need this check on idle_cpu since in
sugov_aggregate_util we already skip the util=sg_cpu->max in case of
!rq->rt.rt_nr_running, while we aggregate just CFS and DL requests.

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Patrick Bellasi
On 22-Dec 12:46, Peter Zijlstra wrote:
> On Fri, Dec 22, 2017 at 11:02:06AM +, Patrick Bellasi wrote:
> > > @@ -315,8 +315,8 @@ static unsigned int sugov_next_freq_shared(struct 
> > > sugov_cpu *sg_cpu, u64 time)
> > >   unsigned long j_util, j_max;
> > >   s64 delta_ns;
> > >  
> > > - if (j_sg_cpu != sg_cpu)
> > > - sugov_get_util(j_sg_cpu);
> > > + if (idle_cpu(j))
> > > + continue;
> > 
> > That should work to skip IDLE CPUs... however I'm missing where now we
> > get the sugov_get_util(j_sg_cpu) for active CPUs. It has been moved
> > somewhere else I guess...
> 
> No, I'm just an idiot... lemme fix that.

Then you just missed a call to sugov_get_util(j_sg_cpu) after the
above if... right, actually that was Viresh proposal...

> > Moreover, that way don't we completely disregard CFS blocked load for
> > IDLE CPUs... as well as DL reserved utilization, which should be
> > released only at the 0-lag time?
> 
> I was thinking that since dl is a 'global' scheduler the reservation
> would be too and thus the freq just needs a single CPU to be observed;

AFAIU global is only the admission control (which is something worth a
thread by itself...) while the dl_se->dl_bw are aggregated into the
dl_rq->running_bw, which ultimately represents the DL bandwidth
required for just a CPU.

> but I suppose there's nothing stopping anybody from splitting a clock
> domain down the middle scheduling wise. So yes, good point.

That makes sense... moreover, using the global utilization, we would
end up asking for capacities which cannot be provided by a single CPU.

> Blergh that'd make a mess of things again.

Actually, looking better at your patch: are we not just ok with that?

I mean, we don't need this check on idle_cpu since in
sugov_aggregate_util we already skip the util=sg_cpu->max in case of
!rq->rt.rt_nr_running, while we aggregate just CFS and DL requests.

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Juri Lelli
Hi Peter,

On 22/12/17 12:46, Peter Zijlstra wrote:
> On Fri, Dec 22, 2017 at 11:02:06AM +, Patrick Bellasi wrote:
> > > @@ -315,8 +315,8 @@ static unsigned int sugov_next_freq_shared(struct 
> > > sugov_cpu *sg_cpu, u64 time)
> > >   unsigned long j_util, j_max;
> > >   s64 delta_ns;
> > >  
> > > - if (j_sg_cpu != sg_cpu)
> > > - sugov_get_util(j_sg_cpu);
> > > + if (idle_cpu(j))
> > > + continue;
> > 
> > That should work to skip IDLE CPUs... however I'm missing where now we
> > get the sugov_get_util(j_sg_cpu) for active CPUs. It has been moved
> > somewhere else I guess...
> 
> No, I'm just an idiot... lemme fix that.
> 
> > Moreover, that way don't we completely disregard CFS blocked load for
> > IDLE CPUs... as well as DL reserved utilization, which should be
> > released only at the 0-lag time?
> 
> I was thinking that since dl is a 'global' scheduler the reservation
> would be too and thus the freq just needs a single CPU to be observed;
> but I suppose there's nothing stopping anybody from splitting a clock
> domain down the middle scheduling wise. So yes, good point.

Also, for CFS current behaviour is to start ignoring contributions after
TICK_NS. It seems that your change might introduce regressions?

Thanks,

- Juri


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Juri Lelli
Hi Peter,

On 22/12/17 12:46, Peter Zijlstra wrote:
> On Fri, Dec 22, 2017 at 11:02:06AM +, Patrick Bellasi wrote:
> > > @@ -315,8 +315,8 @@ static unsigned int sugov_next_freq_shared(struct 
> > > sugov_cpu *sg_cpu, u64 time)
> > >   unsigned long j_util, j_max;
> > >   s64 delta_ns;
> > >  
> > > - if (j_sg_cpu != sg_cpu)
> > > - sugov_get_util(j_sg_cpu);
> > > + if (idle_cpu(j))
> > > + continue;
> > 
> > That should work to skip IDLE CPUs... however I'm missing where now we
> > get the sugov_get_util(j_sg_cpu) for active CPUs. It has been moved
> > somewhere else I guess...
> 
> No, I'm just an idiot... lemme fix that.
> 
> > Moreover, that way don't we completely disregard CFS blocked load for
> > IDLE CPUs... as well as DL reserved utilization, which should be
> > released only at the 0-lag time?
> 
> I was thinking that since dl is a 'global' scheduler the reservation
> would be too and thus the freq just needs a single CPU to be observed;
> but I suppose there's nothing stopping anybody from splitting a clock
> domain down the middle scheduling wise. So yes, good point.

Also, for CFS current behaviour is to start ignoring contributions after
TICK_NS. It seems that your change might introduce regressions?

Thanks,

- Juri


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Peter Zijlstra
On Fri, Dec 22, 2017 at 11:02:06AM +, Patrick Bellasi wrote:
> > @@ -315,8 +315,8 @@ static unsigned int sugov_next_freq_shared(struct 
> > sugov_cpu *sg_cpu, u64 time)
> > unsigned long j_util, j_max;
> > s64 delta_ns;
> >  
> > -   if (j_sg_cpu != sg_cpu)
> > -   sugov_get_util(j_sg_cpu);
> > +   if (idle_cpu(j))
> > +   continue;
> 
> That should work to skip IDLE CPUs... however I'm missing where now we
> get the sugov_get_util(j_sg_cpu) for active CPUs. It has been moved
> somewhere else I guess...

No, I'm just an idiot... lemme fix that.

> Moreover, that way don't we completely disregard CFS blocked load for
> IDLE CPUs... as well as DL reserved utilization, which should be
> released only at the 0-lag time?

I was thinking that since dl is a 'global' scheduler the reservation
would be too and thus the freq just needs a single CPU to be observed;
but I suppose there's nothing stopping anybody from splitting a clock
domain down the middle scheduling wise. So yes, good point.

Blergh that'd make a mess of things again.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Peter Zijlstra
On Fri, Dec 22, 2017 at 11:02:06AM +, Patrick Bellasi wrote:
> > @@ -315,8 +315,8 @@ static unsigned int sugov_next_freq_shared(struct 
> > sugov_cpu *sg_cpu, u64 time)
> > unsigned long j_util, j_max;
> > s64 delta_ns;
> >  
> > -   if (j_sg_cpu != sg_cpu)
> > -   sugov_get_util(j_sg_cpu);
> > +   if (idle_cpu(j))
> > +   continue;
> 
> That should work to skip IDLE CPUs... however I'm missing where now we
> get the sugov_get_util(j_sg_cpu) for active CPUs. It has been moved
> somewhere else I guess...

No, I'm just an idiot... lemme fix that.

> Moreover, that way don't we completely disregard CFS blocked load for
> IDLE CPUs... as well as DL reserved utilization, which should be
> released only at the 0-lag time?

I was thinking that since dl is a 'global' scheduler the reservation
would be too and thus the freq just needs a single CPU to be observed;
but I suppose there's nothing stopping anybody from splitting a clock
domain down the middle scheduling wise. So yes, good point.

Blergh that'd make a mess of things again.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Patrick Bellasi
On 22-Dec 11:06, Peter Zijlstra wrote:
> On Wed, Dec 20, 2017 at 06:38:14PM +0100, Juri Lelli wrote:
> > On 20/12/17 16:30, Peter Zijlstra wrote:
> > 
> > [...]
> > 
> > > @@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
> > >   if (delta_ns > TICK_NSEC) {
> > >   j_sg_cpu->iowait_boost = 0;
> > >   j_sg_cpu->iowait_boost_pending = false;
> > > - j_sg_cpu->util_cfs = 0;
> > > - if (j_sg_cpu->util_dl == 0)
> > > - continue;
> > >   }
> > 
> > This goes away because with Brendan/Vincent fix we don't need the
> > workaround for stale CFS util contribution for idle CPUs anymore?
> 
> An easy fix would be something like the below I suppose (also folded a
> change from Viresh).
> 
> This way it completely ignores the demand from idle CPUs. Which I
> suppose is exactly what you want, no?
> 
> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> index ab84d2261554..9736b537386a 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -315,8 +315,8 @@ static unsigned int sugov_next_freq_shared(struct 
> sugov_cpu *sg_cpu, u64 time)
>   unsigned long j_util, j_max;
>   s64 delta_ns;
>  
> - if (j_sg_cpu != sg_cpu)
> - sugov_get_util(j_sg_cpu);
> + if (idle_cpu(j))
> + continue;

That should work to skip IDLE CPUs... however I'm missing where now we
get the sugov_get_util(j_sg_cpu) for active CPUs. It has been moved
somewhere else I guess...

Moreover, that way don't we completely disregard CFS blocked load for
IDLE CPUs... as well as DL reserved utilization, which should be
released only at the 0-lag time?

>  
>   /*
>* If the CFS CPU utilization was last updated before the
> @@ -354,7 +354,6 @@ static void sugov_update_shared(struct update_util_data 
> *hook, u64 time,
>  
>   raw_spin_lock(_policy->update_lock);
>  
> - sugov_get_util(sg_cpu);
>   sugov_set_iowait_boost(sg_cpu, time, flags);
>   sg_cpu->last_update = time;
>  

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Patrick Bellasi
On 22-Dec 11:06, Peter Zijlstra wrote:
> On Wed, Dec 20, 2017 at 06:38:14PM +0100, Juri Lelli wrote:
> > On 20/12/17 16:30, Peter Zijlstra wrote:
> > 
> > [...]
> > 
> > > @@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
> > >   if (delta_ns > TICK_NSEC) {
> > >   j_sg_cpu->iowait_boost = 0;
> > >   j_sg_cpu->iowait_boost_pending = false;
> > > - j_sg_cpu->util_cfs = 0;
> > > - if (j_sg_cpu->util_dl == 0)
> > > - continue;
> > >   }
> > 
> > This goes away because with Brendan/Vincent fix we don't need the
> > workaround for stale CFS util contribution for idle CPUs anymore?
> 
> An easy fix would be something like the below I suppose (also folded a
> change from Viresh).
> 
> This way it completely ignores the demand from idle CPUs. Which I
> suppose is exactly what you want, no?
> 
> diff --git a/kernel/sched/cpufreq_schedutil.c 
> b/kernel/sched/cpufreq_schedutil.c
> index ab84d2261554..9736b537386a 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -315,8 +315,8 @@ static unsigned int sugov_next_freq_shared(struct 
> sugov_cpu *sg_cpu, u64 time)
>   unsigned long j_util, j_max;
>   s64 delta_ns;
>  
> - if (j_sg_cpu != sg_cpu)
> - sugov_get_util(j_sg_cpu);
> + if (idle_cpu(j))
> + continue;

That should work to skip IDLE CPUs... however I'm missing where now we
get the sugov_get_util(j_sg_cpu) for active CPUs. It has been moved
somewhere else I guess...

Moreover, that way don't we completely disregard CFS blocked load for
IDLE CPUs... as well as DL reserved utilization, which should be
released only at the 0-lag time?

>  
>   /*
>* If the CFS CPU utilization was last updated before the
> @@ -354,7 +354,6 @@ static void sugov_update_shared(struct update_util_data 
> *hook, u64 time,
>  
>   raw_spin_lock(_policy->update_lock);
>  
> - sugov_get_util(sg_cpu);
>   sugov_set_iowait_boost(sg_cpu, time, flags);
>   sg_cpu->last_update = time;
>  

-- 
#include 

Patrick Bellasi


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Peter Zijlstra
On Wed, Dec 20, 2017 at 06:38:14PM +0100, Juri Lelli wrote:
> On 20/12/17 16:30, Peter Zijlstra wrote:
> 
> [...]
> 
> > @@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
> > if (delta_ns > TICK_NSEC) {
> > j_sg_cpu->iowait_boost = 0;
> > j_sg_cpu->iowait_boost_pending = false;
> > -   j_sg_cpu->util_cfs = 0;
> > -   if (j_sg_cpu->util_dl == 0)
> > -   continue;
> > }
> 
> This goes away because with Brendan/Vincent fix we don't need the
> workaround for stale CFS util contribution for idle CPUs anymore?

An easy fix would be something like the below I suppose (also folded a
change from Viresh).

This way it completely ignores the demand from idle CPUs. Which I
suppose is exactly what you want, no?

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index ab84d2261554..9736b537386a 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -315,8 +315,8 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu 
*sg_cpu, u64 time)
unsigned long j_util, j_max;
s64 delta_ns;
 
-   if (j_sg_cpu != sg_cpu)
-   sugov_get_util(j_sg_cpu);
+   if (idle_cpu(j))
+   continue;
 
/*
 * If the CFS CPU utilization was last updated before the
@@ -354,7 +354,6 @@ static void sugov_update_shared(struct update_util_data 
*hook, u64 time,
 
raw_spin_lock(_policy->update_lock);
 
-   sugov_get_util(sg_cpu);
sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
 


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Peter Zijlstra
On Wed, Dec 20, 2017 at 06:38:14PM +0100, Juri Lelli wrote:
> On 20/12/17 16:30, Peter Zijlstra wrote:
> 
> [...]
> 
> > @@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
> > if (delta_ns > TICK_NSEC) {
> > j_sg_cpu->iowait_boost = 0;
> > j_sg_cpu->iowait_boost_pending = false;
> > -   j_sg_cpu->util_cfs = 0;
> > -   if (j_sg_cpu->util_dl == 0)
> > -   continue;
> > }
> 
> This goes away because with Brendan/Vincent fix we don't need the
> workaround for stale CFS util contribution for idle CPUs anymore?

An easy fix would be something like the below I suppose (also folded a
change from Viresh).

This way it completely ignores the demand from idle CPUs. Which I
suppose is exactly what you want, no?

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index ab84d2261554..9736b537386a 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -315,8 +315,8 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu 
*sg_cpu, u64 time)
unsigned long j_util, j_max;
s64 delta_ns;
 
-   if (j_sg_cpu != sg_cpu)
-   sugov_get_util(j_sg_cpu);
+   if (idle_cpu(j))
+   continue;
 
/*
 * If the CFS CPU utilization was last updated before the
@@ -354,7 +354,6 @@ static void sugov_update_shared(struct update_util_data 
*hook, u64 time,
 
raw_spin_lock(_policy->update_lock);
 
-   sugov_get_util(sg_cpu);
sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
 


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Peter Zijlstra
On Thu, Dec 21, 2017 at 04:13:17PM +0530, Viresh Kumar wrote:
> On 21-12-17, 11:39, Peter Zijlstra wrote:
> > The difference is that we apply the per-cpu boost on the per-cpu util
> > value and _then_ find the overall maximum.
> > 
> > Instead of finding the overall maximum and then apply the per-cpu boost
> > to that.
> 
> Okay, so it is just about the right sequencing of these comparisons but the
> outcome will still be same, i.e. max of the 3 util/max values. Thanks.

Aah, you're right. I was thinking we have relative boost, and in that
case the ordering matters much more.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-22 Thread Peter Zijlstra
On Thu, Dec 21, 2017 at 04:13:17PM +0530, Viresh Kumar wrote:
> On 21-12-17, 11:39, Peter Zijlstra wrote:
> > The difference is that we apply the per-cpu boost on the per-cpu util
> > value and _then_ find the overall maximum.
> > 
> > Instead of finding the overall maximum and then apply the per-cpu boost
> > to that.
> 
> Okay, so it is just about the right sequencing of these comparisons but the
> outcome will still be same, i.e. max of the 3 util/max values. Thanks.

Aah, you're right. I was thinking we have relative boost, and in that
case the ordering matters much more.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-21 Thread Viresh Kumar
On 21-12-17, 11:39, Peter Zijlstra wrote:
> The difference is that we apply the per-cpu boost on the per-cpu util
> value and _then_ find the overall maximum.
> 
> Instead of finding the overall maximum and then apply the per-cpu boost
> to that.

Okay, so it is just about the right sequencing of these comparisons but the
outcome will still be same, i.e. max of the 3 util/max values. Thanks.

-- 
viresh


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-21 Thread Viresh Kumar
On 21-12-17, 11:39, Peter Zijlstra wrote:
> The difference is that we apply the per-cpu boost on the per-cpu util
> value and _then_ find the overall maximum.
> 
> Instead of finding the overall maximum and then apply the per-cpu boost
> to that.

Okay, so it is just about the right sequencing of these comparisons but the
outcome will still be same, i.e. max of the 3 util/max values. Thanks.

-- 
viresh


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-21 Thread Peter Zijlstra
On Thu, Dec 21, 2017 at 04:00:22PM +0530, Viresh Kumar wrote:
> On 21-12-17, 11:25, Peter Zijlstra wrote:
> > On Thu, Dec 21, 2017 at 02:45:02PM +0530, Viresh Kumar wrote:
> > > On 20-12-17, 16:43, Peter Zijlstra wrote:
> > > > The below makes more sense to me too; hmm?
> > > > 
> > > > @@ -335,12 +335,11 @@ static unsigned int sugov_next_freq_shar
> > > >  
> > > > j_max = j_sg_cpu->max;
> > > > j_util = sugov_aggregate_util(j_sg_cpu);
> > > > +   sugov_iowait_boost(j_sg_cpu, , );
> > 
> > This should 'obviously' have been:
> > 
> > sugov_iowait_boost(j_sg_cpu, _util, *j_max);
> 
> Actually it should be:
> 
>   sugov_iowait_boost(j_sg_cpu, _util, _max);

Yes, clearly I cannot type much ;-)

> and this is how it was in the commit I reviewed from your tree. But my query
> still stands, what difference will it make ?
> 
> > > > if (j_util * max > j_max * util) {
> > > > util = j_util;
> > > > max = j_max;
> > > > }
> > > > -
> > > > -   sugov_iowait_boost(j_sg_cpu, , );
> 

The difference is that we apply the per-cpu boost on the per-cpu util
value and _then_ find the overall maximum.

Instead of finding the overall maximum and then apply the per-cpu boost
to that.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-21 Thread Peter Zijlstra
On Thu, Dec 21, 2017 at 04:00:22PM +0530, Viresh Kumar wrote:
> On 21-12-17, 11:25, Peter Zijlstra wrote:
> > On Thu, Dec 21, 2017 at 02:45:02PM +0530, Viresh Kumar wrote:
> > > On 20-12-17, 16:43, Peter Zijlstra wrote:
> > > > The below makes more sense to me too; hmm?
> > > > 
> > > > @@ -335,12 +335,11 @@ static unsigned int sugov_next_freq_shar
> > > >  
> > > > j_max = j_sg_cpu->max;
> > > > j_util = sugov_aggregate_util(j_sg_cpu);
> > > > +   sugov_iowait_boost(j_sg_cpu, , );
> > 
> > This should 'obviously' have been:
> > 
> > sugov_iowait_boost(j_sg_cpu, _util, *j_max);
> 
> Actually it should be:
> 
>   sugov_iowait_boost(j_sg_cpu, _util, _max);

Yes, clearly I cannot type much ;-)

> and this is how it was in the commit I reviewed from your tree. But my query
> still stands, what difference will it make ?
> 
> > > > if (j_util * max > j_max * util) {
> > > > util = j_util;
> > > > max = j_max;
> > > > }
> > > > -
> > > > -   sugov_iowait_boost(j_sg_cpu, , );
> 

The difference is that we apply the per-cpu boost on the per-cpu util
value and _then_ find the overall maximum.

Instead of finding the overall maximum and then apply the per-cpu boost
to that.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-21 Thread Viresh Kumar
On 21-12-17, 11:25, Peter Zijlstra wrote:
> On Thu, Dec 21, 2017 at 02:45:02PM +0530, Viresh Kumar wrote:
> > On 20-12-17, 16:43, Peter Zijlstra wrote:
> > > The below makes more sense to me too; hmm?
> > > 
> > > @@ -335,12 +335,11 @@ static unsigned int sugov_next_freq_shar
> > >  
> > >   j_max = j_sg_cpu->max;
> > >   j_util = sugov_aggregate_util(j_sg_cpu);
> > > + sugov_iowait_boost(j_sg_cpu, , );
> 
> This should 'obviously' have been:
> 
>   sugov_iowait_boost(j_sg_cpu, _util, *j_max);

Actually it should be:

sugov_iowait_boost(j_sg_cpu, _util, _max);

and this is how it was in the commit I reviewed from your tree. But my query
still stands, what difference will it make ?

> > >   if (j_util * max > j_max * util) {
> > >   util = j_util;
> > >   max = j_max;
> > >   }
> > > -
> > > - sugov_iowait_boost(j_sg_cpu, , );

-- 
viresh


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-21 Thread Viresh Kumar
On 21-12-17, 11:25, Peter Zijlstra wrote:
> On Thu, Dec 21, 2017 at 02:45:02PM +0530, Viresh Kumar wrote:
> > On 20-12-17, 16:43, Peter Zijlstra wrote:
> > > The below makes more sense to me too; hmm?
> > > 
> > > @@ -335,12 +335,11 @@ static unsigned int sugov_next_freq_shar
> > >  
> > >   j_max = j_sg_cpu->max;
> > >   j_util = sugov_aggregate_util(j_sg_cpu);
> > > + sugov_iowait_boost(j_sg_cpu, , );
> 
> This should 'obviously' have been:
> 
>   sugov_iowait_boost(j_sg_cpu, _util, *j_max);

Actually it should be:

sugov_iowait_boost(j_sg_cpu, _util, _max);

and this is how it was in the commit I reviewed from your tree. But my query
still stands, what difference will it make ?

> > >   if (j_util * max > j_max * util) {
> > >   util = j_util;
> > >   max = j_max;
> > >   }
> > > -
> > > - sugov_iowait_boost(j_sg_cpu, , );

-- 
viresh


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-21 Thread Peter Zijlstra
On Thu, Dec 21, 2017 at 02:45:02PM +0530, Viresh Kumar wrote:
> On 20-12-17, 16:43, Peter Zijlstra wrote:
> > The below makes more sense to me too; hmm?
> > 
> > @@ -335,12 +335,11 @@ static unsigned int sugov_next_freq_shar
> >  
> > j_max = j_sg_cpu->max;
> > j_util = sugov_aggregate_util(j_sg_cpu);
> > +   sugov_iowait_boost(j_sg_cpu, , );

This should 'obviously' have been:

sugov_iowait_boost(j_sg_cpu, _util, *j_max);

> > if (j_util * max > j_max * util) {
> > util = j_util;
> > max = j_max;
> > }
> > -
> > -   sugov_iowait_boost(j_sg_cpu, , );



Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-21 Thread Peter Zijlstra
On Thu, Dec 21, 2017 at 02:45:02PM +0530, Viresh Kumar wrote:
> On 20-12-17, 16:43, Peter Zijlstra wrote:
> > The below makes more sense to me too; hmm?
> > 
> > @@ -335,12 +335,11 @@ static unsigned int sugov_next_freq_shar
> >  
> > j_max = j_sg_cpu->max;
> > j_util = sugov_aggregate_util(j_sg_cpu);
> > +   sugov_iowait_boost(j_sg_cpu, , );

This should 'obviously' have been:

sugov_iowait_boost(j_sg_cpu, _util, *j_max);

> > if (j_util * max > j_max * util) {
> > util = j_util;
> > max = j_max;
> > }
> > -
> > -   sugov_iowait_boost(j_sg_cpu, , );



Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-21 Thread Viresh Kumar
On 20-12-17, 16:43, Peter Zijlstra wrote:
> The below makes more sense to me too; hmm?
> 
> @@ -335,12 +335,11 @@ static unsigned int sugov_next_freq_shar
>  
>   j_max = j_sg_cpu->max;
>   j_util = sugov_aggregate_util(j_sg_cpu);
> + sugov_iowait_boost(j_sg_cpu, , );
>   if (j_util * max > j_max * util) {
>   util = j_util;
>   max = j_max;
>   }
> -
> - sugov_iowait_boost(j_sg_cpu, , );

Sorry if I am being a fool here, I had 3 different interpretations of
the results after this change in the last 15 minutes. It was confusing
for somehow..

Why do you think above change matters ? I think nothing changed after
this diff at all.

We have three different values here:

util/max, j_util/j_max, and j_boost_util/j_boost_max.

And we are trying to find the max among them and changing the order of
comparisons doesn't change anything.

Am I reading the code correctly ?

-- 
viresh


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-21 Thread Viresh Kumar
On 20-12-17, 16:43, Peter Zijlstra wrote:
> The below makes more sense to me too; hmm?
> 
> @@ -335,12 +335,11 @@ static unsigned int sugov_next_freq_shar
>  
>   j_max = j_sg_cpu->max;
>   j_util = sugov_aggregate_util(j_sg_cpu);
> + sugov_iowait_boost(j_sg_cpu, , );
>   if (j_util * max > j_max * util) {
>   util = j_util;
>   max = j_max;
>   }
> -
> - sugov_iowait_boost(j_sg_cpu, , );

Sorry if I am being a fool here, I had 3 different interpretations of
the results after this change in the last 15 minutes. It was confusing
for somehow..

Why do you think above change matters ? I think nothing changed after
this diff at all.

We have three different values here:

util/max, j_util/j_max, and j_boost_util/j_boost_max.

And we are trying to find the max among them and changing the order of
comparisons doesn't change anything.

Am I reading the code correctly ?

-- 
viresh


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-20 Thread Viresh Kumar
On 20-12-17, 16:30, Peter Zijlstra wrote:
> 
> So I ended up with the below (on top of Juri's cpufreq-dl patches).

Nice :)

There are two things that I noticed in your tree.

Firstly, there is no need of the following patch as we shouldn't have
the problem mentioned in the commit anymore:

38e19dbe1286 cpufreq: schedutil: ignore sugov kthreads

And maybe we can apply the below patch after that (only compile
tested).

-8<-

From: Viresh Kumar 
Date: Thu, 21 Dec 2017 12:52:50 +0530
Subject: [PATCH] cpufreq/schedutil: Call sugov_get_util() only when required

sugov_update_shared() doesn't use the updated values of max, util_cfs
and util_dl, unless we try to find the next frequency by calling
sugov_next_freq_shared() and so there is no need to call it directly
from sugov_update_shared(). Rather postpone it until the time
sugov_next_freq_shared() is called for one of the CPUs that share the
policy.

Signed-off-by: Viresh Kumar 
---
 kernel/sched/cpufreq_schedutil.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index ab84d2261554..f2f4df26b954 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -315,8 +315,7 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu 
*sg_cpu, u64 time)
unsigned long j_util, j_max;
s64 delta_ns;
 
-   if (j_sg_cpu != sg_cpu)
-   sugov_get_util(j_sg_cpu);
+   sugov_get_util(j_sg_cpu);
 
/*
 * If the CFS CPU utilization was last updated before the
@@ -354,7 +353,6 @@ static void sugov_update_shared(struct update_util_data 
*hook, u64 time,
 
raw_spin_lock(_policy->update_lock);
 
-   sugov_get_util(sg_cpu);
sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
 


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-20 Thread Viresh Kumar
On 20-12-17, 16:30, Peter Zijlstra wrote:
> 
> So I ended up with the below (on top of Juri's cpufreq-dl patches).

Nice :)

There are two things that I noticed in your tree.

Firstly, there is no need of the following patch as we shouldn't have
the problem mentioned in the commit anymore:

38e19dbe1286 cpufreq: schedutil: ignore sugov kthreads

And maybe we can apply the below patch after that (only compile
tested).

-8<-

From: Viresh Kumar 
Date: Thu, 21 Dec 2017 12:52:50 +0530
Subject: [PATCH] cpufreq/schedutil: Call sugov_get_util() only when required

sugov_update_shared() doesn't use the updated values of max, util_cfs
and util_dl, unless we try to find the next frequency by calling
sugov_next_freq_shared() and so there is no need to call it directly
from sugov_update_shared(). Rather postpone it until the time
sugov_next_freq_shared() is called for one of the CPUs that share the
policy.

Signed-off-by: Viresh Kumar 
---
 kernel/sched/cpufreq_schedutil.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index ab84d2261554..f2f4df26b954 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -315,8 +315,7 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu 
*sg_cpu, u64 time)
unsigned long j_util, j_max;
s64 delta_ns;
 
-   if (j_sg_cpu != sg_cpu)
-   sugov_get_util(j_sg_cpu);
+   sugov_get_util(j_sg_cpu);
 
/*
 * If the CFS CPU utilization was last updated before the
@@ -354,7 +353,6 @@ static void sugov_update_shared(struct update_util_data 
*hook, u64 time,
 
raw_spin_lock(_policy->update_lock);
 
-   sugov_get_util(sg_cpu);
sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
 


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-20 Thread Peter Zijlstra
On Wed, Dec 20, 2017 at 06:38:14PM +0100, Juri Lelli wrote:
> On 20/12/17 16:30, Peter Zijlstra wrote:
> 
> [...]
> 
> > @@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
> > if (delta_ns > TICK_NSEC) {
> > j_sg_cpu->iowait_boost = 0;
> > j_sg_cpu->iowait_boost_pending = false;
> > -   j_sg_cpu->util_cfs = 0;
> > -   if (j_sg_cpu->util_dl == 0)
> > -   continue;
> > }
> 
> This goes away because with Brendan/Vincent fix we don't need the
> workaround for stale CFS util contribution for idle CPUs anymore?

Oh, good point, no I took that out because of:

@@ -314,6 +315,9 @@ static unsigned int sugov_next_freq_shar

unsigned long j_util, j_max;
s64 delta_ns;

+   if (j_sg_cpu != sg_cpu)
+   sugov_get_util(j_sg_cpu);
+
/*
 * If the CFS CPU utilization was last updated before the
 * previous frequency update and the time elapsed between the


which recomputes the util value all the time. But yes, that still needs
those other patches to stay relevant.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-20 Thread Peter Zijlstra
On Wed, Dec 20, 2017 at 06:38:14PM +0100, Juri Lelli wrote:
> On 20/12/17 16:30, Peter Zijlstra wrote:
> 
> [...]
> 
> > @@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
> > if (delta_ns > TICK_NSEC) {
> > j_sg_cpu->iowait_boost = 0;
> > j_sg_cpu->iowait_boost_pending = false;
> > -   j_sg_cpu->util_cfs = 0;
> > -   if (j_sg_cpu->util_dl == 0)
> > -   continue;
> > }
> 
> This goes away because with Brendan/Vincent fix we don't need the
> workaround for stale CFS util contribution for idle CPUs anymore?

Oh, good point, no I took that out because of:

@@ -314,6 +315,9 @@ static unsigned int sugov_next_freq_shar

unsigned long j_util, j_max;
s64 delta_ns;

+   if (j_sg_cpu != sg_cpu)
+   sugov_get_util(j_sg_cpu);
+
/*
 * If the CFS CPU utilization was last updated before the
 * previous frequency update and the time elapsed between the


which recomputes the util value all the time. But yes, that still needs
those other patches to stay relevant.


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-20 Thread Juri Lelli
On 20/12/17 16:30, Peter Zijlstra wrote:

[...]

> @@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
>   if (delta_ns > TICK_NSEC) {
>   j_sg_cpu->iowait_boost = 0;
>   j_sg_cpu->iowait_boost_pending = false;
> - j_sg_cpu->util_cfs = 0;
> - if (j_sg_cpu->util_dl == 0)
> - continue;
>   }

This goes away because with Brendan/Vincent fix we don't need the
workaround for stale CFS util contribution for idle CPUs anymore?


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-20 Thread Juri Lelli
On 20/12/17 16:30, Peter Zijlstra wrote:

[...]

> @@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
>   if (delta_ns > TICK_NSEC) {
>   j_sg_cpu->iowait_boost = 0;
>   j_sg_cpu->iowait_boost_pending = false;
> - j_sg_cpu->util_cfs = 0;
> - if (j_sg_cpu->util_dl == 0)
> - continue;
>   }

This goes away because with Brendan/Vincent fix we don't need the
workaround for stale CFS util contribution for idle CPUs anymore?


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-20 Thread Peter Zijlstra
On Wed, Dec 20, 2017 at 04:30:29PM +0100, Peter Zijlstra wrote:
> 
> So I ended up with the below (on top of Juri's cpufreq-dl patches).
> 
> It compiles, but that's about all the testing it had.

Should all be available at:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/core

compile tested only, I suppose the 0day bot will let me know soon enough
:-)


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-20 Thread Peter Zijlstra
On Wed, Dec 20, 2017 at 04:30:29PM +0100, Peter Zijlstra wrote:
> 
> So I ended up with the below (on top of Juri's cpufreq-dl patches).
> 
> It compiles, but that's about all the testing it had.

Should all be available at:

  git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git sched/core

compile tested only, I suppose the 0day bot will let me know soon enough
:-)


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-20 Thread Peter Zijlstra
On Wed, Dec 20, 2017 at 04:30:29PM +0100, Peter Zijlstra wrote:
> @@ -314,6 +315,9 @@ static unsigned int sugov_next_freq_shar
>   unsigned long j_util, j_max;
>   s64 delta_ns;
>  
> + if (j_sg_cpu != sg_cpu)
> + sugov_get_util(j_sg_cpu);
> +
>   /*
>* If the CFS CPU utilization was last updated before the
>* previous frequency update and the time elapsed between the
> @@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
>   if (delta_ns > TICK_NSEC) {
>   j_sg_cpu->iowait_boost = 0;
>   j_sg_cpu->iowait_boost_pending = false;
> - j_sg_cpu->util_cfs = 0;
> - if (j_sg_cpu->util_dl == 0)
> - continue;
>   }
> - if (j_sg_cpu->flags & SCHED_CPUFREQ_RT)
> - return policy->cpuinfo.max_freq;
>  
>   j_max = j_sg_cpu->max;
>   j_util = sugov_aggregate_util(j_sg_cpu);

The below makes more sense to me too; hmm?

@@ -335,12 +335,11 @@ static unsigned int sugov_next_freq_shar
 
j_max = j_sg_cpu->max;
j_util = sugov_aggregate_util(j_sg_cpu);
+   sugov_iowait_boost(j_sg_cpu, , );
if (j_util * max > j_max * util) {
util = j_util;
max = j_max;
}
-
-   sugov_iowait_boost(j_sg_cpu, , );
}
 
return get_next_freq(sg_policy, util, max);


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-20 Thread Peter Zijlstra
On Wed, Dec 20, 2017 at 04:30:29PM +0100, Peter Zijlstra wrote:
> @@ -314,6 +315,9 @@ static unsigned int sugov_next_freq_shar
>   unsigned long j_util, j_max;
>   s64 delta_ns;
>  
> + if (j_sg_cpu != sg_cpu)
> + sugov_get_util(j_sg_cpu);
> +
>   /*
>* If the CFS CPU utilization was last updated before the
>* previous frequency update and the time elapsed between the
> @@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
>   if (delta_ns > TICK_NSEC) {
>   j_sg_cpu->iowait_boost = 0;
>   j_sg_cpu->iowait_boost_pending = false;
> - j_sg_cpu->util_cfs = 0;
> - if (j_sg_cpu->util_dl == 0)
> - continue;
>   }
> - if (j_sg_cpu->flags & SCHED_CPUFREQ_RT)
> - return policy->cpuinfo.max_freq;
>  
>   j_max = j_sg_cpu->max;
>   j_util = sugov_aggregate_util(j_sg_cpu);

The below makes more sense to me too; hmm?

@@ -335,12 +335,11 @@ static unsigned int sugov_next_freq_shar
 
j_max = j_sg_cpu->max;
j_util = sugov_aggregate_util(j_sg_cpu);
+   sugov_iowait_boost(j_sg_cpu, , );
if (j_util * max > j_max * util) {
util = j_util;
max = j_max;
}
-
-   sugov_iowait_boost(j_sg_cpu, , );
}
 
return get_next_freq(sg_policy, util, max);


Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-20 Thread Peter Zijlstra

So I ended up with the below (on top of Juri's cpufreq-dl patches).

It compiles, but that's about all the testing it had.

--- a/include/linux/sched/cpufreq.h
+++ b/include/linux/sched/cpufreq.h
@@ -8,9 +8,7 @@
  * Interface between cpufreq drivers and the scheduler:
  */
 
-#define SCHED_CPUFREQ_RT   (1U << 0)
-#define SCHED_CPUFREQ_DL   (1U << 1)
-#define SCHED_CPUFREQ_IOWAIT   (1U << 2)
+#define SCHED_CPUFREQ_IOWAIT   (1U << 0)
 
 #ifdef CONFIG_CPU_FREQ
 struct update_util_data {
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -63,7 +63,6 @@ struct sugov_cpu {
unsigned long util_cfs;
unsigned long util_dl;
unsigned long max;
-   unsigned int flags;
 
/* The field below is for single-CPU policies only. */
 #ifdef CONFIG_NO_HZ_COMMON
@@ -188,17 +187,23 @@ static void sugov_get_util(struct sugov_
 
 static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
 {
+   unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
+   struct rq *rq = cpu_rq(sg_cpu->cpu);
+
+   if (rq->rt.rt_nr_running)
+   util = sg_cpu->max;
+
/*
 * Ideally we would like to set util_dl as min/guaranteed freq and
 * util_cfs + util_dl as requested freq. However, cpufreq is not yet
 * ready for such an interface. So, we only do the latter for now.
 */
-   return min(sg_cpu->util_cfs + sg_cpu->util_dl, sg_cpu->max);
+   return min(util, sg_cpu->max);
 }
 
-static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time)
+static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, 
unsigned int flags)
 {
-   if (sg_cpu->flags & SCHED_CPUFREQ_IOWAIT) {
+   if (flags & SCHED_CPUFREQ_IOWAIT) {
if (sg_cpu->iowait_boost_pending)
return;
 
@@ -267,12 +272,11 @@ static void sugov_update_single(struct u
 {
struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, 
update_util);
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
-   struct cpufreq_policy *policy = sg_policy->policy;
unsigned long util, max;
unsigned int next_f;
bool busy;
 
-   sugov_set_iowait_boost(sg_cpu, time);
+   sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
 
if (!sugov_should_update_freq(sg_policy, time))
@@ -280,25 +284,22 @@ static void sugov_update_single(struct u
 
busy = sugov_cpu_is_busy(sg_cpu);
 
-   if (flags & SCHED_CPUFREQ_RT) {
-   next_f = policy->cpuinfo.max_freq;
-   } else {
-   sugov_get_util(sg_cpu);
-   max = sg_cpu->max;
-   util = sugov_aggregate_util(sg_cpu);
-   sugov_iowait_boost(sg_cpu, , );
-   next_f = get_next_freq(sg_policy, util, max);
-   /*
-* Do not reduce the frequency if the CPU has not been idle
-* recently, as the reduction is likely to be premature then.
-*/
-   if (busy && next_f < sg_policy->next_freq) {
-   next_f = sg_policy->next_freq;
+   sugov_get_util(sg_cpu);
+   max = sg_cpu->max;
+   util = sugov_aggregate_util(sg_cpu);
+   sugov_iowait_boost(sg_cpu, , );
+   next_f = get_next_freq(sg_policy, util, max);
+   /*
+* Do not reduce the frequency if the CPU has not been idle
+* recently, as the reduction is likely to be premature then.
+*/
+   if (busy && next_f < sg_policy->next_freq) {
+   next_f = sg_policy->next_freq;
 
-   /* Reset cached freq as next_freq has changed */
-   sg_policy->cached_raw_freq = 0;
-   }
+   /* Reset cached freq as next_freq has changed */
+   sg_policy->cached_raw_freq = 0;
}
+
sugov_update_commit(sg_policy, time, next_f);
 }
 
@@ -314,6 +315,9 @@ static unsigned int sugov_next_freq_shar
unsigned long j_util, j_max;
s64 delta_ns;
 
+   if (j_sg_cpu != sg_cpu)
+   sugov_get_util(j_sg_cpu);
+
/*
 * If the CFS CPU utilization was last updated before the
 * previous frequency update and the time elapsed between the
@@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
if (delta_ns > TICK_NSEC) {
j_sg_cpu->iowait_boost = 0;
j_sg_cpu->iowait_boost_pending = false;
-   j_sg_cpu->util_cfs = 0;
-   if (j_sg_cpu->util_dl == 0)
-   continue;
}
-   if (j_sg_cpu->flags & SCHED_CPUFREQ_RT)
-   return policy->cpuinfo.max_freq;
 
j_max = j_sg_cpu->max;
j_util = sugov_aggregate_util(j_sg_cpu);
@@ -357,17 +356,11 @@ static void 

Re: [PATCH v3 0/6] cpufreq: schedutil: fixes for flags updates

2017-12-20 Thread Peter Zijlstra

So I ended up with the below (on top of Juri's cpufreq-dl patches).

It compiles, but that's about all the testing it had.

--- a/include/linux/sched/cpufreq.h
+++ b/include/linux/sched/cpufreq.h
@@ -8,9 +8,7 @@
  * Interface between cpufreq drivers and the scheduler:
  */
 
-#define SCHED_CPUFREQ_RT   (1U << 0)
-#define SCHED_CPUFREQ_DL   (1U << 1)
-#define SCHED_CPUFREQ_IOWAIT   (1U << 2)
+#define SCHED_CPUFREQ_IOWAIT   (1U << 0)
 
 #ifdef CONFIG_CPU_FREQ
 struct update_util_data {
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -63,7 +63,6 @@ struct sugov_cpu {
unsigned long util_cfs;
unsigned long util_dl;
unsigned long max;
-   unsigned int flags;
 
/* The field below is for single-CPU policies only. */
 #ifdef CONFIG_NO_HZ_COMMON
@@ -188,17 +187,23 @@ static void sugov_get_util(struct sugov_
 
 static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
 {
+   unsigned long util = sg_cpu->util_cfs + sg_cpu->util_dl;
+   struct rq *rq = cpu_rq(sg_cpu->cpu);
+
+   if (rq->rt.rt_nr_running)
+   util = sg_cpu->max;
+
/*
 * Ideally we would like to set util_dl as min/guaranteed freq and
 * util_cfs + util_dl as requested freq. However, cpufreq is not yet
 * ready for such an interface. So, we only do the latter for now.
 */
-   return min(sg_cpu->util_cfs + sg_cpu->util_dl, sg_cpu->max);
+   return min(util, sg_cpu->max);
 }
 
-static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time)
+static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time, 
unsigned int flags)
 {
-   if (sg_cpu->flags & SCHED_CPUFREQ_IOWAIT) {
+   if (flags & SCHED_CPUFREQ_IOWAIT) {
if (sg_cpu->iowait_boost_pending)
return;
 
@@ -267,12 +272,11 @@ static void sugov_update_single(struct u
 {
struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, 
update_util);
struct sugov_policy *sg_policy = sg_cpu->sg_policy;
-   struct cpufreq_policy *policy = sg_policy->policy;
unsigned long util, max;
unsigned int next_f;
bool busy;
 
-   sugov_set_iowait_boost(sg_cpu, time);
+   sugov_set_iowait_boost(sg_cpu, time, flags);
sg_cpu->last_update = time;
 
if (!sugov_should_update_freq(sg_policy, time))
@@ -280,25 +284,22 @@ static void sugov_update_single(struct u
 
busy = sugov_cpu_is_busy(sg_cpu);
 
-   if (flags & SCHED_CPUFREQ_RT) {
-   next_f = policy->cpuinfo.max_freq;
-   } else {
-   sugov_get_util(sg_cpu);
-   max = sg_cpu->max;
-   util = sugov_aggregate_util(sg_cpu);
-   sugov_iowait_boost(sg_cpu, , );
-   next_f = get_next_freq(sg_policy, util, max);
-   /*
-* Do not reduce the frequency if the CPU has not been idle
-* recently, as the reduction is likely to be premature then.
-*/
-   if (busy && next_f < sg_policy->next_freq) {
-   next_f = sg_policy->next_freq;
+   sugov_get_util(sg_cpu);
+   max = sg_cpu->max;
+   util = sugov_aggregate_util(sg_cpu);
+   sugov_iowait_boost(sg_cpu, , );
+   next_f = get_next_freq(sg_policy, util, max);
+   /*
+* Do not reduce the frequency if the CPU has not been idle
+* recently, as the reduction is likely to be premature then.
+*/
+   if (busy && next_f < sg_policy->next_freq) {
+   next_f = sg_policy->next_freq;
 
-   /* Reset cached freq as next_freq has changed */
-   sg_policy->cached_raw_freq = 0;
-   }
+   /* Reset cached freq as next_freq has changed */
+   sg_policy->cached_raw_freq = 0;
}
+
sugov_update_commit(sg_policy, time, next_f);
 }
 
@@ -314,6 +315,9 @@ static unsigned int sugov_next_freq_shar
unsigned long j_util, j_max;
s64 delta_ns;
 
+   if (j_sg_cpu != sg_cpu)
+   sugov_get_util(j_sg_cpu);
+
/*
 * If the CFS CPU utilization was last updated before the
 * previous frequency update and the time elapsed between the
@@ -327,12 +331,7 @@ static unsigned int sugov_next_freq_shar
if (delta_ns > TICK_NSEC) {
j_sg_cpu->iowait_boost = 0;
j_sg_cpu->iowait_boost_pending = false;
-   j_sg_cpu->util_cfs = 0;
-   if (j_sg_cpu->util_dl == 0)
-   continue;
}
-   if (j_sg_cpu->flags & SCHED_CPUFREQ_RT)
-   return policy->cpuinfo.max_freq;
 
j_max = j_sg_cpu->max;
j_util = sugov_aggregate_util(j_sg_cpu);
@@ -357,17 +356,11 @@ static void