Hi Vincent,

On 22/10/20 14:43, Vincent Guittot wrote:
> During fast wakeup path, scheduler always check whether local or prev cpus
> are good candidates for the task before looking for other cpus in the
> domain. With
>   commit b7a331615d25 ("sched/fair: Add asymmetric CPU capacity wakeup scan")
> the heterogenous system gains a dedicated path but doesn't try to keep
> reusing prev cpu whenever possible. If the previous cpu is idle and belong to 
> the
> asymmetric domain, we should check it 1st before looking for another cpu
> because it stays one of the best candidate and it stabilizes task placement
> on the system.
>
> This change aligns asymmetric path behavior with symmetric one and reduces
> cases where the task migrates across all cpus of the sd_asym_cpucapacity
> domains at wakeup.
>
> This change does not impact normal EAS mode but only the overloaded case or
> when EAS is not used.
>
> On hikey960 with performance governor (EAS disable)
>
> ./perf bench sched pipe -T -l 150000
>              mainline           w/ patch
> # migrations   299811                  3

Colour me impressed!

Now AFAICT the only thing that makes new_cpu != prev_cpu in
select_task_rq_fair() is the WAKE_AFFINE stuff, and the likelihood of that
happening increases when WF_SYNC (which the Android binder uses, at least
on a mainline tree). I had severely underestimated how often that thing
picks this_cpu.

> ops/sec        154535(+/-0.13%)   181754(+/- 0.29) +17%
>
> Fixes: b7a331615d25 ("sched/fair: Add asymmetric CPU capacity wakeup scan")
> Signed-off-by: Vincent Guittot <[email protected]>
> ---
>  kernel/sched/fair.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index aa4c6227cd6d..f39638fe6b94 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -6170,7 +6170,7 @@ static int select_idle_cpu(struct task_struct *p, 
> struct sched_domain *sd, int t
>   * maximize capacity.
>   */
>  static int
> -select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int 
> target)
> +select_idle_capacity(struct task_struct *p, struct sched_domain *sd, int 
> prev, int target)
>  {
>       unsigned long best_cap = 0;
>       int cpu, best_cpu = -1;
> @@ -6178,9 +6178,22 @@ select_idle_capacity(struct task_struct *p, struct 
> sched_domain *sd, int target)
>
>       sync_entity_load_avg(&p->se);
>
> +     if ((available_idle_cpu(target) || sched_idle_cpu(target)) &&
> +         task_fits_capacity(p, capacity_of(target)))
> +             return target;
> +

I think we still need to check for CPU affinity here.

>       cpus = this_cpu_cpumask_var_ptr(select_idle_mask);
>       cpumask_and(cpus, sched_domain_span(sd), p->cpus_ptr);
>
> +     /*
> +      * If the previous CPU belongs to this asymmetric domain and is idle,
> +      * check it 1st as it's the best candidate.
> +      */
> +     if (prev != target && cpumask_test_cpu(prev, cpus) &&
> +         (available_idle_cpu(prev) || sched_idle_cpu(prev)) &&
> +         task_fits_capacity(p, capacity_of(prev)))
> +             return prev;
> +
>       for_each_cpu_wrap(cpu, cpus, target) {

So we prioritize target over prev, like the rest of the
select_idle_sibling() family. Here however we apply the same acceptability
function to target, prev and the loop body, so perhaps we could simplify
this to:

  if (accept(target))
      return target;

  ...

  for_each_cpu_wrap(cpu, cpus, prev) {
      ...
  }

That way we evaluate target twice only if it isn't a direct candidate
(but might be a fallback one).

>               unsigned long cpu_cap = capacity_of(cpu);
>
> @@ -6223,7 +6236,7 @@ static int select_idle_sibling(struct task_struct *p, 
> int prev, int target)
>               if (!sd)
>                       goto symmetric;
>
> -             i = select_idle_capacity(p, sd, target);
> +             i = select_idle_capacity(p, sd, prev, target);
>               return ((unsigned)i < nr_cpumask_bits) ? i : target;
>       }

Reply via email to