Re: [PATCH v4 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance

2021-08-28 Thread Vincent Guittot
On Fri, 27 Aug 2021 at 21:45, Ricardo Neri
 wrote:
>
> On Fri, Aug 27, 2021 at 12:13:42PM +0200, Vincent Guittot wrote:
> > On Tue, 10 Aug 2021 at 16:41, Ricardo Neri
> >  wrote:
> > > @@ -9540,6 +9629,12 @@ static struct rq *find_busiest_queue(struct lb_env 
> > > *env,
> > > nr_running == 1)
> > > continue;
> > >
> > > +   /* Make sure we only pull tasks from a CPU of lower 
> > > priority */
> > > +   if ((env->sd->flags & SD_ASYM_PACKING) &&
> > > +   sched_asym_prefer(i, env->dst_cpu) &&
> > > +   nr_running == 1)
> > > +   continue;
> >
> > This really looks similar to the test above for SD_ASYM_CPUCAPACITY.
> > More generally speaking SD_ASYM_PACKING and SD_ASYM_CPUCAPACITY share
> > a lot of common policy and I wonder if at some point we could not
> > merge their behavior in LB
>
> I would like to confirm with you that you are not expecting this merge
> as part of this series, right?

Merging them will probably need more tests on both x86 and Arm so I
suppose that we could keep them separate for now

Regards,
Vincent

>
> Thanks and BR,
> Ricardo


Re: [PATCH v4 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance

2021-08-27 Thread Ricardo Neri
On Fri, Aug 27, 2021 at 12:13:42PM +0200, Vincent Guittot wrote:
> On Tue, 10 Aug 2021 at 16:41, Ricardo Neri
>  wrote:
> > @@ -9540,6 +9629,12 @@ static struct rq *find_busiest_queue(struct lb_env 
> > *env,
> > nr_running == 1)
> > continue;
> >
> > +   /* Make sure we only pull tasks from a CPU of lower 
> > priority */
> > +   if ((env->sd->flags & SD_ASYM_PACKING) &&
> > +   sched_asym_prefer(i, env->dst_cpu) &&
> > +   nr_running == 1)
> > +   continue;
> 
> This really looks similar to the test above for SD_ASYM_CPUCAPACITY.
> More generally speaking SD_ASYM_PACKING and SD_ASYM_CPUCAPACITY share
> a lot of common policy and I wonder if at some point we could not
> merge their behavior in LB

I would like to confirm with you that you are not expecting this merge
as part of this series, right?

Thanks and BR,
Ricardo


Re: [PATCH v4 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance

2021-08-27 Thread Ricardo Neri
On Fri, Aug 27, 2021 at 05:17:22PM +0200, Vincent Guittot wrote:
> On Fri, 27 Aug 2021 at 16:50, Peter Zijlstra  wrote:
> >
> > On Fri, Aug 27, 2021 at 12:13:42PM +0200, Vincent Guittot wrote:
> > > > +/**
> > > > + * asym_smt_can_pull_tasks - Check whether the load balancing CPU can 
> > > > pull tasks
> > > > + * @dst_cpu:   Destination CPU of the load balancing
> > > > + * @sds:   Load-balancing data with statistics of the local group
> > > > + * @sgs:   Load-balancing statistics of the candidate busiest group
> > > > + * @sg:The candidate busiet group
> > > > + *
> > > > + * Check the state of the SMT siblings of both @sds::local and @sg and 
> > > > decide
> > > > + * if @dst_cpu can pull tasks. If @dst_cpu does not have SMT siblings, 
> > > > it can
> > > > + * pull tasks if two or more of the SMT siblings of @sg are busy. If 
> > > > only one
> > > > + * CPU in @sg is busy, pull tasks only if @dst_cpu has higher priority.
> > > > + *
> > > > + * If both @dst_cpu and @sg have SMT siblings, even the number of idle 
> > > > CPUs
> > > > + * between @sds::local and @sg. Thus, pull tasks from @sg if the 
> > > > difference
> > > > + * between the number of busy CPUs is 2 or more. If the difference is 
> > > > of 1,
> > > > + * only pull if @dst_cpu has higher priority. If @sg does not have SMT 
> > > > siblings
> > > > + * only pull tasks if all of the SMT siblings of @dst_cpu are idle and 
> > > > @sg
> > > > + * has lower priority.
> > > > + */
> > > > +static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats 
> > > > *sds,
> > > > +   struct sg_lb_stats *sgs,
> > > > +   struct sched_group *sg)
> > > > +{
> > > > +#ifdef CONFIG_SCHED_SMT
> > > > +   bool local_is_smt, sg_is_smt;
> > > > +   int sg_busy_cpus;
> > > > +
> > > > +   local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY;
> > > > +   sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY;
> > > > +
> > > > +   sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
> > > > +
> > > > +   if (!local_is_smt) {
> > > > +   /*
> > > > +* If we are here, @dst_cpu is idle and does not have 
> > > > SMT
> > > > +* siblings. Pull tasks if candidate group has two or 
> > > > more
> > > > +* busy CPUs.
> > > > +*/
> > > > +   if (sg_is_smt && sg_busy_cpus >= 2)
> > > > +   return true;
> > > > +
> > > > +   /*
> > > > +* @dst_cpu does not have SMT siblings. @sg may have SMT
> > > > +* siblings and only one is busy. In such case, @dst_cpu
> > > > +* can help if it has higher priority and is idle.
> > > > +*/
> > > > +   return !sds->local_stat.group_util &&
> > >
> > > sds->local_stat.group_util can't be used to decide if a CPU or group
> > > of CPUs is idle. util_avg is usually not null when a CPU becomes idle
> > > and you can have to wait  more than 300ms before it becomes Null
> > > At the opposite, the utilization of a CPU can be null but a task with
> > > null utilization has just woken up on it.
> > > Utilization is used to reflect the average work of the CPU or group of
> > > CPUs but not the current state
> >
> > If you want immediate idle, sgs->nr_running == 0 or sgs->idle_cpus ==
> > sgs->group_weight come to mind.
> 
> yes, I have the same in mind

Thank you very much Vincent and Peter for the feedback! I'll look at
using these stats to determine immediate idle.

> 
> >
> > > > +  sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> > > > +   }
> > > > +
> > > > +   /* @dst_cpu has SMT siblings. */
> > > > +
> > > > +   if (sg_is_smt) {
> > > > +   int local_busy_cpus = sds->local->group_weight -
> > > > + sds->local_stat.idle_cpus;
> > > > +   int busy_cpus_delta = sg_busy_cpus - local_busy_cpus;
> > > > +
> > > > +   /* Local can always help to even the number busy CPUs. 
> > > > */
> > >
> > > default behavior of the load balance already tries to even the number
>  a> > of idle CPUs.
> >
> > Right, but I suppose this is because we're trapped here and have to deal
> > with the SMT-SMT case too. Ricardo, can you clarify?
> 
> IIUC, this function is used in sg_lb_stats to set
> sgs->group_asym_packing which is then used to set the group state to
> group_asym_packing and force asym migration.
> But if we only want to even the number of busy CPUs between the group,
> we should not need to set the group state to  group_asym_packing

Yes, what Vincent describe is the intent. Then, I think it is probably
true that it is not necessary to even the number of idle CPUs here.
> 
> >
> > > > +   if (busy_cpus_delta >= 2)
> > > > +   return true;
> > > > +
> > > > +   if (busy_cpus_delta == 1)
> > > > + 

Re: [PATCH v4 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance

2021-08-27 Thread Vincent Guittot
On Fri, 27 Aug 2021 at 16:50, Peter Zijlstra  wrote:
>
> On Fri, Aug 27, 2021 at 12:13:42PM +0200, Vincent Guittot wrote:
> > > +/**
> > > + * asym_smt_can_pull_tasks - Check whether the load balancing CPU can 
> > > pull tasks
> > > + * @dst_cpu:   Destination CPU of the load balancing
> > > + * @sds:   Load-balancing data with statistics of the local group
> > > + * @sgs:   Load-balancing statistics of the candidate busiest group
> > > + * @sg:The candidate busiet group
> > > + *
> > > + * Check the state of the SMT siblings of both @sds::local and @sg and 
> > > decide
> > > + * if @dst_cpu can pull tasks. If @dst_cpu does not have SMT siblings, 
> > > it can
> > > + * pull tasks if two or more of the SMT siblings of @sg are busy. If 
> > > only one
> > > + * CPU in @sg is busy, pull tasks only if @dst_cpu has higher priority.
> > > + *
> > > + * If both @dst_cpu and @sg have SMT siblings, even the number of idle 
> > > CPUs
> > > + * between @sds::local and @sg. Thus, pull tasks from @sg if the 
> > > difference
> > > + * between the number of busy CPUs is 2 or more. If the difference is of 
> > > 1,
> > > + * only pull if @dst_cpu has higher priority. If @sg does not have SMT 
> > > siblings
> > > + * only pull tasks if all of the SMT siblings of @dst_cpu are idle and 
> > > @sg
> > > + * has lower priority.
> > > + */
> > > +static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
> > > +   struct sg_lb_stats *sgs,
> > > +   struct sched_group *sg)
> > > +{
> > > +#ifdef CONFIG_SCHED_SMT
> > > +   bool local_is_smt, sg_is_smt;
> > > +   int sg_busy_cpus;
> > > +
> > > +   local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY;
> > > +   sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY;
> > > +
> > > +   sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
> > > +
> > > +   if (!local_is_smt) {
> > > +   /*
> > > +* If we are here, @dst_cpu is idle and does not have SMT
> > > +* siblings. Pull tasks if candidate group has two or more
> > > +* busy CPUs.
> > > +*/
> > > +   if (sg_is_smt && sg_busy_cpus >= 2)
> > > +   return true;
> > > +
> > > +   /*
> > > +* @dst_cpu does not have SMT siblings. @sg may have SMT
> > > +* siblings and only one is busy. In such case, @dst_cpu
> > > +* can help if it has higher priority and is idle.
> > > +*/
> > > +   return !sds->local_stat.group_util &&
> >
> > sds->local_stat.group_util can't be used to decide if a CPU or group
> > of CPUs is idle. util_avg is usually not null when a CPU becomes idle
> > and you can have to wait  more than 300ms before it becomes Null
> > At the opposite, the utilization of a CPU can be null but a task with
> > null utilization has just woken up on it.
> > Utilization is used to reflect the average work of the CPU or group of
> > CPUs but not the current state
>
> If you want immediate idle, sgs->nr_running == 0 or sgs->idle_cpus ==
> sgs->group_weight come to mind.

yes, I have the same in mind

>
> > > +  sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> > > +   }
> > > +
> > > +   /* @dst_cpu has SMT siblings. */
> > > +
> > > +   if (sg_is_smt) {
> > > +   int local_busy_cpus = sds->local->group_weight -
> > > + sds->local_stat.idle_cpus;
> > > +   int busy_cpus_delta = sg_busy_cpus - local_busy_cpus;
> > > +
> > > +   /* Local can always help to even the number busy CPUs. */
> >
> > default behavior of the load balance already tries to even the number
 a> > of idle CPUs.
>
> Right, but I suppose this is because we're trapped here and have to deal
> with the SMT-SMT case too. Ricardo, can you clarify?

IIUC, this function is used in sg_lb_stats to set
sgs->group_asym_packing which is then used to set the group state to
group_asym_packing and force asym migration.
But if we only want to even the number of busy CPUs between the group,
we should not need to set the group state to  group_asym_packing

>
> > > +   if (busy_cpus_delta >= 2)
> > > +   return true;
> > > +
> > > +   if (busy_cpus_delta == 1)
> > > +   return sched_asym_prefer(dst_cpu,
> > > +sg->asym_prefer_cpu);
> > > +
> > > +   return false;
> > > +   }
> > > +
> > > +   /*
> > > +* @sg does not have SMT siblings. Ensure that @sds::local does 
> > > not end
> > > +* up with more than one busy SMT sibling and only pull tasks if 
> > > there
> > > +* are not busy CPUs. As CPUs move in and out of idle state 
> > > frequently,
> > > +* also check the group utilization to smoother the 

Re: [PATCH v4 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance

2021-08-27 Thread Peter Zijlstra
On Fri, Aug 27, 2021 at 12:13:42PM +0200, Vincent Guittot wrote:
> > +/**
> > + * asym_smt_can_pull_tasks - Check whether the load balancing CPU can pull 
> > tasks
> > + * @dst_cpu:   Destination CPU of the load balancing
> > + * @sds:   Load-balancing data with statistics of the local group
> > + * @sgs:   Load-balancing statistics of the candidate busiest group
> > + * @sg:The candidate busiet group
> > + *
> > + * Check the state of the SMT siblings of both @sds::local and @sg and 
> > decide
> > + * if @dst_cpu can pull tasks. If @dst_cpu does not have SMT siblings, it 
> > can
> > + * pull tasks if two or more of the SMT siblings of @sg are busy. If only 
> > one
> > + * CPU in @sg is busy, pull tasks only if @dst_cpu has higher priority.
> > + *
> > + * If both @dst_cpu and @sg have SMT siblings, even the number of idle CPUs
> > + * between @sds::local and @sg. Thus, pull tasks from @sg if the difference
> > + * between the number of busy CPUs is 2 or more. If the difference is of 1,
> > + * only pull if @dst_cpu has higher priority. If @sg does not have SMT 
> > siblings
> > + * only pull tasks if all of the SMT siblings of @dst_cpu are idle and @sg
> > + * has lower priority.
> > + */
> > +static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
> > +   struct sg_lb_stats *sgs,
> > +   struct sched_group *sg)
> > +{
> > +#ifdef CONFIG_SCHED_SMT
> > +   bool local_is_smt, sg_is_smt;
> > +   int sg_busy_cpus;
> > +
> > +   local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY;
> > +   sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY;
> > +
> > +   sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
> > +
> > +   if (!local_is_smt) {
> > +   /*
> > +* If we are here, @dst_cpu is idle and does not have SMT
> > +* siblings. Pull tasks if candidate group has two or more
> > +* busy CPUs.
> > +*/
> > +   if (sg_is_smt && sg_busy_cpus >= 2)
> > +   return true;
> > +
> > +   /*
> > +* @dst_cpu does not have SMT siblings. @sg may have SMT
> > +* siblings and only one is busy. In such case, @dst_cpu
> > +* can help if it has higher priority and is idle.
> > +*/
> > +   return !sds->local_stat.group_util &&
> 
> sds->local_stat.group_util can't be used to decide if a CPU or group
> of CPUs is idle. util_avg is usually not null when a CPU becomes idle
> and you can have to wait  more than 300ms before it becomes Null
> At the opposite, the utilization of a CPU can be null but a task with
> null utilization has just woken up on it.
> Utilization is used to reflect the average work of the CPU or group of
> CPUs but not the current state

If you want immediate idle, sgs->nr_running == 0 or sgs->idle_cpus ==
sgs->group_weight come to mind.

> > +  sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> > +   }
> > +
> > +   /* @dst_cpu has SMT siblings. */
> > +
> > +   if (sg_is_smt) {
> > +   int local_busy_cpus = sds->local->group_weight -
> > + sds->local_stat.idle_cpus;
> > +   int busy_cpus_delta = sg_busy_cpus - local_busy_cpus;
> > +
> > +   /* Local can always help to even the number busy CPUs. */
> 
> default behavior of the load balance already tries to even the number
> of idle CPUs.

Right, but I suppose this is because we're trapped here and have to deal
with the SMT-SMT case too. Ricardo, can you clarify?

> > +   if (busy_cpus_delta >= 2)
> > +   return true;
> > +
> > +   if (busy_cpus_delta == 1)
> > +   return sched_asym_prefer(dst_cpu,
> > +sg->asym_prefer_cpu);
> > +
> > +   return false;
> > +   }
> > +
> > +   /*
> > +* @sg does not have SMT siblings. Ensure that @sds::local does not 
> > end
> > +* up with more than one busy SMT sibling and only pull tasks if 
> > there
> > +* are not busy CPUs. As CPUs move in and out of idle state 
> > frequently,
> > +* also check the group utilization to smoother the decision.
> > +*/
> > +   if (!sds->local_stat.group_util)
> 
> same comment as above about the meaning of group_util == 0
> 
> > +   return sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
> > +
> > +   return false;
> > +#else
> > +   /* Always return false so that callers deal with non-SMT cases. */
> > +   return false;
> > +#endif
> > +}
> > +
> >  static inline bool
> >  sched_asym(struct lb_env *env, struct sd_lb_stats *sds,  struct 
> > sg_lb_stats *sgs,
> >struct sched_group *group)
> >  {
> > +   /* Only do SMT checks if either local or candidate have 

Re: [PATCH v4 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance

2021-08-27 Thread Vincent Guittot
On Tue, 10 Aug 2021 at 16:41, Ricardo Neri
 wrote:
>
> When deciding to pull tasks in ASYM_PACKING, it is necessary not only to
> check for the idle state of the destination CPU, dst_cpu, but also of
> its SMT siblings.
>
> If dst_cpu is idle but its SMT siblings are busy, performance suffers
> if it pulls tasks from a medium priority CPU that does not have SMT
> siblings.
>
> Implement asym_smt_can_pull_tasks() to inspect the state of the SMT
> siblings of both dst_cpu and the CPUs in the candidate busiest group.
>
> Cc: Aubrey Li 
> Cc: Ben Segall 
> Cc: Daniel Bristot de Oliveira 
> Cc: Dietmar Eggemann 
> Cc: Mel Gorman 
> Cc: Quentin Perret 
> Cc: Rafael J. Wysocki 
> Cc: Srinivas Pandruvada 
> Cc: Steven Rostedt 
> Cc: Tim Chen 
> Reviewed-by: Joel Fernandes (Google) 
> Reviewed-by: Len Brown 
> Signed-off-by: Ricardo Neri 
> ---
> Changes since v3:
>   * Removed the arch_asym_check_smt_siblings() hook. Discussions with the
> powerpc folks showed that this patch should not impact them. Also, more
> recent powerpc processor no longer use asym_packing. (PeterZ)
>   * Removed unnecessary local variable in asym_can_pull_tasks(). (Dietmar)
>   * Removed unnecessary check for local CPUs when the local group has zero
> utilization. (Joel)
>   * Renamed asym_can_pull_tasks() as asym_smt_can_pull_tasks() to reflect
> the fact that it deals with SMT cases.
>   * Made asym_smt_can_pull_tasks() return false for !CONFIG_SCHED_SMT so
> that callers can deal with non-SMT cases.
>
> Changes since v2:
>   * Reworded the commit message to reflect updates in code.
>   * Corrected misrepresentation of dst_cpu as the CPU doing the load
> balancing. (PeterZ)
>   * Removed call to arch_asym_check_smt_siblings() as it is now called in
> sched_asym().
>
> Changes since v1:
>   * Don't bailout in update_sd_pick_busiest() if dst_cpu cannot pull
> tasks. Instead, reclassify the candidate busiest group, as it
> may still be selected. (PeterZ)
>   * Avoid an expensive and unnecessary call to cpumask_weight() when
> determining if a sched_group is comprised of SMT siblings.
> (PeterZ).
> ---
>  kernel/sched/fair.c | 95 +
>  1 file changed, 95 insertions(+)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index dd411cefb63f..8a1a2a43732c 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -8531,10 +8531,99 @@ group_type group_classify(unsigned int imbalance_pct,
> return group_has_spare;
>  }
>
> +/**
> + * asym_smt_can_pull_tasks - Check whether the load balancing CPU can pull 
> tasks
> + * @dst_cpu:   Destination CPU of the load balancing
> + * @sds:   Load-balancing data with statistics of the local group
> + * @sgs:   Load-balancing statistics of the candidate busiest group
> + * @sg:The candidate busiet group
> + *
> + * Check the state of the SMT siblings of both @sds::local and @sg and decide
> + * if @dst_cpu can pull tasks. If @dst_cpu does not have SMT siblings, it can
> + * pull tasks if two or more of the SMT siblings of @sg are busy. If only one
> + * CPU in @sg is busy, pull tasks only if @dst_cpu has higher priority.
> + *
> + * If both @dst_cpu and @sg have SMT siblings, even the number of idle CPUs
> + * between @sds::local and @sg. Thus, pull tasks from @sg if the difference
> + * between the number of busy CPUs is 2 or more. If the difference is of 1,
> + * only pull if @dst_cpu has higher priority. If @sg does not have SMT 
> siblings
> + * only pull tasks if all of the SMT siblings of @dst_cpu are idle and @sg
> + * has lower priority.
> + */
> +static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
> +   struct sg_lb_stats *sgs,
> +   struct sched_group *sg)
> +{
> +#ifdef CONFIG_SCHED_SMT
> +   bool local_is_smt, sg_is_smt;
> +   int sg_busy_cpus;
> +
> +   local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY;
> +   sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY;
> +
> +   sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
> +
> +   if (!local_is_smt) {
> +   /*
> +* If we are here, @dst_cpu is idle and does not have SMT
> +* siblings. Pull tasks if candidate group has two or more
> +* busy CPUs.
> +*/
> +   if (sg_is_smt && sg_busy_cpus >= 2)
> +   return true;
> +
> +   /*
> +* @dst_cpu does not have SMT siblings. @sg may have SMT
> +* siblings and only one is busy. In such case, @dst_cpu
> +* can help if it has higher priority and is idle.
> +*/
> +   return !sds->local_stat.group_util &&

sds->local_stat.group_util can't be used to decide if a CPU or group
of CPUs is idle. util_avg is usually not null when a CPU becomes idle
and you can have to wait  more than 

[PATCH v4 6/6] sched/fair: Consider SMT in ASYM_PACKING load balance

2021-08-10 Thread Ricardo Neri
When deciding to pull tasks in ASYM_PACKING, it is necessary not only to
check for the idle state of the destination CPU, dst_cpu, but also of
its SMT siblings.

If dst_cpu is idle but its SMT siblings are busy, performance suffers
if it pulls tasks from a medium priority CPU that does not have SMT
siblings.

Implement asym_smt_can_pull_tasks() to inspect the state of the SMT
siblings of both dst_cpu and the CPUs in the candidate busiest group.

Cc: Aubrey Li 
Cc: Ben Segall 
Cc: Daniel Bristot de Oliveira 
Cc: Dietmar Eggemann 
Cc: Mel Gorman 
Cc: Quentin Perret 
Cc: Rafael J. Wysocki 
Cc: Srinivas Pandruvada 
Cc: Steven Rostedt 
Cc: Tim Chen 
Reviewed-by: Joel Fernandes (Google) 
Reviewed-by: Len Brown 
Signed-off-by: Ricardo Neri 
---
Changes since v3:
  * Removed the arch_asym_check_smt_siblings() hook. Discussions with the
powerpc folks showed that this patch should not impact them. Also, more
recent powerpc processor no longer use asym_packing. (PeterZ)
  * Removed unnecessary local variable in asym_can_pull_tasks(). (Dietmar)
  * Removed unnecessary check for local CPUs when the local group has zero
utilization. (Joel)
  * Renamed asym_can_pull_tasks() as asym_smt_can_pull_tasks() to reflect
the fact that it deals with SMT cases.
  * Made asym_smt_can_pull_tasks() return false for !CONFIG_SCHED_SMT so
that callers can deal with non-SMT cases.

Changes since v2:
  * Reworded the commit message to reflect updates in code.
  * Corrected misrepresentation of dst_cpu as the CPU doing the load
balancing. (PeterZ)
  * Removed call to arch_asym_check_smt_siblings() as it is now called in
sched_asym().

Changes since v1:
  * Don't bailout in update_sd_pick_busiest() if dst_cpu cannot pull
tasks. Instead, reclassify the candidate busiest group, as it
may still be selected. (PeterZ)
  * Avoid an expensive and unnecessary call to cpumask_weight() when
determining if a sched_group is comprised of SMT siblings.
(PeterZ).
---
 kernel/sched/fair.c | 95 +
 1 file changed, 95 insertions(+)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index dd411cefb63f..8a1a2a43732c 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -8531,10 +8531,99 @@ group_type group_classify(unsigned int imbalance_pct,
return group_has_spare;
 }
 
+/**
+ * asym_smt_can_pull_tasks - Check whether the load balancing CPU can pull 
tasks
+ * @dst_cpu:   Destination CPU of the load balancing
+ * @sds:   Load-balancing data with statistics of the local group
+ * @sgs:   Load-balancing statistics of the candidate busiest group
+ * @sg:The candidate busiet group
+ *
+ * Check the state of the SMT siblings of both @sds::local and @sg and decide
+ * if @dst_cpu can pull tasks. If @dst_cpu does not have SMT siblings, it can
+ * pull tasks if two or more of the SMT siblings of @sg are busy. If only one
+ * CPU in @sg is busy, pull tasks only if @dst_cpu has higher priority.
+ *
+ * If both @dst_cpu and @sg have SMT siblings, even the number of idle CPUs
+ * between @sds::local and @sg. Thus, pull tasks from @sg if the difference
+ * between the number of busy CPUs is 2 or more. If the difference is of 1,
+ * only pull if @dst_cpu has higher priority. If @sg does not have SMT siblings
+ * only pull tasks if all of the SMT siblings of @dst_cpu are idle and @sg
+ * has lower priority.
+ */
+static bool asym_smt_can_pull_tasks(int dst_cpu, struct sd_lb_stats *sds,
+   struct sg_lb_stats *sgs,
+   struct sched_group *sg)
+{
+#ifdef CONFIG_SCHED_SMT
+   bool local_is_smt, sg_is_smt;
+   int sg_busy_cpus;
+
+   local_is_smt = sds->local->flags & SD_SHARE_CPUCAPACITY;
+   sg_is_smt = sg->flags & SD_SHARE_CPUCAPACITY;
+
+   sg_busy_cpus = sgs->group_weight - sgs->idle_cpus;
+
+   if (!local_is_smt) {
+   /*
+* If we are here, @dst_cpu is idle and does not have SMT
+* siblings. Pull tasks if candidate group has two or more
+* busy CPUs.
+*/
+   if (sg_is_smt && sg_busy_cpus >= 2)
+   return true;
+
+   /*
+* @dst_cpu does not have SMT siblings. @sg may have SMT
+* siblings and only one is busy. In such case, @dst_cpu
+* can help if it has higher priority and is idle.
+*/
+   return !sds->local_stat.group_util &&
+  sched_asym_prefer(dst_cpu, sg->asym_prefer_cpu);
+   }
+
+   /* @dst_cpu has SMT siblings. */
+
+   if (sg_is_smt) {
+   int local_busy_cpus = sds->local->group_weight -
+ sds->local_stat.idle_cpus;
+   int busy_cpus_delta = sg_busy_cpus - local_busy_cpus;
+
+   /* Local can always help to even the number busy CPUs. */
+