On 2021/2/24 16:15, Aubrey Li wrote: > A long-tail load balance cost is observed on the newly idle path, > this is caused by a race window between the first nr_running check > of the busiest runqueue and its nr_running recheck in detach_tasks. > > Before the busiest runqueue is locked, the tasks on the busiest > runqueue could be pulled by other CPUs and nr_running of the busiest > runqueu becomes 1 or even 0 if the running task becomes idle, this > causes detach_tasks breaks with LBF_ALL_PINNED flag set, and triggers > load_balance redo at the same sched_domain level. > > In order to find the new busiest sched_group and CPU, load balance will > recompute and update the various load statistics, which eventually leads > to the long-tail load balance cost. > > This patch clears LBF_ALL_PINNED flag for this race condition, and hence > reduces the long-tail cost of newly idle balance.
Ping... > > Cc: Vincent Guittot <vincent.guit...@linaro.org> > Cc: Mel Gorman <mgor...@techsingularity.net> > Cc: Andi Kleen <a...@linux.intel.com> > Cc: Tim Chen <tim.c.c...@linux.intel.com> > Cc: Srinivas Pandruvada <srinivas.pandruv...@linux.intel.com> > Cc: Rafael J. Wysocki <rafael.j.wyso...@intel.com> > Signed-off-by: Aubrey Li <aubrey...@linux.intel.com> > --- > kernel/sched/fair.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 04a3ce2..5c67804 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -7675,6 +7675,15 @@ static int detach_tasks(struct lb_env *env) > > lockdep_assert_held(&env->src_rq->lock); > > + /* > + * Source run queue has been emptied by another CPU, clear > + * LBF_ALL_PINNED flag as we will not test any task. > + */ > + if (env->src_rq->nr_running <= 1) { > + env->flags &= ~LBF_ALL_PINNED; > + return 0; > + } > + > if (env->imbalance <= 0) > return 0; > >