On 2020/7/23 12:23, benbjiang(蒋彪) wrote: > Hi, >> On Jul 23, 2020, at 11:35 AM, Li, Aubrey <[email protected]> wrote: >> >> On 2020/7/23 10:42, benbjiang(蒋彪) wrote: >>> Hi, >>> >>>> On Jul 23, 2020, at 9:57 AM, Li, Aubrey <[email protected]> wrote: >>>> >>>> On 2020/7/22 22:32, benbjiang(蒋彪) wrote: >>>>> Hi, >>>>> >>>>>> On Jul 22, 2020, at 8:13 PM, Li, Aubrey <[email protected]> >>>>>> wrote: >>>>>> >>>>>> On 2020/7/22 16:54, benbjiang(蒋彪) wrote: >>>>>>> Hi, Aubrey, >>>>>>> >>>>>>>> On Jul 1, 2020, at 5:32 AM, Vineeth Remanan Pillai >>>>>>>> <[email protected]> wrote: >>>>>>>> >>>>>>>> From: Aubrey Li <[email protected]> >>>>>>>> >>>>>>>> - Don't migrate if there is a cookie mismatch >>>>>>>> Load balance tries to move task from busiest CPU to the >>>>>>>> destination CPU. When core scheduling is enabled, if the >>>>>>>> task's cookie does not match with the destination CPU's >>>>>>>> core cookie, this task will be skipped by this CPU. This >>>>>>>> mitigates the forced idle time on the destination CPU. >>>>>>>> >>>>>>>> - Select cookie matched idle CPU >>>>>>>> In the fast path of task wakeup, select the first cookie matched >>>>>>>> idle CPU instead of the first idle CPU. >>>>>>>> >>>>>>>> - Find cookie matched idlest CPU >>>>>>>> In the slow path of task wakeup, find the idlest CPU whose core >>>>>>>> cookie matches with task's cookie >>>>>>>> >>>>>>>> - Don't migrate task if cookie not match >>>>>>>> For the NUMA load balance, don't migrate task to the CPU whose >>>>>>>> core cookie does not match with task's cookie >>>>>>>> >>>>>>>> Signed-off-by: Aubrey Li <[email protected]> >>>>>>>> Signed-off-by: Tim Chen <[email protected]> >>>>>>>> Signed-off-by: Vineeth Remanan Pillai <[email protected]> >>>>>>>> --- >>>>>>>> kernel/sched/fair.c | 64 ++++++++++++++++++++++++++++++++++++++++---- >>>>>>>> kernel/sched/sched.h | 29 ++++++++++++++++++++ >>>>>>>> 2 files changed, 88 insertions(+), 5 deletions(-) >>>>>>>> >>>>>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c >>>>>>>> index d16939766361..33dc4bf01817 100644 >>>>>>>> --- a/kernel/sched/fair.c >>>>>>>> +++ b/kernel/sched/fair.c >>>>>>>> @@ -2051,6 +2051,15 @@ static void task_numa_find_cpu(struct >>>>>>>> task_numa_env *env, >>>>>>>> if (!cpumask_test_cpu(cpu, env->p->cpus_ptr)) >>>>>>>> continue; >>>>>>>> >>>>>>>> +#ifdef CONFIG_SCHED_CORE >>>>>>>> + /* >>>>>>>> + * Skip this cpu if source task's cookie does not match >>>>>>>> + * with CPU's core cookie. >>>>>>>> + */ >>>>>>>> + if (!sched_core_cookie_match(cpu_rq(cpu), env->p)) >>>>>>>> + continue; >>>>>>>> +#endif >>>>>>>> + >>>>>>>> env->dst_cpu = cpu; >>>>>>>> if (task_numa_compare(env, taskimp, groupimp, maymove)) >>>>>>>> break; >>>>>>>> @@ -5963,11 +5972,17 @@ find_idlest_group_cpu(struct sched_group >>>>>>>> *group, struct task_struct *p, int this >>>>>>>> >>>>>>>> /* Traverse only the allowed CPUs */ >>>>>>>> for_each_cpu_and(i, sched_group_span(group), p->cpus_ptr) { >>>>>>>> + struct rq *rq = cpu_rq(i); >>>>>>>> + >>>>>>>> +#ifdef CONFIG_SCHED_CORE >>>>>>>> + if (!sched_core_cookie_match(rq, p)) >>>>>>>> + continue; >>>>>>>> +#endif >>>>>>>> + >>>>>>>> if (sched_idle_cpu(i)) >>>>>>>> return i; >>>>>>>> >>>>>>>> if (available_idle_cpu(i)) { >>>>>>>> - struct rq *rq = cpu_rq(i); >>>>>>>> struct cpuidle_state *idle = idle_get_state(rq); >>>>>>>> if (idle && idle->exit_latency < >>>>>>>> min_exit_latency) { >>>>>>>> /* >>>>>>>> @@ -6224,8 +6239,18 @@ static int select_idle_cpu(struct task_struct >>>>>>>> *p, struct sched_domain *sd, int t >>>>>>>> for_each_cpu_wrap(cpu, cpus, target) { >>>>>>>> if (!--nr) >>>>>>>> return -1; >>>>>>>> - if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) >>>>>>>> - break; >>>>>>>> + >>>>>>>> + if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) { >>>>>>>> +#ifdef CONFIG_SCHED_CORE >>>>>>>> + /* >>>>>>>> + * If Core Scheduling is enabled, select this >>>>>>>> cpu >>>>>>>> + * only if the process cookie matches core >>>>>>>> cookie. >>>>>>>> + */ >>>>>>>> + if (sched_core_enabled(cpu_rq(cpu)) && >>>>>>>> + p->core_cookie == >>>>>>>> cpu_rq(cpu)->core->core_cookie) >>>>>>> Why not also add similar logic in select_idle_smt to reduce >>>>>>> forced-idle? :) >>>>>> We hit select_idle_smt after we scaned the entire LLC domain for idle >>>>>> cores >>>>>> and idle cpus and failed,so IMHO, an idle smt is probably a good choice >>>>>> under >>>>>> this scenario. >>>>> >>>>> AFAIC, selecting idle sibling with unmatched cookie will cause >>>>> unnecessary fored-idle, unfairness and latency, compared to choosing >>>>> *target* cpu. >>>> Choosing target cpu could increase the runnable task number on the target >>>> runqueue, this >>>> could trigger busiest->nr_running > 1 logic and makes the idle sibling >>>> trying to pull but >>>> not success(due to cookie not match). Putting task to the idle sibling is >>>> relatively stable IMHO. >>> >>> I’m afraid that *unsuccessful* pullings between smts would not result in >>> unstableness, because >>> the load-balance always do periodicly , and unsuccess means nothing happen. >> unsuccess pulling means more unnecessary overhead in load balance. >> >>> On the contrary, unmatched sibling tasks running concurrently could bring >>> forced-idle to each other repeatedly, >>> Which is more unstable, and more costly when pick_next_task for all >>> siblings. >> Not worse than two tasks ping-pong on the same target run queue I guess, and >> better if >> - task1(cookie A) is running on the target, and task2(cookie B) in the >> runqueue, >> - task3(cookie B) coming >> >> If task3 chooses target's sibling, it could have a chance to run >> concurrently with task2. >> But if task3 chooses target, it will wait for next pulling luck of load >> balancer > That’s more interesting. :) > Distributing different cookie tasks onto different cpus(or cpusets) could be > the *ideal stable status* we want, as I understood. > Different cookie tasks running on sibling smts could hurt performance, and > that should be avoided with best effort. We already tried to avoid when we scan idle cores and idle cpus in llc domain.
> For above case, selecting idle sibling cpu can improve the concurrency > indeed, but it decrease the imbalance for load-balancer. > In that case, load-balancer could not notice the imbalance, and would do > nothing to improve the unmatched situation. > On the contrary, choosing the *target* cpu could enhance the imbalance, and > load-balancer could try to pull unmatched task away, Pulling away to where needs another bunch of elaboration. > which could improve the unmatched situation and be helpful to reach the > *ideal stable status*. Maybe that’s what we expect. :) > If we limit to this one-core two-sibling three-tasks case, choosing the idle sibling is the ideal stable status, as it saves one lucky load balancer pulling and task migration. Thanks, -Aubrey

