Re: [RFC PATCH 11/16] sched: migration changes for core scheduling(Internet mail)

Li, Aubrey Wed, 22 Jul 2020 22:39:35 -0700

On 2020/7/23 12:23, benbjiang(蒋彪) wrote:
> Hi,
>> On Jul 23, 2020, at 11:35 AM, Li, Aubrey <[email protected]> wrote:
>>
>> On 2020/7/23 10:42, benbjiang(蒋彪) wrote:
>>> Hi,
>>>
>>>> On Jul 23, 2020, at 9:57 AM, Li, Aubrey <[email protected]> wrote:
>>>>
>>>> On 2020/7/22 22:32, benbjiang(蒋彪) wrote:
>>>>> Hi,
>>>>>
>>>>>> On Jul 22, 2020, at 8:13 PM, Li, Aubrey <[email protected]> 
>>>>>> wrote:
>>>>>>
>>>>>> On 2020/7/22 16:54, benbjiang(蒋彪) wrote:
>>>>>>> Hi, Aubrey,
>>>>>>>
>>>>>>>> On Jul 1, 2020, at 5:32 AM, Vineeth Remanan Pillai 
>>>>>>>> <[email protected]> wrote:
>>>>>>>>
>>>>>>>> From: Aubrey Li <[email protected]>
>>>>>>>>
>>>>>>>> - Don't migrate if there is a cookie mismatch
>>>>>>>>  Load balance tries to move task from busiest CPU to the
>>>>>>>>  destination CPU. When core scheduling is enabled, if the
>>>>>>>>  task's cookie does not match with the destination CPU's
>>>>>>>>  core cookie, this task will be skipped by this CPU. This
>>>>>>>>  mitigates the forced idle time on the destination CPU.
>>>>>>>>
>>>>>>>> - Select cookie matched idle CPU
>>>>>>>>  In the fast path of task wakeup, select the first cookie matched
>>>>>>>>  idle CPU instead of the first idle CPU.
>>>>>>>>
>>>>>>>> - Find cookie matched idlest CPU
>>>>>>>>  In the slow path of task wakeup, find the idlest CPU whose core
>>>>>>>>  cookie matches with task's cookie
>>>>>>>>
>>>>>>>> - Don't migrate task if cookie not match
>>>>>>>>  For the NUMA load balance, don't migrate task to the CPU whose
>>>>>>>>  core cookie does not match with task's cookie
>>>>>>>>
>>>>>>>> Signed-off-by: Aubrey Li <[email protected]>
>>>>>>>> Signed-off-by: Tim Chen <[email protected]>
>>>>>>>> Signed-off-by: Vineeth Remanan Pillai <[email protected]>
>>>>>>>> ---
>>>>>>>> kernel/sched/fair.c  | 64 ++++++++++++++++++++++++++++++++++++++++----
>>>>>>>> kernel/sched/sched.h | 29 ++++++++++++++++++++
>>>>>>>> 2 files changed, 88 insertions(+), 5 deletions(-)
>>>>>>>>
>>>>>>>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>>>>>>>> index d16939766361..33dc4bf01817 100644
>>>>>>>> --- a/kernel/sched/fair.c
>>>>>>>> +++ b/kernel/sched/fair.c
>>>>>>>> @@ -2051,6 +2051,15 @@ static void task_numa_find_cpu(struct 
>>>>>>>> task_numa_env *env,
>>>>>>>>                if (!cpumask_test_cpu(cpu, env->p->cpus_ptr))
>>>>>>>>                        continue;
>>>>>>>>
>>>>>>>> +#ifdef CONFIG_SCHED_CORE
>>>>>>>> +              /*
>>>>>>>> +               * Skip this cpu if source task's cookie does not match
>>>>>>>> +               * with CPU's core cookie.
>>>>>>>> +               */
>>>>>>>> +              if (!sched_core_cookie_match(cpu_rq(cpu), env->p))
>>>>>>>> +                      continue;
>>>>>>>> +#endif
>>>>>>>> +
>>>>>>>>                env->dst_cpu = cpu;
>>>>>>>>                if (task_numa_compare(env, taskimp, groupimp, maymove))
>>>>>>>>                        break;
>>>>>>>> @@ -5963,11 +5972,17 @@ find_idlest_group_cpu(struct sched_group 
>>>>>>>> *group, struct task_struct *p, int this
>>>>>>>>
>>>>>>>>        /* Traverse only the allowed CPUs */
>>>>>>>>        for_each_cpu_and(i, sched_group_span(group), p->cpus_ptr) {
>>>>>>>> +              struct rq *rq = cpu_rq(i);
>>>>>>>> +
>>>>>>>> +#ifdef CONFIG_SCHED_CORE
>>>>>>>> +              if (!sched_core_cookie_match(rq, p))
>>>>>>>> +                      continue;
>>>>>>>> +#endif
>>>>>>>> +
>>>>>>>>                if (sched_idle_cpu(i))
>>>>>>>>                        return i;
>>>>>>>>
>>>>>>>>                if (available_idle_cpu(i)) {
>>>>>>>> -                      struct rq *rq = cpu_rq(i);
>>>>>>>>                        struct cpuidle_state *idle = idle_get_state(rq);
>>>>>>>>                        if (idle && idle->exit_latency < 
>>>>>>>> min_exit_latency) {
>>>>>>>>                                /*
>>>>>>>> @@ -6224,8 +6239,18 @@ static int select_idle_cpu(struct task_struct 
>>>>>>>> *p, struct sched_domain *sd, int t
>>>>>>>>        for_each_cpu_wrap(cpu, cpus, target) {
>>>>>>>>                if (!--nr)
>>>>>>>>                        return -1;
>>>>>>>> -              if (available_idle_cpu(cpu) || sched_idle_cpu(cpu))
>>>>>>>> -                      break;
>>>>>>>> +
>>>>>>>> +              if (available_idle_cpu(cpu) || sched_idle_cpu(cpu)) {
>>>>>>>> +#ifdef CONFIG_SCHED_CORE
>>>>>>>> +                      /*
>>>>>>>> +                       * If Core Scheduling is enabled, select this 
>>>>>>>> cpu
>>>>>>>> +                       * only if the process cookie matches core 
>>>>>>>> cookie.
>>>>>>>> +                       */
>>>>>>>> +                      if (sched_core_enabled(cpu_rq(cpu)) &&
>>>>>>>> +                          p->core_cookie == 
>>>>>>>> cpu_rq(cpu)->core->core_cookie)
>>>>>>> Why not also add similar logic in select_idle_smt to reduce 
>>>>>>> forced-idle? :)
>>>>>> We hit select_idle_smt after we scaned the entire LLC domain for idle 
>>>>>> cores
>>>>>> and idle cpus and failed,so IMHO, an idle smt is probably a good choice 
>>>>>> under
>>>>>> this scenario.
>>>>>
>>>>> AFAIC, selecting idle sibling with unmatched cookie will cause 
>>>>> unnecessary fored-idle, unfairness and latency, compared to choosing 
>>>>> *target* cpu.
>>>> Choosing target cpu could increase the runnable task number on the target 
>>>> runqueue, this
>>>> could trigger busiest->nr_running > 1 logic and makes the idle sibling 
>>>> trying to pull but
>>>> not success(due to cookie not match). Putting task to the idle sibling is 
>>>> relatively stable IMHO.
>>>
>>> I’m afraid that *unsuccessful* pullings between smts would not result in 
>>> unstableness, because
>>> the load-balance always do periodicly , and unsuccess means nothing happen.
>> unsuccess pulling means more unnecessary overhead in load balance.
>>
>>> On the contrary, unmatched sibling tasks running concurrently could bring 
>>> forced-idle to each other repeatedly,
>>> Which is more unstable, and more costly when pick_next_task for all 
>>> siblings.
>> Not worse than two tasks ping-pong on the same target run queue I guess, and 
>> better if
>> - task1(cookie A) is running on the target, and task2(cookie B) in the 
>> runqueue,
>> - task3(cookie B) coming
>>
>> If task3 chooses target's sibling, it could have a chance to run 
>> concurrently with task2.
>> But if task3 chooses target, it will wait for next pulling luck of load 
>> balancer
> That’s more interesting. :)
> Distributing different cookie tasks onto different cpus(or cpusets) could be 
> the *ideal stable status* we want, as I understood.
> Different cookie tasks running on sibling smts could hurt performance, and 
> that should be avoided with best effort.
We already tried to avoid when we scan idle cores and idle cpus in llc domain.


> For above case, selecting idle sibling cpu can improve the concurrency 
> indeed, but it decrease the imbalance for load-balancer.
> In that case, load-balancer could not notice the imbalance, and would do 
> nothing to improve the unmatched situation.
> On the contrary, choosing the *target* cpu could enhance the imbalance, and 
> load-balancer could try to pull unmatched task away,
Pulling away to where needs another bunch of elaboration.

> which could improve the unmatched situation and be helpful to reach the 
> *ideal stable status*. Maybe that’s what we expect. :)
> 
If we limit to this one-core two-sibling three-tasks case, choosing the idle 
sibling is the ideal stable
status, as it saves one lucky load balancer pulling and task migration.

Thanks,
-Aubrey

Re: [RFC PATCH 11/16] sched: migration changes for core scheduling(Internet mail)

Reply via email to