On Mon, Jan 29, 2018 at 07:39:15PM -0800, Joel Fernandes wrote:

> > @@ -6081,7 +6086,7 @@ static int select_idle_core(struct task_struct *p, 
> > struct sched_domain *sd, int
> >
> >                 for_each_cpu(cpu, cpu_smt_mask(core)) {
> >                         cpumask_clear_cpu(cpu, cpus);
> > -                       if (!idle_cpu(cpu))
> > +                       if (!idle_cpu(cpu) || !full_capacity(cpu))
> >                                 idle = false;
> >                 }
> 
> There's some difference in logic between select_idle_core and
> select_idle_cpu as far as the full_capacity stuff you're adding goes.
> In select_idle_core, if all CPUs are !full_capacity, you're returning
> -1. But in select_idle_cpu you're returning the best idle CPU that's
> the most cap among the !full_capacity ones. Why there is this
> different in logic? Did I miss something?

select_idle_core() wants to find a whole core that's idle, the way he
changed it we'll not consider a core idle if one (or more) of the
siblings have a heavy IRQ load.

select_idle_cpu() just wants an idle (logical) CPU, and here it looks
for 

> >
> > @@ -6102,7 +6107,8 @@ static int select_idle_core(struct task_struct *p, 
> > struct sched_domain *sd, int
> >   */
> >  static int select_idle_smt(struct task_struct *p, struct sched_domain *sd, 
> > int target)
> >  {
> > -       int cpu;
> > +       int cpu, rcpu = -1;
> > +       unsigned long max_cap = 0;
> >
> >         if (!static_branch_likely(&sched_smt_present))
> >                 return -1;
> > @@ -6110,11 +6116,13 @@ static int select_idle_smt(struct task_struct *p, 
> > struct sched_domain *sd, int t
> >         for_each_cpu(cpu, cpu_smt_mask(target)) {
> >                 if (!cpumask_test_cpu(cpu, &p->cpus_allowed))
> >                         continue;
> > -               if (idle_cpu(cpu))
> > -                       return cpu;
> > +               if (idle_cpu(cpu) && (capacity_of(cpu) > max_cap)) {
> > +                       max_cap = capacity_of(cpu);
> > +                       rcpu = cpu;
> 
> At the SMT level, do you need to bother with choosing best capacity
> among threads? If RT is eating into one of the SMT thread's underlying
> capacity, it would eat into the other's. Wondering what's the benefit
> of doing this here.

Its about latency mostly I think; scheduling on the other sibling gets
you to run faster -- the core will interleave the SMT threads and you
don't get to suffer the interrupt load _as_bad_.

If people really cared about their RT workload, they would not allow
regular tasks on its siblings in any case.

Reply via email to