On Fri, Apr 29, 2016 at 08:32:40PM +0100, Dietmar Eggemann wrote:
> From: Morten Rasmussen <morten.rasmus...@arm.com>
> 
> In calculate_imbalance() load_above_capacity currently has the unit
> [load] while it is used as being [load/capacity]. Not only is it wrong it
> also makes it unlikely that load_above_capacity is ever used as the
> subsequent code picks the smaller of load_above_capacity and the avg_load
> difference between the local and the busiest sched_groups.
> 
> This patch ensures that load_above_capacity has the right unit
> [load/capacity].
> 
> Signed-off-by: Morten Rasmussen <morten.rasmus...@arm.com>
> ---
>  kernel/sched/fair.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index bc19fee94f39..c066574cff04 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -7016,9 +7016,11 @@ static inline void calculate_imbalance(struct lb_env 
> *env, struct sd_lb_stats *s
>           local->group_type   == group_overloaded) {
>               load_above_capacity = busiest->sum_nr_running *
>                                       SCHED_LOAD_SCALE;

Given smpnice and cgroups; this starting point seems somewhat random.
Then again, without cgroup and assuming all tasks are nice-0 it is still
true. But I think the cgroup muck is rather universal these days.

The above SCHED_LOAD_SCALE is the 1 -> load metric, right?

> -             if (load_above_capacity > busiest->group_capacity)
> +             if (load_above_capacity > busiest->group_capacity) {
>                       load_above_capacity -= busiest->group_capacity;
> -             else
> +                     load_above_capacity *= SCHED_LOAD_SCALE;
> +                     load_above_capacity /= busiest->group_capacity;

And this one does load->capacity

> +             } else
>                       load_above_capacity = ~0UL;
>       }


Is there anything better we can do? I know there's a fair bit of cruft
in; but these assumptions that SCHED_LOAD_SCALE is one task, just bug
the hell out of me.

Would it make sense to make load_balance()->detach_tasks() ensure
busiest->sum_nr_running >= busiest->group_weight or so?

Reply via email to