On 6/12/19 9:32 PM, Rik van Riel wrote:
> Flatten the hierarchical runqueues into just the per CPU rq.cfs runqueue.
> 
> Iteration of the sched_entity hierarchy is rate limited to once per jiffy
> per sched_entity, which is a smaller change than it seems, because load
> average adjustments were already rate limited to once per jiffy before this
> patch series.
> 
> This patch breaks CONFIG_CFS_BANDWIDTH. The plan for that is to park tasks
> from throttled cgroups onto their cgroup runqueues, and slowly (using the
> GENTLE_FAIR_SLEEPERS) wake them back up, in vruntime order, once the cgroup
> gets unthrottled, to prevent thundering herd issues.
> 
> Signed-off-by: Rik van Riel <[email protected]>
> ---
>  include/linux/sched.h |   2 +
>  kernel/sched/fair.c   | 478 +++++++++++++++++-------------------------
>  kernel/sched/pelt.c   |   6 +-
>  kernel/sched/pelt.h   |   2 +-
>  kernel/sched/sched.h  |   2 +-
>  5 files changed, 194 insertions(+), 296 deletions(-)
> 

[...]

> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c

[...]

> @@ -3491,7 +3544,7 @@ static inline bool update_load_avg(struct cfs_rq 
> *cfs_rq, struct sched_entity *s
>        * track group sched_entity load average for task_h_load calc in 
> migration
>        */
>       if (se->avg.last_update_time && !(flags & SKIP_AGE_LOAD))
> -             updated = __update_load_avg_se(now, cfs_rq, se);
> +             updated = __update_load_avg_se(now, cfs_rq, se, curr, curr);

I wonder if task migration is still working correctly.

migrate_task_rq_fair(p, ...) -> remove_entity_load_avg(&p->se) would use
cfs_rq = se->cfs_rq (i.e. root cfs_rq). So load (and util) will not
propagate through the taskgroup hierarchy.

[...]

Reply via email to