On 26/08/2019 13:16, Liangyan wrote: > do_sched_cfs_period_timer() will refill cfs_b runtime and call > distribute_cfs_runtime to unthrottle cfs_rq, sometimes cfs_b->runtime > will allocate all quota to one cfs_rq incorrectly, then other cfs_rqs > attached to this cfs_b can't get runtime and will be throttled. > > We find that one throttled cfs_rq has non-negative > cfs_rq->runtime_remaining and cause an unexpetced cast from s64 to u64 > in snippet: distribute_cfs_runtime() { > runtime = -cfs_rq->runtime_remaining + 1; }. > The runtime here will change to a large number and consume all > cfs_b->runtime in this cfs_b period. > > According to Ben Segall, the throttled cfs_rq can have > account_cfs_rq_runtime called on it because it is throttled before > idle_balance, and the idle_balance calls update_rq_clock to add time > that is accounted to the task. > > This commit prevents cfs_rq to be assgined new runtime if it has been > throttled until that distribute_cfs_runtime is called. > > Signed-off-by: Liangyan <liangyan.p...@linux.alibaba.com> > Reviewed-by: Ben Segall <bseg...@google.com> > Reviewed-by: Valentin Schneider <valentin.schnei...@arm.com>
@Peter/Ingo, if we care about it I believe it can't hurt to strap Cc: <sta...@vger.kernel.org> Fixes: d3d9dc330236 ("sched: Throttle entities exceeding their allowed bandwidth") to the thing.