On Tue, Mar 19, 2019 at 09:00:05AM -0400, Phil Auld wrote: > sched/fair: Limit sched_cfs_period_timer loop to avoid hard lockup > > With extremely short cfs_period_us setting on a parent task group with a large > number of children the for loop in sched_cfs_period_timer can run until the > watchdog fires. There is no guarantee that the call to hrtimer_forward_now() > will ever return 0. The large number of children can make > do_sched_cfs_period_timer() take longer than the period.
> > To prevent this we add protection to the loop that detects when the loop has > run > too many times and scales the period and quota up, proportionally, so that > the timer > can complete before then next period expires. This preserves the relative > runtime > quota while preventing the hard lockup. > > A warning is issued reporting this state and the new values. > > v2: Math reworked/simplified by Peter Zijlstra. > > Signed-off-by: Phil Auld <[email protected]> > Cc: Ben Segall <[email protected]> > Cc: Ingo Molnar <[email protected]> > Cc: Peter Zijlstra (Intel) <[email protected]> > Cc: Anton Blanchard <[email protected]> Thanks!

