On 02/09/2018 11:41 AM, Vincent Guittot wrote: > On 8 February 2018 at 20:21, Valentin Schneider > <valentin.schnei...@arm.com> wrote: >> On 02/08/2018 01:36 PM, Vincent Guittot wrote: >>> On 8 February 2018 at 13:46, Valentin Schneider >>> <valentin.schnei...@arm.com> wrote: >>>> On 02/06/2018 07:23 PM, Vincent Guittot wrote: >>>>> [...] > >> >> In summary: >> >> 20 iterations per test case >> All non-mentioned CPUs are idling >> >> --------------------- >> kick_ilb() test case: >> --------------------- >> >> a, b = 100% rt-app tasks >> - = idling >> >> Accumulating load before sleeping >> ^ >> ^ >> CPU1| a a a - - - a >> CPU2| - - b b b b b >> v >> v > Periodically kicking ILBs to update nohz blocked load >> >> Baseline: >> _nohz_idle_balance() takes 39µs in average >> nohz_balance_enter_idle() takes 233ns in average >> >> W/ cpumask: >> _nohz_idle_balance() takes 33µs in average >> nohz_balance_enter_idle() takes 283ns in average >> >> Diff: >> _nohz_idle_balance() -6µs in average (-16%) >> nohz_balance_enter_idle() +50ns in average (+21%) > > In your use case, there is no contention when accessing the cpumask. > Have you tried a use case with tasks that wake ups and go back to idle > simultaneously on several/all cpus so they will fight to update the > atomic resources ? > That would be interesting to see the impact on the runtime of the > nohz_balance_enter_idle function
No, I haven't tried that yet. For now these tests picture the "best case" scenario since all but one CPU is idle. I've been meaning to test busier scenarios - I'll give your idle/sleep storm a try, thanks for the suggestion. I also need to work on a test case for the load_balance() call in idle_balance(). As Peter mentioned, the clearing of has_blocked in update_sd_lb_stats() can only be done with atomic ops, so that's another thing to profile against the baseline.