On 2/6/2024 12:51 PM, Andrea Righi wrote:
> - stress-ng --matrix seems quite unpredictable to be used a benchmarks
> in this scenario (the bogo-ops/s are very susceptible to any kind of
> interference, so even if in the long runs NO_HZ_FULL still seems to
> provide some benefits looking at the average, we also need to
> consider that there might be a significant error in the measurements,
> standard deviation was pretty high)
>
Ack on the bogo-ops disclaimers as also mentioned in the stress-ng docs. Agreed
a better metric for perf is helpful.
I am assuming you also have RCU_NOCB enabled for this test?
> - fio doing short writes (in page cache) seems to perform like 2x
> better in terms of iops with nohz_full, respect to the other cases
> and it performs 2x slower with large IO writes (not sure why... need
> to investigate more)
This is interesting, it could be worth counting how many kernel entries/exits
occur for large IO vs small IO. I'd imagine for large IO we have fewer syscalls
and hence lower entry/exit overhead. But if there more interrupts for whatever
reason with large IO, then that also implies more kernel entries/exits. As
Frederic was saying, NOHZ_FULL has higher overhead on kernel entry/exit.
>
> - with lazy RCU enabled hrtimer_interrupt() takes like 2x more to
> return, respect to the other cases (is this expected?)
It depends on which hrtimer_interrupt() instance? There must be several in the
trace due to unrelated timers. Are you saying the worst case or it is always 2x
more? We do queue a timer for Lazy RCU to flush the RCU work but it is set to 10
seconds and should be canceled most of the time (Its just a guard rail). It is
possible there is lock contention on ->nocb_gp_lock which is causing the timer
handler execution to be slow. We have several trace_rcu_nocb* trace points,
including for the timer. Perhaps you could enable those and we dig deeper?
Further, it is interesting to see if it is only the hrtimer_interrupt() instance
that also results in a call to do_nocb_deferred_wakeup_timer() via say function
tracing. That will confirm that it is the lazy timer that is slow for you.
The actual number of callbacks should not be causing specifically the
hrtimer_interrupt() to take too long to run, AFAICS. But RCU's lazy feature does
increase the number of timer interrupts.
Further still, it depends on how much hrtimer_interrupt() takes with lazy RCU to
call it a problem IMO. Some numbers with units will be nice.
thanks,
- Joel