> On Jan 19, 2026, at 7:07 PM, Paul E. McKenney <[email protected]> wrote:
> 
> On Tue, Jan 20, 2026 at 12:53:26AM +0100, Frederic Weisbecker wrote:
>> Le Mon, Jan 19, 2026 at 06:12:22PM -0500, Joel Fernandes a écrit :
>>> During callback overload (exceeding qhimark), the NOCB code attempts
>>> opportunistic advancement via rcu_advance_cbs_nowake(). Analysis shows
>>> this entire code path is dead:
>>> 
>>> - 30 overload conditions triggered with 300,000 callback flood
>>> - 0 advancements actually occurred
>>> - 100% of time blocked because current GP not done
>>> 
>>> The overload condition triggers when callbacks are coming in at a high
>>> rate with GPs not completing as fast. But the advancement requires the
>>> GP to be complete - a logical contradiction. Even if the GP did complete
>>> in time, nocb_gp_wait() has to wake up anyway to do the advancement, so
>>> it is pointless.
>>> 
>>> Since the advancement is dead code, the entire overload handling block
>>> serves no purpose. Remove it entirely.
>>> 
>>> Suggested-by: Frederic Weisbecker <[email protected]>
>>> Signed-off-by: Joel Fernandes <[email protected]>
>> 
>> Reviewed-by: Frederic Weisbecker <[email protected]>
>> 
>> Would be nice to have Paul's ack as well, in case we missed something subtle
>> here.
> 
> Given that you are good with it, I will take a look.  And test it.  ;-)

Sure, thanks!

>> Also probably for upcoming merge window + 1, note that similar code with
>> similar removal opportunity resides in rcu_nocb_try_bypass().
>> And ->nocb_gp_adv_time could then be removed.
> 
> Further simplification sounds like a good thing!  Just not too simple,
> you understand!  ;-)

Yes I have some more queued in my local tree that I plan for merge window + 1. 
:-)

By the way, I have another recent idea: why don't we trigger nocb poll mode
automatically under overload condition? Currently rcu_nocb_poll is only set via
the boot parameter and stays constant. Testing shows me that poll mode can cause
GP completion faster during overload, so dynamically enabling it when we exceed
qhimark could be beneficial. The question then is how do we turn it off
dynamically as well - perhaps when callback count drops below qlowmark, and
using some debounce logic to avoid too frequent toggling?

>                            Thanx, Paul

thanks,

 - Joel


Reply via email to