On Fri, Jul 03, 2026 at 02:11:42PM +0800, Jing Wu wrote:
> On Thu, Jul 02 2026 at 16:07, Paul E. McKenney wrote:
> > wouldn't it work better to just leave all CPUs in RCU-callbacks-offloaded
> > state?  Then you can adjust the nohz_full state of arbitrary CPUs without
> > messing with RCU.
> [...]
> > a continuous stream of race-condition bugs inspired the current state,
> > which is to allow this state to change only for offline CPUs.
> 
> Thanks Paul.  That is appealing, and we would much rather not wade into
> the online offload-switching races you describe.
> 
> Let me lay out the one tension it creates on our side and ask how you and
> Frederic would like it resolved.
> 
> DHM's aim is to enable kernel-noise isolation purely at runtime, on
> machines that did not pass nohz_full= / rcu_nocbs= at boot.  "Leave all
> CPUs offloaded" needs the candidate CPUs to be in rcu_nocb_mask, which is
> only populated at boot.  So the RCU part seems to come down to two options:
> 
>   (a) Accept a boot hint: require rcu_nocbs= (or nohz_full=) to cover the
>       set of CPUs that may later be isolated.  RCU is then never touched at
>       runtime, exactly as you suggest.  tick / timer / managed_irq /
>       watchdog stay fully runtime-adjustable, so the "no boot parameter"
>       property holds for everything except RCU offloading.
> 
>   (b) Change the offload state at runtime with no boot hint, which is
>       precisely the online-switching problem you and Frederic hit, and what
>       Thomas's lightweight-offloaded + CPUHP_AP_RCU_SYNC sketch would need
>       to make cheap and race-free.
> 
> We would lean towards (a) as the pragmatic first step: it keeps RCU out of
> the runtime path entirely, per your recommendation, and only asks the admin
> who wants runtime RCU-noise isolation to declare the candidate CPUs at boot.
> (b) / Thomas's mechanism could be a separate, later effort if a truly
> boot-parameter-free RCU story turns out to be wanted.
> 
> Does scoping the RCU part to (a) sound acceptable to you and Frederic?  If
> so, we will drop runtime nocb toggling from DHM entirely and just document
> the rcu_nocbs= expectation, leaving the other housekeeping types runtime
> adjustable.

For the time being, I will defer to Frederic on this one.

His point about interrupt handlers invoking call_rcu() is a caution.  ;-)

                                                        Thanx, Paul

Reply via email to