> On Jan 12, 2026, at 4:44 AM, Vishal Chourasia <[email protected]> wrote: > > Bulk CPU hotplug operations—such as switching SMT modes across all > cores—require hotplugging multiple CPUs in rapid succession. On large > systems, this process takes significant time, increasing as the number > of CPUs grows, leading to substantial delays on high-core-count > machines. Analysis [1] reveals that the majority of this time is spent > waiting for synchronize_rcu(). > > Expedite synchronize_rcu() during the hotplug path to accelerate the > operation. Since CPU hotplug is a user-initiated administrative task, > it should complete as quickly as possible.
When does the user initiate this in your system? Hotplug should not be happening that often to begin with, it is a slow path that depends on the disruptive stop-machine mechanism. > > Performance data on a PPC64 system with 400 CPUs: > > + ppc64_cpu --smt=1 (SMT8 to SMT1) > Before: real 1m14.792s > After: real 0m03.205s # ~23x improvement > > + ppc64_cpu --smt=8 (SMT1 to SMT8) > Before: real 2m27.695s > After: real 0m02.510s # ~58x improvement This does look compelling but, Could you provide more information about how this was tested - what does the ppc binary do (how many hot plugs , how does the performance change with cycle count etc)? Can you also run rcutorture testing? Some of the scenarios like TREE03 stress hotplug. thanks, - Joel > > Above numbers were collected on Linux 6.19.0-rc4-00310-g755bc1335e3b > > [1] > https://lore.kernel.org/all/5f2ab8a44d685701fe36cdaa8042a1aef215d10d.ca...@linux.vnet.ibm.com > > Signed-off-by: Vishal Chourasia <[email protected]> > --- > include/linux/rcupdate.h | 3 +++ > kernel/cpu.c | 2 ++ > 2 files changed, 5 insertions(+) > > diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h > index c5b30054cd01..03c06cfb2b6d 100644 > --- a/include/linux/rcupdate.h > +++ b/include/linux/rcupdate.h > @@ -1192,6 +1192,9 @@ rcu_head_after_call_rcu(struct rcu_head *rhp, > rcu_callback_t f) > extern int rcu_expedited; > extern int rcu_normal; > > +extern void rcu_expedite_gp(void); > +extern void rcu_unexpedite_gp(void); > + > DEFINE_LOCK_GUARD_0(rcu, > do { > rcu_read_lock(); > diff --git a/kernel/cpu.c b/kernel/cpu.c > index 8df2d773fe3b..6b0d491d73f4 100644 > --- a/kernel/cpu.c > +++ b/kernel/cpu.c > @@ -506,12 +506,14 @@ EXPORT_SYMBOL_GPL(cpus_read_unlock); > > void cpus_write_lock(void) > { > + rcu_expedite_gp(); > percpu_down_write(&cpu_hotplug_lock); > } > > void cpus_write_unlock(void) > { > percpu_up_write(&cpu_hotplug_lock); > + rcu_unexpedite_gp(); > } > > void lockdep_assert_cpus_held(void) > -- > 2.52.0 >

