On 19/01/26 5:13 pm, Peter Zijlstra wrote:
On Mon, Jan 19, 2026 at 04:17:40PM +0530, Vishal Chourasia wrote:
Expedite synchronize_rcu() during the cpuhp_smt_[enable|disable] path to
accelerate the operation.

Bulk CPU hotplug operations—such as switching SMT modes across all
cores—require hotplugging multiple CPUs in rapid succession. On large
systems, this process takes significant time, increasing as the number
of CPUs to hotplug during SMT switch grows, leading to substantial
delays on high-core-count machines. Analysis [1] reveals that the
majority of this time is spent waiting for synchronize_rcu().

You seem to have left out all the useful bits from your changelog again
:/

Anyway, ISTR Joel posted a patch hoisting a lock; it was a icky, but not
something we can't live with either.

Also, memory got jogged and I think something like the below will remove
2/3 of your rcu woes as well.

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 8df2d773fe3b..1365c19444b2 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2669,6 +2669,7 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
        int cpu, ret = 0;
cpu_maps_update_begin();
+       rcu_sync_enter(&cpu_hotplug_lock.rss);
        for_each_online_cpu(cpu) {
                if (topology_is_primary_thread(cpu))
                        continue;
@@ -2698,6 +2699,7 @@ int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
        }
        if (!ret)
                cpu_smt_control = ctrlval;
+       rcu_sync_exit(&cpu_hotplug_lock.rss);
        cpu_maps_update_done();
        return ret;
  }
@@ -2715,6 +2717,7 @@ int cpuhp_smt_enable(void)
        int cpu, ret = 0;
cpu_maps_update_begin();
+       rcu_sync_enter(&cpu_hotplug_lock.rss);
        cpu_smt_control = CPU_SMT_ENABLED;
        for_each_present_cpu(cpu) {
                /* Skip online CPUs and CPUs on offline nodes */
@@ -2728,6 +2731,7 @@ int cpuhp_smt_enable(void)
                /* See comment in cpuhp_smt_disable() */
                cpuhp_online_cpu_device(cpu);
        }
+       rcu_sync_exit(&cpu_hotplug_lock.rss);
        cpu_maps_update_done();
        return ret;
  }


Hi,

I verified this patch using the configuration described below.
Configuration:
    •    Kernel version: 6.19.0-rc6
    •    Number of CPUs: 1536

Earlier verification of an older version of this patch was performed on a system with *2048 CPUs*. Due to system unavailability, the current verification was carried out on a *different system.*


Using this setup, I evaluated the patch with both SMT enabled and SMT disabled. patch shows a significant improvement in the SMT=off case and a measurable improvement in the SMT=on case. The results indicate that when SMT is enabled, the system time is noticeably higher. In contrast, with SMT disabled, no significant increase in system time is observed.

SMT=ON  -> sys 50m42.805s
SMT=OFF -> sys 0m0.064s


SMT Mode    | Without Patch    | With Patch   | % Improvement   |
------------------------------------------------------------------
SMT=off     | 20m 32.210s      |  5m 30.898s  | +73.15%         |
SMT=on      | 62m 46.549s      | 55m 45.671s  | +11.18%         |


Please add below tag: 
Tested-by: Samir M <[email protected]>

Regards,
Samir


Reply via email to