Le Wed, Jul 01, 2026 at 02:56:34PM -0400, Waiman Long a écrit : > On 7/1/26 10:22 AM, Frederic Weisbecker wrote: > > Le Thu, Jun 25, 2026 at 01:27:54AM -0400, Waiman Long a écrit : > > > On 6/24/26 2:34 AM, Jing Wu wrote: > > > > 3. Are there specific patches in your series where you would welcome > > > > our contribution directly? > > > I have broken down the shutdown callback into separate portions as > > > suggested > > > by Thomas. The other major change that I am working on is to try to > > > shutdown > > > to only CPUHP_AP_OFFLINE state instead of all the way down to > > > CPUHP_OFFLINE. > > What was the reason for that already? Can we perhaps ask the user to offline > > the target CPUs before toggling isolation on them? > The major problem about fully offlining the CPU is the CPU hotplug stop > machine mechanism which put all the CPUs except the CPU to be offlined in a > waiting loop within the IPI handler when the offline CPU is transitioning > from CPUHP_TEARDOWN_CPU to CPUHP_AP_IDLE_DEAD. If there is another active > isolated partition running DPDK, for instance, it will break the low latency > guarantee for a short duration.
Looks like a long standing problem that does not only concern nohz_full but also RT in general. I made a proposal a while ago to solve this: https://lore.kernel.org/lkml/[email protected]/ To summarize, we could remove that stop machine thing and have this on the outgoing CPU at CPUHP_TEARDOWN_CPU: set_cpu_online(cpu, 0) synchronize_rcu() migrate things // call CPUHP_TEARDOWN_CPU -> CPUHP_AP_IDLE_DEAD And on other CPUs the usual should work: preempt_disable() // could now be replaced with rcu_read_lock() if (cpu_online(target)) // do things preempt_enable() There are a few dragons on the way in the update side but nothing unsolvable as far as I checked. Of course we must check all those callbacks one by one. Also on the read side we must be careful because: rcu_read_lock() A = cpu_online(target)) B = cpu_online(target)) rcu_read_unlock() We can now have A && !B but I doubt many callsites do that. > > > That will require some adjustments to the nohz_full related hotplug > > > functions. I have some ideas of what needs to be done. However, I haven't > > > looked into RCU yet. I know RCU support changing the nocb mask for fully > > > offline CPUs, I will need to find out if it possible to do that for > > > partially offline CPUs. > > No because callbacks can still be enqueued at this stage. But we could > > manage to make it work with CPUHP_AP_IDLE_DEAD. > > If we can only go as high as CPUHP_AP_IDLE_DEAD, we may as well go down all > the way to CPUHP_OFFLINE as stop machine should be done at > CPUHP_AP_IDLE_DEAD. In that case, we may have to break RCU out from > HK_TYPE_KERNEL_NOISE and add a cpuset control switch for the system > administrators to decide if they are willing to suffer a brief latency spike > for an existing isolated partition or keep the RCU housekeeping mask > unchanged to avoid that when creating a new or destroying an old isolated > partition. Halfway nohz_full doesn't sound good... Thanks. -- Frederic Weisbecker SUSE Labs

