On Tue, Feb 03, 2026 at 06:20:22PM -0800, Josh Poimboeuf wrote: > On Mon, Feb 02, 2026 at 05:13:34PM +0800, Li Zhe wrote: > > In the current KLP transition implementation, the strategy for running > > tasks relies on waiting for a context switch to attempt to clear the > > TIF_PATCH_PENDING flag. Alternatively, determine whether the > > TIF_PATCH_PENDING flag can be cleared by inspecting the stack once the > > process has yielded the CPU. However, this approach proves problematic > > in certain environments. > > > > Consider a scenario where the majority of system CPUs are configured > > with nohzfull and isolcpus, each dedicated to a VM with a vCPU pinned > > to that physical core and configured with idle=poll within the guest. > > Under such conditions, these vCPUs rarely leave the CPU. Combined with > > the high core counts typical of modern server platforms, this results > > in transition completion times that are not only excessively prolonged > > but also highly unpredictable. > > > > This patch resolves this issue by registering a callback with > > stop_machine. The callback attempts to transition the associated running > > task. In a VM environment configured with 32 CPUs, the live patching > > operation completes promptly after the SIGNALS_TIMEOUT period with this > > patch applied; without it, the process nearly fails to complete under > > the same scenario. > > > > Co-developed-by: Rui Qi <[email protected]> > > Signed-off-by: Rui Qi <[email protected]> > > Signed-off-by: Li Zhe <[email protected]> > > PeterZ, what's your take on this? > > I wonder if we could instead do resched_cpu() or something similar to > trigger the call to klp_sched_try_switch() in __schedule()?
Yeah, this is broken. So the whole point of NOHZ_FULL is to not have the CPU disturbed, *ever*. People are working really hard to remove any and all disturbance from these CPUs with the eventual goal of making any disturbance a fatal condition (userspace will get a fatal signal if disturbed or so). Explicitly adding disturbance to NOHZ_FULL is an absolute no-no. NAK There are two ways this can be solved: 1) make it a user problem -- userspace wants to load kernel patch, userspace can force their QEMU or whatnot through a system call to make progress 2) fix it properly and do it like the deferred IPI stuff; recognise that as long as the task is in userspace, it doesn't care about kernel text changes. https://lkml.kernel.org/r/[email protected] While 2 sounds easy, the tricky comes from the fact that you have to deal with the task coming back to kernel space eventually, possibly in the middle of your KLP patching. So you've got to do thing like that patch series above, and make sure the whole of KLP happens while the other CPU is in USER/GUEST context or waits for things when it tries to leave while things are in progress.

