On Mon, Oct 27, 2025 at 06:06:32PM +0100, Thomas Gleixner wrote: > On Wed, Oct 22 2025 at 20:13, Pingfan Liu wrote: > > The previous patch lifted the deadline bandwidth check during the kexec > > Once this is applied 'The previous patch' is meaningless. >
I will rephrase it. > > process, which raises a potential issue: as the number of online CPUs > > decreases, DL tasks may be crowded onto a few CPUs, which may starve the > > CPU hotplug kthread. As a result, the hot-removal cannot proceed in > > practice. On the other hand, as CPUs are offlined one by one, all tasks > > will eventually be migrated to the kexec CPU. > > > > Therefore, this patch marks all other CPUs as inactive to signal the > > git grep "This patch" Documentation/process/ > I will rephrase it. > > scheduler to migrate tasks to the kexec CPU during hot-removal. > > I'm not seeing what this solves. It just changes the timing of moving > tasks off to the boot CPU where they compete for the CPU for nothing. > > When kexec() is in progress, then running user space tasks at all is a > completely pointless exercise. > > So the obvious solution to the problem is to freeze all user space tasks I agree, but what about a less intrusive approach? Simply stopping the DL tasks should suffice, as everything works correctly without them. I have a draft patch ready. Let's discuss it and go from there. > when kexec() is invoked. No horrible hacks in the deadline scheduler and > elsewhere required to make that work. No? > To clarify, skipping the dl_bw_deactivate() validation is necessary because it prevents CPU hot-removal. Thanks, Pingfan
