During discussion of the scheduler deadline bug [1], Pierre Gondois pointed out a potential issue during kexec: as CPUs are unplugged, the available DL bandwidth of the root domain gradually decreases. At some point, insufficient bandwidth triggers an overflow detection, causing CPU hot-removal to fail and kexec to hang.
This can be reproduced by: chrt -d -T 1000000 -P 1000000 0 yes > /dev/null & kexec -e This patch skips the DL bandwidth check if kexec is in progress. [1]: https://lore.kernel.org/all/[email protected]/ Reported-by: Pierre Gondois <[email protected]> Closes: https://lore.kernel.org/all/[email protected]/ Signed-off-by: Pingfan Liu <[email protected]> Cc: Waiman Long <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Juri Lelli <[email protected]> Cc: Pierre Gondois <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Baoquan He <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Vincent Guittot <[email protected]> Cc: Dietmar Eggemann <[email protected]> Cc: Steven Rostedt <[email protected]> Cc: Valentin Schneider <[email protected]> Cc: "Rafael J. Wysocki" <[email protected]> Cc: Joel Granados <[email protected]> To: [email protected] To: [email protected] --- kernel/kexec_core.c | 6 ++++++ kernel/sched/deadline.c | 7 +++++++ 2 files changed, 13 insertions(+) diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index 31203f0bacafa..265de9d1ff5f5 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -1183,7 +1183,13 @@ int kernel_kexec(void) } else #endif { + /* + * CPU hot-removal path refers to kexec_in_progress, it + * requires a sync to ensure no in-flight hot-removing. + */ + cpu_hotplug_disable(); kexec_in_progress = true; + cpu_hotplug_enable(); kernel_restart_prepare("kexec reboot"); migrate_to_reboot_cpu(); syscore_shutdown(); diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index a3a43baf4314e..cc864cc348b2c 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -18,6 +18,7 @@ #include <linux/cpuset.h> #include <linux/sched/clock.h> +#include <linux/kexec.h> #include <uapi/linux/sched/types.h> #include "sched.h" #include "pelt.h" @@ -3502,6 +3503,12 @@ static int dl_bw_manage(enum dl_bw_request req, int cpu, u64 dl_bw) int dl_bw_deactivate(int cpu) { + /* + * The system is shutting down and cannot roll back. There is no point + * in keeping track of bandwidth, which may fail hotplug. + */ + if (unlikely(kexec_in_progress)) + return 0; return dl_bw_manage(dl_bw_req_deactivate, cpu, 0); } -- 2.49.0
