During discussion of the scheduler deadline bug [1], Pierre Gondois
pointed out a potential issue during kexec: as CPUs are unplugged, the
available DL bandwidth of the root domain gradually decreases. At some
point, insufficient bandwidth triggers an overflow detection, causing
CPU hot-removal to fail and kexec to hang.

This can be reproduced by:
  chrt -d -T 1000000 -P 1000000 0 yes > /dev/null &
  kexec -e

This patch skips the DL bandwidth check if kexec is in progress.

[1]: https://lore.kernel.org/all/[email protected]/

Reported-by: Pierre Gondois <[email protected]>
Closes: 
https://lore.kernel.org/all/[email protected]/
Signed-off-by: Pingfan Liu <[email protected]>
Cc: Waiman Long <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Juri Lelli <[email protected]>
Cc: Pierre Gondois <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Vincent Guittot <[email protected]>
Cc: Dietmar Eggemann <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Valentin Schneider <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Joel Granados <[email protected]>
To: [email protected]
To: [email protected]
---
 kernel/kexec_core.c     | 6 ++++++
 kernel/sched/deadline.c | 7 +++++++
 2 files changed, 13 insertions(+)

diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
index 31203f0bacafa..265de9d1ff5f5 100644
--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1183,7 +1183,13 @@ int kernel_kexec(void)
        } else
 #endif
        {
+               /*
+                * CPU hot-removal path refers to kexec_in_progress, it
+                * requires a sync to ensure no in-flight hot-removing.
+                */
+               cpu_hotplug_disable();
                kexec_in_progress = true;
+               cpu_hotplug_enable();
                kernel_restart_prepare("kexec reboot");
                migrate_to_reboot_cpu();
                syscore_shutdown();
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index a3a43baf4314e..cc864cc348b2c 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -18,6 +18,7 @@
 
 #include <linux/cpuset.h>
 #include <linux/sched/clock.h>
+#include <linux/kexec.h>
 #include <uapi/linux/sched/types.h>
 #include "sched.h"
 #include "pelt.h"
@@ -3502,6 +3503,12 @@ static int dl_bw_manage(enum dl_bw_request req, int cpu, 
u64 dl_bw)
 
 int dl_bw_deactivate(int cpu)
 {
+       /*
+        * The system is shutting down and cannot roll back.  There is no point
+        * in keeping track of bandwidth, which may fail hotplug.
+        */
+       if (unlikely(kexec_in_progress))
+               return 0;
        return dl_bw_manage(dl_bw_req_deactivate, cpu, 0);
 }
 
-- 
2.49.0


Reply via email to