On Tue, Nov 10, 2015 at 10:23:56AM +0900, AKASHI Takahiro wrote:
> On 11/07/2015 04:14 AM, Geoff Levand wrote:
> >From: AKASHI Takahiro <[email protected]>
> >
> >kdump calls machine_crash_shutdown() to shut down non-boot cpus and
> >save registers' status in per-cpu ELF notes before starting the crash
> >dump kernel. See kernel_kexec().
> >
> >ipi_cpu_stop() is a bit modified and used to support this behavior.
> 
> I've got some concerns of using ipi_cpu_stop().
> 
> >Signed-off-by: AKASHI Takahiro <[email protected]>
> >---
> >  arch/arm64/include/asm/kexec.h    | 34 +++++++++++++++++++++++++++++++++-
> >  arch/arm64/kernel/machine_kexec.c | 31 +++++++++++++++++++++++++++++--
> >  arch/arm64/kernel/smp.c           | 16 ++++++++++++++--
> >  3 files changed, 76 insertions(+), 5 deletions(-)

[...]

> >diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> >index dbdaacd..88aec66 100644
> >--- a/arch/arm64/kernel/smp.c
> >+++ b/arch/arm64/kernel/smp.c
> >@@ -37,6 +37,7 @@
> >  #include <linux/completion.h>
> >  #include <linux/of.h>
> >  #include <linux/irq_work.h>
> >+#include <linux/kexec.h>
> >
> >  #include <asm/alternative.h>
> >  #include <asm/atomic.h>
> >@@ -54,6 +55,8 @@
> >  #include <asm/ptrace.h>
> >  #include <asm/virt.h>
> >
> >+#include "cpu-reset.h"
> >+
> >  #define CREATE_TRACE_POINTS
> >  #include <trace/events/ipi.h>
> >
> >@@ -679,8 +682,12 @@ static DEFINE_RAW_SPINLOCK(stop_lock);
> >  /*
> >   * ipi_cpu_stop - handle IPI from smp_send_stop()
> >   */
> >-static void ipi_cpu_stop(unsigned int cpu)
> >+static void ipi_cpu_stop(unsigned int cpu, struct pt_regs *regs)
> >  {
> >+#ifdef CONFIG_KEXEC
> >+    /* printing messages may slow down the shutdown. */
> >+    if (!in_crash_kexec)
> >+#endif
> >     if (system_state == SYSTEM_BOOTING ||
> >         system_state == SYSTEM_RUNNING) {
> >             raw_spin_lock(&stop_lock);
> >@@ -693,6 +700,11 @@ static void ipi_cpu_stop(unsigned int cpu)
> >
> >     local_irq_disable();
> >
> >+#ifdef CONFIG_KEXEC
> >+    if (in_crash_kexec)
> >+            crash_save_cpu(regs, cpu);
> >+#endif /* CONFIG_KEXEC */
> >+
> >     while (1)
> >             cpu_relax();
> >  }
> 
> cpu_relax() is defined as asm("yield"), and this puts all but boot cpu into
> a infinite loop of nop (actually, whether nop or other depends on hw 
> implementation).
> Thus all the secondary cpus are still running busy loop even after crash dump 
> kernel
> has started up, and the chip can potentially get overheated.
> I ran into this situation when I tested the code on Hikey, and the system was
> forced to be shut down by thermal driver.
> 
> So I'd like to modify the code a bit like:
> if (in_crash_kernel {
>     crash_save_cpu(regs, cpu);
>     while (1)
>         asm("wfi"); /* irq is disabled here. */
> }
> 
> Does this make sense?

It would be even better if we could hotplug them off.

Will

_______________________________________________
kexec mailing list
[email protected]
http://lists.infradead.org/mailman/listinfo/kexec

Reply via email to