Dave Young suggested to me to explain the problem in more detail,
so here is the revised commit description. The patch is now in -mm,
so I copied Cc list from -mm version. Also I added Corey Minyard's
Tested-by and Reviewed-by.
From: Hidehiro Kawai
Subject: mips/panic: replace smp_send_stop() with kdump friendly version in
panic path
This patch fixes the problems reported by Daniel Walker
(https://lkml.org/lkml/2015/6/24/44).
When kernel panics with crash_kexec_post_notifiers kernel parameter
enabled, other CPUs are stopped by smp_send_stop() instead of
machine_crash_shutdown() in __crash_kexec() path.
panic()
if crash_kexec_post_notifiers == 1
smp_send_stop()
atomic_notifier_call_chain()
kmsg_dump()
__crash_kexec()
machine_crash_shutdown()
octeon_generic_shutdown() // shutdown watchdog for ONLINE CPUs
Different from smp_send_stop(), machine_crash_shutdown() stops other
CPUs with extra works for kdump. So, if smp_send_stop() stops other
CPUs in advance, these extra works won't be done. As the result,
kdump routines miss to save other CPUs' registers. Additionally for
MIPS OCTEON, it misses to stop the watchdog timer.
To fix this problem, call a new kdump friendly function,
crash_smp_send_stop(), instead of the smp_send_stop() when
crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a
weak function, and it just call smp_send_stop(). Architecture
codes should override it so that kdump can work appropriately.
This patch provides MIPS version.
Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option)
Link:
http://lkml.kernel.org/r/20160810080950.11028.28000.st...@sysi4-13.yrl.intra.hitachi.co.jp
Signed-off-by: Hidehiro Kawai
Reported-by: Daniel Walker
Tested-by: Corey Minyard
Reviewed-by: Corey Minyard
Cc: Dave Young
Cc: Baoquan He
Cc: Vivek Goyal
Cc: Eric Biederman
Cc: Masami Hiramatsu
Cc: Daniel Walker
Cc: Xunlei Pang
Cc: Thomas Gleixner
Cc: Ingo Molnar
Cc: "H. Peter Anvin"
Cc: Borislav Petkov
Cc: David Vrabel
Cc: Toshi Kani
Cc: Ralf Baechle
Cc: David Daney
Cc: Aaro Koskinen
Cc: "Steven J. Hill"
Signed-off-by: Andrew Morton
> From: Corey Minyard [mailto:cminy...@mvista.com]
> Sent: Friday, August 19, 2016 6:18 AM
> Sorry this took so long, but I have finally tested this, it seems to
> work fine:
>
> Tested-by: Corey Minyard
> Reviewed-by: Corey Minyard
>
> On 08/10/2016 03:09 AM, Hidehiro Kawai wrote:
> > Daniel Walker reported problems which happens when
> > crash_kexec_post_notifiers kernel option is enabled
> > (https://lkml.org/lkml/2015/6/24/44).
> >
> > In that case, smp_send_stop() is called before entering kdump routines
> > which assume other CPUs are still online. As the result, kdump
> > routines fail to save other CPUs' registers. Additionally for MIPS
> > OCTEON, it misses to stop the watchdog timer.
> >
> > To fix this problem, call a new kdump friendly function,
> > crash_smp_send_stop(), instead of the smp_send_stop() when
> > crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a
> > weak function, and it just call smp_send_stop(). Architecture
> > codes should override it so that kdump can work appropriately.
> > This patch provides MIPS version.
> >
> > Reported-by: Daniel Walker
> > Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers"
> > option)
> > Signed-off-by: Hidehiro Kawai
> > Cc: Ralf Baechle
> > Cc: David Daney
> > Cc: Aaro Koskinen
> > Cc: "Steven J. Hill"
> > Cc: Corey Minyard
> >
> > ---
> > I'm not familiar with MIPS, and I don't have a test environment and
> > just did build tests only. Please don't apply this patch until
> > someone does enough tests, otherwise simply drop this patch.
> > ---
> > arch/mips/cavium-octeon/setup.c | 14 ++
> > arch/mips/include/asm/kexec.h|1 +
> > arch/mips/kernel/crash.c | 18 +-
> > arch/mips/kernel/machine_kexec.c |1 +
> > 4 files changed, 33 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/mips/cavium-octeon/setup.c
> > b/arch/mips/cavium-octeon/setup.c
> > index cb16fcc..5537f95 100644
> > --- a/arch/mips/cavium-octeon/setup.c
> > +++ b/arch/mips/cavium-octeon/setup.c
> > @@ -267,6 +267,17 @@ static void