Re: [PATCH 21/30] panic: Introduce the panic pre-reboot notifier list
On Wed, Apr 27, 2022 at 07:49:15PM -0300, Guilherme G. Piccoli wrote: > This patch renames the panic_notifier_list to panic_pre_reboot_list; > the idea is that a subsequent patch will refactor the panic path > in order to better split the notifiers, running some of them very > early, some of them not so early [but still before kmsg_dump()] and > finally, the rest should execute late, after kdump. The latter ones > are now in the panic pre-reboot list - the name comes from the idea > that these notifiers execute before panic() attempts rebooting the > machine (if that option is set). > > We also took the opportunity to clean-up useless header inclusions, > improve some notifier block declarations (e.g. in ibmasm/heartbeat.c) > and more important, change some priorities - we hereby set 2 notifiers > to run late in the list [iss_panic_event() and the IPMI panic_event()] > due to the risks they offer (may not return, for example). > Proper documentation is going to be provided in a subsequent patch, > that effectively refactors the panic path. For the IPMI portion: Acked-by: Corey Minyard Note that the IPMI panic_event() should always return, but it may take some time, especially if the IPMI controller is no longer functional. So the risk of a long delay is there and it makes sense to move it very late. -corey > > Cc: Alex Elder > Cc: Alexander Gordeev > Cc: Anton Ivanov > Cc: Benjamin Herrenschmidt > Cc: Bjorn Andersson > Cc: Boris Ostrovsky > Cc: Chris Zankel > Cc: Christian Borntraeger > Cc: Corey Minyard > Cc: Dexuan Cui > Cc: "H. Peter Anvin" > Cc: Haiyang Zhang > Cc: Heiko Carstens > Cc: Helge Deller > Cc: Ivan Kokshaysky > Cc: "James E.J. Bottomley" > Cc: James Morse > Cc: Johannes Berg > Cc: Juergen Gross > Cc: "K. Y. Srinivasan" > Cc: Mathieu Poirier > Cc: Matt Turner > Cc: Mauro Carvalho Chehab > Cc: Max Filippov > Cc: Michael Ellerman > Cc: Paul Mackerras > Cc: Pavel Machek > Cc: Richard Henderson > Cc: Richard Weinberger > Cc: Robert Richter > Cc: Stefano Stabellini > Cc: Stephen Hemminger > Cc: Sven Schnelle > Cc: Tony Luck > Cc: Vasily Gorbik > Cc: Wei Liu > Signed-off-by: Guilherme G. Piccoli > --- > > Notice that, with this name change, out-of-tree code that relies in the global > exported "panic_notifier_list" will fail to build. We could easily keep the > retro-compatibility by making the old symbol to still exist and point to the > pre_reboot list (or even, keep the old naming). > > But our design choice was to allow the breakage, making users rethink their > notifiers, adding them in the list that fits best. If that wasn't a good > decision, we're open to change it, of course. > Thanks in advance for the review! > > arch/alpha/kernel/setup.c | 4 ++-- > arch/parisc/kernel/pdc_chassis.c | 3 +-- > arch/powerpc/kernel/setup-common.c| 2 +- > arch/s390/kernel/ipl.c| 4 ++-- > arch/um/drivers/mconsole_kern.c | 2 +- > arch/um/kernel/um_arch.c | 2 +- > arch/x86/xen/enlighten.c | 2 +- > arch/xtensa/platforms/iss/setup.c | 4 ++-- > drivers/char/ipmi/ipmi_msghandler.c | 12 +++- > drivers/edac/altera_edac.c| 3 +-- > drivers/hv/vmbus_drv.c| 4 ++-- > drivers/leds/trigger/ledtrig-panic.c | 3 +-- > drivers/misc/ibmasm/heartbeat.c | 16 +--- > drivers/net/ipa/ipa_smp2p.c | 5 ++--- > drivers/parisc/power.c| 4 ++-- > drivers/remoteproc/remoteproc_core.c | 6 -- > drivers/s390/char/con3215.c | 2 +- > drivers/s390/char/con3270.c | 2 +- > drivers/s390/char/sclp_con.c | 2 +- > drivers/s390/char/sclp_vt220.c| 2 +- > drivers/staging/olpc_dcon/olpc_dcon.c | 6 -- > drivers/video/fbdev/hyperv_fb.c | 4 ++-- > include/linux/panic_notifier.h| 2 +- > kernel/panic.c| 9 - > 24 files changed, 54 insertions(+), 51 deletions(-) > > diff --git a/arch/alpha/kernel/setup.c b/arch/alpha/kernel/setup.c > index d88bdf852753..8ace0d7113b6 100644 > --- a/arch/alpha/kernel/setup.c > +++ b/arch/alpha/kernel/setup.c > @@ -472,8 +472,8 @@ setup_arch(char **cmdline_p) > } > > /* Register a call for panic conditions. */ > - atomic_notifier_chain_register(_notifier_list, > - _panic_block); > + atomic_notifier_chain_register(_pre_reboot_list, > + _panic_block); > > #ifndef alpha_using_srm > /* Assume that we've booted from SRM if we haven't booted from
Re: [PATCH v1 1/1] kernel.h: Split out panic and oops helpers
On Tue, Apr 06, 2021 at 04:31:58PM +0300, Andy Shevchenko wrote: > kernel.h is being used as a dump for all kinds of stuff for a long time. > Here is the attempt to start cleaning it up by splitting out panic and > oops helpers. > > At the same time convert users in header and lib folder to use new header. > Though for time being include new header back to kernel.h to avoid twisted > indirected includes for existing users. For the IPMI portion: Acked-by: Corey Minyard > > Signed-off-by: Andy Shevchenko > --- > arch/powerpc/kernel/setup-common.c | 1 + > arch/x86/include/asm/desc.h | 1 + > arch/x86/kernel/cpu/mshyperv.c | 1 + > arch/x86/kernel/setup.c | 1 + > drivers/char/ipmi/ipmi_msghandler.c | 1 + > drivers/remoteproc/remoteproc_core.c | 1 + > include/asm-generic/bug.h| 3 +- > include/linux/kernel.h | 84 +--- > include/linux/panic.h| 98 > include/linux/panic_notifier.h | 12 > kernel/hung_task.c | 1 + > kernel/kexec_core.c | 1 + > kernel/panic.c | 1 + > kernel/rcu/tree.c| 2 + > kernel/sysctl.c | 1 + > kernel/trace/trace.c | 1 + > 16 files changed, 126 insertions(+), 84 deletions(-) > create mode 100644 include/linux/panic.h > create mode 100644 include/linux/panic_notifier.h > > diff --git a/arch/powerpc/kernel/setup-common.c > b/arch/powerpc/kernel/setup-common.c > index 74a98fff2c2f..046fe21b5c3b 100644 > --- a/arch/powerpc/kernel/setup-common.c > +++ b/arch/powerpc/kernel/setup-common.c > @@ -9,6 +9,7 @@ > #undef DEBUG > > #include > +#include > #include > #include > #include > diff --git a/arch/x86/include/asm/desc.h b/arch/x86/include/asm/desc.h > index 476082a83d1c..ceb12683b6d1 100644 > --- a/arch/x86/include/asm/desc.h > +++ b/arch/x86/include/asm/desc.h > @@ -9,6 +9,7 @@ > #include > #include > > +#include > #include > #include > > diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c > index 22f13343b5da..9e5c6f2b044d 100644 > --- a/arch/x86/kernel/cpu/mshyperv.c > +++ b/arch/x86/kernel/cpu/mshyperv.c > @@ -17,6 +17,7 @@ > #include > #include > #include > +#include > #include > #include > #include > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c > index 59e5e0903b0c..570699eecf90 100644 > --- a/arch/x86/kernel/setup.c > +++ b/arch/x86/kernel/setup.c > @@ -14,6 +14,7 @@ > #include > #include > #include > +#include > #include > #include > #include > diff --git a/drivers/char/ipmi/ipmi_msghandler.c > b/drivers/char/ipmi/ipmi_msghandler.c > index 8a0e97b33cae..e96cb5c4f97a 100644 > --- a/drivers/char/ipmi/ipmi_msghandler.c > +++ b/drivers/char/ipmi/ipmi_msghandler.c > @@ -16,6 +16,7 @@ > > #include > #include > +#include > #include > #include > #include > diff --git a/drivers/remoteproc/remoteproc_core.c > b/drivers/remoteproc/remoteproc_core.c > index 626a6b90fba2..76dd8e2b1e7e 100644 > --- a/drivers/remoteproc/remoteproc_core.c > +++ b/drivers/remoteproc/remoteproc_core.c > @@ -20,6 +20,7 @@ > #include > #include > #include > +#include > #include > #include > #include > diff --git a/include/asm-generic/bug.h b/include/asm-generic/bug.h > index 76a10e0dca9f..719410b93f99 100644 > --- a/include/asm-generic/bug.h > +++ b/include/asm-generic/bug.h > @@ -17,7 +17,8 @@ > #endif > > #ifndef __ASSEMBLY__ > -#include > +#include > +#include > > #ifdef CONFIG_BUG > > diff --git a/include/linux/kernel.h b/include/linux/kernel.h > index 09035ac67d4b..6c5a05ac1ecb 100644 > --- a/include/linux/kernel.h > +++ b/include/linux/kernel.h > @@ -14,6 +14,7 @@ > #include > #include > #include > +#include > #include > #include > #include > @@ -70,7 +71,6 @@ > #define lower_32_bits(n) ((u32)((n) & 0x)) > > struct completion; > -struct pt_regs; > struct user; > > #ifdef CONFIG_PREEMPT_VOLUNTARY > @@ -175,14 +175,6 @@ void __might_fault(const char *file, int line); > static inline void might_fault(void) { } > #endif > > -extern struct atomic_notifier_head panic_notifier_list; > -extern long (*panic_blink)(int state); > -__printf(1, 2) > -void panic(const char *fmt, ...) __noreturn __cold; > -void nmi_panic(struct pt_regs *regs, const char *msg); > -extern void oops_enter(void); > -extern void oops_exit(void); > -extern bool o
Re: [V4 PATCH 2/2] mips/panic: Replace smp_send_stop() with kdump friendly version in panic path
Sorry this took so long, but I have finally tested this, it seems to work fine: Tested-by: Corey Minyard <cminy...@mvista.com> Reviewed-by: Corey Minyard <cminy...@mvista.com> On 08/10/2016 03:09 AM, Hidehiro Kawai wrote: Daniel Walker reported problems which happens when crash_kexec_post_notifiers kernel option is enabled (https://lkml.org/lkml/2015/6/24/44). In that case, smp_send_stop() is called before entering kdump routines which assume other CPUs are still online. As the result, kdump routines fail to save other CPUs' registers. Additionally for MIPS OCTEON, it misses to stop the watchdog timer. To fix this problem, call a new kdump friendly function, crash_smp_send_stop(), instead of the smp_send_stop() when crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a weak function, and it just call smp_send_stop(). Architecture codes should override it so that kdump can work appropriately. This patch provides MIPS version. Reported-by: Daniel Walker <dwal...@fifo99.com> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option) Signed-off-by: Hidehiro Kawai <hidehiro.kawai...@hitachi.com> Cc: Ralf Baechle <r...@linux-mips.org> Cc: David Daney <david.da...@cavium.com> Cc: Aaro Koskinen <aaro.koski...@iki.fi> Cc: "Steven J. Hill" <steven.h...@cavium.com> Cc: Corey Minyard <cminy...@mvista.com> --- I'm not familiar with MIPS, and I don't have a test environment and just did build tests only. Please don't apply this patch until someone does enough tests, otherwise simply drop this patch. --- arch/mips/cavium-octeon/setup.c | 14 ++ arch/mips/include/asm/kexec.h|1 + arch/mips/kernel/crash.c | 18 +- arch/mips/kernel/machine_kexec.c |1 + 4 files changed, 33 insertions(+), 1 deletion(-) diff --git a/arch/mips/cavium-octeon/setup.c b/arch/mips/cavium-octeon/setup.c index cb16fcc..5537f95 100644 --- a/arch/mips/cavium-octeon/setup.c +++ b/arch/mips/cavium-octeon/setup.c @@ -267,6 +267,17 @@ static void octeon_crash_shutdown(struct pt_regs *regs) default_machine_crash_shutdown(regs); } +#ifdef CONFIG_SMP +void octeon_crash_smp_send_stop(void) +{ + int cpu; + + /* disable watchdogs */ + for_each_online_cpu(cpu) + cvmx_write_csr(CVMX_CIU_WDOGX(cpu_logical_map(cpu)), 0); +} +#endif + #endif /* CONFIG_KEXEC */ #ifdef CONFIG_CAVIUM_RESERVE32 @@ -911,6 +922,9 @@ void __init prom_init(void) _machine_kexec_shutdown = octeon_shutdown; _machine_crash_shutdown = octeon_crash_shutdown; _machine_kexec_prepare = octeon_kexec_prepare; +#ifdef CONFIG_SMP + _crash_smp_send_stop = octeon_crash_smp_send_stop; +#endif #endif octeon_user_io_init(); diff --git a/arch/mips/include/asm/kexec.h b/arch/mips/include/asm/kexec.h index ee25ebb..493a3cc 100644 --- a/arch/mips/include/asm/kexec.h +++ b/arch/mips/include/asm/kexec.h @@ -45,6 +45,7 @@ extern const unsigned char kexec_smp_wait[]; extern unsigned long secondary_kexec_args[4]; extern void (*relocated_kexec_smp_wait) (void *); extern atomic_t kexec_ready_to_reboot; +extern void (*_crash_smp_send_stop)(void); #endif #endif diff --git a/arch/mips/kernel/crash.c b/arch/mips/kernel/crash.c index 610f0f3..1723b17 100644 --- a/arch/mips/kernel/crash.c +++ b/arch/mips/kernel/crash.c @@ -47,9 +47,14 @@ static void crash_shutdown_secondary(void *passed_regs) static void crash_kexec_prepare_cpus(void) { + static int cpus_stopped; unsigned int msecs; + unsigned int ncpus; - unsigned int ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */ + if (cpus_stopped) + return; + + ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */ dump_send_ipi(crash_shutdown_secondary); smp_wmb(); @@ -64,6 +69,17 @@ static void crash_kexec_prepare_cpus(void) cpu_relax(); mdelay(1); } + + cpus_stopped = 1; +} + +/* Override the weak function in kernel/panic.c */ +void crash_smp_send_stop(void) +{ + if (_crash_smp_send_stop) + _crash_smp_send_stop(); + + crash_kexec_prepare_cpus(); } #else /* !defined(CONFIG_SMP) */ diff --git a/arch/mips/kernel/machine_kexec.c b/arch/mips/kernel/machine_kexec.c index 50980bf3..5972520 100644 --- a/arch/mips/kernel/machine_kexec.c +++ b/arch/mips/kernel/machine_kexec.c @@ -25,6 +25,7 @@ void (*_machine_crash_shutdown)(struct pt_regs *regs) = NULL; #ifdef CONFIG_SMP void (*relocated_kexec_smp_wait) (void *); atomic_t kexec_ready_to_reboot = ATOMIC_INIT(0); +void (*_crash_smp_send_stop)(void) = NULL; #endif int ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [V4 PATCH 2/2] mips/panic: Replace smp_send_stop() with kdump friendly version in panic path
On 08/15/2016 12:06 PM, Corey Minyard wrote: On 08/15/2016 06:35 AM, 河合英宏 / KAWAI,HIDEHIRO wrote: Hi Corey, From: Corey Minyard [mailto:cminy...@mvista.com] Sent: Friday, August 12, 2016 10:56 PM I'll try to test this, but I have one comment inline... Thank you very much! On 08/11/2016 10:17 PM, Dave Young wrote: On 08/10/16 at 05:09pm, Hidehiro Kawai wrote: [snip] diff --git a/arch/mips/kernel/crash.c b/arch/mips/kernel/crash.c index 610f0f3..1723b17 100644 --- a/arch/mips/kernel/crash.c +++ b/arch/mips/kernel/crash.c @@ -47,9 +47,14 @@ static void crash_shutdown_secondary(void *passed_regs) static void crash_kexec_prepare_cpus(void) { +static int cpus_stopped; unsigned int msecs; +unsigned int ncpus; -unsigned int ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */ +if (cpus_stopped) +return; Wouldn't you want an atomic operation and some special handling here to ensure that only one CPU does this? So if a CPU comes in here and another CPU is already in the process stopping the CPUs it won't result in a deadlock. Because this function can be called only one panicking CPU, there is no problem. There are two paths which crash_kexec_prepare_cpus is called. Path 1 (panic path): panic() crash_smp_send_stop() crash_kexec_prepare_cpus() Path 2 (oops path): crash_kexec() __crash_kexec() machine_crash_shutdown() default_machine_crash_shutdown() // for MIPS crash_kexec_prepare_cpus() Here, panic() and crash_kexec() run exclusively via panic_cpu atomic variable. So we can use cpus_stopped as normal variable. Ok, if the code can only be entered once, what's the purpose of cpus_stopped? I guess that's what confused me. You are right, the panic_cpu atomic should keep this on a single CPU. Never mind, I see the path through panic() where that is required. My question below still remains, though. -corey Also, panic() will call panic_smp_self_stop() if it finds another CPU has already called panic, which will just spin with interrupts off by default. I didn't see a definition for it in MIPS, wouldn't it need to be overridden to avoid a deadlock? -corey Best regards, Hidehiro Kawai ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [V4 PATCH 2/2] mips/panic: Replace smp_send_stop() with kdump friendly version in panic path
On 08/15/2016 06:35 AM, 河合英宏 / KAWAI,HIDEHIRO wrote: Hi Corey, From: Corey Minyard [mailto:cminy...@mvista.com] Sent: Friday, August 12, 2016 10:56 PM I'll try to test this, but I have one comment inline... Thank you very much! On 08/11/2016 10:17 PM, Dave Young wrote: On 08/10/16 at 05:09pm, Hidehiro Kawai wrote: [snip] diff --git a/arch/mips/kernel/crash.c b/arch/mips/kernel/crash.c index 610f0f3..1723b17 100644 --- a/arch/mips/kernel/crash.c +++ b/arch/mips/kernel/crash.c @@ -47,9 +47,14 @@ static void crash_shutdown_secondary(void *passed_regs) static void crash_kexec_prepare_cpus(void) { + static int cpus_stopped; unsigned int msecs; + unsigned int ncpus; - unsigned int ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */ + if (cpus_stopped) + return; Wouldn't you want an atomic operation and some special handling here to ensure that only one CPU does this? So if a CPU comes in here and another CPU is already in the process stopping the CPUs it won't result in a deadlock. Because this function can be called only one panicking CPU, there is no problem. There are two paths which crash_kexec_prepare_cpus is called. Path 1 (panic path): panic() crash_smp_send_stop() crash_kexec_prepare_cpus() Path 2 (oops path): crash_kexec() __crash_kexec() machine_crash_shutdown() default_machine_crash_shutdown() // for MIPS crash_kexec_prepare_cpus() Here, panic() and crash_kexec() run exclusively via panic_cpu atomic variable. So we can use cpus_stopped as normal variable. Ok, if the code can only be entered once, what's the purpose of cpus_stopped? I guess that's what confused me. You are right, the panic_cpu atomic should keep this on a single CPU. Also, panic() will call panic_smp_self_stop() if it finds another CPU has already called panic, which will just spin with interrupts off by default. I didn't see a definition for it in MIPS, wouldn't it need to be overridden to avoid a deadlock? -corey Best regards, Hidehiro Kawai ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [V4 PATCH 2/2] mips/panic: Replace smp_send_stop() with kdump friendly version in panic path
I'll try to test this, but I have one comment inline... On 08/11/2016 10:17 PM, Dave Young wrote: On 08/10/16 at 05:09pm, Hidehiro Kawai wrote: Daniel Walker reported problems which happens when crash_kexec_post_notifiers kernel option is enabled (https://lkml.org/lkml/2015/6/24/44). In that case, smp_send_stop() is called before entering kdump routines which assume other CPUs are still online. As the result, kdump routines fail to save other CPUs' registers. Additionally for MIPS OCTEON, it misses to stop the watchdog timer. To fix this problem, call a new kdump friendly function, crash_smp_send_stop(), instead of the smp_send_stop() when crash_kexec_post_notifiers is enabled. crash_smp_send_stop() is a weak function, and it just call smp_send_stop(). Architecture codes should override it so that kdump can work appropriately. This patch provides MIPS version. Reported-by: Daniel Walker <dwal...@fifo99.com> Fixes: f06e5153f4ae (kernel/panic.c: add "crash_kexec_post_notifiers" option) Signed-off-by: Hidehiro Kawai <hidehiro.kawai...@hitachi.com> Cc: Ralf Baechle <r...@linux-mips.org> Cc: David Daney <david.da...@cavium.com> Cc: Aaro Koskinen <aaro.koski...@iki.fi> Cc: "Steven J. Hill" <steven.h...@cavium.com> Cc: Corey Minyard <cminy...@mvista.com> --- I'm not familiar with MIPS, and I don't have a test environment and just did build tests only. Please don't apply this patch until someone does enough tests, otherwise simply drop this patch. --- arch/mips/cavium-octeon/setup.c | 14 ++ arch/mips/include/asm/kexec.h|1 + arch/mips/kernel/crash.c | 18 +- arch/mips/kernel/machine_kexec.c |1 + 4 files changed, 33 insertions(+), 1 deletion(-) diff --git a/arch/mips/cavium-octeon/setup.c b/arch/mips/cavium-octeon/setup.c index cb16fcc..5537f95 100644 --- a/arch/mips/cavium-octeon/setup.c +++ b/arch/mips/cavium-octeon/setup.c @@ -267,6 +267,17 @@ static void octeon_crash_shutdown(struct pt_regs *regs) default_machine_crash_shutdown(regs); } +#ifdef CONFIG_SMP +void octeon_crash_smp_send_stop(void) +{ + int cpu; + + /* disable watchdogs */ + for_each_online_cpu(cpu) + cvmx_write_csr(CVMX_CIU_WDOGX(cpu_logical_map(cpu)), 0); +} +#endif + #endif /* CONFIG_KEXEC */ #ifdef CONFIG_CAVIUM_RESERVE32 @@ -911,6 +922,9 @@ void __init prom_init(void) _machine_kexec_shutdown = octeon_shutdown; _machine_crash_shutdown = octeon_crash_shutdown; _machine_kexec_prepare = octeon_kexec_prepare; +#ifdef CONFIG_SMP + _crash_smp_send_stop = octeon_crash_smp_send_stop; +#endif #endif octeon_user_io_init(); diff --git a/arch/mips/include/asm/kexec.h b/arch/mips/include/asm/kexec.h index ee25ebb..493a3cc 100644 --- a/arch/mips/include/asm/kexec.h +++ b/arch/mips/include/asm/kexec.h @@ -45,6 +45,7 @@ extern const unsigned char kexec_smp_wait[]; extern unsigned long secondary_kexec_args[4]; extern void (*relocated_kexec_smp_wait) (void *); extern atomic_t kexec_ready_to_reboot; +extern void (*_crash_smp_send_stop)(void); #endif #endif diff --git a/arch/mips/kernel/crash.c b/arch/mips/kernel/crash.c index 610f0f3..1723b17 100644 --- a/arch/mips/kernel/crash.c +++ b/arch/mips/kernel/crash.c @@ -47,9 +47,14 @@ static void crash_shutdown_secondary(void *passed_regs) static void crash_kexec_prepare_cpus(void) { + static int cpus_stopped; unsigned int msecs; + unsigned int ncpus; - unsigned int ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */ + if (cpus_stopped) + return; Wouldn't you want an atomic operation and some special handling here to ensure that only one CPU does this? So if a CPU comes in here and another CPU is already in the process stopping the CPUs it won't result in a deadlock. -corey + + ncpus = num_online_cpus() - 1;/* Excluding the panic cpu */ dump_send_ipi(crash_shutdown_secondary); smp_wmb(); @@ -64,6 +69,17 @@ static void crash_kexec_prepare_cpus(void) cpu_relax(); mdelay(1); } + + cpus_stopped = 1; +} + +/* Override the weak function in kernel/panic.c */ +void crash_smp_send_stop(void) +{ + if (_crash_smp_send_stop) + _crash_smp_send_stop(); + + crash_kexec_prepare_cpus(); } #else /* !defined(CONFIG_SMP) */ diff --git a/arch/mips/kernel/machine_kexec.c b/arch/mips/kernel/machine_kexec.c index 50980bf3..5972520 100644 --- a/arch/mips/kernel/machine_kexec.c +++ b/arch/mips/kernel/machine_kexec.c @@ -25,6 +25,7 @@ void (*_machine_crash_shutdown)(struct pt_regs *regs) = NULL; #ifdef CONFIG_SMP void (*relocated_kexec_smp_wait) (void *); atomic_t kexec_ready_to_reboot = ATOMIC_INIT(0); +void (*_crash_smp_send_stop)(void) = NULL; #endif int Can any mips people review this patch and have a test? Thanks
Re: [PATCH v2] kdump: Fix gdb macros work work with newer and 64-bit kernels
On 05/16/2016 04:32 AM, Baoquan He wrote: On 05/10/16 at 07:30pm, miny...@acm.org wrote: From: Corey Minyard <cminy...@mvista.com> Lots of little changes needed to be made to clean these up, remove the four byte pointer assumption and traverse the pid queue properly. Also consolidate the traceback code into a single function instead of having three copies of it. Signed-off-by: Corey Minyard <cminy...@mvista.com> Hi Corey, Today I tried gdbmacro.txt and found dmesg doesn't work. I tested it on the latest 4.6.0 kernel. And I directly copy /proc/vmcore out and use gdb to open it by below command" gdb vmlinux /var/crash/vmcore --"gdbmacros.txt" All macro functions work well except of dmesg since code inside refer to the deprecated variable like "log_end" and "logged_chars". But these have been changed since this commit: commit 7ff9554bb578ba02166071d2d487b7fc7d860d62 Author: Kay Sievers <k...@vrfy.org> Date: Thu May 3 02:29:13 2012 +0200 printk: convert byte-buffer to variable-length record buffer So invoking dmesg will cause an error message printing out: (gdb) dmesg No symbol "log_end" in current context. Yes, I was actually aware of that, but that's a different issue and I hadn't thought about it much. -corey Thanks Baoquan --- Changes since v1: Rename btthreadstruct to btthreadstack Use sizeof(union thread_union) for the stack size instead of a hardcoded 4096. Documentation/kdump/gdbmacros.txt | 96 ++- 1 file changed, 44 insertions(+), 52 deletions(-) diff --git a/Documentation/kdump/gdbmacros.txt b/Documentation/kdump/gdbmacros.txt index 9b9b454..35f6a98 100644 --- a/Documentation/kdump/gdbmacros.txt +++ b/Documentation/kdump/gdbmacros.txt @@ -15,15 +15,16 @@ define bttnobp set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) - set $pid_off=((size_t)&((struct task_struct *)0)->pids[1].pid_list.next) + set $pid_off=((size_t)&((struct task_struct *)0)->thread_group.next) set $init_t=_task set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) + set var $stacksize = sizeof(union thread_union) while ($next_t != $init_t) set $next_t=(struct task_struct *)$next_t printf "\npid %d; comm %s:\n", $next_t.pid, $next_t.comm printf "===\n" - set var $stackp = $next_t.thread.esp - set var $stack_top = ($stackp & ~4095) + 4096 + set var $stackp = $next_t.thread.sp + set var $stack_top = ($stackp & ~($stacksize - 1)) + $stacksize while ($stackp < $stack_top) if (*($stackp) > _stext && *($stackp) < _sinittext) @@ -31,13 +32,13 @@ define bttnobp end set $stackp += 4 end - set $next_th=(((char *)$next_t->pids[1].pid_list.next) - $pid_off) + set $next_th=(((char *)$next_t->thread_group.next) - $pid_off) while ($next_th != $next_t) set $next_th=(struct task_struct *)$next_th printf "\npid %d; comm %s:\n", $next_t.pid, $next_t.comm printf "===\n" - set var $stackp = $next_t.thread.esp - set var $stack_top = ($stackp & ~4095) + 4096 + set var $stackp = $next_t.thread.sp + set var $stack_top = ($stackp & ~($stacksize - 1)) + stacksize while ($stackp < $stack_top) if (*($stackp) > _stext && *($stackp) < _sinittext) @@ -45,7 +46,7 @@ define bttnobp end set $stackp += 4 end - set $next_th=(((char *)$next_th->pids[1].pid_list.next) - $pid_off) + set $next_th=(((char *)$next_th->thread_group.next) - $pid_off) end set $next_t=(char *)($next_t->tasks.next) - $tasks_off end @@ -54,42 +55,44 @@ document bttnobp dump all thread stack traces on a kernel compiled with !CONFIG_FRAME_POINTER end +define btthreadstack + set var $pid_task = $arg0 + + printf "\npid %d; comm %s:\n", $pid_task.pid, $pid_task.comm + printf "task struct: " + print $pid_task + printf "===\n" + set var $stackp = $pid_task.thread.sp + set var $stacksize = sizeof(union thread_union) + set var $stack_top = ($stackp & ~($stacksize - 1)) + $stacksize + set var $stack_bot = ($stackp & ~($stacksize - 1)) + + set $stackp = *((unsigned long *) $stackp) + while (($stac
Re: [PATCH] kdump: Fix gdb macros work work with newer and 64-bit kernels
On 05/09/2016 12:18 AM, Baoquan He wrote: Hi Corey, I am trying to review this patch now, and these fixes contained are very great. Just several concerns are added in inline comment. By the way, did you run this in your side? Yes, I tested on x86, x86_64, ARM and MIPS. Comments inline... Hi Vivek, Member variable was added into task_struct in below commit replacing pids[PIDTYPE_TGID], and from then on nobody complained about it. Seems people rarely use this utility. commit 47e65328a7b1cdfc4e3102e50d60faf94ebba7d3 Author: Oleg Nesterov <o...@tv-sign.ru> Date: Tue Mar 28 16:11:25 2006 -0800 [PATCH] pids: kill PIDTYPE_TGID On 04/27/16 at 07:21am, Corey Minyard wrote: Any comments on this? If no one else cares I'd be willing to take over maintenance of this. -corey On 02/25/2016 07:51 AM, miny...@acm.org wrote: From: Corey Minyard <cminy...@mvista.com> Lots of little changes needed to be made to clean these up, remove the four byte pointer assumption and traverse the pid queue properly. Also consolidate the traceback code into a single function instead of having three copies of it. Signed-off-by: Corey Minyard <cminy...@mvista.com> --- Documentation/kdump/gdbmacros.txt | 90 +-- 1 file changed, 40 insertions(+), 50 deletions(-) I sent this earlier, but I didn't get a response. These are clearly wrong. I'd be happy to take over maintenance of these macros. It might be better to move them someplace else, too, since they are also useful for kgdb. diff --git a/Documentation/kdump/gdbmacros.txt b/Documentation/kdump/gdbmacros.txt index 9b9b454..e5bbd8d 100644 --- a/Documentation/kdump/gdbmacros.txt +++ b/Documentation/kdump/gdbmacros.txt @@ -15,14 +15,14 @@ define bttnobp set $tasks_off=((size_t)&((struct task_struct *)0)->tasks) - set $pid_off=((size_t)&((struct task_struct *)0)->pids[1].pid_list.next) + set $pid_off=((size_t)&((struct task_struct *)0)->thread_group.next) This is a quite nice fix. set $init_t=_task set $next_t=(((char *)($init_t->tasks).next) - $tasks_off) while ($next_t != $init_t) set $next_t=(struct task_struct *)$next_t printf "\npid %d; comm %s:\n", $next_t.pid, $next_t.comm printf "===\n" - set var $stackp = $next_t.thread.esp + set var $stackp = $next_t.thread.sp set var $stack_top = ($stackp & ~4095) + 4096 while ($stackp < $stack_top) @@ -31,12 +31,12 @@ define bttnobp end set $stackp += 4 end - set $next_th=(((char *)$next_t->pids[1].pid_list.next) - $pid_off) + set $next_th=(((char *)$next_t->thread_group.next) - $pid_off) while ($next_th != $next_t) set $next_th=(struct task_struct *)$next_th printf "\npid %d; comm %s:\n", $next_t.pid, $next_t.comm printf "===\n" - set var $stackp = $next_t.thread.esp + set var $stackp = $next_t.thread.sp set var $stack_top = ($stackp & ~4095) + 4096 while ($stackp < $stack_top) @@ -45,7 +45,7 @@ define bttnobp end set $stackp += 4 end - set $next_th=(((char *)$next_th->pids[1].pid_list.next) - $pid_off) + set $next_th=(((char *)$next_th->thread_group.next) - $pid_off) end set $next_t=(char *)($next_t->tasks.next) - $tasks_off end @@ -54,42 +54,43 @@ document bttnobp dump all thread stack traces on a kernel compiled with !CONFIG_FRAME_POINTER end +define btthreadstruct This is a nice wrapping, but I guess you want to name it as btthreadstack, right? Since I didn't get at all why it's related to thread_struct except of getting 'sp'. The name is based on what is passed into the function. You do a backtrace when given a thread structure. In my experience it is best to name functions based upon how the function's user sees it. Though I'm not stuck on the name, if you would prefer btthreadstack. + set var $pid_task = $arg0 + + printf "\npid %d; comm %s:\n", $pid_task.pid, $pid_task.comm + printf "task struct: " + print $pid_task + printf "===\n" + set var $stackp = $pid_task.thread.sp + set var $stack_top = ($stackp & ~4095) + 4096 + set var $stack_bot = ($stackp & ~4095) + + set $stackp = *((unsigned long *) $stackp) + while (($stackp < $stack_top) && ($stackp > $stack_bot)) + set var
Re: [RFC PATCH 1/4] purgatory/ipmi: Support BMC watchdog timer start/stop in purgatory
A general note here. It does not appear that you implement the error recovery states in your state machine. If the system fails in the middle of doing an IPMI operation, it is likely to fail. If you do this you will need to detect and abort any running operation. Implementing the full state machine is probably the best approach, it should handle this, though it is rather complex. -corey On 01/20/2016 04:37 AM, Hidehiro Kawai wrote: This patch adds an interface to BMC via KCS I/F and a functionality to start/stop BMC watchdog timer in purgatory. Starting the watchdog timer is useful to automatically reboot the server when we fail to boot the second kernel. Stopping the watchdog timer is useful to prevent the second kernel from being stopped by the watchdog timer enabled while the first kernel is running. If you specify --ipmi-wdt-start or --ipmi-wdt-stop option to kexec command, BMC's watchdog timer will start or stop respectively while executing purgatory. You can't specify the both options at the same time. The start operation doesn't change the parameters of the watchdog timer such as initial counter and action, you need to set those parameters in the first OS. On the other hand, the stop operation changes the parameters. You need to reset them when you want to reuse the watchdog timer. Signed-off-by: Hidehiro Kawai--- kexec/ipmi.h |9 ++ kexec/kexec.c | 18 +++ kexec/kexec.h |6 + purgatory/Makefile|1 purgatory/include/purgatory.h |3 + purgatory/ipmi.c | 232 + purgatory/purgatory.c |4 + 7 files changed, 272 insertions(+), 1 deletion(-) create mode 100644 kexec/ipmi.h create mode 100644 purgatory/ipmi.c diff --git a/kexec/ipmi.h b/kexec/ipmi.h new file mode 100644 index 000..395a2c7 --- /dev/null +++ b/kexec/ipmi.h @@ -0,0 +1,9 @@ +#ifndef IPMI_H +#define IPMI_H + +/* Options for IPMI code excuted in purgatory */ +#define IPMI_WDT_DO_NOTHING0 +#define IPMI_WDT_START (1 << 0) +#define IPMI_WDT_STOP (1 << 1) + +#endif /* IPMI_H */ diff --git a/kexec/kexec.c b/kexec/kexec.c index f0bd527..8a8f268 100644 --- a/kexec/kexec.c +++ b/kexec/kexec.c @@ -49,6 +49,7 @@ #include "kexec-sha256.h" #include "kexec-zlib.h" #include "kexec-lzma.h" +#include "ipmi.h" #include unsigned long long mem_min = 0; @@ -57,6 +58,7 @@ static unsigned long kexec_flags = 0; /* Flags for kexec file (fd) based syscall */ static unsigned long kexec_file_flags = 0; int kexec_debug = 0; +int opt_ipmi_wdt = IPMI_WDT_DO_NOTHING; void dbgprint_mem_range(const char *prefix, struct memory_range *mr, int nr_mr) { @@ -643,6 +645,10 @@ static void update_purgatory(struct kexec_info *info) if (!info->rhdr.e_shdr) { return; } + + elf_rel_set_symbol(>rhdr, "ipmi_wdt", _ipmi_wdt, + sizeof(opt_ipmi_wdt)); + arch_update_purgatory(info); memset(region, 0, sizeof(region)); sha256_starts(); @@ -1345,6 +1351,12 @@ int main(int argc, char *argv[]) case OPT_KEXEC_FILE_SYSCALL: /* We already parsed it. Nothing to do. */ break; + case OPT_IPMI_WDT_START: + opt_ipmi_wdt |= IPMI_WDT_START; + break; + case OPT_IPMI_WDT_STOP: + opt_ipmi_wdt |= IPMI_WDT_STOP; + break; default: break; } @@ -1370,6 +1382,12 @@ int main(int argc, char *argv[]) "\"--mem-max\" parameter\n"); } + if ((opt_ipmi_wdt & IPMI_WDT_START) && + (opt_ipmi_wdt & IPMI_WDT_STOP)) { + die("You can't specify both --ipmi-wdt-start and " + "--ipmi-wdt-stop\n"); + } + fileind = optind; /* Reset getopt for the next pass; called in other source modules */ opterr = 1; diff --git a/kexec/kexec.h b/kexec/kexec.h index c02ac8f..4638866 100644 --- a/kexec/kexec.h +++ b/kexec/kexec.h @@ -224,7 +224,9 @@ extern int file_types; #define OPT_LOAD_PRESERVE_CONTEXT 259 #define OPT_LOAD_JUMP_BACK_HELPER 260 #define OPT_ENTRY 261 -#define OPT_MAX262 +#define OPT_IPMI_WDT_START 262 +#define OPT_IPMI_WDT_STOP 263 +#define OPT_MAX264 #define KEXEC_OPTIONS \ { "help", 0, 0, OPT_HELP }, \ { "version", 0, 0, OPT_VERSION }, \ @@ -244,6 +246,8 @@ extern int file_types; { "reuseinitrd", 0, 0, OPT_REUSE_INITRD }, \ { "kexec-file-syscall", 0, 0, OPT_KEXEC_FILE_SYSCALL }, \ { "debug",0, 0, OPT_DEBUG }, \ + { "ipmi-wdt-start", 0, 0, OPT_IPMI_WDT_START }, \ + { "ipmi-wdt-stop",
Re: [RFC PATCH 0/4] purgatory: Add basic support for IPMI command execution
I understand what you are trying to accomplish here, but I'm not sure of the wisdom of this approach. I'll give some more information and the kexec maintainers can decide, I suppose. The KCS interface given here probably covers ~70% of the systems out there right now. Other systems have: * KCS interfaces at a different port or in a different place like memory, PCI, and with different register sizes and spacing. * Other standard interfaces. SMIC (probably not relevant), BT (faster, it does block transfers) and SSIF (which is IPMI over I2C). * Those other standard interfaces can be in different places, just like KCS. Hundreds of I2C interfaces exist. * Non-standard interfaces. Power systems have their own IPMI interfaces, for instance. Some systems have IPMI over serial ports, though hopefully that has pretty much gone away. I'd guess that over half of the IPMI SI driver is discovering and handling all the various interface types, locations from all the sources it can come from. As time goes on that 70% number is decreasing in favour of other faster and more convenient interfaces. I expect that SSIF will become much more popular over time because it has block transfer capability and all the hardware is already there on systems. This is no different, of course, than any other common hardware interface out there. USB, ATA, etc. But it makes it hard to cover all the possibilities in something like purgatory. I know how valuable this information can be. It has saved my butt on occasions, which is why I go through the inconvenience of handling it in the IPMI driver. But it seems to me that the failure rate of doing this in the crashing kernel should be pretty low. Not zero of course. But I have no idea what it is. -corey On 01/20/2016 04:37 AM, Hidehiro Kawai wrote: If the second kernel for crash dumping hangs up while booting, no information related to the first kernel will be saved. This makes crash cause analysis difficult. So, some enterprise users want to save minimal information befor booting the second kernel. One of the approaches is to use panic notifier call or pstore feature. For example, a panic notifier callback registered by IPMI driver saves the panic message to BMC's SEL before booting the second kernel. Similarly, pstore saves kernel logs to a non-volatile memory on the server. However, since these functionalities run with crashed kernel, they may fail to complete their work and boot the second kernel. So, another approach; saving minimal information to BMC's SEL in the purgatory. Since the purgatory code doesn't rely on the crashed kernel, we can run it safely after verifying the hash of the code. This patch set is the first step to the final goal; it provides a basic support for IPMI command execution in purgatory. IPMI specification defines multiple interfaces to BMC, and this patch set uses one of them, KCS I/F, which talks with BMC via I/O port like keyboard controllers. As a use case for that, options to start/stop BMC's watchdog timer before booting the second kernel are also provided. These options are useful for the cases where: - you want to automatically reboot the server when the second kernel hangs up while booting - you want to prevent the second kernel from being stopped by the watchdog timer enabled while the first kernel is running If the BMC doesn't work well, the IPMI command execution can take indefinite time and fail to boot the second kernel. To avoid this, timeout logic based on RTC polling is also implemented. NOTE: This is an RFC version, so some parts are incomplete; these codes are unconditionally built into the kexec binary, and I/O ports for KCS I/F and timeout (5 seconds) are hard-coded, and etc. Future plan: Add an option to save the panic message and instruction pointers to BMC's SEL in purgatory. To realize this, we first need to pass the panic message to the purgatory. Instruction pointers are already passed to the second kernel through ELF notes, so just read them. --- Hidehiro Kawai (4): purgatory/ipmi: Support BMC watchdog timer start/stop in purgatory purgatory: Introduce timeout API purgatory/x86: Support CMOS RTC purgatory/ipmi: Add timeout logic to IPMI command processing kexec/ipmi.h |9 + kexec/kexec.c | 18 ++ kexec/kexec.h |6 + purgatory/Makefile |5 + purgatory/arch/i386/Makefile |1 purgatory/arch/i386/rtc_cmos.c | 104 ++ purgatory/arch/x86_64/Makefile |1 purgatory/include/purgatory.h |3 purgatory/include/time.h | 33 + purgatory/ipmi.c | 293 purgatory/purgatory.c |4 + purgatory/time.c | 58 12 files changed, 533 insertions(+), 2 deletions(-) create mode 100644 kexec/ipmi.h create mode
Tool for creating full vmcore
Hello, I've written a tool that takes an coredump of physical kernel memory and converts it to a coredump of virtual kernel memory that can be directly used by gdb to debug a kernel. I was trying to use /proc/vmcore, but that did not include any of vmalloc memory or vmemmap, which made it kind of useless. I thought about adding all that to the vmcore, but that didn't seem like the proper way to do things. The tool is at https://github.com/cminyard/kdump-tool It can create a physical memory coredump (like the kdump tool) and a virtual memory coredump (either from a physical memory one, or directly from /proc/vmcore and /dev/mem). Two kernel patches are in the kernel-patches directory of the tool, on the master branch: One adds the valid physical memory ranges to the notes in the core file. Memory doesn't always start from zero, some systems have more than one OS running on the same hardware, and memory often has holes and places that are bad to read from. It seems prudent to pass in these ranges so the coredump is only the memory required. It also adds the physical address of the kernel page directory to the notes, so it can parse the page tables. The other is MIPS specific. You need a whole bunch of information about the configuration to parse the MIPS page tables. One cool thing this should be able to do is create a coredump for userspace processes. You can look into kernel memory to find the page table and pass that in to the tool. It should be able to extract the process' resident memory from a physical coredump. Of course, it will only dump resident memory, anything on disk will not be there. Is this interesting to the kdump community? I'd like to include it in kexec, if possible. Thanks, -corey ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec