Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made
On Tue, Feb 21, 2017 at 08:37:21PM +0800, Xunlei Pang wrote: > -/* If this CPU is offline, just bail out. */ > -if (cpu_is_offline(smp_processor_id())) { > +/* > + * Cases to bail out to avoid rendezvous process timeout: > + * 1)If crashing_cpu was set, e.g. entering kdump, > + * we need to skip cpus remaining in 1st kernel. > + * 2)If this CPU is offline. > + */ > +if (crashing_cpu != -1 || > +cpu_is_offline(smp_processor_id())) { You're still not letting the crashing_cpu enter the #MC handler. You need to handle the MCE no matter how short the window is. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made
On 02/20/2017 at 07:09 PM, Borislav Petkov wrote: > On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote: >> @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long >> error_code) >> */ >> int lmce = 1; >> >> -/* If this CPU is offline, just bail out. */ >> -if (cpu_is_offline(smp_processor_id())) { >> +/* If nmi shootdown happened or this CPU is offline, just bail out. */ >> +if (cpus_shotdown() || > I don't like "cpus_shotdown" - it doesn't hint at all that this is > special-handling crash/kdump. > > And more importantly, I want it to be obvious that we do let the > crashing CPU into the MCE handler. Hi Boris, I made some improvements, what do you think the following one? If you think it is fine, I can send out v3. Thanks for your time! --- arch/x86/include/asm/reboot.h| 1 + arch/x86/kernel/cpu/mcheck/mce.c | 11 +-- arch/x86/kernel/reboot.c | 5 +++-- 3 files changed, 13 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h index 2cb1cc2..fc62ba8 100644 --- a/arch/x86/include/asm/reboot.h +++ b/arch/x86/include/asm/reboot.h @@ -15,6 +15,7 @@ struct machine_ops { }; extern struct machine_ops machine_ops; +extern int crashing_cpu; void native_machine_crash_shutdown(struct pt_regs *regs); void native_machine_shutdown(void); diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 8e9725c..7f53145 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -49,6 +49,7 @@ #include #include #include +#include #include "mce-internal.h" @@ -1128,8 +1129,14 @@ void do_machine_check(struct pt_regs *regs, long error_code) */ int lmce = 1; -/* If this CPU is offline, just bail out. */ -if (cpu_is_offline(smp_processor_id())) { +/* + * Cases to bail out to avoid rendezvous process timeout: + * 1)If crashing_cpu was set, e.g. entering kdump, + * we need to skip cpus remaining in 1st kernel. + * 2)If this CPU is offline. + */ +if (crashing_cpu != -1 || +cpu_is_offline(smp_processor_id())) { u64 mcgstatus; mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS); diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index e244c19..92ecf4b 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -749,10 +749,11 @@ void machine_crash_shutdown(struct pt_regs *regs) #endif +/* This keeps a track of which one is crashing cpu. */ +int crashing_cpu = -1; + #if defined(CONFIG_SMP) -/* This keeps a track of which one is crashing cpu. */ -static int crashing_cpu; static nmi_shootdown_cb shootdown_callback; static atomic_t waiting_for_crash_ipi; -- 1.8.3.1 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made
On Tue, Feb 21, 2017 at 09:28:20AM +0800, Xunlei Pang wrote: > Not kdump kernel starts dumping, just during nmi_shootdown_cpus(), if some > MCE comes after crashing_cpu was set and we don't skip crashing_cpu, then > the crashing cpu will enter mce handler and trigger the synchronization issue. Ok, I went and looked at __crash_kexec() - so we do nmi_shootdown_cpus() as part of machine_crash_shutdown(). The next thing we do is kexec the new kernel in machine_kexec(kexec_crash_image). The picture is clearer now. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made
On 02/21/2017 at 04:26 AM, Borislav Petkov wrote: > On Mon, Feb 20, 2017 at 09:29:24PM +0800, Xunlei Pang wrote: >> There is a small window between crash and kdump kernel boot, so >> if a SRAO comes within this window it will also cause the mce >> synchronization problem on the crashing cpu if we don't bail out the >> crashing cpu. > You mean, in the window between, kdump kernel starts writing out memory > and the second, kexec-ed kernel? Not kdump kernel starts dumping, just during nmi_shootdown_cpus(), if some MCE comes after crashing_cpu was set and we don't skip crashing_cpu, then the crashing cpu will enter mce handler and trigger the synchronization issue. > > If so, please add that information to the place in do_machine_check() > where we check crashing_cpu so that we know why we're doing this > temporary ignore of #MC. Ok, will add, thanks for the feedback. Regards, Xunlei ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made
On 02/20/2017 at 09:29 PM, Xunlei Pang wrote: > On 02/20/2017 at 07:09 PM, Borislav Petkov wrote: >> On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote: >>> @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long >>> error_code) >>> */ >>> int lmce = 1; >>> >>> - /* If this CPU is offline, just bail out. */ >>> - if (cpu_is_offline(smp_processor_id())) { >>> + /* If nmi shootdown happened or this CPU is offline, just bail out. */ >>> + if (cpus_shotdown() || >> I don't like "cpus_shotdown" - it doesn't hint at all that this is >> special-handling crash/kdump. >> >> And more importantly, I want it to be obvious that we do let the >> crashing CPU into the MCE handler. > Ok, I will export crashing_cpu and use it directly in mce handler. Forget to mention, one reason I introduced cpus_shotdown() is that "crashing_cpu" is defined with CONFIG_SMP=y, so we have to export it unconditionally if we don't want to add the conditional code(i.e. with #ifdef CONFIG_SMP quoted) in mce.c. Regards, Xunlei > >> Why? >> >> If we didn't, you will not handle *any* MCE, even a fatal one, during >> dumping memory so if that dump is corrupted from the MCE, you won't >> know. And I don't want to be the one staring at the corrupted dump and >> wondering why I'm seeing what I'm seeing. >> >> IOW, if we get a fatal MCE during dumping then we should go and die. >> This is much better than silently corrupting the dump and not even >> saying anything about it. >> > My thought is that it doesn't matter after kdump boots as new mce handler > will be installed. If we get a fatal MCE during kdumping, the new handler will > handle the cpus running kdump kernel correctly. > > There is a small window between crash and kdump kernel boot, so if a SRAO > comes > within this window it will also cause the mce synchronization problem on the > crashing > cpu if we don't bail out the crashing cpu. > > Regards, > Xunlei ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made
On Mon, Feb 20, 2017 at 09:29:24PM +0800, Xunlei Pang wrote: > There is a small window between crash and kdump kernel boot, so > if a SRAO comes within this window it will also cause the mce > synchronization problem on the crashing cpu if we don't bail out the > crashing cpu. You mean, in the window between, kdump kernel starts writing out memory and the second, kexec-ed kernel? If so, please add that information to the place in do_machine_check() where we check crashing_cpu so that we know why we're doing this temporary ignore of #MC. Thanks. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made
On 02/20/2017 at 07:09 PM, Borislav Petkov wrote: > On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote: >> @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long >> error_code) >> */ >> int lmce = 1; >> >> -/* If this CPU is offline, just bail out. */ >> -if (cpu_is_offline(smp_processor_id())) { >> +/* If nmi shootdown happened or this CPU is offline, just bail out. */ >> +if (cpus_shotdown() || > I don't like "cpus_shotdown" - it doesn't hint at all that this is > special-handling crash/kdump. > > And more importantly, I want it to be obvious that we do let the > crashing CPU into the MCE handler. Ok, I will export crashing_cpu and use it directly in mce handler. > > Why? > > If we didn't, you will not handle *any* MCE, even a fatal one, during > dumping memory so if that dump is corrupted from the MCE, you won't > know. And I don't want to be the one staring at the corrupted dump and > wondering why I'm seeing what I'm seeing. > > IOW, if we get a fatal MCE during dumping then we should go and die. > This is much better than silently corrupting the dump and not even > saying anything about it. > My thought is that it doesn't matter after kdump boots as new mce handler will be installed. If we get a fatal MCE during kdumping, the new handler will handle the cpus running kdump kernel correctly. There is a small window between crash and kdump kernel boot, so if a SRAO comes within this window it will also cause the mce synchronization problem on the crashing cpu if we don't bail out the crashing cpu. Regards, Xunlei ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made
On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote: > @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long > error_code) >*/ > int lmce = 1; > > - /* If this CPU is offline, just bail out. */ > - if (cpu_is_offline(smp_processor_id())) { > + /* If nmi shootdown happened or this CPU is offline, just bail out. */ > + if (cpus_shotdown() || I don't like "cpus_shotdown" - it doesn't hint at all that this is special-handling crash/kdump. And more importantly, I want it to be obvious that we do let the crashing CPU into the MCE handler. Why? If we didn't, you will not handle *any* MCE, even a fatal one, during dumping memory so if that dump is corrupted from the MCE, you won't know. And I don't want to be the one staring at the corrupted dump and wondering why I'm seeing what I'm seeing. IOW, if we get a fatal MCE during dumping then we should go and die. This is much better than silently corrupting the dump and not even saying anything about it. -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec
[PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made
We met an issue for kdump: after kdump kernel boots up, and there comes a broadcasted mce in first kernel, the other cpus remaining in first kernel will enter the old mce handler of first kernel, then timeout and panic due to MCE synchronization, finally reset the kdump cpus. This patch lets cpus stay quiet after nmi_shootdown_cpus(), so before crash cpu shots them down or after kdump boots, they should not do anything except clearing MCG_STATUS in case of broadcasted mce. This is useful for kdump to let the vmcore dumping perform as hard as it can. Previous efforts: https://patchwork.kernel.org/patch/6167631/ https://lists.gt.net/linux/kernel/2146557 Cc: Naoya Horiguchi Suggested-by: Borislav Petkov Signed-off-by: Xunlei Pang --- v1->v2: Using crashing_cpu according to Borislav's suggestion. arch/x86/include/asm/reboot.h| 1 + arch/x86/kernel/cpu/mcheck/mce.c | 6 -- arch/x86/kernel/reboot.c | 16 +++- 3 files changed, 20 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h index 2cb1cc2..ec8657b6 100644 --- a/arch/x86/include/asm/reboot.h +++ b/arch/x86/include/asm/reboot.h @@ -26,5 +26,6 @@ struct machine_ops { typedef void (*nmi_shootdown_cb)(int, struct pt_regs*); void nmi_shootdown_cpus(nmi_shootdown_cb callback); void run_crash_ipi_callback(struct pt_regs *regs); +bool cpus_shotdown(void); #endif /* _ASM_X86_REBOOT_H */ diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c index 8e9725c..3b56710 100644 --- a/arch/x86/kernel/cpu/mcheck/mce.c +++ b/arch/x86/kernel/cpu/mcheck/mce.c @@ -49,6 +49,7 @@ #include #include #include +#include #include "mce-internal.h" @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long error_code) */ int lmce = 1; - /* If this CPU is offline, just bail out. */ - if (cpu_is_offline(smp_processor_id())) { + /* If nmi shootdown happened or this CPU is offline, just bail out. */ + if (cpus_shotdown() || + cpu_is_offline(smp_processor_id())) { u64 mcgstatus; mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS); diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index e244c19..b301c8d 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -752,7 +752,7 @@ void machine_crash_shutdown(struct pt_regs *regs) #if defined(CONFIG_SMP) /* This keeps a track of which one is crashing cpu. */ -static int crashing_cpu; +static int crashing_cpu = -1; static nmi_shootdown_cb shootdown_callback; static atomic_t waiting_for_crash_ipi; @@ -852,6 +852,14 @@ void nmi_panic_self_stop(struct pt_regs *regs) } } +bool cpus_shotdown(void) +{ + if (crashing_cpu != -1) + return true; + + return false; +} + #else /* !CONFIG_SMP */ void nmi_shootdown_cpus(nmi_shootdown_cb callback) { @@ -861,4 +869,10 @@ void nmi_shootdown_cpus(nmi_shootdown_cb callback) void run_crash_ipi_callback(struct pt_regs *regs) { } + +bool cpus_shotdown(void) +{ + return false; +} + #endif -- 1.8.3.1 ___ kexec mailing list kexec@lists.infradead.org http://lists.infradead.org/mailman/listinfo/kexec