Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-21 Thread Borislav Petkov
On Tue, Feb 21, 2017 at 08:37:21PM +0800, Xunlei Pang wrote:
> -/* If this CPU is offline, just bail out. */
> -if (cpu_is_offline(smp_processor_id())) {
> +/*
> + * Cases to bail out to avoid rendezvous process timeout:
> + * 1)If crashing_cpu was set, e.g. entering kdump,
> + *   we need to skip cpus remaining in 1st kernel.
> + * 2)If this CPU is offline.
> + */
> +if (crashing_cpu != -1 ||
> +cpu_is_offline(smp_processor_id())) {

You're still not letting the crashing_cpu enter the #MC handler. You
need to handle the MCE no matter how short the window is.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-21 Thread Borislav Petkov
On Tue, Feb 21, 2017 at 08:37:21PM +0800, Xunlei Pang wrote:
> -/* If this CPU is offline, just bail out. */
> -if (cpu_is_offline(smp_processor_id())) {
> +/*
> + * Cases to bail out to avoid rendezvous process timeout:
> + * 1)If crashing_cpu was set, e.g. entering kdump,
> + *   we need to skip cpus remaining in 1st kernel.
> + * 2)If this CPU is offline.
> + */
> +if (crashing_cpu != -1 ||
> +cpu_is_offline(smp_processor_id())) {

You're still not letting the crashing_cpu enter the #MC handler. You
need to handle the MCE no matter how short the window is.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-21 Thread Xunlei Pang
On 02/20/2017 at 07:09 PM, Borislav Petkov wrote:
> On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote:
>> @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long 
>> error_code)
>>   */
>>  int lmce = 1;
>>  
>> -/* If this CPU is offline, just bail out. */
>> -if (cpu_is_offline(smp_processor_id())) {
>> +/* If nmi shootdown happened or this CPU is offline, just bail out. */
>> +if (cpus_shotdown() ||
> I don't like "cpus_shotdown" - it doesn't hint at all that this is
> special-handling crash/kdump.
>
> And more importantly, I want it to be obvious that we do let the
> crashing CPU into the MCE handler.

Hi Boris,

I made some improvements, what do you think the following one?
If you think it is fine, I can send out v3. Thanks for your time!

---
 arch/x86/include/asm/reboot.h|  1 +
 arch/x86/kernel/cpu/mcheck/mce.c | 11 +--
 arch/x86/kernel/reboot.c |  5 +++--
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h
index 2cb1cc2..fc62ba8 100644
--- a/arch/x86/include/asm/reboot.h
+++ b/arch/x86/include/asm/reboot.h
@@ -15,6 +15,7 @@ struct machine_ops {
 };
 
 extern struct machine_ops machine_ops;
+extern int crashing_cpu;
 
 void native_machine_crash_shutdown(struct pt_regs *regs);
 void native_machine_shutdown(void);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 8e9725c..7f53145 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -49,6 +49,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "mce-internal.h"
 
@@ -1128,8 +1129,14 @@ void do_machine_check(struct pt_regs *regs, long 
error_code)
  */
 int lmce = 1;
 
-/* If this CPU is offline, just bail out. */
-if (cpu_is_offline(smp_processor_id())) {
+/*
+ * Cases to bail out to avoid rendezvous process timeout:
+ * 1)If crashing_cpu was set, e.g. entering kdump,
+ *   we need to skip cpus remaining in 1st kernel.
+ * 2)If this CPU is offline.
+ */
+if (crashing_cpu != -1 ||
+cpu_is_offline(smp_processor_id())) {
 u64 mcgstatus;
 
 mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index e244c19..92ecf4b 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -749,10 +749,11 @@ void machine_crash_shutdown(struct pt_regs *regs)
 #endif
 
 
+/* This keeps a track of which one is crashing cpu. */
+int crashing_cpu = -1;
+
 #if defined(CONFIG_SMP)
 
-/* This keeps a track of which one is crashing cpu. */
-static int crashing_cpu;
 static nmi_shootdown_cb shootdown_callback;
 
 static atomic_t waiting_for_crash_ipi;
-- 
1.8.3.1


Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-21 Thread Xunlei Pang
On 02/20/2017 at 07:09 PM, Borislav Petkov wrote:
> On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote:
>> @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long 
>> error_code)
>>   */
>>  int lmce = 1;
>>  
>> -/* If this CPU is offline, just bail out. */
>> -if (cpu_is_offline(smp_processor_id())) {
>> +/* If nmi shootdown happened or this CPU is offline, just bail out. */
>> +if (cpus_shotdown() ||
> I don't like "cpus_shotdown" - it doesn't hint at all that this is
> special-handling crash/kdump.
>
> And more importantly, I want it to be obvious that we do let the
> crashing CPU into the MCE handler.

Hi Boris,

I made some improvements, what do you think the following one?
If you think it is fine, I can send out v3. Thanks for your time!

---
 arch/x86/include/asm/reboot.h|  1 +
 arch/x86/kernel/cpu/mcheck/mce.c | 11 +--
 arch/x86/kernel/reboot.c |  5 +++--
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/reboot.h b/arch/x86/include/asm/reboot.h
index 2cb1cc2..fc62ba8 100644
--- a/arch/x86/include/asm/reboot.h
+++ b/arch/x86/include/asm/reboot.h
@@ -15,6 +15,7 @@ struct machine_ops {
 };
 
 extern struct machine_ops machine_ops;
+extern int crashing_cpu;
 
 void native_machine_crash_shutdown(struct pt_regs *regs);
 void native_machine_shutdown(void);
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 8e9725c..7f53145 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -49,6 +49,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "mce-internal.h"
 
@@ -1128,8 +1129,14 @@ void do_machine_check(struct pt_regs *regs, long 
error_code)
  */
 int lmce = 1;
 
-/* If this CPU is offline, just bail out. */
-if (cpu_is_offline(smp_processor_id())) {
+/*
+ * Cases to bail out to avoid rendezvous process timeout:
+ * 1)If crashing_cpu was set, e.g. entering kdump,
+ *   we need to skip cpus remaining in 1st kernel.
+ * 2)If this CPU is offline.
+ */
+if (crashing_cpu != -1 ||
+cpu_is_offline(smp_processor_id())) {
 u64 mcgstatus;
 
 mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
index e244c19..92ecf4b 100644
--- a/arch/x86/kernel/reboot.c
+++ b/arch/x86/kernel/reboot.c
@@ -749,10 +749,11 @@ void machine_crash_shutdown(struct pt_regs *regs)
 #endif
 
 
+/* This keeps a track of which one is crashing cpu. */
+int crashing_cpu = -1;
+
 #if defined(CONFIG_SMP)
 
-/* This keeps a track of which one is crashing cpu. */
-static int crashing_cpu;
 static nmi_shootdown_cb shootdown_callback;
 
 static atomic_t waiting_for_crash_ipi;
-- 
1.8.3.1


Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-21 Thread Borislav Petkov
On Tue, Feb 21, 2017 at 09:28:20AM +0800, Xunlei Pang wrote:
> Not kdump kernel starts dumping, just during nmi_shootdown_cpus(), if some
> MCE comes after crashing_cpu was set and we don't skip crashing_cpu, then
> the crashing cpu will enter mce handler and trigger the synchronization issue.

Ok, I went and looked at __crash_kexec() - so we do nmi_shootdown_cpus()
as part of machine_crash_shutdown(). The next thing we do is kexec the
new kernel in machine_kexec(kexec_crash_image). The picture is clearer
now.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-21 Thread Borislav Petkov
On Tue, Feb 21, 2017 at 09:28:20AM +0800, Xunlei Pang wrote:
> Not kdump kernel starts dumping, just during nmi_shootdown_cpus(), if some
> MCE comes after crashing_cpu was set and we don't skip crashing_cpu, then
> the crashing cpu will enter mce handler and trigger the synchronization issue.

Ok, I went and looked at __crash_kexec() - so we do nmi_shootdown_cpus()
as part of machine_crash_shutdown(). The next thing we do is kexec the
new kernel in machine_kexec(kexec_crash_image). The picture is clearer
now.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-20 Thread Xunlei Pang
On 02/21/2017 at 04:26 AM, Borislav Petkov wrote:
> On Mon, Feb 20, 2017 at 09:29:24PM +0800, Xunlei Pang wrote:
>> There is a small window between crash and kdump kernel boot, so
>> if a SRAO comes within this window it will also cause the mce
>> synchronization problem on the crashing cpu if we don't bail out the
>> crashing cpu.
> You mean, in the window between, kdump kernel starts writing out memory
> and the second, kexec-ed kernel?

Not kdump kernel starts dumping, just during nmi_shootdown_cpus(), if some
MCE comes after crashing_cpu was set and we don't skip crashing_cpu, then
the crashing cpu will enter mce handler and trigger the synchronization issue.

>
> If so, please add that information to the place in do_machine_check()
> where we check crashing_cpu so that we know why we're doing this
> temporary ignore of #MC.

Ok, will add, thanks for the feedback.

Regards,
Xunlei


Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-20 Thread Xunlei Pang
On 02/21/2017 at 04:26 AM, Borislav Petkov wrote:
> On Mon, Feb 20, 2017 at 09:29:24PM +0800, Xunlei Pang wrote:
>> There is a small window between crash and kdump kernel boot, so
>> if a SRAO comes within this window it will also cause the mce
>> synchronization problem on the crashing cpu if we don't bail out the
>> crashing cpu.
> You mean, in the window between, kdump kernel starts writing out memory
> and the second, kexec-ed kernel?

Not kdump kernel starts dumping, just during nmi_shootdown_cpus(), if some
MCE comes after crashing_cpu was set and we don't skip crashing_cpu, then
the crashing cpu will enter mce handler and trigger the synchronization issue.

>
> If so, please add that information to the place in do_machine_check()
> where we check crashing_cpu so that we know why we're doing this
> temporary ignore of #MC.

Ok, will add, thanks for the feedback.

Regards,
Xunlei


Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-20 Thread Xunlei Pang
On 02/20/2017 at 09:29 PM, Xunlei Pang wrote:
> On 02/20/2017 at 07:09 PM, Borislav Petkov wrote:
>> On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote:
>>> @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long 
>>> error_code)
>>>  */
>>> int lmce = 1;
>>>  
>>> -   /* If this CPU is offline, just bail out. */
>>> -   if (cpu_is_offline(smp_processor_id())) {
>>> +   /* If nmi shootdown happened or this CPU is offline, just bail out. */
>>> +   if (cpus_shotdown() ||
>> I don't like "cpus_shotdown" - it doesn't hint at all that this is
>> special-handling crash/kdump.
>>
>> And more importantly, I want it to be obvious that we do let the
>> crashing CPU into the MCE handler.
> Ok, I will export crashing_cpu and use it directly in mce handler.

Forget to mention, one reason I introduced cpus_shotdown() is that 
"crashing_cpu"
is defined with CONFIG_SMP=y, so we have to export it unconditionally if we 
don't want
to add the conditional code(i.e. with #ifdef CONFIG_SMP quoted) in mce.c.

Regards,
Xunlei

>
>> Why?
>>
>> If we didn't, you will not handle *any* MCE, even a fatal one, during
>> dumping memory so if that dump is corrupted from the MCE, you won't
>> know. And I don't want to be the one staring at the corrupted dump and
>> wondering why I'm seeing what I'm seeing.
>>
>> IOW, if we get a fatal MCE during dumping then we should go and die.
>> This is much better than silently corrupting the dump and not even
>> saying anything about it.
>>
> My thought is that it doesn't matter after kdump boots as new mce handler
> will be installed. If we get a fatal MCE during kdumping, the new handler will
> handle the cpus running kdump kernel correctly.
>
> There is a small window between crash and kdump kernel boot, so if a SRAO 
> comes
> within this window it will also cause the mce synchronization problem on the 
> crashing
> cpu if we don't bail out the crashing cpu.
>
> Regards,
> Xunlei



Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-20 Thread Xunlei Pang
On 02/20/2017 at 09:29 PM, Xunlei Pang wrote:
> On 02/20/2017 at 07:09 PM, Borislav Petkov wrote:
>> On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote:
>>> @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long 
>>> error_code)
>>>  */
>>> int lmce = 1;
>>>  
>>> -   /* If this CPU is offline, just bail out. */
>>> -   if (cpu_is_offline(smp_processor_id())) {
>>> +   /* If nmi shootdown happened or this CPU is offline, just bail out. */
>>> +   if (cpus_shotdown() ||
>> I don't like "cpus_shotdown" - it doesn't hint at all that this is
>> special-handling crash/kdump.
>>
>> And more importantly, I want it to be obvious that we do let the
>> crashing CPU into the MCE handler.
> Ok, I will export crashing_cpu and use it directly in mce handler.

Forget to mention, one reason I introduced cpus_shotdown() is that 
"crashing_cpu"
is defined with CONFIG_SMP=y, so we have to export it unconditionally if we 
don't want
to add the conditional code(i.e. with #ifdef CONFIG_SMP quoted) in mce.c.

Regards,
Xunlei

>
>> Why?
>>
>> If we didn't, you will not handle *any* MCE, even a fatal one, during
>> dumping memory so if that dump is corrupted from the MCE, you won't
>> know. And I don't want to be the one staring at the corrupted dump and
>> wondering why I'm seeing what I'm seeing.
>>
>> IOW, if we get a fatal MCE during dumping then we should go and die.
>> This is much better than silently corrupting the dump and not even
>> saying anything about it.
>>
> My thought is that it doesn't matter after kdump boots as new mce handler
> will be installed. If we get a fatal MCE during kdumping, the new handler will
> handle the cpus running kdump kernel correctly.
>
> There is a small window between crash and kdump kernel boot, so if a SRAO 
> comes
> within this window it will also cause the mce synchronization problem on the 
> crashing
> cpu if we don't bail out the crashing cpu.
>
> Regards,
> Xunlei



Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-20 Thread Borislav Petkov
On Mon, Feb 20, 2017 at 09:29:24PM +0800, Xunlei Pang wrote:
> There is a small window between crash and kdump kernel boot, so
> if a SRAO comes within this window it will also cause the mce
> synchronization problem on the crashing cpu if we don't bail out the
> crashing cpu.

You mean, in the window between, kdump kernel starts writing out memory
and the second, kexec-ed kernel?

If so, please add that information to the place in do_machine_check()
where we check crashing_cpu so that we know why we're doing this
temporary ignore of #MC.

Thanks.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-20 Thread Borislav Petkov
On Mon, Feb 20, 2017 at 09:29:24PM +0800, Xunlei Pang wrote:
> There is a small window between crash and kdump kernel boot, so
> if a SRAO comes within this window it will also cause the mce
> synchronization problem on the crashing cpu if we don't bail out the
> crashing cpu.

You mean, in the window between, kdump kernel starts writing out memory
and the second, kexec-ed kernel?

If so, please add that information to the place in do_machine_check()
where we check crashing_cpu so that we know why we're doing this
temporary ignore of #MC.

Thanks.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-20 Thread Xunlei Pang
On 02/20/2017 at 07:09 PM, Borislav Petkov wrote:
> On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote:
>> @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long 
>> error_code)
>>   */
>>  int lmce = 1;
>>  
>> -/* If this CPU is offline, just bail out. */
>> -if (cpu_is_offline(smp_processor_id())) {
>> +/* If nmi shootdown happened or this CPU is offline, just bail out. */
>> +if (cpus_shotdown() ||
> I don't like "cpus_shotdown" - it doesn't hint at all that this is
> special-handling crash/kdump.
>
> And more importantly, I want it to be obvious that we do let the
> crashing CPU into the MCE handler.

Ok, I will export crashing_cpu and use it directly in mce handler.

>
> Why?
>
> If we didn't, you will not handle *any* MCE, even a fatal one, during
> dumping memory so if that dump is corrupted from the MCE, you won't
> know. And I don't want to be the one staring at the corrupted dump and
> wondering why I'm seeing what I'm seeing.
>
> IOW, if we get a fatal MCE during dumping then we should go and die.
> This is much better than silently corrupting the dump and not even
> saying anything about it.
>

My thought is that it doesn't matter after kdump boots as new mce handler
will be installed. If we get a fatal MCE during kdumping, the new handler will
handle the cpus running kdump kernel correctly.

There is a small window between crash and kdump kernel boot, so if a SRAO comes
within this window it will also cause the mce synchronization problem on the 
crashing
cpu if we don't bail out the crashing cpu.

Regards,
Xunlei


Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-20 Thread Xunlei Pang
On 02/20/2017 at 07:09 PM, Borislav Petkov wrote:
> On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote:
>> @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long 
>> error_code)
>>   */
>>  int lmce = 1;
>>  
>> -/* If this CPU is offline, just bail out. */
>> -if (cpu_is_offline(smp_processor_id())) {
>> +/* If nmi shootdown happened or this CPU is offline, just bail out. */
>> +if (cpus_shotdown() ||
> I don't like "cpus_shotdown" - it doesn't hint at all that this is
> special-handling crash/kdump.
>
> And more importantly, I want it to be obvious that we do let the
> crashing CPU into the MCE handler.

Ok, I will export crashing_cpu and use it directly in mce handler.

>
> Why?
>
> If we didn't, you will not handle *any* MCE, even a fatal one, during
> dumping memory so if that dump is corrupted from the MCE, you won't
> know. And I don't want to be the one staring at the corrupted dump and
> wondering why I'm seeing what I'm seeing.
>
> IOW, if we get a fatal MCE during dumping then we should go and die.
> This is much better than silently corrupting the dump and not even
> saying anything about it.
>

My thought is that it doesn't matter after kdump boots as new mce handler
will be installed. If we get a fatal MCE during kdumping, the new handler will
handle the cpus running kdump kernel correctly.

There is a small window between crash and kdump kernel boot, so if a SRAO comes
within this window it will also cause the mce synchronization problem on the 
crashing
cpu if we don't bail out the crashing cpu.

Regards,
Xunlei


Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-20 Thread Borislav Petkov
On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote:
> @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long 
> error_code)
>*/
>   int lmce = 1;
>  
> - /* If this CPU is offline, just bail out. */
> - if (cpu_is_offline(smp_processor_id())) {
> + /* If nmi shootdown happened or this CPU is offline, just bail out. */
> + if (cpus_shotdown() ||

I don't like "cpus_shotdown" - it doesn't hint at all that this is
special-handling crash/kdump.

And more importantly, I want it to be obvious that we do let the
crashing CPU into the MCE handler.

Why?

If we didn't, you will not handle *any* MCE, even a fatal one, during
dumping memory so if that dump is corrupted from the MCE, you won't
know. And I don't want to be the one staring at the corrupted dump and
wondering why I'm seeing what I'm seeing.

IOW, if we get a fatal MCE during dumping then we should go and die.
This is much better than silently corrupting the dump and not even
saying anything about it.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH v2] x86/mce: Don't participate in rendezvous process once nmi_shootdown_cpus() was made

2017-02-20 Thread Borislav Petkov
On Mon, Feb 20, 2017 at 02:10:37PM +0800, Xunlei Pang wrote:
> @@ -1128,8 +1129,9 @@ void do_machine_check(struct pt_regs *regs, long 
> error_code)
>*/
>   int lmce = 1;
>  
> - /* If this CPU is offline, just bail out. */
> - if (cpu_is_offline(smp_processor_id())) {
> + /* If nmi shootdown happened or this CPU is offline, just bail out. */
> + if (cpus_shotdown() ||

I don't like "cpus_shotdown" - it doesn't hint at all that this is
special-handling crash/kdump.

And more importantly, I want it to be obvious that we do let the
crashing CPU into the MCE handler.

Why?

If we didn't, you will not handle *any* MCE, even a fatal one, during
dumping memory so if that dump is corrupted from the MCE, you won't
know. And I don't want to be the one staring at the corrupted dump and
wondering why I'm seeing what I'm seeing.

IOW, if we get a fatal MCE during dumping then we should go and die.
This is much better than silently corrupting the dump and not even
saying anything about it.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.