subject:"\[tip\:x86\/urgent\] x86\/mce\: Ensure offline CPUs don' t participate in rendezvous process"

答复: 答复: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

2019-05-30 Thread Tony W Wang-oc

On Fri, May 31, 2019, Raj, Ashok wrote:
> On Thu, May 30, 2019 at 09:13:39AM +, Tony W Wang-oc wrote:
> > On Thu, May 30, 2019, Tony W Wang-oc wrote:
> > > Hi Ashok,
> > > I have two questions about this patch, could you help to check:
> > >
> > > 1, for broadcast #MC exceptions, this patch seems require #MC exception
> > > errors
> > > set MCG_STATUS_RIPV = 1.
> > > But for Intel CPU, some #MC exception errors set MCG_STATUS_RIPV = 0
> > > (like "Recoverable-not-continuable SRAR Type" Errors), for these errors
> > > the patch doesn't seem to work, is that okay?
> > >
> > > 2, for LMCE exceptions, this patch seems require #MC exception errors
> > > set MCG_STATUS_RIPV = 0 to make sure LMCE be handled normally even
> > > on offline CPU.
> > > For LMCE errors set MCG_STAUS_RIPV = 1, the patch prevents offline CPU
> > > handle these LMCE errors, is that okay?
> > >
> >
> > More specifically, this patch seems require #MC exceptions meet the
> condition
> > "MCG_STATUS_RIPV ^ MCG_STATUS_LMCES == 1"; But on a Xeon X5650
> machine (SMP),
> 
> The offline CPU will never get a LMCE=1, since those only happen on the CPU
> that's doing active work. Offline CPUs just sitting in idle.
> 
> The specific error here is a PCC=1, so irrespective of what happens
> We do capture the errors in the per-cpu log, and kernel would panic.
> 
> What specifically this patch tries to achieve is to leave an error
> sitting with MCG-STATUS.MCIP=1 and another recoverable error would shut
> the
> system dowm.
Yes, agree with you for this point.

But for question 1, When some #MC exception errors broadcast to offline CPU,
like "Recoverable-not-continuable SRAR Type" Errors, set MCG_STATUS_RIPV = 0, 
PCC = 0, is there also the problem : " Kernel panic - not syncing: Timeout: Not 
all CPUs 
entered broadcast exception handler"?

Thanks
> 
> I don't see anything wrong with what this patch does..
> 
> > "Data CACHE Level-2 Generic Error" does not meet this condition.
> >
> > I got below message from:
> https://www.centos.org/forums/viewtopic.php?p=292742
> >
> > Hardware event. This is not a software error.
> > MCE 0
> > CPU 4 BANK 6 TSC b7065eeaa18b0
> > TIME 1545643603 Mon Dec 24 10:26:43 2018
> > MCG status:MCIP
> > MCi status:
> > Uncorrected error
> > Error enabled
> > Processor context corrupt
> > MCA: Data CACHE Level-2 Generic Error
> > STATUS b2008106 MCGSTATUS 4
> > MCGCAP 1c09 APICID 4 SOCKETID 0
> > CPUID Vendor Intel Family 6 Model 44
> >
> > > Thanks
> > > Tony W Wang-oc

答复: 答复: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

2019-05-30 Thread David Wang

> -Original Mail-
> Sender: Raj, Ashok 
> Time: 2019.05.31 1:11
> To : Tony W Wang-oc 
> CC: tip...@zytor.com; b...@suse.de; h...@zytor.com;
> linux-e...@vger.kernel.org; linux-kernel@vger.kernel.org;
> linux-tip-comm...@vger.kernel.org; mi...@kernel.org; pet...@infradead.org;
> sta...@vger.kernel.org; t...@linutronix.de; tony.l...@intel.com;
> torva...@linux-foundation.org; David Wang ; Ashok
> Raj 
> Topic: Re: Re: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t
> participate in rendezvous process
> 
> On Thu, May 30, 2019 at 09:13:39AM +, Tony W Wang-oc wrote:
> > On Thu, May 30, 2019, Tony W Wang-oc wrote:
> > > Hi Ashok,
> > > I have two questions about this patch, could you help to check:
> > >
> > > 1, for broadcast #MC exceptions, this patch seems require #MC
> > > exception errors set MCG_STATUS_RIPV = 1.
> > > But for Intel CPU, some #MC exception errors set MCG_STATUS_RIPV = 0
> > > (like "Recoverable-not-continuable SRAR Type" Errors), for these
> > > errors the patch doesn't seem to work, is that okay?
> > >
> > > 2, for LMCE exceptions, this patch seems require #MC exception
> > > errors set MCG_STATUS_RIPV = 0 to make sure LMCE be handled normally
> > > even on offline CPU.
> > > For LMCE errors set MCG_STAUS_RIPV = 1, the patch prevents offline
> > > CPU handle these LMCE errors, is that okay?
> > >
> >
> > More specifically, this patch seems require #MC exceptions meet the
> > condition "MCG_STATUS_RIPV ^ MCG_STATUS_LMCES == 1"; But on a Xeon
> > X5650 machine (SMP),
> 
> The offline CPU will never get a LMCE=1, since those only happen on the CPU
> that's doing active work. Offline CPUs just sitting in idle.
So, for intel CPU, LMCE is only for Thread level(or core level) error? If not, 
suppose 2 threads
share level-2 cache. And thread 0 is active, thread 1 was offlined by SW. When 
MCE for this level-2
cache occurred, thread 1 will be active. When thread 1 read mcgstatus.lmce, the 
result will be always 0?

Thanks.
> 
> The specific error here is a PCC=1, so irrespective of what happens We do 
> capture
> the errors in the per-cpu log, and kernel would panic.
> 
> What specifically this patch tries to achieve is to leave an error sitting 
> with
> MCG-STATUS.MCIP=1 and another recoverable error would shut the system
> dowm.
> 
> I don't see anything wrong with what this patch does..
> 
> > "Data CACHE Level-2 Generic Error" does not meet this condition.
> >
> > I got below message from:
> > https://www.centos.org/forums/viewtopic.php?p=292742
> >
> > Hardware event. This is not a software error.
> > MCE 0
> > CPU 4 BANK 6 TSC b7065eeaa18b0
> > TIME 1545643603 Mon Dec 24 10:26:43 2018 MCG status:MCIP MCi status:
> > Uncorrected error
> > Error enabled
> > Processor context corrupt
> > MCA: Data CACHE Level-2 Generic Error
> > STATUS b2008106 MCGSTATUS 4
> > MCGCAP 1c09 APICID 4 SOCKETID 0
> > CPUID Vendor Intel Family 6 Model 44
> >
> > > Thanks
> > > Tony W Wang-oc

Re: 答复: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

2019-05-30 Thread Raj, Ashok

On Thu, May 30, 2019 at 09:13:39AM +, Tony W Wang-oc wrote:
> On Thu, May 30, 2019, Tony W Wang-oc wrote:
> > Hi Ashok,
> > I have two questions about this patch, could you help to check:
> > 
> > 1, for broadcast #MC exceptions, this patch seems require #MC exception
> > errors
> > set MCG_STATUS_RIPV = 1.
> > But for Intel CPU, some #MC exception errors set MCG_STATUS_RIPV = 0
> > (like "Recoverable-not-continuable SRAR Type" Errors), for these errors
> > the patch doesn't seem to work, is that okay?
> > 
> > 2, for LMCE exceptions, this patch seems require #MC exception errors
> > set MCG_STATUS_RIPV = 0 to make sure LMCE be handled normally even
> > on offline CPU.
> > For LMCE errors set MCG_STAUS_RIPV = 1, the patch prevents offline CPU
> > handle these LMCE errors, is that okay?
> > 
> 
> More specifically, this patch seems require #MC exceptions meet the condition
> "MCG_STATUS_RIPV ^ MCG_STATUS_LMCES == 1"; But on a Xeon X5650 machine (SMP), 

The offline CPU will never get a LMCE=1, since those only happen on the CPU 
that's doing active work. Offline CPUs just sitting in idle.

The specific error here is a PCC=1, so irrespective of what happens
We do capture the errors in the per-cpu log, and kernel would panic. 

What specifically this patch tries to achieve is to leave an error
sitting with MCG-STATUS.MCIP=1 and another recoverable error would shut the 
system dowm. 

I don't see anything wrong with what this patch does.. 

> "Data CACHE Level-2 Generic Error" does not meet this condition.
> 
> I got below message from: https://www.centos.org/forums/viewtopic.php?p=292742
> 
> Hardware event. This is not a software error.
> MCE 0
> CPU 4 BANK 6 TSC b7065eeaa18b0 
> TIME 1545643603 Mon Dec 24 10:26:43 2018
> MCG status:MCIP 
> MCi status:
> Uncorrected error
> Error enabled
> Processor context corrupt
> MCA: Data CACHE Level-2 Generic Error
> STATUS b2008106 MCGSTATUS 4
> MCGCAP 1c09 APICID 4 SOCKETID 0 
> CPUID Vendor Intel Family 6 Model 44
> 
> > Thanks
> > Tony W Wang-oc

答复: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

2019-05-30 Thread Tony W Wang-oc

On Thu, May 30, 2019, Tony W Wang-oc wrote:
> Hi Ashok,
> I have two questions about this patch, could you help to check:
> 
> 1, for broadcast #MC exceptions, this patch seems require #MC exception
> errors
> set MCG_STATUS_RIPV = 1.
> But for Intel CPU, some #MC exception errors set MCG_STATUS_RIPV = 0
> (like "Recoverable-not-continuable SRAR Type" Errors), for these errors
> the patch doesn't seem to work, is that okay?
> 
> 2, for LMCE exceptions, this patch seems require #MC exception errors
> set MCG_STATUS_RIPV = 0 to make sure LMCE be handled normally even
> on offline CPU.
> For LMCE errors set MCG_STAUS_RIPV = 1, the patch prevents offline CPU
> handle these LMCE errors, is that okay?
> 

More specifically, this patch seems require #MC exceptions meet the condition
"MCG_STATUS_RIPV ^ MCG_STATUS_LMCES == 1"; But on a Xeon X5650 machine (SMP), 
"Data CACHE Level-2 Generic Error" does not meet this condition.

I got below message from: https://www.centos.org/forums/viewtopic.php?p=292742

Hardware event. This is not a software error.
MCE 0
CPU 4 BANK 6 TSC b7065eeaa18b0 
TIME 1545643603 Mon Dec 24 10:26:43 2018
MCG status:MCIP 
MCi status:
Uncorrected error
Error enabled
Processor context corrupt
MCA: Data CACHE Level-2 Generic Error
STATUS b2008106 MCGSTATUS 4
MCGCAP 1c09 APICID 4 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 44

> Thanks
> Tony W Wang-oc

Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

2019-05-29 Thread Tony W Wang-oc

Hi Ashok,
I have two questions about this patch, could you help to check:

1, for broadcast #MC exceptions, this patch seems require #MC exception errors
set MCG_STATUS_RIPV = 1. 
But for Intel CPU, some #MC exception errors set MCG_STATUS_RIPV = 0 
(like "Recoverable-not-continuable SRAR Type" Errors), for these errors
the patch doesn't seem to work, is that okay?

2, for LMCE exceptions, this patch seems require #MC exception errors
set MCG_STATUS_RIPV = 0 to make sure LMCE be handled normally even
on offline CPU. 
For LMCE errors set MCG_STAUS_RIPV = 1, the patch prevents offline CPU
handle these LMCE errors, is that okay?

Thanks
Tony W Wang-oc

Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

2019-05-29 Thread Tony W Wang-oc

Hi,
This patch requires all #MC exception errors set MCG_STATUS_RIPV = 1?
Because on offline CPUs, for #MC exception errors set MCG_STATUS_RIPV = 0
(like "Recoverable-not-continuable SRAR Type" Errors), this patch doesn't seem
to work. if this patch's "return; " in a wrong place?

Thanks
Tony W Wang-oc

[tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

2015-12-19 Thread tip-bot for Ashok Raj

Commit-ID:  d90167a941f62860f35eb960e1012aa2d30e7e94
Gitweb: http://git.kernel.org/tip/d90167a941f62860f35eb960e1012aa2d30e7e94
Author: Ashok Raj 
AuthorDate: Thu, 10 Dec 2015 11:12:26 +0100
Committer:  Thomas Gleixner 
CommitDate: Sat, 19 Dec 2015 09:55:31 +0100

x86/mce: Ensure offline CPUs don't participate in rendezvous process

Intel's MCA implementation broadcasts MCEs to all CPUs on the
node. This poses a problem for offlined CPUs which cannot
participate in the rendezvous process:

  Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception 
handler
  Kernel Offset: disabled
  Rebooting in 100 seconds..

More specifically, Linux does a soft offline of a CPU when
writing a 0 to /sys/devices/system/cpu/cpuX/online, which
doesn't prevent the #MC exception from being broadcasted to that
CPU.

Ensure that offline CPUs don't participate in the MCE rendezvous
and clear the RIP valid status bit so that a second MCE won't
cause a shutdown.

Without the patch, mce_start() will increment mce_callin and
wait for all CPUs. Offlined CPUs should avoid participating in
the rendezvous process altogether.

Signed-off-by: Ashok Raj 
[ Massage commit message. ]
Signed-off-by: Borislav Petkov 
Reviewed-by: Tony Luck 
Cc: 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-edac 
Link: http://lkml.kernel.org/r/1449742346-21470-2-git-send-email...@alien8.de
Signed-off-by: Ingo Molnar 
Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/cpu/mcheck/mce.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index c5b0d56..7e8a736 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -999,6 +999,17 @@ void do_machine_check(struct pt_regs *regs, long 
error_code)
int flags = MF_ACTION_REQUIRED;
int lmce = 0;
 
+   /* If this CPU is offline, just bail out. */
+   if (cpu_is_offline(smp_processor_id())) {
+   u64 mcgstatus;
+
+   mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
+   if (mcgstatus & MCG_STATUS_RIPV) {
+   mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
+   return;
+   }
+   }
+
ist_enter(regs);
 
this_cpu_inc(mce_exception_count);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

2015-12-19 Thread tip-bot for Ashok Raj

Commit-ID:  d90167a941f62860f35eb960e1012aa2d30e7e94
Gitweb: http://git.kernel.org/tip/d90167a941f62860f35eb960e1012aa2d30e7e94
Author: Ashok Raj 
AuthorDate: Thu, 10 Dec 2015 11:12:26 +0100
Committer:  Thomas Gleixner 
CommitDate: Sat, 19 Dec 2015 09:55:31 +0100

x86/mce: Ensure offline CPUs don't participate in rendezvous process

Intel's MCA implementation broadcasts MCEs to all CPUs on the
node. This poses a problem for offlined CPUs which cannot
participate in the rendezvous process:

  Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception 
handler
  Kernel Offset: disabled
  Rebooting in 100 seconds..

More specifically, Linux does a soft offline of a CPU when
writing a 0 to /sys/devices/system/cpu/cpuX/online, which
doesn't prevent the #MC exception from being broadcasted to that
CPU.

Ensure that offline CPUs don't participate in the MCE rendezvous
and clear the RIP valid status bit so that a second MCE won't
cause a shutdown.

Without the patch, mce_start() will increment mce_callin and
wait for all CPUs. Offlined CPUs should avoid participating in
the rendezvous process altogether.

Signed-off-by: Ashok Raj 
[ Massage commit message. ]
Signed-off-by: Borislav Petkov 
Reviewed-by: Tony Luck 
Cc: 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-edac 
Link: http://lkml.kernel.org/r/1449742346-21470-2-git-send-email...@alien8.de
Signed-off-by: Ingo Molnar 
Signed-off-by: Thomas Gleixner 
---
 arch/x86/kernel/cpu/mcheck/mce.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index c5b0d56..7e8a736 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -999,6 +999,17 @@ void do_machine_check(struct pt_regs *regs, long 
error_code)
int flags = MF_ACTION_REQUIRED;
int lmce = 0;
 
+   /* If this CPU is offline, just bail out. */
+   if (cpu_is_offline(smp_processor_id())) {
+   u64 mcgstatus;
+
+   mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
+   if (mcgstatus & MCG_STATUS_RIPV) {
+   mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
+   return;
+   }
+   }
+
ist_enter(regs);
 
this_cpu_inc(mce_exception_count);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

2015-12-14 Thread tip-bot for Ashok Raj

Commit-ID:  06f337b7c7eb86254c86e8e717273d1e356d5a1b
Gitweb: http://git.kernel.org/tip/06f337b7c7eb86254c86e8e717273d1e356d5a1b
Author: Ashok Raj 
AuthorDate: Thu, 10 Dec 2015 11:12:26 +0100
Committer:  Ingo Molnar 
CommitDate: Fri, 11 Dec 2015 08:59:48 +0100

x86/mce: Ensure offline CPUs don't participate in rendezvous process

Intel's MCA implementation broadcasts MCEs to all CPUs on the
node. This poses a problem for offlined CPUs which cannot
participate in the rendezvous process:

  Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception 
handler
  Kernel Offset: disabled
  Rebooting in 100 seconds..

More specifically, Linux does a soft offline of a CPU when
writing a 0 to /sys/devices/system/cpu/cpuX/online, which
doesn't prevent the #MC exception from being broadcasted to that
CPU.

Ensure that offline CPUs don't participate in the MCE rendezvous
and clear the RIP valid status bit so that a second MCE won't
cause a shutdown.

Without the patch, mce_start() will increment mce_callin and
wait for all CPUs. Offlined CPUs should avoid participating in
the rendezvous process altogether.

Signed-off-by: Ashok Raj 
[ Massage commit message. ]
Signed-off-by: Borislav Petkov 
Reviewed-by: Tony Luck 
Cc: 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-edac 
Link: http://lkml.kernel.org/r/1449742346-21470-2-git-send-email...@alien8.de
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/cpu/mcheck/mce.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index c5b0d56..7e8a736 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -999,6 +999,17 @@ void do_machine_check(struct pt_regs *regs, long 
error_code)
int flags = MF_ACTION_REQUIRED;
int lmce = 0;
 
+   /* If this CPU is offline, just bail out. */
+   if (cpu_is_offline(smp_processor_id())) {
+   u64 mcgstatus;
+
+   mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
+   if (mcgstatus & MCG_STATUS_RIPV) {
+   mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
+   return;
+   }
+   }
+
ist_enter(regs);
 
this_cpu_inc(mce_exception_count);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

2015-12-14 Thread tip-bot for Ashok Raj

Commit-ID:  06f337b7c7eb86254c86e8e717273d1e356d5a1b
Gitweb: http://git.kernel.org/tip/06f337b7c7eb86254c86e8e717273d1e356d5a1b
Author: Ashok Raj 
AuthorDate: Thu, 10 Dec 2015 11:12:26 +0100
Committer:  Ingo Molnar 
CommitDate: Fri, 11 Dec 2015 08:59:48 +0100

x86/mce: Ensure offline CPUs don't participate in rendezvous process

Intel's MCA implementation broadcasts MCEs to all CPUs on the
node. This poses a problem for offlined CPUs which cannot
participate in the rendezvous process:

  Kernel panic - not syncing: Timeout: Not all CPUs entered broadcast exception 
handler
  Kernel Offset: disabled
  Rebooting in 100 seconds..

More specifically, Linux does a soft offline of a CPU when
writing a 0 to /sys/devices/system/cpu/cpuX/online, which
doesn't prevent the #MC exception from being broadcasted to that
CPU.

Ensure that offline CPUs don't participate in the MCE rendezvous
and clear the RIP valid status bit so that a second MCE won't
cause a shutdown.

Without the patch, mce_start() will increment mce_callin and
wait for all CPUs. Offlined CPUs should avoid participating in
the rendezvous process altogether.

Signed-off-by: Ashok Raj 
[ Massage commit message. ]
Signed-off-by: Borislav Petkov 
Reviewed-by: Tony Luck 
Cc: 
Cc: H. Peter Anvin 
Cc: Linus Torvalds 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: linux-edac 
Link: http://lkml.kernel.org/r/1449742346-21470-2-git-send-email...@alien8.de
Signed-off-by: Ingo Molnar 
---
 arch/x86/kernel/cpu/mcheck/mce.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index c5b0d56..7e8a736 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -999,6 +999,17 @@ void do_machine_check(struct pt_regs *regs, long 
error_code)
int flags = MF_ACTION_REQUIRED;
int lmce = 0;
 
+   /* If this CPU is offline, just bail out. */
+   if (cpu_is_offline(smp_processor_id())) {
+   u64 mcgstatus;
+
+   mcgstatus = mce_rdmsrl(MSR_IA32_MCG_STATUS);
+   if (mcgstatus & MCG_STATUS_RIPV) {
+   mce_wrmsrl(MSR_IA32_MCG_STATUS, 0);
+   return;
+   }
+   }
+
ist_enter(regs);
 
this_cpu_inc(mce_exception_count);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

答复: 答复: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

答复: 答复: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

Re: 答复: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

答复: Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

Re: [tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

[tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

[tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

[tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

[tip:x86/urgent] x86/mce: Ensure offline CPUs don' t participate in rendezvous process

10 matches

Site Navigation

Mail list logo

Footer information