https://bugzilla.kernel.org/show_bug.cgi?id=202409

            Bug ID: 202409
           Summary: Dell Precision 3520 - MCE hardware errors - CACHE
                    Level-2 Generic Error
           Product: ACPI
           Version: 2.5
    Kernel Version: 5.0.0-rc2
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: BIOS
          Assignee: [email protected]
          Reporter: [email protected]
        Regression: No

Created attachment 280749
  --> https://bugzilla.kernel.org/attachment.cgi?id=280749&action=edit
Sleepgraph timeline

We run around 3000 iterations of S3 suspend in our weekly stress tests, and we
discovered an issue on the Dell Precision 3520. We consistently receive mce
hardware errors at a rate of about 33% of the time (once out of each 3 runs).
After running mcelog this is the data acquired from a single test run:

Hardware event. This is not a software error.
MCE 0
CPU 0 BANK 6
MISC 3880000086 ADDR fef20080
TIME 1548375775 Thu Jan 24 16:22:55 2019
MCG status:
MCi status:
Error overflow
Uncorrected error
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: corrected filtering (some unreported errors in same region)
Generic CACHE Level-2 Generic Error
STATUS ee2000000040110a MCGSTATUS 0
MCGCAP c0a APICID 0 SOCKETID 0
MICROCODE c6
CPUID Vendor Intel Family 6 Model 94
Hardware event. This is not a software error.
MCE 1
CPU 0 BANK 7
MISC 3880000086 ADDR fef200c0
TIME 1548375775 Thu Jan 24 16:22:55 2019
MCG status:
MCi status:
Error overflow
Uncorrected error
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: corrected filtering (some unreported errors in same region)
Generic CACHE Level-2 Generic Error
STATUS ee2000000040110a MCGSTATUS 0
MCGCAP c0a APICID 0 SOCKETID 0
MICROCODE c6
CPUID Vendor Intel Family 6 Model 94
Hardware event. This is not a software error.
MCE 2
CPU 0 BANK 8
MISC 3880000086 ADDR fef20000
TIME 1548375775 Thu Jan 24 16:22:55 2019
MCG status:
MCi status:
Error overflow
Uncorrected error
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: corrected filtering (some unreported errors in same region)
Generic CACHE Level-2 Generic Error
STATUS ee2000000040110a MCGSTATUS 0
MCGCAP c0a APICID 0 SOCKETID 0
MICROCODE c6
CPUID Vendor Intel Family 6 Model 94
Hardware event. This is not a software error.
MCE 3
CPU 0 BANK 9
MISC 3880000086 ADDR fef20040
TIME 1548375775 Thu Jan 24 16:22:55 2019
MCG status:
MCi status:
Error overflow
Uncorrected error
MCi_MISC register valid
MCi_ADDR register valid
Processor context corrupt
MCA: corrected filtering (some unreported errors in same region)
Generic CACHE Level-2 Generic Error
STATUS ee2000000040110a MCGSTATUS 0
MCGCAP c0a APICID 0 SOCKETID 0
MICROCODE c6
CPUID Vendor Intel Family 6 Model 94

The sleepgraph timeline for this run is attached (this info is in the log as
well).

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

_______________________________________________
acpi-bugzilla mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/acpi-bugzilla

Reply via email to