Deferred errors indicate error conditions that were not corrected, but
require no action from S/W (or action is optional).These errors provide
info about a latent UC MCE that can occur when a poisoned data is
consumed by the processor.

Newer AMD processors can generate deferred errors and can be configured
to generate APIC interrupts on such events.

This patchset introduces a new interrupt handler for deferred errors and
configures the HW if the feature is present.

Patch1: Factor out logging mechanism so we can reuse for deferred errors
        No functional change.
Patch2: Read MCx_ADDR(bank) before calling mce_log(). This fixes an issue
        as currently, amd_decode_mce() will always only print error address
        as 0x0 even if a valid address exists.
Patch3: Defines SUCCOR cpuid bit. This indicates prescence of features
        such as data poisoning and deferred error interrupts in hardware.
Patch4: Implement the interrupt handler.
        - setup vector number, build the interrupt and implement handler
          function in this patch.
Patch5, Patch 6: Cleanups in the code. No functional changes are introduced.

Changes from V1:
  - Two Prepatches-
        * Factor out logging mechanism so we can reuse for deferred errors
        * Read MCx_ADDR(bank) before calling mce_log() so we get relevant
          error address printed out on kernel logs
  - Providing short description of Deferred errors here as well as in commit
    message of patch2 (per Ingo, Boris)
  - Adding comments around mce_flags to define the bitfields better (per Boris)
  - Assign truth values using double negation and 'BIT' macros. Vertically
    align statements while at it. (per Boris)
  - Change definitions of 'deferred_interrupt' to 'deferred_error_interrupt';
    DEFERRED_APIC_VECTOR to DEFERRED_ERROR_VECTOR and irq_deferred_count
    to irq_deferred_error_count (per Andy, Boris)
  - Do the BIOS workaround check for all families as we are behind a cpuid
    bit anyway. And print a FW_BUG message as needed. (per Boris)
  - Updating the timestamp of patch to May 2015 in mce_amd.c

Aravind Gopalakrishnan (6):
  x86/MCE/AMD: Factor out logging mechanism
  x86/MCE/AMD: Read MCx_ADDR(bank) before we log the error
  x86/mce: Define 'SUCCOR' cpuid bit
  x86/MCE/AMD: Introduce deferred error interrupt handler
  x86, irq: Cleanup ordering of vector numbers
  x86/MCE/AMD: Rename setup_APIC_mce

 arch/x86/include/asm/entry_arch.h        |   3 +
 arch/x86/include/asm/hardirq.h           |   3 +
 arch/x86/include/asm/hw_irq.h            |   2 +
 arch/x86/include/asm/irq_vectors.h       |  11 +--
 arch/x86/include/asm/mce.h               |  20 ++++-
 arch/x86/include/asm/trace/irq_vectors.h |   6 ++
 arch/x86/include/asm/traps.h             |   3 +-
 arch/x86/kernel/cpu/mcheck/mce.c         |   3 +-
 arch/x86/kernel/cpu/mcheck/mce_amd.c     | 132 ++++++++++++++++++++++++++++---
 arch/x86/kernel/entry_64.S               |   5 ++
 arch/x86/kernel/irq.c                    |   6 ++
 arch/x86/kernel/irqinit.c                |   4 +
 arch/x86/kernel/traps.c                  |   5 ++
 13 files changed, 182 insertions(+), 21 deletions(-)

-- 
1.9.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to