Document the new hwerr_recovery_stats sysfs directory that exposes hardware error recovery statistics.
Update hw-recoverable-errors.rst to reference the new sysfs interface for runtime monitoring. Signed-off-by: Breno Leitao <[email protected]> --- .../ABI/testing/sysfs-kernel-hwerr_recovery_stats | 47 ++++++++++++++++++++++ Documentation/driver-api/hw-recoverable-errors.rst | 3 +- 2 files changed, 49 insertions(+), 1 deletion(-) diff --git a/Documentation/ABI/testing/sysfs-kernel-hwerr_recovery_stats b/Documentation/ABI/testing/sysfs-kernel-hwerr_recovery_stats new file mode 100644 index 0000000000000..4cb9f5a89fba9 --- /dev/null +++ b/Documentation/ABI/testing/sysfs-kernel-hwerr_recovery_stats @@ -0,0 +1,47 @@ +What: /sys/kernel/hwerr_recovery_stats/ +Date: February 2026 +KernelVersion: 6.20 +Contact: Breno Leitao <[email protected]> +Description: + Directory containing hardware error recovery statistics. + These statistics track recoverable hardware errors that the + kernel has handled since boot. + + Each file contains a single integer representing the count + of recovered errors for that subsystem. + +What: /sys/kernel/hwerr_recovery_stats/cpu +Date: February 2026 +KernelVersion: 6.20 +Contact: Breno Leitao <[email protected]> +Description: + Count of CPU-related recovered errors (MCE, ARM processor + errors). + +What: /sys/kernel/hwerr_recovery_stats/memory +Date: February 2026 +KernelVersion: 6.20 +Contact: Breno Leitao <[email protected]> +Description: + Count of memory-related recovered errors. + +What: /sys/kernel/hwerr_recovery_stats/pci +Date: February 2026 +KernelVersion: 6.20 +Contact: Breno Leitao <[email protected]> +Description: + Count of PCI/PCIe AER non-fatal recovered errors. + +What: /sys/kernel/hwerr_recovery_stats/cxl +Date: February 2026 +KernelVersion: 6.20 +Contact: Breno Leitao <[email protected]> +Description: + Count of CXL (Compute Express Link) recovered errors. + +What: /sys/kernel/hwerr_recovery_stats/others +Date: February 2026 +KernelVersion: 6.20 +Contact: Breno Leitao <[email protected]> +Description: + Count of other hardware recovered errors. diff --git a/Documentation/driver-api/hw-recoverable-errors.rst b/Documentation/driver-api/hw-recoverable-errors.rst index fc526c3454bd7..4aefcd103be22 100644 --- a/Documentation/driver-api/hw-recoverable-errors.rst +++ b/Documentation/driver-api/hw-recoverable-errors.rst @@ -36,7 +36,8 @@ Data Exposure and Consumption types like CPU, memory, PCI, CXL, and others. - It is exposed via vmcoreinfo crash dump notes and can be read using tools like `crash`, `drgn`, or other kernel crash analysis utilities. -- There is no other way to read these data other than from crash dumps. +- It is also exposed via sysfs at ``/sys/kernel/hwerr_recovery_stats/`` for runtime + monitoring without requiring a crash dump. - These errors are divided by area, which includes CPU, Memory, PCI, CXL and others. -- 2.47.3
