Re: [RFC PATCH v2 1/9] pstore: move pstore creator id, section type and record struct to common header

2023-09-25 Thread Kees Cook
On Mon, Sep 25, 2023 at 03:44:18PM +0800, Shuai Xue wrote: > Move pstore creator id, section type and record struct to the common > header, so that it can be use by MCE and GHES driver. I would prefer this was not in the pstore header -- this is a backend detail that should stay in backend

[RFC PATCH v2 5/9] ACPI: APEI: GHES: Use ERST to serialize APEI generic error before panic

2023-09-25 Thread Shuai Xue
In certain scenarios (ie. hosts/guests with root filesystems on NFS/iSCSI where networking software and/or hardware fails, and thus kdump fails), it is necessary to serialize hardware error information available for post-mortem debugging. Save the hardware error log into flash via ERST before go

[RFC PATCH v2 7/9] ACPI: APEI: ESRT: kick ghes_report_chain notifier to report serialized memory errors

2023-09-25 Thread Shuai Xue
Introduce a new pstore_record type, PSTORE_TYPE_CPER_MEM, so that serialized memory errors can be retrieved and saved as a file in pstore file system. While the serialized errors is retrieved from ERST backend, kick ghes_report_chain notifier. Signed-off-by: Shuai Xue ---

[RFC PATCH v2 8/9] ACPI: APEI: ESRT: print AER to report serialized PCIe errors

2023-09-25 Thread Shuai Xue
Introduce a new pstore_record type, PSTORE_TYPE_CPER_PCIE, so that serialized PCIe errors can be restrived and saved as a file in pstore file system. While the serialized PCIe errors is retrieved from ERST backend, print AER information. Signed-off-by: Shuai Xue --- drivers/acpi/apei/erst.c |

[RFC PATCH v2 1/9] pstore: move pstore creator id, section type and record struct to common header

2023-09-25 Thread Shuai Xue
Move pstore creator id, section type and record struct to the common header, so that it can be use by MCE and GHES driver. Signed-off-by: Shuai Xue --- drivers/acpi/apei/erst.c | 19 --- include/linux/pstore.h | 24 2 files changed, 24 insertions(+),

[RFC PATCH v2 2/9] ACPI: APEI: Use common ERST struct to read/write serialized MCE record

2023-09-25 Thread Shuai Xue
It is confusing to define two creator IDs with the same GUID number, and unnecessary to define the same data structure twice. Use common ERST struct to read/write MCE record. Signed-off-by: Shuai Xue --- arch/x86/kernel/cpu/mce/apei.c | 82 +++--- 1 file changed, 35

[RFC PATCH v2 0/9] Use ERST for persistent storage of MCE and APEI errors

2023-09-25 Thread Shuai Xue
changes log since v1: - fix a compile waring by dereferencing rcd pointer before memset - add a compile error by add CONFIG_X86_MCE - Link: https://lore.kernel.org/all/20230916130316.65815-3-xuesh...@linux.alibaba.com/ In certain scenarios (ie. hosts/guests with root filesystems on NFS/iSCSI

[RFC PATCH v2 3/9] ACPI: APEI: ERST: Emit the mce_record tracepoint

2023-09-25 Thread Shuai Xue
After /dev/mcelog character device deprecated by commit 5de97c9f6d85 ("x86/mce: Factor out and deprecate the /dev/mcelog driver"), the serialized hardware error log, a.k.a MCE record, of previous boot in persistent storage is not collected via APEI ERST. Emit the mce_record tracepoint so that it

[RFC PATCH v2 9/9] ACPI: APEI: ESRT: log ARM processor error

2023-09-25 Thread Shuai Xue
Introduce a new pstore_record type, PSTORE_TYPE_CPER_PROC_ARM, so that serialized ARM processor errors can be retrieved and saved as a file in pstore file system. While the serialized errors is retrieved from ERST backend, log it. Signed-off-by: Shuai Xue --- drivers/acpi/apei/erst.c | 6 ++

[RFC PATCH v2 6/9] ACPI: APEI: GHES: export ghes_report_chain

2023-09-25 Thread Shuai Xue
Export ghes_report_chain so that it can be kicked by other drivers. Signed-off-by: Shuai Xue --- drivers/acpi/apei/ghes.c | 2 +- include/acpi/ghes.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c index

[RFC PATCH v2 4/9] ACPI: tables: change section_type of generic error data as guid_t

2023-09-25 Thread Shuai Xue
The section_type of generic error data is now an array of u8. It is a burden to perform explicit type casting from u8[] to guid_t, and to copy the guid_t values to u8[] using memcpy. To alleviate this issue, change the section_type from an array to the type guid_t, which is also consistent with