On 8/5/2025 11:18 PM, Breno Leitao wrote:
On Tue, Aug 05, 2025 at 10:25:11PM +0800, Ethan Zhao wrote:

Seems you are using arm64 platform default config item
arch/arm64/configs/defconfig:CONFIG_ACPI_APEI_PCIEAER=y
So the issue wouldn't be triggered on X86_64 with default config.

Not really, I am running on x86 hosts. There are the AER part of my
.config.

        # cat .config | grep AER
        CONFIG_ACPI_APEI_PCIEAER=y
        CONFIG_PCIEAER=y
        # CONFIG_PCIEAER_INJECT is not set
        CONFIG_PCIEAER_CXL=y
Okay, If so, I would suggest to check and validate the
struct aer_capability_regs *aer_regs before/in enqueue function
aer_recover_queue().

e.g.

static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
{
...
memcpy(aer_info, pcie_err->aer_info, sizeof(struct aer_capability_regs));

//validate the aer_info here

aer_recover_queue(pcie_err->device_id.segment
}

or

void aer_recover_queue(int domain, unsigned int bus, unsigned int devfn,
                       int severity, struct aer_capability_regs *aer_regs)
{
//check and validate aer_regs first here

}

Would be better than dequeue side aer_recover_work_func() ?
BTW, the cause seems you are using a buggy BIOS.


Thanks,
Ethan


Reply via email to