Currently, there is only one CPER buffer (entry) can be delivered and
acknoledged at once. This conflicts to the scenario where the host and
guest has 64KB and 4KB page size. In this specific scenario, a problematic
host page can affect 16x guest pages, resulting in 16x memory errors
in the worst case. Unfortunately, qemu runs to core dump at (a) because
the previous error isn't acknoledged and current error is no way to be
delivered, shown in the following call trace

  kvm_vcpu_thread_fn
    kvm_cpu_exec
      kvm_arch_on_sigbus_vcpu
        kvm_cpu_synchronize_state
        acpi_ghes_memory_errors         (a)
        kvm_inject_arm_sea | abort

Fix the issue by sending 16x consecutive memory CPER entries for this
specific case in one shot. With the series applied, no qemu core dump is
observed in the test where (4KB) guest memory access is triggered by
'victimd' and the recoverable memory error is injected from the (64KB) host.

Changelog
=========
v2:
  * v1: https://lists.nongnu.org/archive/html/qemu-arm/2025-02/msg00897.html
  * Send 16x memory errors for the specific case                 (Jonathan)

Gavin Shan (3):
  acpi/ghes: Extend acpi_ghes_memory_errors() to support multiple CPERs
  kvm/arm/kvm: Introduce helper push_ghes_memory_errors()
  target/arm/kvm: Support multiple memory CPERs injection

 hw/acpi/ghes-stub.c    |  2 +-
 hw/acpi/ghes.c         | 29 ++++++++--------
 include/hw/acpi/ghes.h |  2 +-
 target/arm/kvm.c       | 77 +++++++++++++++++++++++++++++++++++++-----
 4 files changed, 86 insertions(+), 24 deletions(-)

-- 
2.51.0


Reply via email to