On POWER systems, newer processor generations can operate in compatibility
modes corresponding to earlier generations (e.g., a Power11 system running
in Power10 compatibility mode). In such cases, the effective CPU level
exposed to guests differs from the physical processor generation.
This creates a problem for nested virtualization. When booting a nested KVM
guest (L2) inside a host KVM guest (L1) running in a compatibility mode,
userspace (e.g., QEMU) may derive the CPU model from the raw hardware PVR
and attempt to configure the nested guest accordingly. However, the L1
partition is constrained by the compatibility level negotiated with the
hypervisor (L0), and requests exceeding that level are rejected, leading to
guest boot failures such as:
KVM-NESTEDv2: couldn't set guest wide elements
This series addresses the issue in two steps:
1. Detect and reject invalid compatibility requests early in KVM to avoid
late failures.
2. Provide a mechanism for userspace to query the effective CPU
compatibility modes supported by the host, so it can select an
appropriate CPU model for nested guests.
To achieve this, the series introduces a new KVM capability and ioctl
(KVM_CAP_PPC_COMPAT_CAPS / KVM_PPC_GET_COMPAT_CAPS) that expose the
compatibility modes supported by the host.
The implementation supports both:
- PowerVM (nested API v2), where compatibility information is obtained
via the H_GUEST_GET_CAPABILITIES hypercall.
- PowerNV (nested API v1), where compatibility is derived from the device
tree ("cpu-version") representing the effective processor compatibility
level.
This allows userspace (e.g., QEMU) to select a CPU model consistent with
the host compatibility mode, avoiding mismatches and enabling successful
nested guest boot.
Changes in v2:
- Squashed patches 2 and 3 from v1 (capability introduction and ioctl
wiring) into a single patch for better logical grouping
- Changed kvm_ppc_compat_caps.flags from __u32 to __u64 for consistency
and future extensibility
- Addressed other review comments
- Improved commit messages with clearer explanations of the changes
Patch summary:
[1/5] Validate arch_compat against host compatibility mode
[2/5] Introduce KVM_CAP_PPC_COMPAT_CAPS and wire up ioctl
[3/5] Implement capability retrieval for PowerVM (API v2)
[4/5] Add PowerNV support (API v1)
[5/5] Document the new ioctl
Tested on:
- Power11 pSeries LPAR in Power10 compatibility mode (nested API v2)
- Power10 PowerNV system (and QEMU TCG PowerNV 11) with nested
virtualization (API v1) with various combinations of KVM L1/L2 guests
in various supported compatibility modes.
With this series, nested guests boot successfully in configurations where
they previously failed due to compatibility mismatches.
Related QEMU series:
A corresponding QEMU series adds support for querying and using these
compatibility capabilities when configuring nested KVM guests:
https://lore.kernel.org/all/[email protected]/
v1:
https://lore.kernel.org/linuxppc-dev/[email protected]/
Amit Machhiwal (5):
KVM: PPC: Book3S HV: Validate arch_compat against host compatibility
mode
KVM: PPC: Introduce KVM_CAP_PPC_COMPAT_CAPS and wire up ioctl
KVM: PPC: Book3S HV: Implement compat CPU capability retrieval for KVM
on PowerVM
KVM: PPC: Book3S HV: Add support for compat CPU capabilities for KVM
on PowerNV
KVM: PPC: Document KVM_PPC_GET_COMPAT_CAPS ioctl
Documentation/virt/kvm/api.rst | 35 ++++++++++++++++
arch/powerpc/include/asm/kvm_ppc.h | 1 +
arch/powerpc/include/uapi/asm/kvm.h | 6 +++
arch/powerpc/kvm/book3s_hv.c | 63 +++++++++++++++++++++++++++++
arch/powerpc/kvm/powerpc.c | 21 ++++++++++
include/uapi/linux/kvm.h | 4 ++
6 files changed, 130 insertions(+)
base-commit: 1d5dcaa3bd65f2e8c9baa14a393d3a2dc5db7524
--
2.50.1 (Apple Git-155)