On Tue, Jun 16, 2026 at 10:04:05PM +0530, Amit Machhiwal wrote: > On IBM POWER systems, newer processor generations can operate in > compatibility modes corresponding to earlier generations. This becomes > relevant for nested virtualization, where nested KVM guests may need to > run with a specific processor compatibility level. > > Currently, when running a nested KVM guest (L2) inside a Power11 pSeries > logical partition (L1) booted in Power10 compatibility mode, the guest > fails to boot while setting 'arch_compat'. This happens because the CPU > class is derived from the hardware PVR (via mfspr()), which reflects the > physical processor generation (Power11), rather than the effective > compatibility mode (Power10). > > As a result, userspace may request a Power11 arch_compat for the L2 > guest. However, the L1 partition, running in Power10 compatibility, has > only negotiated support up to Power10 with the Power Hypervisor (L0). > When H_GUEST_SET_STATE is invoked with a Power11 Logical PVR, the > hypervisor rejects the request, leading to a late guest boot failure: > > KVM-NESTEDv2: couldn't set guest wide elements > [..KVM reg dump..] > > This situation should be detected earlier and rejected by KVM. Without > proper validation, if userspace ignores the error, the guest may continue > to boot in Power11 raw mode on a Power10 compatibility host, which should > not be allowed. > > Introduce a validation mechanism that detects unsupported arch_compat > values early in the guest initialization path. When an unsupported > arch_compat is requested (e.g., Power11 on a Power10 compatibility mode > host), kvmppc_set_arch_compat() uses cpu_has_feature(CPU_FTR_P11_PVR) to > detect the mismatch and sets arch_compat to PVR_ARCH_INVALID (0xffffffff). > This sentinel value is architecturally safe: PAPR specifies that valid > logical PVR values must have 0x0f as the first byte, ensuring 0xffffffff > lies permanently outside the specification-defined range. Setting this > value triggers kvmppc_sanity_check() to mark the vCPU as invalid by > setting vcpu->arch.sane to false. On the next vCPU run, kvmppc_vcpu_run_hv() > checks this flag and returns -EINVAL, preventing the guest from running > with an invalid processor compatibility configuration. > > With this, when a Power11 arch_compat is requested on a Power10 > compatibility mode host, the guest fails early during boot with: > > error: kvm run failed Invalid argument > > This provides a much clearer failure mode compared to the previous > behavior where the guest could boot in Power11 raw mode (if userspace > ignored the error) or fail late during H_GUEST_SET_STATE. > > Suggested-by: Vaibhav Jain <[email protected]> > Reviewed-by: Vaibhav Jain <[email protected]> > Tested-by: Anushree Mathur <[email protected]> > Acked-by: Gautam Menghani <[email protected]> > Cc: [email protected] # v6.13+ > Signed-off-by: Amit Machhiwal <[email protected]> > --- > Testing: Both Anushree and I have tested the below scenarios: > 1. P11 guest on P11 host - Works > 2. P10 compat guest on P11 host - Works > 3. P11 guest on compat-P10 host - Correctly fails with "Invalid argument" > 4. P10 guest on compat-P10 host - Works > > Changes in v4: > * Added documentation for PVR_ARCH_INVALID explaining why 0xffffffff is > architecturally safe to use as a sentinel value (PAPR constraint on > first byte being 0x0f) [Ritesh] > * Updated commit message > * v3: > https://lore.kernel.org/all/[email protected]/ > > Changes in v3: > * Fixed null pointer dereference in kvmppc_sanity_check(): added check for > vcpu->arch.vcore before accessing arch_compat, as vcore is NULL for Book3S > PR and BookE guests (only Book3S HV uses vcore) [Reported by Sashiko AI] > * Added Reviewed-by tag from Vaibhav > * v2: > https://lore.kernel.org/all/[email protected]/ > > Changes in v2: > * Fixed issue where v1 allowed guest to boot in Power11 raw mode when > userspace ignored the error, by adding validation in kvmppc_sanity_check() > to ensure early failure during vCPU run [Found the issue after posting v1, > also reported by Gautam.] > * Introduced PVR_ARCH_INVALID constant for marking invalid arch_compat > * Dropped all Reviewed-by and Tested-by tags due to code changes; requesting > fresh reviews > * v1: > https://lore.kernel.org/all/[email protected]/ > > Changes in v1: > * Moved this patch out of the v3 series [1] as discussed here [2] > * Addressed below review comments from Ritesh: > - Based the PVR validation on cpu features > - Fixed hcall name typo > - Stable backport > > [1] https://lore.kernel.org/all/[email protected]/ > [2] https://lore.kernel.org/all/[email protected]/ > --- > arch/powerpc/include/asm/reg.h | 12 ++++++++++++ > arch/powerpc/kvm/book3s_hv.c | 15 ++++++++++++++- > arch/powerpc/kvm/powerpc.c | 4 ++++ > 3 files changed, 30 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h > index 3449dd2b577d..b9ab9df1e2bc 100644 > --- a/arch/powerpc/include/asm/reg.h > +++ b/arch/powerpc/include/asm/reg.h > @@ -1357,6 +1357,18 @@ > #define PVR_ARCH_31 0x0f000006 > #define PVR_ARCH_31_P11 0x0f000007 > > +/* > + * Kernel-internal sentinel for invalid processor compatibility modes. > + * PAPR specifies that the first byte of a valid logical PVR value is > + * 0x0f. So 0xffffffff lies permanently outside the PAPR-defined range > + * and is safe to repurpose. KVM stores it in vcpu->arch.arch_compat > + * when userspace requests an unsupported compatibility mode (e.g., > + * Power11 PVR on a Power11 host booted in Power10 compat). > + * kvmppc_sanity_check() detects this and prevents the vCPU from > + * running with an unsupported arch_compat. > + */ > +#define PVR_ARCH_INVALID 0xffffffff > + nit: I think the description of the invalid value should go into the commit message rather then here. It creates an unnecessary clutter here. If anyone wants to know why it's added they can always get the blame and get the description.
Apart from this. Reviewed-by: Mukesh Kumar Chaurasiya (IBM) <[email protected]> > /* Macros for setting and retrieving special purpose registers */ > #ifndef __ASSEMBLER__ > > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c > index 61dbeea317f3..f9380ef65750 100644 > --- a/arch/powerpc/kvm/book3s_hv.c > +++ b/arch/powerpc/kvm/book3s_hv.c > @@ -446,7 +446,19 @@ static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, > u32 arch_compat) > guest_pcr_bit = PCR_ARCH_300; > break; > case PVR_ARCH_31: > + guest_pcr_bit = PCR_ARCH_31; > + break; > case PVR_ARCH_31_P11: > + /* > + * Need to check this for ISA 3.1, as Power10 and > + * Power11 share the same PCR. For any subsequent ISA > + * versions, this will be taken care of by the guest vs > + * host PCR comparison below. > + */ > + if (!cpu_has_feature(CPU_FTR_P11_PVR)) { > + arch_compat = PVR_ARCH_INVALID; > + goto out; > + } > guest_pcr_bit = PCR_ARCH_31; > break; > default: > @@ -469,6 +481,7 @@ static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, > u32 arch_compat) > return -EINVAL; > } > > +out: > spin_lock(&vc->lock); > vc->arch_compat = arch_compat; > kvmhv_nestedv2_mark_dirty(vcpu, KVMPPC_GSID_LOGICAL_PVR); > @@ -479,7 +492,7 @@ static int kvmppc_set_arch_compat(struct kvm_vcpu *vcpu, > u32 arch_compat) > vc->pcr = (host_pcr_bit - guest_pcr_bit) | PCR_MASK; > spin_unlock(&vc->lock); > > - return 0; > + return kvmppc_sanity_check(vcpu); > } > > static void kvmppc_dump_regs(struct kvm_vcpu *vcpu) > diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c > index 00302399fc37..98de68379b18 100644 > --- a/arch/powerpc/kvm/powerpc.c > +++ b/arch/powerpc/kvm/powerpc.c > @@ -258,6 +258,10 @@ int kvmppc_sanity_check(struct kvm_vcpu *vcpu) > if (!vcpu->arch.pvr) > goto out; > > + if (vcpu->arch.vcore && > + vcpu->arch.vcore->arch_compat == PVR_ARCH_INVALID) > + goto out; > + > /* PAPR only works with book3s_64 */ > if ((vcpu->arch.cpu_type != KVM_CPU_3S_64) && vcpu->arch.papr_enabled) > goto out; > > base-commit: 6b5a2b7d9bc156e505f09e698d85d6a1547c1206 > -- > 2.50.1 (Apple Git-155) >
