Hi Ritesh,

Thank you for the review and feedback. Please find my response below.

On 2026/06/16 03:17 PM, Ritesh Harjani wrote:
> Amit Machhiwal <[email protected]> writes:
> 
> > On IBM POWER systems, newer processor generations can operate in
> > compatibility modes corresponding to earlier generations. This becomes
> > relevant for nested virtualization, where nested KVM guests may need to
> > run with a specific processor compatibility level.
> >
> > Currently, when running a nested KVM guest (L2) inside a Power11 pSeries
> > logical partition (L1) booted in Power10 compatibility mode, the guest
> > fails to boot while setting 'arch_compat'. This happens because the CPU
> > class is derived from the hardware PVR (via mfspr()), which reflects the
> > physical processor generation (Power11), rather than the effective
> > compatibility mode (Power10).
> >
> > As a result, userspace may request a Power11 arch_compat for the L2
> > guest. However, the L1 partition, running in Power10 compatibility, has
> > only negotiated support up to Power10 with the Power Hypervisor (L0).
> > When H_GUEST_SET_STATE is invoked with a Power11 Logical PVR, the
> > hypervisor rejects the request, leading to a late guest boot failure:
> >
> >   KVM-NESTEDv2: couldn't set guest wide elements
> >   [..KVM reg dump..]
> >
> > This situation should be detected earlier and rejected by KVM. Without
> > proper validation, if userspace ignores the error, the guest may continue
> > to boot in Power11 raw mode on a Power10 compatibility host, which should
> > not be allowed.
> >
> > Introduce a validation mechanism that detects unsupported arch_compat
> > values early in the guest initialization path. When an unsupported
> > arch_compat is requested (e.g., Power11 on a Power10 compatibility mode
> > host), kvmppc_set_arch_compat() uses cpu_has_feature(CPU_FTR_P11_PVR) to
> > detect the mismatch and sets arch_compat to PVR_ARCH_INVALID. This
> > triggers kvmppc_sanity_check() to mark the vCPU as invalid by setting
> > vcpu->arch.sane to false. On the next vCPU run, kvmppc_vcpu_run_hv()
> > checks this flag and returns -EINVAL, preventing the guest from running
> > with an invalid processor compatibility configuration.
> >
> > With this, when a Power11 arch_compat is requested on a Power10
> > compatibility mode host, the guest fails early during boot with:
> >
> >   error: kvm run failed Invalid argument
> >
> > This provides a much clearer failure mode compared to the previous
> > behavior where the guest could boot in Power11 raw mode (if userspace
> > ignored the error) or fail late during H_GUEST_SET_STATE.
> >
> > Suggested-by: Vaibhav Jain <[email protected]>
> > Reviewed-by: Vaibhav Jain <[email protected]>
> > Cc: [email protected] # v6.13+
> > Signed-off-by: Amit Machhiwal <[email protected]>
> > ---
> > Changes in v3:
> > * Fixed null pointer dereference in kvmppc_sanity_check(): added check for
> >   vcpu->arch.vcore before accessing arch_compat, as vcore is NULL for Book3S
> >   PR and BookE guests (only Book3S HV uses vcore) [Reported by Sashiko AI]
> > * Added Reviewed-by tag from Vaibhav
> >
> > Changes in v2:
> > * Fixed issue where v1 allowed guest to boot in Power11 raw mode when
> >   userspace ignored the error, by adding validation in kvmppc_sanity_check()
> >   to ensure early failure during vCPU run [Found the issue after posting v1,
> >   also reported by Gautam.]
> 
> Would be nice if we could post the matrix test results which Gautam
> posted earlier with this v3. I guess you meant you already tested all of
> those - it would be nice if we could explicitely put that info in the 
> changelog.

Regarding the test matrix: Both Anushree and I have tested all the
scenarios comprehensively. Anushree has shared her detailed test results
in her reply to the patch [1], covering:

  - P11 guest on P11 host -> Works
  - P10 guest on P11 host -> Works
  - P11 guest on compat-P10 host -> Correctly fails with "Invalid argument"
  - P10 guest on compat-P10 host -> Works

I have also verified all these scenarios with the same results.

If you'd prefer that I add the test matrix explicitly to the changelog
and send a new version, please let me know and I'll be happy to do so.

[1] 
https://lore.kernel.org/all/[email protected]/

> 
> > * Introduced PVR_ARCH_INVALID constant for marking invalid arch_compat
> > * Dropped all Reviewed-by and Tested-by tags due to code changes; requesting
> >   fresh reviews
> > * v1: 
> > https://lore.kernel.org/all/[email protected]/
> >
> > Changes in v1:
> > * Moved this patch out of the v3 series [1] as discussed here [2]
> > * Addressed below review comments from Ritesh:
> >   - Based the PVR validation on cpu features
> >   - Fixed hcall name typo
> >   - Stable backport
> >
> > [1] 
> > https://lore.kernel.org/all/[email protected]/
> > [2] 
> > https://lore.kernel.org/all/[email protected]/
> > ---
> >  arch/powerpc/include/asm/reg.h |  1 +
> >  arch/powerpc/kvm/book3s_hv.c   | 15 ++++++++++++++-
> >  arch/powerpc/kvm/powerpc.c     |  4 ++++
> >  3 files changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
> > index 3449dd2b577d..7472b9522f71 100644
> > --- a/arch/powerpc/include/asm/reg.h
> > +++ b/arch/powerpc/include/asm/reg.h
> > @@ -1356,6 +1356,7 @@
> >  #define PVR_ARCH_300       0x0f000005
> >  #define PVR_ARCH_31        0x0f000006
> >  #define PVR_ARCH_31_P11    0x0f000007
> > +#define PVR_ARCH_INVALID   0xffffffff
> 
> Logical processor version is defined as part of the PAPR spec. We should
> ensure that this invalid PVR is also documented in the PAPR spec.
> 
> If you have already taken care of that, then please confirm and feel free to 
> add:

Regarding the PAPR specification documentation: The PAPR spec documents
the valid Processor Version Register (PVR) values for each processor
generation (POWER8, POWER9, POWER10, POWER11, etc.). However, the
PVR_ARCH_INVALID value (0xffffffff) introduced in this patch series is a
KVM implementation detail used internally to mark invalid compatibility
mode requests - it's not an architectural value that would be defined in
PAPR itself.

The validation logic and the use of PVR_ARCH_INVALID as a sentinel value
are documented in the kernel code and commit message.

Please let me know if this addresses your concern, or if you'd like me
to add specific documentation.

> 
> Reviewed-by: Ritesh Harjani (IBM) <[email protected]>

Thanks you again for taking the time out of the review.

~Amit

> 

Reply via email to