Hi Vaibhav,

Thanks for taking a look at this patch. My response is inline.

On 2026/06/03 09:31 AM, Vaibhav Jain wrote:
> Hi Amit,
> 
> Thanks for the patch. My review comments inline below:
> 
> Amit Machhiwal <[email protected]> writes:
> 
> > On POWER systems, the host CPU may run in a compatibility mode (e.g., a
> > Power11 processor operating in Power10 compatibility mode). In such
> > cases, the effective CPU level exposed to guests differs from the
> > physical processor generation.
> >
> > When running nested KVM guests, QEMU derives the host CPU type using
> > mfpvr(), which reflects the physical processor version. This can result
> > in a mismatch between the CPU model selected by QEMU and the
> > compatibility mode enforced by the host, leading to guest boot failures.
> >
> > For example, booting a nested guest on a Power11 LPAR configured in
> > Power10 compatibility mode fails with:
> >
> >   KVM-NESTEDv2: couldn't set guest wide elements
> >   [..KVM reg dump..]
> >
> > This occurs because QEMU selects a CPU model corresponding to the
> > physical processor (via mfpvr()), while the host operates in a lower
> > compatibility mode. As a result, KVM rejects the requested compatibility
> > level during guest initialization.
> >
> > Add support for retrieving host CPU compatibility capabilities for
> > nested guests on PowerVM (PAPR nested API v2). The hypervisor provides
> > the effective compatibility levels via the H_GUEST_GET_CAPABILITIES
> > hcall, which reflects the processor modes negotiated between the Power
> > hypervisor (L0) and the host partition (L1).
> >
> > On pseries systems, obtain the capability bitmap using
> > plpar_guest_get_capabilities() and return it via struct
> > kvm_ppc_compat_caps. This information is then exposed to userspace
> > through the KVM_PPC_GET_COMPAT_CAPS ioctl.
> >
> > Hook the implementation into the Book3S HV kvmppc_ops so that it can be
> > invoked by the generic KVM ioctl handling code.
> >
> > Suggested-by: Vaibhav Jain <[email protected]>
> > Tested-by: Anushree Mathur <[email protected]>
> > Signed-off-by: Amit Machhiwal <[email protected]>
> > ---
> >  arch/powerpc/kvm/book3s_hv.c | 16 ++++++++++++++++
> >  1 file changed, 16 insertions(+)
> >
> > diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
> > index 249d1f2e4e2c..38de7040e2b7 100644
> > --- a/arch/powerpc/kvm/book3s_hv.c
> > +++ b/arch/powerpc/kvm/book3s_hv.c
> > @@ -6522,6 +6522,21 @@ static bool kvmppc_hash_v3_possible(void)
> >     return true;
> >  }
> >  
> > +
> > +static int kvmppc_get_compat_cpu_caps(struct kvm_ppc_compat_caps 
> > *host_caps)
> > +{
> > +   unsigned long capabilities = 0;
> > +   long rc = -EINVAL;
> > +
> > +   if (kvmhv_on_pseries()) {
> > +           if (kvmhv_is_nestedv2())
> > +                   rc = plpar_guest_get_capabilities(0,
> > &capabilities);
> 
> since this value will trikle back to userspace please apply a mask on
> the hcall return value so that any reserved and non-PVR related bits
> doesnt leak back to userspace.

Though currently we only supply the bits corresponding to supported
processor versions, it makes sense to mask out unrelated bits so that
they don't unnecesarily passed on to the userspace. I'll make the
changes in v4.

Thanks,
Amit

> 
> > +           host_caps->compat_capabilities = capabilities;
> > +   }
> > +
> > +   return rc;
> > +}
> > +
> >  static struct kvmppc_ops kvm_ops_hv = {
> >     .get_sregs = kvm_arch_vcpu_ioctl_get_sregs_hv,
> >     .set_sregs = kvm_arch_vcpu_ioctl_set_sregs_hv,
> > @@ -6564,6 +6579,7 @@ static struct kvmppc_ops kvm_ops_hv = {
> >     .hash_v3_possible = kvmppc_hash_v3_possible,
> >     .create_vcpu_debugfs = kvmppc_arch_create_vcpu_debugfs_hv,
> >     .create_vm_debugfs = kvmppc_arch_create_vm_debugfs_hv,
> > +   .get_compat_cpu_ver = kvmppc_get_compat_cpu_caps,
> >  };
> >  
> >  static int kvm_init_subcore_bitmap(void)
> > -- 
> > 2.50.1 (Apple Git-155)
> >
> >
> 
> -- 
> Cheers
> ~ Vaibhav

Reply via email to