On Thu, 2020-07-02 at 11:16 -0700, Sean Christopherson wrote:
> On Thu, Jul 02, 2020 at 08:44:55PM +0300, Maxim Levitsky wrote:
> > There are few cases when this function was creating a bogus #GP condition,
> > for example case when and AMD host supports STIBP but doesn't support SSBD.
> > 
> > Follow the rules for AMD and Intel strictly instead.
> 
> Can you elaborate on the conditions that are problematic, e.g. what does
> the guest expect to exist that KVM isn't providing?

Hi Sean Christopherson!
Sorry that I haven't explained the issue here.
I explained it in bugzilla I opened in details and forgot to explain it
in the commit message.
https://bugzilla.redhat.com/show_bug.cgi?id=1853447
 
 
The issue is that on my cpu (3970X), it does not support IBRS,
but it does support STIBP, and thus guest gets the STIBP cpuid bits enabled
(both the amd one and KVM also enables the intel's cpuid bit for this feature).
 
Then, when guest actually starts to use STIBP, it gets #GP because both of 
these conditions
potentially don't allow STIBP bit to be set when IBRS is not supported:
 
        if (!guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
            !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS))
                bits &= ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP);
        if (!boot_cpu_has(X86_FEATURE_SPEC_CTRL) &&
            !boot_cpu_has(X86_FEATURE_AMD_IBRS))
                bits &= ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP);
 
Most likely it fails on the second condition, since X86_FEATURE_SPEC_CTRL
is enabled in the guest because host does support IBPB which 
X86_FEATURE_SPEC_CTRL also cover.
 
But the host doesn't support X86_FEATURE_SPEC_CTRL and it doesn't support 
X86_FEATURE_AMD_IBRS
thus second condition clears the SPEC_CTRL_STIBP wrongly.
 
Now in addition to that, I long ago had known that win10 guests on my machine 
started to
crash when qemu added ability to pass through X86_FEATURE_AMD_IBRS.
I haven't paid much attention to that other than bisecting this and adding 
'-amd-stibp' to my cpu flags.
 
I did notice that I am not the only one to have that issue, for example
https://www.reddit.com/r/VFIO/comments/gf53o8/upgrading_to_qemu_5_broke_my_setup_windows_bsods/
https://forum.level1techs.com/t/amd-fix-for-host-passthrough-on-qemu-5-0-0-kernel-5-6/157710
 
Now after I debugged this issue in Linux, it occured to me that this might be 
the same issue as in Windows,
and indeed it is. The only difference is that Windows doesn't start to play 
with STIBP when Intel
specific cpuid bit is set on AMD machine (which KVM sets for long time) but 
starts to enable it when AMD specific
bit is set, that is X86_FEATURE_AMD_IBRS, the bit that qemu recently started to 
set and it gets the same #GP and crashes.
 
>From findings on my machine, if we cross-reference this with the above posts, 
>I can assume that many Ryzens have this configuration 
of no support for IBRS but support STIBP.
In fact I don't see the kernel use IBRS much (it seem only used around firmware 
calls or so), so it makes sense
that AMD chose to not enable it.
 
For the fix itself,
I can fix this by only changing the above condition, but then I read the AMD 
whitepaper on
this and they mention that bits in IA32_SPEC_CTRL don't #GP even if not 
supported,
and to implement this correctly would be too complicated with current logic,
thus I rewrote the logic to be as simple as possible and as close to the 
official docs as possible
as well.
 

> 
> > AMD #GP rules for IA32_SPEC_CTRL can be found here:
> > https://bugzilla.kernel.org/show_bug.cgi?id=199889
> > 
> > Fixes: 6441fa6178f5 ("KVM: x86: avoid incorrect writes to host 
> > MSR_IA32_SPEC_CTRL")
> > 
> > Signed-off-by: Maxim Levitsky <[email protected]>
> > ---
> >  arch/x86/kvm/x86.c | 57 ++++++++++++++++++++++++++++++++++------------
> >  1 file changed, 42 insertions(+), 15 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 00c88c2f34e4..a6bed4670b7f 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -10670,27 +10670,54 @@ bool kvm_arch_no_poll(struct kvm_vcpu *vcpu)
> >  }
> >  EXPORT_SYMBOL_GPL(kvm_arch_no_poll);
> >  
> > -u64 kvm_spec_ctrl_valid_bits(struct kvm_vcpu *vcpu)
> > +
> > +static u64 kvm_spec_ctrl_valid_bits_host(void)
> > +{
> > +   uint64_t bits = 0;
> > +
> > +   if (boot_cpu_has(X86_FEATURE_SPEC_CTRL))
> > +           bits |= SPEC_CTRL_IBRS;
> > +   if (boot_cpu_has(X86_FEATURE_INTEL_STIBP))
> > +           bits |= SPEC_CTRL_STIBP;
> > +   if (boot_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD))
> > +           bits |= SPEC_CTRL_SSBD;
> > +
> > +   if (boot_cpu_has(X86_FEATURE_AMD_IBRS) || 
> > boot_cpu_has(X86_FEATURE_AMD_STIBP))
> > +           bits |= SPEC_CTRL_STIBP | SPEC_CTRL_IBRS;
> > +
> > +   if (boot_cpu_has(X86_FEATURE_AMD_SSBD))
> > +           bits |= SPEC_CTRL_STIBP | SPEC_CTRL_IBRS | SPEC_CTRL_SSBD;
> > +
> > +   return bits;
> > +}
> 
> Rather than compute the mask every time, it can be computed once on module
> load and stashed in a global.  Note, there's a RFC series[*] to support
> reprobing bugs at runtime, but that has bigger issues with existing KVM
> functionality to be addressed, i.e. it's not our problem, yet :-).
> 
> [*] 
> https://lkml.kernel.org/r/[email protected]

Thanks for the pointer!
 
Note though that the above code only runs once, since after a single successful 
(non #GP) set
of it to non-zero value, it is cleared in MSR bitmap for both reads and writes 
on
both VMX and SVM.
This is done because of performance reasons which in this case are more 
important than absolute correctness.
Thus to some extent the guest checks in the above are pointless.
 
If you ask
me, I would just remove the kvm_spec_ctrl_valid_bits, and pass this msr to guest
right away and not on first access.
 
I talked with Paulo about this and his opinion if I understand correctly is 
that the
above is
a best effort correctness wise since at least we emulate the bits correctly on 
first access.

> 
> > +
> > +static u64 kvm_spec_ctrl_valid_bits_guest(struct kvm_vcpu *vcpu)
> >  {
> > -   uint64_t bits = SPEC_CTRL_IBRS | SPEC_CTRL_STIBP | SPEC_CTRL_SSBD;
> > +   uint64_t bits = 0;
> >  
> > -   /* The STIBP bit doesn't fault even if it's not advertised */
> > -   if (!guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL) &&
> > -       !guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS))
> > -           bits &= ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP);
> > -   if (!boot_cpu_has(X86_FEATURE_SPEC_CTRL) &&
> > -       !boot_cpu_has(X86_FEATURE_AMD_IBRS))
> > -           bits &= ~(SPEC_CTRL_IBRS | SPEC_CTRL_STIBP);
> > +   if (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL))
> > +           bits |= SPEC_CTRL_IBRS;
> > +   if (guest_cpuid_has(vcpu, X86_FEATURE_INTEL_STIBP))
> > +           bits |= SPEC_CTRL_STIBP;
> > +   if (guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL_SSBD))
> > +           bits |= SPEC_CTRL_SSBD;
> >  
> > -   if (!guest_cpuid_has(vcpu, X86_FEATURE_SPEC_CTRL_SSBD) &&
> > -       !guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD))
> > -           bits &= ~SPEC_CTRL_SSBD;
> > -   if (!boot_cpu_has(X86_FEATURE_SPEC_CTRL_SSBD) &&
> > -       !boot_cpu_has(X86_FEATURE_AMD_SSBD))
> > -           bits &= ~SPEC_CTRL_SSBD;
> > +   if (guest_cpuid_has(vcpu, X86_FEATURE_AMD_IBRS) ||
> > +                   guest_cpuid_has(vcpu, X86_FEATURE_AMD_STIBP))
> 
> Bad indentation.
True.

> 
> > +           bits |= SPEC_CTRL_STIBP | SPEC_CTRL_IBRS;
> > +   if (guest_cpuid_has(vcpu, X86_FEATURE_AMD_SSBD))
> > +           bits |= SPEC_CTRL_STIBP | SPEC_CTRL_IBRS | SPEC_CTRL_SSBD;
> 
> Would it be feasible to split into two patches?  The first (tagged Fixes:)
> to make the functional changes without inverting the logic or splitting, and
> then do the cleanup?  It's really hard to review this patch because I can't
> easily tease out what's different in terms of functionality.

The new logic follows (hopefully) Intel's spec and AMD spec.
I will try to split it though.



> 
> >     return bits;
> >  }
> > +
> > +u64 kvm_spec_ctrl_valid_bits(struct kvm_vcpu *vcpu)
> > +{
> > +   return kvm_spec_ctrl_valid_bits_host() &
> > +          kvm_spec_ctrl_valid_bits_guest(vcpu);
> > +}
> > +
> > +
> >  EXPORT_SYMBOL_GPL(kvm_spec_ctrl_valid_bits);
> >  
> >  EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
> > -- 
> > 2.25.4
> > 

Thanks for the review,
        Best regards,
                Maxim Levitsky



Reply via email to