On Monday 24 May 2010 21:36:12 Avi Kivity wrote:
> On 05/24/2010 01:03 PM, Sheng Yang wrote:
> > From: Dexuan Cui<dexuan....@intel.com>
> > 
> > Enable XSAVE/XRSTORE for guest.
> > 
> > Change from V3:
> > 1. Enforced the assumption that host OS would use all available xstate
> > bits. 2. Various fixes, addressed Avi's comments.
> > 
> > I am still not clear about why we need to reload guest xcr0 when
> > cr4.osxsave set...
> 
> When cr4.osxsave=0, then the guest executes with the host xcr0 (since
> xgetbv will trap; this is similar to the guest running with the host fpu
> if cr0.ts=0).  So if cr4.osxsave transtions, we need to transition xcr0
> as well.

Yes...
> 
> > @@ -3354,6 +3356,29 @@ static int handle_wbinvd(struct kvm_vcpu *vcpu)
> > 
> >     return 1;
> >   
> >   }
> > 
> > +static int handle_xsetbv(struct kvm_vcpu *vcpu)
> > +{
> > +   u64 new_bv = kvm_read_edx_eax(vcpu);
> > +
> > +   if (kvm_register_read(vcpu, VCPU_REGS_RCX) != 0)
> > +           goto err;
> > +   if (vmx_get_cpl(vcpu) != 0)
> > +           goto err;
> > +   if (!(new_bv&  XSTATE_FP))
> > +           goto err;
> > +   if ((new_bv&  XSTATE_YMM)&&  !(new_bv&  XSTATE_SSE))
> > +           goto err;
> > +   if (new_bv&  ~XCNTXT_MASK)
> > +           goto err;
> 
> Ok.  This means we must update kvm immediately when XCNTXT_MASK changes.
> 
> (Otherwise we would use KVM_XCNTXT_MASK which is always smaller than
> than XCNTXT_MASK).

I guess use host_xcr0 here is better?
> 
> > +   vcpu->arch.xcr0 = new_bv;
> > +   xsetbv(XCR_XFEATURE_ENABLED_MASK, vcpu->arch.xcr0);
> > +   skip_emulated_instruction(vcpu);
> > +   return 1;
> > +err:
> > +   kvm_inject_gp(vcpu, 0);
> > +   return 1;
> > +}
> > +
> > 
> > 
> > @@ -4124,6 +4176,8 @@ int kvm_arch_init(void *opaque)
> > 
> >     perf_register_guest_info_callbacks(&kvm_guest_cbs);
> > 
> > +   host_xcr0 = xgetbv(XCR_XFEATURE_ENABLED_MASK);
> > +
> > 
> >     return 0;
> 
> Will fault on old cpu.

...
> 
> >   EXPORT_SYMBOL_GPL(fx_init);
> > 
> > @@ -5134,6 +5195,12 @@ void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
> > 
> >     vcpu->guest_fpu_loaded = 1;
> >     unlazy_fpu(current);
> > 
> > +   /*
> > +    * Restore all possible states in the guest,
> > +    * and assume host would use all available bits.
> > +    */
> > +   if (cpu_has_xsave&&  vcpu->arch.xcr0)
> > +           xsetbv(XCR_XFEATURE_ENABLED_MASK, host_xcr0);
> > 
> >     fpu_restore_checking(&vcpu->arch.guest_fpu);
> 
> I think we need to reload xcr0 now to the guest's value.
> 
> >     trace_kvm_fpu(1);
> >   
> >   }
> > 
> > @@ -5144,6 +5211,13 @@ void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
> > 
> >             return;
> >     
> >     vcpu->guest_fpu_loaded = 0;
> > 
> > +   /*
> > +    * Save all possible states in the guest,
> > +    * and assume host would use all available bits.
> > +    * Also load host_xcr0 for host usage.
> > +    */
> > +   if (cpu_has_xsave&&  vcpu->arch.xcr0)
> > +           xsetbv(XCR_XFEATURE_ENABLED_MASK, host_xcr0);
> > 
> >     fpu_save_init(&vcpu->arch.guest_fpu);
> >     ++vcpu->stat.fpu_reload;
> >     set_bit(KVM_REQ_DEACTIVATE_FPU,&vcpu->requests);
> 
> This might be unnecessary.
> 
> So far xcr0 life cycle is almost that of
> save_host_state()/load_host_state(), but not exactly.  When loading the
> guest fpu we switch temporarily to host xcr0, then we have to switch
> back, but only if gcr4.osxsave.  When saving the guest fpu, we're
> 
> already using the host xcr0:
> > void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > {
> > 
> >     kvm_x86_ops->vcpu_put(vcpu);
> >     kvm_put_guest_fpu(vcpu);
> > 
> > }
> 
> One way to simplify this is to have a vcpu->guest_xcr0_loaded flag and
> check it when needed.  So the transition matrix is:
> 
>    save_host_state: if gcr4.osxsave, set guest_xcr0_loaded, load it
>    set gcr4.osxsave: ditto
>    clear gcr4.osxsave: do nothing
>    load_host_state: if guest_xcr0_loaded, clear it, reload host xcr0
>    fpu switching: if (switched) switch; reload fpu; if (switched) switch
> 
> may be simplified if we move xcr0 reload back to guest entry (... :)
> but make it lazy:
> 
>    save_host_state: nothing
>    set cr4.osxsave: nothing
>    clear cr4.osxsave: nothing
>    guest entry: if (gcr4.osxsave && !guest_xcr0_loaded) {
> guest_xcr0_loaded = true, load gxcr0 }
>    load_host_state: if (guest_xcr0_loaded) { guest_xcr0_loaded = false;
> load host xcr0 }
>    fpu switching:  if (guest_xcr0_loaded) { guest_xcr0_loaded = false;
> load host xcr0 }, do fpu stuff
> 
> So we delay xcr0 reload as late as possible for both entry and exit.

I think I got it. But why we need do it at "load_host_state()"? I guess just 
put 
code before fpu testing in kvm_put_guest_fpu() is fine?

--
regards
Yang, Sheng
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to