Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-19 Thread Paolo Bonzini


On 16/03/2016 04:55, Xiao Guangrong wrote:
>
> Probably not. AFAICT KVM does not rely on it being loaded outside that
> region. xsetbv isn't insanely expensive, is it? Maybe to minimize the
> time spent with interrupts disabled it was put outside.
>
> I do like that your solution would be contained to KVM.

 I agree with Andy.  We do want a fix for recent kernels because of the
 !eager_fpu case that Guangrong mentioned.
> 
> Relying on interrupt is not easy as XCR0 can not be automatically
> saved/loaded by VMCS... Once interrupt happens, it will use guest's XCR0 
> anyway.

Right, that's why an xsetbv while interrupts are disabled is appealing.

Paolo


Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-19 Thread Paolo Bonzini


On 16/03/2016 04:55, Xiao Guangrong wrote:
>
> Probably not. AFAICT KVM does not rely on it being loaded outside that
> region. xsetbv isn't insanely expensive, is it? Maybe to minimize the
> time spent with interrupts disabled it was put outside.
>
> I do like that your solution would be contained to KVM.

 I agree with Andy.  We do want a fix for recent kernels because of the
 !eager_fpu case that Guangrong mentioned.
> 
> Relying on interrupt is not easy as XCR0 can not be automatically
> saved/loaded by VMCS... Once interrupt happens, it will use guest's XCR0 
> anyway.

Right, that's why an xsetbv while interrupts are disabled is appealing.

Paolo


Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-15 Thread Xiao Guangrong



On 03/16/2016 03:32 AM, Paolo Bonzini wrote:



On 15/03/2016 19:27, Andy Lutomirski wrote:

On Mon, Mar 14, 2016 at 6:17 AM, Paolo Bonzini  wrote:



On 11/03/2016 22:33, David Matlack wrote:

Is this better than just always keeping the host's XCR0 loaded outside
if the KVM interrupts-disabled region?


Probably not. AFAICT KVM does not rely on it being loaded outside that
region. xsetbv isn't insanely expensive, is it? Maybe to minimize the
time spent with interrupts disabled it was put outside.

I do like that your solution would be contained to KVM.


I agree with Andy.  We do want a fix for recent kernels because of the
!eager_fpu case that Guangrong mentioned.


Relying on interrupt is not easy as XCR0 can not be automatically saved/loaded
by VMCS... Once interrupt happens, it will use guest's XCR0 anyway.



Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-15 Thread Xiao Guangrong



On 03/16/2016 03:32 AM, Paolo Bonzini wrote:



On 15/03/2016 19:27, Andy Lutomirski wrote:

On Mon, Mar 14, 2016 at 6:17 AM, Paolo Bonzini  wrote:



On 11/03/2016 22:33, David Matlack wrote:

Is this better than just always keeping the host's XCR0 loaded outside
if the KVM interrupts-disabled region?


Probably not. AFAICT KVM does not rely on it being loaded outside that
region. xsetbv isn't insanely expensive, is it? Maybe to minimize the
time spent with interrupts disabled it was put outside.

I do like that your solution would be contained to KVM.


I agree with Andy.  We do want a fix for recent kernels because of the
!eager_fpu case that Guangrong mentioned.


Relying on interrupt is not easy as XCR0 can not be automatically saved/loaded
by VMCS... Once interrupt happens, it will use guest's XCR0 anyway.



Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-15 Thread Paolo Bonzini


On 15/03/2016 19:27, Andy Lutomirski wrote:
> On Mon, Mar 14, 2016 at 6:17 AM, Paolo Bonzini  wrote:
>>
>>
>> On 11/03/2016 22:33, David Matlack wrote:
 Is this better than just always keeping the host's XCR0 loaded outside
 if the KVM interrupts-disabled region?
>>>
>>> Probably not. AFAICT KVM does not rely on it being loaded outside that
>>> region. xsetbv isn't insanely expensive, is it? Maybe to minimize the
>>> time spent with interrupts disabled it was put outside.
>>>
>>> I do like that your solution would be contained to KVM.
>>
>> I agree with Andy.  We do want a fix for recent kernels because of the
>> !eager_fpu case that Guangrong mentioned.
>>
>> Paolo
>>
>> ps: while Andy is planning to kill lazy FPU, I want to benchmark it with
>> KVM...  Remember that with a single pre-xsave host in your cluster, your
>> virt management might happily default your VMs to a Westmere or Nehalem
>> CPU model.  GCC might be a pretty good testbench for this (e.g. a kernel
>> compile with very high make -j), because outside of the lexer (which
>> plays SIMD games) it never uses the FPU.
> 
> Aren't pre-xsave CPUs really, really old?  A brief search suggests
> that Intel Core added it somewhere in the middle of the cycle.

I am fairly sure it was added in Sandy Bridge, together with AVX. But
what really matters for eager FPU is not xsave, it's xsaveopt, and I
think AMD has never even produced a microprocessor that supports it.

> For pre-xsave, it could indeed hurt performance a tiny bit under
> workloads that use the FPU and then stop completely because the
> xsaveopt and init optimizations aren't available.  But even that is
> probably a very small effect, especially because pre-xsave CPUs have
> smaller FPU state sizes.

It's still a few cache lines.  Benchmarks will tell.

Paolo


Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-15 Thread Paolo Bonzini


On 15/03/2016 19:27, Andy Lutomirski wrote:
> On Mon, Mar 14, 2016 at 6:17 AM, Paolo Bonzini  wrote:
>>
>>
>> On 11/03/2016 22:33, David Matlack wrote:
 Is this better than just always keeping the host's XCR0 loaded outside
 if the KVM interrupts-disabled region?
>>>
>>> Probably not. AFAICT KVM does not rely on it being loaded outside that
>>> region. xsetbv isn't insanely expensive, is it? Maybe to minimize the
>>> time spent with interrupts disabled it was put outside.
>>>
>>> I do like that your solution would be contained to KVM.
>>
>> I agree with Andy.  We do want a fix for recent kernels because of the
>> !eager_fpu case that Guangrong mentioned.
>>
>> Paolo
>>
>> ps: while Andy is planning to kill lazy FPU, I want to benchmark it with
>> KVM...  Remember that with a single pre-xsave host in your cluster, your
>> virt management might happily default your VMs to a Westmere or Nehalem
>> CPU model.  GCC might be a pretty good testbench for this (e.g. a kernel
>> compile with very high make -j), because outside of the lexer (which
>> plays SIMD games) it never uses the FPU.
> 
> Aren't pre-xsave CPUs really, really old?  A brief search suggests
> that Intel Core added it somewhere in the middle of the cycle.

I am fairly sure it was added in Sandy Bridge, together with AVX. But
what really matters for eager FPU is not xsave, it's xsaveopt, and I
think AMD has never even produced a microprocessor that supports it.

> For pre-xsave, it could indeed hurt performance a tiny bit under
> workloads that use the FPU and then stop completely because the
> xsaveopt and init optimizations aren't available.  But even that is
> probably a very small effect, especially because pre-xsave CPUs have
> smaller FPU state sizes.

It's still a few cache lines.  Benchmarks will tell.

Paolo


Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-15 Thread Andy Lutomirski
On Mon, Mar 14, 2016 at 6:17 AM, Paolo Bonzini  wrote:
>
>
> On 11/03/2016 22:33, David Matlack wrote:
>> > Is this better than just always keeping the host's XCR0 loaded outside
>> > if the KVM interrupts-disabled region?
>>
>> Probably not. AFAICT KVM does not rely on it being loaded outside that
>> region. xsetbv isn't insanely expensive, is it? Maybe to minimize the
>> time spent with interrupts disabled it was put outside.
>>
>> I do like that your solution would be contained to KVM.
>
> I agree with Andy.  We do want a fix for recent kernels because of the
> !eager_fpu case that Guangrong mentioned.
>
> Paolo
>
> ps: while Andy is planning to kill lazy FPU, I want to benchmark it with
> KVM...  Remember that with a single pre-xsave host in your cluster, your
> virt management might happily default your VMs to a Westmere or Nehalem
> CPU model.  GCC might be a pretty good testbench for this (e.g. a kernel
> compile with very high make -j), because outside of the lexer (which
> plays SIMD games) it never uses the FPU.

Aren't pre-xsave CPUs really, really old?  A brief search suggests
that Intel Core added it somewhere in the middle of the cycle.

For pre-xsave, it could indeed hurt performance a tiny bit under
workloads that use the FPU and then stop completely because the
xsaveopt and init optimizations aren't available.  But even that is
probably a very small effect, especially because pre-xsave CPUs have
smaller FPU state sizes.


-- 
Andy Lutomirski
AMA Capital Management, LLC


Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-15 Thread Andy Lutomirski
On Mon, Mar 14, 2016 at 6:17 AM, Paolo Bonzini  wrote:
>
>
> On 11/03/2016 22:33, David Matlack wrote:
>> > Is this better than just always keeping the host's XCR0 loaded outside
>> > if the KVM interrupts-disabled region?
>>
>> Probably not. AFAICT KVM does not rely on it being loaded outside that
>> region. xsetbv isn't insanely expensive, is it? Maybe to minimize the
>> time spent with interrupts disabled it was put outside.
>>
>> I do like that your solution would be contained to KVM.
>
> I agree with Andy.  We do want a fix for recent kernels because of the
> !eager_fpu case that Guangrong mentioned.
>
> Paolo
>
> ps: while Andy is planning to kill lazy FPU, I want to benchmark it with
> KVM...  Remember that with a single pre-xsave host in your cluster, your
> virt management might happily default your VMs to a Westmere or Nehalem
> CPU model.  GCC might be a pretty good testbench for this (e.g. a kernel
> compile with very high make -j), because outside of the lexer (which
> plays SIMD games) it never uses the FPU.

Aren't pre-xsave CPUs really, really old?  A brief search suggests
that Intel Core added it somewhere in the middle of the cycle.

For pre-xsave, it could indeed hurt performance a tiny bit under
workloads that use the FPU and then stop completely because the
xsaveopt and init optimizations aren't available.  But even that is
probably a very small effect, especially because pre-xsave CPUs have
smaller FPU state sizes.


-- 
Andy Lutomirski
AMA Capital Management, LLC


Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-14 Thread Paolo Bonzini


On 11/03/2016 22:33, David Matlack wrote:
> > Is this better than just always keeping the host's XCR0 loaded outside
> > if the KVM interrupts-disabled region?
> 
> Probably not. AFAICT KVM does not rely on it being loaded outside that
> region. xsetbv isn't insanely expensive, is it? Maybe to minimize the
> time spent with interrupts disabled it was put outside.
> 
> I do like that your solution would be contained to KVM.

I agree with Andy.  We do want a fix for recent kernels because of the
!eager_fpu case that Guangrong mentioned.

Paolo

ps: while Andy is planning to kill lazy FPU, I want to benchmark it with
KVM...  Remember that with a single pre-xsave host in your cluster, your
virt management might happily default your VMs to a Westmere or Nehalem
CPU model.  GCC might be a pretty good testbench for this (e.g. a kernel
compile with very high make -j), because outside of the lexer (which
plays SIMD games) it never uses the FPU.


Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-14 Thread Paolo Bonzini


On 11/03/2016 22:33, David Matlack wrote:
> > Is this better than just always keeping the host's XCR0 loaded outside
> > if the KVM interrupts-disabled region?
> 
> Probably not. AFAICT KVM does not rely on it being loaded outside that
> region. xsetbv isn't insanely expensive, is it? Maybe to minimize the
> time spent with interrupts disabled it was put outside.
> 
> I do like that your solution would be contained to KVM.

I agree with Andy.  We do want a fix for recent kernels because of the
!eager_fpu case that Guangrong mentioned.

Paolo

ps: while Andy is planning to kill lazy FPU, I want to benchmark it with
KVM...  Remember that with a single pre-xsave host in your cluster, your
virt management might happily default your VMs to a Westmere or Nehalem
CPU model.  GCC might be a pretty good testbench for this (e.g. a kernel
compile with very high make -j), because outside of the lexer (which
plays SIMD games) it never uses the FPU.


Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-11 Thread David Matlack
On Fri, Mar 11, 2016 at 1:14 PM, Andy Lutomirski  wrote:
>
> On Fri, Mar 11, 2016 at 12:47 PM, David Matlack  wrote:
> > From: Eric Northup 
> >
> > Add a percpu boolean, tracking whether a KVM vCPU is running on the
> > host CPU.  KVM will set and clear it as it loads/unloads guest XCR0.
> > (Note that the rest of the guest FPU load/restore is safe, because
> > kvm_load_guest_fpu and kvm_put_guest_fpu call __kernel_fpu_begin()
> > and __kernel_fpu_end(), respectively.)  irq_fpu_usable() will then
> > also check for this percpu boolean.
>
> Is this better than just always keeping the host's XCR0 loaded outside
> if the KVM interrupts-disabled region?

Probably not. AFAICT KVM does not rely on it being loaded outside that
region. xsetbv isn't insanely expensive, is it? Maybe to minimize the
time spent with interrupts disabled it was put outside.

I do like that your solution would be contained to KVM.

>
> --Andy


Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-11 Thread David Matlack
On Fri, Mar 11, 2016 at 1:14 PM, Andy Lutomirski  wrote:
>
> On Fri, Mar 11, 2016 at 12:47 PM, David Matlack  wrote:
> > From: Eric Northup 
> >
> > Add a percpu boolean, tracking whether a KVM vCPU is running on the
> > host CPU.  KVM will set and clear it as it loads/unloads guest XCR0.
> > (Note that the rest of the guest FPU load/restore is safe, because
> > kvm_load_guest_fpu and kvm_put_guest_fpu call __kernel_fpu_begin()
> > and __kernel_fpu_end(), respectively.)  irq_fpu_usable() will then
> > also check for this percpu boolean.
>
> Is this better than just always keeping the host's XCR0 loaded outside
> if the KVM interrupts-disabled region?

Probably not. AFAICT KVM does not rely on it being loaded outside that
region. xsetbv isn't insanely expensive, is it? Maybe to minimize the
time spent with interrupts disabled it was put outside.

I do like that your solution would be contained to KVM.

>
> --Andy


Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-11 Thread Andy Lutomirski
On Fri, Mar 11, 2016 at 12:47 PM, David Matlack  wrote:
> From: Eric Northup 
>
> Add a percpu boolean, tracking whether a KVM vCPU is running on the
> host CPU.  KVM will set and clear it as it loads/unloads guest XCR0.
> (Note that the rest of the guest FPU load/restore is safe, because
> kvm_load_guest_fpu and kvm_put_guest_fpu call __kernel_fpu_begin()
> and __kernel_fpu_end(), respectively.)  irq_fpu_usable() will then
> also check for this percpu boolean.

Is this better than just always keeping the host's XCR0 loaded outside
if the KVM interrupts-disabled region?

--Andy


Re: [PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-11 Thread Andy Lutomirski
On Fri, Mar 11, 2016 at 12:47 PM, David Matlack  wrote:
> From: Eric Northup 
>
> Add a percpu boolean, tracking whether a KVM vCPU is running on the
> host CPU.  KVM will set and clear it as it loads/unloads guest XCR0.
> (Note that the rest of the guest FPU load/restore is safe, because
> kvm_load_guest_fpu and kvm_put_guest_fpu call __kernel_fpu_begin()
> and __kernel_fpu_end(), respectively.)  irq_fpu_usable() will then
> also check for this percpu boolean.

Is this better than just always keeping the host's XCR0 loaded outside
if the KVM interrupts-disabled region?

--Andy


[PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-11 Thread David Matlack
From: Eric Northup 

Add a percpu boolean, tracking whether a KVM vCPU is running on the
host CPU.  KVM will set and clear it as it loads/unloads guest XCR0.
(Note that the rest of the guest FPU load/restore is safe, because
kvm_load_guest_fpu and kvm_put_guest_fpu call __kernel_fpu_begin()
and __kernel_fpu_end(), respectively.)  irq_fpu_usable() will then
also check for this percpu boolean.
---
 arch/x86/include/asm/i387.h |  3 +++
 arch/x86/kernel/i387.c  | 10 --
 arch/x86/kvm/x86.c  |  4 
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index ed8089d..ca2c173 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -14,6 +14,7 @@
 
 #include 
 #include 
+#include 
 
 struct pt_regs;
 struct user_i387_struct;
@@ -25,6 +26,8 @@ extern void math_state_restore(void);
 
 extern bool irq_fpu_usable(void);
 
+DECLARE_PER_CPU(bool, kvm_xcr0_loaded);
+
 /*
  * Careful: __kernel_fpu_begin/end() must be called with preempt disabled
  * and they don't touch the preempt state on their own.
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index b627746..9015828 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -19,6 +19,9 @@
 #include 
 #include 
 
+DEFINE_PER_CPU(bool, kvm_xcr0_loaded);
+EXPORT_PER_CPU_SYMBOL(kvm_xcr0_loaded);
+
 /*
  * Were we in an interrupt that interrupted kernel mode?
  *
@@ -33,8 +36,11 @@
  */
 static inline bool interrupted_kernel_fpu_idle(void)
 {
-   if (use_eager_fpu())
-   return __thread_has_fpu(current);
+   if (use_eager_fpu()) {
+   /* Preempt already disabled, safe to read percpu. */
+   return __thread_has_fpu(current) &&
+   !__this_cpu_read(kvm_xcr0_loaded);
+   }
 
return !__thread_has_fpu(current) &&
(read_cr0() & X86_CR0_TS);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d21bce5..f0ba7a1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -557,8 +557,10 @@ EXPORT_SYMBOL_GPL(kvm_lmsw);
 
 static void kvm_load_guest_xcr0(struct kvm_vcpu *vcpu)
 {
+   BUG_ON(this_cpu_read(kvm_xcr0_loaded) != vcpu->guest_xcr0_loaded);
if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE) &&
!vcpu->guest_xcr0_loaded) {
+   this_cpu_write(kvm_xcr0_loaded, 1);
/* kvm_set_xcr() also depends on this */
xsetbv(XCR_XFEATURE_ENABLED_MASK, vcpu->arch.xcr0);
vcpu->guest_xcr0_loaded = 1;
@@ -571,7 +573,9 @@ static void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu)
if (vcpu->arch.xcr0 != host_xcr0)
xsetbv(XCR_XFEATURE_ENABLED_MASK, host_xcr0);
vcpu->guest_xcr0_loaded = 0;
+   this_cpu_write(kvm_xcr0_loaded, 0);
}
+   BUG_ON(this_cpu_read(kvm_xcr0_loaded) != vcpu->guest_xcr0_loaded);
 }
 
 int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr)
-- 
2.7.0.rc3.207.g0ac5344



[PATCH 1/1] KVM: don't allow irq_fpu_usable when the VCPU's XCR0 is loaded

2016-03-11 Thread David Matlack
From: Eric Northup 

Add a percpu boolean, tracking whether a KVM vCPU is running on the
host CPU.  KVM will set and clear it as it loads/unloads guest XCR0.
(Note that the rest of the guest FPU load/restore is safe, because
kvm_load_guest_fpu and kvm_put_guest_fpu call __kernel_fpu_begin()
and __kernel_fpu_end(), respectively.)  irq_fpu_usable() will then
also check for this percpu boolean.
---
 arch/x86/include/asm/i387.h |  3 +++
 arch/x86/kernel/i387.c  | 10 --
 arch/x86/kvm/x86.c  |  4 
 3 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/i387.h b/arch/x86/include/asm/i387.h
index ed8089d..ca2c173 100644
--- a/arch/x86/include/asm/i387.h
+++ b/arch/x86/include/asm/i387.h
@@ -14,6 +14,7 @@
 
 #include 
 #include 
+#include 
 
 struct pt_regs;
 struct user_i387_struct;
@@ -25,6 +26,8 @@ extern void math_state_restore(void);
 
 extern bool irq_fpu_usable(void);
 
+DECLARE_PER_CPU(bool, kvm_xcr0_loaded);
+
 /*
  * Careful: __kernel_fpu_begin/end() must be called with preempt disabled
  * and they don't touch the preempt state on their own.
diff --git a/arch/x86/kernel/i387.c b/arch/x86/kernel/i387.c
index b627746..9015828 100644
--- a/arch/x86/kernel/i387.c
+++ b/arch/x86/kernel/i387.c
@@ -19,6 +19,9 @@
 #include 
 #include 
 
+DEFINE_PER_CPU(bool, kvm_xcr0_loaded);
+EXPORT_PER_CPU_SYMBOL(kvm_xcr0_loaded);
+
 /*
  * Were we in an interrupt that interrupted kernel mode?
  *
@@ -33,8 +36,11 @@
  */
 static inline bool interrupted_kernel_fpu_idle(void)
 {
-   if (use_eager_fpu())
-   return __thread_has_fpu(current);
+   if (use_eager_fpu()) {
+   /* Preempt already disabled, safe to read percpu. */
+   return __thread_has_fpu(current) &&
+   !__this_cpu_read(kvm_xcr0_loaded);
+   }
 
return !__thread_has_fpu(current) &&
(read_cr0() & X86_CR0_TS);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d21bce5..f0ba7a1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -557,8 +557,10 @@ EXPORT_SYMBOL_GPL(kvm_lmsw);
 
 static void kvm_load_guest_xcr0(struct kvm_vcpu *vcpu)
 {
+   BUG_ON(this_cpu_read(kvm_xcr0_loaded) != vcpu->guest_xcr0_loaded);
if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE) &&
!vcpu->guest_xcr0_loaded) {
+   this_cpu_write(kvm_xcr0_loaded, 1);
/* kvm_set_xcr() also depends on this */
xsetbv(XCR_XFEATURE_ENABLED_MASK, vcpu->arch.xcr0);
vcpu->guest_xcr0_loaded = 1;
@@ -571,7 +573,9 @@ static void kvm_put_guest_xcr0(struct kvm_vcpu *vcpu)
if (vcpu->arch.xcr0 != host_xcr0)
xsetbv(XCR_XFEATURE_ENABLED_MASK, host_xcr0);
vcpu->guest_xcr0_loaded = 0;
+   this_cpu_write(kvm_xcr0_loaded, 0);
}
+   BUG_ON(this_cpu_read(kvm_xcr0_loaded) != vcpu->guest_xcr0_loaded);
 }
 
 int __kvm_set_xcr(struct kvm_vcpu *vcpu, u32 index, u64 xcr)
-- 
2.7.0.rc3.207.g0ac5344