Re: [PATCH v2 06/13] KVM: x86: save/load state on SMM switch

2015-06-04 Thread Paolo Bonzini


On 03/06/2015 21:02, Radim Krčmář wrote:
  +   if (r  0)
  +   return;
 
 And if we fail to write it, is there other option than throwing an error
 to userspace?  (Unset HF_SMM_MASK and pretend that nothing happened
 doesn't find much support in docs.)

Just do nothing, I guess.  If this is ROM, things will work (for some
definition of work) the same as in real hardware.  If it's MMIO, they
will break as soon as the next instruction is executed.

Paolo
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 06/13] KVM: x86: save/load state on SMM switch

2015-06-04 Thread Paolo Bonzini


On 03/06/2015 21:02, Radim Krčmář wrote:
 +r = kvm_write_guest(vcpu-kvm, vcpu-arch.smbase + 0xfe00, buf, 
 sizeof(buf));
 
 The state is saved in SMRAM, but we are accessing it using the non-SMM
 address space ... how did it pass testing?
 (Restore is using SMM address space, so I'm guessing that the mapping
  from QEMU wasn't really utilizing two separate address spaces.)

At this point of the series there are no separate address spaces yet.
Patch 10 then changes it everywhere:

@@ -6558,7 +6558,7 @@ static void process_smi(struct kvm_vcpu *vcpu)
else
process_smi_save_state_32(vcpu, buf);
 
-   r = kvm_write_guest(vcpu-kvm, vcpu-arch.smbase + 0xfe00, buf, 
sizeof(buf));
+   r = kvm_vcpu_write_guest(vcpu, vcpu-arch.smbase + 0xfe00, buf, 
sizeof(buf));
if (r  0)
return;

Why did I order it this way?  Because it is already possible to test
this code with the default SMBASE of 0x3, and it is already
possible to run the full firmware if you hack it not to close SMRAM
(for this I used q35's high SMRAM).  It is not possible to test the
code partially if you first add the two address spaces, and only
implement the world switch second.

Thanks,

Paolo


 +if (r  0)
 +return;
 
 And if we fail to write it, is there other option than throwing an error
 to userspace?  (Unset HF_SMM_MASK and pretend that nothing happened
 doesn't find much support in docs.)
 
 Thanks.
 
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 06/13] KVM: x86: save/load state on SMM switch

2015-06-04 Thread Radim Krčmář
2015-06-04 08:14+0200, Paolo Bonzini:
 On 03/06/2015 21:02, Radim Krčmář wrote:
 +   r = kvm_write_guest(vcpu-kvm, vcpu-arch.smbase + 0xfe00, buf, 
 sizeof(buf));
 
 The state is saved in SMRAM, but we are accessing it using the non-SMM
 address space ... how did it pass testing?
 (Restore is using SMM address space, so I'm guessing that the mapping
  from QEMU wasn't really utilizing two separate address spaces.)
 
 At this point of the series there are no separate address spaces yet.
 Patch 10 then changes it everywhere:
 
 @@ -6558,7 +6558,7 @@ static void process_smi(struct kvm_vcpu *vcpu)

My bad, people using jackhammers at 7am are getting the better of me.

 Why did I order it this way?  Because it is already possible to test
 this code with the default SMBASE of 0x3, and it is already
 possible to run the full firmware if you hack it not to close SMRAM
 (for this I used q35's high SMRAM).  It is not possible to test the
 code partially if you first add the two address spaces, and only
 implement the world switch second.

The ordering makes sense;  I wanted to point out the early return,
noticed this as well and missed that it was fixed later, sorry.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 06/13] KVM: x86: save/load state on SMM switch

2015-06-03 Thread Radim Krčmář
2015-05-27 19:05+0200, Paolo Bonzini:
 The big ugly one.  This patch adds support for switching in and out of
 system management mode, respectively upon receiving KVM_REQ_SMI and upon
 executing a RSM instruction.  Both 32- and 64-bit formats are supported
 for the SMM state save area.
 
 Signed-off-by: Paolo Bonzini pbonz...@redhat.com
 ---
   RFC-v1: shift access rights left by 8 for 32-bit format
move tracepoint to kvm_set_hflags
fix NMI handling
 ---
 diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
  static void process_smi(struct kvm_vcpu *vcpu)
  {
 + struct kvm_segment cs, ds;
 + char buf[512];
 + u32 cr0;
 + int r;
 +
   if (is_smm(vcpu)) {
   vcpu-arch.smi_pending = true;
   return;
   }
  
 - printk_once(KERN_DEBUG Ignoring guest SMI\n);
 + trace_kvm_enter_smm(vcpu-vcpu_id, vcpu-arch.smbase, true);
 + vcpu-arch.hflags |= HF_SMM_MASK;
 + memset(buf, 0, 512);
 + if (guest_cpuid_has_longmode(vcpu))
 + process_smi_save_state_64(vcpu, buf);
 + else
 + process_smi_save_state_32(vcpu, buf);
 +
 + r = kvm_write_guest(vcpu-kvm, vcpu-arch.smbase + 0xfe00, buf, 
 sizeof(buf));

The state is saved in SMRAM, but we are accessing it using the non-SMM
address space ... how did it pass testing?
(Restore is using SMM address space, so I'm guessing that the mapping
 from QEMU wasn't really utilizing two separate address spaces.)

 + if (r  0)
 + return;

And if we fail to write it, is there other option than throwing an error
to userspace?  (Unset HF_SMM_MASK and pretend that nothing happened
doesn't find much support in docs.)

Thanks.
--
To unsubscribe from this list: send the line unsubscribe kvm in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 06/13] KVM: x86: save/load state on SMM switch

2015-05-27 Thread Paolo Bonzini
The big ugly one.  This patch adds support for switching in and out of
system management mode, respectively upon receiving KVM_REQ_SMI and upon
executing a RSM instruction.  Both 32- and 64-bit formats are supported
for the SMM state save area.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
RFC-v1: shift access rights left by 8 for 32-bit format
 move tracepoint to kvm_set_hflags
 fix NMI handling
---
 arch/x86/kvm/cpuid.h   |   8 ++
 arch/x86/kvm/emulate.c | 248 -
 arch/x86/kvm/trace.h   |  22 +
 arch/x86/kvm/x86.c | 225 +++-
 4 files changed, 501 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/cpuid.h b/arch/x86/kvm/cpuid.h
index 496b3695d3d3..dd05b9cef6ae 100644
--- a/arch/x86/kvm/cpuid.h
+++ b/arch/x86/kvm/cpuid.h
@@ -70,6 +70,14 @@ static inline bool guest_cpuid_has_fsgsbase(struct kvm_vcpu 
*vcpu)
return best  (best-ebx  bit(X86_FEATURE_FSGSBASE));
 }
 
+static inline bool guest_cpuid_has_longmode(struct kvm_vcpu *vcpu)
+{
+   struct kvm_cpuid_entry2 *best;
+
+   best = kvm_find_cpuid_entry(vcpu, 0x8001, 0);
+   return best  (best-edx  bit(X86_FEATURE_LM));
+}
+
 static inline bool guest_cpuid_has_osvw(struct kvm_vcpu *vcpu)
 {
struct kvm_cpuid_entry2 *best;
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index e763a9b8c26b..1da4dd88465d 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2259,12 +2259,258 @@ static int em_lseg(struct x86_emulate_ctxt *ctxt)
return rc;
 }
 
+static int emulator_has_longmode(struct x86_emulate_ctxt *ctxt)
+{
+   u32 eax, ebx, ecx, edx;
+
+   eax = 0x8001;
+   ecx = 0;
+   ctxt-ops-get_cpuid(ctxt, eax, ebx, ecx, edx);
+   return edx  bit(X86_FEATURE_LM);
+}
+
+#define get_smstate(type, smbase, offset)\
+   ({\
+type __val;  \
+int r = ctxt-ops-read_std(ctxt, smbase + offset, __val,   \
+sizeof(__val), NULL);\
+if (r != X86EMUL_CONTINUE)   \
+return X86EMUL_UNHANDLEABLE; \
+__val;   \
+   })
+
+static void rsm_set_desc_flags(struct desc_struct *desc, u32 flags)
+{
+   desc-g= (flags  23)  1;
+   desc-d= (flags  22)  1;
+   desc-l= (flags  21)  1;
+   desc-avl  = (flags  20)  1;
+   desc-p= (flags  15)  1;
+   desc-dpl  = (flags  13)  3;
+   desc-s= (flags  12)  1;
+   desc-type = (flags   8)  15;
+}
+
+static int rsm_load_seg_32(struct x86_emulate_ctxt *ctxt, u64 smbase, int n)
+{
+   struct desc_struct desc;
+   int offset;
+   u16 selector;
+
+   selector = get_smstate(u32, smbase, 0x7fa8 + n * 4);
+
+   if (n  3)
+   offset = 0x7f84 + n * 12;
+   else
+   offset = 0x7f2c + (n - 3) * 12;
+
+   set_desc_base(desc,  get_smstate(u32, smbase, offset + 8));
+   set_desc_limit(desc, get_smstate(u32, smbase, offset + 4));
+   rsm_set_desc_flags(desc, get_smstate(u32, smbase, offset));
+   ctxt-ops-set_segment(ctxt, selector, desc, 0, n);
+   return X86EMUL_CONTINUE;
+}
+
+static int rsm_load_seg_64(struct x86_emulate_ctxt *ctxt, u64 smbase, int n)
+{
+   struct desc_struct desc;
+   int offset;
+   u16 selector;
+   u32 base3;
+
+   offset = 0x7e00 + n * 16;
+
+   selector =get_smstate(u16, smbase, offset);
+   rsm_set_desc_flags(desc, get_smstate(u16, smbase, offset + 2)  8);
+   set_desc_limit(desc, get_smstate(u32, smbase, offset + 4));
+   set_desc_base(desc,  get_smstate(u32, smbase, offset + 8));
+   base3 =   get_smstate(u32, smbase, offset + 12);
+
+   ctxt-ops-set_segment(ctxt, selector, desc, base3, n);
+   return X86EMUL_CONTINUE;
+}
+
+static int rsm_enter_protected_mode(struct x86_emulate_ctxt *ctxt,
+u64 cr0, u64 cr4)
+{
+   int bad;
+
+   /*
+* First enable PAE, long mode needs it before CR0.PG = 1 is set.
+* Then enable protected mode.  However, PCID cannot be enabled
+* if EFER.LMA=0, so set it separately.
+*/
+   bad = ctxt-ops-set_cr(ctxt, 4, cr4  ~X86_CR4_PCIDE);
+   if (bad)
+   return X86EMUL_UNHANDLEABLE;
+
+   bad = ctxt-ops-set_cr(ctxt, 0, cr0);
+   if (bad)
+   return X86EMUL_UNHANDLEABLE;
+
+   if (cr4  X86_CR4_PCIDE) {
+   bad = ctxt-ops-set_cr(ctxt, 4, cr4);
+   if (bad)
+   return X86EMUL_UNHANDLEABLE;
+   }
+
+   return X86EMUL_CONTINUE;
+}
+