On Jun 4, 2014, at 10:43 PM, Gabriel L. Somlo <[email protected]> wrote:

My implementation still emulates the instruction as a NOP, but first checks for 
an exception.
> On Wed, Jun 04, 2014 at 10:12:39PM +0300, Nadav Amit wrote:
> 
> I'd be curious how you're dealing with the "hidden" CPU state which
> tells MWAIT to sleep until someone or something writes to the
> monitored memory area set up by a corresponding MONITOR instruction.
>> Regardless to the whole discussion of what the guest is informed about, I 
>> think it might be better to implement mwait and monitor correctly according 
>> to the spec and let the instructions to be fully emulated.
>> Both mwait and monitor may encounter exceptions (#GP, #PF, regardless of 
>> #UD), so this behaviour should be correct.
>> If you want me, I?ll send my version which looks something like:
>> 
>> static int em_monitor(struct x86_emulate_ctxt *ctxt)
>> {
>>      int rc;
>>      struct segmented_address addr;
>>      u64 rcx = reg_read(ctxt, VCPU_REGS_RCX);
>>      u64 rax = reg_read(ctxt, VCPU_REGS_RAX);
>>      u8 byte;
>> 
>>      rc = check_mwait_supported(ctxt);
>>      if (rc != X86EMUL_CONTINUE)
>>              return rc;
>> 
>>      if (ctxt->mode != X86EMUL_MODE_PROT64)
>>              rcx = (u32)rcx;
>> 
>>      if (rcx != 0)
>>              return emulate_gp(ctxt, 0);
>> 
>>      addr.seg = seg_override(ctxt);
>>      addr.ea = ctxt->ad_bytes == 8 ? rax : (u32)rax;
>> 
>>      rc = segmented_read(ctxt, addr, &byte, 1);
>>      if (rc != X86EMUL_CONTINUE)
>>              return rc;
>> 
>>      return X86EMUL_CONTINUE;
>> }
>> 
>> static int em_mwait(struct x86_emulate_ctxt *ctxt)
>> {
>>      u64 rcx = reg_read(ctxt, VCPU_REGS_RCX);
>>      int rc = check_mwait_supported(ctxt);
>>      if (rc != X86EMUL_CONTINUE)
>>              return rc;
>>      if (ctxt->mode != X86EMUL_MODE_PROT64)
>>              rcx = (u32)rcx;
>> 
>>      if ((rcx & ~(u64)1) != 0)
>>              return emulate_gp(ctxt, 0);
>> 
>>      if (rcx & 1) {
>>              /* Interrupt as break event */
>>              u32 ebx, ecx, edx, eax;
>>              eax = 5;
>>              ecx = 0;
>>              ctxt->ops->get_cpuid(ctxt, &eax, &ebx, &ecx, &edx);
>>              if (!(ecx & 1))
>>                      return emulate_gp(ctxt, 0);
>>      }
>>      return X86EMUL_CONTINUE;
>> }

Anyhow, if you want a real mwait emulation, you can write-protect the page of 
the monitored memory area in the EPT of the other VCPUs and set a callback once 
a write to the area takes place. You may want the host to cause a spurious 
wakeup after you do the write-protection, so you will not miss a write of 
another VCPU to the monitored area. After the spurious wake-up, the VM is 
likely to issue an additional mwait, using the same monitored cache-line.

Additional care for DMAs (emulated and paravirtual) might be needed with the 
assistance of QEMU. The complicated case is dealing with the DMAs of assigned 
devices due to the lack of support for I/O page-faules.

Nadav

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to