On Thu, Apr 11, 2013 at 02:18:15AM +0200, Borislav Petkov wrote:
> On Wed, Apr 10, 2013 at 03:16:39PM +0300, Gleb Natapov wrote:
> > Right, the question is how kernel can tell QEMU that the cpuid bit is
> > supported but should not be set unless explicitly asked by an user.
>
> Actually, this seems to work with the patch below based on whether you
> have "+movbe" in the -cpu option or not.
>
> Anyway, here's the second version with hopefully all comments and
> suggestions addressed.
>
> Thanks.
>
> --
> >From 612fc75a732ad16332f270b7c52a68c89e3565ca Mon Sep 17 00:00:00 2001
> From: Borislav Petkov <[email protected]>
> Date: Thu, 11 Apr 2013 02:06:30 +0200
> Subject: [PATCH] kvm: Emulate MOVBE
>
> This basically came from the need to be able to boot 32-bit Atom SMP
> guests on an AMD host, i.e. host which doesn't support MOVBE. As a
> matter of fact, qemu has since recently received MOVBE support but we
> cannot share that with kvm emulation and thus we have to do this in the
> host.
>
> We piggyback on the #UD path and emulate the MOVBE functionality. With
> it, an 8-core SMP guest boots in under 6 seconds.
>
> Also, requesting MOVBE emulation needs to happen explicitly to work,
> i.e. qemu -cpu n270,+movbe...
>
> Signed-off-by: Andre Przywara <[email protected]>
> Signed-off-by: Borislav Petkov <[email protected]>
> ---
> arch/x86/kvm/cpuid.c | 2 +-
> arch/x86/kvm/emulate.c | 39 +++++++++++++++++++++++++++++++++++++--
> 2 files changed, 38 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index a20ecb5b6cbf..2d44fc4fd855 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -273,7 +273,7 @@ static int do_cpuid_ent(struct kvm_cpuid_entry2 *entry,
> u32 function,
> cpuid_mask(&entry->ecx, 4);
> /* we support x2apic emulation even if host does not support
> * it since we emulate x2apic in software */
> - entry->ecx |= F(X2APIC);
> + entry->ecx |= F(X2APIC) | F(MOVBE);
> break;
> /* function 2 entries are STATEFUL. That is, repeated cpuid commands
> * may return different values. This forces us to get_cpu() before
> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
> index a335cc6cde72..9011c7a656ad 100644
> --- a/arch/x86/kvm/emulate.c
> +++ b/arch/x86/kvm/emulate.c
> @@ -152,6 +152,7 @@
> #define Avx ((u64)1 << 43) /* Advanced Vector Extensions */
> #define Fastop ((u64)1 << 44) /* Use opcode::u.fastop */
> #define NoWrite ((u64)1 << 45) /* No writeback */
> +#define EmulateOnUD ((u64)1 << 46) /* emulate if unsupported by the host */
>
Just rename VendorSpecific to EmulateOnUD. The meaning is the same.
> #define X2(x...) x, x
> #define X3(x...) X2(x), x
> @@ -3107,6 +3108,30 @@ static int em_mov(struct x86_emulate_ctxt *ctxt)
> return X86EMUL_CONTINUE;
> }
>
> +static int em_movbe(struct x86_emulate_ctxt *ctxt)
> +{
> + switch (ctxt->op_bytes) {
> + case 2:
> + *(u16 *)ctxt->dst.valptr = swab16(*(u16 *)ctxt->src.valptr);
What's wrong with: ctxt->dst.val = swab16((u16)ctxt->src.val)?
> + break;
> + case 4:
> + *(u32 *)ctxt->dst.valptr = swab32(*(u32 *)ctxt->src.valptr);
> +
> + /*
> + * clear upper dword for 32-bit operand size in 64-bit mode.
> + */
> + if (ctxt->mode == X86EMUL_MODE_PROT64)
> + *((u32 *)ctxt->dst.valptr + 1) = 0x0;
It should be clean without this this.
> + break;
> + case 8:
> + *(u64 *)ctxt->dst.valptr = swab64(*(u64 *)ctxt->src.valptr);
> + break;
> + default:
> + return X86EMUL_PROPAGATE_FAULT;
> + }
> + return X86EMUL_CONTINUE;
> +}
> +
> static int em_cr_write(struct x86_emulate_ctxt *ctxt)
> {
> if (ctxt->ops->set_cr(ctxt, ctxt->modrm_reg, ctxt->src.val))
> @@ -4033,6 +4058,11 @@ static const struct opcode twobyte_table[256] = {
> N, N, N, N, N, N, N, N, N, N, N, N, N, N, N, N
> };
>
> +static const struct opcode threebyte_table[] = {
> + [0xf0] = I(DstReg | SrcMem | ModRM | Mov | EmulateOnUD, em_movbe),
> + [0xf1] = I(DstMem | SrcReg | ModRM | Mov | EmulateOnUD, em_movbe),
> +};
> +
Populate the whole table explicitly please. You can use X16(N) to make it
short. The encoding is also wrong since 0F 38 F0/F1 instruction decoding
depend on a prefix. Same opcode is decoded as CRC32 with f2 prefix. We
have Prefix mechanism for such instructions, but it currently assumes
that 66 and f2/f3 prefixes are mutually exclusive, but in this case they
are not, ugh.
> #undef D
> #undef N
> #undef G
> @@ -4320,6 +4350,9 @@ done_prefixes:
> ctxt->twobyte = 1;
> ctxt->b = insn_fetch(u8, ctxt);
> opcode = twobyte_table[ctxt->b];
> +
> + if (ctxt->b == 0x38)
> + opcode = threebyte_table[insn_fetch(u8, ctxt)];
ctxt->b = insn_fetch(u8, ctxt);
opcode = threebyte_table[ctxt->b];
Otherwise OpImplicit type of decoding will not work for three byte
instructions. Also should rename twobyte into opbytes and set it to 1,
2 or 3. I prefer that three byte instruction decoding will be sent as a
separate patch.
> }
> ctxt->d = opcode.flags;
>
> @@ -4376,8 +4409,10 @@ done_prefixes:
> if (ctxt->d == 0 || (ctxt->d & Undefined))
> return EMULATION_FAILED;
>
> - if (!(ctxt->d & VendorSpecific) && ctxt->only_vendor_specific_insn)
> - return EMULATION_FAILED;
> + if (!(ctxt->d & VendorSpecific) && ctxt->only_vendor_specific_insn) {
> + if (!(ctxt->d & EmulateOnUD))
> + return EMULATION_FAILED;
> + }
>
> if (mode == X86EMUL_MODE_PROT64 && (ctxt->d & Stack))
> ctxt->op_bytes = 8;
> --
> 1.8.2.135.g7b592fa
>
>
> --
> Regards/Gruss,
> Boris.
>
> Sent from a fat crate under my desk. Formatting is fine.
> --
--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html