On Sun, Apr 14, 2013 at 11:43:03AM +0300, Gleb Natapov wrote:
> > +#define EmulateOnUD ((u64)1 << 46) /* emulate if unsupported by the host
> > */
> >
> Just rename VendorSpecific to EmulateOnUD. The meaning is the same.
done.
> > +static int em_movbe(struct x86_emulate_ctxt *ctxt)
> > +{
> > + switch (ctxt->op_bytes) {
> > + case 2:
> > + *(u16 *)ctxt->dst.valptr = swab16(*(u16 *)ctxt->src.valptr);
> What's wrong with: ctxt->dst.val = swab16((u16)ctxt->src.val)?
This doesn't work for the 16-bit swab because of the following
functionality of MOVBE:
"When the operand size is 16 bits, the upper word of the destination
register remains unchanged."
Now here's what gcc produces here:
movzwl 112(%rdi), %eax # ctxt_5(D)->src.D.27823.val, tmp83
rolw $8, %ax #, tmp83
movzwl %ax, %eax # tmp83, tmp84
movq %rax, 240(%rdi) # tmp84,
See the zero-extension happening twice (second one should not be needed,
even) and it is actually writing the whole 64-bit value in %rax back
which effectively destroys the upper word?
However, this, a bit uglier version works:
*(u16 *)&ctxt->dst.val = swab16((u16)ctxt->src.val);
movzwl 112(%rdi), %eax # ctxt_5(D)->src.D.27823.val, tmp82
rolw $8, %ax #, tmp82
movw %ax, 240(%rdi) # tmp82, MEM[(u16 *)ctxt_5(D) + 240B]
Whichever version we take, I'll slap a comment explaining why we're
doing it differently for 16-bit.
> > + case 4:
> > + *(u32 *)ctxt->dst.valptr = swab32(*(u32 *)ctxt->src.valptr);
> > +
> > + /*
> > + * clear upper dword for 32-bit operand size in 64-bit mode.
> > + */
> > + if (ctxt->mode == X86EMUL_MODE_PROT64)
> > + *((u32 *)ctxt->dst.valptr + 1) = 0x0;
> It should be clean without this this.
Right, this should probably work because for 32-bit, we don't care
about the zero-extended upper dword and in 64-bit mode we want to do it
anyway.
> > + break;
> > + case 8:
> > + *(u64 *)ctxt->dst.valptr = swab64(*(u64 *)ctxt->src.valptr);
Ditto for this one.
> > +static const struct opcode threebyte_table[] = {
> > + [0xf0] = I(DstReg | SrcMem | ModRM | Mov | EmulateOnUD, em_movbe),
> > + [0xf1] = I(DstMem | SrcReg | ModRM | Mov | EmulateOnUD, em_movbe),
> > +};
> > +
> Populate the whole table explicitly please. You can use X16(N) to make it
> short. The encoding is also wrong since 0F 38 F0/F1 instruction decoding
> depend on a prefix.
... on the absence of a prefix, actually.
> Same opcode is decoded as CRC32 with f2 prefix. We have Prefix
> mechanism for such instructions, but it currently assumes that 66 and
> f2/f3 prefixes are mutually exclusive, but in this case they are not,
> ugh.
Yeah, I'm lazy and wanted to leave the honors to the poor soul who
implements the whole 0F_38h opcode map. :-)
Ok, I probably could do something similar to pfx_vmovntpx with the GP
macro for the F0/F1 opcodes. I'll give it a try when I get a chance
soonish.
> > +
> > + if (ctxt->b == 0x38)
> > + opcode = threebyte_table[insn_fetch(u8, ctxt)];
> ctxt->b = insn_fetch(u8, ctxt);
> opcode = threebyte_table[ctxt->b];
>
> Otherwise OpImplicit type of decoding will not work for three byte
> instructions. Also should rename twobyte into opbytes and set it to 1,
> 2 or 3. I prefer that three byte instruction decoding will be sent as a
> separate patch.
Ok.
Thanks.
--
Regards/Gruss,
Boris.
Sent from a fat crate under my desk. Formatting is fine.
--
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html