On 6/5/19 12:06 AM, Jan Kiszka wrote:
> On 04.06.19 23:02, Ralf Ramsauer wrote:
>> mov (%rax), %ax is a 16-bit data MOV_FROM_MEM that will emit
>> 0x66 0x8b 0x00.
>>
>> 0x66 is the operand-size override prefix which we currently do not
>> support.
>>
>> We should support it, as we can find this opcode, for example, for some
>> mmconfig space access from Linux (e.g., pci_generic_config_read).
>>
>> This also adds appropriate mmio-access tests.
>>
>> Tested in QEMU virtual target.
>>
>> Signed-off-by: Ralf Ramsauer <[email protected]>
>> ---
>> hypervisor/arch/x86/include/asm/processor.h | 1 +
>> hypervisor/arch/x86/mmio.c | 37 +++++++++++++++++----
>> inmates/tests/x86/mmio-access-32.c | 4 +++
>> inmates/tests/x86/mmio-access.c | 4 +++
>> 4 files changed, 40 insertions(+), 6 deletions(-)
>>
>> diff --git a/hypervisor/arch/x86/include/asm/processor.h
>> b/hypervisor/arch/x86/include/asm/processor.h
>> index 70a6c3ff..d8111690 100644
>> --- a/hypervisor/arch/x86/include/asm/processor.h
>> +++ b/hypervisor/arch/x86/include/asm/processor.h
>> @@ -145,6 +145,7 @@
>> #define X86_REX_CODE 4
>> +#define X86_PREFIX_OP_SZ 0x66
>> #define X86_PREFIX_ADDR_SZ 0x67
>> #define X86_OP_MOVZX_OPC1 0x0f
>> diff --git a/hypervisor/arch/x86/mmio.c b/hypervisor/arch/x86/mmio.c
>> index b234bd79..6d9ad1c5 100644
>> --- a/hypervisor/arch/x86/mmio.c
>> +++ b/hypervisor/arch/x86/mmio.c
>> @@ -79,6 +79,26 @@ static unsigned int get_address_width(bool
>> has_addrsz_prefix)
>> (!!(cs_attr & VCPU_CS_DB) ^ has_addrsz_prefix) ? 4 : 2;
>> }
>> +static unsigned int get_op_width(bool has_rex_w, bool has_opsz_prefix)
>
> We should move all the flags into parse_context so that we can pass them
> around more easily.
Good point.
>
>> +{
>> + u16 cs_attr;
>> + bool long_mode;
>> +
>> + /* Op size prefix is ignored if rex.w = 1 */
>> + if (has_rex_w)
>> + return 8;
>> +
>> + cs_attr = vcpu_vendor_get_cs_attr();
>> + long_mode = (vcpu_vendor_get_efer() & EFER_LMA) &&
>> + (cs_attr & VCPU_CS_L);
>
> This may mean accessing the same VMCS regs multiple times. I vaguely
> recall from KVM that it pays off to avoid that and keep the results
> cached (per vmexit).
See comment below…
(BTW: not that it solved this issue, but we could also consider to
inline VCPU_VENDOR_GET_ accessors)
>
>> +
>> + if (long_mode)
>> + /* CS.d is ignored in long mode */
>> + return has_opsz_prefix ? 2 : 4;
>> +
>> + return (!!(cs_attr & VCPU_CS_DB) ^ has_opsz_prefix) ? 4 : 2;
>
> This does the same as get_address_width (minus different output values),
> but its code format is different. Should be aligned.
Yeah. I chose this style, as it took me a while to understand what
long_mode ? (has_addrsz_prefix ? 4 : 8) : (!!(cs_attr & VCPU_CS_DB) ^
has_addrsz_prefix) ? 4 : 2;
actually means -- with respect to ?'s operator precedence (which is
clear in this case, but I got confused). May I propose to rather align
the line mentioned above? (but let's see -- maybe we can consolidate it
in any case)
>
> In fact, I could imagine a combined helper:
>
> void parse_widths(struct parse_context *ctx,
> struct mmio_instruction *inst,
> bool parse_addr_width)
>
> That one would obtain cs_attr and long_mode only once, would do the
> address width thing only if parse_addr_width is true, and would push its
> results directly into *inst.
... as an inlined single-user function, right?
Good idea, that would also save the potential double efer/cs_attr
access. I'm just curious: It's Intel's vmread that you worry about? At
least on AMD, guest's efer is directly read from vmcs region.
>
>> +}
>> +
>> struct mmio_instruction
>> x86_mmio_parse(const struct guest_paging_structures *pg_structs,
>> bool is_write)
>> {
>> @@ -94,6 +114,7 @@ x86_mmio_parse(const struct guest_paging_structures
>> *pg_structs, bool is_write)
>> bool has_rex_w = false;
>> bool has_rex_r = false;
>> bool has_addrsz_prefix = false;
>> + bool has_opsz_prefix = false;
>> if (!ctx_update(&ctx, &pc, 0, pg_structs))
>> goto error_noinst;
>> @@ -114,9 +135,13 @@ restart:
>> }
>> switch (op[0].raw) {
>> case X86_PREFIX_ADDR_SZ:
>> + case X86_PREFIX_OP_SZ:
>> if (!ctx_update(&ctx, &pc, 1, pg_structs))
>> goto error_noinst;
>> - has_addrsz_prefix = true;
>> + if (op[0].raw == X86_PREFIX_ADDR_SZ)
>> + has_addrsz_prefix = true;
>> + else
>> + has_opsz_prefix = true;
>
> I would avoid the double-dispatching just to save one ctx_update.
Ok.
>
>> goto restart;
>> case X86_OP_MOVZX_OPC1:
>> if (!ctx_update(&ctx, &pc, 1, pg_structs))
>> @@ -134,28 +159,28 @@ restart:
>> does_write = true;
>> break;
>> case X86_OP_MOV_TO_MEM:
>> - inst.access_size = has_rex_w ? 8 : 4;
>> + inst.access_size = get_op_width(has_rex_w, has_opsz_prefix);
>> does_write = true;
>> break;
>> case X86_OP_MOVB_FROM_MEM:
>> inst.access_size = 1;
>> break;
>> case X86_OP_MOV_FROM_MEM:
>> - inst.access_size = has_rex_w ? 8 : 4;
>> + inst.access_size = get_op_width(has_rex_w, has_opsz_prefix);
>> break;
>> case X86_OP_MOV_IMMEDIATE_TO_MEM:
>> - inst.access_size = has_rex_w ? 8 : 4;
>> + inst.access_size = get_op_width(has_rex_w, has_opsz_prefix);
>> has_immediate = true;
>> does_write = true;
>> break;
>> case X86_OP_MOV_MEM_TO_AX:
>> inst.inst_len += get_address_width(has_addrsz_prefix);
>> - inst.access_size = has_rex_w ? 8 : 4;
>> + inst.access_size = get_op_width(has_rex_w, has_opsz_prefix);
>> inst.in_reg_num = 15;
>> goto final;
>> case X86_OP_MOV_AX_TO_MEM:
>> inst.inst_len += get_address_width(has_addrsz_prefix);
>> - inst.access_size = has_rex_w ? 8 : 4;
>> + inst.access_size = get_op_width(has_rex_w, has_opsz_prefix);
>> inst.out_val = guest_regs->by_index[15];
>> does_write = true;
>> goto final;
>> diff --git a/inmates/tests/x86/mmio-access-32.c
>> b/inmates/tests/x86/mmio-access-32.c
>> index 2f47f211..b4f83850 100644
>> --- a/inmates/tests/x86/mmio-access-32.c
>> +++ b/inmates/tests/x86/mmio-access-32.c
>> @@ -41,6 +41,10 @@ void inmate_main(void)
>> mmio_write32(mmio_reg, pattern);
>> EXPECT_EQUAL(*comm_page_reg, pattern);
>> + /* MOV_FROM_MEM (8b), 16-bit data, 32-bit address, OP size
>> prefix */
>> + asm volatile("mov (%%eax), %%ax" : "=a" (reg32) : "a" (mmio_reg));
>> + EXPECT_EQUAL(reg32, (u16)pattern);
>
> We should try to cover all cases of current get_op_width.
Ok.
>
>> +
>> /* MOV_FROM_MEM (8b), 32-bit data, 32-bit address */
>> asm volatile("movl (%%ebx), %%eax"
>> : "=a" (reg32) : "a" (0), "b" (mmio_reg));
>> diff --git a/inmates/tests/x86/mmio-access.c
>> b/inmates/tests/x86/mmio-access.c
>> index 0e6a56b1..86694353 100644
>> --- a/inmates/tests/x86/mmio-access.c
>> +++ b/inmates/tests/x86/mmio-access.c
>> @@ -51,6 +51,10 @@ void inmate_main(void)
>> mmio_write64(mmio_reg, pattern);
>> EXPECT_EQUAL(*comm_page_reg, pattern);
>> + /* MOV_FROM_MEM (8b), 16-bit data, Ox66 OP size prefix */
>> + asm volatile("mov (%%rax), %%ax" : "=a" (reg64) : "a" (mmio_reg));
>> + EXPECT_EQUAL(reg64, (u16)pattern);
>> +
>> /* MOV_FROM_MEM (8b), 64-bit data, mod=0, reg=0, rm=3 */
>> asm volatile("movq (%%rbx), %%rax"
>> : "=a" (reg64) : "a" (0), "b" (mmio_reg));
>>
>
> Thanks for picking this up. I guess we need to complete that aspect of
> the instruction parsing. Eventually, it will be simpler to argue about
> being complete, rather than why not being so.
Yeah, I somehow feel sorry that we need this.
Just had a look at KVM's emulate.c. Oh my dear.
Ralf
>
> Jan
>
--
You received this message because you are subscribed to the Google Groups
"Jailhouse" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/jailhouse-dev/c72d9659-2d46-5aff-062b-9d7d13dfcef4%40oth-regensburg.de.
For more options, visit https://groups.google.com/d/optout.