On Tue, Oct 22, 2019 at 11:17 AM Mike Larkin <[email protected]> wrote:
>
> On Mon, Oct 21, 2019 at 03:52:52AM +0900, Iori YONEJI wrote:
> > Hello tech@,
> >
> > I have a question (or maybe a suggestion) about vmm(4).
> >
> > I'm writing a small additional feature to sys/arch/amd64/amd64/vmm.c
> > and found a seemingly unneeded vmclear() at the end of
> > vcpu_readregs_vmx(). This function didn't seem to affect VM state at
> > first glance because, obviously, its name is _read_. Against this
> > intuition, this function clears VM state, which makes the vmexit
> > handling stop (eg. "advance rip" failure) and the VM die if it is used
> > in there. Possibly this was added as a counterpart of
> > vcpu_reload_vmcs_vmx() at the very beginning of the read function, but
> > I don't think it does undo the reload.
> >
> > This vmclear can be removed in my understanding and also I confirmed
> > that Alpine Linux and OpenBSD guest VMs can run on the vmm(4) without
> > this vmclear call.
> >
> > Or I am just wrong and it may be actually needed in some corner cases.
> > If so, I will restore (vmptrld) vcpu->vc_control_pa every time the VM
> > context continues after calling readregs function or add another flag
> > to indicate whether vmclear is needed here if all of the corner cases
> > are known.
> >
> > Would you mind to tell me I am correct or not, or how it has to be
> > dealt with?
> >
> >
> > Index: sys/arch/amd64/amd64/vmm.c
> > ===================================================================
> > RCS file: /cvs/src/sys/arch/amd64/amd64/vmm.c,v
> > retrieving revision 1.254
> > diff -u -p -r1.254 vmm.c
> > --- sys/arch/amd64/amd64/vmm.c    22 Sep 2019 08:47:54 -0000    1.254
> > +++ sys/arch/amd64/amd64/vmm.c    20 Oct 2019 18:04:17 -0000
> > @@ -1602,8 +1602,10 @@ vcpu_readregs_vmx(struct vcpu *vcpu, uin
> >  errout:
> >      ret = EINVAL;
> >  out:
> > +    /* XXX Why do we need vmclear here?
> >      if (vmclear(&vcpu->vc_control_pa))
> >          ret = EINVAL;
> > +    */
> >      return (ret);
> >  }
> >
>
> The assumption in vcpu_run_vmx is that we will always come in with a cleared
> VMCS state, unless we are staying in the run loop (the "resume" flag in that
> function tracks this). So any ioctl (including things like read/write regs)
> that might possibly manipulate the VMCS, need to clear the VMCS before they
> return to user space. That was the original reason for that clear at the end
> of the function.
>
> If you are adding new functionality that uses read/write regs internally, then
> you're right; on return, you'll have a cleared VMCS and subsequent vmreads and
> vmwrites will fail.
>
> It looks like there are a couple of other users of readregs/writeregs
> internally - the presently unused GVA translator function as well as the 
> in/out
> exit handler structure preparation code. Both of those look like they got 
> lucky
> in how they were handling things; they aren't touching the VMCS after 
> read/write
> regs or we would have seen them explode like you're saying.
>
> It looks like the start of vcpu_run_vmx *almost* properly handles coming
> in with an arbitrary  VMCS. The "almost" part is the bit under #ifdef 
> VMM_DEBUG
> that handles triple faults. It seems to assume that the proper VMCS is 
> currently
> loaded, which is certainly not true all the time. I'll fix that part.
>
> Based on the fact that the other code paths look clean, I think we can remove
> the vmclear from the end of read/write regs, as you suggest. I'll do that as
> well.
>
> Might I ask what you're working on? Just in case I know of someone else 
> working
> on the same thing? (There are about a dozen vmm side-projects going on that 
> I'm
> aware of and it would be a waste for two people to be working on the same 
> thing
> without knowing of the other). Feel free to mail me off-list if you don't want
> to share in public.
>
> -ml
>
Thank you. I didn't notice that the VMCS must be cleared before
returning to user space.

Talking about the use of this function from outside of the run loop, I
worry about that it would overwrite VMCS memory when another physical
CPU that is running the vcpu's run loop has exited and written back its
new state just before read regs clears its VMCS.
If those control_pa(s) always point different memory addresses, this
will never happen. Otherwise, I think it would (very unlikely) happen.

And thank you to remove the vmclear.

> Might I ask what you're working on?
I am working on adding MMIO handling.

Originally, I encountered very same situation as this report:
Booting NetBSD on vmm fails with uvm_fault
https://marc.info/?l=openbsd-bugs&m=153761926725642&w=2

The address 0xfed40014 is the TPM capability register as defined in TIS
specification, which is not needed for the guest VM for now. I can
manage it by handling this specific issue by just ignoring EPT page fault
at the address (luckily, the destination register seems to be always
cleared before read operation in this case and zero means
"TPM not available").
To advance from this hacky workaround and to make the handling more
general, I am going to parse this instruction (it's at GVA, so I need
to use vmm_translate_gva so as read regs) and store proper data at
specified register.

Currently, only mov instruction is valid and only VMX is supported,
but I will hopefully post the patch soon as "the scaffold" after make
it work on AMD machine.

Reply via email to