On Tue, Oct 22, 2019 at 06:57:32PM +0900, Iori YONEJI wrote:
> On Tue, Oct 22, 2019 at 11:17 AM Mike Larkin <mlar...@nested.page> wrote:
> >
> > On Mon, Oct 21, 2019 at 03:52:52AM +0900, Iori YONEJI wrote:
> > > Hello tech@,
> > >
> > > I have a question (or maybe a suggestion) about vmm(4).
> > >
> > > I'm writing a small additional feature to sys/arch/amd64/amd64/vmm.c
> > > and found a seemingly unneeded vmclear() at the end of
> > > vcpu_readregs_vmx(). This function didn't seem to affect VM state at
> > > first glance because, obviously, its name is _read_. Against this
> > > intuition, this function clears VM state, which makes the vmexit
> > > handling stop (eg. "advance rip" failure) and the VM die if it is used
> > > in there. Possibly this was added as a counterpart of
> > > vcpu_reload_vmcs_vmx() at the very beginning of the read function, but
> > > I don't think it does undo the reload.
> > >
> > > This vmclear can be removed in my understanding and also I confirmed
> > > that Alpine Linux and OpenBSD guest VMs can run on the vmm(4) without
> > > this vmclear call.
> > >
> > > Or I am just wrong and it may be actually needed in some corner cases.
> > > If so, I will restore (vmptrld) vcpu->vc_control_pa every time the VM
> > > context continues after calling readregs function or add another flag
> > > to indicate whether vmclear is needed here if all of the corner cases
> > > are known.
> > >
> > > Would you mind to tell me I am correct or not, or how it has to be
> > > dealt with?
> > >
> > >
> > > Index: sys/arch/amd64/amd64/vmm.c
> > > ===================================================================
> > > RCS file: /cvs/src/sys/arch/amd64/amd64/vmm.c,v
> > > retrieving revision 1.254
> > > diff -u -p -r1.254 vmm.c
> > > --- sys/arch/amd64/amd64/vmm.c    22 Sep 2019 08:47:54 -0000    1.254
> > > +++ sys/arch/amd64/amd64/vmm.c    20 Oct 2019 18:04:17 -0000
> > > @@ -1602,8 +1602,10 @@ vcpu_readregs_vmx(struct vcpu *vcpu, uin
> > >  errout:
> > >      ret = EINVAL;
> > >  out:
> > > +    /* XXX Why do we need vmclear here?
> > >      if (vmclear(&vcpu->vc_control_pa))
> > >          ret = EINVAL;
> > > +    */
> > >      return (ret);
> > >  }
> > >
> >
> > The assumption in vcpu_run_vmx is that we will always come in with a cleared
> > VMCS state, unless we are staying in the run loop (the "resume" flag in that
> > function tracks this). So any ioctl (including things like read/write regs)
> > that might possibly manipulate the VMCS, need to clear the VMCS before they
> > return to user space. That was the original reason for that clear at the end
> > of the function.
> >
> > If you are adding new functionality that uses read/write regs internally, 
> > then
> > you're right; on return, you'll have a cleared VMCS and subsequent vmreads 
> > and
> > vmwrites will fail.
> >
> > It looks like there are a couple of other users of readregs/writeregs
> > internally - the presently unused GVA translator function as well as the 
> > in/out
> > exit handler structure preparation code. Both of those look like they got 
> > lucky
> > in how they were handling things; they aren't touching the VMCS after 
> > read/write
> > regs or we would have seen them explode like you're saying.
> >
> > It looks like the start of vcpu_run_vmx *almost* properly handles coming
> > in with an arbitrary  VMCS. The "almost" part is the bit under #ifdef 
> > VMM_DEBUG
> > that handles triple faults. It seems to assume that the proper VMCS is 
> > currently
> > loaded, which is certainly not true all the time. I'll fix that part.
> >
> > Based on the fact that the other code paths look clean, I think we can 
> > remove
> > the vmclear from the end of read/write regs, as you suggest. I'll do that as
> > well.
> >
> > Might I ask what you're working on? Just in case I know of someone else 
> > working
> > on the same thing? (There are about a dozen vmm side-projects going on that 
> > I'm
> > aware of and it would be a waste for two people to be working on the same 
> > thing
> > without knowing of the other). Feel free to mail me off-list if you don't 
> > want
> > to share in public.
> >
> > -ml
> >
> Thank you. I didn't notice that the VMCS must be cleared before
> returning to user space.
> 
> Talking about the use of this function from outside of the run loop, I
> worry about that it would overwrite VMCS memory when another physical
> CPU that is running the vcpu's run loop has exited and written back its
> new state just before read regs clears its VMCS.
> If those control_pa(s) always point different memory addresses, this
> will never happen. Otherwise, I think it would (very unlikely) happen.
> 
> And thank you to remove the vmclear.
> 
> > Might I ask what you're working on?
> I am working on adding MMIO handling.
> 
> Originally, I encountered very same situation as this report:
> Booting NetBSD on vmm fails with uvm_fault
> https://marc.info/?l=openbsd-bugs&m=153761926725642&w=2
> 
> The address 0xfed40014 is the TPM capability register as defined in TIS
> specification, which is not needed for the guest VM for now. I can
> manage it by handling this specific issue by just ignoring EPT page fault
> at the address (luckily, the destination register seems to be always
> cleared before read operation in this case and zero means
> "TPM not available").
> To advance from this hacky workaround and to make the handling more
> general, I am going to parse this instruction (it's at GVA, so I need
> to use vmm_translate_gva so as read regs) and store proper data at
> specified register.
> 
> Currently, only mov instruction is valid and only VMX is supported,
> but I will hopefully post the patch soon as "the scaffold" after make
> it work on AMD machine.

I fixed the first part described above, but I'm going to leave the clear
in; this ensures we are returning to userspace without leaving the VMCS
loaded, which matches the approach used in the other ioctl functions.

-ml

Reply via email to