On Thu, Dec 12, 2019 at 07:40:45PM -0800, Mike Larkin wrote: > On Tue, Oct 22, 2019 at 06:57:32PM +0900, Iori YONEJI wrote: > > On Tue, Oct 22, 2019 at 11:17 AM Mike Larkin <[email protected]> wrote: > > > > > > On Mon, Oct 21, 2019 at 03:52:52AM +0900, Iori YONEJI wrote: > > > > Hello tech@, > > > > > > > > I have a question (or maybe a suggestion) about vmm(4). > > > > > > > > I'm writing a small additional feature to sys/arch/amd64/amd64/vmm.c > > > > and found a seemingly unneeded vmclear() at the end of > > > > vcpu_readregs_vmx(). This function didn't seem to affect VM state at > > > > first glance because, obviously, its name is _read_. Against this > > > > intuition, this function clears VM state, which makes the vmexit > > > > handling stop (eg. "advance rip" failure) and the VM die if it is used > > > > in there. Possibly this was added as a counterpart of > > > > vcpu_reload_vmcs_vmx() at the very beginning of the read function, but > > > > I don't think it does undo the reload. > > > > > > > > This vmclear can be removed in my understanding and also I confirmed > > > > that Alpine Linux and OpenBSD guest VMs can run on the vmm(4) without > > > > this vmclear call. > > > > > > > > Or I am just wrong and it may be actually needed in some corner cases. > > > > If so, I will restore (vmptrld) vcpu->vc_control_pa every time the VM > > > > context continues after calling readregs function or add another flag > > > > to indicate whether vmclear is needed here if all of the corner cases > > > > are known. > > > > > > > > Would you mind to tell me I am correct or not, or how it has to be > > > > dealt with? > > > > > > > > > > > > Index: sys/arch/amd64/amd64/vmm.c > > > > =================================================================== > > > > RCS file: /cvs/src/sys/arch/amd64/amd64/vmm.c,v > > > > retrieving revision 1.254 > > > > diff -u -p -r1.254 vmm.c > > > > --- sys/arch/amd64/amd64/vmm.c 22 Sep 2019 08:47:54 -0000 1.254 > > > > +++ sys/arch/amd64/amd64/vmm.c 20 Oct 2019 18:04:17 -0000 > > > > @@ -1602,8 +1602,10 @@ vcpu_readregs_vmx(struct vcpu *vcpu, uin > > > > errout: > > > > ret = EINVAL; > > > > out: > > > > + /* XXX Why do we need vmclear here? > > > > if (vmclear(&vcpu->vc_control_pa)) > > > > ret = EINVAL; > > > > + */ > > > > return (ret); > > > > } > > > > > > > > > > The assumption in vcpu_run_vmx is that we will always come in with a > > > cleared > > > VMCS state, unless we are staying in the run loop (the "resume" flag in > > > that > > > function tracks this). So any ioctl (including things like read/write > > > regs) > > > that might possibly manipulate the VMCS, need to clear the VMCS before > > > they > > > return to user space. That was the original reason for that clear at the > > > end > > > of the function. > > > > > > If you are adding new functionality that uses read/write regs internally, > > > then > > > you're right; on return, you'll have a cleared VMCS and subsequent > > > vmreads and > > > vmwrites will fail. > > > > > > It looks like there are a couple of other users of readregs/writeregs > > > internally - the presently unused GVA translator function as well as the > > > in/out > > > exit handler structure preparation code. Both of those look like they got > > > lucky > > > in how they were handling things; they aren't touching the VMCS after > > > read/write > > > regs or we would have seen them explode like you're saying. > > > > > > It looks like the start of vcpu_run_vmx *almost* properly handles coming > > > in with an arbitrary VMCS. The "almost" part is the bit under #ifdef > > > VMM_DEBUG > > > that handles triple faults. It seems to assume that the proper VMCS is > > > currently > > > loaded, which is certainly not true all the time. I'll fix that part. > > > > > > Based on the fact that the other code paths look clean, I think we can > > > remove > > > the vmclear from the end of read/write regs, as you suggest. I'll do that > > > as > > > well. > > > > > > Might I ask what you're working on? Just in case I know of someone else > > > working > > > on the same thing? (There are about a dozen vmm side-projects going on > > > that I'm > > > aware of and it would be a waste for two people to be working on the same > > > thing > > > without knowing of the other). Feel free to mail me off-list if you don't > > > want > > > to share in public. > > > > > > -ml > > > > > Thank you. I didn't notice that the VMCS must be cleared before > > returning to user space. > > > > Talking about the use of this function from outside of the run loop, I > > worry about that it would overwrite VMCS memory when another physical > > CPU that is running the vcpu's run loop has exited and written back its > > new state just before read regs clears its VMCS. > > If those control_pa(s) always point different memory addresses, this > > will never happen. Otherwise, I think it would (very unlikely) happen. > > > > And thank you to remove the vmclear. > > > > > Might I ask what you're working on? > > I am working on adding MMIO handling. > > > > Originally, I encountered very same situation as this report: > > Booting NetBSD on vmm fails with uvm_fault > > https://marc.info/?l=openbsd-bugs&m=153761926725642&w=2 > > > > The address 0xfed40014 is the TPM capability register as defined in TIS > > specification, which is not needed for the guest VM for now. I can > > manage it by handling this specific issue by just ignoring EPT page fault > > at the address (luckily, the destination register seems to be always > > cleared before read operation in this case and zero means > > "TPM not available"). > > To advance from this hacky workaround and to make the handling more > > general, I am going to parse this instruction (it's at GVA, so I need > > to use vmm_translate_gva so as read regs) and store proper data at > > specified register. > > > > Currently, only mov instruction is valid and only VMX is supported, > > but I will hopefully post the patch soon as "the scaffold" after make > > it work on AMD machine. > > I fixed the first part described above, but I'm going to leave the clear > in; this ensures we are returning to userspace without leaving the VMCS > loaded, which matches the approach used in the other ioctl functions. > > -ml
OK, I understood that it should return to userspace with vmclear. I've been away from the modification I mentioned above for a while, but I'll aware about it after (hopefully) back to it. I looked into my old diffs and restored this vmclear once I deleted, and changed to making sure VMCS is loaded before using it again after reading the registers for the emulation. This would make much sense if the emulator may be eventually used for other things which might enter from other paths.
