On Mon, Oct 21, 2019 at 03:52:52AM +0900, Iori YONEJI wrote:
> Hello tech@,
> 
> I have a question (or maybe a suggestion) about vmm(4).
> 
> I'm writing a small additional feature to sys/arch/amd64/amd64/vmm.c
> and found a seemingly unneeded vmclear() at the end of
> vcpu_readregs_vmx(). This function didn't seem to affect VM state at
> first glance because, obviously, its name is _read_. Against this
> intuition, this function clears VM state, which makes the vmexit
> handling stop (eg. "advance rip" failure) and the VM die if it is used
> in there. Possibly this was added as a counterpart of
> vcpu_reload_vmcs_vmx() at the very beginning of the read function, but
> I don't think it does undo the reload.
> 
> This vmclear can be removed in my understanding and also I confirmed
> that Alpine Linux and OpenBSD guest VMs can run on the vmm(4) without
> this vmclear call.
> 
> Or I am just wrong and it may be actually needed in some corner cases.
> If so, I will restore (vmptrld) vcpu->vc_control_pa every time the VM
> context continues after calling readregs function or add another flag
> to indicate whether vmclear is needed here if all of the corner cases
> are known.
> 
> Would you mind to tell me I am correct or not, or how it has to be
> dealt with?
> 
> 
> Index: sys/arch/amd64/amd64/vmm.c
> ===================================================================
> RCS file: /cvs/src/sys/arch/amd64/amd64/vmm.c,v
> retrieving revision 1.254
> diff -u -p -r1.254 vmm.c
> --- sys/arch/amd64/amd64/vmm.c    22 Sep 2019 08:47:54 -0000    1.254
> +++ sys/arch/amd64/amd64/vmm.c    20 Oct 2019 18:04:17 -0000
> @@ -1602,8 +1602,10 @@ vcpu_readregs_vmx(struct vcpu *vcpu, uin
>  errout:
>      ret = EINVAL;
>  out:
> +    /* XXX Why do we need vmclear here?
>      if (vmclear(&vcpu->vc_control_pa))
>          ret = EINVAL;
> +    */
>      return (ret);
>  }
> 

The assumption in vcpu_run_vmx is that we will always come in with a cleared
VMCS state, unless we are staying in the run loop (the "resume" flag in that
function tracks this). So any ioctl (including things like read/write regs)
that might possibly manipulate the VMCS, need to clear the VMCS before they
return to user space. That was the original reason for that clear at the end
of the function.

If you are adding new functionality that uses read/write regs internally, then
you're right; on return, you'll have a cleared VMCS and subsequent vmreads and
vmwrites will fail.

It looks like there are a couple of other users of readregs/writeregs
internally - the presently unused GVA translator function as well as the in/out
exit handler structure preparation code. Both of those look like they got lucky
in how they were handling things; they aren't touching the VMCS after read/write
regs or we would have seen them explode like you're saying.

It looks like the start of vcpu_run_vmx *almost* properly handles coming
in with an arbitrary  VMCS. The "almost" part is the bit under #ifdef VMM_DEBUG
that handles triple faults. It seems to assume that the proper VMCS is currently
loaded, which is certainly not true all the time. I'll fix that part.

Based on the fact that the other code paths look clean, I think we can remove
the vmclear from the end of read/write regs, as you suggest. I'll do that as
well.

Might I ask what you're working on? Just in case I know of someone else working
on the same thing? (There are about a dozen vmm side-projects going on that I'm
aware of and it would be a waste for two people to be working on the same thing
without knowing of the other). Feel free to mail me off-list if you don't want
to share in public.

-ml

Reply via email to