On Wed, Jan 17, 2018 at 09:26:58AM -0800, Mike Larkin wrote:
> On Wed, Jan 17, 2018 at 01:28:51PM +0100, Martin Pieuchot wrote:
> > I'm running an OpenBSD amd64 guest into an amd64 host. Dmesgs
> > attached. The VM config is as follow:
> >
> > vm "qemu" {
> > memory 128M
> > disk "/usr/hack/mpi/qemu/amd64/disk.img"
> > local interface tap0 lladdr 70:5f:ca:21:8d:70
> > disable
> > owner mpi
> > }
> >
> > I run dhclient(8) and NAT the traffic using IP forwarding on my host.
> >
> > vmd(8) exited multiple times with the following error:
> > vmd[98149]: vcpu_run_loop: vm 2 / vcpu 0 run ioctl failed: Invalid
> > argument
> >
> > The last two times it happened it was at "reboot" time. Once after
> > upgrading
> > the machine via bsd.rd. After entering "Enter" when asked if I want to
> > reboot
> > vmd(8) crashed. The second time it happened just after typing 'reboot' in a
> > ssh session.
>
> 3rd report of this in the last few weeks. Now it's happening here to me also.
> I'll see if I can repro it.
>
> -ml
One of my VMs just crashed with this:
vmx_handle_exit: unhandled exit 0x31 (EPT misconfiguration)
(If you could enable VMM_DEBUG in vmm.c, it would be helpful to see if
this is the same problem you are seeing - fair warning, VMM_DEBUG is
fairly chatty).
Misconfigurations are a side effect of how we handle faulting in pages into
EPT; we use a regular pmap and let uvm_fault fault pages into the "guest
physical" area, which in the EPT corresponds to addresses < 512GB. As luck
would have it, that is the same region as userspace in a regular process,
so we essentially trick uvm into doing most of the work for us.
For the most part, it works fine. However, in order to enter a PTE into any
pmap, we load that pmap on curcpu and then make the change. But we have to be
very careful we don't touch any bits that are reserved in EPT PTEs while
we are doing that. Unfortunately, the A/D bits of a regular PTE conflict
with certain bits that must be 0 in an EPT entry; this causes
misconfigurations. The A/D bits are entered into the PTE by the CPU
automatically during normal operation, and when we re-enter the guest,
we get the abort shown above.
I'm not sure why this suddenly started happening more, we used to see this
when the machine went into swap, but this machine certainly didn't do that.
Perhaps this newer CPU I'm using is more picky about some of those bits
and my x230 with its older CPU was more forgiving.
The more correct approach is to detect if we are trying to enter a PTE into
an EPT pmap, and if so, use the direct map to reach into it and set the PTEs
directly (akin to the old alternate PDE space we used to use a few years
back). And KASSERT if we ever try to activate an EPT pmap on a cpu.
I had always planned to fix this, but never got to it. As luck would have it,
I learned some pmap tricks during the meltdown effort that may make it easier
to fix now. I'll work under the assumption that the issue you saw is the same
and let you know when I have a diff to look at.
-ml