Dave Voutila <[email protected]> writes:

> Josh Grosse <[email protected]> writes:
>
>> On Wed, Nov 17, 2021 at 08:36:35PM -0500, Dave Voutila wrote:
>>> My work adding an ipi for clearing VMCS didn't touch anything to do with
>>> emulating instructions touch memory regions.
>>>
>>> Another thing that changed around this time, IIRC, was we bumped seabios
>>> in ports.
>>>
>>> Can you grab an older "vmm-firmware" from the below, unpack it, and boot
>>> the same vm but tell vmctl to use that bios image with -B?
>>>
>>> http://firmware.openbsd.org/firmware/6.9/vmm-firmware-1.11.0p3.tgz
>>>
>>> I would be shocked if my ipi work magically made vmm or vmd start
>>> showing a floppy disk device.
>>
>> There was no change in outcome with firmware 1.11 (6.9) or 1.14 (-current).
>> With the Aug 31 commit applied, I see this message in OpenBSD/amd64 -release 
>> 7.0
>> guests:
>>
>>     fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
>>
>> and the subsequent logging of:
>>
>>     fdcresult: overrun
>>
>> However, with kernels built preceeding the Aug 31 commit, these don't occur.
>
> Interesting. Thanks for testing my suggestion. I'm in the middle of
> squashing some other issue in vmm but will take a look at this next. We
> do very little emulation in vmm(4) and send most of it to vmd(8), so
> this is a bit surprising, but clearly something is funky.
>

Ok, I have a hypothesis and a diff based on some detective work.

The commit found through bisection changed how we handle hogging the
physical cpu. Previously, we would volunteer to yield(), but as the diff
was focusing on addressing VMCS state corruption when we move between
cpus, I removed the yield() and just broke out of the run loop and
returned to useland. This is not an uncommon design and is done by other
hypervisors I've looked at.

The problem is we may have just been emulating an io instruction. For
the io port range used by fdc(4) we're correctly populating parts of EAX
with 0xff and that is enough to convince fdc(4) there's nothing found
during probe. HOWEVER, since we are breaking out of the vcpu run loop we
end up round-tripping to userland. When we come back, we're clobbering
part or all of RAX...so when we re-enter the guest and the driver
inspects the result of the operation, it can get something not-0xff.

The below diff changes the behavior back to yield() and staying in the
kernel, but incorporates the proper VMCS dance to make sure we don't
corrupt the VMCS if we resume on another cpu or have had another guest
vm load their VMCS on the same CPU.

Can you please give this a try and see if you can reproduce the fdc(4)
issue? I tested it myself by adding printf's to sys/dev/isa/fdc.c to
check the result of the dma reads to make sure they're 0xff so I'm
reasonably confident this should fix it.

If it resolves the issue I'll validate it on AMD as well once I dust off
that machine, wherever I put it.

-dv


Index: sys/arch/amd64/amd64/vmm.c
===================================================================
RCS file: /opt/cvs/src/sys/arch/amd64/amd64/vmm.c,v
retrieving revision 1.294
diff -u -p -r1.294 vmm.c
--- sys/arch/amd64/amd64/vmm.c  26 Oct 2021 16:29:49 -0000      1.294
+++ sys/arch/amd64/amd64/vmm.c  18 Nov 2021 21:26:18 -0000
@@ -4891,8 +4891,14 @@ vcpu_run_vmx(struct vcpu *vcpu, struct v

                        /* Check if we should yield - don't hog the {p,v}pu */
                        spc = &ci->ci_schedstate;
-                       if (spc->spc_schedflags & SPCF_SHOULDYIELD)
-                               break;
+                       if (spc->spc_schedflags & SPCF_SHOULDYIELD) {
+                               vcpu->vc_last_pcpu = curcpu();
+                               yield();
+                               if (vcpu_reload_vmcs_vmx(vcpu)) {
+                                       ret = EINVAL;
+                                       break;
+                               }
+                       }

                } else {
                        /*

Reply via email to