Mike Larkin writes:

> On Sat, May 08, 2021 at 08:14:35AM -0400, Dave Voutila wrote:
>>
>> Josh Rickmar writes:
>>
>> > On Fri, May 07, 2021 at 04:19:18PM -0400, Dave Voutila wrote:
>> >>
>> >> Josh Rickmar writes:
>> >>
>> >> >>Synopsis:       vmm protection fault trap
>> >> >>Category:       vmm
>> >> >>Environment:
>> >> >         System      : OpenBSD 6.9
>> >> >         Details     : OpenBSD 6.9-current (GENERIC.MP) #6: Thu May  6 
>> >> > 10:16:53 MDT 2021
>> >> >                          
>> >> > [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>> >> >
>> >> >         Architecture: OpenBSD.amd64
>> >> >         Machine     : amd64
>> >> >>Description:
>> >> >
>> >> > My nixos vm is causing the host kernel to crash (after cold boot) with
>> >> > 'protection fault trap, code=0'.  The guest is running Linux 5.11.14
>> >> > (guest dmesg included after the host dmesg below).  I've also attached
>> >> > a screenshot of ddb showing the backtrace and registers.
>> >> >
>> >> >>How-To-Repeat:
>> >> >
>> >> > The crash can be reliably triggered by doing heavy disk IO on the vm.
>> >> > Upgrading the VM actually got the nixos install wedged during an
>> >> > initial crash, and attempting to repair it with "nix-build -A system
>> >> > '<nixpkgs/nixos>' --repair" is reliably repeating the crash.
>> >>
>> >> Any chance you've experienced this with a non-NixOS guest? I can't
>> >> reproduce this error on my Ryzen5 Pro host.
>> >>
>>
>> I've reproduced this locally with the help of abieber@. Seems I just
>> need to boot a nixos iso (nixos-21.05pre287333.63586475587-x86_64) and
>> try installing a package like git into the ramdisk:
>>
>>   # nix-env -f '<nixpkgs>' -iA git
>>
>> I still haven't triggered this without nixos, but at least I can
>> reproduce it locally now. :-)
>>
>> -dv
>>
>
> robert@ reported this same bug a long time ago and I could never reproduce it.
>
> I'll see if it repros against my R415 using these instructions.
>
> -ml

So far I haven't managed to trigger it using this diff. I don't know
why, but maybe the guest is mucking with the GDTR? I checked our logic
vs. netbsd nvmm's...as well as our acpi resume handling...and that's all
I can think of to explain it.


Index: sys/arch/amd64/amd64/vmm_support.S
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/vmm_support.S,v
retrieving revision 1.17
diff -u -p -r1.17 vmm_support.S
--- sys/arch/amd64/amd64/vmm_support.S  13 Feb 2021 07:47:37 -0000      1.17
+++ sys/arch/amd64/amd64/vmm_support.S  9 May 2021 13:45:08 -0000
@@ -747,6 +747,7 @@ restore_host_svm:
        popw    %ax             /* ax = saved TR */

        popq    %rdx
+       lgdtq   (%rdx)
        addq    $0x2, %rdx
        movq    (%rdx), %rdx

Reply via email to