Re: kernel protection fault during boot on vmm(4) VM running on AMD EPYC cpu with tsc_identify in trace

Scott Cheloha Mon, 31 Oct 2022 05:40:00 -0700

On Mon, Oct 31, 2022 at 12:43:50PM +0100, Paul de Weerd wrote:
> Hi folks,
> 
> I just upgraded a VM on my AMD EPYC host.  I get the following
> protection fault during boot:
> 
> ddb> bo re
> rebooting...
> Using drive 0, partition 3.
> Loading......
> probing: pc0 com0 mem[638K 3838M 256M a20=on] 
> disk: hd0+
> >> OpenBSD/amd64 BOOT 3.55
> \
> com0: 115200 baud
> switching console to com0
> >> OpenBSD/amd64 BOOT 3.55
> boot> 
> NOTE: random seed is being reused.
> booting hd0a:/bsd: 15615256+3781640+298464+0+1171456 
> [1143945+128+1225080+928182]=0x170d440
> entry point at 0xffffffff81001000
> [ using 3298368 bytes of bsd ELF symbol table ]
> Copyright (c) 1982, 1986, 1989, 1991, 1993
>         The Regents of the University of California.  All rights reserved.
> Copyright (c) 1995-2022 OpenBSD. All rights reserved.  https://www.OpenBSD.org
> 
> OpenBSD 7.2-current (GENERIC) #784: Fri Oct 28 21:50:59 MDT 2022
>     [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC
> real mem = 4278177792 (4079MB)
> avail mem = 4131221504 (3939MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf36b0 (12 entries)
> bios0: vendor SeaBIOS version "1.14.0p0-OpenBSD-vmm" date 01/01/2011
> bios0: OpenBSD VMM
> acpi at bios0 not configured
> cpu0 at mainbus0: (uniprocessor)
> kernel: protection fault trap, code=0
> Stopped at      tsc_identify+0xcd:      rdmsr
> ddb> ps
>    PID     TID   PPID    UID  S       FLAGS  WAIT          COMMAND
> *    0       0     -1      0  7     0x10200                swapper
> ddb> trace
> tsc_identify(ffffffff822c7ff0,ffffffff822c7ff0,68a34bffd15c67e6,ffffffff822c7ff0,10,ffffffff82714c10)
>  at tsc_identify+0xcd
> identifycpu(ffffffff822c7ff0,ffffffff822c7ff0,bca189629b3de454,ffff80000002c400,ffffffff822c7ff0,ffff80000002c424)
>  at identifycpu+0x2e4
> cpu_attach(ffff80000002c300,ffff80000002c400,ffffffff82714d98,ffff80000002c300,980a70616799eafd,ffff80000002c300)
>  at cpu_attach+0x16f
> config_attach(ffff80000002c300,ffffffff82289250,ffffffff82714d98,ffffffff8138d1b0,6c550c45866795b6,ffffffff82714db8)
>  at config_attach+0x1f4
> mainbus_attach(0,ffff80000002c300,0,0,819b798732a62156,0) at 
> mainbus_attach+0x151
> config_attach(0,ffffffff822891a8,0,0,6c550c4586f4e2c4,0) at 
> config_attach+0x1f4
> cpu_configure(f588b7541b8b8d14,0,0,ffff80000002e000,ffffffff81abb8d3,ffffffff82714f00)
>  at cpu_configure+0x33
> main(0,0,0,0,0,1) at main+0x379
> end trace frame: 0x0, count: -8
> ddb> show reg
> rdi               0xffffffff822a3035    cpu_vendor+0xd
> rsi               0xffffffff81f04410    cmd0646_9_tim_udma+0x170f5
> rbp               0xffffffff82714c30    end+0x314c30
> rbx                       0x20202020
> rdx                                0
> rcx                       0xc0010015
> rax                                0
> r8                                 0
> r9                              0x40
> r10               0x2bc299b68ee7cba5
> r11               0x75a3a544d54dd7b9
> r12                              0x1
> r13               0xffff80000002c424
> r14               0xffffffff822c7ff0    cpu_info_full_primary+0x1ff0
> r15               0xffffffff82714c40    end+0x314c40
> rip               0xffffffff819e1f4d    tsc_identify+0xcd
> cs                               0x8
> rflags                       0x10202    __ALIGN_SIZE+0xf202
> rsp               0xffffffff82714c10    end+0x314c10
> ss                              0x10
> tsc_identify+0xcd:      rdmsr
> ddb>


You get a #GP in your VM when trying to rdmsr(MSR_HWCR).  My guess is
we need to expand the MSR read bitmap for SVM.

This patch compiles, but I can't test it.  Does it fix the panic?

CC dv@ mlarkin@

Index: vmm.c
===================================================================
RCS file: /cvs/src/sys/arch/amd64/amd64/vmm.c,v
retrieving revision 1.323
diff -u -p -r1.323 vmm.c
--- vmm.c       7 Sep 2022 18:44:09 -0000       1.323
+++ vmm.c       31 Oct 2022 12:38:30 -0000
@@ -2705,6 +2705,10 @@ vcpu_reset_regs_svm(struct vcpu *vcpu, s
        /* allow reading TSC */
        svm_setmsrbr(vcpu, MSR_TSC);
 
+       /* allow reading HWCR and PSTATEDEF for TSC calibration */
+       svm_setmsrbr(vcpu, MSR_HWCR);
+       svm_setmsrbr(vcpu, MSR_PSTATEDEF(0));
+
        /* Guest VCPU ASID */
        if (vmm_alloc_vpid(&asid)) {
                DPRINTF("%s: could not allocate asid\n", __func__);

Re: kernel protection fault during boot on vmm(4) VM running on AMD EPYC cpu with tsc_identify in trace

Reply via email to