On Mon, Oct 31, 2022 at 07:39:01AM -0500, Scott Cheloha wrote: > On Mon, Oct 31, 2022 at 12:43:50PM +0100, Paul de Weerd wrote: > > Hi folks, > > > > I just upgraded a VM on my AMD EPYC host. I get the following > > protection fault during boot: > > > > ddb> bo re > > rebooting... > > Using drive 0, partition 3. > > Loading...... > > probing: pc0 com0 mem[638K 3838M 256M a20=on] > > disk: hd0+ > > >> OpenBSD/amd64 BOOT 3.55 > > \ > > com0: 115200 baud > > switching console to com0 > > >> OpenBSD/amd64 BOOT 3.55 > > boot> > > NOTE: random seed is being reused. > > booting hd0a:/bsd: 15615256+3781640+298464+0+1171456 > > [1143945+128+1225080+928182]=0x170d440 > > entry point at 0xffffffff81001000 > > [ using 3298368 bytes of bsd ELF symbol table ] > > Copyright (c) 1982, 1986, 1989, 1991, 1993 > > The Regents of the University of California. All rights reserved. > > Copyright (c) 1995-2022 OpenBSD. All rights reserved. > > https://www.OpenBSD.org > > > > OpenBSD 7.2-current (GENERIC) #784: Fri Oct 28 21:50:59 MDT 2022 > > [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC > > real mem = 4278177792 (4079MB) > > avail mem = 4131221504 (3939MB) > > random: good seed from bootblocks > > mpath0 at root > > scsibus0 at mpath0: 256 targets > > mainbus0 at root > > bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf36b0 (12 entries) > > bios0: vendor SeaBIOS version "1.14.0p0-OpenBSD-vmm" date 01/01/2011 > > bios0: OpenBSD VMM > > acpi at bios0 not configured > > cpu0 at mainbus0: (uniprocessor) > > kernel: protection fault trap, code=0 > > Stopped at tsc_identify+0xcd: rdmsr > > ddb> ps > > PID TID PPID UID S FLAGS WAIT COMMAND > > * 0 0 -1 0 7 0x10200 swapper > > ddb> trace > > tsc_identify(ffffffff822c7ff0,ffffffff822c7ff0,68a34bffd15c67e6,ffffffff822c7ff0,10,ffffffff82714c10) > > at tsc_identify+0xcd > > identifycpu(ffffffff822c7ff0,ffffffff822c7ff0,bca189629b3de454,ffff80000002c400,ffffffff822c7ff0,ffff80000002c424) > > at identifycpu+0x2e4 > > cpu_attach(ffff80000002c300,ffff80000002c400,ffffffff82714d98,ffff80000002c300,980a70616799eafd,ffff80000002c300) > > at cpu_attach+0x16f > > config_attach(ffff80000002c300,ffffffff82289250,ffffffff82714d98,ffffffff8138d1b0,6c550c45866795b6,ffffffff82714db8) > > at config_attach+0x1f4 > > mainbus_attach(0,ffff80000002c300,0,0,819b798732a62156,0) at > > mainbus_attach+0x151 > > config_attach(0,ffffffff822891a8,0,0,6c550c4586f4e2c4,0) at > > config_attach+0x1f4 > > cpu_configure(f588b7541b8b8d14,0,0,ffff80000002e000,ffffffff81abb8d3,ffffffff82714f00) > > at cpu_configure+0x33 > > main(0,0,0,0,0,1) at main+0x379 > > end trace frame: 0x0, count: -8 > > ddb> show reg > > rdi 0xffffffff822a3035 cpu_vendor+0xd > > rsi 0xffffffff81f04410 cmd0646_9_tim_udma+0x170f5 > > rbp 0xffffffff82714c30 end+0x314c30 > > rbx 0x20202020 > > rdx 0 > > rcx 0xc0010015 > > rax 0 > > r8 0 > > r9 0x40 > > r10 0x2bc299b68ee7cba5 > > r11 0x75a3a544d54dd7b9 > > r12 0x1 > > r13 0xffff80000002c424 > > r14 0xffffffff822c7ff0 cpu_info_full_primary+0x1ff0 > > r15 0xffffffff82714c40 end+0x314c40 > > rip 0xffffffff819e1f4d tsc_identify+0xcd > > cs 0x8 > > rflags 0x10202 __ALIGN_SIZE+0xf202 > > rsp 0xffffffff82714c10 end+0x314c10 > > ss 0x10 > > tsc_identify+0xcd: rdmsr > > ddb> > > You get a #GP in your VM when trying to rdmsr(MSR_HWCR). My guess is > we need to expand the MSR read bitmap for SVM. > > This patch compiles, but I can't test it. Does it fix the panic? > > CC dv@ mlarkin@ > > Index: vmm.c > =================================================================== > RCS file: /cvs/src/sys/arch/amd64/amd64/vmm.c,v > retrieving revision 1.323 > diff -u -p -r1.323 vmm.c > --- vmm.c 7 Sep 2022 18:44:09 -0000 1.323 > +++ vmm.c 31 Oct 2022 12:38:30 -0000 > @@ -2705,6 +2705,10 @@ vcpu_reset_regs_svm(struct vcpu *vcpu, s > /* allow reading TSC */ > svm_setmsrbr(vcpu, MSR_TSC); > > + /* allow reading HWCR and PSTATEDEF for TSC calibration */ > + svm_setmsrbr(vcpu, MSR_HWCR); > + svm_setmsrbr(vcpu, MSR_PSTATEDEF(0)); > + > /* Guest VCPU ASID */ > if (vmm_alloc_vpid(&asid)) { > DPRINTF("%s: could not allocate asid\n", __func__); >
This is the same diff I would have come up with myself, and since it is reported to fix the issue, ok mlarkin@ on this. Thanks Scott. -ml
