> From: Dave Voutila <d...@sisu.io> > Date: Sun, 18 May 2025 15:21:28 -0400 > > Claudio Jeker <cje...@diehard.n-r-g.com> writes: > > > On Thu, May 15, 2025 at 01:15:43PM +0200, Martin Pieuchot wrote: > >> Fault below has been triggered by dong 'vmctl $myvm stop' while > >> rebooting a 7.7 amd64 VM: > >> > >> OpenBSD 7.7 (GENERIC) #619: Sun Apr 13 08:19:34 MDT 2025 > >> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC > >> real mem = 117428224 (111MB) > >> avail mem = 87916544 (83MB) > >> random: good seed from bootblocks > >> mpath0 at root > >> scsibus0 at mpath0: 256 targets > >> mainbus0 at root > >> bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf2760 (10 entries) > >> bios0: vendor SeaBIOS version "1.16.3p0-OpenBSD-vmm" date 01/01/2011 > >> bios0: OpenBSD VMM > >> acpi at bios0 not configured > >> cpu0 at mainbus0: (uniprocessor) > >> cpu0: Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz, 2394.43 MHz, 06-3d-04 > >> cpu0: cpuid 1 > >> edx=78ba97f<FPU,VME,DE,PSE,TSC,MSR,PAE,CX8,SEP,PGE,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2> > >> ecx=f6d83203<SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV> > >> cpu0: cpuid 7.0 > >> ebx=1c23a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,RDSEED,ADX,SMAP> > >> edx=400<MD_CLEAR> > >> cpu0: cpuid d.1 eax=1<XSAVEOPT> > >> cpu0: cpuid 80000001 edx=24100800<NXE,PAGE1GB,LONG> > >> ecx=121<LAHF,ABM,3DNOWP> > >> cpu0: cpuid 80000007 edx=100<ITSC> > >> cpu0: MELTDOWN > >> cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB > >> 64b/line 8-way L2 cache, 4MB 64b/line 16-way L3 cache > >> cpu0: smt 0, core 0, package 0 > >> cpu0: using VERW MDS workaround > >> pvbus0 at mainbus0: OpenBSD > >> pvclock0 at pvbus0 > >> pci0 at mainbus0 bus 0 > >> pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00 > >> virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00 > >> viornd0 at virtio0 > >> virtio0: irq 3 > >> virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Network" rev 0x00 > >> vio0 at virtio1: 1 queue, address fe:e1:bb:d1:de:21 > >> virtio1: irq 5 > >> virtio2 at pci0 dev 3 function 0 "Qumranet Virtio Storage" rev 0x00 > >> vioblk0 at virtio2 > >> virtio2: irq 6 > >> scsibus1 at vioblk0: 1 targets > >> sd0 at scsibus1 targ 0 lun 0: <VirtIO, Block Device, > > >> sd0: 3072MB, 512 bytes/sector, 6291456 sectors > >> virtio3 at pci0 dev 4 function 0 "OpenBSD VMM Control" rev 0x00 > >> vmmci0 at virtio3 > >> virtio3: irq 7 > >> isa0 at mainbus0 > >> isadma0 at isa0 > >> com0 at isa0 port 0x3f8/8 irq 4: ns8250, no fifo > >> com0: console > >> Rebooting in response to request from vmmci0 host > >> uvm_fault(0xffffffff82a56848, 0xf0, 0, 1) -> e > >> kernel: page fault trap, code=0 > >> Stopped at mtx_enter+0x2f: movq 0(%rdi),%rax > >> TID PID UID PRFLAGS PFLAGS CPU COMMAND > >> * 0 0 0 0x10000 0x200 0 swapper > >> mtx_enter(f0,f0,cea0d74fd2320eb6,f0,10,ffffffff82fbdcc0) at mtx_enter+0x2f > >> prsignal(0,2,2,1,ffff800000090400,d) at prsignal+0x26 > >> vmmci_config_change(ffff800000090400,ffff800000090400,86acf72f78ea7d07,0,ffff80 > >> 0000090400,0) at vmmci_config_change+0xf6 > >> virtio_pci_legacy_intr(ffff800000090400,ffff800000090400,c9a59ef2495be15f,fffff > >> fff81004000,0,4) at virtio_pci_legacy_intr+0x63 > >> intr_handler(ffffffff82fbddf0,ffff80000008e500,cbae99a56f027dfe,ffff80000008e48 > >> 0,ffffffff8152e8b6,ffffffff82fbdde0) at intr_handler+0x56 > >> Xintr_legacy7_untramp(0,ffffffff824e65b0,0,18041969,ffffffff81004000,e) at > >> Xint > >> r_legacy7_untramp+0x1a3 > >> Xspllower(0,0,c0d762ae039e8784,ffffffff81004000,ffffffff8152ed5c,ffffffff82fb70 > >> 08) at Xspllower+0x1d > >> cpu_configure(4449abb23155d965,0,0,ffff800000032000,ffffffff812a613f,ffffffff82 > >> fbdf10) at cpu_configure+0xaf > >> main(e,e,0,1,ffffffff816ab787,ffffffff82fbdf40) at main+0x427 > >> end trace frame: 0x0, count: 6 > >> https://www.openbsd.org/ddb.html describes the minimum info required in bug > >> > > > > I guess initprocess is still NULL in pvbus_reboot() and so prsignal blows > > up trying to access the ps_mtx. > > > > I would suggest to use startuphook_establish() to delay the attach of > > vmmci0 until initprocess is set. > > So here's what I'm thinking... > > I first tried delaying the virtio_attach_finish call using > startuphook_establish(9) but there seems to be a lifetime issue with the > virtio_attach_args struct. Specifically, virtio_pci_attach_finish > dereferences it as a virtio_pci_attach_args. At this point, the > pci_attach_args* member vpa_pa seems trashed.
Yes, as soon as you return from the attach function, the attach_args struct will be freed. You can make a copy of the struct and stash it away for later. But even then you have to be careful that pointer in the struct may no longer be valid. > Instead, I took the approach of using kthread_create_deferred(9) to > queue up signalling the shutdown or reboot command. If called late > enough, this ends up a direct call of the callback function instead of > adding it to the deferred kthread queue. > > In my testing, it can still result in the signal hitting initprocess > after it's alive, but before it can handle it. If that happens, it ends > up not causing a shutdown / reboot. To deal with this and the fact that > vmd thinks the command is in flight, I arm a timeout to try again after > 5 seconds. > > It kinda sucks, but it works. I'm all ears as to another approach or if > someone has a clue as to the lifetime issue I'm seeing with deferring > virtio pci attachment. I wonder if it would make sense to have a more generic solution. I suspect a similar issue may arise if you press the (ACPI) powerbutton on a machine during boot. I think an API, that can be safely called from interrupt context, would make sense. Currently we have several ssubtly different implementations of the same code, at least for shutdown, that all have one or more bugs... > diff refs/heads/master refs/heads/vmmci-panic > commit - 3c994bb556f544ca04982fd977c153c1485aab1e > commit + c58acacf15c284967253b4196b7d24aedf9ad1a2 > blob - 984626393cd85a97b6b84a19c82c8175aa2e4379 > blob + 0a4eac6d9ed700c383d332e5360bf0aea0a464de > --- sys/dev/pv/vmmci.c > +++ sys/dev/pv/vmmci.c > @@ -21,6 +21,7 @@ > #include <sys/timeout.h> > #include <sys/device.h> > #include <sys/sensors.h> > +#include <sys/kthread.h> > > #include <machine/bus.h> > > @@ -39,10 +40,12 @@ struct vmmci_softc { > struct device sc_dev; > struct virtio_softc *sc_virtio; > enum vmmci_cmd sc_cmd; > + enum vmmci_cmd sc_cmd_async; > unsigned int sc_interval; > struct ksensordev sc_sensordev; > struct ksensor sc_sensor; > struct timeout sc_tick; > + struct timeout sc_to_shutdown; > }; > > int vmmci_match(struct device *, void *, void *); > @@ -52,6 +55,7 @@ int vmmci_activate(struct device *, int); > int vmmci_config_change(struct virtio_softc *); > void vmmci_tick(void *); > void vmmci_tick_hook(struct device *); > +void vmmci_worker(void *); > > const struct cfattach vmmci_ca = { > sizeof(struct vmmci_softc), > @@ -116,6 +120,8 @@ vmmci_attach(struct device *parent, struct device *sel > config_mountroot(self, vmmci_tick_hook); > } > > + timeout_set(&sc->sc_to_shutdown, vmmci_worker, sc); > + > printf("\n"); > if (virtio_attach_finish(vsc, va) != 0) > goto err; > @@ -171,18 +177,22 @@ vmmci_config_change(struct virtio_softc *vsc) > /* no action */ > break; > case VMMCI_SHUTDOWN: > - pvbus_shutdown(&sc->sc_dev); > - break; > case VMMCI_REBOOT: > - pvbus_reboot(&sc->sc_dev); > + /* > + * Asynchronously process shutdown/reboot to allow us to handle > + * when vmmci(4) receives a command during system startup or > + * shutdown. > + */ > + sc->sc_cmd_async = cmd; > + kthread_create_deferred(vmmci_worker, sc); > break; > case VMMCI_SYNCRTC: > inittodr(gettime()); > sc->sc_cmd = VMMCI_NONE; > - break; > + break; > default: > printf("%s: invalid command %d\n", sc->sc_dev.dv_xname, cmd); > - cmd = VMMCI_NONE; > + cmd = VMMCI_NONE; > break; > } > > @@ -226,3 +236,25 @@ vmmci_tick_hook(struct device *self) > timeout_set(&sc->sc_tick, vmmci_tick, sc); > vmmci_tick(sc); > } > + > +void > +vmmci_worker(void *arg) > +{ > + struct vmmci_softc *sc = (struct vmmci_softc *)arg; > + > + switch (sc->sc_cmd_async) { > + case VMMCI_SHUTDOWN: > + pvbus_shutdown(&sc->sc_dev); > + break; > + case VMMCI_REBOOT: > + pvbus_reboot(&sc->sc_dev); > + break; > + default: > + printf("%s: invalid async command %d\n", sc->sc_dev.dv_xname, > + sc->sc_cmd_async); > + return; > + } > + > + /* Re-arm in case initprocess misses our signal. */ > + timeout_add_sec(&sc->sc_to_shutdown, 5); > +} > >