On 18/05/25(Sun) 23:43, Mark Kettenis wrote: > > From: Dave Voutila <d...@sisu.io> > > Date: Sun, 18 May 2025 15:21:28 -0400 > > > > Claudio Jeker <cje...@diehard.n-r-g.com> writes: > > > > > On Thu, May 15, 2025 at 01:15:43PM +0200, Martin Pieuchot wrote: > > >> Fault below has been triggered by dong 'vmctl $myvm stop' while > > >> rebooting a 7.7 amd64 VM: > > >> > > >> OpenBSD 7.7 (GENERIC) #619: Sun Apr 13 08:19:34 MDT 2025 > > >> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC > > >> real mem = 117428224 (111MB) > > >> avail mem = 87916544 (83MB) > > >> random: good seed from bootblocks > > >> mpath0 at root > > >> scsibus0 at mpath0: 256 targets > > >> mainbus0 at root > > >> bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf2760 (10 entries) > > >> bios0: vendor SeaBIOS version "1.16.3p0-OpenBSD-vmm" date 01/01/2011 > > >> bios0: OpenBSD VMM > > >> acpi at bios0 not configured > > >> cpu0 at mainbus0: (uniprocessor) > > >> cpu0: Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz, 2394.43 MHz, 06-3d-04 > > >> cpu0: cpuid 1 > > >> edx=78ba97f<FPU,VME,DE,PSE,TSC,MSR,PAE,CX8,SEP,PGE,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2> > > >> ecx=f6d83203<SSE3,PCLMUL,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV> > > >> cpu0: cpuid 7.0 > > >> ebx=1c23a9<FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,RDSEED,ADX,SMAP> > > >> edx=400<MD_CLEAR> > > >> cpu0: cpuid d.1 eax=1<XSAVEOPT> > > >> cpu0: cpuid 80000001 edx=24100800<NXE,PAGE1GB,LONG> > > >> ecx=121<LAHF,ABM,3DNOWP> > > >> cpu0: cpuid 80000007 edx=100<ITSC> > > >> cpu0: MELTDOWN > > >> cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB > > >> 64b/line 8-way L2 cache, 4MB 64b/line 16-way L3 cache > > >> cpu0: smt 0, core 0, package 0 > > >> cpu0: using VERW MDS workaround > > >> pvbus0 at mainbus0: OpenBSD > > >> pvclock0 at pvbus0 > > >> pci0 at mainbus0 bus 0 > > >> pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00 > > >> virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00 > > >> viornd0 at virtio0 > > >> virtio0: irq 3 > > >> virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Network" rev 0x00 > > >> vio0 at virtio1: 1 queue, address fe:e1:bb:d1:de:21 > > >> virtio1: irq 5 > > >> virtio2 at pci0 dev 3 function 0 "Qumranet Virtio Storage" rev 0x00 > > >> vioblk0 at virtio2 > > >> virtio2: irq 6 > > >> scsibus1 at vioblk0: 1 targets > > >> sd0 at scsibus1 targ 0 lun 0: <VirtIO, Block Device, > > > >> sd0: 3072MB, 512 bytes/sector, 6291456 sectors > > >> virtio3 at pci0 dev 4 function 0 "OpenBSD VMM Control" rev 0x00 > > >> vmmci0 at virtio3 > > >> virtio3: irq 7 > > >> isa0 at mainbus0 > > >> isadma0 at isa0 > > >> com0 at isa0 port 0x3f8/8 irq 4: ns8250, no fifo > > >> com0: console > > >> Rebooting in response to request from vmmci0 host > > >> uvm_fault(0xffffffff82a56848, 0xf0, 0, 1) -> e > > >> kernel: page fault trap, code=0 > > >> Stopped at mtx_enter+0x2f: movq 0(%rdi),%rax > > >> TID PID UID PRFLAGS PFLAGS CPU COMMAND > > >> * 0 0 0 0x10000 0x200 0 swapper > > >> mtx_enter(f0,f0,cea0d74fd2320eb6,f0,10,ffffffff82fbdcc0) at > > >> mtx_enter+0x2f > > >> prsignal(0,2,2,1,ffff800000090400,d) at prsignal+0x26 > > >> vmmci_config_change(ffff800000090400,ffff800000090400,86acf72f78ea7d07,0,ffff80 > > >> 0000090400,0) at vmmci_config_change+0xf6 > > >> virtio_pci_legacy_intr(ffff800000090400,ffff800000090400,c9a59ef2495be15f,fffff > > >> fff81004000,0,4) at virtio_pci_legacy_intr+0x63 > > >> intr_handler(ffffffff82fbddf0,ffff80000008e500,cbae99a56f027dfe,ffff80000008e48 > > >> 0,ffffffff8152e8b6,ffffffff82fbdde0) at intr_handler+0x56 > > >> Xintr_legacy7_untramp(0,ffffffff824e65b0,0,18041969,ffffffff81004000,e) > > >> at Xint > > >> r_legacy7_untramp+0x1a3 > > >> Xspllower(0,0,c0d762ae039e8784,ffffffff81004000,ffffffff8152ed5c,ffffffff82fb70 > > >> 08) at Xspllower+0x1d > > >> cpu_configure(4449abb23155d965,0,0,ffff800000032000,ffffffff812a613f,ffffffff82 > > >> fbdf10) at cpu_configure+0xaf > > >> main(e,e,0,1,ffffffff816ab787,ffffffff82fbdf40) at main+0x427 > > >> end trace frame: 0x0, count: 6 > > >> https://www.openbsd.org/ddb.html describes the minimum info required in > > >> bug > > >> > > > > > > I guess initprocess is still NULL in pvbus_reboot() and so prsignal blows > > > up trying to access the ps_mtx. > > > > > > I would suggest to use startuphook_establish() to delay the attach of > > > vmmci0 until initprocess is set. > > > > So here's what I'm thinking... > > > > I first tried delaying the virtio_attach_finish call using > > startuphook_establish(9) but there seems to be a lifetime issue with the > > virtio_attach_args struct. Specifically, virtio_pci_attach_finish > > dereferences it as a virtio_pci_attach_args. At this point, the > > pci_attach_args* member vpa_pa seems trashed. > > Yes, as soon as you return from the attach function, the attach_args > struct will be freed. You can make a copy of the struct and stash it > away for later. But even then you have to be careful that pointer in > the struct may no longer be valid. > > > Instead, I took the approach of using kthread_create_deferred(9) to > > queue up signalling the shutdown or reboot command. If called late > > enough, this ends up a direct call of the callback function instead of > > adding it to the deferred kthread queue. > > > > In my testing, it can still result in the signal hitting initprocess > > after it's alive, but before it can handle it. If that happens, it ends > > up not causing a shutdown / reboot. To deal with this and the fact that > > vmd thinks the command is in flight, I arm a timeout to try again after > > 5 seconds. > > > > It kinda sucks, but it works. I'm all ears as to another approach or if > > someone has a clue as to the lifetime issue I'm seeing with deferring > > virtio pci attachment. > > I wonder if it would make sense to have a more generic solution. I > suspect a similar issue may arise if you press the (ACPI) powerbutton > on a machine during boot. I think an API, that can be safely called > from interrupt context, would make sense. Currently we have several > ssubtly different implementations of the same code, at least for > shutdown, that all have one or more bugs...
Are you suggesting using something in-kernel instead of sending a signal to the init process which might not yet exist? > > diff refs/heads/master refs/heads/vmmci-panic > > commit - 3c994bb556f544ca04982fd977c153c1485aab1e > > commit + c58acacf15c284967253b4196b7d24aedf9ad1a2 > > blob - 984626393cd85a97b6b84a19c82c8175aa2e4379 > > blob + 0a4eac6d9ed700c383d332e5360bf0aea0a464de > > --- sys/dev/pv/vmmci.c > > +++ sys/dev/pv/vmmci.c > > @@ -21,6 +21,7 @@ > > #include <sys/timeout.h> > > #include <sys/device.h> > > #include <sys/sensors.h> > > +#include <sys/kthread.h> > > > > #include <machine/bus.h> > > > > @@ -39,10 +40,12 @@ struct vmmci_softc { > > struct device sc_dev; > > struct virtio_softc *sc_virtio; > > enum vmmci_cmd sc_cmd; > > + enum vmmci_cmd sc_cmd_async; > > unsigned int sc_interval; > > struct ksensordev sc_sensordev; > > struct ksensor sc_sensor; > > struct timeout sc_tick; > > + struct timeout sc_to_shutdown; > > }; > > > > int vmmci_match(struct device *, void *, void *); > > @@ -52,6 +55,7 @@ int vmmci_activate(struct device *, int); > > int vmmci_config_change(struct virtio_softc *); > > void vmmci_tick(void *); > > void vmmci_tick_hook(struct device *); > > +void vmmci_worker(void *); > > > > const struct cfattach vmmci_ca = { > > sizeof(struct vmmci_softc), > > @@ -116,6 +120,8 @@ vmmci_attach(struct device *parent, struct device *sel > > config_mountroot(self, vmmci_tick_hook); > > } > > > > + timeout_set(&sc->sc_to_shutdown, vmmci_worker, sc); > > + > > printf("\n"); > > if (virtio_attach_finish(vsc, va) != 0) > > goto err; > > @@ -171,18 +177,22 @@ vmmci_config_change(struct virtio_softc *vsc) > > /* no action */ > > break; > > case VMMCI_SHUTDOWN: > > - pvbus_shutdown(&sc->sc_dev); > > - break; > > case VMMCI_REBOOT: > > - pvbus_reboot(&sc->sc_dev); > > + /* > > + * Asynchronously process shutdown/reboot to allow us to handle > > + * when vmmci(4) receives a command during system startup or > > + * shutdown. > > + */ > > + sc->sc_cmd_async = cmd; > > + kthread_create_deferred(vmmci_worker, sc); > > break; > > case VMMCI_SYNCRTC: > > inittodr(gettime()); > > sc->sc_cmd = VMMCI_NONE; > > - break; > > + break; > > default: > > printf("%s: invalid command %d\n", sc->sc_dev.dv_xname, cmd); > > - cmd = VMMCI_NONE; > > + cmd = VMMCI_NONE; > > break; > > } > > > > @@ -226,3 +236,25 @@ vmmci_tick_hook(struct device *self) > > timeout_set(&sc->sc_tick, vmmci_tick, sc); > > vmmci_tick(sc); > > } > > + > > +void > > +vmmci_worker(void *arg) > > +{ > > + struct vmmci_softc *sc = (struct vmmci_softc *)arg; > > + > > + switch (sc->sc_cmd_async) { > > + case VMMCI_SHUTDOWN: > > + pvbus_shutdown(&sc->sc_dev); > > + break; > > + case VMMCI_REBOOT: > > + pvbus_reboot(&sc->sc_dev); > > + break; > > + default: > > + printf("%s: invalid async command %d\n", sc->sc_dev.dv_xname, > > + sc->sc_cmd_async); > > + return; > > + } > > + > > + /* Re-arm in case initprocess misses our signal. */ > > + timeout_add_sec(&sc->sc_to_shutdown, 5); > > +} > > > > >