Re: vmd amd64 snapshot, crash in acpiopen triggerred by apm -b
Patch installed on my system, $ apm -b 255 $ apm Battery state: unknown, 0% remaining, 0 minutes life estimate AC adapter state: not known Performance adjustment mode: manual (2300 MHz) Looks nice, please let me know if there's more I could test. Thanks you. -- xs De : Mark Kettenis À : Tobias Heider Sujet : Re: vmd amd64 snapshot, crash in acpiopen triggerred by apm -b Date : 06/08/2023 15:07:32 Europe/Paris Copie à : xavie...@mailoo.org; bugs@openbsd.org > Date: Sun, 6 Aug 2023 14:36:30 +0200 > From: Tobias Heider > > On Sun, Aug 06, 2023 at 07:55:40AM +0200, Anton Lindqvist wrote: > > On Sat, Aug 05, 2023 at 10:08:53PM +0200, xavie...@mailoo.org wrote: > > > Hi, > > > > > > I run a 2G/100G virtual machine at openbsd.amsterdam freshly upgraded > > > from stable to the latest snapshot and I've figured out the panic > > > by the two steps detailed there: > > > > > > 1. The system has a root @reboot crontab entry that start a tmux > > > session in the background (so always detached from a TTY during the > > > whole procedure) + a /root/.tmux.conf which is some copy of my usual > > > tmux confi, which appears to call a script that does `apm -b` (we have > > > our quick workaround by removing it). > > > > > > The tmux session and the programs ran inside started just fine at the > > > exception of the tmux session itself. By attaching that special > > > session created @reboot, I noticed that tmux somehow fallback'd on the > > > builtin's default config. (Green bottom status-bar and defaults > > > keybinds). Which indeed indicated me that something went wrong. > > > > > > 2. It's only when I started tmux manually that the .tmux.conf calling > > > `apm -b` triggerred the crash: > > > > > > # tmux ^M > > > campfire.01:ksh* <-- my "on-top" status-bar was loaded this time > > > uvm_fault(0xfd8078416cf0, 0x39c, 0, 2) -> e > > > kernel: page fault trap, code=2 > > > Stopped at acpiopen+0x85: orb $0x1,0x39c(%r13) > > > TID PID UID PRFLAGS PFLAGS CPU COMMAND > > > *173406 19781 0 0x2 0 0 apm > > > acpiopen(5300,1,2000,80000b08) at acpiopen+0x85 > > > spec_open(800021648598) at spec_open+0xe0 > > > VOP_OPEN(fd803bb6bcb0,1,fd80691bf550,80000b08) at > > > VOP_OPEN+0x4e > > > > > > vn_open(8000216487b0,1,0) at vn_open+0x275 > > > doopenat(80000b08,ff9c,f9805daef3b,0,0,800021648980) > > > at doopena > > > t+0x1d1 > > > syscall(8000216489f0) at syscall+0x364 > > > Xsyscall() at Xsyscall+0x128 > > > end of kernel > > > end trace frame: 0x775e645c8040, count: 8 > > > > This looks like a regression introduced in the recent acpi_apm.c > > extraction in which the ENXIO short circuit got lost in > > acpi{open,close,ioctl}. > > > > https://github.com/openbsd/src/commit/c75690924c3df592a3a5078fe57c951f808a8350 > > > > Urgh yes, thanks for tracking this down. We are clearly missing at > least a few checks here. I am working on getting this reproduced > meanwhile here is a first diff to hopefully fix the crash. ok kettenis@ > Index: dev/acpi/acpi_apm.c > === > RCS file: /mount/openbsd/cvs/src/sys/dev/acpi/acpi_apm.c,v > retrieving revision 1.2 > diff -u -p -r1.2 acpi_apm.c > --- dev/acpi/acpi_apm.c 8 Jul 2023 14:44:43 - 1.2 > +++ dev/acpi/acpi_apm.c 6 Aug 2023 12:29:56 - > @@ -47,6 +47,9 @@ acpiopen(dev_t dev, int flag, int mode, > struct acpi_softc *sc = acpi_softc; > int s; > > + if (sc == NULL) > + return (ENXIO); > + > s = splbio(); > switch (APMDEV(dev)) { > case APMDEV_CTL: > @@ -82,6 +85,9 @@ acpiclose(dev_t dev, int flag, int mode, > struct acpi_softc *sc = acpi_softc; > int s; > > + if (sc == NULL) > + return (ENXIO); > + > s = splbio(); > switch (APMDEV(dev)) { > case APMDEV_CTL: > @@ -106,6 +112,9 @@ acpiioctl(dev_t dev, u_long cmd, caddr_t > struct apm_power_info *pi = (struct apm_power_info *)data; > int s; > > + if (sc == NULL) > + return (ENXIO); > + > s = splbio(); > /* fake APM */ > switch (cmd) { > @@ -167,6 +176,9 @@ acpikqfilter(dev_t dev, struct knote *kn > { > struct acpi_softc *sc = acpi_softc; > int s; > + > + if (sc == NULL) > + return (ENXIO); > > switch (kn->kn_filter) { > case EVFILT_READ: > >
vmd amd64 snapshot, crash in acpiopen triggerred by apm -b
Hi, I run a 2G/100G virtual machine at openbsd.amsterdam freshly upgraded from stable to the latest snapshot and I've figured out the panic by the two steps detailed there: 1. The system has a root @reboot crontab entry that start a tmux session in the background (so always detached from a TTY during the whole procedure) + a /root/.tmux.conf which is some copy of my usual tmux confi, which appears to call a script that does `apm -b` (we have our quick workaround by removing it). The tmux session and the programs ran inside started just fine at the exception of the tmux session itself. By attaching that special session created @reboot, I noticed that tmux somehow fallback'd on the builtin's default config. (Green bottom status-bar and defaults keybinds). Which indeed indicated me that something went wrong. 2. It's only when I started tmux manually that the .tmux.conf calling `apm -b` triggerred the crash: # tmux ^M campfire.01:ksh* <-- my "on-top" status-bar was loaded this time uvm_fault(0xfd8078416cf0, 0x39c, 0, 2) -> e kernel: page fault trap, code=2 Stopped at acpiopen+0x85: orb $0x1,0x39c(%r13) TID PID UID PRFLAGS PFLAGS CPU COMMAND *173406 19781 0 0x2 0 0 apm acpiopen(5300,1,2000,80000b08) at acpiopen+0x85 spec_open(800021648598) at spec_open+0xe0 VOP_OPEN(fd803bb6bcb0,1,fd80691bf550,80000b08) at VOP_OPEN+0x4e vn_open(8000216487b0,1,0) at vn_open+0x275 doopenat(80000b08,ff9c,f9805daef3b,0,0,800021648980) at doopena t+0x1d1 syscall(8000216489f0) at syscall+0x364 Xsyscall() at Xsyscall+0x128 end of kernel end trace frame: 0x775e645c8040, count: 8 Thanks you for having a look. This is just a surface analysis I was able to do for now. I will take the necessary time on my side to setup the build environment on this host and be ready to test patches. PS: I would also like to thanks misha for his genius idea and hosting services I use proudly since 2020. dmesg: OpenBSD 7.3-current (GENERIC) #1267: Fri Aug 4 12:41:36 MDT 2023 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC real mem = 2130694144 (2031MB) avail mem = 2046570496 (1951MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf3740 (10 entries) bios0: vendor SeaBIOS version "1.14.0p3-OpenBSD-vmm" date 01/01/2011 bios0: OpenBSD VMM acpi at bios0 not configured cpu0 at mainbus0: (uniprocessor) cpu0: Intel(R) Xeon(R) CPU E5-2630 0 @ 2.30GHz, 2300.04 MHz, 06-2d-07 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,CX8,SEP,PGE,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,AES,XSAVE,AVX,HV,NXE,PAGE1GB,LONG,LAHF,ITSC,MD_CLEAR,MELTDOWN cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB 64b/line 8-way L2 cache, 15MB 64b/line 20-way L3 cache cpu0: smt 0, core 0, package 0 cpu0: using VERW MDS workaround pvbus0 at mainbus0: OpenBSD pvclock0 at pvbus0 pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00 virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00 viornd0 at virtio0 virtio0: irq 3 virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Network" rev 0x00 vio0 at virtio1: address fe:e1:bb:6f:67:15 virtio1: irq 5 virtio2 at pci0 dev 3 function 0 "Qumranet Virtio Storage" rev 0x00 vioblk0 at virtio2 scsibus1 at vioblk0: 1 targets sd0 at scsibus1 targ 0 lun 0: sd0: 102400MB, 512 bytes/sector, 209715200 sectors virtio2: irq 6 virtio3 at pci0 dev 4 function 0 "OpenBSD VMM Control" rev 0x00 vmmci0 at virtio3 virtio3: irq 7 isa0 at mainbus0 isadma0 at isa0 com0 at isa0 port 0x3f8/8 irq 4: ns8250, no fifo com0: console vscsi0 at root scsibus2 at vscsi0: 256 targets softraid0 at root scsibus3 at softraid0: 256 targets root on sd0a (ed63eb54292b967c.a) swap on sd0b dump on sd0b -- xs
Re: panic at klist_remove_locked trigerred by SIGTERM on tail -f /dev/ugen0.01 after unplugging a USB serial interface
New patch applied, same procedure: se-h1# uname -v GENERIC.MP#1 se-h1# tail -f /dev/ugen0.01 [user remove the USB device] tail: /dev/ugen0.01: Input/output error tail: Lost file /dev/ugen0.01: Input/output error ^C [ no panic, back to sh ] I did the test twice, here's the dmesg output: se-h1# dmesg [...] ugen0 at uhub0 port 7 "INNO TECH USB to Serial" rev 1.10/0.02 addr 7 tsleep_nsec: tail[48559]: ugenrintr: trying to sleep zero nanoseconds ugen0 detached ugen0 at uhub0 port 7 "INNO TECH USB to Serial" rev 1.10/0.02 addr 7 tsleep_nsec: tail[80471]: ugenrintr: trying to sleep zero nanoseconds ugen0 detached ugen0 at uhub0 port 7 "INNO TECH USB to Serial" rev 1.10/0.02 addr 7 It seems to still react as we would expect. To be more confident I did another test with less: se-h1# less +F -f /dev/ugen0.01 --> Waiting for data... [user remove the USB device] --> Read error (press RETURN) ^CQ [no panic, back to sh] De : Visa Hankala À : xavie...@mailoo.org Sujet : Re: panic at klist_remove_locked trigerred by SIGTERM on tail -f /dev/ugen0.01 after unplugging a USB serial interface Date : 18/12/2022 17:14:27 Europe/Paris Copie à : bugs@openbsd.org On Sun, Dec 18, 2022 at 02:32:09PM +0100, xavie...@mailoo.org wrote: > I tested, it looks promising: > > 0. Rebooted on the new GENERIC.MP built, uname -v > > GENERIC.MP#0 > > 1. Physical plug > > ugen0 at uhub0 port 7 "INNO TECH USB to Serial" rev 1.10/0.02 addr 3 > > 2. Read > > tail -f /dev/ugen0.01 > > 3. Physical unplug > > ugen0 detached > > 4. Tail reports i/o error as expected > > tail: /dev/ugen0.01: Input/output error > tail: Lost file /dev/ugen0.01: Input/output error > > 5. ^C > > No panic occurring. Good, we are on the right track. Below is a revised patch. Could you replace the initial patch with this one and test again? This new patch ensures that klist_invalidate() does get called after vdevgone(). Making the call from ugen_do_close() might be brittle if kqueue event registration (ugenkqfilter()) happens to race with detaching (usbd_is_dying() possibly stops the race, though). Index: dev/usb/ugen.c === RCS file: src/sys/dev/usb/ugen.c,v retrieving revision 1.116 diff -u -p -r1.116 ugen.c --- dev/usb/ugen.c 2 Jul 2022 08:50:42 - 1.116 +++ dev/usb/ugen.c 18 Dec 2022 15:48:11 - @@ -798,6 +798,10 @@ ugen_detach(struct device *self, int fla for (endptno = 0; endptno < USB_MAX_ENDPOINTS; endptno++) { if (sc->sc_is_open[endptno]) ugen_do_close(sc, endptno, FREAD|FWRITE); + + /* ugenkqfilter() always uses IN. */ + sce = >sc_endpoints[endptno][IN]; + klist_invalidate(>rsel.si_note); } return (0); }
Re: panic at klist_remove_locked trigerred by SIGTERM on tail -f /dev/ugen0.01 after unplugging a USB serial interface
I tested, it looks promising: 0. Rebooted on the new GENERIC.MP built, uname -v GENERIC.MP#0 1. Physical plug ugen0 at uhub0 port 7 "INNO TECH USB to Serial" rev 1.10/0.02 addr 3 2. Read tail -f /dev/ugen0.01 3. Physical unplug ugen0 detached 4. Tail reports i/o error as expected tail: /dev/ugen0.01: Input/output error tail: Lost file /dev/ugen0.01: Input/output error 5. ^C No panic occurring. Excellent! Well, thanks you very much for this fix. If more testing is needed, please let me know. Cheers. De : Visa Hankala À : xavie...@mailoo.org Sujet : Re: panic at klist_remove_locked trigerred by SIGTERM on tail -f /dev/ugen0.01 after unplugging a USB serial interface Date : 18/12/2022 12:49:01 Europe/Paris Copie à : bugs@openbsd.org On Sat, Dec 17, 2022 at 02:32:03PM +0100, xavie...@mailoo.org wrote: > While doing the following actions: > > 1. Plugging this USB device: > > ugen0 at uhub0 port 7 "INNO TECH USB to Serial" rev 1.10/0.02 addr 2 > > 2. Running (remotely via ssh) > > tail -f /dev/ugen0.01 > > 3. Physically unplugging the USB device > > [kernel notices ugen0 detached] > > 4. ^C in the terminal attached to the `tail -f /dev/ugen0.01` of step 2 > > Panic occurs. > > [...] > > kernel: page fault trap, code x0 > > Stopped at klist_remove_locked.0x53 This indicates that there was a stray knote that referred to the detached device. The following patch might help. Could you test it? Index: dev/usb/ugen.c === RCS file: src/sys/dev/usb/ugen.c,v retrieving revision 1.116 diff -u -p -r1.116 ugen.c --- dev/usb/ugen.c 2 Jul 2022 08:50:42 - 1.116 +++ dev/usb/ugen.c 18 Dec 2022 11:42:26 - @@ -474,6 +474,8 @@ ugen_do_close(struct ugen_softc *sc, int free(sce->ibuf, M_USBDEV, sce->ibuflen); sce->ibuf = NULL; } + + klist_invalidate(>rsel.si_note); } sc->sc_is_open[endpt] = 0;
outdated mandoc.db contains bogus man on snapshot
>Synopsis: outdated mandoc.db contains bogus man on snapshot >Category: mandoc >Environment: System : OpenBSD 6.5 Details : OpenBSD 6.5-current (GENERIC.MP) #184: Wed Aug 7 21:37:16 MDT 2019 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP Architecture: OpenBSD.amd64 Machine : amd64 >Description: Running `man intro` on a fresh install of snapthot gives the following output: ``` primary# man intro man: /usr/share/man/man9/intro.9: No such file or directory man: outdated mandoc.db contains bogus man9/intro.9 entry, run makewhatis /usr/share/man man: /usr/share/man/man3/intro.3: No such file or directory man: outdated mandoc.db contains bogus man3/intro.3 entry, run makewhatis /usr/share/man man: /usr/share/man/man6/intro.6: No such file or directory man: outdated mandoc.db contains bogus man6/intro.6 entry, run makewhatis /usr/share/man man: /usr/share/man/man2/intro.2: No such file or directory man: outdated mandoc.db contains bogus man2/intro.2 entry, run makewhatis /usr/share/man INTRO(1) General Commands Manual INTRO(1) NAME intro - introduction to general commands (tools and utilities) [ ... ] ``` >How-To-Repeat: I did a FDE setup in a kvm virtual machine, added only / and swap to my disklabel and rebooted. Once I was connected, I just ran `man intro`. >Fix: As root, `makewhatis /usr/share/man` did solve the problem : Run sendbug as root if this is an ACPI report! : dmesg and usbdevs are attached. : Feel free to delete or use the -D flag if they contain sensitive information. dmesg: OpenBSD 6.5-current (GENERIC.MP) #184: Wed Aug 7 21:37:16 MDT 2019 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 1056804864 (1007MB) avail mem = 1014673408 (967MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.8 @ 0xf57c0 (12 entries) bios0: vendor SeaBIOS version "1.12.0-1" date 04/01/2014 bios0: QEMU Standard PC (i440FX + PIIX, 1996) acpi0 at bios0: ACPI 1.0 acpi0: sleep states S5 acpi0: tables DSDT FACP APIC acpi0: wakeup devices acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee0: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel Core Processor (Skylake, IBRS), 2808.37 MHz, 06-5e-03 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,UMIP,MD_CLEAR,IBRS,IBPB,SSBD,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 16-way L2 cache cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges cpu0: apic clock running at 1000MHz cpu1 at mainbus0: apid 1 (application processor) TSC skew=-2 cpu1: Intel Core Processor (Skylake, IBRS), 2808.03 MHz, 06-5e-03 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,UMIP,MD_CLEAR,IBRS,IBPB,SSBD,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN cpu1: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 16-way L2 cache cpu1: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped cpu1: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped cpu1: smt 0, core 0, package 1 cpu2 at mainbus0: apid 2 (application processor) TSC skew=-9 cpu2: Intel Core Processor (Skylake, IBRS), 2808.03 MHz, 06-5e-03 cpu2: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SS,SSE3,PCLMUL,SSSE3,FMA3,CX16,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,UMIP,MD_CLEAR,IBRS,IBPB,SSBD,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN cpu2: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB 64b/line 16-way L2 cache cpu2: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped cpu2: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped cpu2: smt 0, core 0, package 2 cpu3 at mainbus0: apid 3 (application processor) TSC skew=-7 cpu3: Intel Core Processor (Skylake, IBRS), 2808.03 MHz, 06-5e-03 cpu3:
libressl ocsp aborts with a passphrase in the rkey file
>Synopsis: libressl aborted when starting ocsp with a passphrase in the >generated rkey file >Category: library >Environment: System : OpenBSD 6.0 Details : OpenBSD 6.0-current (GENERIC.MP) #150: Tue Jan 17 17:41:15 MST 2017 bu...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP Architecture: OpenBSD.amd64 Machine : amd64 >Description: LibreSSL was aborted when starting the ocsp server for a test domain intermediate CA, openning the rkey file generated by: openssl genrsa -aes256 -out intermediate/private/ocsp.inda.re.key.pem 4096 Please note that the OCSP server starts correctly if the keyfile is generated without -aes256. Being not aware how setting up a Root CA, I've followed the procedure at the url pasted below. Then, running the OSCP server with arguments shown below resulted in: Abort trap (core dumped) at the output, and openssl(7598): syscall 54 "ioctl" in the messages. >How-To-Repeat: # Folowed the method as presented on this site: # https://jamielinux.com/docs/openssl-certificate-authority/introduction.html # Everything goes right with libressl until I attempted to start the OCSP server # Generated the keyfile with a passphrase, as shown in the last part of the tutorial openssl genrsa -aes256 -out intermediate/private/ocsp.inda.re.key.pem 4096 # Triggers the abort openssl ocsp -port 127.0.0.1:25600 -text -sha256 \ -index intermediate/index.txt \ -CA intermediate/certs/ca-chain.cert.pem \ -rkey intermediate/private/ocsp.inda.re.key.pem \ -rsigner intermediate/certs/ocsp.inda.re.cert.pem \ -nrequest 1