A laptop of mine (dmesg below) frequently hangs. After some bisecting
and extensive testing I think I found the commit that causes this:
mpi@'s
"Always acquire the `vmobjlock' before incrementing an object's reference."
commit from 2022-04-28.

My definition of "the system hangs": 
 * Display is frozen
 * Switching to ttyC0 using Ctrl+Alt+F1 doesn't do anything
 * System does not respond to keyboard or mouse input
 * Pressing the power button for 1-2 seconds doesn't achieve anything
(usually this initiates a system shutdown)
 * And also the fan starts spinning

The system sometimes hangs very soon after booting the system, I've
seen it happen once while I was typing my username in xenodm to log in.
But sometimes it takes a couple of hours.

For some reason I put
"@reboot while sleep 1 ; do sync ; done"
in my crontab and it *seems* (I'm not sure) that the hangs occur more
frequently this way. Not sure if that is useful information.

I don't see similar problems on my other machines.

It looks like when the system hangs, it's stuck spinning in the new
code that was added in that commit; to confirm that I added some code
(see the diff below) to enter ddb if it's spinning there for 10 seconds
(and then it indeed enters ddb). If my thinking and diff make sense
I think that indeed confirms that is the problem.

Any tips for debugging this?

Caspar


==== 1: the diff to confirm it's the spinning

Index: uvm/uvm_vnode.c
===================================================================
RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v
retrieving revision 1.124
diff -u -p -r1.124 uvm_vnode.c
--- uvm/uvm_vnode.c     3 May 2022 21:20:35 -0000       1.124
+++ uvm/uvm_vnode.c     26 Jun 2022 18:21:19 -0000
@@ -1455,6 +1455,10 @@ uvm_vnp_sync(struct mount *mp)
 {
        struct uvm_vnode *uvn;
        struct vnode *vp;
+       struct timespec start;
+       struct timespec limit = { 10, 0 };
+       struct timespec now, diff;
+       int i = 0;
 
        /*
         * step 1: ensure we are only ones using the uvn_sync_q by locking
@@ -1472,9 +1476,18 @@ uvm_vnp_sync(struct mount *mp)
                if (mp && vp->v_mount != mp)
                        continue;
 
+               nanotime(&start);
                /* Spin to ensure `uvn_wlist' isn't modified concurrently. */
                while (rw_enter(uvn->u_obj.vmobjlock, RW_WRITE|RW_NOSLEEP)) {
                        CPU_BUSY_CYCLE();
+                       if (++i % 4096 == 0) {
+                               nanotime(&now);
+                               timespecsub(&now, &start, &diff);
+                               if (timespeccmp(&diff, &limit, >)) {
+                                       printf("i: %d\n", i);
+                                       db_enter();
+                               }
+                       }
                }
 
                /*


==== 2: ddb output

i: 175194112
Stopped at      db_enter+0x10:  popq    %rbp
ddb{1}> ddb{1}> db_enter() at db_enter+0x10
uvm_vnp_sync(ffff80000087f400) at uvm_vnp_sync+0xc9
sys_sync(ffff8000fffc0a90,ffff800033ba0b30,ffff800033ba0b80) at sys_sync+0x85
syscall(ffff800033ba0bf0) at syscall+0x374
Xsyscall() at Xsyscall+0x128
end of kernel
end trace frame: 0x7f7fffff78c0, count: -5
ddb{1}> 


==== 3: dmesg

OpenBSD 7.1-current (GENERIC.MP) #1: Sun Jun 26 19:19:30 CEST 2022
        caspar@laptop:/sys/arch/amd64/compile/GENERIC.MP
real mem = 17032646656 (16243MB)
avail mem = 16499073024 (15734MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 3.0 @ 0xa9764000 (34 entries)
bios0: vendor HP version "P96 Ver. 01.38" date 01/06/2021
bios0: HP HP EliteBook 1040 G4
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP SSDT RTMA UEFI SSDT TPM2 SSDT MSDM SLIC WSMT HPET APIC 
MCFG SSDT SSDT SSDT SSDT SSDT SSDT SSDT SSDT SSDT DBGP DBG2 DMAR NHLT SSDT ASF! 
FPDT BGRT SSDT
acpi0: wakeup devices GLAN(S4) XHC_(S3) XDCI(S4) HDAS(S4) RP01(S4) PXSX(S4) 
TXHC(S3) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) PXSX(S4) RP05(S0) 
PXSX(S0) RP06(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 23999999 Hz
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz, 2593.96 MHz, 06-8e-09
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 24MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz, 2592.92 MHz, 06-8e-09
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 1 (application processor)
cpu2: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz, 2593.96 MHz, 06-8e-09
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 1, core 0, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz, 2593.96 MHz, 06-8e-09
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,SGX,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,MPX,RDSEED,ADX,SMAP,CLFLUSHOPT,PT,SRBDS_CTRL,MD_CLEAR,TSXFA,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,XSAVEC,XGETBV1,XSAVES,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 1, core 1, package 0
ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 120 pins
acpimcfg0 at acpi0
acpimcfg0: addr 0xf0000000, bus 0-127
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus 1 (RP01)
acpiprt2 at acpi0: bus -1 (RP02)
acpiprt3 at acpi0: bus -1 (RP03)
acpiprt4 at acpi0: bus -1 (RP04)
acpiprt5 at acpi0: bus 108 (RP05)
acpiprt6 at acpi0: bus -1 (RP06)
acpiprt7 at acpi0: bus -1 (RP07)
acpiprt8 at acpi0: bus -1 (RP08)
acpiprt9 at acpi0: bus 109 (RP09)
acpiprt10 at acpi0: bus -1 (RP10)
acpiprt11 at acpi0: bus -1 (RP11)
acpiprt12 at acpi0: bus -1 (RP12)
acpiprt13 at acpi0: bus -1 (RP13)
acpiprt14 at acpi0: bus -1 (RP14)
acpiprt15 at acpi0: bus -1 (RP15)
acpiprt16 at acpi0: bus -1 (RP16)
acpiprt17 at acpi0: bus -1 (RP17)
acpiprt18 at acpi0: bus -1 (RP18)
acpiprt19 at acpi0: bus -1 (RP19)
acpiprt20 at acpi0: bus -1 (RP20)
acpiprt21 at acpi0: bus -1 (RP21)
acpiprt22 at acpi0: bus -1 (RP22)
acpiprt23 at acpi0: bus -1 (RP23)
acpiprt24 at acpi0: bus -1 (RP24)
acpiec0 at acpi0
"PNP0C14" at acpi0 not configured
"PNP0C14" at acpi0 not configured
acpipci0 at acpi0 PCI0: 0x00000010 0x00000011 0x00000000
"INT3403" at acpi0 not configured
"INT3403" at acpi0 not configured
"ALP0113" at acpi0 not configured
pchgpio0 at acpi0 GPI0 addr 0xfdaf0000/0x10000 0xfdae0000/0x10000 
0xfdac0000/0x10000 irq 14, 176 pins
"ALP0013" at acpi0 not configured
"PNP0C51" at acpi0 not configured
"PNP0C51" at acpi0 not configured
acpibat0 at acpi0: BAT0 model "Primary" serial 01525 2018/04/20 type LIon oem 
"Hewlett-Packard"
acpiac0 at acpi0: AC unit online
acpibtn0 at acpi0: SLPB
acpibtn1 at acpi0: LID_
acpibtn2 at acpi0: PWRB
"PNP0C14" at acpi0 not configured
"PNP0C14" at acpi0 not configured
"HPQ6001" at acpi0 not configured
tpm0 at acpi0 TPM_ 2.0 (TIS) addr 0xfed40000/0x5000, device 0x001b15d1 rev 0x10
"INT3400" at acpi0 not configured
acpicpu0 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
C1(1000@1 mwait.1), PSS
acpicpu1 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
C1(1000@1 mwait.1), PSS
acpicpu2 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
C1(1000@1 mwait.1), PSS
acpicpu3 at acpi0: C3(200@1034 mwait.1@0x60), C2(200@151 mwait.1@0x33), 
C1(1000@1 mwait.1), PSS
acpipwrres0 at acpi0: WRST
acpitz0 at acpi0: critical temperature is 128 degC
acpitz1 at acpi0: critical temperature is 128 degC
acpitz2 at acpi0: critical temperature is 128 degC
acpitz3 at acpi0: critical temperature is 128 degC
acpitz4 at acpi0: critical temperature is 128 degC
acpitz5 at acpi0acpitz5: CHGZ: failed to read _CRT
acpitz5: CHGZ: failed to read _CRT
: no critical temperature defined
acpitz6 at acpi0: critical temperature is 128 degC
acpitz7 at acpi0: critical temperature is 128 degC
acpitz8 at acpi0: critical temperature is 128 degC
acpivideo0 at acpi0: GFX0
acpivout0 at acpivideo0: DD1F
acpivout1 at acpivideo0: DD2F
cpu0: using VERW MDS workaround (except on vmm entry)
cpu0: Enhanced SpeedStep 2593 MHz: speeds: 2701, 2700, 2600, 2500, 2400, 2200, 
2000, 1800, 1600, 1500, 1300, 1100, 800, 700, 600, 400 MHz
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel Core 7G Host" rev 0x02
inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics 620" rev 0x02
drm0 at inteldrm0
inteldrm0: msi, KABYLAKE, gen 9
"Intel Core 6G Thermal" rev 0x02 at pci0 dev 4 function 0 not configured
xhci0 at pci0 dev 20 function 0 "Intel 100 Series xHCI" rev 0x21: msi, xHCI 1.0
usb0 at xhci0: USB revision 3.0
uhub0 at usb0 configuration 1 interface 0 "Intel xHCI root hub" rev 3.00/1.00 
addr 1
pchtemp0 at pci0 dev 20 function 2 "Intel 100 Series Thermal" rev 0x21
dwiic0 at pci0 dev 21 function 0 "Intel 100 Series I2C" rev 0x21: apic 2 int 16
iic0 at dwiic0
dwiic1 at pci0 dev 21 function 1 "Intel 100 Series I2C" rev 0x21: apic 2 int 17
iic1 at dwiic1
ihidev0 at iic1 addr 0x2c gpio 82, vendor 0x44e product 0x120a, ALP0013
ihidev0: 12 report ids
imt0 at ihidev0: clickpad, 5 contacts
wsmouse0 at imt0 mux 0
ims0 at ihidev0 reportid 1: 2 buttons, Z and W dir
wsmouse1 at ims0 mux 0
hid at ihidev0 reportid 3 not configured
hid at ihidev0 reportid 4 not configured
hid at ihidev0 reportid 5 not configured
hid at ihidev0 reportid 6 not configured
hid at ihidev0 reportid 10 not configured
hid at ihidev0 reportid 12 not configured
"Intel 100 Series MEI" rev 0x21 at pci0 dev 22 function 0 not configured
ppb0 at pci0 dev 28 function 0 "Intel 100 Series PCIE" rev 0xf1: msi
pci1 at ppb0 bus 1
ppb1 at pci0 dev 28 function 4 "Intel 100 Series PCIE" rev 0xf1: msi
pci2 at ppb1 bus 108
iwm0 at pci2 dev 0 function 0 "Intel Dual Band Wireless-AC 8265" rev 0x78, msi
ppb2 at pci0 dev 29 function 0 "Intel 100 Series PCIE" rev 0xf1: msi
pci3 at ppb2 bus 109
nvme0 at pci3 dev 0 function 0 "Samsung SM981/PM981 NVMe" rev 0x00: msix, NVMe 
1.2
nvme0: SAMSUNG MZVLB512HAJQ-000H1, firmware EXA72H1Q, serial S3WTNB0K561746
scsibus1 at nvme0: 2 targets, initiator 0
sd0 at scsibus1 targ 1 lun 0: <NVMe, SAMSUNG MZVLB512, EXA7>
sd0: 488386MB, 512 bytes/sector, 1000215216 sectors
pcib0 at pci0 dev 31 function 0 "Intel 200 Series LPC" rev 0x21
"Intel 100 Series PMC" rev 0x21 at pci0 dev 31 function 2 not configured
azalia0 at pci0 dev 31 function 3 "Intel 200 Series HD Audio" rev 0x21: msi
azalia0: codecs: Conexant/0x1fd6, Intel/0x280b, using Conexant/0x1fd6
audio0 at azalia0
ichiic0 at pci0 dev 31 function 4 "Intel 100 Series SMBus" rev 0x21: apic 2 int 
16
iic2 at ichiic0
isa0 at pcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5 irq 1 irq 12
pckbd0 at pckbc0 (kbd slot)
wskbd0 at pckbd0: console keyboard
pms0 at pckbc0 (aux slot)
wsmouse2 at pms0 mux 0
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
vmm0 at mainbus0: VMX/EPT
efifb at mainbus0 not configured
cdce0 at uhub0 port 2 configuration 2 interface 0 "HP Inc. HP lt4132 LTE/HSPA+ 
4G Module" rev 2.00/1.02 addr 2
cdce0: could not find data bulk in
ugen0 at uhub0 port 2 configuration 2 "HP Inc. HP lt4132 LTE/HSPA+ 4G Module" 
rev 2.00/1.02 addr 2
ugen1 at uhub0 port 3 "Intel Bluetooth" rev 2.00/0.10 addr 3
uhidev0 at uhub0 port 4 configuration 1 interface 0 "Logitech USB Receiver" rev 
2.00/24.07 addr 4
uhidev0: iclass 3/1
ukbd0 at uhidev0: 8 variable keys, 6 key codes
wskbd1 at ukbd0 mux 1
uhidev1 at uhub0 port 4 configuration 1 interface 1 "Logitech USB Receiver" rev 
2.00/24.07 addr 4
uhidev1: iclass 3/1, 8 report ids
ums0 at uhidev1 reportid 2: 16 buttons, Z and W dir
wsmouse3 at ums0 mux 0
ucc0 at uhidev1 reportid 3: 652 usages, 18 keys, array
wskbd2 at ucc0 mux 1
uhid0 at uhidev1 reportid 4: input=1, output=0, feature=0
uhid1 at uhidev1 reportid 8: input=1, output=0, feature=0
uhidev2 at uhub0 port 4 configuration 1 interface 2 "Logitech USB Receiver" rev 
2.00/24.07 addr 4
uhidev2: iclass 3/0, 33 report ids
uhidpp0 at uhidev2 device 1 mouse "MX Master 2S" serial ae-f6-7f-70
uhid2 at uhidev2 reportid 32: input=14, output=14, feature=0
uhid3 at uhidev2 reportid 33: input=31, output=31, feature=0
uvideo0 at uhub0 port 7 configuration 1 interface 0 "Generic HP HD Camera" rev 
2.00/0.16 addr 5
video0 at uvideo0
uvideo1 at uhub0 port 7 configuration 1 interface 2 "Generic HP HD Camera" rev 
2.00/0.16 addr 5
video1 at uvideo1
ugen2 at uhub0 port 8 "Validity Sensors product 0x0092" rev 2.00/1.64 addr 6
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
sd1 at scsibus3 targ 1 lun 0: <OPENBSD, SR CRYPTO, 006>
sd1: 409599MB, 512 bytes/sector, 838860272 sectors
root on sd1a (06ac9358f268d3c0.a) swap on sd1b dump on sd1b
drm:pid0:intel_ddi_sanitize_encoder_pll_mapping *NOTICE* [drm] [ENCODER:102:DDI 
\M-J/PHY h] is disabled/in DSI mode with an ungated DDI clock, gate it
drm:pid0:intel_ddi_sanitize_encoder_pll_mapping *NOTICE* [drm] [ENCODER:116:DDI 
\M-J/PHY h] is disabled/in DSI mode with an ungated DDI clock, gate it
inteldrm0: 1920x1080, 32bpp
wsdisplay0 at inteldrm0 mux 1: console (std, vt100 emulation), using wskbd0
wskbd1: connecting to wsdisplay0
wskbd2: connecting to wsdisplay0
wsdisplay0: screen 1-5 added (std, vt100 emulation)
iwm0: hw rev 0x230, fw ver 36.ca7b901d.0, address 98:3b:8f:xx:xx:xx

Reply via email to