>From l...@md5collisions.eu Thu May 26 19:51:47 2022 Date: Thu, 26 May 2022 19:51:47 +0200 From: Stephan Mending <l...@md5collisions.eu> To: misc@openbsd.org Subject: Re: Cron running at 99% CPU for seemingly no reason Message-ID: <yo++mzjsnihga...@hellisempty.reich.home.arpa> Mail-Followup-To: Stephan Mending <l...@md5collisions.eu>, misc@openbsd.org References: <yodqqaodgsfq0...@hellisempty.reich.home.arpa> <yodvfdtukxkx1...@diehard.n-r-g.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <yodvfdtukxkx1...@diehard.n-r-g.com> X-Operating-System: not-disclosed Status: RO Content-Length: 10166 Lines: 244
On Sun, May 15, 2022 at 12:25:24PM +0200, Claudio Jeker wrote: > On Sun, May 15, 2022 at 12:06:33PM +0200, Stephan Mending wrote: > > Hi *, > > I've got a system running -current that keeps crashing on me every couple > > of days. > > Output of ddb: > > > > Connected to /dev/cuaU0 (speed 115200) > > > > ddb{0}> show panic > > the kernel did not panic > > ddb{0}> show uvm > > Current UVM status: > > pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12 > > 482451 VM pages: 43158 active, 132795 inactive, 35 wired, 192336 free > > (24054 z > > ero) > > min 10% (25) anon, 10% (25) vnode, 5% (12) vtext > > freemin=16081, free-target=21441, inactive-target=0, wired-max=160817 > > faults=2487210, traps=2404140, intrs=211883, ctxswitch=1960560 fpuswitch=0 > > softint=3499069, syscalls=2015497, kmapent=9 > > fault counts: > > noram=0, noanon=0, noamap=0, pgwait=0, pgrele=0 > > ok relocks(total)=192470(193514), anget(retries)=603205(0), > > amapcopy=177151 > > > > neighbor anon/obj pg=82033/639788, gets(lock/unlock)=415897/193548 > > cases: anon=570367, anoncow=32838, obj=347149, prcopy=67670, > > przero=1469152 > > > > daemon and swap counts: > > woke=0, revs=0, scans=0, obscans=0, anscans=0 > > busy=0, freed=0, reactivate=0, deactivate=0 > > pageouts=0, pending=0, nswget=0 > > nswapdev=1 > > swpages=526020, swpginuse=0, swpgonly=0 paging=0 > > kernel pointers: > > objs(kern)=0xffffffff8238a038 > > ddb{0}> show trace > > No such command > > ddb{0}> trace > > icmp_mtudisc_timeout(fffffd807a50b070,0) at icmp_mtudisc_timeout+0x77 > > rt_timer_timer(ffffffff8235d668) at rt_timer_timer+0x1cc > > softclock_thread(ffff8000fffff260) at softclock_thread+0x13b > > end trace frame: 0x0, count: -3 > > ddb{0}> > > > > This looks like some bad memory access. This is a fault and not really a > panic this is why 'show panic' returns 'the kernel did not panic'. > > The crash in icmp_mtudisc_timeout() points to some error in the rttimer > code refactor I made. Please try a newer snapshot and include the dmesg. > > If it happens again a call to db_show_rtentry in ddb may help better > understand what is going on: > > call db_show_rtentry(fffffd807a50b070, 0, 0) > Where the first argument is derived from the first argument of > icmp_mtudisc_timeout from the trace. > > -- > :wq Claudio @Claudio The error seems to persist as attached occurred a couple of days back. (This time I've got the full dmesg for you) I tried your "call db_show_rtentry<...>". That somehow returned an error. Just reporting right here. I'll try and run a kernel with symbols and wait for the next crash. 'Til then. Best regards, Stephan dmesg: OpenBSD 7.1-current (GENERIC.MP) #525: Sat May 14 23:04:32 MDT 2022 dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP real mem = 2046992384 (1952MB) avail mem = 1967644672 (1876MB) random: good seed from bootblocks mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root bios0 at mainbus0: SMBIOS rev. 2.8 @ 0x7bef9000 (81 entries) bios0: vendor American Megatrends Inc. version "5.011" date 10/10/2016 bios0: INTEL Corporation CRESCENTBAY acpi0 at bios0: ACPI 5.0 acpi0: sleep states S0 S3 S4 S5 acpi0: tables DSDT FACP APIC FPDT FIDT MCFG HPET SSDT UEFI SSDT ASF! SSDT SSDT acpi0: wakeup devices PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) PEGP(S4) RP0 1(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) PXSX(S4) RP05(S4) P XSX(S4) [...] acpitimer0 at acpi0: 3579545 Hz, 24 bits acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat cpu0 at mainbus0: apid 0 (boot processor) cpu0: Intel(R) Pentium(R) 3558U @ 1.70GHz, 1696.36 MHz, 06-45-01 cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,C FLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,V MX,EST,TM2,SSSE3,SDBG,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,X SAVE,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,ERMS ,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MEL TDOWN cpu0: 256KB 64b/line 8-way L2 cache cpu0: smt 0, core 0, package 0 mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges cpu0: apic clock running at 99MHz cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE cpu1 at mainbus0: apid 2 (application processor) cpu1: Intel(R) Pentium(R) 3558U @ 1.70GHz, 1696.08 MHz, 06-45-01 cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,C FLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,V MX,EST,TM2,SSSE3,SDBG,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,X SAVE,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,ERMS ,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MEL TDOWN cpu1: 256KB 64b/line 8-way L2 cache cpu1: smt 0, core 1, package 0 ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 40 pins acpimadt0: bogus nmi for apid 0 acpimadt0: bogus nmi for apid 2 acpimcfg0 at acpi0 acpimcfg0: addr 0xf8000000, bus 0-63 acpihpet0 at acpi0: 14318179 Hz acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus -1 (PEG0) acpiprt2 at acpi0: bus -1 (PEG1) acpiprt3 at acpi0: bus -1 (PEG2) acpiprt4 at acpi0: bus -1 (RP01) acpiprt5 at acpi0: bus -1 (RP02) acpiprt6 at acpi0: bus 1 (RP03) acpiprt7 at acpi0: bus 2 (RP04) acpiprt8 at acpi0: bus -1 (RP05) acpiprt9 at acpi0: bus -1 (RP06) acpiprt10 at acpi0: bus -1 (RP07) acpiprt11 at acpi0: bus -1 (RP08) acpiec0 at acpi0: not present acpipci0 at acpi0 PCI0: 0x00000000 0x00000011 0x00000001 com0 at acpi0 UAR1 addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo com0: console com1 at acpi0 UAR2 addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo "NTN0530" at acpi0 not configured acpicmos0 at acpi0 acpibtn0 at acpi0: SLPB acpibtn1 at acpi0: PWRB "PNP0C0B" at acpi0 not configured "PNP0C0B" at acpi0 not configured "PNP0C0B" at acpi0 not configured "PNP0C0B" at acpi0 not configured "PNP0C0B" at acpi0 not configured acpicpu0 at acpi0: C1(@1 halt!) acpicpu1 at acpi0: C1(@1 halt!) acpipwrres0 at acpi0: PG00, resource for PEG0 acpipwrres1 at acpi0: PG01, resource for PEG1 acpipwrres2 at acpi0: PG02, resource for PEG2 acpipwrres3 at acpi0: FN00, resource for FAN0 acpipwrres4 at acpi0: FN01, resource for FAN1 acpipwrres5 at acpi0: FN02, resource for FAN2 acpipwrres6 at acpi0: FN03, resource for FAN3 acpipwrres7 at acpi0: FN04, resource for FAN4 acpitz0 at acpi0: critical temperature is 105 degC acpitz1 at acpi0: critical temperature is 105 degC acpivideo0 at acpi0: GFX0 acpivout0 at acpivideo0: DD1F cpu0: using VERW MDS workaround (except on vmm entry) pci0 at mainbus0 bus 0 pchb0 at pci0 dev 0 function 0 "Intel Core 4G Host" rev 0x0b inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics" rev 0x0b drm0 at inteldrm0 inteldrm0: msi, HASWELL, gen 7 azalia0 at pci0 dev 3 function 0 "Intel Core 4G HD Audio" rev 0x0b: msi azalia0: No codecs found xhci0 at pci0 dev 20 function 0 "Intel 8 Series xHCI" rev 0x04: msi, xHCI 1.0 usb0 at xhci0: USB revision 3.0 uhub0 at usb0 configuration 1 interface 0 "Intel xHCI root hub" rev 3.00/1.00 a ddr 1 "Intel 8 Series MEI" rev 0x04 at pci0 dev 22 function 0 not configured azalia1 at pci0 dev 27 function 0 "Intel 8 Series HD Audio" rev 0x04: msi azalia1: codecs: Realtek ALC662 audio0 at azalia1 ppb0 at pci0 dev 28 function 0 "Intel 8 Series PCIE" rev 0xe4: msi pci1 at ppb0 bus 1 re0 at pci1 dev 0 function 0 "Realtek 8168" rev 0x07: RTL8168E/8111E-VL (0x2c80 ), msi, address 00:97:18:05:00:ed rgephy0 at re0 phy 7: RTL8169S/8110S/8211 PHY, rev. 5 ppb1 at pci0 dev 28 function 3 "Intel 8 Series PCIE" rev 0xe4: msi pci2 at ppb1 bus 2 re1 at pci2 dev 0 function 0 "Realtek 8168" rev 0x07: RTL8168E/8111E-VL (0x2c80 ), msi, address 00:97:18:05:00:ee rgephy1 at re1 phy 7: RTL8169S/8110S/8211 PHY, rev. 5 ehci0 at pci0 dev 29 function 0 "Intel 8 Series USB" rev 0x04: apic 2 int 23 usb1 at ehci0: USB revision 2.0 uhub1 at usb1 configuration 1 interface 0 "Intel EHCI root hub" rev 2.00/1.00 a ddr 1 pcib0 at pci0 dev 31 function 0 "Intel 8 Series LPC" rev 0x04 ahci0 at pci0 dev 31 function 2 "Intel 8 Series AHCI" rev 0x04: msi, AHCI 1.3 ahci0: port 0: 6.0Gb/s scsibus1 at ahci0: 32 targets sd0 at scsibus1 targ 0 lun 0: <ATA, TS32GSSD370S, P122> t10.ATA_TS32GSSD370S_A9 1901F5E48434200240 sd0: 30533MB, 512 bytes/sector, 62533296 sectors, thin ichiic0 at pci0 dev 31 function 3 "Intel 8 Series SMBus" rev 0x04: apic 2 int 1 8 iic0 at ichiic0 spdmem0 at iic0 addr 0x50: 2GB DDR3 SDRAM PC3-12800 SO-DIMM isa0 at pcib0 isadma0 at isa0 vga0 at isa0 port 0x3b0/48 iomem 0xa0000/131072 wsdisplay at vga0 not configured pcppi0 at isa0 port 0x61 spkr0 at pcppi0 wbsio0 at isa0 port 0x2e/2: NCT5104D rev 0x53 wbsio0 port 0xa10/2 not configured vmm0 at mainbus0: VMX/EPT run0 at uhub0 port 4 configuration 1 interface 0 "Ralink 802.11 n WLAN" rev 2.0 0/1.01 addr 2 run0: MAC/BBP RT5592 (rev 0x0222), RF RT5592 (MIMO 2T2R), address d8:61:62:37:5 6:c8 uhub2 at uhub1 port 1 configuration 1 interface 0 "Intel Rate Matching Hub" rev 2.00/0.04 addr 2 vscsi0 at root scsibus2 at vscsi0: 256 targets softraid0 at root scsibus3 at softraid0: 256 targets root on sd0a (7ec83d15890e2a71.a) swap on sd0b dump on sd0b inteldrm0: 1024x768, 32bpp wsdisplay0 at inteldrm0 mux 1 wsdisplay0: screen 0-5 added (std, vt100 emulation) kernel: protection fault trap, code=0 kernel: protection fault trap, code=0 Stopped at icmp_mtudisc_timeout+0x77: movq 0(%rax),%rcx ddb{0}> ddb{0}> ddb{0}> icmp_mtudisc_timeout(fffffd807a4f2620,0) at icmp_mtudis c_timeout+0x77 rt_timer_timer(ffffffff82370aa8) at rt_timer_timer+0x1cc softclock_thread(ffff8000ffffe540) at softclock_thread+0x13b end trace frame: 0x0, count: -3 call db_show_rtentr: Stopped at icmp_mtudisc_timeout+0x77: movq 0(%rax),%rcx ddb{0}> ddb{0}> ddb{0}> icmp_mtudisc_timeout(fffffd807a4f2620,0) at icmp_mtudis c_timeout+0x77 rt_timer_timer(ffffffff82370aa8) at rt_timer_timer+0x1cc softclock_thread(ffff8000ffffe540) at softclock_thread+0x13b end trace frame: 0x0, count: -3 ddb{0}> ddb{0}> ddb{0}> call db_show_rtentry(fffffd807a4f2620, 0, 0) Symbol not found ddb{0}>