>From l...@md5collisions.eu Thu May 26 19:51:47 2022
Date: Thu, 26 May 2022 19:51:47 +0200
From: Stephan Mending <l...@md5collisions.eu>
To: misc@openbsd.org
Subject: Re: Cron running at 99% CPU for seemingly no reason
Message-ID: <yo++mzjsnihga...@hellisempty.reich.home.arpa>
Mail-Followup-To: Stephan Mending <l...@md5collisions.eu>, misc@openbsd.org
References: <yodqqaodgsfq0...@hellisempty.reich.home.arpa>
 <yodvfdtukxkx1...@diehard.n-r-g.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <yodvfdtukxkx1...@diehard.n-r-g.com>
X-Operating-System: not-disclosed
Status: RO
Content-Length: 10166
Lines: 244

On Sun, May 15, 2022 at 12:25:24PM +0200, Claudio Jeker wrote:
> On Sun, May 15, 2022 at 12:06:33PM +0200, Stephan Mending wrote:
> > Hi *, 
> > I've got a system running -current that keeps crashing on me every couple 
> > of days. 
> > Output of ddb: 
> > 
> > Connected to /dev/cuaU0 (speed 115200)
> > 
> > ddb{0}> show panic
> > the kernel did not panic
> > ddb{0}> show uvm
> > Current UVM status:
> >   pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
> >   482451 VM pages: 43158 active, 132795 inactive, 35 wired, 192336 free 
> > (24054 z
> > ero)
> >   min  10% (25) anon, 10% (25) vnode, 5% (12) vtext
> >   freemin=16081, free-target=21441, inactive-target=0, wired-max=160817
> >   faults=2487210, traps=2404140, intrs=211883, ctxswitch=1960560 fpuswitch=0
> >   softint=3499069, syscalls=2015497, kmapent=9
> >   fault counts:
> >     noram=0, noanon=0, noamap=0, pgwait=0, pgrele=0
> >     ok relocks(total)=192470(193514), anget(retries)=603205(0), 
> > amapcopy=177151
> > 
> >     neighbor anon/obj pg=82033/639788, gets(lock/unlock)=415897/193548
> >     cases: anon=570367, anoncow=32838, obj=347149, prcopy=67670, 
> > przero=1469152
> > 
> >   daemon and swap counts:
> >     woke=0, revs=0, scans=0, obscans=0, anscans=0
> >     busy=0, freed=0, reactivate=0, deactivate=0
> >     pageouts=0, pending=0, nswget=0
> >     nswapdev=1
> >     swpages=526020, swpginuse=0, swpgonly=0 paging=0
> >   kernel pointers:
> >     objs(kern)=0xffffffff8238a038
> > ddb{0}> show trace
> > No such command
> > ddb{0}> trace
> > icmp_mtudisc_timeout(fffffd807a50b070,0) at icmp_mtudisc_timeout+0x77
> > rt_timer_timer(ffffffff8235d668) at rt_timer_timer+0x1cc
> > softclock_thread(ffff8000fffff260) at softclock_thread+0x13b
> > end trace frame: 0x0, count: -3
> > ddb{0}> 
> > 
> 
> This looks like some bad memory access. This is a fault and not really a
> panic this is why 'show panic' returns 'the kernel did not panic'.
> 
> The crash in icmp_mtudisc_timeout() points to some error in the rttimer
> code refactor I made. Please try a newer snapshot and include the dmesg.
> 
> If it happens again a call to db_show_rtentry in ddb may help better
> understand what is going on:
> 
> call db_show_rtentry(fffffd807a50b070, 0, 0)
> Where the first argument is derived from the first argument of
> icmp_mtudisc_timeout from the trace.
> 
> -- 
> :wq Claudio

@Claudio

The error seems to persist as attached occurred a couple of days back. (This 
time I've got the full dmesg for you)
I tried your "call db_show_rtentry<...>". That somehow returned an error.

Just reporting right here. I'll try and run a kernel with symbols and wait for 
the next crash. 

'Til then. 

Best regards, 
Stephan


dmesg:
OpenBSD 7.1-current (GENERIC.MP) #525: Sat May 14 23:04:32 MDT 2022
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 2046992384 (1952MB)
avail mem = 1967644672 (1876MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.8 @ 0x7bef9000 (81 entries)
bios0: vendor American Megatrends Inc. version "5.011" date 10/10/2016
bios0: INTEL Corporation CRESCENTBAY
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP APIC FPDT FIDT MCFG HPET SSDT UEFI SSDT ASF! SSDT SSDT
acpi0: wakeup devices PEG0(S4) PEGP(S4) PEG1(S4) PEGP(S4) PEG2(S4) PEGP(S4) RP0
1(S4) PXSX(S4) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) PXSX(S4) RP05(S4) P
XSX(S4) [...]
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Pentium(R) 3558U @ 1.70GHz, 1696.36 MHz, 06-45-01
cpu0: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,C
FLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,V
MX,EST,TM2,SSSE3,SDBG,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,X
SAVE,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,ERMS
,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MEL
TDOWN      
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
cpu1 at mainbus0: apid 2 (application processor)
cpu1: Intel(R) Pentium(R) 3558U @ 1.70GHz, 1696.08 MHz, 06-45-01
cpu1: FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,C
FLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,V
MX,EST,TM2,SSSE3,SDBG,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,MOVBE,POPCNT,DEADLINE,X
SAVE,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,PERF,ITSC,FSGSBASE,TSC_ADJUST,ERMS
,INVPCID,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MEL
TDOWN      
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 0, core 1, package 0
ioapic0 at mainbus0: apid 2 pa 0xfec00000, version 20, 40 pins
acpimadt0: bogus nmi for apid 0
acpimadt0: bogus nmi for apid 2
acpimcfg0 at acpi0
acpimcfg0: addr 0xf8000000, bus 0-63
acpihpet0 at acpi0: 14318179 Hz
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (PEG0)
acpiprt2 at acpi0: bus -1 (PEG1)
acpiprt3 at acpi0: bus -1 (PEG2)
acpiprt4 at acpi0: bus -1 (RP01)
acpiprt5 at acpi0: bus -1 (RP02)
acpiprt6 at acpi0: bus 1 (RP03)
acpiprt7 at acpi0: bus 2 (RP04)
acpiprt8 at acpi0: bus -1 (RP05)
acpiprt9 at acpi0: bus -1 (RP06)
acpiprt10 at acpi0: bus -1 (RP07)
acpiprt11 at acpi0: bus -1 (RP08)
acpiec0 at acpi0: not present
acpipci0 at acpi0 PCI0: 0x00000000 0x00000011 0x00000001
com0 at acpi0 UAR1 addr 0x3f8/0x8 irq 4: ns16550a, 16 byte fifo
com0: console
com1 at acpi0 UAR2 addr 0x2f8/0x8 irq 3: ns16550a, 16 byte fifo
"NTN0530" at acpi0 not configured
acpicmos0 at acpi0
acpibtn0 at acpi0: SLPB
acpibtn1 at acpi0: PWRB
"PNP0C0B" at acpi0 not configured
"PNP0C0B" at acpi0 not configured
"PNP0C0B" at acpi0 not configured
"PNP0C0B" at acpi0 not configured
"PNP0C0B" at acpi0 not configured
acpicpu0 at acpi0: C1(@1 halt!)
acpicpu1 at acpi0: C1(@1 halt!)
acpipwrres0 at acpi0: PG00, resource for PEG0
acpipwrres1 at acpi0: PG01, resource for PEG1
acpipwrres2 at acpi0: PG02, resource for PEG2
acpipwrres3 at acpi0: FN00, resource for FAN0
acpipwrres4 at acpi0: FN01, resource for FAN1
acpipwrres5 at acpi0: FN02, resource for FAN2
acpipwrres6 at acpi0: FN03, resource for FAN3
acpipwrres7 at acpi0: FN04, resource for FAN4
acpitz0 at acpi0: critical temperature is 105 degC
acpitz1 at acpi0: critical temperature is 105 degC
acpivideo0 at acpi0: GFX0
acpivout0 at acpivideo0: DD1F
cpu0: using VERW MDS workaround (except on vmm entry)
pci0 at mainbus0 bus 0
pchb0 at pci0 dev 0 function 0 "Intel Core 4G Host" rev 0x0b
inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics" rev 0x0b
drm0 at inteldrm0
inteldrm0: msi, HASWELL, gen 7
azalia0 at pci0 dev 3 function 0 "Intel Core 4G HD Audio" rev 0x0b: msi
azalia0: No codecs found
xhci0 at pci0 dev 20 function 0 "Intel 8 Series xHCI" rev 0x04: msi, xHCI 1.0
usb0 at xhci0: USB revision 3.0
uhub0 at usb0 configuration 1 interface 0 "Intel xHCI root hub" rev 3.00/1.00 a
ddr 1      
"Intel 8 Series MEI" rev 0x04 at pci0 dev 22 function 0 not configured
azalia1 at pci0 dev 27 function 0 "Intel 8 Series HD Audio" rev 0x04: msi
azalia1: codecs: Realtek ALC662
audio0 at azalia1
ppb0 at pci0 dev 28 function 0 "Intel 8 Series PCIE" rev 0xe4: msi
pci1 at ppb0 bus 1
re0 at pci1 dev 0 function 0 "Realtek 8168" rev 0x07: RTL8168E/8111E-VL (0x2c80
), msi, address 00:97:18:05:00:ed
rgephy0 at re0 phy 7: RTL8169S/8110S/8211 PHY, rev. 5
ppb1 at pci0 dev 28 function 3 "Intel 8 Series PCIE" rev 0xe4: msi
pci2 at ppb1 bus 2
re1 at pci2 dev 0 function 0 "Realtek 8168" rev 0x07: RTL8168E/8111E-VL (0x2c80
), msi, address 00:97:18:05:00:ee
rgephy1 at re1 phy 7: RTL8169S/8110S/8211 PHY, rev. 5
ehci0 at pci0 dev 29 function 0 "Intel 8 Series USB" rev 0x04: apic 2 int 23
usb1 at ehci0: USB revision 2.0
uhub1 at usb1 configuration 1 interface 0 "Intel EHCI root hub" rev 2.00/1.00 a
ddr 1      
pcib0 at pci0 dev 31 function 0 "Intel 8 Series LPC" rev 0x04
ahci0 at pci0 dev 31 function 2 "Intel 8 Series AHCI" rev 0x04: msi, AHCI 1.3
ahci0: port 0: 6.0Gb/s
scsibus1 at ahci0: 32 targets
sd0 at scsibus1 targ 0 lun 0: <ATA, TS32GSSD370S, P122> t10.ATA_TS32GSSD370S_A9
1901F5E48434200240
sd0: 30533MB, 512 bytes/sector, 62533296 sectors, thin
ichiic0 at pci0 dev 31 function 3 "Intel 8 Series SMBus" rev 0x04: apic 2 int 1
8          
iic0 at ichiic0
spdmem0 at iic0 addr 0x50: 2GB DDR3 SDRAM PC3-12800 SO-DIMM
isa0 at pcib0
isadma0 at isa0
vga0 at isa0 port 0x3b0/48 iomem 0xa0000/131072
wsdisplay at vga0 not configured
pcppi0 at isa0 port 0x61
spkr0 at pcppi0
wbsio0 at isa0 port 0x2e/2: NCT5104D rev 0x53
wbsio0 port 0xa10/2 not configured
vmm0 at mainbus0: VMX/EPT
run0 at uhub0 port 4 configuration 1 interface 0 "Ralink 802.11 n WLAN" rev 2.0
0/1.01 addr 2
run0: MAC/BBP RT5592 (rev 0x0222), RF RT5592 (MIMO 2T2R), address d8:61:62:37:5
6:c8       
uhub2 at uhub1 port 1 configuration 1 interface 0 "Intel Rate Matching Hub" rev
 2.00/0.04 addr 2
 vscsi0 at root
 scsibus2 at vscsi0: 256 targets
 softraid0 at root
 scsibus3 at softraid0: 256 targets
 root on sd0a (7ec83d15890e2a71.a) swap on sd0b dump on sd0b
 inteldrm0: 1024x768, 32bpp
 wsdisplay0 at inteldrm0 mux 1
 wsdisplay0: screen 0-5 added (std, vt100 emulation)
 kernel: protection fault trap, code=0
 kernel: protection fault trap, code=0
 Stopped at      icmp_mtudisc_timeout+0x77:      movq    0(%rax),%rcx
 ddb{0}> ddb{0}> ddb{0}> icmp_mtudisc_timeout(fffffd807a4f2620,0) at icmp_mtudis
 c_timeout+0x77
 rt_timer_timer(ffffffff82370aa8) at rt_timer_timer+0x1cc
 softclock_thread(ffff8000ffffe540) at softclock_thread+0x13b
 end trace frame: 0x0, count: -3


call db_show_rtentr:

Stopped at      icmp_mtudisc_timeout+0x77:      movq    0(%rax),%rcx
ddb{0}> ddb{0}> ddb{0}> icmp_mtudisc_timeout(fffffd807a4f2620,0) at icmp_mtudis
c_timeout+0x77
rt_timer_timer(ffffffff82370aa8) at rt_timer_timer+0x1cc
softclock_thread(ffff8000ffffe540) at softclock_thread+0x13b
end trace frame: 0x0, count: -3
ddb{0}> ddb{0}>
ddb{0}> call db_show_rtentry(fffffd807a4f2620, 0, 0)
Symbol not found
ddb{0}> 



Reply via email to