On Sat, 28 Mar 2020 10:18:22 +0100 Martin Pieuchot wrote: > On 27/03/20(Fri) 22:43, Charlene Wendling wrote: > > Hi, > > > > >Environment: > > System : OpenBSD 6.6 > > Details : OpenBSD 6.6-current (GENERIC.MP) #676: Fri > > Feb 14 02:26:37 MST 2020 > > [email protected]:/usr/src/sys/arch/macppc/compile/GENERIC.MP > > > > Architecture: OpenBSD.macppc > > Machine : macppc > > >Description: > > > > Note that it's still reproducible with more recent snapshots. > > > > Running GENERIC.MP causes kernel panics if it's under high > > load. Running GENERIC causes no such issues on the two dual > > core machines belonging to the macppc ports building cluster. > > > > It's happening since early December 2019, but is occurring even > > more since the last few weeks, at a rate becoming harmful, hence my > > report. > > > > >How-To-Repeat: > > > > Start a bulk with dpb(1) with GENERIC.MP, it should panic anytime > > before 4 days. If you're lucky it will crash straight while listing > > ports. > > Thanks for the report. If you have the patience to continue gather > such crash please do send the same report every time. It is > interesting to see that CPU0 is in uvm_swap_io() here.
Due to the current context, it's more about time than patience :) I say it again publicly this time, it messes up badly filesystems, and i had to rebuild chroots several times, on slow wd(4) disks. > It would be nice to know if there's a common pattern between what > seems to be a memory corruption on CPU1 and what CPU0 is doing at > that moment. > > This might be a MD or MI bug, so the more information you get us the > better :o) Here is a new one, with: OpenBSD 6.6-current (GENERIC.MP) #692: Sat Mar 21 10:19:57 MDT 2020 [email protected]:/usr/src/sys/arch/macppc/compile/GENERIC.MP and xlights(4) disabled, as hinted by deraadt@ to try to avoid complete machine freezes. ddb{1}> machine ddbcpu 0 Stopped at db_enter+0x10: lwz r0,36(r1) db_enter() at db_enter+0xc openpic_ipi_ddb() at openpic_ipi_ddb+0xc openpic_ext_intr() at openpic_ext_intr+0x254 extint_call() at extint_call --- interrupt --- openpic_splx(100) at openpic_splx+0xa4 splx(6c16e000) at splx+0x1c tsleep(6360,920000,e62f3c30,0) at tsleep+0x98 biowait(1) at biowait+0x5c uvm_swap_io(ffffffff,0,0,20000000) at uvm_swap_io+0x5f4 uvm_swap_get(4073ad0,4073ad0,e62f3d40) at uvm_swap_get+0x58 uvmfault_anonget(1,1,e62f3d90) at uvmfault_anonget+0x1ac uvm_fault(0,40,e62f3dc0,2000d034) at uvm_fault+0x554 trap(d897b0) at trap+0x8d4 trapagain() at trapagain+0x4 --- trap (type 0x10300) --- End of kernel: 0xfffece40 end trace frame: 0xfffece40, count: 1 ddb{0}> trace db_enter() at db_enter+0xc openpic_ipi_ddb() at openpic_ipi_ddb+0xc openpic_ext_intr() at openpic_ext_intr+0x254 extint_call() at extint_call --- interrupt --- openpic_splx(100) at openpic_splx+0xa4 splx(6c16e000) at splx+0x1c tsleep(6360,920000,e62f3c30,0) at tsleep+0x98 biowait(1) at biowait+0x5c uvm_swap_io(ffffffff,0,0,20000000) at uvm_swap_io+0x5f4 uvm_swap_get(4073ad0,4073ad0,e62f3d40) at uvm_swap_get+0x58 uvmfault_anonget(1,1,e62f3d90) at uvmfault_anonget+0x1ac uvm_fault(0,40,e62f3dc0,2000d034) at uvm_fault+0x554 trap(d897b0) at trap+0x8d4 trapagain() at trapagain+0x4 --- trap (type 0x10300) --- End of kernel: 0xfffece40 end trace frame: 0xfffece40, count: -14 ddb{0}> machine ddbcpu 1 Stopped at db_enter+0x10: lwz r0,36(r1) db_enter() at db_enter+0xc panic(0) at panic+0xe0 rw_assert_rdlock(e61f0e88) at rw_assert_rdlock+0x60 rw_exit_read(9741b8) at rw_exit_read+0x14 if_input_process(7bba80,e61f0f28) at if_input_process+0x68 ifiq_process(ffffffff) at ifiq_process+0x78 taskq_thread(e0007040) at taskq_thread+0x58 fork_trampoline() at fork_trampoline+0x14 end trace frame: 0x0, count: 7 ddb{1}> trace db_enter() at db_enter+0xc panic(0) at panic+0xe0 rw_assert_rdlock(e61f0e88) at rw_assert_rdlock+0x60 rw_exit_read(9741b8) at rw_exit_read+0x14 if_input_process(7bba80,e61f0f28) at if_input_process+0x68 ifiq_process(ffffffff) at ifiq_process+0x78 taskq_thread(e0007040) at taskq_thread+0x58 fork_trampoline() at fork_trampoline+0x14 end trace frame: 0x0, count: -8 ddb{1}> show uvm Current UVM status: pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12 505883 VM pages: 31785 active, 35049 inactive, 0 wired, 320740 free (38475 ze ro) min 10% (25) anon, 10% (25) vnode, 5% (12) vtext freemin=16862, free-target=22482, inactive-target=118759, wired-max=168627 faults=545923323, traps=0, intrs=9812526, ctxswitch=27849670 fpuswitch=368046 7 softint=26817811, syscalls=529071026, kmapent=11 fault counts: noram=0, noanon=0, noamap=0, pgwait=0, pgrele=0 ok relocks(total)=1038445(1038451), anget(retries)=55843728(716327), amapco py=79091608 neighbor anon/obj pg=7380110/142389090, gets(lock/unlock)=44677040/322124 cases: anon=50235600, anoncow=5608127, obj=36021180, prcopy=8655854, przero =445402559 daemon and swap counts: woke=282, revs=277, scans=12460402, obscans=132041, anscans=12328361 busy=0, freed=1189539, reactivate=0, deactivate=3509379 pageouts=2805442, pending=66149, nswget=716346 nswapdev=1 swpages=327679, swpginuse=3451, swpgonly=2584 paging=0 kernel pointers: objs(kern)=0x93eabc ddb{1}> show bcstats Current Buffer Cache status: numbufs 39909 busymapped 2, delwri 37 kvaslots 358 avail kva slots 356 bufpages 98214, dmapages 98214, dirtypages 148 pendingreads 153, pendingwrites 61 highflips 0, highflops 0, dmaflips 0 ddb{1}> ps PID TID PPID UID S FLAGS WAIT COMMAND 38573 500579 40607 55 2 0x2 moc 91399 2819 40607 55 2 0x2 moc 40607 359914 10617 55 3 0x82 kqread cmake 40607 77714 10617 55 3 0x4000082 fsleep cmake 40607 414276 10617 55 3 0x4000082 fsleep cmake 10617 286603 84946 55 3 0x10008a pause sh 52840 279944 86410 55 3 0x100002 getblk cksum 86410 433884 48514 55 3 0x10008a pause sh 48514 197542 62095 55 3 0x10008a pause make 62095 497515 62270 55 3 0x10008a pause make 62270 237411 14981 55 3 0x10008a pause sh 14981 47014 34220 55 3 0x10008a pause make 34220 180947 88741 55 3 0x10008a pause sh 88741 495610 77102 55 3 0x10008a pause make 77102 479278 93162 0 3 0x10008a pause ksh 84946 266688 58589 55 3 0x8a poll ninja 58589 128453 92139 55 3 0x10008a pause make 92139 179407 37215 55 3 0x10008a pause make 37215 120056 60717 55 3 0x10008a pause sh 60717 501110 89399 55 3 0x10008a pause make 89399 323538 9207 55 3 0x10008a pause sh 9207 440167 88340 55 3 0x10008a pause make 88340 416509 93162 0 3 0x10008a pause ksh 93162 2053 72060 0 3 0x92 select sshd 11943 387300 1 0 7 0x100003 getty 95662 201985 1 0 3 0x100098 poll cron 32477 238302 1 110 3 0x100090 poll sndiod 59661 427557 1 99 3 0x100090 poll sndiod 20437 361379 2054 95 3 0x100092 kqread smtpd 49929 106186 2054 103 3 0x100092 kqread smtpd 13060 507050 2054 95 3 0x100092 kqread smtpd 43860 370751 2054 95 3 0x100092 kqread smtpd 22026 42648 2054 95 3 0x100092 kqread smtpd 11464 274219 2054 95 3 0x100092 kqread smtpd 2054 160405 1 0 3 0x100080 kqread smtpd 72060 131364 1 0 3 0x80 select sshd 18146 130816 0 0 3 0x14280 nfsidl nfsio 90798 522596 0 0 2 0x14280 nfsio 50429 129760 0 0 3 0x14280 nfsidl nfsio 38308 389895 0 0 3 0x14200 netio nfsio 34449 469395 55543 83 3 0x100092 poll ntpd 55543 64576 47380 83 3 0x100092 poll ntpd 47380 392784 1 0 3 0x100080 poll ntpd 4990 57005 84109 74 3 0x100092 bpf pflogd 84109 424693 1 0 3 0x80 netio pflogd 85512 132636 9303 73 3 0x100090 kqread syslogd 9303 480167 1 0 3 0x100082 netio syslogd 50245 27526 3278 115 3 0x100092 kqread slaacd 13737 30350 3278 115 3 0x100092 kqread slaacd 3278 81907 1 0 3 0x100080 kqread slaacd 53754 287148 0 0 3 0x14200 bored smr 70873 430137 0 0 3 0x40014200 idle1 89304 201687 0 0 2 0x14200 zerothread 2513 418103 0 0 3 0x14200 aiodoned aiodoned 75690 291346 0 0 3 0x14200 syncer update 24800 418803 0 0 3 0x14200 cleaner cleaner 10239 267090 0 0 3 0x14200 reaper reaper 93632 33802 0 0 3 0x14200 pgdaemon pagedaemon 80864 336862 0 0 3 0x14200 bored crynlk 15848 154968 0 0 3 0x14200 bored crypto 194 121071 0 0 3 0x14200 usbtsk usbtask 63751 148381 0 0 3 0x14200 usbatsk usbatsk *38044 165862 0 0 7 0x14200 softnet 59266 452488 0 0 3 0x14200 bored systqmp 86479 74620 0 0 3 0x14200 bored systq 15364 523764 0 0 2 0x40014200 softclock 85711 139071 0 0 3 0x40014200 idle0 1 145103 0 0 3 0x82 wait init 0 0 -1 0 3 0x10200 scheduler swapper > dmesg OpenBSD 6.6-current (GENERIC.MP) #692: Sat Mar 21 10:19:57 MDT 2020 [email protected]:/usr/src/sys/arch/macppc/compile/GENERIC.MP real mem = 2147483648 (2048MB) avail mem = 2071994368 (1976MB) mpath0 at root scsibus0 at mpath0: 256 targets mainbus0 at root: model RackMac1,1 cpu0 at mainbus0: 7455 (Revision 0x201): 999 MHz: 256KB L2 cache, 2MB L3 cache cpu1 at mainbus0: 7455 (Revision 0x201): 999 MHz: 256KB L2 cache, 2MB L3 cache mem0 at mainbus0 spdmem0 at mem0: 512MB DDR SDRAM non-parity PC2100CL2.5 spdmem1 at mem0: 512MB DDR SDRAM non-parity PC2100CL2.5 spdmem2 at mem0: 512MB DDR SDRAM non-parity PC2700CL2.5 spdmem3 at mem0: 512MB DDR SDRAM non-parity PC2700CL2.5 memc0 at mainbus0: uni-n rev 0x24 kiic0 at memc0 offset 0xf8001000 iic0 at kiic0 lmtemp0 at iic0 addr 0xc9: ds1775, fails to respond mpcpcibr0 at mainbus0 pci: uni-north pci0 at mpcpcibr0 bus 0 mpcpcibr1 at mainbus0 pci: uni-north pci1 at mpcpcibr1 bus 0 ppb0 at pci1 dev 13 function 0 "Intel 21154AE/BE" rev 0x00 pci2 at ppb0 bus 1 macobio0 at pci2 dev 7 function 0 "Apple Keylargo" rev 0x03 openpic0 at macobio0 offset 0x40000: version 0x4614 feature 3f0302 LE macgpio0 at macobio0 offset 0x50 macgpio1 at macgpio0 offset 0x9: irq 47 "programmer-switch" at macgpio0 offset 0x11 not configured "ringDetect-gpio" at macgpio0 offset 0x8 not configured "keySwitch-gpio" at macgpio0 offset 0xc not configured "systemMonitor-gpio" at macgpio0 offset 0x12 not configured sysbutton0 at macgpio0 offset 0x15: irq 59 "indicatorLED-gpio" at macgpio0 offset 0x20 not configured "virtual-sound" at macgpio0 not configured "escc-legacy" at macobio0 offset 0x12000 not configured zs0 at macobio0 offset 0x13000: irq 22,23 zstty0 at zs0 channel 0: console zstty1 at zs0 channel 1 "i2s" at macobio0 offset 0x10000 not configured "timer" at macobio0 offset 0x15000 not configured adb0 at macobio0 offset 0x16000 apm0 at adb0: battery flags 0x9, 0% charged piic0 at adb0 iic1 at piic0 "PCA9554" at iic1 addr 0xa0 not configured "PCA9554" at iic1 addr 0xa1 not configured "PCA9554" at iic1 addr 0xa2 not configured "PCA9554" at iic1 addr 0xa3 not configured "PCA9554" at iic1 addr 0xa4 not configured kiic1 at macobio0 offset 0x18000 iic2 at kiic1 wdc0 at macobio0 offset 0x1f000 irq 19: DMA atapiscsi0 at wdc0 channel 0 drive 0 scsibus1 at atapiscsi0: 2 targets cd0 at scsibus1 targ 0 lun 0: <LG, CD-ROM CRN-8245B, AHT9> removable cd0(wdc0:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 2 ohci0 at pci2 dev 8 function 0 "Apple USB" rev 0x00: irq 27, version 1.0 ohci1 at pci2 dev 9 function 0 "Apple USB" rev 0x00: irq 28, version 1.0 usb0 at ohci0: USB revision 1.0 uhub0 at usb0 configuration 1 interface 0 "Apple OHCI root hub" rev 1.00/1.00 addr 1 usb1 at ohci1: USB revision 1.0 uhub1 at usb1 configuration 1 interface 0 "Apple OHCI root hub" rev 1.00/1.00 addr 1 ppb1 at pci1 dev 17 function 0 "Intel 21154AE/BE" rev 0x00 pci3 at ppb1 bus 2 pciide0 at pci1 dev 21 function 0 "Promise PDC20268R" rev 0x02: DMA, channel 0 configured to native-PCI, channel 1 configured to native-PCI pciide0: using irq 58 for native-PCI interrupt wd0 at pciide0 channel 0 drive 0: <WDC WD800JB-00CRA1> wd0: 16-sector PIO, LBA, 76319MB, 156301488 sectors wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5 wd1 at pciide0 channel 1 drive 0: <WL500GPA1672> wd1: 16-sector PIO, LBA48, 476940MB, 976773168 sectors wd1(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 5 pciide1 at pci1 dev 27 function 0 "Promise PDC20268R" rev 0x02: DMA, channel 0 configured to native-PCI, channel 1 configured to native-PCI pciide1: using irq 63 for native-PCI interrupt mpcpcibr2 at mainbus0 pci: uni-north pci4 at mpcpcibr2 bus 0 "Apple UniNorth Firewire" rev 0x01 at pci4 dev 14 function 0 not configured gem0 at pci4 dev 15 function 0 "Apple Uni-N2 GMAC" rev 0x00: irq 41, address 00:03:ae:45:3a:9d brgphy0 at gem0 phy 0: BCM5421 10/100/1000baseT PHY, rev. 1 vscsi0 at root scsibus2 at vscsi0: 256 targets softraid0 at root scsibus3 at softraid0: 256 targets bootpath: /pci@f2000000/AppleKiwi@15/ata-6@0/disk@0:/bsd root on wd0a (c0bc446965ec65a8.a) swap on wd0b dump on wd0b WARNING: / was not properly unmounted
