On Sat, 28 Mar 2020 10:18:22 +0100
Martin Pieuchot wrote:

> On 27/03/20(Fri) 22:43, Charlene Wendling wrote:
> > Hi,
> > 
> > >Environment:
> >         System      : OpenBSD 6.6
> >         Details     : OpenBSD 6.6-current (GENERIC.MP) #676: Fri
> > Feb 14 02:26:37 MST 2020
> > [email protected]:/usr/src/sys/arch/macppc/compile/GENERIC.MP
> > 
> >         Architecture: OpenBSD.macppc
> >         Machine     : macppc
> > >Description:
> > 
> > Note that it's still reproducible with more recent snapshots.
> > 
> > Running GENERIC.MP causes kernel panics if it's under high
> > load. Running GENERIC causes no such issues on the two dual
> > core machines belonging to the macppc ports building cluster.
> > 
> > It's happening since early December 2019, but is occurring even
> > more since the last few weeks, at a rate becoming harmful, hence my
> > report.
> > 
> > >How-To-Repeat:
> > 
> > Start a bulk with dpb(1) with GENERIC.MP, it should panic anytime
> > before 4 days. If you're lucky it will crash straight while listing
> > ports.
> 
> Thanks for the report.  If you have the patience to continue gather
> such crash please do send the same report every time.  It is
> interesting to see that CPU0 is in uvm_swap_io() here.

Due to the current context, it's more about time than patience :)
I say it again publicly this time, it messes up badly filesystems,
and i had to rebuild chroots several times, on slow wd(4) disks.

> It would be nice to know if there's a common pattern between what
> seems to be a memory corruption on CPU1 and what CPU0 is doing at
> that moment.
>
> This might be a MD or MI bug, so the more information you get us the
> better :o)

Here is a new one, with:

OpenBSD 6.6-current (GENERIC.MP) #692: Sat Mar 21 10:19:57 MDT 2020             
   [email protected]:/usr/src/sys/arch/macppc/compile/GENERIC.MP

and xlights(4) disabled, as hinted by deraadt@ to try to avoid
complete machine freezes.

ddb{1}> machine ddbcpu 0
Stopped at      db_enter+0x10:  lwz r0,36(r1)
db_enter() at db_enter+0xc
openpic_ipi_ddb() at openpic_ipi_ddb+0xc
openpic_ext_intr() at openpic_ext_intr+0x254
extint_call() at extint_call
--- interrupt ---
openpic_splx(100) at openpic_splx+0xa4
splx(6c16e000) at splx+0x1c
tsleep(6360,920000,e62f3c30,0) at tsleep+0x98
biowait(1) at biowait+0x5c
uvm_swap_io(ffffffff,0,0,20000000) at uvm_swap_io+0x5f4
uvm_swap_get(4073ad0,4073ad0,e62f3d40) at uvm_swap_get+0x58
uvmfault_anonget(1,1,e62f3d90) at uvmfault_anonget+0x1ac
uvm_fault(0,40,e62f3dc0,2000d034) at uvm_fault+0x554
trap(d897b0) at trap+0x8d4
trapagain() at trapagain+0x4
--- trap (type 0x10300) ---
End of kernel: 0xfffece40
end trace frame: 0xfffece40, count: 1

ddb{0}> trace
db_enter() at db_enter+0xc
openpic_ipi_ddb() at openpic_ipi_ddb+0xc
openpic_ext_intr() at openpic_ext_intr+0x254
extint_call() at extint_call
--- interrupt ---
openpic_splx(100) at openpic_splx+0xa4
splx(6c16e000) at splx+0x1c
tsleep(6360,920000,e62f3c30,0) at tsleep+0x98
biowait(1) at biowait+0x5c
uvm_swap_io(ffffffff,0,0,20000000) at uvm_swap_io+0x5f4
uvm_swap_get(4073ad0,4073ad0,e62f3d40) at uvm_swap_get+0x58
uvmfault_anonget(1,1,e62f3d90) at uvmfault_anonget+0x1ac
uvm_fault(0,40,e62f3dc0,2000d034) at uvm_fault+0x554
trap(d897b0) at trap+0x8d4
trapagain() at trapagain+0x4
--- trap (type 0x10300) ---
End of kernel: 0xfffece40
end trace frame: 0xfffece40, count: -14

ddb{0}> machine ddbcpu 1
Stopped at      db_enter+0x10:  lwz r0,36(r1)
db_enter() at db_enter+0xc
panic(0) at panic+0xe0
rw_assert_rdlock(e61f0e88) at rw_assert_rdlock+0x60
rw_exit_read(9741b8) at rw_exit_read+0x14
if_input_process(7bba80,e61f0f28) at if_input_process+0x68
ifiq_process(ffffffff) at ifiq_process+0x78
taskq_thread(e0007040) at taskq_thread+0x58
fork_trampoline() at fork_trampoline+0x14
end trace frame: 0x0, count: 7

ddb{1}> trace
db_enter() at db_enter+0xc
panic(0) at panic+0xe0
rw_assert_rdlock(e61f0e88) at rw_assert_rdlock+0x60
rw_exit_read(9741b8) at rw_exit_read+0x14
if_input_process(7bba80,e61f0f28) at if_input_process+0x68
ifiq_process(ffffffff) at ifiq_process+0x78
taskq_thread(e0007040) at taskq_thread+0x58
fork_trampoline() at fork_trampoline+0x14
end trace frame: 0x0, count: -8

ddb{1}> show uvm
Current UVM status:
  pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
  505883 VM pages: 31785 active, 35049 inactive, 0 wired, 320740 free (38475 ze
ro)
  min  10% (25) anon, 10% (25) vnode, 5% (12) vtext
  freemin=16862, free-target=22482, inactive-target=118759, wired-max=168627
  faults=545923323, traps=0, intrs=9812526, ctxswitch=27849670 fpuswitch=368046
7
  softint=26817811, syscalls=529071026, kmapent=11
  fault counts:
    noram=0, noanon=0, noamap=0, pgwait=0, pgrele=0
    ok relocks(total)=1038445(1038451), anget(retries)=55843728(716327), amapco
py=79091608
    neighbor anon/obj pg=7380110/142389090, gets(lock/unlock)=44677040/322124
    cases: anon=50235600, anoncow=5608127, obj=36021180, prcopy=8655854, przero
=445402559
  daemon and swap counts:
    woke=282, revs=277, scans=12460402, obscans=132041, anscans=12328361
    busy=0, freed=1189539, reactivate=0, deactivate=3509379
    pageouts=2805442, pending=66149, nswget=716346
    nswapdev=1
    swpages=327679, swpginuse=3451, swpgonly=2584 paging=0
  kernel pointers:
    objs(kern)=0x93eabc

ddb{1}> show bcstats
Current Buffer Cache status:
numbufs 39909 busymapped 2, delwri 37
kvaslots 358 avail kva slots 356
bufpages 98214, dmapages 98214, dirtypages 148
pendingreads 153, pendingwrites 61
highflips 0, highflops 0, dmaflips 0

ddb{1}> ps 
   PID     TID   PPID    UID  S       FLAGS  WAIT          COMMAND
 38573  500579  40607     55  2         0x2                moc
 91399    2819  40607     55  2         0x2                moc
 40607  359914  10617     55  3        0x82  kqread        cmake
 40607   77714  10617     55  3   0x4000082  fsleep        cmake
 40607  414276  10617     55  3   0x4000082  fsleep        cmake
 10617  286603  84946     55  3    0x10008a  pause         sh
 52840  279944  86410     55  3    0x100002  getblk        cksum
 86410  433884  48514     55  3    0x10008a  pause         sh
 48514  197542  62095     55  3    0x10008a  pause         make
 62095  497515  62270     55  3    0x10008a  pause         make
 62270  237411  14981     55  3    0x10008a  pause         sh
 14981   47014  34220     55  3    0x10008a  pause         make
 34220  180947  88741     55  3    0x10008a  pause         sh
 88741  495610  77102     55  3    0x10008a  pause         make
 77102  479278  93162      0  3    0x10008a  pause         ksh
 84946  266688  58589     55  3        0x8a  poll          ninja
 58589  128453  92139     55  3    0x10008a  pause         make
 92139  179407  37215     55  3    0x10008a  pause         make
 37215  120056  60717     55  3    0x10008a  pause         sh
 60717  501110  89399     55  3    0x10008a  pause         make
 89399  323538   9207     55  3    0x10008a  pause         sh
  9207  440167  88340     55  3    0x10008a  pause         make
 88340  416509  93162      0  3    0x10008a  pause         ksh
 93162    2053  72060      0  3        0x92  select        sshd
 11943  387300      1      0  7    0x100003                getty
 95662  201985      1      0  3    0x100098  poll          cron
 32477  238302      1    110  3    0x100090  poll          sndiod
 59661  427557      1     99  3    0x100090  poll          sndiod
 20437  361379   2054     95  3    0x100092  kqread        smtpd
 49929  106186   2054    103  3    0x100092  kqread        smtpd
 13060  507050   2054     95  3    0x100092  kqread        smtpd
 43860  370751   2054     95  3    0x100092  kqread        smtpd
 22026   42648   2054     95  3    0x100092  kqread        smtpd
 11464  274219   2054     95  3    0x100092  kqread        smtpd
  2054  160405      1      0  3    0x100080  kqread        smtpd
 72060  131364      1      0  3        0x80  select        sshd
 18146  130816      0      0  3     0x14280  nfsidl        nfsio
 90798  522596      0      0  2     0x14280                nfsio
 50429  129760      0      0  3     0x14280  nfsidl        nfsio
 38308  389895      0      0  3     0x14200  netio         nfsio
 34449  469395  55543     83  3    0x100092  poll          ntpd
 55543   64576  47380     83  3    0x100092  poll          ntpd
 47380  392784      1      0  3    0x100080  poll          ntpd
  4990   57005  84109     74  3    0x100092  bpf           pflogd
 84109  424693      1      0  3        0x80  netio         pflogd
 85512  132636   9303     73  3    0x100090  kqread        syslogd
  9303  480167      1      0  3    0x100082  netio         syslogd
 50245   27526   3278    115  3    0x100092  kqread        slaacd
 13737   30350   3278    115  3    0x100092  kqread        slaacd
  3278   81907      1      0  3    0x100080  kqread        slaacd
 53754  287148      0      0  3     0x14200  bored         smr
 70873  430137      0      0  3  0x40014200                idle1
 89304  201687      0      0  2     0x14200                zerothread
  2513  418103      0      0  3     0x14200  aiodoned      aiodoned
 75690  291346      0      0  3     0x14200  syncer        update
 24800  418803      0      0  3     0x14200  cleaner       cleaner
 10239  267090      0      0  3     0x14200  reaper        reaper
 93632   33802      0      0  3     0x14200  pgdaemon      pagedaemon
 80864  336862      0      0  3     0x14200  bored         crynlk
 15848  154968      0      0  3     0x14200  bored         crypto
   194  121071      0      0  3     0x14200  usbtsk        usbtask
 63751  148381      0      0  3     0x14200  usbatsk       usbatsk
*38044  165862      0      0  7     0x14200                softnet
 59266  452488      0      0  3     0x14200  bored         systqmp
 86479   74620      0      0  3     0x14200  bored         systq
 15364  523764      0      0  2  0x40014200                softclock
 85711  139071      0      0  3  0x40014200                idle0
     1  145103      0      0  3        0x82  wait          init
     0       0     -1      0  3     0x10200  scheduler     swapper

> dmesg

OpenBSD 6.6-current (GENERIC.MP) #692: Sat Mar 21 10:19:57 MDT 2020
    [email protected]:/usr/src/sys/arch/macppc/compile/GENERIC.MP
real mem = 2147483648 (2048MB)
avail mem = 2071994368 (1976MB)
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root: model RackMac1,1
cpu0 at mainbus0: 7455 (Revision 0x201): 999 MHz: 256KB L2 cache, 2MB L3 cache
cpu1 at mainbus0: 7455 (Revision 0x201): 999 MHz: 256KB L2 cache, 2MB L3 cache
mem0 at mainbus0
spdmem0 at mem0: 512MB DDR SDRAM non-parity PC2100CL2.5
spdmem1 at mem0: 512MB DDR SDRAM non-parity PC2100CL2.5
spdmem2 at mem0: 512MB DDR SDRAM non-parity PC2700CL2.5
spdmem3 at mem0: 512MB DDR SDRAM non-parity PC2700CL2.5
memc0 at mainbus0: uni-n rev 0x24
kiic0 at memc0 offset 0xf8001000
iic0 at kiic0
lmtemp0 at iic0 addr 0xc9: ds1775, fails to respond
mpcpcibr0 at mainbus0 pci: uni-north
pci0 at mpcpcibr0 bus 0
mpcpcibr1 at mainbus0 pci: uni-north
pci1 at mpcpcibr1 bus 0
ppb0 at pci1 dev 13 function 0 "Intel 21154AE/BE" rev 0x00
pci2 at ppb0 bus 1
macobio0 at pci2 dev 7 function 0 "Apple Keylargo" rev 0x03
openpic0 at macobio0 offset 0x40000: version 0x4614 feature 3f0302 LE
macgpio0 at macobio0 offset 0x50
macgpio1 at macgpio0 offset 0x9: irq 47
"programmer-switch" at macgpio0 offset 0x11 not configured
"ringDetect-gpio" at macgpio0 offset 0x8 not configured
"keySwitch-gpio" at macgpio0 offset 0xc not configured
"systemMonitor-gpio" at macgpio0 offset 0x12 not configured
sysbutton0 at macgpio0 offset 0x15: irq 59
"indicatorLED-gpio" at macgpio0 offset 0x20 not configured
"virtual-sound" at macgpio0 not configured
"escc-legacy" at macobio0 offset 0x12000 not configured
zs0 at macobio0 offset 0x13000: irq 22,23
zstty0 at zs0 channel 0: console
zstty1 at zs0 channel 1
"i2s" at macobio0 offset 0x10000 not configured
"timer" at macobio0 offset 0x15000 not configured
adb0 at macobio0 offset 0x16000
apm0 at adb0: battery flags 0x9, 0% charged
piic0 at adb0
iic1 at piic0
"PCA9554" at iic1 addr 0xa0 not configured
"PCA9554" at iic1 addr 0xa1 not configured
"PCA9554" at iic1 addr 0xa2 not configured
"PCA9554" at iic1 addr 0xa3 not configured
"PCA9554" at iic1 addr 0xa4 not configured
kiic1 at macobio0 offset 0x18000
iic2 at kiic1
wdc0 at macobio0 offset 0x1f000 irq 19: DMA
atapiscsi0 at wdc0 channel 0 drive 0
scsibus1 at atapiscsi0: 2 targets
cd0 at scsibus1 targ 0 lun 0: <LG, CD-ROM CRN-8245B, AHT9> removable
cd0(wdc0:0:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 2
ohci0 at pci2 dev 8 function 0 "Apple USB" rev 0x00: irq 27, version 1.0
ohci1 at pci2 dev 9 function 0 "Apple USB" rev 0x00: irq 28, version 1.0
usb0 at ohci0: USB revision 1.0
uhub0 at usb0 configuration 1 interface 0 "Apple OHCI root hub" rev 1.00/1.00 
addr 1
usb1 at ohci1: USB revision 1.0
uhub1 at usb1 configuration 1 interface 0 "Apple OHCI root hub" rev 1.00/1.00 
addr 1
ppb1 at pci1 dev 17 function 0 "Intel 21154AE/BE" rev 0x00
pci3 at ppb1 bus 2
pciide0 at pci1 dev 21 function 0 "Promise PDC20268R" rev 0x02: DMA, channel 0 
configured to native-PCI, channel 1 configured to native-PCI
pciide0: using irq 58 for native-PCI interrupt
wd0 at pciide0 channel 0 drive 0: <WDC WD800JB-00CRA1>
wd0: 16-sector PIO, LBA, 76319MB, 156301488 sectors
wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 5
wd1 at pciide0 channel 1 drive 0: <WL500GPA1672>
wd1: 16-sector PIO, LBA48, 476940MB, 976773168 sectors
wd1(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 5
pciide1 at pci1 dev 27 function 0 "Promise PDC20268R" rev 0x02: DMA, channel 0 
configured to native-PCI, channel 1 configured to native-PCI
pciide1: using irq 63 for native-PCI interrupt
mpcpcibr2 at mainbus0 pci: uni-north
pci4 at mpcpcibr2 bus 0
"Apple UniNorth Firewire" rev 0x01 at pci4 dev 14 function 0 not configured
gem0 at pci4 dev 15 function 0 "Apple Uni-N2 GMAC" rev 0x00: irq 41, address 
00:03:ae:45:3a:9d
brgphy0 at gem0 phy 0: BCM5421 10/100/1000baseT PHY, rev. 1
vscsi0 at root
scsibus2 at vscsi0: 256 targets
softraid0 at root
scsibus3 at softraid0: 256 targets
bootpath: /pci@f2000000/AppleKiwi@15/ata-6@0/disk@0:/bsd
root on wd0a (c0bc446965ec65a8.a) swap on wd0b dump on wd0b
WARNING: / was not properly unmounted

Reply via email to