Re: glxsb crypto crash

2007-11-15 Thread Mitja Muženič
For the archives,

the following two commits have fixed this:

--
CVSROOT:/cvs
Module name:src
Changes by: [EMAIL PROTECTED]   2007/11/14 12:10:44

Modified files:
sys/arch/i386/i386: via.c
sys/arch/i386/pci: glxsb.c

Log message:
do not process requests linked to unused sessions. (crypto_freesession
might happen between enqueuing a crypto request and scheduling of
the crypto thread); ok hshoexer
--

CVSROOT:/cvs
Module name:src
Changes by: [EMAIL PROTECTED]   2007/11/14 12:12:37

Modified files:
sys/crypto : crypto.c

Log message:
do not call crypto_done() on errors, since the drivers already do this.
otherwise we call the callback twice; fixes panics on crypto errors as
seen on reboot; ok hshoexer
--


Regards, Mitja


 -Original Message-
 From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
 On Behalf Of Mitja Muenih
 Sent: Saturday, November 10, 2007 10:56 PM
 To: misc@openbsd.org
 Subject: glxsb crypto crash

 Not my day, obviously

 On a net5501 machine that I just upgraded to 4.2 I
 experienced a sudden
 reboot and found this in dmesg:

 uvm_fault(0xd078d120, 0x0, 0, 1) - e
 fatal page fault (6) in supervisor mode
 trap type 6 code 0 eip d03d4fab cs 8 eflags 10296 cr2 4 cpl 90
 panic: trap type 6, code=0, pc=d03d4fab
 Starting stack trace...
 panic(0,d6a0215c,da48ed3c,0,d6a0215c) at panic+0x71
 panic(d06a2ca7,6,0,d03d4fab,d067c27d) at panic+0x71
 trap() at trap+0x119
 --- trap (number 6) ---
 swcr_authcompute(d6845068,d68440e0,0,d69a5900,2) at
 swcr_authcompute+0x33
 glxsb_crypto_freesession(d6845068,d68440e0,0,d69a5900,90) at
 glxsb_crypto_freese
 ssion+0xe3
 glxsb_crypto_process(d6845068,0,da48ef6c,d0332d78) at
 glxsb_crypto_process+0xd4
 crypto_invoke(d6845068,24,d0690993,0) at crypto_invoke+0xbc
 crypto_thread(d6a0215c) at crypto_thread+0x38
 Bad frame pointer: 0xd08c7eb8
 End of stack trace.
 syncing disks... 22 22 21 19 15 9 2 1 1 1 1 1 1 1 1 1 1 1 1 1
 giving up


 It's a moderately loaded machine with approx. 40 ipsec
 tunnels active, this
 crash happened within an hour since my last reboot.

 Any hints on how to debug this? I'd settle for disabling the
 use of glxsb
 crypto acceleration if necessary, don't have that much ipsec
 traffic (yet).


 Regards, Mitja

 dmesg:

 OpenBSD 4.2 (GENERIC) #375: Tue Aug 28 10:38:44 MDT 2007
 [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC
 cpu0: Geode(TM) Integrated Processor by AMD PCS
 (AuthenticAMD 586-class)
 500 MHz
 cpu0: FPU,DE,PSE,TSC,MSR,CX8,SEP,PGE,CMOV,CFLUSH,MMX
 real mem  = 536440832 (511MB)
 avail mem = 511070208 (487MB)
 mainbus0 at root
 bios0 at mainbus0: AT/286+ BIOS, date 20/70/19, BIOS32 rev. 0
 @ 0xfac40
 pcibios0 at bios0: rev 2.0 @ 0xf/0x1
 pcibios0: pcibios_get_intr_routing - function not supported
 pcibios0: PCI IRQ Routing information unavailable.
 pcibios0: PCI bus #1 is the last bus
 bios0: ROM list: 0xc8000/0xa800
 cpu0 at mainbus0
 pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
 pchb0 at pci0 dev 1 function 0 AMD Geode LX rev 0x31
 glxsb0 at pci0 dev 1 function 2 AMD Geode LX Crypto rev
 0x00: RNG AES
 vr0 at pci0 dev 6 function 0 VIA VT6105M RhineIII rev 0x96: irq 11,
 address 00:00:24:c8:de:2c
 ukphy0 at vr0 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI
 0x004063, model 0x0034
 vr1 at pci0 dev 7 function 0 VIA VT6105M RhineIII rev 0x96:
 irq 5, address
 00:00:24:c8:de:2d
 ukphy1 at vr1 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI
 0x004063, model 0x0034
 vr2 at pci0 dev 8 function 0 VIA VT6105M RhineIII rev 0x96:
 irq 9, address
 00:00:24:c8:de:2e
 ukphy2 at vr2 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI
 0x004063, model 0x0034
 vr3 at pci0 dev 9 function 0 VIA VT6105M RhineIII rev 0x96: irq 12,
 address 00:00:24:c8:de:2f
 ukphy3 at vr3 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI
 0x004063, model 0x0034
 ppb0 at pci0 dev 14 function 0 TI PCI2250 PCI-PCI rev 0x02
 pci1 at ppb0 bus 1
 sis0 at pci1 dev 0 function 0 NS DP83815 10/100 rev 0x00,
 DP83816A: irq
 10, address 00:00:24:c3:47:fc
 nsphyter0 at sis0 phy 0: DP83815 10/100 PHY, rev. 1
 sis1 at pci1 dev 1 function 0 NS DP83815 10/100 rev 0x00,
 DP83816A: irq 7,
 address 00:00:24:c3:47:fd
 nsphyter1 at sis1 phy 0: DP83815 10/100 PHY, rev. 1
 pcib0 at pci0 dev 20 function 0 AMD CS5536 ISA rev 0x03
 pciide0 at pci0 dev 20 function 2 AMD CS5536 IDE rev 0x01:
 DMA, channel 0
 wired to compatibility, channel 1 wired to compatibility
 wd0 at pciide0 channel 0 drive 0: SanDisk SDCFH-1024
 wd0: 4-sector PIO, LBA, 977MB, 2001888 sectors
 wd0(pciide0:0:0): using PIO mode 4, DMA mode 2
 pciide0: channel 1 ignored (disabled)
 ohci0 at pci0 dev 21 function 0 AMD CS5536 USB rev 0x02:
 irq 15, version
 1.0, legacy support
 ehci0 at pci0 dev 21 function 1 AMD CS5536 USB rev 0x02: irq 15
 usb0 at ehci0: USB revision 2.0
 uhub0 at usb0: AMD EHCI root hub, rev 2.00/1.00, addr 1
 isa0 at pcib0

glxsb crypto crash

2007-11-10 Thread Mitja Muženič
Not my day, obviously

On a net5501 machine that I just upgraded to 4.2 I experienced a sudden
reboot and found this in dmesg:

uvm_fault(0xd078d120, 0x0, 0, 1) - e
fatal page fault (6) in supervisor mode
trap type 6 code 0 eip d03d4fab cs 8 eflags 10296 cr2 4 cpl 90
panic: trap type 6, code=0, pc=d03d4fab
Starting stack trace...
panic(0,d6a0215c,da48ed3c,0,d6a0215c) at panic+0x71
panic(d06a2ca7,6,0,d03d4fab,d067c27d) at panic+0x71
trap() at trap+0x119
--- trap (number 6) ---
swcr_authcompute(d6845068,d68440e0,0,d69a5900,2) at swcr_authcompute+0x33
glxsb_crypto_freesession(d6845068,d68440e0,0,d69a5900,90) at
glxsb_crypto_freese
ssion+0xe3
glxsb_crypto_process(d6845068,0,da48ef6c,d0332d78) at
glxsb_crypto_process+0xd4
crypto_invoke(d6845068,24,d0690993,0) at crypto_invoke+0xbc
crypto_thread(d6a0215c) at crypto_thread+0x38
Bad frame pointer: 0xd08c7eb8
End of stack trace.
syncing disks... 22 22 21 19 15 9 2 1 1 1 1 1 1 1 1 1 1 1 1 1 giving up


It's a moderately loaded machine with approx. 40 ipsec tunnels active, this
crash happened within an hour since my last reboot.

Any hints on how to debug this? I'd settle for disabling the use of glxsb
crypto acceleration if necessary, don't have that much ipsec traffic (yet).


Regards, Mitja

dmesg:

OpenBSD 4.2 (GENERIC) #375: Tue Aug 28 10:38:44 MDT 2007
[EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC
cpu0: Geode(TM) Integrated Processor by AMD PCS (AuthenticAMD 586-class)
500 MHz
cpu0: FPU,DE,PSE,TSC,MSR,CX8,SEP,PGE,CMOV,CFLUSH,MMX
real mem  = 536440832 (511MB)
avail mem = 511070208 (487MB)
mainbus0 at root
bios0 at mainbus0: AT/286+ BIOS, date 20/70/19, BIOS32 rev. 0 @ 0xfac40
pcibios0 at bios0: rev 2.0 @ 0xf/0x1
pcibios0: pcibios_get_intr_routing - function not supported
pcibios0: PCI IRQ Routing information unavailable.
pcibios0: PCI bus #1 is the last bus
bios0: ROM list: 0xc8000/0xa800
cpu0 at mainbus0
pci0 at mainbus0 bus 0: configuration mode 1 (no bios)
pchb0 at pci0 dev 1 function 0 AMD Geode LX rev 0x31
glxsb0 at pci0 dev 1 function 2 AMD Geode LX Crypto rev 0x00: RNG AES
vr0 at pci0 dev 6 function 0 VIA VT6105M RhineIII rev 0x96: irq 11,
address 00:00:24:c8:de:2c
ukphy0 at vr0 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI
0x004063, model 0x0034
vr1 at pci0 dev 7 function 0 VIA VT6105M RhineIII rev 0x96: irq 5, address
00:00:24:c8:de:2d
ukphy1 at vr1 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI
0x004063, model 0x0034
vr2 at pci0 dev 8 function 0 VIA VT6105M RhineIII rev 0x96: irq 9, address
00:00:24:c8:de:2e
ukphy2 at vr2 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI
0x004063, model 0x0034
vr3 at pci0 dev 9 function 0 VIA VT6105M RhineIII rev 0x96: irq 12,
address 00:00:24:c8:de:2f
ukphy3 at vr3 phy 1: Generic IEEE 802.3u media interface, rev. 3: OUI
0x004063, model 0x0034
ppb0 at pci0 dev 14 function 0 TI PCI2250 PCI-PCI rev 0x02
pci1 at ppb0 bus 1
sis0 at pci1 dev 0 function 0 NS DP83815 10/100 rev 0x00, DP83816A: irq
10, address 00:00:24:c3:47:fc
nsphyter0 at sis0 phy 0: DP83815 10/100 PHY, rev. 1
sis1 at pci1 dev 1 function 0 NS DP83815 10/100 rev 0x00, DP83816A: irq 7,
address 00:00:24:c3:47:fd
nsphyter1 at sis1 phy 0: DP83815 10/100 PHY, rev. 1
pcib0 at pci0 dev 20 function 0 AMD CS5536 ISA rev 0x03
pciide0 at pci0 dev 20 function 2 AMD CS5536 IDE rev 0x01: DMA, channel 0
wired to compatibility, channel 1 wired to compatibility
wd0 at pciide0 channel 0 drive 0: SanDisk SDCFH-1024
wd0: 4-sector PIO, LBA, 977MB, 2001888 sectors
wd0(pciide0:0:0): using PIO mode 4, DMA mode 2
pciide0: channel 1 ignored (disabled)
ohci0 at pci0 dev 21 function 0 AMD CS5536 USB rev 0x02: irq 15, version
1.0, legacy support
ehci0 at pci0 dev 21 function 1 AMD CS5536 USB rev 0x02: irq 15
usb0 at ehci0: USB revision 2.0
uhub0 at usb0: AMD EHCI root hub, rev 2.00/1.00, addr 1
isa0 at pcib0
isadma0 at isa0
pckbc0 at isa0 port 0x60/5
pckbd0 at pckbc0 (kbd slot)
pckbc0: using irq 1 for kbd slot
wskbd0 at pckbd0: console keyboard
pcppi0 at isa0 port 0x61
midi0 at pcppi0: PC speaker
spkr0 at pcppi0
nsclpcsio0 at isa0 port 0x2e/2: NSC PC87366 rev 9: GPIO VLM TMS
gpio0 at nsclpcsio0: 29 pins
npx0 at isa0 port 0xf0/16: reported by CPUID; using exception 16
pccom0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
pccom0: console
pccom1 at isa0 port 0x2f8/8 irq 3: ns16550a, 16 byte fifo
usb1 at ohci0: USB revision 1.0
uhub1 at usb1: AMD OHCI root hub, rev 1.00/1.00, addr 1
biomask e145 netmask ffe5 ttymask ffe7
pctr: user-level cycle counter enabled
mtrr: K6-family MTRR support (2 registers)
dkcsum: wd0 matches BIOS drive 0x80
root on wd0a swap on wd0b dump on wd0b
WARNING: / was not properly unmounted