On Tue, Dec 17, 2013 at 07:32:02PM -0500, RD Thrush wrote:
> On 11/22/13 12:03, Stuart Henderson wrote:
> > On 2013/11/22 08:47, RD Thrush wrote:
> >> On 11/11/13 11:22, Stuart Henderson wrote:
> >>> On 2013/11/11 09:53, RD Thrush wrote:
> >>>>> Synopsis: Firewall panic with Nov 10 snapshot
> >>>>> Category: kernel
> >>>>> Environment:
> >>>> System : OpenBSD 5.4
> >>>> Details : OpenBSD 5.4-current (GENERIC) #142: Sun Nov 10 22:52:49
> >>>> MST 2013
> >>>>
> >>>> [email protected]:/usr/src/sys/arch/i386/compile/GENERIC
> >>>> Architecture: OpenBSD.i386
> >>>> Machine : i386
> >>>>> Description:
> >>>> Soekris 5501 firewall panics an hour after booting new snapshot.
> >>>> Appended is
> >>>> some ddb info as well as normal sendbug details.
> >>>>> How-To-Repeat:
> >>>> Don't know.
> >>>>> Fix:
> >>>> Revert to Nov 7 kernel
> >>>
> >>> I've reverted the bpf commit for now, it looks like the change is
> >>> invalidating
> >>> assumptions of the conditional around bpf_read()'s tsleep in bpf.c:439 ..
> >>
> >> It appears this problem still exists. I've had panic's on 3 machines since
> >> upgrading to the Nov 20 snap (2 amd64, 1 i386). I've attached the report
> >> from
> >> the x2 machine and am appending ddb's trace,ps,show registers and callout
> >> from
> >> all three. I have the full serial captures if more info is required.
> >
> > I've just updated things on my router at home and hit this (with
> > ladvd), including with the bpf.c commits reverted.
> >
> > I'm using "ladvd -Lz" and hit the panic pretty much as soon as it starts.
>
> The panic remains with today's snapshot. This problem originated with v1.84
> of sys/net/bpf.c. Despite several reverts since, the problem remains. It is
> easy to reproduce, ie.:
Not sure how we can point the finger at remnants of 1.84 since
$ cd src/sys/net
$ cvs diff -u -r1.83 bpf.c
Index: bpf.c
===================================================================
RCS file: /cvs/src/sys/net/bpf.c,v
retrieving revision 1.83
retrieving revision 1.89
diff -u -p -u -r1.83 -r1.89
--- bpf.c 28 Dec 2012 17:52:06 -0000 1.83
+++ bpf.c 29 Nov 2013 19:28:55 -0000 1.89
@@ -1,4 +1,4 @@
-/* $OpenBSD: bpf.c,v 1.83 2012/12/28 17:52:06 gsoares Exp $ */
+/* $OpenBSD: bpf.c,v 1.89 2013/11/29 19:28:55 tedu Exp $ */
/* $NetBSD: bpf.c,v 1.33 1997/02/21 23:59:35 thorpej Exp $ */
/*
@@ -172,8 +172,8 @@ bpf_movein(struct uio *uio, u_int linkty
case DLT_ATM_RFC1483:
/*
- * en atm driver requires 4-byte atm pseudo header.
- * though it isn't standard, vpi:vci needs to be
+ * An ATM driver requires 4-byte ATM pseudo header.
+ * Though it isn't standard, vpi:vci needs to be
* specified anyway.
*/
sockp->sa_family = AF_UNSPEC;
@@ -401,7 +401,7 @@ bpfread(dev_t dev, struct uio *uio, int
D_GET(d);
/*
- * bd_rdStart is tagged when we start the read, iff there's a timeout.
+ * If there's a timeout, bd_rdStart is tagged when we start the read.
* we can then figure out when we're done reading.
*/
i.e. only comment changes from 1.83 remain. What rev of bpf.c does work
for you?
.... Ken
>
> sudo pkg_add darkstat
> sudo /etc/rc.d/darkstat start
> echo quit | nc localhost 667
>
> I've appended the panic, ddb info + dmesg from a qemu session.
> $ pkg_info
> darkstat-3.0.717p1 network statistics gatherer with graphs
> quirks-1.104 exceptions to pkg_add rules
> $ sudo /etc/rc.d/darkstat start
> darkstat(ok)
> $ echo quit | nc localhost 667
> panic: timeout_add: to_ticks (-1) < 0
> Stopped at Debugger+0x5: leave
> RUN AT LEAST 'trace' AND 'ps' AND INCLUDE OUTPUT WHEN REPORTING THIS PANIC!
> DO NOT EVEN BOTHER REPORTING THIS WITHOUT INCLUDING THAT INFORMATION!
> ddb> trace
> Debugger() at Debugger+0x5
> panic() at panic+0xee
> timeout_add() at timeout_add+0xff
> tsleep() at tsleep+0x88
> bpfread() at bpfread+0x1a4
> spec_read() at spec_read+0x2a8
> VOP_READ() at VOP_READ+0x32
> vn_read() at vn_read+0xa1
> dofilereadv() at dofilereadv+0x18a
> sys_read() at sys_read+0x8f
> syscall() at syscall+0x162
> --- syscall (number 3) ---
> end of kernel
> end trace frame: 0x7f7ffffbf5a8, count: -11
> 0x93482acf1ba:
> ddb> ps
> PID PPID PGRP UID S FLAGS WAIT COMMAND
> 26262 16126 28453 1000 2 0x3 nc
> 26586 6377 26586 587 3 0x90 netio darkstat
> * 6377 1 6377 587 7 0x10 bpf darkstat
> 16126 1 16126 1000 3 0x8b pause ksh
> 32327 1 32327 0 3 0x83 ttyin getty
> 12746 1 12746 0 3 0x83 ttyin getty
> 4079 1 4079 0 3 0x83 ttyin getty
> 5137 1 5137 0 3 0x83 ttyin getty
> 27113 1 27113 0 3 0x83 ttyin getty
> 28129 1 28129 0 3 0x80 select cron
> 1050 1 1050 99 3 0x90 poll sndiod
> 31085 1 31085 0 3 0xb0 select sendmail
> 31327 1 31327 0 3 0x80 select sshd
> 3210 0 0 0 3 0x100280 nfsidl nfsio
> 26471 0 0 0 3 0x100280 nfsidl nfsio
> 30689 0 0 0 3 0x100280 nfsidl nfsio
> 5699 0 0 0 3 0x100280 nfsidl nfsio
> 22462 28066 9855 83 3 0x90 poll ntpd
> 28066 9855 9855 83 3 0x90 poll ntpd
> 9855 1 9855 0 3 0x80 poll ntpd
> 24367 1337 1337 74 3 0x90 bpf pflogd
> 1337 1 1337 0 3 0x80 netio pflogd
> 21976 18728 18728 73 3 0x90 poll syslogd
> 18728 1 18728 0 3 0x80 netio syslogd
> 24426 1 24426 77 3 0x90 poll dhclient
> 3819 1 3819 0 3 0x80 poll dhclient
> 12 0 0 0 3 0x100200 aiodoned aiodoned
> 11 0 0 0 3 0x100200 syncer update
> 10 0 0 0 3 0x100200 cleaner cleaner
> 9 0 0 0 3 0x100200 reaper reaper
> 8 0 0 0 3 0x100200 pgdaemon pagedaemon
> 7 0 0 0 3 0x100200 bored crypto
> 6 0 0 0 3 0x100200 pftm pfpurge
> 5 0 0 0 3 0x40100200 acpi0 acpi0
> 4 0 0 0 3 0x100200 bored systq
> 3 0 0 0 3 0x100200 bored syswq
> 2 0 0 0 3 0x40100200 idle0
> 1 0 1 0 3 0x82 wait init
> 0 -1 0 0 3 0x200 scheduler swapper
> ddb> show registers
> ds 0x292
> es 0x3a50
> fs 0x3a70
> gs 0xfdf1 acpi_pdirpa+0xb891
> rdi 0x1
> rsi 0x5
> rbp 0xffff800001d93a40
> rbx 0xffffffff81783490 addrmask+0x1250
> rdx 0xc
> rcx 0x292
> rax 0x1
> r8 0xffff800001d93960
> r9 0
> r10 0x2d
> r11 0xffffffff8108f680 comcnputc
> r12 0x100
> r13 0xffff800001d93a50
> r14 0
> r15 0xffff800001d93db0
> rip 0xffffffff81342ab5 Debugger+0x5
> cs 0x8
> rflags 0x202
> rsp 0xffff800001d93a40
> ss 0x10
> Debugger+0x5: leave
> ddb> callout
> ticks now: 3638
> ticks wheel arg func
> 1 0d/55 d 0x0 m_cltick
> 6 0d/60 d 0xffffffff81cdb340 pffasttimo
> 17 0d/71 d 0xffffffff81cdb300 pfslowtimo
> 23 0d/77 d 0xfffffe80036544e0 endtsleep
> 26 0d/80 d 0xfffffe8003654c18 endtsleep
> 66 0d/120 d 0xffff800000166000 em_local_timer
> 66 0d/120 d 0xffffffff81cdb440 if_slowtimo
> 66 0d/120 d 0xffffffff81d2db00 pckbc_poll
> 66 0d/120 d 0x0 nd6_timer
> 66 0d/120 d 0xffffffff81cdb560 rt_timer_timer
> 66 0d/120 d 0xffffffff81cdb000 schedcpu
> 71 0d/125 d 0xfffffe8003723c08 endtsleep
> 217 1d/271 d 0xfffffe800346fb00 realitexpire
> 292 1d/271 d 0xfffffe8000abd028 endtsleep
> 1568 1d/276 d 0xfffffe8000abd4f8 endtsleep
> 3291 1d/283 d 0xfffffe800346f010 realitexpire
> 27087 1d/376 d 0xffffffff81cdb5a0 arptimer
> 56362 1d/490 d 0x0 arc4_reinit
> 177682 2d/514 d 0xfffffe800331c480 realitexpire
> 356362 2d/517 d 0x0 nd6_slowtimo
> 719998 2d/523 d 0xfffffe800337e008 tcp_timer_keep
> 719999 2d/523 d 0xfffffe800337e238 tcp_timer_keep
> 8637276 2d/643 d 0xfffffe80036549b0 endtsleep
> ddb> boot crash
>
> dumping to dev 0,1 offset 66455
> dump 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40
> 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14
> 13 12 11 10 9 8 7 6 5 4 3 2 1 0 succeeded
>
>
> booting hd0a:bsd: 6537740+1652172+1079992+0+619808 [80+552096+367479]=0xe50088
> entry point at 0x10001e0 [7205c766, 34000004, 24448b12, 7958a304]
> [ using 920424 bytes of bsd ELF symbol table ]
> Copyright (c) 1982, 1986, 1989, 1991, 1993
> The Regents of the University of California. All rights reserved.
> Copyright (c) 1995-2013 OpenBSD. All rights reserved. http://www.OpenBSD.org
>
> OpenBSD 5.4-current (GENERIC) #170: Tue Dec 17 09:21:47 MST 2013
> [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC
> real mem = 50323456 (47MB)
> avail mem = 41013248 (39MB)
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 2.4 @ 0xf1650 (10 entries)
> bios0: vendor Bochs version "Bochs" date 01/01/2011
> bios0: Bochs Bochs
> acpi0 at bios0: rev 0
> acpi0: sleep states S3 S4 S5
> acpi0: tables DSDT FACP SSDT APIC HPET
> acpi0: wakeup devices
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpimadt0 at acpi0 addr 0xfee00000: PC-AT compat
> acpihpet0 at acpi0: 100000000 Hz
> acpiprt0 at acpi0: bus 0 (PCI0)
> acpicpu0 at acpi0
> mpbios0 at bios0: Intel MP Specification 1.4
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: QEMU Virtual CPU version 1.7.0, 3403.33 MHz
> cpu0:
> FPU,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,CX16,POPCNT,NXE,LONG,LAHF,SVM,ABM,SSE4A
> cpu0: 64KB 64b/line 2-way I-cache, 64KB 64b/line 2-way D-cache, 512KB
> 64b/line 16-way L2 cache
> cpu0: ITLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> cpu0: DTLB 255 4KB entries direct-mapped, 255 4MB entries direct-mapped
> cpu0: apic clock running at 1000MHz
> mpbios0: bus 0 is type PCI
> mpbios0: bus 1 is type ISA
> ioapic0 at mainbus0: apid 0 pa 0xfec00000, version 11, 24 pins
> pci0 at mainbus0 bus 0
> pchb0 at pci0 dev 0 function 0 "Intel 82441FX" rev 0x02
> pcib0 at pci0 dev 1 function 0 "Intel 82371SB ISA" rev 0x00
> pciide0 at pci0 dev 1 function 1 "Intel 82371SB IDE" rev 0x00: DMA, channel 0
> wired to compatibility, channel 1 wired to compatibility
> wd0 at pciide0 channel 0 drive 0: <QEMU HARDDISK>
> wd0: 16-sector PIO, LBA48, 10240MB, 20971520 sectors
> wd0(pciide0:0:0): using PIO mode 4, DMA mode 2
> atapiscsi0 at pciide0 channel 1 drive 0
> scsibus0 at atapiscsi0: 2 targets
> cd0 at scsibus0 targ 0 lun 0: <QEMU, QEMU DVD-ROM, 1.7.> ATAPI 5/cdrom
> removable
> cd0(pciide0:1:0): using PIO mode 4, DMA mode 2
> piixpm0 at pci0 dev 1 function 3 "Intel 82371AB Power" rev 0x03: apic 0 int 9
> iic0 at piixpm0
> vga1 at pci0 dev 2 function 0 "Cirrus Logic CL-GD5446" rev 0x00
> wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation)
> wsdisplay0: screen 1-5 added (80x25, vt100 emulation)
> em0 at pci0 dev 3 function 0 "Intel 82540EM" rev 0x03: apic 0 int 11, address
> 52:54:00:4e:62:8a
> isa0 at pcib0
> isadma0 at isa0
> com0 at isa0 port 0x3f8/8 irq 4: ns16550a, 16 byte fifo
> com0: console
> pckbc0 at isa0 port 0x60/5
> pckbd0 at pckbc0 (kbd slot)
> pckbc0: using irq 1 for kbd slot
> wskbd0 at pckbd0: console keyboard, using wsdisplay0
> pms0 at pckbc0 (aux slot)
> pckbc0: using irq 12 for aux slot
> wsmouse0 at pms0 mux 0
> pcppi0 at isa0 port 0x61
> spkr0 at pcppi0
> lpt0 at isa0 port 0x378/4 irq 7
> fdc0 at isa0 port 0x3f0/6 irq 6 drq 2
> fd0 at fdc0 drive 1: density unknown
> nvram: invalid checksum
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> vscsi0 at root
> scsibus1 at vscsi0: 256 targets
> softraid0 at root
> scsibus2 at softraid0: 256 targets
> root on wd0a (094007408badcf66.a) swap on wd0b dump on wd0b
> clock: unknown CMOS layout
> Fast boot: skipping disk checks.
> setting tty flags
> pf enabled
> ddb.console: 0 -> 1
> ddb.max_line: 24 -> 0
> machdep.allowaperture: 0 -> 2
> starting network
> DHCPREQUEST on em0 to 255.255.255.255 port 67
> DHCPACK from 10.1.2.18 (14:da:e9:b5:84:cf)
> bound to 10.1.2.236 -- renewal in 302400 seconds.
> starting early daemons: syslogd pflogd ntpd.
> starting RPC daemons:.
> savecore: no core dump
> checking quotas: done.
> clearing /tmp
> starting pre-securelevel daemons:.
> setting kernel security level: kern.securelevel: 0 -> 1
> creating runtime link editor directory cache.
> preserving editor files.
> starting network daemons: sshd sendmail sndiod.
> starting local daemons: cron.
> Tue Dec 17 19:01:23 EST 2013