On Wed, Dec 18, 2013 at 12:46:44AM -0500, RD Thrush wrote: > On 12/17/13 20:42, RD Thrush wrote: > > On 12/17/13 20:01, Kenneth R Westerback wrote: > >> On Tue, Dec 17, 2013 at 07:32:02PM -0500, RD Thrush wrote: > >>> On 11/22/13 12:03, Stuart Henderson wrote: > >>>> On 2013/11/22 08:47, RD Thrush wrote: > >>>>> On 11/11/13 11:22, Stuart Henderson wrote: > >>>>>> On 2013/11/11 09:53, RD Thrush wrote: > >>>>>>>> Synopsis: Firewall panic with Nov 10 snapshot > >>>>>>>> Category: kernel > >>>>>>>> Environment: > >>>>>>> System : OpenBSD 5.4 > >>>>>>> Details : OpenBSD 5.4-current (GENERIC) #142: Sun Nov 10 > >>>>>>> 22:52:49 MST 2013 > >>>>>>> > >>>>>>> [email protected]:/usr/src/sys/arch/i386/compile/GENERIC > >>>>>>> Architecture: OpenBSD.i386 > >>>>>>> Machine : i386 > >>>>>>>> Description: > >>>>>>> Soekris 5501 firewall panics an hour after booting new > >>>>>>> snapshot. Appended is > >>>>>>> some ddb info as well as normal sendbug details. > >>>>>>>> How-To-Repeat: > >>>>>>> Don't know. > >>>>>>>> Fix: > >>>>>>> Revert to Nov 7 kernel > >>>>>> > >>>>>> I've reverted the bpf commit for now, it looks like the change is > >>>>>> invalidating > >>>>>> assumptions of the conditional around bpf_read()'s tsleep in bpf.c:439 > >>>>>> .. > >>>>> > >>>>> It appears this problem still exists. I've had panic's on 3 machines > >>>>> since > >>>>> upgrading to the Nov 20 snap (2 amd64, 1 i386). I've attached the > >>>>> report from > >>>>> the x2 machine and am appending ddb's trace,ps,show registers and > >>>>> callout from > >>>>> all three. I have the full serial captures if more info is required. > >>>> > >>>> I've just updated things on my router at home and hit this (with > >>>> ladvd), including with the bpf.c commits reverted. > >>>> > >>>> I'm using "ladvd -Lz" and hit the panic pretty much as soon as it starts. > >>> > >>> The panic remains with today's snapshot. This problem originated with > >>> v1.84 of sys/net/bpf.c. Despite several reverts since, the problem > >>> remains. It is easy to reproduce, ie.: > >> > >> Not sure how we can point the finger at remnants of 1.84 since > >> > >> [ snip ] > >> i.e. only comment changes from 1.83 remain. What rev of bpf.c does work > >> for you? > >> > >> .... Ken > > > > I believe I had reverted to the Nov. 7 snapshot on the soekris 5501 > > successfully. Since then I've tried various combinations of amd64/i386 & > > sp/mp, > > with newer snaps and reported some of them in this thread. I stopped using > > darkstat on the firewall after that initial report. > > > > I'm afraid I should have said the problem started w/ the Nov 11 snapshot > > rather > > than incorrectly point the finger at the bpf.c/bpfdesc.h changes. > > > > Unfortunately, I no longer have any 5.4 snapshots older than Nov 11... > > > > I do have a crash dump (and can easily reproduce another) but am not able to > > analyze them. > > > > What else can I do to help? > > FWIW, I built a GENERIC kernel from cvs as of Nov 11 00:00 GMT and that kernel > did *not* panic. I noticed that although bpf.c was reverted, bpfdesc.h was > not. > Reverting bpfdesc.h to before Nov 11 results in a kernel that passes the > darkstat exercise. > > Here's what I used: > > Index: bpfdesc.h > =================================================================== > RCS file: /a8v/pub2/cvsroot/OpenBSD/src/sys/net/bpfdesc.h,v > retrieving revision 1.21 > diff -w -b -u -r1.21 bpfdesc.h > --- bpfdesc.h 12 Nov 2013 01:12:09 -0000 1.21 > +++ bpfdesc.h 18 Dec 2013 05:24:05 -0000 > @@ -67,8 +67,8 @@ > int bd_bufsize; /* absolute length of buffers */ > > struct bpf_if * bd_bif; /* interface descriptor */ > - int bd_rtout; /* Read timeout in 'ticks' */ > - int bd_rdStart; /* when the read started */ > + u_long bd_rtout; /* Read timeout in 'ticks' */ > + u_long bd_rdStart; /* when the read started */ > struct bpf_insn *bd_rfilter; /* read filter code */ > struct bpf_insn *bd_wfilter; /* write filter code */ > u_long bd_rcount; /* number of packets received */ >
OK, bpfdes.h reverted to match the reversion of bpf.c. .... Ken
