On 12/17/13 20:42, RD Thrush wrote:
> On 12/17/13 20:01, Kenneth R Westerback wrote:
>> On Tue, Dec 17, 2013 at 07:32:02PM -0500, RD Thrush wrote:
>>> On 11/22/13 12:03, Stuart Henderson wrote:
>>>> On 2013/11/22 08:47, RD Thrush wrote:
>>>>> On 11/11/13 11:22, Stuart Henderson wrote:
>>>>>> On 2013/11/11 09:53, RD Thrush wrote:
>>>>>>>> Synopsis: Firewall panic with Nov 10 snapshot
>>>>>>>> Category: kernel
>>>>>>>> Environment:
>>>>>>> System : OpenBSD 5.4
>>>>>>> Details : OpenBSD 5.4-current (GENERIC) #142: Sun Nov 10
>>>>>>> 22:52:49 MST 2013
>>>>>>>
>>>>>>> [email protected]:/usr/src/sys/arch/i386/compile/GENERIC
>>>>>>> Architecture: OpenBSD.i386
>>>>>>> Machine : i386
>>>>>>>> Description:
>>>>>>> Soekris 5501 firewall panics an hour after booting new
>>>>>>> snapshot. Appended is
>>>>>>> some ddb info as well as normal sendbug details.
>>>>>>>> How-To-Repeat:
>>>>>>> Don't know.
>>>>>>>> Fix:
>>>>>>> Revert to Nov 7 kernel
>>>>>>
>>>>>> I've reverted the bpf commit for now, it looks like the change is
>>>>>> invalidating
>>>>>> assumptions of the conditional around bpf_read()'s tsleep in bpf.c:439 ..
>>>>>
>>>>> It appears this problem still exists. I've had panic's on 3 machines
>>>>> since
>>>>> upgrading to the Nov 20 snap (2 amd64, 1 i386). I've attached the report
>>>>> from
>>>>> the x2 machine and am appending ddb's trace,ps,show registers and callout
>>>>> from
>>>>> all three. I have the full serial captures if more info is required.
>>>>
>>>> I've just updated things on my router at home and hit this (with
>>>> ladvd), including with the bpf.c commits reverted.
>>>>
>>>> I'm using "ladvd -Lz" and hit the panic pretty much as soon as it starts.
>>>
>>> The panic remains with today's snapshot. This problem originated with
>>> v1.84 of sys/net/bpf.c. Despite several reverts since, the problem
>>> remains. It is easy to reproduce, ie.:
>>
>> Not sure how we can point the finger at remnants of 1.84 since
>>
>> [ snip ]
>> i.e. only comment changes from 1.83 remain. What rev of bpf.c does work
>> for you?
>>
>> .... Ken
>
> I believe I had reverted to the Nov. 7 snapshot on the soekris 5501
> successfully. Since then I've tried various combinations of amd64/i386 &
> sp/mp,
> with newer snaps and reported some of them in this thread. I stopped using
> darkstat on the firewall after that initial report.
>
> I'm afraid I should have said the problem started w/ the Nov 11 snapshot
> rather
> than incorrectly point the finger at the bpf.c/bpfdesc.h changes.
>
> Unfortunately, I no longer have any 5.4 snapshots older than Nov 11...
>
> I do have a crash dump (and can easily reproduce another) but am not able to
> analyze them.
>
> What else can I do to help?
FWIW, I built a GENERIC kernel from cvs as of Nov 11 00:00 GMT and that kernel
did *not* panic. I noticed that although bpf.c was reverted, bpfdesc.h was not.
Reverting bpfdesc.h to before Nov 11 results in a kernel that passes the
darkstat exercise.
Here's what I used:
Index: bpfdesc.h
===================================================================
RCS file: /a8v/pub2/cvsroot/OpenBSD/src/sys/net/bpfdesc.h,v
retrieving revision 1.21
diff -w -b -u -r1.21 bpfdesc.h
--- bpfdesc.h 12 Nov 2013 01:12:09 -0000 1.21
+++ bpfdesc.h 18 Dec 2013 05:24:05 -0000
@@ -67,8 +67,8 @@
int bd_bufsize; /* absolute length of buffers */
struct bpf_if * bd_bif; /* interface descriptor */
- int bd_rtout; /* Read timeout in 'ticks' */
- int bd_rdStart; /* when the read started */
+ u_long bd_rtout; /* Read timeout in 'ticks' */
+ u_long bd_rdStart; /* when the read started */
struct bpf_insn *bd_rfilter; /* read filter code */
struct bpf_insn *bd_wfilter; /* write filter code */
u_long bd_rcount; /* number of packets received */