On Fri, May 11, 2007 at 12:49:20PM -0700, Matthew Dillon wrote: > > :The strange thing is I was rebooting my laptop (via icewm) when this > :occurred. The interface is re(4) according to the kernel buffer output > :which follows. > : > :Joe > > I'm guessing there's an issue with re_init() or re_stop() that is > possibly being triggered by setting the IP address. > > re_init() for the RE interface looks like is doing some dangerous > things... if there is DMA still operating while it is trying to > reinitialize the device, that could be causing the NMI. It seems to be > writing 0x00 to the command register which I guess is supposed to stop > device operation, but it is not waiting for the device to actually stop > operating before it begins to free the TX and RX rings. > > Most network controllers these days are actually microcontrollers, > which means that commands do not instantaniously take effect when > you write to the command register. Usually only the interrupt > control registers are hardwired. > > I got two questions. First, when you ifconfig the interface with a > new IP address does it normally pause before returning? That would > indicate that is is in fact doing a full device reset when configuring > an IP address. Second, can you reproduce the problem? Perhaps by > re-configuring the device's IP address over and over again in a loop?
There is a small delay <2s. I ran a loop that switched between two IPs for about 15 minutes and nothing happened. The kernel buffer output in the corefile was from months ago. I only remembered because I did the same thing this time; shutdown now; umount /home; ifconfig re0 ... I don't know how this can be in a dump months after the fact unless there is stale data in my swap partition from my last coredump that hasn't been overwritten since I don't do very much swapping. This idea may be completely wrong. I am 100% certain that I'm not looking at a stale dump as strings on the kernel and vmcore show them as being from May 9, 2007. I am also certain that I was not ifconfig'ing any interface when this happened. Joe > > We may be able to 'fix' the problem simply by introducing a delay > after writing 0x00 to RE_COMMAND, or by calling re_reset() as part > of re_stop(), but I'd like a way to verify that doing so will actually > fix the problem. > > -Matt > Matthew Dillon > <[EMAIL PROTECTED]>
