On 6/24/07, Vijay Sankar <[EMAIL PROTECTED]> wrote:
On Sunday 24 June 2007 13:50, patrick keshishian wrote:
> Hi,
>
> I've been noticing some strange problems with the built-in nfe0
> interface on my desktop.  Actually I've seen it on two such
> computers, but the description below is for my current desktop PC.
>
> The PC is running `cvs up -dP -rOPENBSD_4_1' built. I'm including
> netstat, ifconfig output[1] and dmesg below[2].
>
> I've noticed that once in a while the nfe0 interface will stop
> sending and receiving data.  At this point I can not make it work
> again.  The only solution I have is to reboot the box.  I have
> installed a dc0 card in the box since.  The problem seemed
> intermittent and not reliably reproducible.  But I think I found
> a way to reproduce this problem on demand (at least for the time
> being).  I have an ssh session to another box, on which I run
> '/usr/bin/nm somelib.so'.  After a page or two of output the
> terminal "hangs".  At this point nfe0 becomes unresponsive.
>
> I switch to the dc0 interface and the terminal finishes the output.
> Running the nm command while using the dc0 interface doesn't cause
> any problems.

I experienced similar problems last year and can empathize.

The following items improved my situation somewhat:

1) BIOS upgrade
2) Removing dual boot (I had both OpenBSD and Windows 2003 on one
machine. There were more errors if I did not power off after shutting
down Windows 2003 and just did a restart from within Windows. If I did
not unplug the machine after shutting down Windows, most of the time I
saw watchdog timeouts but if I powered off the host, and then powered
it back on, there were fewer errors)

Both boxes I have run solely OpenBSD.


One thing that I did notice was that after switching to the dc0
interface for a short while (5 min or so?), I could switch back
to the nfe0 and it would start responding again. Basically:

# /sbin/ifconfig dc0 delete
# /sbin/route delete default
# /sbin/ifconfig nfe0 inet <IP> netmask <netmask> up
# /sbin/route add default <gateway>

Therefore, a reboot isn't the only way to "fix" the problem ("reset"
the interface) as I had previously thought.  I am not sure exactly
what causes the interface to "reset": idle time, "no carrier", or
something completely random?


Either way, thanks for all the replies!



I experimented with different combinations and different switches
(10/100/1000, 10/100, and 10-Base-T). When all the hosts connected to a
10/100 switch were running at 100 MB/s then changing nfe0 from
autoselect to full-duplex using

ifconfig nfe0 media 100baseTX mediaopt full-duplex

seemed to eliminate nfe0 hangs as well as timeouts completely. I am not
sure whether this has any rational basis or is specific to some weird
situation in my network, but that has been my experience.

Vijay


>
> Interestingly enough, if I redirect the output of nm to a file
> and subsequently cat the file the nfe0 interface doesn't seem
> to exhibit the same problem.
>
> I am not sure how to diagnose this problem further.  I've enabled
> debug on the nfe0 interface (/sbin/ifconfig nfe0 debug), but don't
> see any output.
>
> Any and all suggestions are welcome.
> --patrick

Reply via email to