Re: nfe0 problem (obsd 4.1)
Hi! I've noticed that once in a while the nfe0 interface will stop sending and receiving data. At this point I can not make it work again. The only solution I have is to reboot the box. I have installed a dc0 card in the box since. The problem seemed intermittent and not reliably reproducible. I had problems like these when I ported OpenBSD to the Xbox ( http://tobias.schroepf.de/doku/doku.php?id=xbox:porting_openbsd_to_the_xbox ) You can find the patches I have made here: http://tobias.schroepf.de/doku/doku.php?id=xbox:patch_the_openbsd_sources_network But don't know if this will solve your problem. Markus Ritzer
Re: nfe0 problem (obsd 4.1)
On Wednesday 27 June 2007 10:50, Tony Lambiris wrote: You might be interested in some unofficial patches I had created when experiencing the same thing. I hadn't officially released these because of the awful DELAY() timeout hack taken from the original nfe code from DragonFly BSD. Most of the updates were taken from NetBSD. Either way, what you would be interested in is the encap_delay stuff, specifically the part in nfe.c where it actually assigns the variable: case PCI_PRODUCT_NVIDIA_CK804_LAN1: case PCI_PRODUCT_NVIDIA_CK804_LAN2: + sc-sc_encap_delay = 10; + break; You would obviously have to locate where your interface matches and assign it there. For me, my interface is a CK804. Not sure if it was LAN1 or LAN2, but I assigned the delay to both anyway. These patches seemed to work good for me, didn't experience any timeouts, YMMV. Let me know if this works. These will apply cleanly against 4.1-RELEASE. I downloaded your patches and would like to try it out. Thanks very much. Because I don't know what I am doing here, I need a bit more help. How can I find out whether my interface is also a CK804? scanpci -v gave me the following: pci bus 0x cardnum 0x08 function 0x00: vendor 0x10de device 0x0373 nVidia Corporation MCP55 Ethernet CardVendor 0x1043 card 0x8239 (ASUSTeK Computer Inc., Card unknown) STATUS0x00b0 COMMAND 0x0007 CLASS 0x06 0x80 0x00 REVISION 0xa2 BIST 0x00 HEADER 0x00 LATENCY 0x00 CACHE 0x00 BASE0 0xfe02a000 addr 0xfe02a000 MEM BASE1 0xb001 addr 0xb000 I/O BASE2 0xfe029000 addr 0xfe029000 MEM BASE3 0xfe028000 addr 0xfe028000 MEM MAX_LAT 0x14 MIN_GNT 0x01 INT_PIN 0x01 INT_LINE 0x0a BYTE_00x43 BYTE_1 0x10 BYTE_2 0x39 BYTE_3 0x82 pci bus 0x cardnum 0x09 function 0x00: vendor 0x10de device 0x0373 nVidia Corporation MCP55 Ethernet CardVendor 0x1043 card 0x8239 (ASUSTeK Computer Inc., Card unknown) STATUS0x00b0 COMMAND 0x0007 CLASS 0x06 0x80 0x00 REVISION 0xa2 BIST 0x00 HEADER 0x00 LATENCY 0x00 CACHE 0x00 BASE0 0xfe027000 addr 0xfe027000 MEM BASE1 0xac01 addr 0xac00 I/O BASE2 0xfe026000 addr 0xfe026000 MEM BASE3 0xfe025000 addr 0xfe025000 MEM MAX_LAT 0x14 MIN_GNT 0x01 INT_PIN 0x01 INT_LINE 0x0a BYTE_00x43 BYTE_1 0x10 BYTE_2 0x39 BYTE_3 0x82 dmesg shows nfe0 at pci0 dev 8 function 0 NVIDIA MCP55 LAN rev 0xa2: irq 10, address 00:17:31:cb:ee:d1 eephy0 at nfe0 phy 1: Marvell 88E1116 Gigabit PHY, rev. 1 nfe1 at pci0 dev 9 function 0 NVIDIA MCP55 LAN rev 0xa2: irq 10, address 00:17:31:cb:dd:7a eephy1 at nfe1 phy 1: Marvell 88E1116 Gigabit PHY, rev. 1 http://lysergik.com/~tony/openbsd/ On 6/25/07, patrick keshishian [EMAIL PROTECTED] wrote: On 6/24/07, Vijay Sankar [EMAIL PROTECTED] wrote: On Sunday 24 June 2007 13:50, patrick keshishian wrote: Hi, I've been noticing some strange problems with the built-in nfe0 interface on my desktop. Actually I've seen it on two such computers, but the description below is for my current desktop PC. The PC is running `cvs up -dP -rOPENBSD_4_1' built. I'm including netstat, ifconfig output[1] and dmesg below[2]. I've noticed that once in a while the nfe0 interface will stop sending and receiving data. At this point I can not make it work again. The only solution I have is to reboot the box. I have installed a dc0 card in the box since. The problem seemed intermittent and not reliably reproducible. But I think I found a way to reproduce this problem on demand (at least for the time being). I have an ssh session to another box, on which I run '/usr/bin/nm somelib.so'. After a page or two of output the terminal hangs. At this point nfe0 becomes unresponsive. I switch to the dc0 interface and the terminal finishes the output. Running the nm command while using the dc0 interface doesn't cause any problems. I experienced similar problems last year and can empathize. The following items improved my situation somewhat: 1) BIOS upgrade 2) Removing dual boot (I had both OpenBSD and Windows 2003 on one machine. There were more errors if I did not power off after shutting down Windows 2003 and just did a restart from within Windows. If I did not unplug the machine after shutting down Windows, most of the time I saw watchdog timeouts but if I powered off the host, and then powered it back on, there were fewer errors) Both boxes I have run solely OpenBSD. One thing that I did notice was that after switching to the dc0 interface for a short while (5 min or so?), I could switch back to the nfe0 and it would start responding again. Basically: # /sbin/ifconfig dc0 delete # /sbin/route delete default # /sbin/ifconfig nfe0 inet IP netmask netmask up # /sbin/route add
Re: nfe0 problem (obsd 4.1)
You might be interested in some unofficial patches I had created when experiencing the same thing. I hadn't officially released these because of the awful DELAY() timeout hack taken from the original nfe code from DragonFly BSD. Most of the updates were taken from NetBSD. Either way, what you would be interested in is the encap_delay stuff, specifically the part in nfe.c where it actually assigns the variable: case PCI_PRODUCT_NVIDIA_CK804_LAN1: case PCI_PRODUCT_NVIDIA_CK804_LAN2: + sc-sc_encap_delay = 10; + break; You would obviously have to locate where your interface matches and assign it there. For me, my interface is a CK804. Not sure if it was LAN1 or LAN2, but I assigned the delay to both anyway. These patches seemed to work good for me, didn't experience any timeouts, YMMV. Let me know if this works. These will apply cleanly against 4.1-RELEASE. http://lysergik.com/~tony/openbsd/ On 6/25/07, patrick keshishian [EMAIL PROTECTED] wrote: On 6/24/07, Vijay Sankar [EMAIL PROTECTED] wrote: On Sunday 24 June 2007 13:50, patrick keshishian wrote: Hi, I've been noticing some strange problems with the built-in nfe0 interface on my desktop. Actually I've seen it on two such computers, but the description below is for my current desktop PC. The PC is running `cvs up -dP -rOPENBSD_4_1' built. I'm including netstat, ifconfig output[1] and dmesg below[2]. I've noticed that once in a while the nfe0 interface will stop sending and receiving data. At this point I can not make it work again. The only solution I have is to reboot the box. I have installed a dc0 card in the box since. The problem seemed intermittent and not reliably reproducible. But I think I found a way to reproduce this problem on demand (at least for the time being). I have an ssh session to another box, on which I run '/usr/bin/nm somelib.so'. After a page or two of output the terminal hangs. At this point nfe0 becomes unresponsive. I switch to the dc0 interface and the terminal finishes the output. Running the nm command while using the dc0 interface doesn't cause any problems. I experienced similar problems last year and can empathize. The following items improved my situation somewhat: 1) BIOS upgrade 2) Removing dual boot (I had both OpenBSD and Windows 2003 on one machine. There were more errors if I did not power off after shutting down Windows 2003 and just did a restart from within Windows. If I did not unplug the machine after shutting down Windows, most of the time I saw watchdog timeouts but if I powered off the host, and then powered it back on, there were fewer errors) Both boxes I have run solely OpenBSD. One thing that I did notice was that after switching to the dc0 interface for a short while (5 min or so?), I could switch back to the nfe0 and it would start responding again. Basically: # /sbin/ifconfig dc0 delete # /sbin/route delete default # /sbin/ifconfig nfe0 inet IP netmask netmask up # /sbin/route add default gateway Therefore, a reboot isn't the only way to fix the problem (reset the interface) as I had previously thought. I am not sure exactly what causes the interface to reset: idle time, no carrier, or something completely random? Either way, thanks for all the replies! I experimented with different combinations and different switches (10/100/1000, 10/100, and 10-Base-T). When all the hosts connected to a 10/100 switch were running at 100 MB/s then changing nfe0 from autoselect to full-duplex using ifconfig nfe0 media 100baseTX mediaopt full-duplex seemed to eliminate nfe0 hangs as well as timeouts completely. I am not sure whether this has any rational basis or is specific to some weird situation in my network, but that has been my experience. Vijay Interestingly enough, if I redirect the output of nm to a file and subsequently cat the file the nfe0 interface doesn't seem to exhibit the same problem. I am not sure how to diagnose this problem further. I've enabled debug on the nfe0 interface (/sbin/ifconfig nfe0 debug), but don't see any output. Any and all suggestions are welcome. --patrick
Re: nfe0 problem (obsd 4.1)
After applying the patches, you want to go into if_nfe.c, and after line 244 (PCI_PRODUCT_NVIDIA_MCP55_LAN2) you would want to put sc-sc_encap_delay = 10; On 6/27/07, Vijay Sankar [EMAIL PROTECTED] wrote: On Wednesday 27 June 2007 10:50, Tony Lambiris wrote: You might be interested in some unofficial patches I had created when experiencing the same thing. I hadn't officially released these because of the awful DELAY() timeout hack taken from the original nfe code from DragonFly BSD. Most of the updates were taken from NetBSD. Either way, what you would be interested in is the encap_delay stuff, specifically the part in nfe.c where it actually assigns the variable: case PCI_PRODUCT_NVIDIA_CK804_LAN1: case PCI_PRODUCT_NVIDIA_CK804_LAN2: + sc-sc_encap_delay = 10; + break; You would obviously have to locate where your interface matches and assign it there. For me, my interface is a CK804. Not sure if it was LAN1 or LAN2, but I assigned the delay to both anyway. These patches seemed to work good for me, didn't experience any timeouts, YMMV. Let me know if this works. These will apply cleanly against 4.1-RELEASE. I downloaded your patches and would like to try it out. Thanks very much. Because I don't know what I am doing here, I need a bit more help. How can I find out whether my interface is also a CK804? scanpci -v gave me the following: pci bus 0x cardnum 0x08 function 0x00: vendor 0x10de device 0x0373 nVidia Corporation MCP55 Ethernet CardVendor 0x1043 card 0x8239 (ASUSTeK Computer Inc., Card unknown) STATUS0x00b0 COMMAND 0x0007 CLASS 0x06 0x80 0x00 REVISION 0xa2 BIST 0x00 HEADER 0x00 LATENCY 0x00 CACHE 0x00 BASE0 0xfe02a000 addr 0xfe02a000 MEM BASE1 0xb001 addr 0xb000 I/O BASE2 0xfe029000 addr 0xfe029000 MEM BASE3 0xfe028000 addr 0xfe028000 MEM MAX_LAT 0x14 MIN_GNT 0x01 INT_PIN 0x01 INT_LINE 0x0a BYTE_00x43 BYTE_1 0x10 BYTE_2 0x39 BYTE_3 0x82 pci bus 0x cardnum 0x09 function 0x00: vendor 0x10de device 0x0373 nVidia Corporation MCP55 Ethernet CardVendor 0x1043 card 0x8239 (ASUSTeK Computer Inc., Card unknown) STATUS0x00b0 COMMAND 0x0007 CLASS 0x06 0x80 0x00 REVISION 0xa2 BIST 0x00 HEADER 0x00 LATENCY 0x00 CACHE 0x00 BASE0 0xfe027000 addr 0xfe027000 MEM BASE1 0xac01 addr 0xac00 I/O BASE2 0xfe026000 addr 0xfe026000 MEM BASE3 0xfe025000 addr 0xfe025000 MEM MAX_LAT 0x14 MIN_GNT 0x01 INT_PIN 0x01 INT_LINE 0x0a BYTE_00x43 BYTE_1 0x10 BYTE_2 0x39 BYTE_3 0x82 dmesg shows nfe0 at pci0 dev 8 function 0 NVIDIA MCP55 LAN rev 0xa2: irq 10, address 00:17:31:cb:ee:d1 eephy0 at nfe0 phy 1: Marvell 88E1116 Gigabit PHY, rev. 1 nfe1 at pci0 dev 9 function 0 NVIDIA MCP55 LAN rev 0xa2: irq 10, address 00:17:31:cb:dd:7a eephy1 at nfe1 phy 1: Marvell 88E1116 Gigabit PHY, rev. 1 http://lysergik.com/~tony/openbsd/ On 6/25/07, patrick keshishian [EMAIL PROTECTED] wrote: On 6/24/07, Vijay Sankar [EMAIL PROTECTED] wrote: On Sunday 24 June 2007 13:50, patrick keshishian wrote: Hi, I've been noticing some strange problems with the built-in nfe0 interface on my desktop. Actually I've seen it on two such computers, but the description below is for my current desktop PC. The PC is running `cvs up -dP -rOPENBSD_4_1' built. I'm including netstat, ifconfig output[1] and dmesg below[2]. I've noticed that once in a while the nfe0 interface will stop sending and receiving data. At this point I can not make it work again. The only solution I have is to reboot the box. I have installed a dc0 card in the box since. The problem seemed intermittent and not reliably reproducible. But I think I found a way to reproduce this problem on demand (at least for the time being). I have an ssh session to another box, on which I run '/usr/bin/nm somelib.so'. After a page or two of output the terminal hangs. At this point nfe0 becomes unresponsive. I switch to the dc0 interface and the terminal finishes the output. Running the nm command while using the dc0 interface doesn't cause any problems. I experienced similar problems last year and can empathize. The following items improved my situation somewhat: 1) BIOS upgrade 2) Removing dual boot (I had both OpenBSD and Windows 2003 on one machine. There were more errors if I did not power off after shutting down Windows 2003 and just did a restart from within Windows. If I did not unplug the machine after shutting down Windows, most of the time I saw watchdog timeouts but if I powered off the host, and then powered it back on, there were fewer errors) Both boxes I have run solely OpenBSD. One thing that I did notice was that after switching to the dc0 interface for a short while (5 min or so?), I could switch
Re: nfe0 problem (obsd 4.1)
On 6/24/07, Vijay Sankar [EMAIL PROTECTED] wrote: On Sunday 24 June 2007 13:50, patrick keshishian wrote: Hi, I've been noticing some strange problems with the built-in nfe0 interface on my desktop. Actually I've seen it on two such computers, but the description below is for my current desktop PC. The PC is running `cvs up -dP -rOPENBSD_4_1' built. I'm including netstat, ifconfig output[1] and dmesg below[2]. I've noticed that once in a while the nfe0 interface will stop sending and receiving data. At this point I can not make it work again. The only solution I have is to reboot the box. I have installed a dc0 card in the box since. The problem seemed intermittent and not reliably reproducible. But I think I found a way to reproduce this problem on demand (at least for the time being). I have an ssh session to another box, on which I run '/usr/bin/nm somelib.so'. After a page or two of output the terminal hangs. At this point nfe0 becomes unresponsive. I switch to the dc0 interface and the terminal finishes the output. Running the nm command while using the dc0 interface doesn't cause any problems. I experienced similar problems last year and can empathize. The following items improved my situation somewhat: 1) BIOS upgrade 2) Removing dual boot (I had both OpenBSD and Windows 2003 on one machine. There were more errors if I did not power off after shutting down Windows 2003 and just did a restart from within Windows. If I did not unplug the machine after shutting down Windows, most of the time I saw watchdog timeouts but if I powered off the host, and then powered it back on, there were fewer errors) Both boxes I have run solely OpenBSD. One thing that I did notice was that after switching to the dc0 interface for a short while (5 min or so?), I could switch back to the nfe0 and it would start responding again. Basically: # /sbin/ifconfig dc0 delete # /sbin/route delete default # /sbin/ifconfig nfe0 inet IP netmask netmask up # /sbin/route add default gateway Therefore, a reboot isn't the only way to fix the problem (reset the interface) as I had previously thought. I am not sure exactly what causes the interface to reset: idle time, no carrier, or something completely random? Either way, thanks for all the replies! I experimented with different combinations and different switches (10/100/1000, 10/100, and 10-Base-T). When all the hosts connected to a 10/100 switch were running at 100 MB/s then changing nfe0 from autoselect to full-duplex using ifconfig nfe0 media 100baseTX mediaopt full-duplex seemed to eliminate nfe0 hangs as well as timeouts completely. I am not sure whether this has any rational basis or is specific to some weird situation in my network, but that has been my experience. Vijay Interestingly enough, if I redirect the output of nm to a file and subsequently cat the file the nfe0 interface doesn't seem to exhibit the same problem. I am not sure how to diagnose this problem further. I've enabled debug on the nfe0 interface (/sbin/ifconfig nfe0 debug), but don't see any output. Any and all suggestions are welcome. --patrick
nfe0 problem (obsd 4.1)
Hi, I've been noticing some strange problems with the built-in nfe0 interface on my desktop. Actually I've seen it on two such computers, but the description below is for my current desktop PC. The PC is running `cvs up -dP -rOPENBSD_4_1' built. I'm including netstat, ifconfig output[1] and dmesg below[2]. I've noticed that once in a while the nfe0 interface will stop sending and receiving data. At this point I can not make it work again. The only solution I have is to reboot the box. I have installed a dc0 card in the box since. The problem seemed intermittent and not reliably reproducible. But I think I found a way to reproduce this problem on demand (at least for the time being). I have an ssh session to another box, on which I run '/usr/bin/nm somelib.so'. After a page or two of output the terminal hangs. At this point nfe0 becomes unresponsive. I switch to the dc0 interface and the terminal finishes the output. Running the nm command while using the dc0 interface doesn't cause any problems. Interestingly enough, if I redirect the output of nm to a file and subsequently cat the file the nfe0 interface doesn't seem to exhibit the same problem. I am not sure how to diagnose this problem further. I've enabled debug on the nfe0 interface (/sbin/ifconfig nfe0 debug), but don't see any output. Any and all suggestions are welcome. --patrick [1] netstat and ifconfig outputs: $ /usr/bin/netstat -in NameMtu Network Address Ipkts IerrsOpkts Oerrs Colls lo0 33224 Link 1 01 0 0 lo0 33224 127/8 127.0.0.11 01 0 0 lo0 33224 ::1/128 ::1 1 01 0 0 lo0 33224 fe80::%lo0/ fe80::1%lo0 1 01 0 0 dc0 1500 Link 00:02:e3:07:cc:df 1713 0 424 7 0 dc0 1500 fe80::%dc0/ fe80::202:e3ff:fe 1713 0 424 7 0 nfe01500 Link 00:16:e6:82:17:da 1520 613 878 0 0 nfe01500 fe80::%nfe0 fe80::216:e6ff:fe 1520 613 878 0 0 nfe01500 xx.yy.ww.zz xx.yy.ww.zz2 1520 613 878 0 0 pflog0 33224 Link 0 00 0 0 enc0* 1536 Link 0 00 0 0 $ /usr/bin/netstat -rnfinet Routing tables Internet: DestinationGatewayFlagsRefs UseMtu Interface defaultxx.yy.ww.zz9 UGS 00 - nfe0 xx.yy.ww.zz8/28link#2 UC 40 - nfe0 xx.yy.ww.zz9 00:20:6f:03:a2:e5 UHLc10 - nfe0 xx.yy.ww.zz1 link#2 UHLc02 - nfe0 xx.yy.ww.zz3 00:01:02:c2:a1:b9 UHLc1 159 - nfe0 xx.yy.ww.zz0 00:20:e0:68:5d:c8 UHLc1 11 - L nfe0 127/8 127.0.0.1 UGRS00 33224 lo0 127.0.0.1 127.0.0.1 UH 10 33224 lo0 224/4 127.0.0.1 URS 00 33224 lo0 $ /sbin/ifconfig lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST mtu 33224 groups: lo inet 127.0.0.1 netmask 0xff00 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 dc0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 lladdr 00:02:e3:07:cc:df media: Ethernet autoselect (none) status: no carrier inet6 fe80::202:e3ff:fe07:ccdf%dc0 prefixlen 64 scopeid 0x1 nfe0: flags=8847UP,BROADCAST,DEBUG,RUNNING,SIMPLEX,MULTICAST mtu 1500 lladdr 00:16:e6:82:17:da groups: egress media: Ethernet autoselect (100baseTX full-duplex) status: active inet6 fe80::216:e6ff:fe82:17da%nfe0 prefixlen 64 scopeid 0x2 inet xx.yy.ww.zz2 netmask 0xfff0 broadcast xx.yy.ww.zz3 pflog0: flags=141UP,RUNNING,PROMISC mtu 33224 enc0: flags=0 mtu 1536 [2] dmesg OpenBSD 4.1-stable (GENERIC) #0: Mon May 28 18:06:28 PDT 2007 [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC cpu0: AMD Athlon(tm) 64 Processor 3200+ (AuthenticAMD 686-class, 512KB L2 cach e) 2.02 GHz cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CF LUSH,MMX,FXSR,SSE,SSE2,SSE3 cpu0: AMD erratum 89 present, BIOS upgrade may be required real mem = 536375296 (523804K) avail mem = 481710080 (470420K) using 4278 buffers containing 26943488 bytes (26312K) of memory mainbus0 (root) bios0 at mainbus0: AT/286+ BIOS, date 05/11/06, BIOS32 rev. 0 @ 0xfb5f0, SMBIOS rev. 2.3 @ 0xf0100 (43 entries) bios0: Gigabyte Technology Co., Ltd. GA-K8N-SLi / GA-K8N-SLi-RH apm0 at bios0: Power Management spec V1.2 apm0: AC on, battery charge unknown apm0: flags 70102 dobusy 1 doidle 1 pcibios0 at bios0: rev 3.0 @ 0xf/0xdd64 pcibios0: PCI IRQ Routing Table rev 1.0 @ 0xfdc00/352 (20 entries) pcibios0: PCI
Re: nfe0 problem (obsd 4.1)
On Sun, Jun 24, 2007 at 11:50:28AM -0700, patrick keshishian wrote: Hi, I've been noticing some strange problems with the built-in nfe0 interface on my desktop. Actually I've seen it on two such computers, but the description below is for my current desktop PC. The PC is running `cvs up -dP -rOPENBSD_4_1' built. I'm including netstat, ifconfig output[1] and dmesg below[2]. I've noticed that once in a while the nfe0 interface will stop sending and receiving data. At this point I can not make it work again. The only solution I have is to reboot the box. I have installed a dc0 card in the box since. The problem seemed intermittent and not reliably reproducible. But I think I found a way to reproduce this problem on demand (at least for the time being). I have an ssh session to another box, on which I run '/usr/bin/nm somelib.so'. After a page or two of output the terminal hangs. At this point nfe0 becomes unresponsive. i used to see these hangs fairly often when doing a cvs up in /usr/src. for some reason i have not seen them for an age. i am unable to hang this box using your method, for example. nfe(4) is not great. i think CAVEATS says it all. buyer beware ;( jmc
Re: nfe0 problem (obsd 4.1)
On 6/24/07, patrick keshishian [EMAIL PROTECTED] wrote: Hi, I've been noticing some strange problems with the built-in nfe0 interface on my desktop. Actually I've seen it on two such computers, but the description below is for my current desktop PC. The PC is running `cvs up -dP -rOPENBSD_4_1' built. I'm including netstat, ifconfig output[1] and dmesg below[2]. I've noticed that once in a while the nfe0 interface will stop sending and receiving data. At this point I can not make it work again. The only solution I have is to reboot the box. I have installed a dc0 card in the box since. The problem seemed intermittent and not reliably reproducible. But I think I found a way to reproduce this problem on demand (at least for the time being). I have an ssh session to another box, on which I run '/usr/bin/nm somelib.so'. After a page or two of output the terminal hangs. At this point nfe0 becomes unresponsive. This is a known problem, but probably unfixable due to lack of documentation from nvidia. See http://cvs.openbsd.org/cgi-bin/query-pr-wrapper?full=yesnumbers=5108
Re: nfe0 problem (obsd 4.1)
I have one of the older Sun Ultra 20 systems that also has an nfe(4) in it. It does the same thing everytime I try to cvs or put a load on the interface. Only way around it was to install a second NIC. Like someone else mentioned before, until more documentation is available, probably won't get any better. Until then it won't bother me to run a second NIC. Regards, Shane patrick keshishian wrote: Hi, I've been noticing some strange problems with the built-in nfe0 interface on my desktop. Actually I've seen it on two such computers, but the description below is for my current desktop PC. The PC is running `cvs up -dP -rOPENBSD_4_1' built. I'm including netstat, ifconfig output[1] and dmesg below[2]. I've noticed that once in a while the nfe0 interface will stop sending and receiving data. At this point I can not make it work again. The only solution I have is to reboot the box. I have installed a dc0 card in the box since. The problem seemed intermittent and not reliably reproducible. But I think I found a way to reproduce this problem on demand (at least for the time being). I have an ssh session to another box, on which I run '/usr/bin/nm somelib.so'. After a page or two of output the terminal hangs. At this point nfe0 becomes unresponsive. I switch to the dc0 interface and the terminal finishes the output. Running the nm command while using the dc0 interface doesn't cause any problems. Interestingly enough, if I redirect the output of nm to a file and subsequently cat the file the nfe0 interface doesn't seem to exhibit the same problem. I am not sure how to diagnose this problem further. I've enabled debug on the nfe0 interface (/sbin/ifconfig nfe0 debug), but don't see any output. Any and all suggestions are welcome. --patrick [1] netstat and ifconfig outputs: $ /usr/bin/netstat -in NameMtu Network Address Ipkts IerrsOpkts Oerrs Colls lo0 33224 Link 1 0 1 0 0 lo0 33224 127/8 127.0.0.11 0 1 0 0 lo0 33224 ::1/128 ::1 1 0 1 0 0 lo0 33224 fe80::%lo0/ fe80::1%lo0 1 0 1 0 0 dc0 1500 Link 00:02:e3:07:cc:df 1713 0 424 7 0 dc0 1500 fe80::%dc0/ fe80::202:e3ff:fe 1713 0 424 7 0 nfe01500 Link 00:16:e6:82:17:da 1520 613 878 0 0 nfe01500 fe80::%nfe0 fe80::216:e6ff:fe 1520 613 878 0 0 nfe01500 xx.yy.ww.zz xx.yy.ww.zz2 1520 613 878 0 0 pflog0 33224 Link 0 0 0 0 0 enc0* 1536 Link 0 0 0 0 0 $ /usr/bin/netstat -rnfinet Routing tables Internet: DestinationGatewayFlagsRefs UseMtu Interface defaultxx.yy.ww.zz9 UGS 00 - nfe0 xx.yy.ww.zz8/28link#2 UC 40 - nfe0 xx.yy.ww.zz9 00:20:6f:03:a2:e5 UHLc10 - nfe0 xx.yy.ww.zz1 link#2 UHLc02 - nfe0 xx.yy.ww.zz3 00:01:02:c2:a1:b9 UHLc1 159 - nfe0 xx.yy.ww.zz0 00:20:e0:68:5d:c8 UHLc1 11 - L nfe0 127/8 127.0.0.1 UGRS00 33224 lo0 127.0.0.1 127.0.0.1 UH 10 33224 lo0 224/4 127.0.0.1 URS 00 33224 lo0 $ /sbin/ifconfig lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST mtu 33224 groups: lo inet 127.0.0.1 netmask 0xff00 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 dc0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 lladdr 00:02:e3:07:cc:df media: Ethernet autoselect (none) status: no carrier inet6 fe80::202:e3ff:fe07:ccdf%dc0 prefixlen 64 scopeid 0x1 nfe0: flags=8847UP,BROADCAST,DEBUG,RUNNING,SIMPLEX,MULTICAST mtu 1500 lladdr 00:16:e6:82:17:da groups: egress media: Ethernet autoselect (100baseTX full-duplex) status: active inet6 fe80::216:e6ff:fe82:17da%nfe0 prefixlen 64 scopeid 0x2 inet xx.yy.ww.zz2 netmask 0xfff0 broadcast xx.yy.ww.zz3 pflog0: flags=141UP,RUNNING,PROMISC mtu 33224 enc0: flags=0 mtu 1536 [2] dmesg OpenBSD 4.1-stable (GENERIC) #0: Mon May 28 18:06:28 PDT 2007 [EMAIL PROTECTED]:/usr/src/sys/arch/i386/compile/GENERIC cpu0: AMD Athlon(tm) 64 Processor 3200+ (AuthenticAMD 686-class, 512KB L2 cach e) 2.02 GHz cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CF LUSH,MMX,FXSR,SSE,SSE2,SSE3 cpu0: AMD erratum 89 present, BIOS upgrade may be required real mem = 536375296 (523804K) avail mem = 481710080 (470420K) using 4278 buffers containing 26943488 bytes (26312K) of
Re: nfe0 problem (obsd 4.1)
On Sunday 24 June 2007 13:50, patrick keshishian wrote: Hi, I've been noticing some strange problems with the built-in nfe0 interface on my desktop. Actually I've seen it on two such computers, but the description below is for my current desktop PC. The PC is running `cvs up -dP -rOPENBSD_4_1' built. I'm including netstat, ifconfig output[1] and dmesg below[2]. I've noticed that once in a while the nfe0 interface will stop sending and receiving data. At this point I can not make it work again. The only solution I have is to reboot the box. I have installed a dc0 card in the box since. The problem seemed intermittent and not reliably reproducible. But I think I found a way to reproduce this problem on demand (at least for the time being). I have an ssh session to another box, on which I run '/usr/bin/nm somelib.so'. After a page or two of output the terminal hangs. At this point nfe0 becomes unresponsive. I switch to the dc0 interface and the terminal finishes the output. Running the nm command while using the dc0 interface doesn't cause any problems. I experienced similar problems last year and can empathize. The following items improved my situation somewhat: 1) BIOS upgrade 2) Removing dual boot (I had both OpenBSD and Windows 2003 on one machine. There were more errors if I did not power off after shutting down Windows 2003 and just did a restart from within Windows. If I did not unplug the machine after shutting down Windows, most of the time I saw watchdog timeouts but if I powered off the host, and then powered it back on, there were fewer errors) I experimented with different combinations and different switches (10/100/1000, 10/100, and 10-Base-T). When all the hosts connected to a 10/100 switch were running at 100 MB/s then changing nfe0 from autoselect to full-duplex using ifconfig nfe0 media 100baseTX mediaopt full-duplex seemed to eliminate nfe0 hangs as well as timeouts completely. I am not sure whether this has any rational basis or is specific to some weird situation in my network, but that has been my experience. Vijay Interestingly enough, if I redirect the output of nm to a file and subsequently cat the file the nfe0 interface doesn't seem to exhibit the same problem. I am not sure how to diagnose this problem further. I've enabled debug on the nfe0 interface (/sbin/ifconfig nfe0 debug), but don't see any output. Any and all suggestions are welcome. --patrick [1] netstat and ifconfig outputs: $ /usr/bin/netstat -in NameMtu Network Address Ipkts IerrsOpkts Oerrs Colls lo0 33224 Link 1 01 0 0 lo0 33224 127/8 127.0.0.1 1 01 0 0 lo0 33224 ::1/128 ::1 1 01 0 0 lo0 33224 fe80::%lo0/ fe80::1%lo0 1 01 0 0 dc0 1500 Link 00:02:e3:07:cc:df 1713 0 424 7 0 dc0 1500 fe80::%dc0/ fe80::202:e3ff:fe 1713 0 424 7 0 nfe01500 Link 00:16:e6:82:17:da 1520 613 878 0 0 nfe01500 fe80::%nfe0 fe80::216:e6ff:fe 1520 613 878 0 0 nfe01500 xx.yy.ww.zz xx.yy.ww.zz2 1520 613 878 0 0 pflog0 33224 Link 0 00 0 0 enc0* 1536 Link 0 00 0 0 $ /usr/bin/netstat -rnfinet Routing tables Internet: DestinationGatewayFlagsRefs UseMtu Interface defaultxx.yy.ww.zz9 UGS 0 0 - nfe0 xx.yy.ww.zz8/28link#2 UC 4 0 - nfe0 xx.yy.ww.zz9 00:20:6f:03:a2:e5 UHLc 10 - nfe0 xx.yy.ww.zz1 link#2 UHLc 02 - nfe0 xx.yy.ww.zz3 00:01:02:c2:a1:b9 UHLc1 159 - nfe0 xx.yy.ww.zz0 00:20:e0:68:5d:c8 UHLc1 11 - L nfe0 127/8 127.0.0.1 UGRS00 33224 lo0 127.0.0.1 127.0.0.1 UH 10 33224 lo0 224/4 127.0.0.1 URS 00 33224 lo0 $ /sbin/ifconfig lo0: flags=8049UP,LOOPBACK,RUNNING,MULTICAST mtu 33224 groups: lo inet 127.0.0.1 netmask 0xff00 inet6 ::1 prefixlen 128 inet6 fe80::1%lo0 prefixlen 64 scopeid 0x5 dc0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500 lladdr 00:02:e3:07:cc:df media: Ethernet autoselect (none) status: no carrier inet6 fe80::202:e3ff:fe07:ccdf%dc0 prefixlen 64 scopeid 0x1 nfe0: flags=8847UP,BROADCAST,DEBUG,RUNNING,SIMPLEX,MULTICAST mtu 1500 lladdr 00:16:e6:82:17:da groups: egress media: Ethernet autoselect (100baseTX full-duplex) status: active