Hi,
I have lots of transmit timeouts with an Intel E1000 card during large
TCP transmissions (remotely viewing a 3000x2000 jpeg image using XV is
an excellent way to trigger it). This is what I get in linux-2.6.8.1:
Jan 10 15:24:41 zurix kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jan 10 15:24:41 zurix kernel: e1000: eth0: e1000_watchdog: NIC Link is Up 1000
Mbps Full Duplex
Jan 10 15:24:46 zurix kernel: nfs: server abra2 not responding, still trying
Jan 10 15:24:46 zurix kernel: nfs: server abra2 OK
And this is with linux-2.6.15:
Jan 10 06:53:27 zurix kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit
Hang
Jan 10 06:53:27 zurix kernel: TDH <b0>
Jan 10 06:53:27 zurix kernel: TDT <b0>
Jan 10 06:53:27 zurix kernel: next_to_use <b0>
Jan 10 06:53:27 zurix kernel: next_to_clean <c3>
Jan 10 06:53:27 zurix kernel: buffer_info[next_to_clean]
Jan 10 06:53:27 zurix kernel: dma <e938a5e>
Jan 10 06:53:27 zurix kernel: time_stamp <872de93>
Jan 10 06:53:27 zurix kernel: next_to_watch <c3>
Jan 10 06:53:27 zurix kernel: jiffies <872e086>
Jan 10 06:53:27 zurix kernel: next_to_watch.status <0>
Jan 10 06:53:29 zurix kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit
Hang
Jan 10 06:53:29 zurix kernel: TDH <b0>
Jan 10 06:53:29 zurix kernel: TDT <b0>
Jan 10 06:53:29 zurix kernel: next_to_use <b0>
Jan 10 06:53:29 zurix kernel: next_to_clean <c3>
Jan 10 06:53:29 zurix kernel: buffer_info[next_to_clean]
Jan 10 06:53:29 zurix kernel: dma <e938a5e>
Jan 10 06:53:29 zurix kernel: time_stamp <872de93>
Jan 10 06:53:29 zurix kernel: next_to_watch <c3>
Jan 10 06:53:29 zurix kernel: jiffies <872e27a>
Jan 10 06:53:29 zurix kernel: next_to_watch.status <0>
Jan 10 06:53:31 zurix kernel: e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit
Hang
Jan 10 06:53:31 zurix kernel: TDH <b0>
Jan 10 06:53:31 zurix kernel: TDT <b0>
Jan 10 06:53:31 zurix kernel: next_to_use <b0>
Jan 10 06:53:31 zurix kernel: next_to_clean <c3>
Jan 10 06:53:31 zurix kernel: buffer_info[next_to_clean]
Jan 10 06:53:31 zurix kernel: dma <e938a5e>
Jan 10 06:53:31 zurix kernel: time_stamp <872de93>
Jan 10 06:53:31 zurix kernel: next_to_watch <c3>
Jan 10 06:53:31 zurix kernel: jiffies <872e46e>
Jan 10 06:53:31 zurix kernel: next_to_watch.status <0>
Jan 10 06:53:32 zurix kernel: nfs: server abra2 not responding, still trying
Jan 10 06:53:33 zurix kernel: NETDEV WATCHDOG: eth0: transmit timed out
Jan 10 06:53:36 zurix kernel: e1000: eth0: e1000_watchdog_task: NIC Link is Up
1000 Mbps Full Duplex
Jan 10 06:53:37 zurix kernel: nfs: server abra2 OK
The system is a an AMD Athlon XP 2000+ running at 1.666 GHz with a VIA
KT400 chipset (Asrock K7VT4APro).
Here's the relevant output from lspci:
0000:00:0b.0 Ethernet controller: Intel Corporation 82541PI Gigabit
Ethernet Controller (rev 05)
Subsystem: Intel Corporation: Unknown device 1376
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV+ VGASnoop-
ParErr- Stepping- SERR- FastB2B-
Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 32 (63750ns min), Cache Line Size: 0x08 (32 bytes)
Interrupt: pin A routed to IRQ 19
Region 0: Memory at dffc0000 (32-bit, non-prefetchable) [size=128K]
Region 1: Memory at dffa0000 (32-bit, non-prefetchable) [size=128K]
Region 2: I/O ports at d400 [size=64]
Expansion ROM at fffe0000 [disabled] [size=128K]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
PME(D0+,D1-,D2-,D3hot+,D3cold+)
Status: D0 PME-Enable- DSel=0 DScale=1 PME-
Capabilities: [e4] PCI-X non-bridge device.
Command: DPERE- ERO+ RBC=0 OST=0
Status: Bus=0 Dev=0 Func=0 64bit- 133MHz- SCD- USC-, DC=simple,
DMMRBC=2, DMOST=0, DMCRS=0, RSCEM-
00: 86 80 7c 10 17 00 30 02 05 00 00 02 08 20 00 00
10: 00 00 fc df 00 00 fa df 01 d4 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 76 13
30: 00 00 fe ff dc 00 00 00 00 00 00 00 0c 01 ff 00
Loaded modules (with 2.6.8.1): nfsd exportfs sd_mod sg lp sr_mod
autofs4 nfs lockd sunrpc ide_cd cdrom floppy parport_pc parport
8250_pnp 8250 serial_core snd_via82xx snd_ac97_codec snd_pcm_oss
snd_mixer_oss snd_pcm snd_timer snd_page_alloc gameport snd_mpu401_uart
snd_rawmidi snd_seq_device snd soundcore joydev evdev ehci_hcd usbhid
uhci_hcd usbcore sata_via libata e1000 reiserfs mga via_agp agpgart .
So far I have replaced the NIC, the motherboard, the power supply, RAM,
network cable, and gigE switch, but to no avail. I've tried three
different kernels (2.6.8.1, 2.6.11-ac7, and 2.6.15) but the problem
remains. I've been stress testing the system by continuously compiling
kernels (over NFS), but after 288 runs there hasn't been a single error
so I guess the CPU and RAM are OK. The amount of transmit timeouts is
less with linux-2.6.8.1, so for the moment I keep running that version.
We have about 15 other machines using the Intel E1000, but I haven't
seen these kind of problems on any of the other machines. I have run
out of ideas, so I hope somebody knows how to solve this. If you need
more information, just let me know.
Erik
--
+-- Erik Mouw -- www.harddisk-recovery.com -- +31 70 370 12 90 --
| Lab address: Delftechpark 26, 2628 XH, Delft, The Netherlands
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vger.kernel.org/majordomo-info.html