Please download and run
http://e1000.sourceforge.net/fixeep-82573-dspd-v2.sh

on both interfaces.

I've recently enhanced the script to load an ethtool file and verified that 
your ethtool output requires the fix.  Let me know if the script doesn't work.

search google for fixeep for more related reports.


-----Original Message-----
From: Vasco Steinmetz [mailto:va...@kyberraum.net]
Sent: Wednesday, January 05, 2011 3:52 AM
To: Brandeburg, Jesse
Subject: Re: [E1000-devel] 2.6.36.1 e1000e - Detected Tx Unit Hang

Hi Jesse,

no I didn't run anything to check/fix eeproms. How do I do this?

ethtool -e outputs attached.


Cheers,
Vasco


Am Dienstag 04 Januar 2011, 03:11:10 schrieben Sie:
> Hi Vasco, please post your ethtool -e ethX output for both interfaces.
> have you run the fixeep script or something similar to check/fix your
> eeprom(s)
>
> Jesse
>
> -----Original Message-----
> From: Vasco Steinmetz [mailto:va...@kyberraum.net]
> Sent: Thursday, December 23, 2010 2:06 PM
> To: e1000-devel@lists.sourceforge.net
> Subject: [E1000-devel] 2.6.36.1 e1000e - Detected Tx Unit Hang
>
> Hello all,
>
> as I already reported exactly one year ago I am still experiencing Tx Unit
> hangs on two 82573 NICs on a TYAN Toledo i3210W S5211 (4GB RAM) running
> Gentoo 64 Bit.
>
> The box is running since May 2008 and has shown continuous Tx Unit hangs
> since then. The vanilla kernel was updated quarterly from 2.6.29.x upon
> install up to 2.6.36.1 currently.
>
> I also replaced the unmanaged switch with a managed one but without any
> improvement.
>
> Box was freshly rebootet.
> Sometimes eth0 hangs, sometimes eth1 does.
>
>
> dmesg excerpt:
>
> e1000e 0000:0f:00.0: eth0: Detected Hardware Unit Hang:
>   TDH                  <1>
>   TDT                  <6f>
>   next_to_use          <6f>
>   next_to_clean        <ff>
> buffer_info[next_to_clean]:
>   time_stamp           <10001ae38>
>   next_to_watch        <1>
>   jiffies              <10001b747>
>   next_to_watch.status <0>
> MAC Status             <80080793>
> PHY Status             <796d>
> PHY 1000BASE-T Status  <3c00>
> PHY Extended Status    <3000>
> PCI Status             <10>
> e1000e 0000:0f:00.0: eth0: Detected Hardware Unit Hang:
>   TDH                  <1>
>   TDT                  <6f>
>   next_to_use          <6f>
>   next_to_clean        <ff>
> buffer_info[next_to_clean]:
>   time_stamp           <10001ae38>
>   next_to_watch        <1>
>   jiffies              <10001bb2f>
>   next_to_watch.status <0>
> MAC Status             <80080793>
> PHY Status             <796d>
> PHY 1000BASE-T Status  <3c00>
> PHY Extended Status    <3000>
> PCI Status             <10>
> ------------[ cut here ]------------
> WARNING: at net/sched/sch_generic.c:258 dev_watchdog+0x222/0x230()
> Hardware name: empty
> NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
> Modules linked in: md5 xts gf128mul tun nfsd lockd nfs_acl auth_rpcgss
> sunrpc af_packet bonding ipv6 coretemp lm85 hwmon_vid hwmon ext4 mbcache
> jbd2 crc16 ansi_cprng krng eseqiv rng aes_x86_64 aes_generic cbc cryptomgr
> aead dm_crypt cryptd crypto_hash crypto_wq crypto_blkcipher crypto_algapi
> fan
> cpufreq_ondemand acpi_cpufreq freq_table mperf msr cpuid usbhid hid
> 8250_pnp ehci_hcd uhci_hcd e1000e usbcore 8250 rtc_cmos serial_core
> rtc_core i2c_i801 nls_base rtc_lib i2c_core psmouse evdev pcspkr button sg
> container thermal processor unix [last unloaded: microcode]
> Pid: 0, comm: swapper Not tainted 2.6.36.1 #2
> Call Trace:
>  <IRQ>  [<ffffffff8103b1ba>] warn_slowpath_common+0x7a/0xb0
>  [<ffffffff8103b291>] warn_slowpath_fmt+0x41/0x50
>  [<ffffffff812f4052>] dev_watchdog+0x222/0x230
>  [<ffffffff81046b64>] run_timer_softirq+0x124/0x230
>  [<ffffffff812f3e30>] ? dev_watchdog+0x0/0x230
>  [<ffffffff81060f59>] ? clockevents_program_event+0x59/0xa0
>  [<ffffffff81040dd7>] __do_softirq+0xa7/0x130
>  [<ffffffff810587e3>] ? hrtimer_interrupt+0x133/0x240
>  [<ffffffff8100314c>] call_softirq+0x1c/0x30
>  [<ffffffff81004f4d>] do_softirq+0x4d/0x80
>  [<ffffffff81040aed>] irq_exit+0x7d/0x90
>  [<ffffffff8101c33b>] smp_apic_timer_interrupt+0x6b/0xa0
>  [<ffffffff81002c13>] apic_timer_interrupt+0x13/0x20
>  <EOI>  [<ffffffff8100a3f2>] ? mwait_idle+0x72/0x90
>  [<ffffffff810013b0>] ? enter_idle+0x20/0x30
>  [<ffffffff81001419>] cpu_idle+0x59/0xb0
>  [<ffffffff813437d8>] rest_init+0x68/0x70
>  [<ffffffff814aecff>] start_kernel+0x34b/0x356
>  [<ffffffff814ae31c>] x86_64_start_reservations+0x12c/0x130
>  [<ffffffff814ae407>] x86_64_start_kernel+0xe7/0xee
> ---[ end trace 629a53efa268a4ba ]---
> e1000e 0000:0f:00.0: eth0: Reset adapter
> bonding: bond0: link status definitely down for interface eth0, disabling
> it e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
> bonding: bond0: link status definitely up for interface eth0.
>
> continuity ~ # cat /proc/interrupts
>            CPU0       CPU1
>   0:         45          1   IO-APIC-edge      timer
>   1:          1          1   IO-APIC-edge      i8042
>   4:       3269         33   IO-APIC-edge      serial
>   8:         61         64   IO-APIC-edge      rtc0
>   9:          0          0   IO-APIC-fasteoi   acpi
>  12:          2          3   IO-APIC-edge      i8042
>  16:      85111     324026   IO-APIC-fasteoi   pata_pdc2027x,
> uhci_hcd:usb1, uhci_hcd:usb3, ehci_hcd:usb8, eth1
>  17:     482156       3274   IO-APIC-fasteoi   ahci, uhci_hcd:usb2,
> uhci_hcd:usb4, eth0
>  18:        183        120   IO-APIC-fasteoi   uhci_hcd:usb5, ehci_hcd:usb7
>  19:          0          0   IO-APIC-fasteoi   uhci_hcd:usb6
> NMI:          0          0   Non-maskable interrupts
> LOC:     200194     152902   Local timer interrupts
> SPU:          0          0   Spurious interrupts
> PMI:          0          0   Performance monitoring interrupts
> PND:          0          0   Performance pending work
> RES:      32585      37266   Rescheduling interrupts
> CAL:         57         75   Function call interrupts
> TLB:      18358      18379   TLB shootdowns
> TRM:          0          0   Thermal event interrupts
> THR:          0          0   Threshold APIC interrupts
> MCE:          0          0   Machine check exceptions
> MCP:          6          6   Machine check polls
> ERR:          0
> MIS:          0
>
>
> continuity ~ # modinfo e1000|grep ^version
> version:        7.3.21-k6-NAPI
>
>
> continuity ~ # ethtool eth0
> Settings for eth0:
>         Supported ports: [ TP ]
>         Supported link modes:   10baseT/Half 10baseT/Full
>                                 100baseT/Half 100baseT/Full
>                                 1000baseT/Full
>         Supports auto-negotiation: Yes
>         Advertised link modes:  10baseT/Half 10baseT/Full
>                                 100baseT/Half 100baseT/Full
>                                 1000baseT/Full
>         Advertised auto-negotiation: Yes
>         Speed: 1000Mb/s
>         Duplex: Full
>         Port: Twisted Pair
>         PHYAD: 1
>         Transceiver: internal
>         Auto-negotiation: on
>         Supports Wake-on: pumbag
>         Wake-on: g
>         Current message level: 0x00000001 (1)
>         Link detected: yes
> continuity ~ # ethtool eth1
> Settings for eth1:
>         Supported ports: [ TP ]
>         Supported link modes:   10baseT/Half 10baseT/Full
>                                 100baseT/Half 100baseT/Full
>                                 1000baseT/Full
>         Supports auto-negotiation: Yes
>         Advertised link modes:  10baseT/Half 10baseT/Full
>                                 100baseT/Half 100baseT/Full
>                                 1000baseT/Full
>         Advertised auto-negotiation: Yes
>         Speed: 1000Mb/s
>         Duplex: Full
>         Port: Twisted Pair
>         PHYAD: 1
>         Transceiver: internal
>         Auto-negotiation: on
>         Supports Wake-on: pumbag
>         Wake-on: g
>         Current message level: 0x00000001 (1)
>         Link detected: yes
>
>
> continuity ~ # ethtool -i eth0
> driver: e1000e
> version: 1.2.7-k2
> firmware-version: 1.0-2
> bus-info: 0000:0f:00.0
> continuity ~ # ethtool -i eth1
> driver: e1000e
> version: 1.2.7-k2
> firmware-version: 1.0-2
> bus-info: 0000:0d:00.0
>
>
>
> continuity ~ # ethtool -i eth0
> driver: e1000e
> version: 1.2.7-k2
> firmware-version: 1.0-2
> bus-info: 0000:0f:00.0
> continuity ~ # ethtool -i eth1
> driver: e1000e
> version: 1.2.7-k2
> firmware-version: 1.0-2
> bus-info: 0000:0d:00.0
>
>
>
> continuity ~ # ethtool -k eth0
> Offload parameters for eth0:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: on
> tcp segmentation offload: on
> udp fragmentation offload: off
> generic segmentation offload: on
> continuity ~ # ethtool -k eth1
> Offload parameters for eth1:
> rx-checksumming: on
> tx-checksumming: on
> scatter-gather: on
> tcp segmentation offload: on
> udp fragmentation offload: off
> generic segmentation offload: on
>
>
>
> lspci -vv
>
> 0d:00.0 Ethernet controller: Intel Corporation 82573V Gigabit Ethernet
> Controller (Copper) (rev 03)
>         Subsystem: Tyan Computer Device 5211
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR+ FastB2B- DisINTx-
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 16
>         Region 0: Memory at f4080000 (32-bit, non-prefetchable) [size=128K]
>         Region 1: Memory at f4000000 (32-bit, non-prefetchable) [size=512K]
>         Region 2: I/O ports at 3000 [size=32]
>         Capabilities: [c8] Power Management version 2
>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>         Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
>                 Address: 0000000000000000  Data: 0000
>         Capabilities: [e0] Express (v1) Endpoint, MSI 00
>                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
> <512ns, L1 <64us
>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
>                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
> Unsupported+
>                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                         MaxPayload 128 bytes, MaxReadReq 512 bytes
>                 DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+
> TransPend-
>                 LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM unknown,
> Latency L0 <128ns, L1 <64us
>                         ClockPM- Surprise- LLActRep- BwNot-
>                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain-
> CommClk+
>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                 LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+
> DLActive- BWMgmt- ABWMgmt-
>         Capabilities: [100] Advanced Error Reporting
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                 CESta:  RxErr+ BadTLP- BadDLLP+ Rollover- Timeout-
> NonFatalErr-
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                 AERCap: First Error Pointer: 14, GenCap- CGenEn- ChkCap-
> ChkEn-
>         Capabilities: [140] Device Serial Number 00-e0-81-ff-ff-b1-83-2e
>         Kernel driver in use: e1000e
>         Kernel modules: e1000e
>
>
> 0f:00.0 Ethernet controller: Intel Corporation 82573L Gigabit Ethernet
> Controller
>         Subsystem: Tyan Computer Device 5211
>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr-
> Stepping- SERR+ FastB2B- DisINTx-
>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> <TAbort- <MAbort- >SERR- <PERR- INTx-
>         Latency: 0, Cache Line Size: 64 bytes
>         Interrupt: pin A routed to IRQ 17
>         Region 0: Memory at f4280000 (32-bit, non-prefetchable) [size=128K]
>         Region 1: Memory at f4200000 (32-bit, non-prefetchable) [size=512K]
>         Region 2: I/O ports at 4000 [size=32]
>         Capabilities: [c8] Power Management version 2
>                 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
> PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                 Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>         Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
>                 Address: 0000000000000000  Data: 0000
>         Capabilities: [e0] Express (v1) Endpoint, MSI 00
>                 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
> <512ns, L1 <64us
>                         ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
>                 DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
> Unsupported+
>                         RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                         MaxPayload 128 bytes, MaxReadReq 512 bytes
>                 DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+
> TransPend-
>                 LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM unknown,
> Latency L0 <128ns, L1 <64us
>                         ClockPM+ Surprise- LLActRep- BwNot-
>                 LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain-
> CommClk+
>                         ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                 LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+
> DLActive- BWMgmt- ABWMgmt-
>         Capabilities: [100] Advanced Error Reporting
>                 UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
>                 UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                 UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt-
> RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                 CESta:  RxErr+ BadTLP- BadDLLP+ Rollover- Timeout-
> NonFatalErr-
>                 CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
> NonFatalErr-
>                 AERCap: First Error Pointer: 14, GenCap- CGenEn- ChkCap-
> ChkEn-
>         Capabilities: [140] Device Serial Number 00-e0-81-ff-ff-b1-83-2f
>         Kernel driver in use: e1000e
>         Kernel modules: e1000e
>
>
>
> continuity ~ # ifconfig
> bond0     Link encap:Ethernet  HWaddr 00:e0:81:b1:83:2f
>           inet addr:192.168.0.2  Bcast:192.168.0.255  Mask:255.255.255.0
>           inet6 addr: fe80::2e0:81ff:feb1:832f/64 Scope:Link
>           UP BROADCAST RUNNING MASTER MULTICAST  MTU:1500  Metric:1
>           RX packets:730984 errors:0 dropped:8386 overruns:0 frame:0
>           TX packets:1134308 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:0
>           RX bytes:172475094 (164.4 MiB)  TX bytes:1466576683 (1.3 GiB)
>
> eth0      Link encap:Ethernet  HWaddr 00:e0:81:b1:83:2f
>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>           RX packets:672162 errors:0 dropped:8386 overruns:0 frame:0
>           TX packets:4102 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:152695089 (145.6 MiB)  TX bytes:1622744 (1.5 MiB)
>           Interrupt:17 Memory:f4280000-f42a0000
>
> eth1      Link encap:Ethernet  HWaddr 00:e0:81:b1:83:2f
>           UP BROADCAST RUNNING SLAVE MULTICAST  MTU:1500  Metric:1
>           RX packets:58822 errors:0 dropped:0 overruns:0 frame:0
>           TX packets:1130206 errors:0 dropped:0 overruns:0 carrier:0
>           collisions:0 txqueuelen:1000
>           RX bytes:19780005 (18.8 MiB)  TX bytes:1464953939 (1.3 GiB)
>           Interrupt:16 Memory:f4080000-f40a0000
>
>
>
>
> Kernel cfg and more infos available upon request.
>
> Bonding is properly configured and trunking enabled on the switch.
> It also happens without bonding.
>
>
> Any help appreciated.
>
>
> Cheers,
> Vasco
>
> ---------------------------------------------------------------------------
> --- Learn how Oracle Real Application Clusters (RAC) One Node allows
> customers to consolidate database storage, standardize their database
> environment, and, should the need arise, upgrade to a full multi-node
> Oracle RAC database without downtime or disruption
> http://p.sf.net/sfu/oracle-sfdevnl
> _______________________________________________
> E1000-devel mailing list
> E1000-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/e1000-devel
> To learn more about Intel&#174; Ethernet, visit
> http://communities.intel.com/community/wired


------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to