Re: [E1000-devel] 2.6.36.1 e1000e - Detected Tx Unit Hang

2011-01-05 Thread Brandeburg, Jesse
Please download and run
http://e1000.sourceforge.net/fixeep-82573-dspd-v2.sh

on both interfaces.

I've recently enhanced the script to load an ethtool file and verified that 
your ethtool output requires the fix.  Let me know if the script doesn't work.

search google for fixeep for more related reports.


-Original Message-
From: Vasco Steinmetz [mailto:va...@kyberraum.net]
Sent: Wednesday, January 05, 2011 3:52 AM
To: Brandeburg, Jesse
Subject: Re: [E1000-devel] 2.6.36.1 e1000e - Detected Tx Unit Hang

Hi Jesse,

no I didn't run anything to check/fix eeproms. How do I do this?

ethtool -e outputs attached.


Cheers,
Vasco


Am Dienstag 04 Januar 2011, 03:11:10 schrieben Sie:
 Hi Vasco, please post your ethtool -e ethX output for both interfaces.
 have you run the fixeep script or something similar to check/fix your
 eeprom(s)

 Jesse

 -Original Message-
 From: Vasco Steinmetz [mailto:va...@kyberraum.net]
 Sent: Thursday, December 23, 2010 2:06 PM
 To: e1000-devel@lists.sourceforge.net
 Subject: [E1000-devel] 2.6.36.1 e1000e - Detected Tx Unit Hang

 Hello all,

 as I already reported exactly one year ago I am still experiencing Tx Unit
 hangs on two 82573 NICs on a TYAN Toledo i3210W S5211 (4GB RAM) running
 Gentoo 64 Bit.

 The box is running since May 2008 and has shown continuous Tx Unit hangs
 since then. The vanilla kernel was updated quarterly from 2.6.29.x upon
 install up to 2.6.36.1 currently.

 I also replaced the unmanaged switch with a managed one but without any
 improvement.

 Box was freshly rebootet.
 Sometimes eth0 hangs, sometimes eth1 does.


 dmesg excerpt:

 e1000e :0f:00.0: eth0: Detected Hardware Unit Hang:
   TDH  1
   TDT  6f
   next_to_use  6f
   next_to_cleanff
 buffer_info[next_to_clean]:
   time_stamp   10001ae38
   next_to_watch1
   jiffies  10001b747
   next_to_watch.status 0
 MAC Status 80080793
 PHY Status 796d
 PHY 1000BASE-T Status  3c00
 PHY Extended Status3000
 PCI Status 10
 e1000e :0f:00.0: eth0: Detected Hardware Unit Hang:
   TDH  1
   TDT  6f
   next_to_use  6f
   next_to_cleanff
 buffer_info[next_to_clean]:
   time_stamp   10001ae38
   next_to_watch1
   jiffies  10001bb2f
   next_to_watch.status 0
 MAC Status 80080793
 PHY Status 796d
 PHY 1000BASE-T Status  3c00
 PHY Extended Status3000
 PCI Status 10
 [ cut here ]
 WARNING: at net/sched/sch_generic.c:258 dev_watchdog+0x222/0x230()
 Hardware name: empty
 NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
 Modules linked in: md5 xts gf128mul tun nfsd lockd nfs_acl auth_rpcgss
 sunrpc af_packet bonding ipv6 coretemp lm85 hwmon_vid hwmon ext4 mbcache
 jbd2 crc16 ansi_cprng krng eseqiv rng aes_x86_64 aes_generic cbc cryptomgr
 aead dm_crypt cryptd crypto_hash crypto_wq crypto_blkcipher crypto_algapi
 fan
 cpufreq_ondemand acpi_cpufreq freq_table mperf msr cpuid usbhid hid
 8250_pnp ehci_hcd uhci_hcd e1000e usbcore 8250 rtc_cmos serial_core
 rtc_core i2c_i801 nls_base rtc_lib i2c_core psmouse evdev pcspkr button sg
 container thermal processor unix [last unloaded: microcode]
 Pid: 0, comm: swapper Not tainted 2.6.36.1 #2
 Call Trace:
  IRQ  [8103b1ba] warn_slowpath_common+0x7a/0xb0
  [8103b291] warn_slowpath_fmt+0x41/0x50
  [812f4052] dev_watchdog+0x222/0x230
  [81046b64] run_timer_softirq+0x124/0x230
  [812f3e30] ? dev_watchdog+0x0/0x230
  [81060f59] ? clockevents_program_event+0x59/0xa0
  [81040dd7] __do_softirq+0xa7/0x130
  [810587e3] ? hrtimer_interrupt+0x133/0x240
  [8100314c] call_softirq+0x1c/0x30
  [81004f4d] do_softirq+0x4d/0x80
  [81040aed] irq_exit+0x7d/0x90
  [8101c33b] smp_apic_timer_interrupt+0x6b/0xa0
  [81002c13] apic_timer_interrupt+0x13/0x20
  EOI  [8100a3f2] ? mwait_idle+0x72/0x90
  [810013b0] ? enter_idle+0x20/0x30
  [81001419] cpu_idle+0x59/0xb0
  [813437d8] rest_init+0x68/0x70
  [814aecff] start_kernel+0x34b/0x356
  [814ae31c] x86_64_start_reservations+0x12c/0x130
  [814ae407] x86_64_start_kernel+0xe7/0xee
 ---[ end trace 629a53efa268a4ba ]---
 e1000e :0f:00.0: eth0: Reset adapter
 bonding: bond0: link status definitely down for interface eth0, disabling
 it e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
 bonding: bond0: link status definitely up for interface eth0.

 continuity ~ # cat /proc/interrupts
CPU0   CPU1
   0: 45  1   IO-APIC-edge  timer
   1:  1  1   IO-APIC-edge  i8042
   4:   3269 33   IO-APIC-edge  serial
   8: 61 64   IO-APIC-edge  rtc0
   9:  0  0   IO-APIC-fasteoi   acpi
  12:  2  3

Re: [E1000-devel] 2.6.36.1 e1000e - Detected Tx Unit Hang

2011-01-03 Thread Brandeburg, Jesse
Hi Vasco, please post your ethtool -e ethX output for both interfaces.
have you run the fixeep script or something similar to check/fix your eeprom(s)

Jesse

-Original Message-
From: Vasco Steinmetz [mailto:va...@kyberraum.net]
Sent: Thursday, December 23, 2010 2:06 PM
To: e1000-devel@lists.sourceforge.net
Subject: [E1000-devel] 2.6.36.1 e1000e - Detected Tx Unit Hang

Hello all,

as I already reported exactly one year ago I am still experiencing Tx Unit
hangs on two 82573 NICs on a TYAN Toledo i3210W S5211 (4GB RAM) running Gentoo
64 Bit.

The box is running since May 2008 and has shown continuous Tx Unit hangs since
then. The vanilla kernel was updated quarterly from 2.6.29.x upon install up
to 2.6.36.1 currently.

I also replaced the unmanaged switch with a managed one but without any
improvement.

Box was freshly rebootet.
Sometimes eth0 hangs, sometimes eth1 does.


dmesg excerpt:

e1000e :0f:00.0: eth0: Detected Hardware Unit Hang:
  TDH  1
  TDT  6f
  next_to_use  6f
  next_to_cleanff
buffer_info[next_to_clean]:
  time_stamp   10001ae38
  next_to_watch1
  jiffies  10001b747
  next_to_watch.status 0
MAC Status 80080793
PHY Status 796d
PHY 1000BASE-T Status  3c00
PHY Extended Status3000
PCI Status 10
e1000e :0f:00.0: eth0: Detected Hardware Unit Hang:
  TDH  1
  TDT  6f
  next_to_use  6f
  next_to_cleanff
buffer_info[next_to_clean]:
  time_stamp   10001ae38
  next_to_watch1
  jiffies  10001bb2f
  next_to_watch.status 0
MAC Status 80080793
PHY Status 796d
PHY 1000BASE-T Status  3c00
PHY Extended Status3000
PCI Status 10
[ cut here ]
WARNING: at net/sched/sch_generic.c:258 dev_watchdog+0x222/0x230()
Hardware name: empty
NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
Modules linked in: md5 xts gf128mul tun nfsd lockd nfs_acl auth_rpcgss sunrpc
af_packet bonding ipv6 coretemp lm85 hwmon_vid hwmon ext4 mbcache jbd2 crc16
ansi_cprng krng eseqiv rng aes_x86_64 aes_generic cbc cryptomgr aead dm_crypt
cryptd crypto_hash crypto_wq crypto_blkcipher crypto_algapi fan
cpufreq_ondemand acpi_cpufreq freq_table mperf msr cpuid usbhid hid 8250_pnp
ehci_hcd uhci_hcd e1000e usbcore 8250 rtc_cmos serial_core rtc_core i2c_i801
nls_base rtc_lib i2c_core psmouse evdev pcspkr button sg container thermal
processor unix [last unloaded: microcode]
Pid: 0, comm: swapper Not tainted 2.6.36.1 #2
Call Trace:
 IRQ  [8103b1ba] warn_slowpath_common+0x7a/0xb0
 [8103b291] warn_slowpath_fmt+0x41/0x50
 [812f4052] dev_watchdog+0x222/0x230
 [81046b64] run_timer_softirq+0x124/0x230
 [812f3e30] ? dev_watchdog+0x0/0x230
 [81060f59] ? clockevents_program_event+0x59/0xa0
 [81040dd7] __do_softirq+0xa7/0x130
 [810587e3] ? hrtimer_interrupt+0x133/0x240
 [8100314c] call_softirq+0x1c/0x30
 [81004f4d] do_softirq+0x4d/0x80
 [81040aed] irq_exit+0x7d/0x90
 [8101c33b] smp_apic_timer_interrupt+0x6b/0xa0
 [81002c13] apic_timer_interrupt+0x13/0x20
 EOI  [8100a3f2] ? mwait_idle+0x72/0x90
 [810013b0] ? enter_idle+0x20/0x30
 [81001419] cpu_idle+0x59/0xb0
 [813437d8] rest_init+0x68/0x70
 [814aecff] start_kernel+0x34b/0x356
 [814ae31c] x86_64_start_reservations+0x12c/0x130
 [814ae407] x86_64_start_kernel+0xe7/0xee
---[ end trace 629a53efa268a4ba ]---
e1000e :0f:00.0: eth0: Reset adapter
bonding: bond0: link status definitely down for interface eth0, disabling it
e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
bonding: bond0: link status definitely up for interface eth0.

continuity ~ # cat /proc/interrupts
   CPU0   CPU1
  0: 45  1   IO-APIC-edge  timer
  1:  1  1   IO-APIC-edge  i8042
  4:   3269 33   IO-APIC-edge  serial
  8: 61 64   IO-APIC-edge  rtc0
  9:  0  0   IO-APIC-fasteoi   acpi
 12:  2  3   IO-APIC-edge  i8042
 16:  85111 324026   IO-APIC-fasteoi   pata_pdc2027x, uhci_hcd:usb1,
uhci_hcd:usb3, ehci_hcd:usb8, eth1
 17: 482156   3274   IO-APIC-fasteoi   ahci, uhci_hcd:usb2,
uhci_hcd:usb4, eth0
 18:183120   IO-APIC-fasteoi   uhci_hcd:usb5, ehci_hcd:usb7
 19:  0  0   IO-APIC-fasteoi   uhci_hcd:usb6
NMI:  0  0   Non-maskable interrupts
LOC: 200194 152902   Local timer interrupts
SPU:  0  0   Spurious interrupts
PMI:  0  0   Performance monitoring interrupts
PND:  0  0   Performance pending work
RES:  32585  37266   Rescheduling interrupts
CAL: 57 75   Function call interrupts
TLB:  18358  18379   TLB shootdowns
TRM

Re: [E1000-devel] 2.6.36.1 e1000e - Detected Tx Unit Hang

2010-12-23 Thread Allan, Bruce W
Sorry to hear about your problem, and the lack of a response in so long.  Have 
you filed a bug tracker at 
http://sourceforge.net/tracker/?group_id=42302atid=447449 with this same 
information?  If not, I suggest you do so your issue does not fall through the 
cracks again.

Thanks,
Bruce.

-Original Message-
From: Vasco Steinmetz [mailto:va...@kyberraum.net]
Sent: Thursday, December 23, 2010 2:06 PM
To: e1000-devel@lists.sourceforge.net
Subject: [E1000-devel] 2.6.36.1 e1000e - Detected Tx Unit Hang

Hello all,

as I already reported exactly one year ago I am still experiencing Tx Unit
hangs on two 82573 NICs on a TYAN Toledo i3210W S5211 (4GB RAM) running Gentoo
64 Bit.

The box is running since May 2008 and has shown continuous Tx Unit hangs since
then. The vanilla kernel was updated quarterly from 2.6.29.x upon install up
to 2.6.36.1 currently.

I also replaced the unmanaged switch with a managed one but without any
improvement.

Box was freshly rebootet.
Sometimes eth0 hangs, sometimes eth1 does.


dmesg excerpt:

e1000e :0f:00.0: eth0: Detected Hardware Unit Hang:
  TDH  1
  TDT  6f
  next_to_use  6f
  next_to_cleanff
buffer_info[next_to_clean]:
  time_stamp   10001ae38
  next_to_watch1
  jiffies  10001b747
  next_to_watch.status 0
MAC Status 80080793
PHY Status 796d
PHY 1000BASE-T Status  3c00
PHY Extended Status3000
PCI Status 10
e1000e :0f:00.0: eth0: Detected Hardware Unit Hang:
  TDH  1
  TDT  6f
  next_to_use  6f
  next_to_cleanff
buffer_info[next_to_clean]:
  time_stamp   10001ae38
  next_to_watch1
  jiffies  10001bb2f
  next_to_watch.status 0
MAC Status 80080793
PHY Status 796d
PHY 1000BASE-T Status  3c00
PHY Extended Status3000
PCI Status 10
[ cut here ]
WARNING: at net/sched/sch_generic.c:258 dev_watchdog+0x222/0x230()
Hardware name: empty
NETDEV WATCHDOG: eth0 (e1000e): transmit queue 0 timed out
Modules linked in: md5 xts gf128mul tun nfsd lockd nfs_acl auth_rpcgss sunrpc
af_packet bonding ipv6 coretemp lm85 hwmon_vid hwmon ext4 mbcache jbd2 crc16
ansi_cprng krng eseqiv rng aes_x86_64 aes_generic cbc cryptomgr aead dm_crypt
cryptd crypto_hash crypto_wq crypto_blkcipher crypto_algapi fan
cpufreq_ondemand acpi_cpufreq freq_table mperf msr cpuid usbhid hid 8250_pnp
ehci_hcd uhci_hcd e1000e usbcore 8250 rtc_cmos serial_core rtc_core i2c_i801
nls_base rtc_lib i2c_core psmouse evdev pcspkr button sg container thermal
processor unix [last unloaded: microcode]
Pid: 0, comm: swapper Not tainted 2.6.36.1 #2
Call Trace:
 IRQ  [8103b1ba] warn_slowpath_common+0x7a/0xb0
 [8103b291] warn_slowpath_fmt+0x41/0x50
 [812f4052] dev_watchdog+0x222/0x230
 [81046b64] run_timer_softirq+0x124/0x230
 [812f3e30] ? dev_watchdog+0x0/0x230
 [81060f59] ? clockevents_program_event+0x59/0xa0
 [81040dd7] __do_softirq+0xa7/0x130
 [810587e3] ? hrtimer_interrupt+0x133/0x240
 [8100314c] call_softirq+0x1c/0x30
 [81004f4d] do_softirq+0x4d/0x80
 [81040aed] irq_exit+0x7d/0x90
 [8101c33b] smp_apic_timer_interrupt+0x6b/0xa0
 [81002c13] apic_timer_interrupt+0x13/0x20
 EOI  [8100a3f2] ? mwait_idle+0x72/0x90
 [810013b0] ? enter_idle+0x20/0x30
 [81001419] cpu_idle+0x59/0xb0
 [813437d8] rest_init+0x68/0x70
 [814aecff] start_kernel+0x34b/0x356
 [814ae31c] x86_64_start_reservations+0x12c/0x130
 [814ae407] x86_64_start_kernel+0xe7/0xee
---[ end trace 629a53efa268a4ba ]---
e1000e :0f:00.0: eth0: Reset adapter
bonding: bond0: link status definitely down for interface eth0, disabling it
e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
bonding: bond0: link status definitely up for interface eth0.

continuity ~ # cat /proc/interrupts
   CPU0   CPU1
  0: 45  1   IO-APIC-edge  timer
  1:  1  1   IO-APIC-edge  i8042
  4:   3269 33   IO-APIC-edge  serial
  8: 61 64   IO-APIC-edge  rtc0
  9:  0  0   IO-APIC-fasteoi   acpi
 12:  2  3   IO-APIC-edge  i8042
 16:  85111 324026   IO-APIC-fasteoi   pata_pdc2027x, uhci_hcd:usb1,
uhci_hcd:usb3, ehci_hcd:usb8, eth1
 17: 482156   3274   IO-APIC-fasteoi   ahci, uhci_hcd:usb2,
uhci_hcd:usb4, eth0
 18:183120   IO-APIC-fasteoi   uhci_hcd:usb5, ehci_hcd:usb7
 19:  0  0   IO-APIC-fasteoi   uhci_hcd:usb6
NMI:  0  0   Non-maskable interrupts
LOC: 200194 152902   Local timer interrupts
SPU:  0  0   Spurious interrupts
PMI:  0  0   Performance monitoring interrupts
PND:  0  0   Performance pending work
RES:  32585  37266