[E1000-devel] Detected Tx Unit Hang Issue

2011-01-22 Thread Stephen Palmateer
:  pci:v8086d109Asv*sd*bc*sc*i*
alias:  pci:v8086d108Csv*sd*bc*sc*i*
alias:  pci:v8086d108Bsv*sd*bc*sc*i*
alias:  pci:v8086d107Fsv*sd*bc*sc*i*
alias:  pci:v8086d107Esv*sd*bc*sc*i*
alias:  pci:v8086d107Dsv*sd*bc*sc*i*
alias:  pci:v8086d10B9sv*sd*bc*sc*i*
alias:  pci:v8086d10D5sv*sd*bc*sc*i*
alias:  pci:v8086d10DAsv*sd*bc*sc*i*
alias:  pci:v8086d10D9sv*sd*bc*sc*i*
alias:  pci:v8086d1060sv*sd*bc*sc*i*
alias:  pci:v8086d10A5sv*sd*bc*sc*i*
alias:  pci:v8086d10BCsv*sd*bc*sc*i*
alias:  pci:v8086d10A4sv*sd*bc*sc*i*
alias:  pci:v8086d105Fsv*sd*bc*sc*i*
alias:  pci:v8086d105Esv*sd*bc*sc*i*
depends:
vermagic:   2.6.18-164.15.1.el5.netsw SMP mod_unload 686 REGPARM 4KSTACKS 
gcc-4.1
parm:   copybreak:Maximum size of packet that is copied to a new buffer 
on receive (uint)
parm:   TxIntDelay:Transmit Interrupt Delay (array of int)
parm:   TxAbsIntDelay:Transmit Absolute Interrupt Delay (array of int)
parm:   RxIntDelay:Receive Interrupt Delay (array of int)
parm:   RxAbsIntDelay:Receive Absolute Interrupt Delay (array of int)
parm:   InterruptThrottleRate:Interrupt Throttling Rate (array of int)
parm:   IntMode:Interrupt Mode (array of int)
parm:   SmartPowerDownEnable:Enable PHY smart power down (array of int)
parm:   KumeranLockLoss:Enable Kumeran lock loss workaround (array of 
int)
parm:   WriteProtectNVM:Write-protect NVM [WARNING: disabling this can 
lead to corrupted NVM] (array of int)
parm:   CrcStripping:Enable CRC Stripping, disable if your BMC needs 
the CRC (array of int)
module_sig: 
883f3504bd5e42e157cdcd4e826d9c611273910a0a1a1171e437af8913da28628a3569983976f83eb0a08c1c88252b77e9b2d0bd8aeb6e966185314d3a52

Only experiencing this issue with the PCI-Express interfaces.
The igb interfaces prove useless on this production machine.

Is there any other information that you would like me to provide?
Is there anything I can do?  This server is in production!!!

thanks again,
Stephen Palmateer
Netsweeper Inc.


- Original Message -
From: Assem Alwadee assem1...@gmail.com
To: stephen palmateer stephen.palmat...@netsweeper.com
Sent: Saturday, January 22, 2011 10:47:37 AM
Subject: YemenNet netsweeper Installation








Dear Stephen, 

We appreciate your effort , 

This my email and mobile number and the contact detail of ALI  Mohammed 
a...@yemen.net.ye 
Mobile : 777 013 003 
and Mohammed is : 
m...@yemen.net.ye 
Mobile : 777 013 220 



Would you please provide us with the work schedule for the next days. 
best regards 
-- 
Assem Mohammed Alwadee 
Network Engineer  IT Consultant 
Mobile: +967-777014846 
Sanaa -Yemen 



-- 
Assem Mohammed Alwadee 
Network Engineer  IT Consultant 
Mobile: +967-777014846 
Sanaa -Yemen 

-- 
Stephen Palmateer 
Netsweeper Inc. 
Tel. +1.519.826.5222 
Cell. +1.519.500.7929 
Fax. +1.519.826.5228 
Email: stephen.palmat...@netsweeper.com 

Corporate Head Office 
104 Dawson Road 
Guelph, Ontario. Canada 
N1H 1A7 



--
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel#174; Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] Detected Tx Unit Hang Issue

2011-01-22 Thread Stephen Palmateer
just found Network Adapter Driver for PCI-E Gigabit Network Connections under 
Linux*
version 1.2.20
Intel's Readme suggests that this will fix the driver generated interrupts.

Since our e1000e driver is only version 1.0.2 I'm going to winscp the tarball 
provided by Intel to the machine and follow Intel's instructions for 
installation.

Intel's website; 
http://downloadcenter.intel.com/Detail_Desc.aspx?agr=YDwnldID=15817
suggests that this version of the driver is valid for the IntelĀ® 82571EB 
Gigabit Ethernet Controllers we're working with.

I'll update this email thread when I'm finished.

thanks again,
Stephen Palmateer

- Original Message -
From: Stephen Palmateer stephen.palmat...@netsweeper.com
To: E1000-devel@lists.sourceforge.net
Cc: ali a...@yemen.net.ye, Assem Alwadee assem1...@gmail.com, Jeremy 
Erb jeremy@netsweeper.com, Tamer Abu-Elsaad 
tamer.abu-els...@netsweeper.com
Sent: Saturday, January 22, 2011 4:44:01 PM
Subject: Detected Tx Unit Hang Issue

Hello All,

I would like to report a problem with the e1000e driver on a CentOS 5.4 machine 
with a custom kernel.

Experiencing interface timeouts/failure on a regular basis, rendering the 
management interface useless.

Seeing the following error repeatedly in dmesg and stdout:

:04:00.0: eth0: Detected Tx Unit Hang:
  TDH  143
  TDT  12e
  next_to_use  12e
  next_to_clean142
buffer_info[next_to_clean]:
  time_stamp   100de9410
  next_to_watch144
  jiffies  100de952f
  next_to_watch.status 0
:04:00.0: eth0: Detected Tx Unit Hang:
  TDH  143
  TDT  12e
  next_to_use  12e
  next_to_clean142
buffer_info[next_to_clean]:
  time_stamp   100de9410
  next_to_watch144
  jiffies  100de95f7
  next_to_watch.status 0
:04:00.0: eth0: Detected Tx Unit Hang:
  TDH  143
  TDT  12e
  next_to_use  12e
  next_to_clean142
buffer_info[next_to_clean]:
  time_stamp   100de9410
  next_to_watch144
  jiffies  100de96bf
  next_to_watch.status 0

The wierd part is eth2 has far more traffic on it and is not seeing any issue.

I'll try to provide as much info as I can below.

[admin@filter1 ~]$ uname -a
Linux filter1.yemen.net.ye 2.6.18-164.15.1.el5.netsw #1 SMP Mon Apr 26 15:01:04 
EDT 2010 i686 i686 i386 GNU/Linux

[root@filter1 ~]# ethtool -i eth0
driver: e1000e
version: 1.0.2-k2
firmware-version: 5.10-2
bus-info: :05:00.0

[root@filter1 ~]# ethtool -k eth0
Offload parameters for eth0:
Cannot get device udp large send offload settings: Operation not supported
rx-checksumming: on
tx-checksumming: on
scatter-gather: on
tcp segmentation offload: on
udp fragmentation offload: off
generic segmentation offload: off
generic-receive-offload: off

[root@filter1 ~]# lspci -vv | grep net
04:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet 
Controller (Copper) (rev 06)
Subsystem: Sun Microsystems Computer Corp. x4 PCI-Express Quad Gigabit 
Ethernet UTP Low Profile Adapter
04:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet 
Controller (Copper) (rev 06)
Subsystem: Sun Microsystems Computer Corp. x4 PCI-Express Quad Gigabit 
Ethernet UTP Low Profile Adapter
05:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet 
Controller (Copper) (rev 06)
Subsystem: Sun Microsystems Computer Corp. x4 PCI-Express Quad Gigabit 
Ethernet UTP Low Profile Adapter
05:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet 
Controller (Copper) (rev 06)
Subsystem: Sun Microsystems Computer Corp. x4 PCI-Express Quad Gigabit 
Ethernet UTP Low Profile Adapter
07:00.0 Ethernet controller: Intel Corporation 82575EB Gigabit Network 
Connection (rev 02)
07:00.1 Ethernet controller: Intel Corporation 82575EB Gigabit Network 
Connection (rev 02)

[root@filter1 ~]# modinfo e1000e
filename:   
/lib/modules/2.6.18-164.15.1.el5.netsw/kernel/drivers/net/e1000e/e1000e.ko
version:1.0.2-k2
license:GPL
description:Intel(R) PRO/1000 Network Driver
author: Intel Corporation, linux.n...@intel.com
srcversion: D6678FCB5D0D64FDE5CC3DF
alias:  pci:v8086d10F0sv*sd*bc*sc*i*
alias:  pci:v8086d10EFsv*sd*bc*sc*i*
alias:  pci:v8086d10EBsv*sd*bc*sc*i*
alias:  pci:v8086d10EAsv*sd*bc*sc*i*
alias:  pci:v8086d10DFsv*sd*bc*sc*i*
alias:  pci:v8086d10DEsv*sd*bc*sc*i*
alias:  pci:v8086d10CEsv*sd*bc*sc*i*
alias:  pci:v8086d10CDsv*sd*bc*sc*i*
alias:  pci:v8086d10CCsv*sd*bc*sc*i*
alias:  pci:v8086d10CBsv*sd*bc*sc*i*
alias:  pci:v8086d10F5sv*sd*bc*sc*i*
alias:  pci:v8086d10BFsv*sd*bc*sc*i*
alias:  pci:v8086d10E5sv*sd*bc*sc*i*
alias