>-----Original Message-----
>From: Andrew Peng [mailto:peng...@gmail.com]
>Sent: Wednesday, July 11, 2012 8:50 AM
>To: e1000-devel@lists.sourceforge.net
>Subject: [E1000-devel] 82571EB - Detected Hardware Unit Hang
>
>Folks, I've been getting some strange error messages in my home server /
>router that I've been having trouble debugging. I'm decently proficient in
>Linux, but I fear I'm in over my head with this one.
>
>The hardware is a HP N40L Microserver - here are the hardware details
>- http://n40l.wikia.com/wiki/Base_Hardware
>
>I am running Debian Squeeze 6.0:
>pengc99@gaia:/$ sudo uname -a
>Linux gaia 2.6.32-5-amd64 #1 SMP Sun May 6 04:00:17 UTC 2012 x86_64
>GNU/Linux
>
>I also subscribe to Ksplice's Uptrack system but since I have the newest
>kernel installed (as released by Debian) there have been no hot-patches
>yet.
>
>This is the message I've been getting in /var/log/kern.log:
>Jul 11 08:55:38 gaia kernel: [402056.009687] e1000e 0000:02:00.0:
>eth1: Detected Hardware Unit Hang:
>Jul 11 08:55:38 gaia kernel: [402056.009690]   TDH                  <fc>
>Jul 11 08:55:38 gaia kernel: [402056.009692]   TDT                  <fd>
>Jul 11 08:55:38 gaia kernel: [402056.009693]   next_to_use          <fd>
>Jul 11 08:55:38 gaia kernel: [402056.009694]   next_to_clean        <fc>
>Jul 11 08:55:38 gaia kernel: [402056.009695] buffer_info[next_to_clean]:
>Jul 11 08:55:38 gaia kernel: [402056.009697]   time_stamp
><105fc92b2>
>Jul 11 08:55:38 gaia kernel: [402056.009698]   next_to_watch        <fc>
>Jul 11 08:55:38 gaia kernel: [402056.009699]   jiffies
><105fc93da>
>Jul 11 08:55:38 gaia kernel: [402056.009700]   next_to_watch.status <0>
>Jul 11 08:55:38 gaia kernel: [402056.009701] MAC Status
><80383>
>Jul 11 08:55:38 gaia kernel: [402056.009702] PHY Status             <792d>
>Jul 11 08:55:38 gaia kernel: [402056.009703] PHY 1000BASE-T Status  <3800>
>Jul 11 08:55:38 gaia kernel: [402056.009705] PHY Extended Status    <3000>
>Jul 11 08:55:38 gaia kernel: [402056.009706] PCI Status             <10>
>
>Complete output of lspci:
>pengc99@gaia:/$ lspci
>00:00.0 Host bridge: Advanced Micro Devices [AMD] RS880 Host Bridge
>00:01.0 PCI bridge: Hewlett-Packard Company Device 9602
>00:02.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge
>(ext gfx port 0)
>00:06.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge
>(PCIE port 2)
>00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller
>[AHCI mode] (rev 40)
>00:12.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0
>Controller
>00:12.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI
>Controller
>00:13.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0
>Controller
>00:13.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI
>Controller
>00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 42)
>00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller
>(rev 40)
>00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge (rev 40)
>00:16.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0
>Controller
>00:16.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI
>Controller
>00:18.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor
>HyperTransport Configuration
>00:18.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor
>Address Map
>00:18.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor
>DRAM Controller
>00:18.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor
>Miscellaneous Control
>00:18.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor
>Link Control
>01:05.0 VGA compatible controller: ATI Technologies Inc M880G [Mobility
>Radeon HD 4200]
>02:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet
>Controller (rev 06)
>02:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet
>Controller (rev 06)
>03:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5723
>Gigabit Ethernet PCIe (rev 10)
>
>Output of lspci -vvv (as root, network adapter section):
>02:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet
>Controller (rev 06)
>        Subsystem: Hewlett-Packard Company NC360T PCI Express Dual Port
>Gigabit Server Adapter
>        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>ParErr- Stepping- SERR+ FastB2B- DisINTx+
>        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
><TAbort- <MAbort- >SERR- <PERR- INTx-
>        Latency: 0, Cache Line Size: 64 bytes
>        Interrupt: pin A routed to IRQ 26
>        Region 0: Memory at fe8e0000 (32-bit, non-prefetchable)
>[size=128K]
>        Region 1: Memory at fe8c0000 (32-bit, non-prefetchable)
>[size=128K]
>        Region 2: I/O ports at e800 [size=32]
>        Expansion ROM at fe8a0000 [disabled] [size=128K]
>        Capabilities: [c8] Power Management version 2
>                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA
>PME(D0+,D1-,D2-,D3hot+,D3cold+)
>                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
>        Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
>                Address: 00000000fee0300c  Data: 4191
>        Capabilities: [e0] Express (v1) Endpoint, MSI 00
>                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s
><512ns, L1 <64us
>                        ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
>                DevCtl: Report errors: Correctable+ Non-Fatal+ Fatal+
>Unsupported+
>                        RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop+
>                        MaxPayload 128 bytes, MaxReadReq 512 bytes
>                DevSta: CorrErr+ UncorrErr+ FatalErr- UnsuppReq+
>AuxPwr+ TransPend-
>                LnkCap: Port #0, Speed 2.5GT/s, Width x4, ASPM L0s,
>Latency L0 <4us, L1 <64us
>                        ClockPM- Surprise- LLActRep- BwNot-
>                LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain-
>CommClk+
>                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>                LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train-
>SlotClk+ DLActive- BWMgmt- ABWMgmt-
>        Capabilities: [100 v1] Advanced Error Reporting
>                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
>UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
>                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt-
>UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
>                UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt-
>UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
>                CESta:  RxErr+ BadTLP+ BadDLLP- Rollover- Timeout-
>NonFatalErr-
>                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout-
>NonFatalErr-
>                AERCap: First Error Pointer: 14, GenCap- CGenEn- ChkCap-
>ChkEn-
>        Capabilities: [140 v1] Device Serial Number 00-1f-29-ff-ff-5b-38-
>56
>        Kernel driver in use: e1000e
>
>The only references I could find online about this problem is about some
>PCI-E power management EEPROM bug:
>http://serverfault.com/questions/193114/linux-e1000e-intel-networking-
>driver-problems-galore-where-do-i-start
>http://downloadmirror.intel.com/9180/eng/README.txt
>
>Along with the associated fix script:
>http://sourceforge.net/projects/e1000/files/e1000e%20stable/eeprom_fix_825
>74_or_82583/
>
>However, this appears only to apply to 82574 or 82583 chipsets. This is a
>82571EB. I also checked the EEPROM output and it doesn't look like the fix
>applies:
>
>pengc99@gaia:/var$ sudo ethtool -e eth1 | head [sudo] password for
>pengc99:
>Offset          Values
>------          ------
>0x0000          00 1f 29 5b 38 56 30 15 ff ff b2 50 ff ff ff ff
>0x0010          19 d5 04 30 2f a4 44 70 3c 10 5e 10 86 80 65 b1
>
>
>I'd appreciate any help I can get, and thanks for all the hard work!
>
>--Andrew Peng

Looks like there PCIe errors detected
DevSta: CorrErr+ **UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend-

      
Would you please attach full dmesg log and full lspci -vvv (run as root) after 
issue occurs.
Please also attach your kernel .config.
Does this issue happen after you upgrade kernel?

Few things to try,
please load AER module and see if it logs any errors into log.
Does BIOS log reports any machine check errors?
Try disable TSO.

-tushar       




------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to