Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
-Original Message- From: Joe Jin [mailto:joe@oracle.com] Sent: Tuesday, July 10, 2012 10:03 PM To: Dave, Tushar N Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- ker...@vger.kernel.org Subject: Re: 82571EB: Detected Hardware Unit Hang On 07/11/12 12:05, Dave, Tushar N wrote: When you said you had this issue with RHEL5 and RHEL6 drivers, have you install RHEl5/6 kernel and reproduced it? If so I think I should install RHEL6 and try reproduce it locally! Yes I reproduced this on both RHEL5 and RHEL6. So far I tried to scp big file (~1GB) will hit it at once. Thanks, Joe Joe, Can you please send lspci -vvv output for failing port before issue occurs. Thanks. -Tushar -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
On 07/11/12 15:11, Dave, Tushar N wrote: -Original Message- From: Joe Jin [mailto:joe@oracle.com] Sent: Tuesday, July 10, 2012 10:03 PM To: Dave, Tushar N Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- ker...@vger.kernel.org Subject: Re: 82571EB: Detected Hardware Unit Hang On 07/11/12 12:05, Dave, Tushar N wrote: When you said you had this issue with RHEL5 and RHEL6 drivers, have you install RHEl5/6 kernel and reproduced it? If so I think I should install RHEL6 and try reproduce it locally! Yes I reproduced this on both RHEL5 and RHEL6. So far I tried to scp big file (~1GB) will hit it at once. Thanks, Joe Joe, Can you please send lspci -vvv output for failing port before issue occurs. Thanks. # lspci -s 05:00.0 -vvv 05:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06) Subsystem: Oracle Corporation x4 PCI-Express Quad Gigabit Ethernet UTP Low Profile Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 256 bytes Interrupt: pin B routed to IRQ 80 Region 0: Memory at fbde (32-bit, non-prefetchable) [size=128K] Region 1: Memory at fbdc (32-bit, non-prefetchable) [size=128K] Region 2: I/O ports at dc00 [size=32] Expansion ROM at fbda [disabled] [size=128K] Capabilities: [c8] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: fee21000 Data: 40cb Capabilities: [e0] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 512ns, L1 64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend- LnkCap: Port #2, Speed 2.5GT/s, Width x4, ASPM L0s, Latency L0 4us, L1 64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO+ CmpltAbrt- UnxCmplt- RxOF- MalfTLP+ ECRC- UnsupReq+ ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- AERCap: First Error Pointer: 12, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [140 v1] Device Serial Number 00-15-17-ff-ff-b9-77-9c Kernel driver in use: e1000e Kernel modules: e1000e Thanks, Joe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
On 07/11/12 15:37, Dave, Tushar N wrote: -Original Message- From: Joe Jin [mailto:joe@oracle.com] Sent: Wednesday, July 11, 2012 12:18 AM To: Dave, Tushar N Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- ker...@vger.kernel.org Subject: Re: 82571EB: Detected Hardware Unit Hang On 07/11/12 15:11, Dave, Tushar N wrote: -Original Message- From: Joe Jin [mailto:joe@oracle.com] Sent: Tuesday, July 10, 2012 10:03 PM To: Dave, Tushar N Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- ker...@vger.kernel.org Subject: Re: 82571EB: Detected Hardware Unit Hang On 07/11/12 12:05, Dave, Tushar N wrote: When you said you had this issue with RHEL5 and RHEL6 drivers, have you install RHEl5/6 kernel and reproduced it? If so I think I should install RHEL6 and try reproduce it locally! Yes I reproduced this on both RHEL5 and RHEL6. So far I tried to scp big file (~1GB) will hit it at once. Thanks, Joe Joe, Can you please send lspci -vvv output for failing port before issue occurs. Thanks. # lspci -s 05:00.0 -vvv 05:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06) Subsystem: Oracle Corporation x4 PCI-Express Quad Gigabit Ethernet UTP Low Profile Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 256 bytes Interrupt: pin B routed to IRQ 80 Region 0: Memory at fbde (32-bit, non-prefetchable) [size=128K] Region 1: Memory at fbdc (32-bit, non-prefetchable) [size=128K] Region 2: I/O ports at dc00 [size=32] Expansion ROM at fbda [disabled] [size=128K] Capabilities: [c8] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2- ,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: fee21000 Data: 40cb Capabilities: [e0] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 512ns, L1 64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend- LnkCap: Port #2, Speed 2.5GT/s, Width x4, ASPM L0s, Latency L0 4us, L1 64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO+ CmpltAbrt- UnxCmplt- RxOF- MalfTLP+ ECRC- UnsupReq+ ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- AERCap: First Error Pointer: 12, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [140 v1] Device Serial Number 00-15-17-ff-ff-b9-77-9c Kernel driver in use: e1000e Kernel modules: e1000e Thanks, Joe was this lspci output taken on freshly booted system? Yes, any issue do you find? Thanks, Joe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
-Original Message- From: Joe Jin [mailto:joe@oracle.com] Sent: Wednesday, July 11, 2012 12:39 AM To: Dave, Tushar N Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- ker...@vger.kernel.org Subject: Re: 82571EB: Detected Hardware Unit Hang On 07/11/12 15:37, Dave, Tushar N wrote: -Original Message- From: Joe Jin [mailto:joe@oracle.com] Sent: Wednesday, July 11, 2012 12:18 AM To: Dave, Tushar N Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- ker...@vger.kernel.org Subject: Re: 82571EB: Detected Hardware Unit Hang On 07/11/12 15:11, Dave, Tushar N wrote: -Original Message- From: Joe Jin [mailto:joe@oracle.com] Sent: Tuesday, July 10, 2012 10:03 PM To: Dave, Tushar N Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- ker...@vger.kernel.org Subject: Re: 82571EB: Detected Hardware Unit Hang On 07/11/12 12:05, Dave, Tushar N wrote: When you said you had this issue with RHEL5 and RHEL6 drivers, have you install RHEl5/6 kernel and reproduced it? If so I think I should install RHEL6 and try reproduce it locally! Yes I reproduced this on both RHEL5 and RHEL6. So far I tried to scp big file (~1GB) will hit it at once. Thanks, Joe Joe, Can you please send lspci -vvv output for failing port before issue occurs. Thanks. # lspci -s 05:00.0 -vvv 05:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (Copper) (rev 06) Subsystem: Oracle Corporation x4 PCI-Express Quad Gigabit Ethernet UTP Low Profile Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 256 bytes Interrupt: pin B routed to IRQ 80 Region 0: Memory at fbde (32-bit, non-prefetchable) [size=128K] Region 1: Memory at fbdc (32-bit, non-prefetchable) [size=128K] Region 2: I/O ports at dc00 [size=32] Expansion ROM at fbda [disabled] [size=128K] Capabilities: [c8] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2- ,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: fee21000 Data: 40cb Capabilities: [e0] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s 512ns, L1 64us ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ MaxPayload 128 bytes, MaxReadReq 512 bytes DevSta: CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr+ TransPend- LnkCap: Port #2, Speed 2.5GT/s, Width x4, ASPM L0s, Latency L0 4us, L1 64us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO+ CmpltAbrt- UnxCmplt- RxOF- MalfTLP+ ECRC- UnsupReq+ ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- AERCap: First Error Pointer: 12, GenCap- CGenEn- ChkCap- ChkEn- Capabilities: [140 v1] Device Serial Number 00-15-17-ff-ff-b9-77-9c Kernel driver in use: e1000e Kernel modules: e1000e Thanks, Joe was this lspci output taken on freshly booted system? Yes, any issue do you find? Thanks, Joe Device status and AER sections show some errors that looks little suspicious to me but I'm not too sure. I will get back tomorrow. -Tushar -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
On 07/11/12 15:50, Dave, Tushar N wrote: Device status and AER sections show some errors that looks little suspicious to me but I'm not too sure. I will get back tomorrow. Thanks a lot, Tushar! Joe -- Oracle http://www.oracle.com Joe Jin | Software Development Senior Manager | +8610.6106.5624 ORACLE | Linux and Virtualization No. 24 Zhongguancun Software Park, Haidian District | 100193 Beijing -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
[E1000-devel] 82571EB - Detected Hardware Unit Hang
Folks, I've been getting some strange error messages in my home server / router that I've been having trouble debugging. I'm decently proficient in Linux, but I fear I'm in over my head with this one. The hardware is a HP N40L Microserver - here are the hardware details - http://n40l.wikia.com/wiki/Base_Hardware I am running Debian Squeeze 6.0: pengc99@gaia:/$ sudo uname -a Linux gaia 2.6.32-5-amd64 #1 SMP Sun May 6 04:00:17 UTC 2012 x86_64 GNU/Linux I also subscribe to Ksplice's Uptrack system but since I have the newest kernel installed (as released by Debian) there have been no hot-patches yet. This is the message I've been getting in /var/log/kern.log: Jul 11 08:55:38 gaia kernel: [402056.009687] e1000e :02:00.0: eth1: Detected Hardware Unit Hang: Jul 11 08:55:38 gaia kernel: [402056.009690] TDH fc Jul 11 08:55:38 gaia kernel: [402056.009692] TDT fd Jul 11 08:55:38 gaia kernel: [402056.009693] next_to_use fd Jul 11 08:55:38 gaia kernel: [402056.009694] next_to_cleanfc Jul 11 08:55:38 gaia kernel: [402056.009695] buffer_info[next_to_clean]: Jul 11 08:55:38 gaia kernel: [402056.009697] time_stamp 105fc92b2 Jul 11 08:55:38 gaia kernel: [402056.009698] next_to_watchfc Jul 11 08:55:38 gaia kernel: [402056.009699] jiffies 105fc93da Jul 11 08:55:38 gaia kernel: [402056.009700] next_to_watch.status 0 Jul 11 08:55:38 gaia kernel: [402056.009701] MAC Status 80383 Jul 11 08:55:38 gaia kernel: [402056.009702] PHY Status 792d Jul 11 08:55:38 gaia kernel: [402056.009703] PHY 1000BASE-T Status 3800 Jul 11 08:55:38 gaia kernel: [402056.009705] PHY Extended Status3000 Jul 11 08:55:38 gaia kernel: [402056.009706] PCI Status 10 Complete output of lspci: pengc99@gaia:/$ lspci 00:00.0 Host bridge: Advanced Micro Devices [AMD] RS880 Host Bridge 00:01.0 PCI bridge: Hewlett-Packard Company Device 9602 00:02.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (ext gfx port 0) 00:06.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 2) 00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller [AHCI mode] (rev 40) 00:12.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:12.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:13.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:13.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 42) 00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller (rev 40) 00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge (rev 40) 00:16.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:16.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:18.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor HyperTransport Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Miscellaneous Control 00:18.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Link Control 01:05.0 VGA compatible controller: ATI Technologies Inc M880G [Mobility Radeon HD 4200] 02:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06) 02:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06) 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5723 Gigabit Ethernet PCIe (rev 10) Output of lspci -vvv (as root, network adapter section): 02:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06) Subsystem: Hewlett-Packard Company NC360T PCI Express Dual Port Gigabit Server Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 26 Region 0: Memory at fe8e (32-bit, non-prefetchable) [size=128K] Region 1: Memory at fe8c (32-bit, non-prefetchable) [size=128K] Region 2: I/O ports at e800 [size=32] Expansion ROM at fe8a [disabled] [size=128K] Capabilities: [c8] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: fee0300c Data: 4191 Capabilities: [e0] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 256 bytes, PhantFunc 0,
Re: [E1000-devel] Bonding + ixgbe breaks with jumbo frames if the MTU is not set on bond0 before adding slaves
Hi Nathan, Our engineers have root caused this issue in the driver and is currently undergoing validation. Once it passes testing we'll post it up on sourceforge. Thanks again for the report. Stephen -Original Message- From: Nathan March [mailto:nat...@gt.net] Sent: Tuesday, July 03, 2012 10:53 AM To: Ko, Stephen S Cc: e1000-devel@lists.sourceforge.net Subject: Re: [E1000-devel] Bonding + ixgbe breaks with jumbo frames if the MTU is not set on bond0 before adding slaves Here's a slightly cleaner dmesg with some changes removed that I'd hacked in to try to fix: http://pastebin.com/DkjFEdfc I'm not doing anything special on bootup anymore, this is a standard gentoo system booting. Configured using: config_eth2=null config_eth3=null rc_need_bond0=net.eth2 net.eth3 config_bond0=( 10.1.14.23 broadcast 10.1.14.255 netmask 255.255.255.0 ) slaves_bond0=eth2 eth3 modules_bond0=( ifconfig !iproute2 ) mtu_eth2=9000 mtu_eth3=9000 mtu_bond0=9000 - Nathan On 7/3/2012 10:32 AM, Nathan March wrote: Hi Stephen, Sure, dmesg is here: http://pastebin.com/VyH6gA4A xen13 ~ # modinfo bonding | head filename: /lib/modules/3.2.7/kernel/drivers/net/bonding/bonding.ko alias: rtnl-link-bond author: Thomas Davis, tada...@lbl.gov and many others description:Ethernet Channel Bonding Driver, v3.7.1 version:3.7.1 license:GPL srcversion: 35B9A516B4FC085FFCBEF61 depends: intree: Y vermagic: 3.2.7 SMP mod_unload Happy to provide any other info if needed, or access to a test machine. - Nathan On 6/29/2012 9:01 PM, Ko, Stephen S wrote: Hi Nathan, Thanks for the report. We are trying to reproduce this issue in our lab. Could you please send us: - dmesg - modinfo bonding | head Just to narrow down the issue, have you tried this on all other interfaces besides ixgbe? (e.g. igb, e1000, or e1000e)? We will keep you informed of progress on our end. Thanks, Stephen -Original Message- From: Nathan March [mailto:nat...@gt.net] Sent: Friday, June 29, 2012 3:17 PM To: e1000-devel@lists.sourceforge.net Subject: [E1000-devel] Bonding + ixgbe breaks with jumbo frames if the MTU is not set on bond0 before adding slaves Hi All, I think I've found a bug in the ixgbe driver when using bonding + jumbo frames. Adding slaves to the bond device and setting mtu 9000 after enslaving, results in one of the slaves dropping traffic. The strange thing is putting bond0 into promiscuous mode (by running tcpdump) will solve the problem (until you close tcpdump). Here's a test script I've put together to reproduce the problem: #!/bin/bash -x rmmod ixgbe rmmod bonding modprobe bonding miimon=100 mode=4 modprobe ixgbe ifconfig bond0 up mtu 1500 ifconfig eth2 up ifconfig eth3 up ifenslave bond0 eth2 eth3 ifconfig bond0 10.1.14.23 broadcast 10.1.14.255 netmask 255.255.255.0 mtu 9000 Changing line #6 to be 'mtu 9000' no longer triggers the bug and networking works perfectly. This is on an Intel X540-T2 connected to a pair of Arista 1050T (mlag) on kernel 3.2.7. I'm using the bonding module built into the kernel with ixgbe 3.9.17. - Nathan -- Nathan March nat...@gt.net Gossamer Threads Inc. http://www.gossamer-threads.com/ Tel: (604) 687-5804 Fax: (604) 687-5806 -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired -- Nathan March nat...@gt.net Gossamer Threads Inc. http://www.gossamer-threads.com/ Tel: (604) 687-5804 Fax: (604) 687-5806 -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
-Original Message- From: Joe Jin [mailto:joe@oracle.com] Sent: Tuesday, July 10, 2012 10:03 PM To: Dave, Tushar N Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- ker...@vger.kernel.org Subject: Re: 82571EB: Detected Hardware Unit Hang On 07/11/12 12:05, Dave, Tushar N wrote: When you said you had this issue with RHEL5 and RHEL6 drivers, have you install RHEl5/6 kernel and reproduced it? If so I think I should install RHEL6 and try reproduce it locally! Yes I reproduced this on both RHEL5 and RHEL6. So far I tried to scp big file (~1GB) will hit it at once. Thanks, Joe Joe, I see couple of errors in lspci output. Device capability status register shows UnCorrectable PCIe error. This means there is certainly something went wrong. The only way to recover from Uncorrectable errors is reset. DevSta: CorrErr- *UncorrErr+ FatalErr+ UnsuppReq+ AuxPwr+ TransPend- Also AER sections in lspci output shows PCIe completion timeout. Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- *CmpltTO+ CmpltAbrt- UnxCmplt- RxOF- MalfTLP+ ECRC- UnsupReq+ ACSViol- I suggest you should load AER driver and check for any error messages in log. Also please check any error message reported by system in BIOS log. Are there any machine check errors? When did you notice this issue? have 82571 ever been working before on this server? One more thing, Cache line size 256 is little unusual( I never seen this value before, mostly it's 64). Does BIOS settings have been changed? Are you using default BIOS setting? Thanks. -Tushar -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] Bonding + ixgbe breaks with jumbo frames if the MTU is not set on bond0 before adding slaves
Hi Nathan, If you'd like to try, attached is a patch to fix the issue. The patch was generated against 3.9.17 driver. Thanks, Stephen -Original Message- From: Nathan March [mailto:nat...@gt.net] Sent: Tuesday, July 03, 2012 10:53 AM To: Ko, Stephen S Cc: e1000-devel@lists.sourceforge.net Subject: Re: [E1000-devel] Bonding + ixgbe breaks with jumbo frames if the MTU is not set on bond0 before adding slaves Here's a slightly cleaner dmesg with some changes removed that I'd hacked in to try to fix: http://pastebin.com/DkjFEdfc I'm not doing anything special on bootup anymore, this is a standard gentoo system booting. Configured using: config_eth2=null config_eth3=null rc_need_bond0=net.eth2 net.eth3 config_bond0=( 10.1.14.23 broadcast 10.1.14.255 netmask 255.255.255.0 ) slaves_bond0=eth2 eth3 modules_bond0=( ifconfig !iproute2 ) mtu_eth2=9000 mtu_eth3=9000 mtu_bond0=9000 - Nathan On 7/3/2012 10:32 AM, Nathan March wrote: Hi Stephen, Sure, dmesg is here: http://pastebin.com/VyH6gA4A xen13 ~ # modinfo bonding | head filename: /lib/modules/3.2.7/kernel/drivers/net/bonding/bonding.ko alias: rtnl-link-bond author: Thomas Davis, tada...@lbl.gov and many others description:Ethernet Channel Bonding Driver, v3.7.1 version:3.7.1 license:GPL srcversion: 35B9A516B4FC085FFCBEF61 depends: intree: Y vermagic: 3.2.7 SMP mod_unload Happy to provide any other info if needed, or access to a test machine. - Nathan On 6/29/2012 9:01 PM, Ko, Stephen S wrote: Hi Nathan, Thanks for the report. We are trying to reproduce this issue in our lab. Could you please send us: - dmesg - modinfo bonding | head Just to narrow down the issue, have you tried this on all other interfaces besides ixgbe? (e.g. igb, e1000, or e1000e)? We will keep you informed of progress on our end. Thanks, Stephen -Original Message- From: Nathan March [mailto:nat...@gt.net] Sent: Friday, June 29, 2012 3:17 PM To: e1000-devel@lists.sourceforge.net Subject: [E1000-devel] Bonding + ixgbe breaks with jumbo frames if the MTU is not set on bond0 before adding slaves Hi All, I think I've found a bug in the ixgbe driver when using bonding + jumbo frames. Adding slaves to the bond device and setting mtu 9000 after enslaving, results in one of the slaves dropping traffic. The strange thing is putting bond0 into promiscuous mode (by running tcpdump) will solve the problem (until you close tcpdump). Here's a test script I've put together to reproduce the problem: #!/bin/bash -x rmmod ixgbe rmmod bonding modprobe bonding miimon=100 mode=4 modprobe ixgbe ifconfig bond0 up mtu 1500 ifconfig eth2 up ifconfig eth3 up ifenslave bond0 eth2 eth3 ifconfig bond0 10.1.14.23 broadcast 10.1.14.255 netmask 255.255.255.0 mtu 9000 Changing line #6 to be 'mtu 9000' no longer triggers the bug and networking works perfectly. This is on an Intel X540-T2 connected to a pair of Arista 1050T (mlag) on kernel 3.2.7. I'm using the bonding module built into the kernel with ixgbe 3.9.17. - Nathan -- Nathan March nat...@gt.net Gossamer Threads Inc. http://www.gossamer-threads.com/ Tel: (604) 687-5804 Fax: (604) 687-5806 -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired -- Nathan March nat...@gt.net Gossamer Threads Inc. http://www.gossamer-threads.com/ Tel: (604) 687-5804 Fax: (604) 687-5806 ixgbe_fix_mac_flush.patch Description: ixgbe_fix_mac_flush.patch -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
[E1000-devel] ArraiƔa de Ofertas Refrimur!!!
nbsp; Problemas para visualizar a mensagem? Acesse aqui nbsp; Clique para natilde;o receber nossos emails nbsp; nbsp; -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
[E1000-devel] ixgbe 3.7.21: NULL skb deref in ixgbe_clean_rx_irq_ps()
Using the 3.7.21 version of the ixgbe driver we can reliably produce a crash with this signature: BUG: unable to handle kernel NULL pointer dereference at 006c IP: [a005afef] ixgbe_poll+0x9df/0x1710 [ixgbe] PGD 814c7b067 PUD 8074dd067 PMD 0 Oops: [#1] SMP last sysfs file: /sys/devices/virtual/bypass/8-9/ping_watchdog CPU 2 Pid: 18925, comm: sport Tainted: P 2.6.32-perf #1 To Be Filled By O.E.M. RIP: 0010:[a005afef] [a005afef] ixgbe_poll+0x9df/0x1710 [ixgbe] RSP: 0018:88080750b8b0 EFLAGS: 00010246 RAX: RBX: 88040f816f00 RCX: RDX: 0020 RSI: c9000429c000 RDI: 88040f891d80 RBP: 88080750b970 R08: 0100 R09: R10: 0100 R11: 88080750bfd8 R12: R13: c900041221b8 R14: 8804077580b0 R15: 000b FS: 7f61ccda9700() GS:88002828() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 006c CR3: 000814436000 CR4: 06e0 DR0: DR1: DR2: DR3: DR6: 0ff0 DR7: 0400 Process sport (pid: 18925, threadinfo 88080750a000, task 880814beeb60) Stack: 8804148b0540 000e 88040703d1c0 0 8804148b0598 88080750b918 815671ac 0001000359c2 0 880410744700 0040 88040f891d80 004011087c9c Call Trace: [815671ac] ? ip_finish_output+0x13c/0x310 [8152b468] net_rx_action+0xb8/0x400 [81517a84] ? sock_def_readable+0x44/0x80 [81066a91] __do_softirq+0xc1/0x1d0 [8100c1ec] call_softirq+0x1c/0x30 [8100de25] do_softirq+0x65/0xa0 [8106699a] local_bh_enable+0x9a/0xb0 [815176fc] lock_sock_nested+0xac/0xc0 [81641f0b] ? _spin_unlock_bh+0x1b/0x20 [81517627] ? release_sock+0xd7/0x100 [81571838] tcp_recvmsg+0x38/0xe80 [812d4c19] ? cpumask_next_and+0x29/0x50 [8104b6f4] ? find_busiest_group+0x244/0xb10 [810544d2] ? default_wake_function+0x12/0x20 [81516cf9] sock_common_recvmsg+0x39/0x50 [81516829] sock_aio_read+0x159/0x160 [8104dbd3] ? perf_event_task_sched_out+0x33/0x80 [810097ac] ? __switch_to+0x1ac/0x320 [815166d0] ? sock_aio_read+0x0/0x160 [811533bb] do_sync_readv_writev+0xfb/0x140 [810853b0] ? autoremove_wake_function+0x0/0x40 [811543df] do_readv_writev+0xcf/0x1f0 [8156dc0d] ? do_tcp_getsockopt+0x3d/0x5f0 [81012879] ? read_tsc+0x9/0x20 [8108fc13] ? ktime_get+0x63/0xe0 [810650c2] ? ns_to_timeval+0x12/0x40 [810896af] ? hrtimer_get_remaining+0x3f/0x50 [811546d3] vfs_readv+0x43/0x60 [811547d1] sys_readv+0x51/0x80 [8100b132] system_call_fastpath+0x16/0x1b Code: c1 e5 03 4c 03 6b 20 4d 8b 65 00 49 c7 45 00 00 00 00 00 0f ae e8 48 8b 53 28 31 c0 f6 c2 10 74 0a 41 f7 06 00 00 1e 00 0f 95 c0 41 8b 74 24 6c 49 8b 8c 24 b0 01 00 00 85 f6 0f 18 09 0f 85 c0 RIP [a005afef] ixgbe_poll+0x9df/0x1710 [ixgbe] RSP 88080750b8b0 CR2: 006c ---[ end trace 9db4623b9591cd54 ]--- addr2line says this is happening on line 2028 below - so a NULL skb pointer is being passed to skb_is_nonlinear(): 1990 static bool ixgbe_clean_rx_irq_ps(struct ixgbe_q_vector *q_vector, 1991 struct ixgbe_ring *rx_ring, 1992 int budget) 1993 { . 2021 rmb(); 2022 2023 pkt_is_rsc = ixgbe_get_rsc_state(rx_ring, rx_desc); 2024 2025 prefetch(skb-data); 2026 2027 /* pull the header of the skb in if no data is already present */ 2028 if (!skb_is_nonlinear(skb)) { 2029 __skb_put(skb, ixgbe_get_hlen(rx_ring, rx_desc)); Anyone have a guess as to the cause? Or have you seen similar? One good clue that we've found is that the problem disappears if we turn off irq balancing. -- Arthur -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] 82571EB - Detected Hardware Unit Hang
-Original Message- From: Andrew Peng [mailto:peng...@gmail.com] Sent: Wednesday, July 11, 2012 8:50 AM To: e1000-devel@lists.sourceforge.net Subject: [E1000-devel] 82571EB - Detected Hardware Unit Hang Folks, I've been getting some strange error messages in my home server / router that I've been having trouble debugging. I'm decently proficient in Linux, but I fear I'm in over my head with this one. The hardware is a HP N40L Microserver - here are the hardware details - http://n40l.wikia.com/wiki/Base_Hardware I am running Debian Squeeze 6.0: pengc99@gaia:/$ sudo uname -a Linux gaia 2.6.32-5-amd64 #1 SMP Sun May 6 04:00:17 UTC 2012 x86_64 GNU/Linux I also subscribe to Ksplice's Uptrack system but since I have the newest kernel installed (as released by Debian) there have been no hot-patches yet. This is the message I've been getting in /var/log/kern.log: Jul 11 08:55:38 gaia kernel: [402056.009687] e1000e :02:00.0: eth1: Detected Hardware Unit Hang: Jul 11 08:55:38 gaia kernel: [402056.009690] TDH fc Jul 11 08:55:38 gaia kernel: [402056.009692] TDT fd Jul 11 08:55:38 gaia kernel: [402056.009693] next_to_use fd Jul 11 08:55:38 gaia kernel: [402056.009694] next_to_cleanfc Jul 11 08:55:38 gaia kernel: [402056.009695] buffer_info[next_to_clean]: Jul 11 08:55:38 gaia kernel: [402056.009697] time_stamp 105fc92b2 Jul 11 08:55:38 gaia kernel: [402056.009698] next_to_watchfc Jul 11 08:55:38 gaia kernel: [402056.009699] jiffies 105fc93da Jul 11 08:55:38 gaia kernel: [402056.009700] next_to_watch.status 0 Jul 11 08:55:38 gaia kernel: [402056.009701] MAC Status 80383 Jul 11 08:55:38 gaia kernel: [402056.009702] PHY Status 792d Jul 11 08:55:38 gaia kernel: [402056.009703] PHY 1000BASE-T Status 3800 Jul 11 08:55:38 gaia kernel: [402056.009705] PHY Extended Status3000 Jul 11 08:55:38 gaia kernel: [402056.009706] PCI Status 10 Complete output of lspci: pengc99@gaia:/$ lspci 00:00.0 Host bridge: Advanced Micro Devices [AMD] RS880 Host Bridge 00:01.0 PCI bridge: Hewlett-Packard Company Device 9602 00:02.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (ext gfx port 0) 00:06.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 2) 00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller [AHCI mode] (rev 40) 00:12.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:12.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:13.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:13.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 42) 00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller (rev 40) 00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge (rev 40) 00:16.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:16.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:18.0 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor HyperTransport Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Miscellaneous Control 00:18.4 Host bridge: Advanced Micro Devices [AMD] Family 10h Processor Link Control 01:05.0 VGA compatible controller: ATI Technologies Inc M880G [Mobility Radeon HD 4200] 02:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06) 02:00.1 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06) 03:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5723 Gigabit Ethernet PCIe (rev 10) Output of lspci -vvv (as root, network adapter section): 02:00.0 Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06) Subsystem: Hewlett-Packard Company NC360T PCI Express Dual Port Gigabit Server Adapter Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 26 Region 0: Memory at fe8e (32-bit, non-prefetchable) [size=128K] Region 1: Memory at fe8c (32-bit, non-prefetchable) [size=128K] Region 2: I/O ports at e800 [size=32] Expansion ROM at fe8a [disabled] [size=128K] Capabilities: [c8] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME- Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable-
Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
On 07/12/12 02:51, Dave, Tushar N wrote: Joe, I see couple of errors in lspci output. Device capability status register shows UnCorrectable PCIe error. This means there is certainly something went wrong. The only way to recover from Uncorrectable errors is reset. DevSta: CorrErr- *UncorrErr+ FatalErr+ UnsuppReq+ AuxPwr+ TransPend- Also AER sections in lspci output shows PCIe completion timeout. Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- *CmpltTO+ CmpltAbrt- UnxCmplt- RxOF- MalfTLP+ ECRC- UnsupReq+ ACSViol- I suggest you should load AER driver and check for any error messages in log. Also please check any error message reported by system in BIOS log. Are there any machine check errors? When did you notice this issue? have 82571 ever been working before on this server? One more thing, Cache line size 256 is little unusual( I never seen this value before, mostly it's 64). Does BIOS settings have been changed? Are you using default BIOS setting? I checked BIOS's log found the fault from the device, I changed PCI-E Payload Size from 256(default) to 128, now the device works. I compared lspci output found Address for data of MSI Capabilities's be changed: Old: Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: fee21000 Data: 40cb New: Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: fee24000 Data: 405c Mostly like it's a BIOS bug? please comments. Thanks, Joe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
On 07/12/12 10:52, Dave, Tushar N wrote: What is the exact error messages in BIOS log? Error message from BIOS event log: 07/12/12 05:54:00 PCI Express Non-Fatal Error Thanks, Joe -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
-Original Message- From: Joe Jin [mailto:joe@oracle.com] Sent: Wednesday, July 11, 2012 7:58 PM To: Dave, Tushar N Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- ker...@vger.kernel.org Subject: Re: 82571EB: Detected Hardware Unit Hang On 07/12/12 10:52, Dave, Tushar N wrote: What is the exact error messages in BIOS log? Error message from BIOS event log: 07/12/12 05:54:00 PCI Express Non-Fatal Error Thanks, Joe Thanks. Well, I will check with team tomorrow if this (max payload size) can be treated as solution to this issue. We can know more about what exact non-fatal error occurred if we capture bus trace. We should check the eeprom on this device to make sure they are up-to-date. Send me the full eeprom dump in a file and I will confirm with team that it is up-to-date. Thanks for your work. -Tushar -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired
Re: [E1000-devel] 82571EB: Detected Hardware Unit Hang
-Original Message- From: Joe Jin [mailto:joe@oracle.com] Sent: Wednesday, July 11, 2012 8:13 PM To: Dave, Tushar N Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- ker...@vger.kernel.org Subject: Re: 82571EB: Detected Hardware Unit Hang On 07/12/12 11:07, Dave, Tushar N wrote: -Original Message- From: Joe Jin [mailto:joe@oracle.com] Sent: Wednesday, July 11, 2012 7:58 PM To: Dave, Tushar N Cc: e1000-de...@lists.sf.net; net...@vger.kernel.org; linux- ker...@vger.kernel.org Subject: Re: 82571EB: Detected Hardware Unit Hang On 07/12/12 10:52, Dave, Tushar N wrote: What is the exact error messages in BIOS log? Error message from BIOS event log: 07/12/12 05:54:00 PCI Express Non-Fatal Error Thanks, Joe Hi Tushar, Please find eeprom from attachment. Do you have lspci -vvv dump of entire system before and after issue occurs? If you have can you send it to me? -- Live Security Virtual Conference Exclusive live event will cover all the ways today's security and threat landscape has changed and how IT managers can respond. Discussions will include endpoint security, mobile security and the latest in malware threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ ___ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel#174; Ethernet, visit http://communities.intel.com/community/wired