> -----Original Message-----
> From: Thomas Jarosch [mailto:thomas.jaro...@intra2net.com]
> Sent: Friday, February 13, 2015 8:15 AM
> To: Brown, Aaron F
> Cc: Kirsher, Jeffrey T; 'Linux Netdev List'; Eric Dumazet; e1000-devel
> Subject: Re: [bisected regression] e1000e: "Detected Hardware Unit Hang"
> 
> Hi Aaron,
> 
> On Thursday, 12. February 2015 23:28:27 Brown, Aaron F wrote:
> > I do not have any real info.  I had been asked to try and reproduce some
> > unit hangs (maybe for this) recently and did not succeed in producing
> > them on the parts I have.  Reading through the thread I see this is
> > showing up in a NAT environment.  The port that is getting the unit hang
> > in the NAT system?
> 
> yes, the e1000e NIC is serving the NATed Windows client.
> 
> The setup was outlined here:
> 
>     http://marc.info/?l=linux-netdev&m=142133691713824&w=2
> 
> > I will make some attempts at replicating this with the port in a NAT and
> > or forwarding role.  Has a bug been opened for this?  Or has information
> > for this specific unit hang been entered into one of the other unit hang
> > bugs opened against e1000e?
> 
> I didn't do anything(tm). This report sounds like the same issue:
> 
>     http://ehc.ac/p/e1000/bugs/378/
> 
> Oliver Wagner wrote the problem started to appear
> after updating from kernel 3.5 to 3.8.0.35 (new frag size code).
> 
> I just noticed now he wrote he has two identical boxes:
> 
> ---------------------------------------------------
> - Box with symptoms: Router/Firewall, packet forwarding
>   between different VLANs on eth0 and eth1
> - Box without symptoms: Fileserver, eth0/eth1 bonded
>   (VLANs used, but no forwarding)
> ---------------------------------------------------
> 
> So it looks like it's related to forwarding somehow,
> I've made the same experience IIRC.

Thanks, that (and the multiple bug write-ups on sourceforge) gave me more than 
enough to go on.  I was able to replicate it on a handful of systems in my lab. 
 On effected systems setting up a NAT and stressing the interfaces with even 
moderate traffic levels triggers it pretty quickly.  It appears that the NAT 
part is unnecessary, just setting the systems up as a software router and 
running some traffic across it also triggers it giving the same apparent 
behavior (tx hang, watchdog timeout trace, port reset.)

And with an internal reproduction of the issue I have created an internal bug 
report, described my set of reproductions, referenced the similar external ones 
and assigned it to our current e1000e developer.

Thanks again,
Aaron

> 
> Cheers,
> Thomas

------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to