* Ronciak, John ([email protected]) wrote: > Hi Dave, > > Please see my comments in-line below. > > Cheers, > John > > > > -----Original Message----- > > From: Dr. David Alan Gilbert [mailto:[email protected]] > > Sent: Monday, February 03, 2014 12:11 PM > > To: Ronciak, John > > Cc: [email protected] > > Subject: Re: [E1000-devel] dual e100 'exec cuc_dump_reset' vs PCI > > latency (possibly vs Tulip) > > > > * Ronciak, John ([email protected]) wrote: > > > Some list removed for now > > > > Hi John, > > Thanks for the reply. > > > > > What do the HW stats for the failing port say? > > > Is it receiving what it thinks are packets that a problem in some > > way? > > > > I'm fairly sure they weren't incrementing at all, and I took a tcpdump > > that was showing nothing coming from the e100's at that point. > > Let me know which counters/debug to collect and I'll be happy to gather > > it. > Output the stats using 'ethtool -S <ethx>'. Do this before the failure and > then again after. You can also get us the stack stats using 'netstat -s'.
The log of ethtool -S and netstat -s is at: http://www.treblig.org/daveG/bad-e100.log that's sitting in a loop doing it once a minute (the 'NIC statistics' is the output of the ethtool) To line those times up here are the lines from the dmesg; I powered the switch/machines on the end of the e100 up here [Sat Feb 8 19:40:08 2014] e100 0000:08:04.0 ethdad: NIC Link is Up 100 Mbps Full Duplex and it was watching the camera a few minutes later, until it died here: [Sat Feb 8 19:58:06 2014] e100 0000:08:04.0 ethdad: exec cuc_dump_reset failed <repeats regularly> [Sat Feb 8 19:58:10 2014] e100 0000:08:04.0 ethdad: exec cuc_dump_reset failed [Sat Feb 8 19:59:54 2014] e100 0000:08:04.0 ethdad: No space for CB [Sat Feb 8 19:59:54 2014] e100 0000:08:04.0 ethdad: scb.status=0x50 It recovered itself about here (I'd not seen it recover before, or do the No space for CB/scb.status before) [Sat Feb 8 19:59:54 2014] e100 0000:08:04.0 ethdad: NIC Link is Up 100 Mbps Full Duplex [Sat Feb 8 20:01:58 2014] e100 0000:08:04.0 ethdad: NIC Link is Down [Sat Feb 8 20:05:40 2014] e100 0000:08:04.0 ethdad: NIC Link is Up 100 Mbps Full Duplex If you search in the bad-e100.log for 19:56 that's before it failed, and everything seems like it's chugging along OK until 19:57, but then there is no change in the output of the ethtool between 19:58 and 19:59. Note: there is other stuff going on other interfaces that the netstat -s is seeing I've only got one e100 in use tonight; and the other Tulips seem to be carrying on ok even when the e100 is upset. Dave -- -----Open up your eyes, open up your mind, open up your code ------- / Dr. David Alan Gilbert | Running GNU/Linux | Happy \ \ gro.gilbert @ treblig.org | | In Hex / \ _________________________|_____ http://www.treblig.org |_______/ ------------------------------------------------------------------------------ Managing the Performance of Cloud-Based Applications Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. Read the Whitepaper. http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk _______________________________________________ E1000-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
