Re: [E1000-devel] Issue with ixgbe driver with continual lsc_int's

Tantilov, Emil S Fri, 22 Jun 2012 12:47:06 -0700

>-----Original Message-----
>From: Tantilov, Emil S [mailto:[email protected]]
>Sent: Friday, June 22, 2012 11:49 AM
>To: Brian T. O'Neill; [email protected]
>Subject: Re: [E1000-devel] Issue with ixgbe driver with continual lsc_int's
>
>>-----Original Message-----
>>From: Brian T. O'Neill [mailto:[email protected]]
>>Sent: Friday, June 22, 2012 8:07 AM
>>To: [email protected]
>>Subject: [E1000-devel] Issue with ixgbe driver with continual lsc_int's
>>
>>We have an issue with X520-DA cards when they are connected to Arista
>>7124SX switches, and only these switches. All our other Arista switches
>are
>>fine. The issue is that we have continual increasing lsc_int values via
>>ethtool, with it sometimes taking the link down, others not. It's not the
>>switch ports or cables, as, replacing with a non-Intel 10G card has no
>>issues. We have had this issue, with varying degrees of how often we get
>>the lsc_int's, with dozens of cards, some do it 100's of times a minute,
>>others do it 100 times a day. We have had this occur on Dell R610's, Dell
>>2950's, and Dell 1950's. We have 100's of X520-DA's working perfectly on 3
>>other models of Arista switches (which have a higher latency). Nothing on
>>the switch shows any issues. We have seen this on the native CentOS 5.4
>and
>>6.2 ixgbe drivers, as well as a compiled 3.9.17 driver.
>>
>>Any help with first figuring out what would cause it to keep needing to
>>reenable it's irq's would great, and hopefully might lead to root cause.
>>Below is some debug out from the driver as well as ethtool -S output.
>>
>>Thanks,
>>Brian
>>
>>
>>We have compiled it with -DDBG and -DDEBUG to get more info, and this is
>>what we see in syslog:
>>
>>Jun 22 10:54:26 hostname kernel: ixgbe_msix_other: Reg - 0x00800, value -
>>0x80100000
>
>ixgbe_msix_other() is a function that handles interrupts unrelated to Tx/Rx
>(hence "other").
>
>In this case the EICR (0x00800) register indicates that an interrupt was
>caused due to a link
>status change (bit 20 is set if link changes from down to up or vice
>versa).
>
>Could you please open a bug on SF and attach the dmesg file in its
>entirety?
>
>Thanks,
>Emil


Also just to mention - we have seen a similar issue on Cisco switches. 

Here's an excerpt from the README related to it:

  Cisco Catalyst 4948-10GE port resets may cause switch to shut down ports
  ------------------------------------------------------------------------
  82598-based hardware can re-establish link quickly and when connected to some 
  switches, rapid resets within the driver may cause the switch port to become 
  isolated due to "link flap". This is typically indicated by a yellow instead
  of a green link light. Several operations may cause this problem, such as 
  repeatedly running ethtool commands that cause a reset.  

  A potential workaround is to use the Cisco IOS command "no errdisable detect 
  cause all" from the Global Configuration prompt which enables the switch to 
  keep the interfaces up, regardless of errors.


Thanks,
Emil


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] Issue with ixgbe driver with continual lsc_int's

Reply via email to