On 06/13/2011 02:53 PM, Ben Greear wrote: > On 06/13/2011 01:54 PM, Brandeburg, Jesse wrote: >> The generator box seems to be resetting the controller (and therefore the >> link) Check out ethtool stats on both side and look for tx_timeout. >> > > The emulator (igb) system is steadily increasing every > few seconds: > > tx_timeout_count: 9
I think the problem lies in the same place that I reported the crash last time: adapter->num_tx_queues is 1, but netdev->num_tx_queues is 16. I'm guessing that is not a correct state to be in? I believe that must be confusing the logic that keeps the trans_start counter up to date, but I haven't confirmed that yet. Thanks, Ben > > We are using a proprietary emulator module on this system, and it > directly sends pkts to the NIC using ndo_start_xmit(). When we > add jitter, we would tend to hold off sending pkts for a few ms and > then burst some, so perhaps we are over-driving the queues somehow. > > I tried with our user-space app, but it can only handle around 850Mbps > instead of full 1G. At any rate, I cannot reproduce the problem with > just our user-space app. > > I don't see any other counters of interest that are not zero > in the ethtool stats. > > We did have a clean run of several hours with the in-kernel driver, > so it does seem to be software related. > > I'll take a quick poke through the igb source and see if I notice > anything strange around the xmit timeout code. > > Thanks, > Ben > > >> -- >> Jesse Brandeburg's iPhone >> >> On Jun 13, 2011, at 1:35 PM, "Ben Greear"<[email protected]> wrote: >> >>> I have a traffic generator system with e1000e running pktgen, and emulator >>> (bridge-ish) >>> system with igb (3.0.19 driver). When we add jitter on the emulator, >>> and only then, we start seeing link bounces. >>> >>> If we use the stock 2.6.38.8 igb driver on the emulator, we do not see link >>> bounces. The kernel logs make me *think* that it's the generator that >>> is resetting the link, however. >>> >>> We tried two different generators, but both were e1000e NICs. >>> >>> I'm curious if anyone has any suggestions about how to tell which >>> NIC is causing the resets, and why it might be doing so. >>> >>> Here are kernel logs from the generator: >>> >>> e1000e: eth2 NIC Link is Down >>> e1000e 0000:08:00.0: eth2: Reset adapter >>> e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx >>> e1000e: eth3 NIC Link is Down >>> e1000e 0000:08:00.1: eth3: Reset adapter >>> e1000e: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx >>> e1000e: eth2 NIC Link is Down >>> e1000e 0000:08:00.0: eth2: Reset adapter >>> e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx >>> e1000e: eth3 NIC Link is Down >>> e1000e 0000:08:00.1: eth3: Reset adapter >>> e1000e: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx >>> e1000e: eth2 NIC Link is Down >>> e1000e 0000:08:00.0: eth2: Reset adapter >>> e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx >>> [root@simech2 ~]# ethtool -i eth3 >>> driver: e1000e >>> version: 1.2.20-k2 >>> firmware-version: 5.6-2 >>> bus-info: 0000:08:00.1 >>> >>> And here is from the emulator: >>> >>> igb: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX >>> ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready >>> igb: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX >>> ADDRCONF(NETDEV_CHANGE): eth3: link becomes ready >>> igb: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX >>> ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready >>> igb: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX >>> ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready >>> igb: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX >>> ADDRCONF(NETDEV_CHANGE): eth3: link becomes ready >>> >>> >>> Thanks, >>> Ben >>> >>> -- >>> Ben Greear<[email protected]> >>> Candela Technologies Inc http://www.candelatech.com >>> >>> >>> ------------------------------------------------------------------------------ >>> EditLive Enterprise is the world's most technically advanced content >>> authoring tool. Experience the power of Track Changes, Inline Image >>> Editing and ensure content is compliant with Accessibility Checking. >>> http://p.sf.net/sfu/ephox-dev2dev >>> _______________________________________________ >>> E1000-devel mailing list >>> [email protected] >>> https://lists.sourceforge.net/lists/listinfo/e1000-devel >>> To learn more about Intel® Ethernet, visit >>> http://communities.intel.com/community/wired > > -- Ben Greear <[email protected]> Candela Technologies Inc http://www.candelatech.com ------------------------------------------------------------------------------ EditLive Enterprise is the world's most technically advanced content authoring tool. Experience the power of Track Changes, Inline Image Editing and ensure content is compliant with Accessibility Checking. http://p.sf.net/sfu/ephox-dev2dev _______________________________________________ E1000-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
