On 06/13/2011 02:53 PM, Ben Greear wrote:
> On 06/13/2011 01:54 PM, Brandeburg, Jesse wrote:
>> The generator box seems to be resetting the controller (and therefore the 
>> link) Check out ethtool stats on both side and look for tx_timeout.
>>
>
> The emulator (igb) system is steadily increasing every
> few seconds:
>
>       tx_timeout_count: 9

I think the problem lies in the same place that I reported the crash
last time:

adapter->num_tx_queues is 1, but netdev->num_tx_queues is 16.

I'm guessing that is not a correct state to be in?

I believe that must be confusing the logic that keeps the
trans_start counter up to date, but I haven't confirmed
that yet.


Thanks,
Ben

>
> We are using a proprietary emulator module on this system, and it
> directly sends pkts to the NIC using ndo_start_xmit().  When we
> add jitter, we would tend to hold off sending pkts for a few ms and
> then burst some, so perhaps we are over-driving the queues somehow.
>
> I tried with our user-space app, but it can only handle around 850Mbps
> instead of full 1G.  At any rate, I cannot reproduce the problem with
> just our user-space app.
>
> I don't see any other counters of interest that are not zero
> in the ethtool stats.
>
> We did have a clean run of several hours with the in-kernel driver,
> so it does seem to be software related.
>
> I'll take a quick poke through the igb source and see if I notice
> anything strange around the xmit timeout code.
>
> Thanks,
> Ben
>
>
>> --
>> Jesse Brandeburg's iPhone
>>
>> On Jun 13, 2011, at 1:35 PM, "Ben Greear"<[email protected]>   wrote:
>>
>>> I have a traffic generator system with e1000e running pktgen, and emulator 
>>> (bridge-ish)
>>> system with igb (3.0.19 driver).  When we add jitter on the emulator,
>>> and only then, we start seeing link bounces.
>>>
>>> If we use the stock 2.6.38.8 igb driver on the emulator, we do not see link
>>> bounces.  The kernel logs make me *think* that it's the generator that
>>> is resetting the link, however.
>>>
>>> We tried two different generators, but both were e1000e NICs.
>>>
>>> I'm curious if anyone has any suggestions about how to tell which
>>> NIC is causing the resets, and why it might be doing so.
>>>
>>> Here are kernel logs from the generator:
>>>
>>> e1000e: eth2 NIC Link is Down
>>> e1000e 0000:08:00.0: eth2: Reset adapter
>>> e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
>>> e1000e: eth3 NIC Link is Down
>>> e1000e 0000:08:00.1: eth3: Reset adapter
>>> e1000e: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
>>> e1000e: eth2 NIC Link is Down
>>> e1000e 0000:08:00.0: eth2: Reset adapter
>>> e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
>>> e1000e: eth3 NIC Link is Down
>>> e1000e 0000:08:00.1: eth3: Reset adapter
>>> e1000e: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
>>> e1000e: eth2 NIC Link is Down
>>> e1000e 0000:08:00.0: eth2: Reset adapter
>>> e1000e: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
>>> [root@simech2 ~]# ethtool -i eth3
>>> driver: e1000e
>>> version: 1.2.20-k2
>>> firmware-version: 5.6-2
>>> bus-info: 0000:08:00.1
>>>
>>> And here is from the emulator:
>>>
>>> igb: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
>>> ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
>>> igb: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
>>> ADDRCONF(NETDEV_CHANGE): eth3: link becomes ready
>>> igb: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
>>> ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
>>> igb: eth2 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
>>> ADDRCONF(NETDEV_CHANGE): eth2: link becomes ready
>>> igb: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
>>> ADDRCONF(NETDEV_CHANGE): eth3: link becomes ready
>>>
>>>
>>> Thanks,
>>> Ben
>>>
>>> --
>>> Ben Greear<[email protected]>
>>> Candela Technologies Inc  http://www.candelatech.com
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> EditLive Enterprise is the world's most technically advanced content
>>> authoring tool. Experience the power of Track Changes, Inline Image
>>> Editing and ensure content is compliant with Accessibility Checking.
>>> http://p.sf.net/sfu/ephox-dev2dev
>>> _______________________________________________
>>> E1000-devel mailing list
>>> [email protected]
>>> https://lists.sourceforge.net/lists/listinfo/e1000-devel
>>> To learn more about Intel&#174; Ethernet, visit 
>>> http://communities.intel.com/community/wired
>
>


-- 
Ben Greear <[email protected]>
Candela Technologies Inc  http://www.candelatech.com


------------------------------------------------------------------------------
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to