We'll need your kernel config for the reproduction.

Todd Fujinaka
Software Application Engineer
Networking Division (ND)
Intel Corporation
todd.fujin...@intel.com
(503) 712-4565

-----Original Message-----
From: Tony Battersby [mailto:to...@cybernetics.com] 
Sent: Wednesday, January 07, 2015 3:02 PM
To: Fujinaka, Todd; e1000-devel@lists.sourceforge.net
Subject: Re: [E1000-devel] [PATCH] igb: fix Tx Unit Hang on 82576 with RSS > 1

Actually igb_cache_ring_register *did* change between igb-5.2.5.tar.gz and 
igb-5.2.9.2.tar.gz.  My first patch did nothing more than revert that change to 
make igb_cache_ring_register the same as it was in igb-5.2.5.tar.gz (plus 
adding the fall through comment from the upstream kernel).  I came up with the 
patch by manually bisecting the changes between the two driver versions to 
figure out what broke it.  So the fix that you are against was not my idea at 
all.  I was just reverting the change that caused breakage.

To recap:
igb-5.2.5.tar.gz does work, and does NOT have a "break" in between "case 
e1000_82576" and the following cases.
igb-5.2.9.2.tar.gz and later do NOT work, and they DO have "break" in between 
"case e1000_82576" and the following cases.
in-kernel igb does work, and does NOT have a "break" in between "case 
e1000_82576" and the following cases.

The in-kernel igb also has the "/* Fall through */" comment in place of the 
break.  That was added by upstream commit
b26141d47a4a73f07853986bd6b5a9f4ee6b4fa1 "igb: Cleanups to fix missing break in 
switch statements".  That is why I added the fall through comment in my patch - 
to match the upstream kernel.

The OS is custom built from scratch rather than being a regular distribution.  
It is running a very minimalistic userspace with upstream kernel 3.18.1.  To 
reproduce the problem:

1) boot the machine
2) modprobe igb InterruptThrottleRate=1,1 QueuePairs=1,1 IntMode=2,2 RSS=8,8
3) ifconfig eth0 192.168.136.21 netmask 255.255.255.0 mtu 1500
4) ping 192.168.136.55
5) immediately get "Tx Unit Hang" message, and no ping response
6) try again with my patch, and everything works fine

I have attached files from the system with my patch when everything is working 
fine.  Let me know if you want files from the system without my patch when it 
is not working.

Motherboard: Supermicro X8DTH-6F with BIOS 2.1b
CPU: Intel Xeon X5650 @ 2.67GHz

Tony


On 01/07/2015 05:02 PM, Fujinaka, Todd wrote:
> First, please don't remove the older parts of the thread. We're using outlook 
> here and the threading is awful and sometimes it's hard for me to follow the 
> thread.
>
> Second, I'm not sure what you're doing. Your first email said:
>
> Known-good versions:
>   sourceforge igb-5.2.5.tar.gz and earlier
>   upstream in-kernel igb
>
> Known-bad versions:
>   sourceforge igb-5.2.9.2.tar.gz and later
>
> igb_cache_ring_register hasn't changed between igb-5.2.5.tar.gz and 
> igb-5.2.9.2.tar.gz. While I'm not doubting we changed some behavior, I have 
> to repeat what I said in my first reply, your fix is wrong.
>
> We'll try reproducing your issue locally so we'll need exact repro steps 
> starting from how you installed the OS to what you did to set up your system.
>
> Thanks.
>
> Todd Fujinaka
> Software Application Engineer
> Networking Division (ND)
> Intel Corporation
> todd.fujin...@intel.com
> (503) 712-4565
>
> -----Original Message-----
> From: Tony Battersby [mailto:to...@cybernetics.com]
> Sent: Wednesday, January 07, 2015 11:06 AM
> To: Fujinaka, Todd; e1000-devel@lists.sourceforge.net
> Subject: Re: [E1000-devel] [PATCH] igb: fix Tx Unit Hang on 82576 with 
> RSS > 1
>
> I had igb_cache_ring_register() print out the relevant values and this is 
> what I get:
>
> rss_queues = 8
> vmdq_pools = 0
> num_rx_queues = 8
> num_tx_queues = 8
>
> With those values, the unpatched igb_cache_ring_register() does nothing 
> because of the break.  With my patch, it falls through to the default case.  
> Maybe the break should be contained within the if ((adapter->rss_queues > 1) 
> && adapter->vmdq_pools) { ... } code segment instead, and the 82576 non-VMDq 
> case should fall through?  But I am testing only the non-VMDq case, so I 
> can't confirm if the VMDq case is broken or not.  What do you think of this 
> patch instead:
>
> --- igb-5.2.15/src/igb_main.c.orig    2014-09-18 12:12:17.000000000 -0400
> +++ igb-5.2.15/src/igb_main.c 2015-01-07 14:02:08.000000000 -0500
> @@ -407,8 +407,9 @@ static void igb_cache_ring_register(stru
>                       for (; i < adapter->rss_queues; i++)
>                               adapter->rx_ring[i]->reg_idx = rbase_offset +
>                                                               Q_IDX_82576(i);
> +                     break;
>               }
> -     break;
> +             /* Fall through */
>       case e1000_82575:
>       case e1000_82580:
>       case e1000_i350:
>
>
> On 01/07/2015 01:10 PM, Fujinaka, Todd wrote:
>> I think this is a setup issue. We don't use multiple TX queues per pool when 
>> VMDq is enabled and that's why there's a special case for the 82576.
>>
>> Your patch would invalidate the whole switch statement and set up all parts 
>> the same way.
>>
>> Todd Fujinaka
>> Software Application Engineer
>> Networking Division (ND)
>> Intel Corporation
>> todd.fujin...@intel.com
>> (503) 712-4565
>>
>>
>


------------------------------------------------------------------------------
Dive into the World of Parallel Programming! The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net
_______________________________________________
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to