Re: [E1000-devel] [PATCH v2 net-next 0/4] net: low latency Ethernet device polling

2013-05-20 Thread Brandeburg, Jesse
On Sun, 19 May 2013, Or Gerlitz wrote:
> On Sun, May 19, 2013 at 1:25 PM, Eliezer Tamir
>  wrote:
> > This is an updated version of the code we posted on February.
> 
> Last time you've placed a copy of the patchset in the rfc branch of
> git://github.com/jbrandeb/lls.git  - can you repost there V2 too?

done, sorry for the dup, the first post got html munged by gmail web 
interface.

the latest set (the v3 changes) were posted to
the rfcv2 branch on git://github.com/jbrandeb/lls.git

--
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] [PATCH v2 net-next 0/4] net: low latency Ethernet device polling

2013-05-20 Thread Jesse Brandeburg
On Mon, May 20, 2013 at 1:09 PM, Jeff Kirsher
wrote:

> On Sun, 2013-05-19 at 22:20 +0300, Eliezer Tamir wrote:
> > On 19/05/2013 22:06, Or Gerlitz wrote:
> > > On Sun, May 19, 2013 at 1:25 PM, Eliezer Tamir
> > >  wrote:
> > >> This is an updated version of the code we posted on February.
> > >
> > > Last time you've placed a copy of the patchset in the rfc branch of
> > > git://github.com/jbrandeb/lls.git  - can you repost there V2 too?
>

the latest set (the v3 changes) were posted to
the rfcv2 branch on git://github.com/jbrandeb/lls.git
--
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] [PATCH v2 net-next 0/4] net: low latency Ethernet device polling

2013-05-20 Thread Jeff Kirsher
On Sun, 2013-05-19 at 22:20 +0300, Eliezer Tamir wrote:
> On 19/05/2013 22:06, Or Gerlitz wrote:
> > On Sun, May 19, 2013 at 1:25 PM, Eliezer Tamir
> >  wrote:
> >> This is an updated version of the code we posted on February.
> >
> > Last time you've placed a copy of the patchset in the rfc branch of
> > git://github.com/jbrandeb/lls.git  - can you repost there V2 too?
> >
> > Or.
> >
> Yes, but it will have to wait for tomorrow.
> It's on Jesse's github account and for him it's still the middle of the 
> weekend.
> 
> -Eliezer

If Jesse has not done so already, I can put your series of patches on a
branch of my kernel net-next tree.  Let me know...


signature.asc
Description: This is a digitally signed message part
--
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] [PATCH v2 net-next 0/4] net: low latency Ethernet device polling

2013-05-20 Thread Eliezer Tamir
On 19/05/2013 22:56, Or Gerlitz wrote:
> On Sun, May 19, 2013 at 10:25 PM, Eliezer Tamir <
> eliezer.ta...@linux.intel.com> wrote:
>
>> On 19/05/2013 22:06, Or Gerlitz wrote:
>>
>>> Last time you've placed a copy of the patchset in the rfc branch of
>>> git://github.com/jbrandeb/lls.**git  - 
>>> can you repost there V2 too?
>>>
>>> Or.
>>>
>>>   BTW did you try the last version on your HW?
>>
>
>
> nope, you didn't provide mlx4 patch :( we will look into that ...

I would never risk compromising your job security like that

--
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] [PATCH v2 net-next 0/4] net: low latency Ethernet device polling

2013-05-19 Thread Or Gerlitz
On Sun, May 19, 2013 at 10:25 PM, Eliezer Tamir <
eliezer.ta...@linux.intel.com> wrote:

> On 19/05/2013 22:06, Or Gerlitz wrote:
>
>> Last time you've placed a copy of the patchset in the rfc branch of
>> git://github.com/jbrandeb/lls.**git  - 
>> can you repost there V2 too?
>>
>> Or.
>>
>>  BTW did you try the last version on your HW?
>


nope, you didn't provide mlx4 patch :( we will look into that ...
--
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] [PATCH v2 net-next 0/4] net: low latency Ethernet device polling

2013-05-19 Thread Eliezer Tamir
On 19/05/2013 22:06, Or Gerlitz wrote:
> Last time you've placed a copy of the patchset in the rfc branch of
> git://github.com/jbrandeb/lls.git  - can you repost there V2 too?
>
> Or.
>
BTW did you try the last version on your HW?

--
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] [PATCH v2 net-next 0/4] net: low latency Ethernet device polling

2013-05-19 Thread Eliezer Tamir
On 19/05/2013 22:06, Or Gerlitz wrote:
> On Sun, May 19, 2013 at 1:25 PM, Eliezer Tamir
>  wrote:
>> This is an updated version of the code we posted on February.
>
> Last time you've placed a copy of the patchset in the rfc branch of
> git://github.com/jbrandeb/lls.git  - can you repost there V2 too?
>
> Or.
>
Yes, but it will have to wait for tomorrow.
It's on Jesse's github account and for him it's still the middle of the 
weekend.

-Eliezer

--
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired


Re: [E1000-devel] [PATCH v2 net-next 0/4] net: low latency Ethernet device polling

2013-05-19 Thread Or Gerlitz
On Sun, May 19, 2013 at 1:25 PM, Eliezer Tamir
 wrote:
> This is an updated version of the code we posted on February.

Last time you've placed a copy of the patchset in the rfc branch of
git://github.com/jbrandeb/lls.git  - can you repost there V2 too?

Or.

--
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d
___
E1000-devel mailing list
E1000-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired


[E1000-devel] [PATCH v2 net-next 0/4] net: low latency Ethernet device polling

2013-05-19 Thread Eliezer Tamir
Dave, 

Please consider applying to net-next.

Thanks,
Eliezer

This is an updated version of the code we posted on February.

Patch 1 adds ndo_ll_poll and the IP code to use it.
Patch 2 is an example of how TCP can use ndo_ll_poll.
Patch 3 shows how this method would be implemented for the ixgbe driver.
Patch 4 adds statistics to the ixgbe driver for ndo_ll_poll events.

Changes from previous version:
1. The sysctl knob is now in microseconds, we don't adjust it for cpu
clock changes. The default value is now 0 (off).
Recommended value is around 50.

2. For now the code depends at configure time on CONFIG_I86_TSC to
satisfy both the need for a high precision get_cycles() and a 64 bit
cycles_t. I looked into using sched_clock(). It looks like it does not
have the required precision on all architectures. Using config it would
be easy to add other architectures once some testing has been done on them.

3. The napi reference in struct skb is now a union with the dma cookie
since the former is only used on RX and the latter on TX, as suggested
by Eric Dumazet.

4. We do a better job at honoring non-blocking operations.

5. Removed busy-polling support for tcp_read_sock().
Doing a release_sock() followed by a lock_sock() to get the backlog 
processed is unsafe there.
If there is interest in tcp_read_sock() support we would need another
way to get backlog processing done.
BTW I was not able to find a microbenchamrk that uses tcp_read_sock(),
any suggestions?

6. To avoid the overhead of reference counting napi structs by skbs
and sockets in the fastpath, and increasing the size of the skb struct,
we no longer allow unloading the module once this feature has been used.

It seems that for most of the people interested in busy-polling, giving
up the ability to blindly remove the module for a slight but measurable
performance gain is a good tradeoff.
(There is a module parameter to override this behavior and if you know
what you are doing and are careful to stop the processes you can safely
unload, but we don't enforce this.)

7. We no longer try to dynamically turn GRO off when someone is busy-
polling, since this sometimes caused reordering with packets left on
the napi->gro_list by napi. For most workloads you should probably start
by globally disabling GRO with ethtool. In some cases the performance
gain of GRO greatly outweighs the cost of reordering.
Your mileage may vary.

8. Many small changes suggested by people here. I would like to thank
all of the people that took the time to review our code.

The performance is about the same as the last time.
I promised Rick Jones CPU utilization numbers so here are some examples
with these numbers added.

Performance numbers:
setupTCP_RR  UDP_RR
kernel  Config C3/6 rx-usecs tps cpu% S.dem   tps cpu% S.dem
patched optimized* on   100  87k 3.13 11.4   94K 3.17 10.7
patched optimized* on   071k 3.12 14.0   84k 3.19 12.0
patched optimized* on   adaptive 80k 3.13 12.5   90k 3.46 12.2
patched typicalon   100  72  3.13 14.0   79k 3.17 12.8
patched typicalon   060k 2.13 16.5   71k 3.18 14.0
patched typicalon   adaptive 67k 3.51 16.7   75k 3.36 14.5
3.9 optimized* on   adaptive 25k 1.0  12.7   28k 0.98 11.2
3.9 typicaloff  048k 1.09  7.3   52k 1.11 4.18
3.9 typical0ff  adaptive 35k 1.12 4.08   38k 0.65 5.49
3.9 optimized* off  adaptive 40k 0.82 4.83   43k 0.70 5.23
3.9 optimized* off  057k 1.17 4.08   62k 1.04 3.95
*not the same config as the one used in v1.

Test setup details:
Machines: each with two Intel Xeon 2680 CPUs and X520 (82599) optical NICs
Tests: Netperf tcp_rr and udp_rr, 1 byte (round trips per second)
Kernel: unmodified 3.9 and patched 3.9
Config: typical is derived from RH6.2, optimized is a stripped down config.
Interrupt coalescing (ethtool rx-usecs) settings: 0=off, 1=adaptive, 100 us
When C3/6 states were turned on (via BIOS) the performance governor was used.

This is not the same exact optimized config that I used last time.
When trying it on kernel 3.9 my machines would not boot.
So I re-did it and I removed a slightly different set of options.
As a result it is a bit faster on the patched kernel.
This is also probably the explanation for a slight regression in 
the performance of the unpatched 3.9 kernel with the optimized config
compared to the 3.8 results.

how to test:
(changes from v1 are highlighted with ***)

1. The patchset should apply cleanly to net-next.
If someone wants a set for 3.9 I can give it to them.
(don't forget to configure INET_LL_RX_POLL and INET_LL_TCP_POLL).

2. The ethtool -c setting for rx-usecs should be on the order of 100.

3. *** Use ethtool -K to disable GRO and LRO
(You are encouraged to try it both ways. If you find that your workload
does better with GRO on do tell us.)

4. *** Sysctl value net.ipv4.ip_low_latency_poll controls how long
(in us) to busy-wait for more data, You are encouraged to pla