Hi Alexei,

On 07/09/2016 8:34 PM, Alexei Starovoitov wrote:
On Wed, Sep 07, 2016 at 03:42:25PM +0300, Saeed Mahameed wrote:
For non-striding RQ configuration before this patch we had a ring
with pre-allocated SKBs and mapped the SKB->data buffers for
device.

For robustness and better RX data buffers management, we allocate a
page per packet and build_skb around it.

This patch (which is a prerequisite for XDP) will actually reduce
performance for normal stack usage, because we are now hitting a bottleneck
in the page allocator. A later patch of page reuse mechanism will be
needed to restore or even improve performance in comparison to the old
RX scheme.

Packet rate performance testing was done with pktgen 64B packets on xmit
side and TC drop action on RX side.

CPU: Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz

Comparison is done between:
  1.Baseline, before 'net/mlx5e: Build RX SKB on demand'
  2.Build SKB with RX page cache (This patch)

Streams    Baseline    Build SKB+page-cache    Improvement
-----------------------------------------------------------
1          4.33Mpps      5.51Mpps                27%
2          7.35Mpps      11.5Mpps                52%
4          14.0Mpps      16.3Mpps                16%
8          22.2Mpps      29.6Mpps                20%
16         24.8Mpps      34.0Mpps                17%
Impressive gains for build_skb. I think it should help ip forwarding too
and likely tcp_rr. tcp_stream shouldn't see any difference.
If you can benchmark that along with pktgen+tc_drop it would
help to better understand the impact of the changes.
Why do you expect an improvement in tcp_rr?
I don't see such in my tests.


Reply via email to