>I have retested your "Output patches batching" v6 in our standard PVP L3-
>VPN/VXLAN benchmark setup [1]. The configuration is a single PMD serving a
>physical 10G port and a VM running DPDK testpmd as IP reflector with 4
>equally loaded vhostuser ports. The tests are run with 64 byte packets. Below
>are Mpps values averaged over four 10 second runs:
>
>        master  patch                patch
>Flows   Mpps    tx-flush-interval=0  tx-flush-interval=50
>8       4.419   4.342   -1.7%        4.749    7.5%
>100     4.026   3.956   -1.7%        4.281    6.3%
>1000    3.630   3.632    0.1%        3.760    3.6%
>2000    3.394   3.390   -0.1%        3.490    2.8%
>5000    2.989   2.938   -1.7%        2.994    0.2%
>10000   2.756   2.711   -1.6%        2.746   -0.4%
>20000   2.641   2.598   -1.6%        2.622   -0.7%
>50000   2.604   2.558   -1.8%        2.579   -1.0%
>100000  2.598   2.552   -1.8%        2.572   -1.0%
>500000  2.598   2.550   -1.8%        2.571   -1.0%
>
>As expected output batching within rx bursts (tx-flush-interval=0) provides
>little or no benefit in this scenario. The test results reflect roughly a 1.7%
>performance penalty due to the tx batching overhead. This overhead is
>measurable, but should in my eyes not be a blocker for merging this patch
>series.

I had a similar observation when I was testing for regression with non-batching 
scenario.
https://mail.openvswitch.org/pipermail/ovs-dev/2017-October/339719.html

As tx-flush-interval by default is 0 (enable instant send) and causes 
performance degradation,
I recommend documenting this in one of the commits and giving a link to this 
performance numbers(adding Tested-at tag)
so that users can tune tx-flush-interval accordingly. 

>
>Interestingly, tests with time-based tx batching and a minimum flush interval
>of 50 microseconds show a consistent and significant performance increase
>for small number of flows (in the regime where EMC is effective) and a
>reduced penalty of 1% for many flows. I don't have a good explanation yet for
>this phenomenon. I would be interested to see if other benchmark results
>support the general positive impact of time-based tx batching on throughput
>also for synthetic DPDK applications in the VM. The average Ping RTT increases
>by 20-30 us as expected.

I think this depends on tx-flush-interval and also should be documented.

- Bhanuprakash.

>
>We will also retest the performance improvement of time-based tx batching
>on interrupt driven Linux kernel applications (such as iperf3).
>
>BR, Jan
>
>> -----Original Message-----
>> From: Ilya Maximets [mailto:i.maxim...@samsung.com]
>> Sent: Friday, 01 December, 2017 16:44
>> To: ovs-dev@openvswitch.org; Bhanuprakash Bodireddy
><bhanuprakash.bodire...@intel.com>
>> Cc: Heetae Ahn <heetae82....@samsung.com>; Antonio Fischetti
><antonio.fische...@intel.com>; Eelco Chaudron
>> <echau...@redhat.com>; Ciara Loftus <ciara.lof...@intel.com>; Kevin
>Traynor <ktray...@redhat.com>; Jan Scheurich
>> <jan.scheur...@ericsson.com>; Ian Stokes <ian.sto...@intel.com>; Ilya
>Maximets <i.maxim...@samsung.com>
>> Subject: [PATCH v6 0/7] Output packet batching.
>>
>> This patch-set inspired by [1] from Bhanuprakash Bodireddy.
>> Implementation of [1] looks very complex and introduces many pitfalls [2]
>> for later code modifications like possible packet stucks.
>>
>> This version targeted to make simple and flexible output packet batching on
>> higher level without introducing and even simplifying netdev layer.
>>
>> Basic testing of 'PVP with OVS bonding on phy ports' scenario shows
>> significant performance improvement.
>>
>> Test results for time-based batching for v3:
>> https://mail.openvswitch.org/pipermail/ovs-dev/2017-
>September/338247.html
>>
>> Test results for v4:
>> https://mail.openvswitch.org/pipermail/ovs-dev/2017-
>October/339624.html
>>
>> [1] [PATCH v4 0/5] netdev-dpdk: Use intermediate queue during packet
>transmission.
>>     https://mail.openvswitch.org/pipermail/ovs-dev/2017-
>August/337019.html
>>
>> [2] For example:
>>     https://mail.openvswitch.org/pipermail/ovs-dev/2017-
>August/337133.html
>>
>> Version 6:
>>      * Rebased on current master:
>>        - Added new patch to refactor dp_netdev_pmd_thread structure
>>          according to following suggestion:
>>          https://mail.openvswitch.org/pipermail/ovs-dev/2017-
>November/341230.html
>>
>>        NOTE: I still prefer reverting of the padding related patch.
>>              Rebase done to not block acepting of this series.
>>              Revert patch and discussion here:
>>              https://mail.openvswitch.org/pipermail/ovs-dev/2017-
>November/341153.html
>>
>>      * Added comment about pmd_thread_ctx_time_update() usage.
>>
>> Version 5:
>>      * pmd_thread_ctx_time_update() calls moved to different places to
>>        call them only from dp_netdev_process_rxq_port() and main
>>        polling functions:
>>              pmd_thread_main, dpif_netdev_run and
>dpif_netdev_execute.
>>        All other functions should use cached time from pmd->ctx.now.
>>        It's guaranteed to be updated at least once per polling cycle.
>>      * 'may_steal' patch returned to version from v3 because
>>        'may_steal' in qos is a completely different variable. This
>>        patch only removes 'may_steal' from netdev API.
>>      * 2 more usec functions added to timeval to have complete public API.
>>      * Checking of 'output_cnt' turned to assertion.
>>
>> Version 4:
>>      * Rebased on current master.
>>      * Rebased on top of "Keep latest measured time for PMD thread."
>>        (Jan Scheurich)
>>      * Microsecond resolution related patches integrated.
>>      * Time-based batching without RFC tag.
>>      * 'output_time' renamed to 'flush_time'. (Jan Scheurich)
>>      * 'flush_time' update moved to
>'dp_netdev_pmd_flush_output_on_port'.
>>        (Jan Scheurich)
>>      * 'output-max-latency' renamed to 'tx-flush-interval'.
>>      * Added patch for output batching statistics.
>>
>> Version 3:
>>
>>      * Rebased on current master.
>>      * Time based RFC: fixed assert on n_output_batches <= 0.
>>
>> Version 2:
>>
>>      * Rebased on current master.
>>      * Added time based batching RFC patch.
>>      * Fixed mixing packets with different sources in same batch.
>>
>>
>> Ilya Maximets (7):
>>   dpif-netdev: Refactor PMD thread structure for further extension.
>>   dpif-netdev: Keep latest measured time for PMD thread.
>>   dpif-netdev: Output packet batching.
>>   netdev: Remove unused may_steal.
>>   netdev: Remove useless cutlen.
>>   dpif-netdev: Time based output batching.
>>   dpif-netdev: Count sent packets and batches.
>>
>>  lib/dpif-netdev.c     | 412 +++++++++++++++++++++++++++++++++++++--
>-----------
>>  lib/netdev-bsd.c      |   6 +-
>>  lib/netdev-dpdk.c     |  30 ++--
>>  lib/netdev-dummy.c    |   6 +-
>>  lib/netdev-linux.c    |   8 +-
>>  lib/netdev-provider.h |   7 +-
>>  lib/netdev.c          |  12 +-
>>  lib/netdev.h          |   2 +-
>>  vswitchd/vswitch.xml  |  16 ++
>>  9 files changed, 349 insertions(+), 150 deletions(-)
>>
>> --
>> 2.7.4

_______________________________________________
dev mailing list
d...@openvswitch.org
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to