I am OK with Ilya's proposal. 

The only correction I need to make is that for us 50 us showed good 
improvements in a "PVP" scenario with iperf3 as kernel application in the guest 
(not VM-VM).

BR, Jan

> -----Original Message-----
> From: Stokes, Ian [mailto:[email protected]]
> Sent: Tuesday, 19 December, 2017 12:40
> To: Ilya Maximets <[email protected]>; Jan Scheurich 
> <[email protected]>; [email protected]; Bodireddy,
> Bhanuprakash <[email protected]>
> Cc: Heetae Ahn <[email protected]>; Fischetti, Antonio 
> <[email protected]>; Eelco Chaudron
> <[email protected]>; Loftus, Ciara <[email protected]>; Kevin Traynor 
> <[email protected]>
> Subject: RE: [RFC v2] docs: Describe output packet batching in DPDK guide.
> 
> > On 19.12.2017 13:39, Stokes, Ian wrote:
> > >> Hi Ilya,
> > >>
> > >> Some more suggestions below to expand a bit on the use cases for
> > >> tx-flush- interval.
> > >>
> > >> BR, Jan
> > >
> > > Hi Ilya,
> > >
> > > I agree with Jans input here. I've finished validating the output
> > batching patchset today bit would like to include this documentation also.
> > >
> > > Are you planning to re-spin a new version of this patch with the
> > required changes?
> >
> > I have concerns about suggesting exact value of 50 microseconds or even
> > saying that increasing of 'tx-flush-interval' will increase performance
> > without stating the exact testing scenario and environment including
> > hardware.
> 
> Sure, from case to case this could vary.
> >
> > Testing shows that optimal value of 'tx-flush-interval' highly depends on
> > scenario and possible traffic patterns. For example, tx-flush-interval=50
> > significantly degrades performance of PVP with bonded HW NICs scenario on
> > x86.
> > I didn't finish the full testing, but it also degrades performance of VM-
> > VM scenario with Linux kernel guests (interrupt based) on my ARMv8 system.
> >
> > So, I prefer to avoid saying that this value will increase performance. At
> > least without full testing scenario description.
> 
> Sure, I think over time as more testing of different scenarios is completed 
> we can add to this.
> 
> It probably makes sense to give just a general comment for people getting 
> started with the caveat that users need to experiment to tune
> for their own needs in specific deployments.
> 
> >
> > I'll try to modify Jan's comments according to above concerns.
> >
> > What about something like this:
> > ----------------------
> > To make advantage of batched transmit functions, OVS collects packets in
> > intermediate queues before sending when processing a batch of received
> > packets.
> > Even if packets are matched by different flows, OVS uses a single send
> > operation for all packets destined to the same output port.
> >
> > Furthermore, OVS is able to buffer packets in these intermediate queues
> > for a configurable amount of time to reduce the frequency of send bursts
> > at medium load levels when the packet receive rate is high, but the
> > receive batch size still very small. This is particularly beneficial for
> > packets transmitted to VMs using an interrupt-driven virtio driver, where
> > the interrupt overhead is significant for the OVS PMD, the host operating
> > system and the guest driver.
> >
> > The ``tx-flush-interval`` parameter can be used to specify the time in
> > microseconds OVS should wait between two send bursts to a given port
> > (default is ``0``). When the intermediate queue fills up before that time
> > is over, the buffered packet batch is sent immediately::
> >
> >     $ ovs-vsctl set Open_vSwitch . other_config:tx-flush-interval=50
> >
> > This parameter influences both throughput and latency, depending on the
> > traffic load on the port. In general lower values decrease latency while
> > higher values may be useful to achieve higher throughput.
> >
> > Low traffic (``packet rate < 1 / tx-flush-interval``) should not
> > experience any significant latency or throughput increase as packets are
> > forwarded immediately.
> >
> > At intermediate load levels
> > (``1 / tx-flush-interval < packet rate < 32 / tx-flush-interval``) traffic
> > should experience an average latency increase of up to
> > ``1 / 2 * tx-flush-interval`` and a possible throughput improvement.
> >
> > Very high traffic (``packet rate >> 32 / tx-flush-interval``) should
> > experience the average latency increase equal to ``32 / (2 * packet
> > rate)``. Most send batches in this case will contain the maximum number of
> > packets (``32``).
> >
> > A ``tx-burst-interval`` value of ``50`` microseconds has shown to provide
> > a good performance increase in a ``VM-VM`` scenario on x86 system for
> > interrupt-driven guests while keeping the latency increase at a reasonable
> > level.
> >
> > .. note::
> >   Throughput impact of this option significantly depends on the scenario
> > and
> >   the traffic patterns. For example: ``tx-burst-interval`` value of ``50``
> >   microseconds shows performance degradation in PVP with bonded PHY
> > scenario
> >   while testing with ``256 - 1024`` packet flows:
> >
> >     https://mail.openvswitch.org/pipermail/ovs-dev/2017-
> > December/341700.html
> >
> 
> Over all I think above looks good. We'll never be able to detail every test 
> scenario but this gets the general trade off across to a user of
> latency/throughput in relation to the timing parameter.
> 
> I'd be happy to take something like above for the initial merge and it could 
> be expanded upon as people test going forward.
> 
> Ian
> 
> > ----------------------
> >
> >
> > >
> > > Thanks
> > > Ian
> > >>
> > >>> -----Original Message-----
> > >>> From: Ilya Maximets [mailto:[email protected]]
> > >>> Sent: Tuesday, 12 December, 2017 14:07
> > >>> To: [email protected]; Bhanuprakash Bodireddy
> > >>> <[email protected]>
> > >>> Cc: Heetae Ahn <[email protected]>; Antonio Fischetti
> > >>> <[email protected]>; Eelco Chaudron <[email protected]>;
> > >>> Ciara Loftus <[email protected]>; Kevin Traynor
> > >>> <[email protected]>; Jan Scheurich <[email protected]>;
> > >>> Ian Stokes <[email protected]>; Ilya Maximets
> > >>> <[email protected]>
> > >>> Subject: [RFC v2] docs: Describe output packet batching in DPDK guide.
> > >>>
> > >>> Added information about output packet batching and a way to
> > >>> configure 'tx-flush-interval'.
> > >>>
> > >>> Signed-off-by: Ilya Maximets <[email protected]>
> > >>> ---
> > >>>
> > >>> Version 2:
> > >>>         * Some grammar/wording corrections. (Eelco Chaudron)
> > >>>
> > >>>  Documentation/intro/install/dpdk.rst | 24 ++++++++++++++++++++++++
> > >>>  1 file changed, 24 insertions(+)
> > >>>
> > >>> diff --git a/Documentation/intro/install/dpdk.rst
> > >>> b/Documentation/intro/install/dpdk.rst
> > >>> index 3fecb5c..5485dbc 100644
> > >>> --- a/Documentation/intro/install/dpdk.rst
> > >>> +++ b/Documentation/intro/install/dpdk.rst
> > >>> @@ -568,6 +568,30 @@ not needed i.e. jumbo frames are not needed, it
> > >>> can be forced off by adding  chains of descriptors it will make more
> > >>> individual virtio descriptors available  for rx to the guest using
> > >> dpdkvhost ports and this can improve performance.
> > >>>
> > >>> +Output Packet Batching
> > >>> +~~~~~~~~~~~~~~~~~~~~~~
> > >>> +
> > >>> +To get advantages of the batched send functions OVS collects
> > >>> +packets in intermediate queues before sending. This allows using a
> > >>> +single send for packets matched by different flows but having the
> > >>> +same output
> > >> action.
> > >>> +Furthermore, OVS is able to collect packets for some reasonable
> > >>> +amount of time before batch sending them which might help when
> > >>> +input
> > >> batches are small.
> > >>
> > >> To make advantage of batched transmit functions, OVS collects packets
> > >> in intermediate queues before sending when processing a batch of
> > >> received packets. Even if packets are matched by different flows, OVS
> > >> uses a single send operation for all packets destined to the same
> > output port.
> > >>
> > >> Furthermore, OVS is able to buffer packets in these intermediate
> > >> queues for a configurable amount of time to reduce the frequency of
> > >> send bursts at medium load levels when the packet receive rate is
> > >> high, but the receive batch size still very small. This is
> > >> particularly beneficial for packets transmitted to VMs using an
> > >> interrupt-driven virtio driver, where the interrupt overhead is
> > >> significant for the OVS PMD, the host operating system and the guest
> > driver.
> > >>
> > >>> +
> > >>> +``tx-flush-interval`` config could be used to specify the time in
> > >>> +microseconds that a packet can wait in an output queue for sending
> > >> (default is ``0``)::
> > >>
> > >> The ``tx-flush-interval`` parameter can be used to specify the time
> > >> in microseconds OVS should wait between two send bursts to a given
> > >> port (default is ``0``). When the intermediate queue fills up before
> > >> that time is over, the buffered packet batch is sent immediately::
> > >>
> > >>> +
> > >>> +    $ ovs-vsctl set Open_vSwitch .
> > >>> + other_config:tx-flush-interval=50
> > >>> +
> > >>> +Lower values decrease latency while higher values may be useful to
> > >>> +achieve higher performance. For example, increasing of
> > >>> +``tx-flush-interval`` can be used to decrease the number of
> > >>> +interrupts
> > >> for interrupt based guest drivers.
> > >>> +This may significantly affect the performance. Zero value means
> > >>> +immediate send at the end of processing a single input batch.
> > >>
> > >> This parameter influences both throughput and latency, depending on
> > >> the traffic load on the port. In general lower values decrease
> > >> latency while higher values may be useful to achieve higher throughput.
> > >>
> > >> Low traffic (packet rate < 1/tx-flush-interval) should not experience
> > >> any significant latency or throughput increase as packets are
> > >> forwarded immediately.
> > >>
> > >> At intermediate load levels (1/tx-flush-interval < packet rate <
> > >> 32/tx-
> > >> flush-interval) traffic should experience an average latency increase
> > >> of up to 1/2 * tx-flush-interval and a throughput improvement that
> > >> depends on the average size of send bursts and grows with the traffic
> > rate.
> > >>
> > >> Very high traffic (packet rate >> 32/tx-flush-interval) should
> > >> experience improved throughput as most send batches contain the
> > >> maximum number of packets (32). The average latency increase should
> > >> equal 32/(2 * packet rate).
> > >>
> > >> A tx-burst-interval value of 50 microseconds has shown to provide a
> > >> good performance increase for interrupt-driven guests while keeping
> > >> the latency increase at a reasonable level.
> > >>
> > >>> +
> > >>> +Average number of packets per output batch could be checked in PMD
> > >> stats::
> > >>
> > >> The average number of packets per output batch can be checked in PMD
> > >> stats::
> > >>
> > >>> +
> > >>> +    $ ovs-appctl dpif-netdev/pmd-stats-show
> > >>> +
> > >>>  Limitations
> > >>>  ------------
> > >>>
> > >>> --
> > >>> 2.7.4
> > >
> > >
> > >
> > >
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to