On 1/11/23 11:24, David Marchand wrote:
> On Wed, Jan 11, 2023 at 10:35 AM Kevin Traynor <[email protected]> wrote:
>>
>> Sleep for an incremental amount of time if none of the Rx queues
>> assigned to a PMD have at least half a batch of packets (i.e. 16 pkts)
>> on an polling iteration of the PMD.
>>
>> Upon detecting the threshold of >= 16 pkts on an Rxq, reset the
>> sleep time to zero (i.e. no sleep).
>>
>> Sleep time will be increased on each iteration where the low load
>> conditions remain up to a total of the max sleep time which is set
>> by the user e.g:
>> ovs-vsctl set Open_vSwitch . other_config:pmd-maxsleep=500
>>
>> The default pmd-maxsleep value is 0, which means that no sleeps
>> will occur and the default behaviour is unchanged from previously.
>>
>> Also add new stats to pmd-perf-show to get visibility of operation
>> e.g.
>> ...
>>    - sleep iterations:       153994  ( 76.8 % of iterations)
>>    Sleep time (us):         9159399  ( 46 us/iteration avg.)
>> ...
>>
>> Reviewed-by: Robin Jarry <[email protected]>
>> Reviewed-by: David Marchand <[email protected]>
>> Signed-off-by: Kevin Traynor <[email protected]>
> 
> Checked v4 -> v5 diff.
> Reviewed-by: David Marchand <[email protected]>

Thanks, Kevin, David and Robin!

As we discussed off-list with Kevin, I changed the 'us/iteration avg.'
statistics to report average time per sleep iteration, not all the
iterations.  Average sleep time among sleep iterations is more intuitive
and much more useful as it conveys how long sleeps actually take.
Average among all the iterations doesn't seem to be useful in any way,
from my perspective.  With that, I applied the set.


There is some unwanted behavior though that I noticed during the tests
and for which we'll need some follow up fixes.

I'm using my usual setup with OVS with 2 PMD threads and 2 testpmd
applications with virtio-user ports.  One in txonly and the other
in mac mode.  So, the traffic is almost bi-directional:

  txonly --> vhost0 --> PMD#0 --> vhost1 --> mac --> vhost1 --> PMD#1 --> drop

Since the first testpmd is in txonly mode, it doesn't receive any
packets and they end up dropped on a send attempt by PMD#1.

The load on PMD#0 is higher than on PMD#1, because dropping packets
is much faster than actually sending them.  So, only 66% of cycles
on PMD#1 are busy cycles, others are idle.  PMD#0 is using 100% of
its cycles for forwarding.

If the pmd-maxsleep is not enabled, both threads are forwarding the
same amount of traffic (8.2 Mpps in my case).

However, once pmd-maxsleep changed, PMD#1 starts forwarding only 70%
of the previous amount of traffic, while still consuming only 60%
of CPU cycles.  Which would be a strange behavior from a user's
perspective as we're seemingly prioritizing sleeps over packet drops.

Here is what happens:

1. The thread has idle cycles, so it decides that it can sleep.
2. Every time it sleeps, it sleeps for 60+ us.  50 us default
   timer slack in the kernel + 10 us OVS requests.
3. 60+ us is enough to overflow the rxq and testpmd in mac mode
   starts dropping packets on transmit.
4. PMD#1 wakes up, quickly clears the queue in a few iterations.
5. Goto step 1.

This cycle continues resulting in a constant drop rate of 30%
of the incoming traffic.

The main problem for this case is the extra 50 us slack on the timer.
Setting the PR_SET_TIMERSLACK to 1 us solves the problem, as 10 us
is not enough to overflow the queue in this scenario.

Another thing worth changing is setting back the sleep increments
to 1 us, once the timer slack is reduced.  That helps making work
smoother.  This will also increase the ramp up time to the maxsleep
value, allowing OVS to react to bursty traffic faster.

We may think of some other mitigation strategies that will help to
avoid packet drops in the future.  But for now, I think, we should
consider getting the timer slack fix and the reduction of the sleep
increment in order to not fall into such pathological cases so easily.

Thoughts?

Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to