Kevin Traynor, Jan 06, 2023 at 15:59:
> Sleep for an incremental amount of time if none of the Rx queues
> assigned to a PMD have at least half a batch of packets (i.e. 16 pkts)
> on an polling iteration of the PMD.
>
> Upon detecting the threshold of >= 16 pkts on an Rxq, reset the
> sleep time to zero (i.e. no sleep).
>
> Sleep time will be increased on each iteration where the low load
> conditions remain up to a total of the max sleep time which is set
> by the user e.g:
> ovs-vsctl set Open_vSwitch . other_config:pmd-maxsleep=500
>
> The default pmd-maxsleep value is 0, which means that no sleeps
> will occur and the default behaviour is unchanged from previously.
>
> Also add new stats to pmd-perf-show to get visibility of operation
> e.g.
> ...
>    - sleep iterations:       153994  ( 76.8 % of iterations)
>    Sleep time:               9159399  us ( 46 us/iteration avg.)
> ...
>
> Signed-off-by: Kevin Traynor <[email protected]>

Hi Kevin,

For the record, here are a few numbers that were gathered on a HP DL360
Gen9 server (Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz) with and without
this patch series applied.

Single socket, Physical to physical test, 2 cores in pmd-cpu-mask, power
measurement with pcm-power:

+------------+------------+------------+--------------+-----------------+
|            | Reference: | Powersave: | pmd-maxsleep | Power off       |
|            | disabled   |            | 500us        | unused cores    |
|            | c-states   | C6 enabled | C6 enabled   | (X remaining)   |
+------------+------------+------------+--------------+-----------------+
| No OvS     | 33 W       | 11.30W     | N/A          | 2 cores online  |
|            |            |            |              | All OFF: 11.30W |
+------------+------------+------------+--------------+-----------------+
| No traffic | 37W        | 26.5W      | 12W          | 12W             |
| 0 PPS      |            |            |              |                 |
+------------+------------+------------+--------------+-----------------+
| Idle       | 37W        | 26.5W      | 12W          | 12W             |
| 1k pps     |            |            |              |                 |
+------------+------------+------------+--------------+-----------------+
| Medium     | 37W        | 27W        | 15-20W       | 15-20W          |
| 1 Mpps     |            |            |              |                 |
+------------+------------+------------+--------------+-----------------+
| High       | 38W        | 28W        | 28W          | 28W             |
| 14 Mpps    |            |            |              |                 |
+------------+------------+------------+--------------+-----------------+

> diff --git a/Documentation/topics/dpdk/pmd.rst 
> b/Documentation/topics/dpdk/pmd.rst
> index 9006fd40f..89f6b3052 100644
> --- a/Documentation/topics/dpdk/pmd.rst
> +++ b/Documentation/topics/dpdk/pmd.rst
> @@ -325,4 +325,55 @@ reassignment due to PMD Auto Load Balance. For example, 
> this could be set
>  (in min) such that a reassignment is triggered at most every few hours.
>  
> +PMD Power Saving (Experimental)
> +-------------------------------
> +
> +PMD threads constantly poll Rx queues which are assigned to them. In order to
> +reduce the CPU cycles they use, they can sleep for small periods of time
> +when there is no load or very-low load on all the Rx queues they poll.
> +
> +This can be enabled by setting the max requested sleep time (in microseconds)
> +for a PMD thread::
> +
> +    $ ovs-vsctl set open_vswitch . other_config:pmd-maxsleep=500
> +
> +Non-zero values will be rounded up to the nearest 10 microseconds to avoid
> +requesting very small sleep times.
> +
> +With a non-zero max value a PMD may request to sleep by an incrementing 
> amount
> +of time up to the maximum time. If at any point the threshold of at least 
> half
> +a batch of packets (i.e. 16) is received from an Rx queue that the PMD is
> +polling is met, the requested sleep time will be reset to 0. At that point no
> +sleeps will occur until the no/low load conditions return.
> +
> +Sleeping in a PMD thread will mean there is a period of time when the PMD
> +thread will not process packets. Sleep times requested are not guaranteed
> +and can differ significantly depending on system configuration. The actual
> +time not processing packets will be determined by the sleep and processor
> +wake-up times and should be tested with each system configuration.
> +
> +Sleep time statistics for 10 secs can be seen with::
> +
> +    $ ovs-appctl dpif-netdev/pmd-stats-clear \
> +        && sleep 10 && ovs-appctl dpif-netdev/pmd-perf-show
> +
> +Example output, showing that during the last 10 seconds, 76.8% of iterations
> +had a sleep of some length. The total amount of sleep time was 9.15 seconds 
> and
> +the average sleep time per iteration was 46 microseconds::
> +
> +   - sleep iterations:       153994  ( 76.8 % of iterations)
> +   Sleep time:               9159399  us ( 46 us/iteration avg.)
> +
> +.. note::
> +
> +    If there is a sudden spike of packets while the PMD thread is sleeping 
> and
> +    the processor is in a low-power state it may result in some lost packets 
> or
> +    extra latency before the PMD thread returns to processing packets at full
> +    rate.
> +
> +.. note::
> +
> +    Default Linux kernel hrtimer resolution is set to 50 microseconds so this
> +    will add overhead to requested sleep time.

I wonder if it would make sense to round up to the nearest hrtimer
resolution (if such info can be retrieved at runtime).

Cheers,

Reviewed-by: Robin Jarry <[email protected]>

_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to