Kevin Traynor, Jan 06, 2023 at 15:59: > Sleep for an incremental amount of time if none of the Rx queues > assigned to a PMD have at least half a batch of packets (i.e. 16 pkts) > on an polling iteration of the PMD. > > Upon detecting the threshold of >= 16 pkts on an Rxq, reset the > sleep time to zero (i.e. no sleep). > > Sleep time will be increased on each iteration where the low load > conditions remain up to a total of the max sleep time which is set > by the user e.g: > ovs-vsctl set Open_vSwitch . other_config:pmd-maxsleep=500 > > The default pmd-maxsleep value is 0, which means that no sleeps > will occur and the default behaviour is unchanged from previously. > > Also add new stats to pmd-perf-show to get visibility of operation > e.g. > ... > - sleep iterations: 153994 ( 76.8 % of iterations) > Sleep time: 9159399 us ( 46 us/iteration avg.) > ... > > Signed-off-by: Kevin Traynor <[email protected]>
Hi Kevin, For the record, here are a few numbers that were gathered on a HP DL360 Gen9 server (Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz) with and without this patch series applied. Single socket, Physical to physical test, 2 cores in pmd-cpu-mask, power measurement with pcm-power: +------------+------------+------------+--------------+-----------------+ | | Reference: | Powersave: | pmd-maxsleep | Power off | | | disabled | | 500us | unused cores | | | c-states | C6 enabled | C6 enabled | (X remaining) | +------------+------------+------------+--------------+-----------------+ | No OvS | 33 W | 11.30W | N/A | 2 cores online | | | | | | All OFF: 11.30W | +------------+------------+------------+--------------+-----------------+ | No traffic | 37W | 26.5W | 12W | 12W | | 0 PPS | | | | | +------------+------------+------------+--------------+-----------------+ | Idle | 37W | 26.5W | 12W | 12W | | 1k pps | | | | | +------------+------------+------------+--------------+-----------------+ | Medium | 37W | 27W | 15-20W | 15-20W | | 1 Mpps | | | | | +------------+------------+------------+--------------+-----------------+ | High | 38W | 28W | 28W | 28W | | 14 Mpps | | | | | +------------+------------+------------+--------------+-----------------+ > diff --git a/Documentation/topics/dpdk/pmd.rst > b/Documentation/topics/dpdk/pmd.rst > index 9006fd40f..89f6b3052 100644 > --- a/Documentation/topics/dpdk/pmd.rst > +++ b/Documentation/topics/dpdk/pmd.rst > @@ -325,4 +325,55 @@ reassignment due to PMD Auto Load Balance. For example, > this could be set > (in min) such that a reassignment is triggered at most every few hours. > > +PMD Power Saving (Experimental) > +------------------------------- > + > +PMD threads constantly poll Rx queues which are assigned to them. In order to > +reduce the CPU cycles they use, they can sleep for small periods of time > +when there is no load or very-low load on all the Rx queues they poll. > + > +This can be enabled by setting the max requested sleep time (in microseconds) > +for a PMD thread:: > + > + $ ovs-vsctl set open_vswitch . other_config:pmd-maxsleep=500 > + > +Non-zero values will be rounded up to the nearest 10 microseconds to avoid > +requesting very small sleep times. > + > +With a non-zero max value a PMD may request to sleep by an incrementing > amount > +of time up to the maximum time. If at any point the threshold of at least > half > +a batch of packets (i.e. 16) is received from an Rx queue that the PMD is > +polling is met, the requested sleep time will be reset to 0. At that point no > +sleeps will occur until the no/low load conditions return. > + > +Sleeping in a PMD thread will mean there is a period of time when the PMD > +thread will not process packets. Sleep times requested are not guaranteed > +and can differ significantly depending on system configuration. The actual > +time not processing packets will be determined by the sleep and processor > +wake-up times and should be tested with each system configuration. > + > +Sleep time statistics for 10 secs can be seen with:: > + > + $ ovs-appctl dpif-netdev/pmd-stats-clear \ > + && sleep 10 && ovs-appctl dpif-netdev/pmd-perf-show > + > +Example output, showing that during the last 10 seconds, 76.8% of iterations > +had a sleep of some length. The total amount of sleep time was 9.15 seconds > and > +the average sleep time per iteration was 46 microseconds:: > + > + - sleep iterations: 153994 ( 76.8 % of iterations) > + Sleep time: 9159399 us ( 46 us/iteration avg.) > + > +.. note:: > + > + If there is a sudden spike of packets while the PMD thread is sleeping > and > + the processor is in a low-power state it may result in some lost packets > or > + extra latency before the PMD thread returns to processing packets at full > + rate. > + > +.. note:: > + > + Default Linux kernel hrtimer resolution is set to 50 microseconds so this > + will add overhead to requested sleep time. I wonder if it would make sense to round up to the nearest hrtimer resolution (if such info can be retrieved at runtime). Cheers, Reviewed-by: Robin Jarry <[email protected]> _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
