> v16 Summary: > - Rebased to master, fixing NEWS file conflicts. > - Included Acks from Flavio on patches 4 and 5.
Thanks all for the great work on patches/reviews/testing/re-work. Github actions/appveyor/Read the docs are in the green. As all issues seem resolved from review and acked for the DPIF series I have merged this to master. Regards Ian > > v15 Summary: > - Add Flavio's Acked-by tag to the appropriate commits. > - Remove "dpif-netdev: Split HWOL out to own header file." commit. It's no > longer necessary because of the rework in AVX512 DPIF to use > dp_netdev_hw_flow(). > - Fix an issue with prefetching packets ahead in AVX512 DPIF with a batch size > of 1. This fix is for patch 03: "dpif-avx512: Add ISA implementation of > dpif.". > - Address Flavio's comments on the following patches: > - Patch 04: "dpif-netdev: Add command to switch dpif implementation.". > - Patch 05: "dpif-netdev: Add command to get dpif implementations.". > > v14 Summary: > - Prefetch 2 packets ahead when processing in AVX512 DPIF. This was found to > perform best when testing. > - Make stores to a PMDs DPIF function pointer atomic. > - Update AVX512 DPIF to use the latest PHWOL implementation. This > changed in the scalar DPIF from v13. > - Change CMD names to dpif-impl-get/set. > - Add currently running DPIF info to dpif-impl-get CMD. > - Add Flavio's Acked-by tag to the appropriate commits. > - Remove "dpif-netdev-unixctl.man: Document subtable-lookup-* CMDs" > commit to > be sent separately. > > v13 Summary: > - Squash DPCLS function rename commit into the first refactor commit. > - Add NEWS items in the commits where the features are added. > - Add documentation in the commits where the features are added. > - Squash commit which adds HWOL support to AVX512 DPIF into commit which > adds the AVX512 DPIF. > - Add EMC and SMC batch insert functions for better handling of EMC and > SMC in AVX512 DPIF. > - Document added commands in manpages as well as rST. > > v12 Summary: > - Add a partial HWOL PMD statistic. This is added for both the scalar > and AVX512 DPIFs. > > v11 Summary: > - Improve the dp_netdev_impl_get_default() function so PMD threads created > after running "dpif-set" command will use the DPIF implementation that was > set. > - Fix small comment formatting issues. > > v10 Summary: > - Removed AVX512 POC work for DPIF and MFEX which was added in v9 > -- MFEX patches will be sent separately > - Rebase additions to NEWS entries > - Update copyright notices > > v9 Summary: > - Added AVX512 POC work for DPIF and MFEX in single patch at end > -- Note that the AVX512 MFEX is for Ether()/IP()/UDP() traffic. > -- A significant performance boost is possible with these optimizations. > > v8 Summary: > - Added NEWS entries for significant changes > - Added scalar optimizations for datapath TX > - Patchset is now ready for merge in my opinion. > > v7 summary: > - OVS Conference included DPIF overview, youtube link: > --- https://youtu.be/5dWyPxiXEhg > - Rebased and tested on the DPDK 20.11 v4 patch > --- Link: https://patchwork.ozlabs.org/project/openvswitch/list/?series=220645 > --- Tested this series for shared/static builds > --- Tested this series with/without -march=<native,skylake,nehalem> > - Minor code improvements in DPIF component (see commits for details) > - Improved CPU ISA checks, caching results > - Commit message improvements (.'s etc) > - Added performance data of patchset > --- Note that the benchmark below does not utilize the AVX512-vpopcntdq > --- optimizations, and performance is expected to improve when used. > --- Further optimizations are planned that continue. > > Benchmark Details & Results > =========================== > > Intel® Xeon® Gold 6230 CPU @2.10GHz > OVS*-DPDK* Phy-Phy Performance 4x 25G Ports - Total 1 million flows > 1C1T-4P, 64-byte frame size, performance in mpps: > > Results Table: > ------------------------------------------- > DPIF | Scalar | Scalar | AVX512 | AVX512 | > DPCLS | Scalar | AVX512 | Scalar | AVX512 | > ------------------------------------------- > mpps | 6.955 | 7.530 | 7.530 | 7.962 | > > By enabling both AVX512 DPIF and DPCLS, packet forwarding > is 7.962 / 6.955 = 1.1447x faster, aka 14% speedup. > > > > v6 summary: > - Rebase to DPDK 20.11 enabling patch > --- This creates a dependency, expect CI build failures on the last > patch in this series if it is not applied! > - Small improvements to DPIF layer > --- EMC/SMC enabling in AVX512 DPIF cleanups > - CPU ISA flags are cached, lowering overhead > - Wilcard Classifier DPCLS > --- Refactor and cleanups for function names > --- Enable more subtable specializations > --- Enable AVX512 vpopcount instruction > > > v5 summary: > - Dropped MFEX optimizations, re-targetting to a later release > --- This allows focus of community reviews & development on DPIF > --- Note OVS Conference talk still introduces both DPIF and MFEX topics > - DPIF improvements > --- Better EMC/SMC handling > --- HWOL is enabled in the avx512 DPIF > --- Documentation & NEWS items added > --- Various smaller improvements > > v4 summary: > - Updated and improve DPIF component > --- SMC now implemented > --- EMC handling improved > --- Novel batching method using AVX512 implemented > --- see commits for details > - Updated Miniflow Extract component > --- Improved AVX512 code path performance > --- Implemented multiple TODO item's in v3 > --- Add "disable" implementation to return to scalar miniflow only > --- More fixes planned for v5/future revisions: > ---- Rename command to better reflect usage > ---- Improve dynamicness of patterns > ---- Add more demo protocols to show usage > - Future work > --- Documentation/NEWS items > --- Statistics for optimized MFEX > - Note that this patchset will be discussed/presented at OvsConf soon :) > > v3 update summary: > (Cian Ferriter helping with rebases, review and code cleanups) > - Split out partially related changes (these will be sent separately) > --- netdev output action optimization > --- avx512 dpcls 16-block support optimization > - Squash commit which moves netdev struct flow into the refactor commit: > --- Squash dpif-netdev: move netdev flow struct to header > --- Into dpif-netdev: Refactor to multiple header files > - Implement Miniflow extract for AVX-512 DPIF > --- A generic method of matching patterns and packets is implemented, > providing traffic-pattern specific miniflow-extract acceleration. > --- The patterns today are hard-coded, however in a future patchset it > is intended to make these runtime configurable, allowing users to > optimize the SIMD miniflow extract for active traffic types. > - Notes: > --- 32 bit builds will be fixed in next release by adding flexible > miniflow extract optimization selection. > --- AVX-512 VBMI ISA is not yet supported in OVS due to requiring the > DPDK 20.11 update for RTE_CPUFLAG_*. Once on a newer DPDK this will > be added. > > v2 updates: > - Includes DPIF command switching at runtime > - Includes AVX512 DPIF implementation > - Includes some partially related changes (can be split out of set?) > --- netdev output action optimization > --- avx512 dpcls 16-block support optimization > > > This patchset is a v7 for making the DPIF components of the > userspace datapath more flexible. It has been refactored to be > more modular to encourage code-reuse, and scalable in that ISA > optimized implementations can be added and selected at runtime. > > The same approach as has been previously used for DPCLS is used > here, where a function pointer allows selection of an implementation > at runtime. > > Datapath features such as EMC, SMC and HWOL are shared between > implementations, hence they are refactored into seperate header files. > The file splitting also improves maintainability, as dpif_netdev.c > has ~9000 LOC, and very hard to modify due to many structs defined > locally in the .c file, ruling out re-usability in other .c files. > > Questions welcomed! Regards, -Harry _______________________________________________ dev mailing list [email protected] https://mail.openvswitch.org/mailman/listinfo/ovs-dev
