These patch series introduce support of AVX512 specific classify implementation for ACL library. Inside it contains two code-paths – one uses mostly 256 bit instruction/registers and can process up to 16 flows in parallel. second uses 512 bit instruction/registers over majority of places and can process up to 32 flows in parallel. These internal code-path selection is done internally based on input burst size and is totally opaque to the user. On my SKX box test-acl shows ~20-65% improvement (depending on rule-set and input burst size) when switching from AVX2 to AVX512 classify algorithms.
Note that this change introduce a formal ABI incompatibility with previous versions of ACL library. TODO list: - Deduplicate 8/16 code paths - Update default algorithm selection - Update docs These patch series depends on: https://patches.dpdk.org/patch/70429/ to be applied first. Konstantin Ananyev (7): acl: fix x86 build when compiler doesn't support AVX2 app/acl: few small improvements acl: remove of unused enum value acl: add infrastructure to support AVX512 classify app/acl: add AVX512 classify support acl: introduce AVX512 classify implementation acl: enhance AVX512 classify implementation app/test-acl/main.c | 19 +- config/x86/meson.build | 3 +- lib/librte_acl/Makefile | 26 ++ lib/librte_acl/acl.h | 4 + lib/librte_acl/acl_run_avx512.c | 140 +++++++ lib/librte_acl/acl_run_avx512x16.h | 635 +++++++++++++++++++++++++++++ lib/librte_acl/acl_run_avx512x8.h | 614 ++++++++++++++++++++++++++++ lib/librte_acl/meson.build | 39 ++ lib/librte_acl/rte_acl.c | 19 +- lib/librte_acl/rte_acl.h | 2 +- 10 files changed, 1493 insertions(+), 8 deletions(-) create mode 100644 lib/librte_acl/acl_run_avx512.c create mode 100644 lib/librte_acl/acl_run_avx512x16.h create mode 100644 lib/librte_acl/acl_run_avx512x8.h -- 2.17.1