From: Emma Finn <emma.f...@intel.com> This commit adds the AVX512 implementation of the action functionality.
Usage: $ ovs-appctl odp-execute/action-impl-set avx512 Signed-off-by: Emma Finn <emma.f...@intel.com> Signed-off-by: Harry van Haaren <harry.van.haa...@intel.com> --- v10: - Move location and rework documentation (Ilya) - Improve ISA checks to fix CI build issue v9: rebase conflict on NEWS --- Documentation/topics/dpdk/bridge.rst | 30 ++++++++++++++++++++++++++++ Documentation/topics/testing.rst | 24 ++++++++++++++-------- NEWS | 1 + acinclude.m4 | 1 + lib/cpu.c | 1 + lib/cpu.h | 1 + lib/odp-execute-private.c | 8 ++++++++ lib/odp-execute-private.h | 12 +++++++++++ m4/openvswitch.m4 | 29 +++++++++++++++++++++++++++ 9 files changed, 99 insertions(+), 8 deletions(-) diff --git a/Documentation/topics/dpdk/bridge.rst b/Documentation/topics/dpdk/bridge.rst index 1f626c7c2..354f1ced1 100644 --- a/Documentation/topics/dpdk/bridge.rst +++ b/Documentation/topics/dpdk/bridge.rst @@ -321,3 +321,33 @@ following command:: ``scalar`` can be selected on core ``3`` by the following command:: $ ovs-appctl dpif-netdev/miniflow-parser-set -pmd 3 scalar + + +Actions Implementations (Experimental) +-------------------------------------- + +Actions describe what processing or modification should be performed on a +packet when it matches a given flow. Similar to the datapath interface, +DPCLS and MFEX (see above), the implementation of these actions can be +accelerated using SIMD instructions, resulting in improved performance. + +OVS provides multiple implementations of the actions, however some +implementations requiring a CPU capable of executing the required SIMD +instructions. + +Available implementations can be listed with the following command:: + + $ ovs-appctl odp-execute/action-impl-show + Available Actions implementations: + scalar (available: Yes, active: Yes) + autovalidator (available: Yes, active: No) + avx512 (available: Yes, active: No) + +By default, ``scalar`` is used. Implementations can be selected by +name:: + + $ ovs-appctl odp-execute/action-impl-set avx512 + Action implementation set to avx512. + + $ ovs-appctl odp-execute/action-impl-set scalar + Action implementation set to scalar. diff --git a/Documentation/topics/testing.rst b/Documentation/topics/testing.rst index c15d5b38f..a6c747b18 100644 --- a/Documentation/topics/testing.rst +++ b/Documentation/topics/testing.rst @@ -361,12 +361,12 @@ testsuite. Userspace datapath: Testing and Validation of CPU-specific Optimizations '''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''' -As multiple versions of the datapath classifier and packet parsing functions -can co-exist, each with different CPU ISA optimizations, it is important to -validate that they all give the exact same results. To easily test all the -implementations, an ``autovalidator`` implementation of them exists. This -implementation runs all other available implementations, and verifies that the -results are identical. +As multiple versions of the datapath classifier, packet parsing functions and +actions can co-exist, each with different CPU ISA optimizations, it is +important to validate that they all give the exact same results. To easily +test all the implementations, an ``autovalidator`` implementation of them +exists. This implementation runs all other available implementations, and +verifies that the results are identical. Running the OVS unit tests with the autovalidator enabled ensures all implementations provide the same results. Note that the performance of the @@ -382,18 +382,26 @@ To set the autovalidator for the packet parser, use this command:: $ ovs-appctl dpif-netdev/miniflow-parser-set autovalidator +To set the autovalidator for actions, use this command:: + + $ ovs-appctl odp-execute/action-impl-set autovalidator + To run the OVS unit test suite with the autovalidator as the default implementation, it is required to recompile OVS. During the recompilation, the default priority of the `autovalidator` implementation is set to the -maximum priority, ensuring every test will be run with every implementation:: +maximum priority, ensuring every test will be run with every implementation. +Priority is only related to mfex autovalidator and not the actions +autovalidator.:: - $ ./configure --enable-autovalidator --enable-mfex-default-autovalidator + $ ./configure --enable-autovalidator --enable-mfex-default-autovalidator \ + --enable-actions-default-autovalidator The following line should be seen in the configuration log when the above options are used:: checking whether DPCLS Autovalidator is default implementation... yes checking whether MFEX Autovalidator is default implementation... yes + checking whether actions Autovalidator is default implementation... yes Compile OVS in debug mode to have `ovs_assert` statements error out if there is a mis-match in the datapath classifier lookup or packet parser diff --git a/NEWS b/NEWS index bb03457f6..94f9d7f66 100644 --- a/NEWS +++ b/NEWS @@ -56,6 +56,7 @@ Post-v2.17.0 implementations available at run time. * Add build time configure command to enable auto-validator as default actions implementation at build time. + * Add AVX512 implementation of actions. - Linux datapath: * Add offloading meter tc police. * Add support for offloading the check_pkt_len action. diff --git a/acinclude.m4 b/acinclude.m4 index 84fad425c..555e50956 100644 --- a/acinclude.m4 +++ b/acinclude.m4 @@ -96,6 +96,7 @@ dnl dnl Checks if compiler and binutils supports various AVX512 ISA. AC_DEFUN([OVS_CHECK_AVX512], [ OVS_CHECK_BINUTILS_AVX512 + OVS_CHECK_GCC_AVX512VL OVS_CONDITIONAL_CC_OPTION_DEFINE([-mavx512f], [HAVE_AVX512F]) OVS_CONDITIONAL_CC_OPTION_DEFINE([-mavx512bw], [HAVE_AVX512BW]) OVS_CONDITIONAL_CC_OPTION_DEFINE([-mavx512vl], [HAVE_AVX512VL]) diff --git a/lib/cpu.c b/lib/cpu.c index 2df003c51..0292f715e 100644 --- a/lib/cpu.c +++ b/lib/cpu.c @@ -53,6 +53,7 @@ X86_ISA(X86_EXT_FEATURES_LEAF, EBX, 16, OVS_CPU_ISA_X86_AVX512F) X86_ISA(X86_EXT_FEATURES_LEAF, EBX, 30, OVS_CPU_ISA_X86_AVX512BW) X86_ISA(X86_EXT_FEATURES_LEAF, ECX, 1, OVS_CPU_ISA_X86_AVX512VBMI) X86_ISA(X86_EXT_FEATURES_LEAF, ECX, 14, OVS_CPU_ISA_X86_VPOPCNTDQ) +X86_ISA(X86_EXT_FEATURES_LEAF, EBX, 31, OVS_CPU_ISA_X86_AVX512VL) #endif bool diff --git a/lib/cpu.h b/lib/cpu.h index 92897bb71..3215229bc 100644 --- a/lib/cpu.h +++ b/lib/cpu.h @@ -25,6 +25,7 @@ enum ovs_cpu_isa { OVS_CPU_ISA_X86_AVX512F, OVS_CPU_ISA_X86_AVX512BW, OVS_CPU_ISA_X86_AVX512VBMI, + OVS_CPU_ISA_X86_AVX512VL, OVS_CPU_ISA_X86_VPOPCNTDQ, OVS_CPU_ISA_X86_LAST = OVS_CPU_ISA_X86_VPOPCNTDQ, }; diff --git a/lib/odp-execute-private.c b/lib/odp-execute-private.c index 38be22ec9..3591da2e5 100644 --- a/lib/odp-execute-private.c +++ b/lib/odp-execute-private.c @@ -41,6 +41,14 @@ static struct odp_execute_action_impl action_impls[] = { .name = "scalar", .init_func = odp_action_scalar_init, }, + +#if ACTION_IMPL_AVX512_CHECK + [ACTION_IMPL_AVX512] = { + .available = false, + .name = "avx512", + .init_func = NULL, + }, +#endif }; static void diff --git a/lib/odp-execute-private.h b/lib/odp-execute-private.h index d6eebbf37..1943eb600 100644 --- a/lib/odp-execute-private.h +++ b/lib/odp-execute-private.h @@ -22,6 +22,15 @@ #include "odp-netlink.h" #include "ovs-atomic.h" +/* Combine all required ISA and Linker checks into a single #define + * for readability and simplicity where the checks are needed. Note + * that it is always #defined, so code must use the #if preprocesor + * directive (not #ifdef). + */ +#define ACTION_IMPL_AVX512_CHECK (__x86_64__ && HAVE_AVX512F \ + && HAVE_LD_AVX512_GOOD && __SSE4_2__ && HAVE_AVX512BW && HAVE_AVX512VL \ + && HAVE_GCC_AVX512VL_GOOD) + /* Forward declaration for typedef. */ struct odp_execute_action_impl; @@ -59,6 +68,9 @@ enum odp_execute_action_impl_idx { * Do not change the autovalidator position in this list without updating * the define below. */ +#if ACTION_IMPL_AVX512_CHECK + ACTION_IMPL_AVX512, +#endif ACTION_IMPL_MAX, }; diff --git a/m4/openvswitch.m4 b/m4/openvswitch.m4 index 4c3bace6e..fe51455b4 100644 --- a/m4/openvswitch.m4 +++ b/m4/openvswitch.m4 @@ -421,6 +421,35 @@ AC_DEFUN([OVS_CHECK_SPHINX], AC_ARG_VAR([SPHINXBUILD]) AM_CONDITIONAL([HAVE_SPHINX], [test "$SPHINXBUILD" != none])]) + +dnl Checks for compiler correctly emitting AVX512-VL vpermd instruction. +dnl GCC5 says it exports AVX512-VL, but it doesn't implement "vpermd" instruction +dnl resulting in compilation failures. To workaround this "reported vs actual" +dnl mismatch, we compile a small snippet, and conditionally enable AVX512-VL. +AC_DEFUN([OVS_CHECK_GCC_AVX512VL], [ + AC_MSG_CHECKING([whether compiler correctly emits AVX512-VL]) + AC_COMPILE_IFELSE( + [AC_LANG_PROGRAM([#include <immintrin.h> + static void __attribute__((__target__("avx512vl"))) + check_permutexvar(void) + { + __m256i v_swap32a = _mm256_setr_epi32(0x0, 0x4, 0xF, + 0xF, 0xF, 0xF, + 0xF, 0xF); + v_swap32a = _mm256_permutexvar_epi32(v_swap32a, + v_swap32a); + }],[])], + [AC_MSG_RESULT([yes]) + ovs_cv_gcc_avx512vl_good=yes], + [AC_MSG_RESULT([no]) + ovs_cv_gcc_avx512vl_good=no]) + if test "$ovs_cv_gcc_avx512vl_good" = yes; then + AC_DEFINE([HAVE_GCC_AVX512VL_GOOD], [1], + [Define to 1 if gcc implements the vpermd instruction.]) + fi + AM_CONDITIONAL([HAVE_GCC_AVX512VL_GOOD], + [test "$ovs_cv_gcc_avx512vl_good" = yes])]) + dnl Checks for binutils/assembler known issue with AVX512. dnl Due to backports, we probe assembling a reproducer instead of checking dnl binutils version string. More details, including ASM dumps and debug here: -- 2.32.0 _______________________________________________ dev mailing list d...@openvswitch.org https://mail.openvswitch.org/mailman/listinfo/ovs-dev