From: Scott <[email protected]> This series optimizes __rte_raw_cksum by replacing memcpy with direct pointer access, enabling compiler vectorization on both GCC and Clang.
Patch 1 adds __rte_may_alias and __rte_aligned(1) to unaligned typedefs to prevent a GCC strict-aliasing bug where struct initialization is incorrectly elided, and avoid UB by clarifying access can be from any address. Patch 2 uses the improved unaligned_uint16_t type in __rte_raw_cksum to enable compiler optimizations while maintaining correctness across all architectures (including strict-alignment platforms). Performance results show significant improvements (40% for small buffers, up to 8x for larger buffers) on Intel Xeon with Clang 18.1. Changes in v17: - Use __rte_aligned(1) unconditionally on unaligned type aliases - test_cksum_fuzz uses unit_test_suite_runner - test_cksum_fuzz reference method rename to test_cksum_fuzz_cksum_reference Changes in v16: - Add Fixes tag and Cc stable/author for backporting (patch 1) Changes in v15: - Use NOHUGE_OK and ASAN_OK constants in REGISTER_FAST_TEST Changes in v14: - Split into two patches: EAL typedef fix and checksum optimization - Use unaligned_uint16_t directly instead of wrapper struct - Added __rte_may_alias to unaligned typedefs to prevent GCC bug Scott Mitchell (2): eal: add __rte_may_alias and __rte_aligned to unaligned typedefs net: __rte_raw_cksum pointers enable compiler optimizations app/test/meson.build | 1 + app/test/test_cksum_fuzz.c | 234 +++++++++++++++++++++++++++++++++++ app/test/test_cksum_perf.c | 2 +- lib/eal/include/rte_common.h | 39 +++--- lib/net/rte_cksum.h | 14 +-- 5 files changed, 264 insertions(+), 26 deletions(-) create mode 100644 app/test/test_cksum_fuzz.c -- 2.39.5 (Apple Git-154)

