From: Scott <[email protected]>

This series optimizes __rte_raw_cksum by replacing memcpy with direct
pointer access, enabling compiler vectorization on both GCC and Clang.

Patch 1 adds __rte_may_alias and __rte_aligned(1) to unaligned typedefs
to prevent a GCC strict-aliasing bug where struct initialization is
incorrectly elided, and avoid UB by clarifying access can be from any
address.

Patch 2 uses the improved unaligned_uint16_t type in __rte_raw_cksum
to enable compiler optimizations while maintaining correctness across
all architectures (including strict-alignment platforms).

Performance results show significant improvements (40% for small buffers,
up to 8x for larger buffers) on Intel Xeon with Clang 18.1.

Changes in v18:
- Fix MSVC compile error __rte_aligned(1) must come before type
- Fix test_hash_functions incorrect usage of unaligned_uint32_t

Changes in v17:
- Use __rte_aligned(1) unconditionally on unaligned type aliases
- test_cksum_fuzz uses unit_test_suite_runner
- test_cksum_fuzz reference method rename to
test_cksum_fuzz_cksum_reference

Changes in v16:
- Add Fixes tag and Cc stable/author for backporting (patch 1)

Changes in v15:
- Use NOHUGE_OK and ASAN_OK constants in REGISTER_FAST_TEST

Changes in v14:
- Split into two patches: EAL typedef fix and checksum optimization
- Use unaligned_uint16_t directly instead of wrapper struct
- Added __rte_may_alias to unaligned typedefs to prevent GCC bug

Scott Mitchell (2):
  eal: add __rte_may_alias and __rte_aligned to unaligned typedefs
  net: __rte_raw_cksum pointers enable compiler optimizations

 app/test/meson.build           |   1 +
 app/test/test_cksum_fuzz.c     | 234 +++++++++++++++++++++++++++++++++
 app/test/test_cksum_perf.c     |   2 +-
 app/test/test_hash_functions.c |   2 +-
 lib/eal/include/rte_common.h   |  45 ++++---
 lib/net/rte_cksum.h            |  14 +-
 6 files changed, 271 insertions(+), 27 deletions(-)
 create mode 100644 app/test/test_cksum_fuzz.c

--
2.39.5 (Apple Git-154)

Reply via email to