> > From: Scott Mitchell <[email protected]> > > > > This series optimizes __rte_raw_cksum by replacing memcpy with direct > > pointer access, enabling compiler vectorization on both GCC and > Clang. > > > > Patch 1 adds __rte_may_alias to unaligned typedefs to prevent a GCC > > strict-aliasing bug where struct initialization is incorrectly > elided. > > > > Patch 2 uses the improved unaligned_uint16_t type in __rte_raw_cksum > > to enable compiler optimizations while maintaining correctness across > > all architectures (including strict-alignment platforms). > > > > Performance results show significant improvements (40% for small > buffers, > > up to 8x for larger buffers) on Intel Xeon with Clang 18.1. > > > > Changes in v15: > > - Use NOHUGE_OK and ASAN_OK constants in REGISTER_FAST_TEST > > > > Changes in v14: > > - Split into two patches: EAL typedef fix and checksum optimization > > - Use unaligned_uint16_t directly instead of wrapper struct > > - Added __rte_may_alias to unaligned typedefs to prevent GCC bug > > > > Scott Mitchell (2): > > eal: add __rte_may_alias to unaligned typedefs > > net: __rte_raw_cksum pointers enable compiler optimizations > > > > app/test/meson.build | 1 + > > app/test/test_cksum_fuzz.c | 240 > +++++++++++++++++++++++++++++++++++ > > app/test/test_cksum_perf.c | 2 +- > > lib/eal/include/rte_common.h | 34 ++--- > > lib/net/rte_cksum.h | 14 +- > > 5 files changed, 266 insertions(+), 25 deletions(-) > > create mode 100644 app/test/test_cksum_fuzz.c > > > > -- > > 2.39.5 (Apple Git-154) > > > > Looks good now. > Acked-by: Stephen Hemminger <[email protected]>
LGTM too. Acked-by: Morten Brørup <[email protected]> Thank you for the effort and prompt reaction to feedback, Scott. It has been a pleasure reviewing this series!

