> > From: Scott Mitchell <[email protected]>
> >
> > This series optimizes __rte_raw_cksum by replacing memcpy with direct
> > pointer access, enabling compiler vectorization on both GCC and
> Clang.
> >
> > Patch 1 adds __rte_may_alias to unaligned typedefs to prevent a GCC
> > strict-aliasing bug where struct initialization is incorrectly
> elided.
> >
> > Patch 2 uses the improved unaligned_uint16_t type in __rte_raw_cksum
> > to enable compiler optimizations while maintaining correctness across
> > all architectures (including strict-alignment platforms).
> >
> > Performance results show significant improvements (40% for small
> buffers,
> > up to 8x for larger buffers) on Intel Xeon with Clang 18.1.
> >
> > Changes in v15:
> > - Use NOHUGE_OK and ASAN_OK constants in REGISTER_FAST_TEST
> >
> > Changes in v14:
> > - Split into two patches: EAL typedef fix and checksum optimization
> > - Use unaligned_uint16_t directly instead of wrapper struct
> > - Added __rte_may_alias to unaligned typedefs to prevent GCC bug
> >
> > Scott Mitchell (2):
> >   eal: add __rte_may_alias to unaligned typedefs
> >   net: __rte_raw_cksum pointers enable compiler optimizations
> >
> >  app/test/meson.build         |   1 +
> >  app/test/test_cksum_fuzz.c   | 240
> +++++++++++++++++++++++++++++++++++
> >  app/test/test_cksum_perf.c   |   2 +-
> >  lib/eal/include/rte_common.h |  34 ++---
> >  lib/net/rte_cksum.h          |  14 +-
> >  5 files changed, 266 insertions(+), 25 deletions(-)
> >  create mode 100644 app/test/test_cksum_fuzz.c
> >
> > --
> > 2.39.5 (Apple Git-154)
> >
> 
> Looks good now.
> Acked-by: Stephen Hemminger <[email protected]>

LGTM too.
Acked-by: Morten Brørup <[email protected]>

Thank you for the effort and prompt reaction to feedback, Scott.
It has been a pleasure reviewing this series!

Reply via email to