On Sat, 17 Jan 2026 13:21:12 -0800 [email protected] wrote: > From: Scott Mitchell <[email protected]> > > This series optimizes __rte_raw_cksum by replacing memcpy with direct > pointer access, enabling compiler vectorization on both GCC and Clang. > > Patch 1 adds __rte_may_alias to unaligned typedefs to prevent a GCC > strict-aliasing bug where struct initialization is incorrectly elided. > > Patch 2 uses the improved unaligned_uint16_t type in __rte_raw_cksum > to enable compiler optimizations while maintaining correctness across > all architectures (including strict-alignment platforms). > > Performance results show significant improvements (40% for small buffers, > up to 8x for larger buffers) on Intel Xeon with Clang 18.1. > > Changes in v15: > - Use NOHUGE_OK and ASAN_OK constants in REGISTER_FAST_TEST > > Changes in v14: > - Split into two patches: EAL typedef fix and checksum optimization > - Use unaligned_uint16_t directly instead of wrapper struct > - Added __rte_may_alias to unaligned typedefs to prevent GCC bug > > Scott Mitchell (2): > eal: add __rte_may_alias to unaligned typedefs > net: __rte_raw_cksum pointers enable compiler optimizations > > app/test/meson.build | 1 + > app/test/test_cksum_fuzz.c | 240 +++++++++++++++++++++++++++++++++++ > app/test/test_cksum_perf.c | 2 +- > lib/eal/include/rte_common.h | 34 ++--- > lib/net/rte_cksum.h | 14 +- > 5 files changed, 266 insertions(+), 25 deletions(-) > create mode 100644 app/test/test_cksum_fuzz.c > > -- > 2.39.5 (Apple Git-154) >
Looks good now. Acked-by: Stephen Hemminger <[email protected]> AI review agrees with me... ## Patch Review: [PATCH v15 1/2] eal: add __rte_may_alias to unaligned typedefs ### Commit Message Analysis | Criterion | Status | Notes | |-----------|--------|-------| | Subject ≤60 chars | ✅ Pass | 47 characters | | Lowercase after colon | ✅ Pass | "add __rte_may_alias..." | | Imperative mood | ✅ Pass | "add" | | No trailing period | ✅ Pass | | | Correct prefix | ✅ Pass | "eal:" for lib/eal/ files | | Body ≤75 chars/line | ✅ Pass | Lines appear within limit | | Body doesn't start with "It" | ✅ Pass | Starts with "Add" | | Signed-off-by present | ✅ Pass | `Signed-off-by: Scott Mitchell <[email protected]>` | ### Missing Tags (Warning) **No `Fixes:` tag**: The commit message describes fixing "GCC strict-aliasing optimization bugs" and "incorrect optimization." This sounds like a bug fix that should reference the original commit introducing the unaligned typedefs. Consider adding: ``` Fixes: <12-char-sha> ("original commit introducing unaligned typedefs") ``` **No `Cc: [email protected]`**: If this fixes a real bug causing reads from uninitialized memory, it's likely a stable backport candidate. ### Code Review **Positive aspects:** - Proper Doxygen comment added for `__rte_may_alias` macro - Good explanation of the GCC bug workaround - MSVC fallback handled correctly - Macro moved before its use (necessary for the typedefs) **Minor observations:** - The second comment block (lines 121-124 in the diff) is somewhat redundant with the first Doxygen comment. Consider consolidating. --- ## Patch Review: [PATCH v15 2/2] net: use unaligned type for raw checksum ### Commit Message Analysis The mbox was truncated, but based on what's visible: | Criterion | Status | Notes | |-----------|--------|-------| | Correct prefix | ✅ Pass | "net:" for lib/net/ files | ### Code Review - lib/net/rte_cksum.h **The core change:** ```c // OLD (memcpy-based): for (end = RTE_PTR_ADD(buf, RTE_ALIGN_FLOOR(len, sizeof(uint16_t))); buf != end; buf = RTE_PTR_ADD(buf, sizeof(uint16_t))) { uint16_t v; memcpy(&v, buf, sizeof(uint16_t)); sum += v; } // NEW (direct access via unaligned type): const unaligned_uint16_t *buf16 = (const unaligned_uint16_t *)buf; const unaligned_uint16_t *end = buf16 + (len / sizeof(*buf16)); for (; buf16 != end; buf16++) sum += *buf16; ``` **Positive aspects:** - Cleaner, more readable code - Relies on the `__rte_may_alias` attribute from patch 1 to prevent aliasing bugs - Comment explains vectorization benefit: "GCC/Clang vectorize the loop" - Good dependency ordering (patch 1 must come before patch 2) **Style observations:** - ✅ Line length within 100 chars - ✅ Proper use of `const` ### Code Review - app/test/test_cksum_fuzz.c (New File) **Positive aspects:** - ✅ Uses `TEST_SUCCESS`/`TEST_FAILED` correctly - ✅ Uses `REGISTER_FAST_TEST` macro properly - ✅ `printf()` usage is acceptable in test code per AGENTS.md - ✅ `rte_malloc()` usage acceptable in test code - ✅ Comprehensive edge case testing (power-of-2 boundaries, MTU sizes, GRO boundaries) - ✅ Tests both aligned and unaligned cases - ✅ Tests with zero and random initial sums **Issues to verify** (file header not visible in truncated mbox): - Ensure SPDX license identifier present on first line - Ensure copyright line follows SPDX - Ensure blank line before includes **Style warning (lines 394-396):** ```c printf("MISMATCH at len=%zu aligned='%s' initial_sum=0x%08x ref=0x%08x opt=0x%08x\n", len, aligned ? "aligned" : "unaligned", initial_sum, sum_ref, sum_opt); ``` Line length appears to be ~95 chars which is acceptable (<100). ### Code Review - app/test/test_cksum_perf.c Minor change extending test coverage - looks fine. --- ## Summary ### Errors (Must Fix) None identified. ### Warnings (Should Fix) | Issue | Patch | Recommendation | |-------|-------|----------------| | Missing `Fixes:` tag | 1/2 | Add if this fixes a regression from a specific commit | | Missing `Cc: [email protected]` | 1/2 | Consider if this should be backported | | Verify SPDX header | 2/2 | Ensure test_cksum_fuzz.c has proper license header | ### Info (Consider) 1. **Patch 1**: The two comment blocks for `__rte_may_alias` could be consolidated into a single, more comprehensive Doxygen comment. 2. **Patch 2**: The new fuzz test is well-structured and follows DPDK test conventions. Good use of the `unit_test_suite_runner`-style approach with `REGISTER_FAST_TEST`. 3. **Series overall**: Good logical ordering - patch 1 provides the infrastructure, patch 2 uses it. Each commit should compile independently. --- **Verdict**: This is a well-structured patch series at v15. The code changes are clean and the test coverage is thorough. The main actionable items are adding appropriate `Fixes:` and `Cc: stable` tags if this is indeed a bug fix worth backporting.

