> From: Stephen Hemminger [mailto:[email protected]] > Sent: Monday, 19 January 2026 23.48 > > On Fri, 16 Jan 2026 10:32:52 +0100 > Morten Brørup <[email protected]> wrote: > > > > From: Stephen Hemminger [mailto:[email protected]] > > > Sent: Friday, 16 January 2026 07.46 > > > > > > When building with LTO (Link Time Optimization), GCC performs > > > aggressive cross-compilation-unit inlining. This causes the > compiler > > > to analyze all code paths in __rte_ring_do_dequeue_elems(), > including > > > the 16-byte element path (__rte_ring_dequeue_elems_128), even when > > > the runtime element size is only 4 bytes. > > > > > > The static analyzer sees that the 16-byte path would copy > > > 32 elements * 16 bytes = 512 bytes into a 128-byte buffer > > > (uint32_t[32]), > > > triggering -Wstringop-overflow warnings.
The element size is not an inline function parameter, but fetched from the "esize" field in the rte_soring structure, so the compiler cannot see that the element size is 4 bytes. And thus it needs to consider all possible element sizes. > > > > > > The existing #pragma GCC diagnostic suppression in > rte_ring_elem_pvt.h > > > doesn't help because with LTO the warning context shifts to the > test > > > file where the inlined code is instantiated. > > > > > > Fix by sizing all buffers passed to soring acquire/dequeue > functions > > > for the worst-case element size (16 bytes = 4 * sizeof(uint32_t)). > > > This satisfies the static analyzer without changing runtime > behavior. > > > > Using wildly oversized buffers doesn't seem like a recommendable > solution. > > If the ring library is ever updated to support cache size elements > (64 byte), the buffers would have to be oversize by factor 16. > > The analysis (from AI) is that compiler is getting confused. That would be my analysis too. > Since there is no good > way other than turning of LTO for the test to tell the compiler There is another way to tell the compiler: __rte_assume() > > > > > Maybe adding __rte_assume(sor->esize == sizeof(uint32_t)); > immediately before calling each of the affected soring functions would > fix the problem instead? The soring functions are inline, so adding __rte_assume(sor->esize == sizeof(uint32_t)) before calling them tells the compiler that only the code path for 4 byte element size is relevant, so the compiler can eliminate the code paths for other element sizes. Using __rte_assume() has worked for me in other cases, not just for optimization purposes, but also when the compiler gets confused about potential buffer overruns (that we know cannot occur, i.e. false positives). It might solve this problem too. IMO, it is generally preferable to help the compiler (by providing hints) rather than implementing workarounds (using oversized buffers in this case). > > > > It's only a test application, so oversized buffers as a workaround is > acceptable. > > > > But if it serves as guidance for real applications, a better > solution/workaround would be preferable.

