> > From: Stephen Hemminger [mailto:[email protected]] > > Sent: Tuesday, 20 January 2026 15.34 > > > > On Tue, 20 Jan 2026 09:49:44 +0100 > > Morten Brørup <[email protected]> wrote: > > > > > > From: Stephen Hemminger [mailto:[email protected]] > > > > Sent: Monday, 19 January 2026 23.48 > > > > > > > > On Fri, 16 Jan 2026 10:32:52 +0100 > > > > Morten Brørup <[email protected]> wrote: > > > > > > > > > > From: Stephen Hemminger [mailto:[email protected]] > > > > > > Sent: Friday, 16 January 2026 07.46 > > > > > > > > > > > > When building with LTO (Link Time Optimization), GCC performs > > > > > > aggressive cross-compilation-unit inlining. This causes the > > > > compiler > > > > > > to analyze all code paths in __rte_ring_do_dequeue_elems(), > > > > including > > > > > > the 16-byte element path (__rte_ring_dequeue_elems_128), even > > when > > > > > > the runtime element size is only 4 bytes. > > > > > > > > > > > > The static analyzer sees that the 16-byte path would copy > > > > > > 32 elements * 16 bytes = 512 bytes into a 128-byte buffer > > > > > > (uint32_t[32]), > > > > > > triggering -Wstringop-overflow warnings. > > > > > > The element size is not an inline function parameter, but fetched > > from the "esize" field in the rte_soring structure, so the compiler > > cannot see that the element size is 4 bytes. And thus it needs to > > consider all possible element sizes. > > > > > > > > > > > > > > > The existing #pragma GCC diagnostic suppression in > > > > rte_ring_elem_pvt.h > > > > > > doesn't help because with LTO the warning context shifts to the > > > > test > > > > > > file where the inlined code is instantiated. > > > > > > > > > > > > Fix by sizing all buffers passed to soring acquire/dequeue > > > > functions > > > > > > for the worst-case element size (16 bytes = 4 * > > sizeof(uint32_t)). > > > > > > This satisfies the static analyzer without changing runtime > > > > behavior. > > > > > > > > > > Using wildly oversized buffers doesn't seem like a recommendable > > > > solution. > > > > > If the ring library is ever updated to support cache size > > elements > > > > (64 byte), the buffers would have to be oversize by factor 16. > > > > > > > > The analysis (from AI) is that compiler is getting confused. > > > > > > That would be my analysis too. > > > > > > > Since there is no good > > > > way other than turning of LTO for the test to tell the compiler > > > > > > There is another way to tell the compiler: __rte_assume() > > > > Tried that but it doesn't work because doesn't get propagated deep > > enough to impact here. > > Does this fix generally imply that when using LTO, using an SORING with > elements > smaller than 16 bytes requires oversize buffers? > That's not good. :-( > > The SORING is still experimental. > Maybe the element size and metadata size need to be passed as parameters to > the SORING functions, like the RING functions take element size as parameter > (except the functions that are hardcoded for using pointers as element size).
Personally, I am not a big fan of such idea... Wonder is that possible just to disable LTO for soring.o? Another thought - if all the problems come from 128 bit version of enque/dequeue, would using memcpy() instead of specific functions help to mitigate the problem?

