> > From: Stephen Hemminger [mailto:[email protected]]
> > Sent: Tuesday, 20 January 2026 15.34
> >
> > On Tue, 20 Jan 2026 09:49:44 +0100
> > Morten Brørup <[email protected]> wrote:
> >
> > > > From: Stephen Hemminger [mailto:[email protected]]
> > > > Sent: Monday, 19 January 2026 23.48
> > > >
> > > > On Fri, 16 Jan 2026 10:32:52 +0100
> > > > Morten Brørup <[email protected]> wrote:
> > > >
> > > > > > From: Stephen Hemminger [mailto:[email protected]]
> > > > > > Sent: Friday, 16 January 2026 07.46
> > > > > >
> > > > > > When building with LTO (Link Time Optimization), GCC performs
> > > > > > aggressive cross-compilation-unit inlining. This causes the
> > > > compiler
> > > > > > to analyze all code paths in __rte_ring_do_dequeue_elems(),
> > > > including
> > > > > > the 16-byte element path (__rte_ring_dequeue_elems_128), even
> > when
> > > > > > the runtime element size is only 4 bytes.
> > > > > >
> > > > > > The static analyzer sees that the 16-byte path would copy
> > > > > > 32 elements * 16 bytes = 512 bytes into a 128-byte buffer
> > > > > > (uint32_t[32]),
> > > > > > triggering -Wstringop-overflow warnings.
> > >
> > > The element size is not an inline function parameter, but fetched
> > from the "esize" field in the rte_soring structure, so the compiler
> > cannot see that the element size is 4 bytes. And thus it needs to
> > consider all possible element sizes.
> > >
> > > > > >
> > > > > > The existing #pragma GCC diagnostic suppression in
> > > > rte_ring_elem_pvt.h
> > > > > > doesn't help because with LTO the warning context shifts to the
> > > > test
> > > > > > file where the inlined code is instantiated.
> > > > > >
> > > > > > Fix by sizing all buffers passed to soring acquire/dequeue
> > > > functions
> > > > > > for the worst-case element size (16 bytes = 4 *
> > sizeof(uint32_t)).
> > > > > > This satisfies the static analyzer without changing runtime
> > > > behavior.
> > > > >
> > > > > Using wildly oversized buffers doesn't seem like a recommendable
> > > > solution.
> > > > > If the ring library is ever updated to support cache size
> > elements
> > > > (64 byte), the buffers would have to be oversize by factor 16.
> > > >
> > > > The analysis (from AI) is that compiler is getting confused.
> > >
> > > That would be my analysis too.
> > >
> > > > Since there is no good
> > > > way other than turning of LTO for the test to tell the compiler
> > >
> > > There is another way to tell the compiler: __rte_assume()
> >
> > Tried that but it doesn't work because doesn't get propagated deep
> > enough to impact here.
> 
> Does this fix generally imply that when using LTO, using an SORING with 
> elements
> smaller than 16 bytes requires oversize buffers?
> That's not good. :-(
> 
> The SORING is still experimental.
> Maybe the element size and metadata size need to be passed as parameters to
> the SORING functions, like the RING functions take element size as parameter
> (except the functions that are hardcoded for using pointers as element size).

Personally, I am not a big fan of such idea...
Wonder is that possible just to disable LTO for soring.o?
Another thought - if all the problems come from 128 bit version of 
enque/dequeue,
would using memcpy() instead  of specific functions help to mitigate the 
problem?  


Reply via email to