On Tue, 20 Jan 2026 15:40:21 +0000
Konstantin Ananyev <[email protected]> wrote:

> > > From: Stephen Hemminger [mailto:[email protected]]
> > > Sent: Tuesday, 20 January 2026 15.34
> > >
> > > On Tue, 20 Jan 2026 09:49:44 +0100
> > > Morten Brørup <[email protected]> wrote:
> > >  
> > > > > From: Stephen Hemminger [mailto:[email protected]]
> > > > > Sent: Monday, 19 January 2026 23.48
> > > > >
> > > > > On Fri, 16 Jan 2026 10:32:52 +0100
> > > > > Morten Brørup <[email protected]> wrote:
> > > > >  
> > > > > > > From: Stephen Hemminger [mailto:[email protected]]
> > > > > > > Sent: Friday, 16 January 2026 07.46
> > > > > > >
> > > > > > > When building with LTO (Link Time Optimization), GCC performs
> > > > > > > aggressive cross-compilation-unit inlining. This causes the  
> > > > > compiler  
> > > > > > > to analyze all code paths in __rte_ring_do_dequeue_elems(),  
> > > > > including  
> > > > > > > the 16-byte element path (__rte_ring_dequeue_elems_128), even  
> > > when  
> > > > > > > the runtime element size is only 4 bytes.
> > > > > > >
> > > > > > > The static analyzer sees that the 16-byte path would copy
> > > > > > > 32 elements * 16 bytes = 512 bytes into a 128-byte buffer
> > > > > > > (uint32_t[32]),
> > > > > > > triggering -Wstringop-overflow warnings.  
> > > >
> > > > The element size is not an inline function parameter, but fetched  
> > > from the "esize" field in the rte_soring structure, so the compiler
> > > cannot see that the element size is 4 bytes. And thus it needs to
> > > consider all possible element sizes.  
> > > >  
> > > > > > >
> > > > > > > The existing #pragma GCC diagnostic suppression in  
> > > > > rte_ring_elem_pvt.h  
> > > > > > > doesn't help because with LTO the warning context shifts to the  
> > > > > test  
> > > > > > > file where the inlined code is instantiated.
> > > > > > >
> > > > > > > Fix by sizing all buffers passed to soring acquire/dequeue  
> > > > > functions  
> > > > > > > for the worst-case element size (16 bytes = 4 *  
> > > sizeof(uint32_t)).  
> > > > > > > This satisfies the static analyzer without changing runtime  
> > > > > behavior.  
> > > > > >
> > > > > > Using wildly oversized buffers doesn't seem like a recommendable  
> > > > > solution.  
> > > > > > If the ring library is ever updated to support cache size  
> > > elements  
> > > > > (64 byte), the buffers would have to be oversize by factor 16.
> > > > >
> > > > > The analysis (from AI) is that compiler is getting confused.  
> > > >
> > > > That would be my analysis too.
> > > >  
> > > > > Since there is no good
> > > > > way other than turning of LTO for the test to tell the compiler  
> > > >
> > > > There is another way to tell the compiler: __rte_assume()  
> > >
> > > Tried that but it doesn't work because doesn't get propagated deep
> > > enough to impact here.  
> > 
> > Does this fix generally imply that when using LTO, using an SORING with 
> > elements
> > smaller than 16 bytes requires oversize buffers?
> > That's not good. :-(
> > 
> > The SORING is still experimental.
> > Maybe the element size and metadata size need to be passed as parameters to
> > the SORING functions, like the RING functions take element size as parameter
> > (except the functions that are hardcoded for using pointers as element 
> > size).  
> 
> Personally, I am not a big fan of such idea...
> Wonder is that possible just to disable LTO for soring.o?
> Another thought - if all the problems come from 128 bit version of 
> enque/dequeue,
> would using memcpy() instead  of specific functions help to mitigate the 
> problem?  
> 
> 

I did some more experiments using pragmas, and attributes.
The good news is it works for some versions of Gcc,
the bad news is that pragmas and optimization changes on function basis
seem to crash old compilers, and be disabled in Gcc 16 and get:

../app/test/test_soring.c:154:1: warning: bad option ‘-fno-lto’ to attribute 
‘optimize’ [-Wattributes]




Reply via email to