> > When LTO is enabled, GCC inlines through the entire soring call chain
> > from test code into the ring element copy functions. With
> always_inline,
> > the compiler is forced to inline __rte_ring_dequeue_elems_128() which
> > copies 32 bytes per element. GCC's static analysis then warns about
> > potential buffer overflow because it cannot prove the 128-bit element
> > path is unreachable when the ring is configured for 4-byte elements:
> >
> >   warning: writing 32 bytes into a region of size 0 [-Wstringop-
> overflow=]
> >
> > By using plain inline instead of always_inline on the soring enqueue
> > and dequeue functions, the compiler regains discretion over inlining
> > decisions. This introduces an analysis boundary that prevents GCC
> from
> > connecting the test's buffer sizes to the unreachable 128-bit code
> path,
> > eliminating the false positive warning.
> >
> > Performance impact is expected to be negligible. At -O2/-O3, the
> > compiler will still inline these small, hot functions based on its
> > own heuristics. The difference only matters in debug builds or with
> > -Os, where slightly less aggressive inlining is acceptable.
> >
> > Signed-off-by: Stephen Hemminger <[email protected]>
> > ---
> >  lib/ring/soring.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/lib/ring/soring.c b/lib/ring/soring.c
> > index 797484d6bf..3b90521bdb 100644
> > --- a/lib/ring/soring.c
> > +++ b/lib/ring/soring.c
> > @@ -249,7 +249,7 @@ __rte_soring_stage_move_head(struct
> > soring_stage_headtail *d,
> >     return n;
> >  }
> >
> > -static __rte_always_inline uint32_t
> > +static inline uint32_t
> >  soring_enqueue(struct rte_soring *r, const void *objs,
> >     const void *meta, uint32_t n, enum rte_ring_queue_behavior
> behavior,
> >     uint32_t *free_space)
> > @@ -278,7 +278,7 @@ soring_enqueue(struct rte_soring *r, const void
> *objs,
> >     return n;
> >  }
> >
> > -static __rte_always_inline uint32_t
> > +static inline uint32_t
> >  soring_dequeue(struct rte_soring *r, void *objs, void *meta,
> >     uint32_t num, enum rte_ring_queue_behavior behavior,
> >     uint32_t *available)
> > --
> 
> Run quick test, no perf degradation noticed.
> Acked-by: Konstantin Ananyev <[email protected]>
> Tested-by: Konstantin Ananyev <[email protected]>

We want LTO to work, and the alternatives discussed in another thread [1] are 
not popular.
Acked-by: Morten Brørup <[email protected]>

[1]: 
https://inbox.dpdk.org/dev/[email protected]/

Reply via email to