On Wed, Mar 25, 2026 at 09:52:37AM -0700, Stephen Hemminger wrote:
> On Wed, 25 Mar 2026 16:22:45 +0000
> Bruce Richardson <[email protected]> wrote:
> 
> > On Wed, Mar 25, 2026 at 09:19:21AM -0700, Stephen Hemminger wrote:
> > > On Wed, 25 Mar 2026 10:36:56 +0100
> > > Morten Brørup <[email protected]> wrote:
> > >   
> > > > If an application clones packets instead of copying them, it is 
> > > > probably for performance reasons.
> > > > If the drivers start copying those clones, it may defeat the 
> > > > performance purpose.
> > > > 
> > > > <brainstorming>
> > > > Maybe segmentation can be used instead of copying the full packet:
> > > > Make the "copy" packet of two (or more) segments, where the header is 
> > > > copied into a new mbuf (where the VLAN tag is added), and the remaining 
> > > > part of the packet uses an indirect mbuf referring to the "original" 
> > > > packet at the offset after the header.
> > > > </brainstorming>
> > > > 
> > > > Furthermore...
> > > > If drivers start copying packets in the Tx function, the Tx queue 
> > > > should have its own mbuf pool to allocate these mbufs from.
> > > > Drivers should not steal mbufs from the pools used by the packets being 
> > > > transmitted.
> > > > E.g. if a segmented packet has a small mbuf for the first few bytes, 
> > > > followed by a large mbuf (from another pool) for the remaining bytes.
> > > > Or if the "original" mbuf comes from a mempool allocated on different 
> > > > CPU socket, the "copy" would too.  
> > > 
> > > 
> > > The problem with the Tx function is how backpressure gets handled.
> > > Not sure that it is documented well enough that if a packet is not sent
> > > due to backpressure, the mbuf in the array may still have been replaced.  
> > 
> > Most drivers should be able to check for space in a Tx ring, or whatever
> > other backpressure mechanism is being used, before modifying a buffer.
> > 
> > /Bruce
> 
> Not in case of drivers that need syscall to push packets.

Modifications will have to be rolled back in that case. Alternatively, the
driver just doesn't offer the offload, which is IMHO a perfectly reasonable
approach to take.

/Bruce

Reply via email to