On Wed, May 13, 2026 at 1:49 PM David Laight <[email protected]> wrote: > > On Wed, 13 May 2026 09:28:40 -0700 > John Ousterhout <[email protected]> wrote: > > > On Wed, May 13, 2026 at 2:07 AM David Laight > > <[email protected]> wrote: > > > > > > On Tue, 12 May 2026 11:19:53 -0700 > > > John Ousterhout <[email protected]> wrote: > > > > > > > Consider the following sequence of events: > > > > * The bottom half of a buffer page is filled with data from > > > > packet A. The page has a net reference count (reference count > > > > - bias) of 1. The page is returned to the NIC, flipped to > > > > use the top half. > > > > * Before the reference on the page is released, the NIC returns > > > > the page with no data in it ('size' is zero in ice_clean_rx_irq). > > > > In this case the bias does not get decremented. The page still > > > > has a net reference count of 1, so it gets returned to the NIC. > > > > However, ice_put_rx_mbuf flipped the page so that the bottom > > > > half is active. > > > > * If the NIC stores another packet in the page before packet A > > > > has released its reference, the data in packet A will be > > > > overwritten with data from the new packet. > > > > * Unfortunately zero-length buffers occur frequently: they seem > > > > to occur whenever a packet uses every available byte in a > > > > buffer, ending precisely at the end of the buffer. When this > > > > happens the NIC seems to generate an extra zero-length > > > > buffer. > > > > The fix is for ice_put_rx_mbuf not to flip pages that have a > > > > size of 0. > > > > > > How is this different from packet B (in the top half) being > > > freed before packet A (in the bottom half)? > > > > I'm not sure exactly what you're referring to here. Are you asking > > about a situation where both halves of the page get filled with packet > > data and then the second half to be filled is the first to be freed? I > > believe that the ICE driver abandons a page if both halves are ever > > occupied simultaneously; the page will be returned to the system once > > both halves have dropped their references. Thus it doesn't matter > > which half is freed first. > > That is what I was thinking, seems like the logic is over complicated. > > If you need to put 4k pages into some kind of iommu rather than 2k buffers > (to contain 1536 byte ethernet packets) then I'd have thought you'd > initially put both halves into adjacent tx ring entries. > If a rx buffer is discarded (eg a zero length fragment or a CRC error, > or even 'copy break' for short packets) then, as an optimisation, > you could reuse the buffer for another receive. > The same could be done if the page is freed by an application. > > However it sounds like it doesn't use the 2nd half until the first > completes - otherwise you'd never 'flip' to make the other half > active. > > Thinks... > By only putting half of each 4k 'page' into the rx ring the code > will usually save (expensive) iommu setup in the (probably) normal > case where the buffers are freed 'reasonably quickly'. > But that really requires a 'free/with_nic/busy' state for each half > rather then trying to guess from a reference count. > > But if the low-level code is recycling the rx buffer (for any reason) > it wants to use the same buffer. > > The ethernet driver I wrote (a long time ago, early 90s) allocated > 64k as 128 512byte buffers and did an aligned word-sized copy of > every receive frame - most frames were in contiguous memory. > The simplicity of it made up for the cost of the copy, especially > since that was an iommu system.
I'm not here to defend the logic (and it has been replaced with something that is probably simpler and more efficient); I'm just suggesting a bug fix for the stable releases that still have this logic. -John-
