On Wed, May 13, 2026 at 2:07 AM David Laight
<[email protected]> wrote:
>
> On Tue, 12 May 2026 11:19:53 -0700
> John Ousterhout <[email protected]> wrote:
>
> > Consider the following sequence of events:
> > * The bottom half of a buffer page is filled with data from
> >   packet A. The page has a net reference count (reference count
> >   - bias) of 1. The page is returned to the NIC, flipped to
> >   use the top half.
> > * Before the reference on the page is released, the NIC returns
> >   the page with no data in it ('size' is zero in ice_clean_rx_irq).
> >   In this case the bias does not get decremented. The page still
> >   has a net reference count of 1, so it gets returned to the NIC.
> >   However, ice_put_rx_mbuf flipped the page so that the bottom
> >   half is active.
> > * If the NIC stores another packet in the page before packet A
> >   has released its reference, the data in packet A will be
> >   overwritten with data from the new packet.
> > * Unfortunately zero-length buffers occur frequently: they seem
> >   to occur whenever a packet uses every available byte in a
> >   buffer, ending precisely at the end of the buffer. When this
> >   happens the NIC seems to generate an extra zero-length
> >   buffer.
> > The fix is for ice_put_rx_mbuf not to flip pages that have a
> > size of 0.
>
> How is this different from packet B (in the top half) being
> freed before packet A (in the bottom half)?

I'm not sure exactly what you're referring to here. Are you asking
about a situation where both halves of the page get filled with packet
data and then the second half to be filled is the first to be freed? I
believe that the ICE driver abandons a page if both halves are ever
occupied simultaneously; the page will be returned to the system once
both halves have dropped their references. Thus it doesn't matter
which half is freed first.

> > This patch applies directly to longterm stable versions 6.18.27
> > and 6.12.86; it also seems relevant for 6.6.137 but would need
> > modifcations for that version. I have not examined earlier
> > versions.
> >
> > Unfortunately there is no upstream commit id for this patch because
> > the ICE driver has undergone a major revision (libeth refactor and
> > pagepool conversion) that eliminated the buggy code. Thus the
> > problem no longer exists in the main line.
> >
> > Cc: [email protected] # 6.12+
> > Signed-off-by: John Ousterhout <[email protected]>
> > ---
> >  drivers/net/ethernet/intel/ice/ice_txrx.c | 23 ++++++++++++++++++++---
> >  1 file changed, 20 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c 
> > b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > index 51c459a3e722..081c7a7392b7 100644
> > --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> > +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> > @@ -1215,6 +1215,13 @@ static void ice_put_rx_mbuf(struct ice_rx_ring 
> > *rx_ring, struct xdp_buff *xdp,
> >               xdp_frags = xdp_get_shared_info_from_buff(xdp)->nr_frags;
> >
> >       while (idx != ntc) {
> > +             union ice_32b_rx_flex_desc *rx_desc;
> > +             unsigned int size;
> > +
> > +             rx_desc = ICE_RX_DESC(rx_ring, idx);
> > +             size = le16_to_cpu(rx_desc->wb.pkt_len) &
> > +                    ICE_RX_FLX_DESC_PKT_LEN_M;
> > +
>
> Looks like you only need to calculate 'size' for the !ICE_XDP_CONSUMED path.
> You could also use the (likely cheaper) test for zero:
>                 if (!(rx_desc->wb.pkt_len & 
> cpu_to_le16(ICE_RX_FLX_DESC_PKT_LEN_M))
>
> -- David
>
> >               buf = &rx_ring->rx_buf[idx];
> >               if (++idx == cnt)
> >                       idx = 0;
> > @@ -1224,10 +1231,20 @@ static void ice_put_rx_mbuf(struct ice_rx_ring 
> > *rx_ring, struct xdp_buff *xdp,
> >                * To do this, only adjust pagecnt_bias for fragments up to
> >                * the total remaining after the XDP program has run.
> >                */
> > -             if (verdict != ICE_XDP_CONSUMED)
> > -                     ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);
> > -             else if (i++ <= xdp_frags)
> > +             if (verdict != ICE_XDP_CONSUMED) {
> > +                     /* Don't "flip" the page if size is 0: in this case
> > +                      * the data in the current half will not be used so
> > +                      * it's OK to reuse that half. And, since the bias
> > +                      * didn't get decremented for this half, the page can
> > +                      * be returned to the NIC even if the other half is
> > +                      * still in use, so flipping the page could cause
> > +                      * live packet data to be overwritten.
> > +                      */
> > +                     if (size != 0)
> > +                             ice_rx_buf_adjust_pg_offset(buf, 
> > xdp->frame_sz);
> > +             } else if (i++ <= xdp_frags) {
> >                       buf->pagecnt_bias++;
> > +             }
> >
> >               ice_put_rx_buf(rx_ring, buf);
> >       }
>

Reply via email to