On Tue, 12 May 2026 11:19:53 -0700
John Ousterhout <[email protected]> wrote:
> Consider the following sequence of events:
> * The bottom half of a buffer page is filled with data from
> packet A. The page has a net reference count (reference count
> - bias) of 1. The page is returned to the NIC, flipped to
> use the top half.
> * Before the reference on the page is released, the NIC returns
> the page with no data in it ('size' is zero in ice_clean_rx_irq).
> In this case the bias does not get decremented. The page still
> has a net reference count of 1, so it gets returned to the NIC.
> However, ice_put_rx_mbuf flipped the page so that the bottom
> half is active.
> * If the NIC stores another packet in the page before packet A
> has released its reference, the data in packet A will be
> overwritten with data from the new packet.
> * Unfortunately zero-length buffers occur frequently: they seem
> to occur whenever a packet uses every available byte in a
> buffer, ending precisely at the end of the buffer. When this
> happens the NIC seems to generate an extra zero-length
> buffer.
> The fix is for ice_put_rx_mbuf not to flip pages that have a
> size of 0.
How is this different from packet B (in the top half) being
freed before packet A (in the bottom half)?
> This patch applies directly to longterm stable versions 6.18.27
> and 6.12.86; it also seems relevant for 6.6.137 but would need
> modifcations for that version. I have not examined earlier
> versions.
>
> Unfortunately there is no upstream commit id for this patch because
> the ICE driver has undergone a major revision (libeth refactor and
> pagepool conversion) that eliminated the buggy code. Thus the
> problem no longer exists in the main line.
>
> Cc: [email protected] # 6.12+
> Signed-off-by: John Ousterhout <[email protected]>
> ---
> drivers/net/ethernet/intel/ice/ice_txrx.c | 23 ++++++++++++++++++++---
> 1 file changed, 20 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c
> b/drivers/net/ethernet/intel/ice/ice_txrx.c
> index 51c459a3e722..081c7a7392b7 100644
> --- a/drivers/net/ethernet/intel/ice/ice_txrx.c
> +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
> @@ -1215,6 +1215,13 @@ static void ice_put_rx_mbuf(struct ice_rx_ring
> *rx_ring, struct xdp_buff *xdp,
> xdp_frags = xdp_get_shared_info_from_buff(xdp)->nr_frags;
>
> while (idx != ntc) {
> + union ice_32b_rx_flex_desc *rx_desc;
> + unsigned int size;
> +
> + rx_desc = ICE_RX_DESC(rx_ring, idx);
> + size = le16_to_cpu(rx_desc->wb.pkt_len) &
> + ICE_RX_FLX_DESC_PKT_LEN_M;
> +
Looks like you only need to calculate 'size' for the !ICE_XDP_CONSUMED path.
You could also use the (likely cheaper) test for zero:
if (!(rx_desc->wb.pkt_len &
cpu_to_le16(ICE_RX_FLX_DESC_PKT_LEN_M))
-- David
> buf = &rx_ring->rx_buf[idx];
> if (++idx == cnt)
> idx = 0;
> @@ -1224,10 +1231,20 @@ static void ice_put_rx_mbuf(struct ice_rx_ring
> *rx_ring, struct xdp_buff *xdp,
> * To do this, only adjust pagecnt_bias for fragments up to
> * the total remaining after the XDP program has run.
> */
> - if (verdict != ICE_XDP_CONSUMED)
> - ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);
> - else if (i++ <= xdp_frags)
> + if (verdict != ICE_XDP_CONSUMED) {
> + /* Don't "flip" the page if size is 0: in this case
> + * the data in the current half will not be used so
> + * it's OK to reuse that half. And, since the bias
> + * didn't get decremented for this half, the page can
> + * be returned to the NIC even if the other half is
> + * still in use, so flipping the page could cause
> + * live packet data to be overwritten.
> + */
> + if (size != 0)
> + ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);
> + } else if (i++ <= xdp_frags) {
> buf->pagecnt_bias++;
> + }
>
> ice_put_rx_buf(rx_ring, buf);
> }