Consider the following sequence of events:
* The bottom half of a buffer page is filled with data from
  packet A. The page has a net reference count (reference count
  - bias) of 1. The page is returned to the NIC, flipped to
  use the top half.
* Before the reference on the page is released, the NIC returns
  the page with no data in it ('size' is zero in ice_clean_rx_irq).
  In this case the bias does not get decremented. The page still
  has a net reference count of 1, so it gets returned to the NIC.
  However, ice_put_rx_mbuf flipped the page so that the bottom
  half is active.
* If the NIC stores another packet in the page before packet A
  has released its reference, the data in packet A will be
  overwritten with data from the new packet.
* Unfortunately zero-length buffers occur frequently: they seem
  to occur whenever a packet uses every available byte in a
  buffer, ending precisely at the end of the buffer. When this
  happens the NIC seems to generate an extra zero-length
  buffer.
The fix is for ice_put_rx_mbuf not to flip pages that have a
size of 0.

This patch applies directly to longterm stable versions 6.18.27
and 6.12.86; it also seems relevant for 6.6.137 but would need
modifcations for that version. I have not examined earlier
versions.

Unfortunately there is no upstream commit id for this patch because
the ICE driver has undergone a major revision (libeth refactor and
pagepool conversion) that eliminated the buggy code. Thus the
problem no longer exists in the main line.

Cc: [email protected] # 6.12+
Signed-off-by: John Ousterhout <[email protected]>
---
 drivers/net/ethernet/intel/ice/ice_txrx.c | 23 ++++++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c 
b/drivers/net/ethernet/intel/ice/ice_txrx.c
index 51c459a3e722..081c7a7392b7 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -1215,6 +1215,13 @@ static void ice_put_rx_mbuf(struct ice_rx_ring *rx_ring, 
struct xdp_buff *xdp,
                xdp_frags = xdp_get_shared_info_from_buff(xdp)->nr_frags;
 
        while (idx != ntc) {
+               union ice_32b_rx_flex_desc *rx_desc;
+               unsigned int size;
+
+               rx_desc = ICE_RX_DESC(rx_ring, idx);
+               size = le16_to_cpu(rx_desc->wb.pkt_len) &
+                      ICE_RX_FLX_DESC_PKT_LEN_M;
+
                buf = &rx_ring->rx_buf[idx];
                if (++idx == cnt)
                        idx = 0;
@@ -1224,10 +1231,20 @@ static void ice_put_rx_mbuf(struct ice_rx_ring 
*rx_ring, struct xdp_buff *xdp,
                 * To do this, only adjust pagecnt_bias for fragments up to
                 * the total remaining after the XDP program has run.
                 */
-               if (verdict != ICE_XDP_CONSUMED)
-                       ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);
-               else if (i++ <= xdp_frags)
+               if (verdict != ICE_XDP_CONSUMED) {
+                       /* Don't "flip" the page if size is 0: in this case
+                        * the data in the current half will not be used so
+                        * it's OK to reuse that half. And, since the bias
+                        * didn't get decremented for this half, the page can
+                        * be returned to the NIC even if the other half is
+                        * still in use, so flipping the page could cause
+                        * live packet data to be overwritten.
+                        */
+                       if (size != 0)
+                               ice_rx_buf_adjust_pg_offset(buf, xdp->frame_sz);
+               } else if (i++ <= xdp_frags) {
                        buf->pagecnt_bias++;
+               }
 
                ice_put_rx_buf(rx_ring, buf);
        }
-- 
2.43.0

Reply via email to