http://bugs.dpdk.org/show_bug.cgi?id=1886

            Bug ID: 1886
           Summary: net/avp: tx_burst callbacks free mbufs then return
                    short count, causing double-free
           Product: DPDK
           Version: 25.11
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: Normal
         Component: ethdev
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

Recent audit of how tx_pkt_burst is (or is not) being handled in drivers found
this.

The AVP PMD tx_burst callbacks (`avp_xmit_pkts` and `avp_xmit_scattered_pkts`)
violate the `rte_eth_tx_burst()` mbuf ownership contract.  Both functions free
all mbufs they process, but return the result of `avp_fifo_put()` which may be
smaller.  When the caller follows the standard pattern:

```c
n = rte_eth_tx_burst(port, txq, mbufs, nb_pkts);
for (i = n; i < nb_pkts; i++)
    rte_pktmbuf_free(mbufs[i]);
```

any mbufs that were freed by the driver but not reported as consumed will be
freed a second time by the caller.

There are several additional bugs in the same code paths, listed below.

### Bug 1: Double-free in avp_xmit_pkts (avp_ethdev.c ~line 1917 vs 1926)

The function frees each mbuf at line 1917 (`rte_pktmbuf_free(m)`) inside the
processing loop which runs for `count` packets.  At line 1924 it calls
`avp_fifo_put()` and returns the FIFO result `n` at line 1926.  If the FIFO
fills between the earlier free-count check and the put, `n < count`, and
`tx_pkts[n..count-1]` get double-freed.

Fix: return `count` instead of `n`, since all `count` mbufs have already been
consumed.

### Bug 2: Double-free in avp_xmit_scattered_pkts (line 1793 vs 1809)

Same pattern.  All `nb_pkts` mbufs are freed in the loop (line 1793) but
`avp_fifo_put()` result is returned (line 1809).

Fix: return `nb_pkts` instead of `n`.

### Bug 3: AVP host buffer leak on partial avp_fifo_get (lines 1862-1866)

```c
n = avp_fifo_get(alloc_q, (void **)&avp_bufs, count);
if (unlikely(n != count)) {
    txq->errors++;
    return 0;
}
```

If `avp_fifo_get()` returns a partial result (0 < n < count), those `n` AVP
buffers have been dequeued from the allocation queue but are never returned.
They are permanently leaked.  The same pattern exists in
`avp_xmit_scattered_pkts` at lines 1769-1774.

Fix: return partial buffers to alloc_q before returning:

```c
if (unlikely(n != count)) {
    if (n > 0)
        avp_fifo_put(alloc_q, (void **)&avp_bufs[0], n);
    return 0;
}
```

### Bug 4: Wrong variable in error comparison (line 1806)

```c
if (unlikely(n != orig_nb_pkts))
    txq->errors += (orig_nb_pkts - n);
```

`n` is the result of putting `nb_pkts` entries, but it is compared against
`orig_nb_pkts` which is the original caller count before clamping by
`AVP_MAX_TX_BURST` and resource limits.  These values differ whenever the
burst was clamped, so the error count is wrong.

Fix: compare against `nb_pkts`.

### Bug 5: Spurious error counting for unconsumed packets

Both TX functions increment `txq->errors` for all `nb_pkts` when returning 0
due to resource exhaustion or detach (lines 1714, 1761, 1832, 1855).  Since
the function returns 0, no mbufs were consumed — the caller still owns all of
them.  These are not transmission errors; they are backpressure.  Inflating
`oerrors` is misleading.

Fix: remove the error increments on the early-return-zero paths.

### Bug 6: Oversized packets silently truncated (lines 1886-1899)

When a packet exceeds both `guest_mbuf_size` and `host_mbuf_size`,
`avp_xmit_pkts` truncates it to the smaller buffer size and transmits a
corrupt frame.  Per the tx_burst contract, the driver should consume and
free the mbuf and count it in tx_errors, not send garbage.

Fix: set the AVP buffer length to 0 (skip the copy) and count the error.

## Steps to Reproduce

Code inspection of `drivers/net/avp/avp_ethdev.c` at tag v26.03-rc1.

The double-free (bugs 1-2) can trigger when the tx FIFO fills between the
`avp_fifo_free_count()` check and the `avp_fifo_put()` call.  This is a race
against the host consumer and is timing-dependent.

The buffer leak (bug 3) triggers when `avp_fifo_get()` from the alloc queue
returns a partial result, which can occur under host-side memory pressure.

## Suggested Patch

A patch addressing all six issues is available.  The key changes are:

- Return the number of mbufs freed (consumed) rather than the FIFO put count
- Return partially dequeued AVP buffers on allocation failure
- Remove error counting for unconsumed packets (backpressure is not an error)
- Drop oversized packets instead of truncating
- Fix the orig_nb_pkts vs nb_pkts comparison

-- 
You are receiving this mail because:
You are the assignee for the bug.

Reply via email to