On Wed, Feb 18, 2026 at 09:48:04AM +0100, Morten Brørup wrote:
> > From: Stephen Hemminger [mailto:[email protected]]
> > Sent: Monday, 16 February 2026 19.00
> > 
> > The documentation for rte_eth_tx_burst() uses the word "sent" to
> > describe the return value, which is misleading. Packets returned as
> > consumed may not have been transmitted yet; they have been accepted
> > by the driver and are no longer the caller's responsibility.
> > 
> > This matters because the common usage pattern is:
> > 
> >     n = rte_eth_tx_burst(port, txq, mbufs, nb_pkts);
> >     for (i = n; i < nb_pkts; i++)
> >         rte_pktmbuf_free(mbufs[i]);
> > 
> > For this to work correctly, the contract must be:
> >  - tx_pkts[0..n-1]: ownership transferred to the driver.
> >  - tx_pkts[n..nb_pkts-1]: untouched, still owned by the caller.
> > 
> > Several drivers (and AI-assisted reviews) misinterpret the current
> > wording and treat packets with errors as unconsumed, returning a
> > short count. This causes callers to retry those packets indefinitely.
> > The correct behavior is that the driver must consume (and free)
> > erroneous packets, counting them via tx_errors.
> > 
> > Replace "sent" with "consumed" in the return value description,
> > spell out the mbuf ownership contract, clarify the error handling
> > expectation, and update the @return block to match.
> > 
> > Signed-off-by: Stephen Hemminger <[email protected]>
> > ---
> >  lib/ethdev/rte_ethdev.h | 21 ++++++++++++++++-----
> >  1 file changed, 16 insertions(+), 5 deletions(-)
> > 
> > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> > index 0d8e2d0236..9e49c4a945 100644
> > --- a/lib/ethdev/rte_ethdev.h
> > +++ b/lib/ethdev/rte_ethdev.h
> > @@ -6639,13 +6639,24 @@ uint16_t rte_eth_call_tx_callbacks(uint16_t
> > port_id, uint16_t queue_id,
> >   * of the ring.
> >   *
> >   * The rte_eth_tx_burst() function returns the number of packets it
> > - * actually sent. A return value equal to *nb_pkts* means that all
> > packets
> > - * have been sent, and this is likely to signify that other output
> > packets
> > + * has consumed from the *tx_pkts* array. The driver takes ownership
> > of
> > + * the mbufs for all consumed packets (tx_pkts[0] to tx_pkts[n-1]);
> > + * the caller must not access them afterward. The remaining packets
> > + * (tx_pkts[n] to tx_pkts[nb_pkts-1]) are not modified and remain the
> > + * caller's responsibility.
> > + *
> > + * A return value equal to *nb_pkts* means that all packets have been
> > + * consumed, and this is likely to signify that other output packets
> >   * could be immediately transmitted again. Applications that implement
> > a
> >   * "send as many packets to transmit as possible" policy can check
> > this
> >   * specific case and keep invoking the rte_eth_tx_burst() function
> > until
> >   * a value less than *nb_pkts* is returned.
> >   *
> > + * If a packet cannot be transmitted due to an error (for example, an
> > + * invalid offload flag), the driver must still consume it and free
> > the
> > + * mbuf, rather than stopping at that point. Such packets should be
> > + * counted in the *tx_errors* port statistic.
> 
> The above paragraph is driver centric, it should be application centric.
> Suggest rephrasing as:
> 
> If a packet cannot be transmitted due to an error (for example, an invalid 
> offload flag), the rte_eth_tx_burst() function will still consume it, rather 
> than stopping at that point.
> Such packets are counted in the *oerrors* port statistic.
> 
> NB: In struct rte_eth_stats [1], the error counter is named "oerrors", not 
> "tx_errors".
> 
> [1]: 
> https://elixir.bootlin.com/dpdk/v25.11/source/lib/ethdev/rte_ethdev.h#L273
> 
> While discussing details...
> Let's say a packet has 4 segments, and the driver only has 2 descriptors 
> remaining available.
> In that case, I think the driver should not consume the packet, but leave it 
> for the application to either drop it or retry transmitting it later.
> Do we want to mention this case too, or is it a semi-obvious case of the 
> descriptor ring having no more room?
> 
I would tend towards it being covered by the descriptor ring not having
room. If we try to cover all edge cases here the documentation will get too
long and therefore less likely to be read.

/Bruce

Reply via email to