From: Maciej Fijalkowski <[email protected]>
Date: Thu, 24 Apr 2025 14:48:39 +0200
> On Tue, Apr 15, 2025 at 07:28:12PM +0200, Alexander Lobakin wrote:
>> Start adding XDP-specific code to libeth, namely handling XDP_TX buffers
>> (only sending).
[...]
>> +static __always_inline u32
>> +libeth_xdp_tx_xmit_bulk(const struct libeth_xdp_tx_frame *bulk, void *xdpsq,
>> + u32 n, bool unroll, u64 priv,
>> + u32 (*prep)(void *xdpsq, struct libeth_xdpsq *sq),
>> + struct libeth_xdp_tx_desc
>> + (*fill)(struct libeth_xdp_tx_frame frm, u32 i,
>> + const struct libeth_xdpsq *sq, u64 priv),
>> + void (*xmit)(struct libeth_xdp_tx_desc desc, u32 i,
>> + const struct libeth_xdpsq *sq, u64 priv))
>> +{
>> + u32 this, batched, off = 0;
>> + struct libeth_xdpsq sq;
>> + u32 ntu, i = 0;
>> +
>> + n = min(n, prep(xdpsq, &sq));
>> + if (unlikely(!n))
>> + return 0;
>> +
>> + ntu = *sq.ntu;
>> +
>> + this = sq.count - ntu;
>
> maybe something more self-descriptive than 'this'? :)
> this is available space in sq, right?
'this' means "this batch", IOW what we'll send for sure this iteration.
>
>> + if (likely(this > n))
>> + this = n;
>> +
>> +again:
>> + if (!unroll)
>
> who takes this decision? a caller or is this dependent on some constraints
> of underlying system? when would you like to not unroll?
XDP_TX, ndo_xdp_xmit, XSk XDP_TX wrappers pass `false` here, only XSk
xmit passes `true`. In cases other than XSk xmit, I had no positive
impact, while the object code bloat was huge -- XSk xmit doesn't fill
&libeth_sqe, only a Tx descriptor, while all the rest do.
>
>> + goto linear;
>> +
>> + batched = ALIGN_DOWN(this, LIBETH_XDP_TX_BATCH);
>> +
>> + for ( ; i < off + batched; i += LIBETH_XDP_TX_BATCH) {
>> + u32 base = ntu + i - off;
>> +
>> + unrolled_count(LIBETH_XDP_TX_BATCH)
>> + for (u32 j = 0; j < LIBETH_XDP_TX_BATCH; j++)
>> + xmit(fill(bulk[i + j], base + j, &sq, priv),
>> + base + j, &sq, priv);
>> + }
>> +
>> + if (batched < this) {
>> +linear:
>> + for ( ; i < off + this; i++)
>> + xmit(fill(bulk[i], ntu + i - off, &sq, priv),
>> + ntu + i - off, &sq, priv);
>> + }
>> +
>> + ntu += this;
>> + if (likely(ntu < sq.count))
>> + goto out;
>> +
>> + ntu = 0;
>> +
>> + if (i < n) {
>> + this = n - i;
>> + off = i;
>> +
>> + goto again;
>> + }
>> +
>> +out:
>> + *sq.ntu = ntu;
>> + *sq.pending += n;
>> + if (sq.xdp_tx)
>> + *sq.xdp_tx += n;
>> +
>> + return n;
>> +}
[...]
>> +/**
>> + * __libeth_xdp_tx_flush_bulk - internal helper to flush one XDP Tx bulk
>> + * @bq: bulk to flush
>> + * @flags: XDP TX flags
>> + * @prep: driver-specific callback to prepare the queue for sending
>> + * @fill: libeth_xdp callback to fill &libeth_sqe and &libeth_xdp_tx_desc
>
> Could you explain why this has to be implemented as a callback? for now
> this might just be directly called within libeth_xdp_tx_xmit_bulk() ?
>
> What I currently understand is this is not something that driver would
> provide. If its libeth internal routine then call this directly and
> simplify the code.
XSk XDP_TX passes a different callback here :> Anyway, all callbacks
within libeth_xdp get inlined or (sometimes) converted to direct calls,
no indirections.
>
> Besides, thanks a lot for this series and split up! I think we're on a
> good path.
Thanks,
Olek