> From: Bruce Richardson [mailto:[email protected]] > Sent: Thursday, 12 February 2026 10.15 > > On Wed, Feb 11, 2026 at 10:51:56PM +0100, Morten Brørup wrote: > > > Rather than writing a last_id for each individual descriptor, we > can > > > write one only for places where the "report status" (RS) bit is > set, > > > i.e. the descriptors which will be written back when done. The > method > > > used for marking what descriptors are free is also changed in the > > > process, even if the last descriptor with the "done" bits set is > past > > > the expected point, we only track up to the expected point, and > leave > > > the rest to be counted as freed next time. This means that we > always > > > have the RS/DD bits set at fixed intervals, and we always track > free > > > slots in units of the same tx_free_thresh intervals. > > > > I'm not saying it's good or bad, I'm simply trying to understand the > performance tradeoff... > > I'm wondering if spreading fields over two separate arrays is > beneficial when considering cache misses. > > > > This patch introduces a separate array, uint16_t txq->rs_last_id[], > which is not in the same cache line as the txe array. > > > > So now, two separate cache lines must be updated, rs_last_id and txe. > > > > Previously, only txe needed updating. > > > > Assuming both rings are cold, how many cache misses would a burst of > 32 (single-segment) packets cause... > > Number of cache misses in the txe ring (before this patch, and > after)? > > Number of cache misses in the rs_last_id ring (after this patch)? > > > > The main txe ring has 4 elements per cacheline both before and after > this > patch. As an aside, 6 bytes of each 16 is wasted, and I have tried a > couple of > times to get us down to an 8 byte ring element and each time > performance > has dropped, presumably due to extra computation and branching for ring > wraparound when we remove the precomputed next index values. > > Anyway, for 32 single-segment packets, we obviously touch 8 cachelines > both > before an after this patch in the txe array. In the next_rs array, we > should just read a single index, of 16 bytes, so the overall array has > a > fairly small cache footprint, 2 bytes per 32 ring entries. [Or one > cacheline for a 1024-sized ring] > > However, the performance improvements given by the rework of the > cleanup > handling are very noticible and not really to do with memory/cache > footprint so much as doing less work, and less branching. For example: > > * before this patch, on transmit we stored the index of the last > descriptor > for each segment of each packet. It was to a hot cacheline, but it > was still > (if no coalescing) potentially 32 stores per burst. After this patch, > we > instead only store the index of the last segment for the packet > crossing > the next multiple of 32 boundary. Going from potentially 32 stores to > 1. > > * When doing cleanup post transmit, we still have an array lookup to > determine the location of the NIC write back to indicate descriptor > done. > However, while before this was being done to the txe array - which > was > written (ring_size - 32) packets ago - it's now done by reading the > new, > tiny rs_next_id array which is more likely to be in cache. > > * finally, when doing the actual buffer freeing, before we would have > an > arbitrary number of buffers - though normally 32 - to be freed from > the > ring starting at an arbitrary location. After this patch, we always > have > exactly 32 (or more correctly rs_thresh, which must divide evenly > into > ring_size) buffers to be freed starting at an index which is a > multiple of > 32, and which therefore, most importantly, guarantees that we do not > have > any wraparound as part of the buffer free process. This means no > branches > for idx >= ring_size etc., and that we are free to treat the buffers > to > be freed as being stored in a linear array.
OK. Thank you for the detailed explanation. Acked-by: Morten Brørup <[email protected]> > > > > > > > > > Signed-off-by: Bruce Richardson <[email protected]> > > > Acked-by: Anatoly Burakov <[email protected]> > > > ---

