On Thu, Feb 05, 2026 at 05:54:08PM -0800, Jakub Kicinski wrote: > On Thu, 5 Feb 2026 15:40:46 +0200 Vladimir Oltean wrote: > > > > I mean, it should "work" given the caveat that calling > > > > bpf_xdp_adjust_tail() > > > > on a first-half page buffer with a large offset risks leaking into the > > > > second half, which may also be in use, and this will go undetected, > > > > right? > > > > Although the practical chances of that happening are low, the requested > > > > offset needs to be in the order of hundreds still. > > > > > > Oh, I did get carried away there... > > > Well, one thing is shared page memory model in enetc and i40e, another > > > thing is > > > xsk_buff_pool, where chunk size can be between 2K and PAGE_SIZE. What > > > about > > > > > > tailroom = rxq->frag_size - skb_frag_size(frag) - > > > (skb_frag_off(frag) % rxq->frag_size); > > > > > > When frag_size is set to 2K, headroom is let's say 256, so aligned DMA > > > write > > > size is 1420. > > > last frag at the start of the page: offset=256, size<=1420 > > > tailroom >= 2K - 1420 - 256 = 372 > > > last frag in the middle of the page: offset=256+2K, size<=1420 > > > tailroom >= 2K - 1420 - ((256 + 2K) % 2K) = 372 > > > > > > And for drivers that do not fragment pages for multi-buffer packets, > > > nothing > > > changes, since offset is always less than rxq->frag_size. > > > > > > This brings us back to rxq->frag_size being half of a page for enetc and > > > i40e, > > > and seems like in ZC mode it should be pool->chunk_size to work properly. > > > > > > > With skb_frag_off() taken into account modulo 2K for the tailroom > > calculation, I can confirm bpf_xdp_frags_increase_tail() works well for > > ENETC. I haven't fully considered the side effects, though. > > +1, also seems to me like it would work tho I haven't thought thru all > the cases. We do need to document and name things well, tho, 'cause > subtleties are piling up ;) Maybe it's time for an ASCII art > for xdp layout? >
Yeah, for AF_XDP mbuf in i40e we actually recently discovered another buffer-size-calculation-related issue, so some visual aid would be useful. I will think about how it should look. > FWIW my feeling is that instead of nickel and diming leftover space > in the frags if someone actually cared about growing mbufs we should > have the helper allocate a new page from the PP and append it to the > shinfo. Much simpler, "infinite space", and works regardless of the > driver. I don't mean that to suggest you implement it, purely to point > out that I think nobody really uses positive offsets.. So we can as > well switch more complicated drivers back to xdp_rxq_info_reg(). > As Vladimir has mentioned, if the driver does not use header split, frags will have a tailroom of a size of skb_shared_info, so tail growing does work in practice. Allocating a page_pool buffer (given XDP queue has one attached) is certainly an option, although I am not sure if anyone needs it. Furthermore, growing tail would still fail for a single-buf case.

