From: Lorenz Brun <[email protected]>
Date: Tue, 12 May 2026 17:26:56 +0200
> xdp_build_skb_from_zc() allocated xdp->frame_sz bytes from the per-cpu
> system_page_pool and built the skb head with napi_build_skb(). The
> latter places skb_shared_info at the tail of the buffer, but the
> helper sized the allocation as if the whole frame_sz were usable for
> data. Whenever the packet plus reserved headroom approached frame_sz,
> the head memcpy overran shinfo with packet content, corrupting
> ->flags (SKBFL_ZEROCOPY_ENABLE) and ->nr_frags, which then drove
> skb_copy_ubufs() off the end of frags[] on the RX path:
>
> UBSAN: array-index-out-of-bounds in include/linux/skbuff.h:2541
> index 113 is out of range for type 'skb_frag_t [17]'
> skb_copy_ubufs+0x7da/0x960
> ip_local_deliver_finish+0xcd/0x110
> ice_napi_poll+0xe4/0x2a0 [ice]
>
> The overrun bytes come from the packet, so an on-wire sender can
> corrupt kernel memory remotely whenever the XDP program returns
> XDP_PASS.
>
> Rather than patch the sizing math, switch to the pattern used by other
> in-tree AF_XDP zero-copy drivers like mlx5 and i40e which use
> napi_alloc_skb() sized to the actual packet plus skb_put_data().
> This sizes the head exactly for the data being copied, drops the
> system_page_pool local_lock from this path, and removes the
> structural mismatch between frame_sz and the skb head buffer. Frags
> are allocated with alloc_page() per frag, matching the other drivers.
I used napi_build_skb() + system page_pool to enable PP recycling
improving XSk XDP_PASS performance a lot.
Are you sure there's no other way to approach this?
napi_alloc_skb() used in other drivers works, but it's sorta old
approach which is way slower.
System page_pools always allocate a full page, why can it create an skb
prone to overruns?
>
> Fixes: 560d958c6c68 ("xsk: add generic XSk &xdp_buff -> skb conversion")
> Cc: [email protected]
> Signed-off-by: Lorenz Brun <[email protected]>
Thanks,
Olek