> From: Frank Du [mailto:frank...@intel.com]
> Sent: Thursday, 23 May 2024 10.08
> 
> The current calculation assumes that the mbufs are contiguous. However,
> this assumption is incorrect when the mbuf memory spans across huge page.
> To ensure that each mbuf resides exclusively within a single page, there
> are deliberate spacing gaps when allocating mbufs across the boundaries.

A agree that this patch is an improvement of what existed previously.
But I still don't understand the patch description. To me, it looks like the 
patch adds a missing check for contiguous memory, and the patch itself has 
nothing to do with huge pages. Anyway, if the maintainer agrees with the 
description, I don't mind not grasping it. ;-)

However, while trying to understand what is happening, I think I found one more 
(already existing) bug.
I will show through an example inline below.

> 
> Correct to directly read the size from the mempool memory chunk.
> 
> Fixes: d8a210774e1d ("net/af_xdp: support unaligned umem chunks")
> Cc: sta...@dpdk.org
> 
> Signed-off-by: Frank Du <frank...@intel.com>
> 
> ---
> v2:
> * Add virtual contiguous detect for for multiple memhdrs
> v3:
> * Use RTE_ALIGN_FLOOR to get the aligned addr
> * Add check on the first memhdr of memory chunks
> v4:
> * Replace the iterating with simple nb_mem_chunks check
> ---
>  drivers/net/af_xdp/rte_eth_af_xdp.c | 33 +++++++++++++++++++++++------
>  1 file changed, 26 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/af_xdp/rte_eth_af_xdp.c
> b/drivers/net/af_xdp/rte_eth_af_xdp.c
> index 6ba455bb9b..d0431ec089 100644
> --- a/drivers/net/af_xdp/rte_eth_af_xdp.c
> +++ b/drivers/net/af_xdp/rte_eth_af_xdp.c
> @@ -1040,16 +1040,32 @@ eth_link_update(struct rte_eth_dev *dev __rte_unused,
>  }
> 
>  #if defined(XDP_UMEM_UNALIGNED_CHUNK_FLAG)
> -static inline uintptr_t get_base_addr(struct rte_mempool *mp, uint64_t
> *align)
> +static inline uintptr_t
> +get_memhdr_info(const struct rte_mempool *mp, uint64_t *align, size_t *len)
>  {
>       struct rte_mempool_memhdr *memhdr;
>       uintptr_t memhdr_addr, aligned_addr;
> 
> +     if (mp->nb_mem_chunks != 1) {
> +             /*
> +              * The mempool with multiple chunks is not virtual contiguous 
> but
> +              * xsk umem only support single virtual region mapping.
> +              */
> +             AF_XDP_LOG(ERR, "The mempool contain multiple %u memory
> chunks\n",
> +                                mp->nb_mem_chunks);
> +             return 0;
> +     }
> +
> +     /* Get the mempool base addr and align from the header now */
>       memhdr = STAILQ_FIRST(&mp->mem_list);
> +     if (!memhdr) {
> +             AF_XDP_LOG(ERR, "The mempool is not populated\n");
> +             return 0;
> +     }
>       memhdr_addr = (uintptr_t)memhdr->addr;
> -     aligned_addr = memhdr_addr & ~(getpagesize() - 1);
> +     aligned_addr = RTE_ALIGN_FLOOR(memhdr_addr, getpagesize());
>       *align = memhdr_addr - aligned_addr;
> -
> +     *len = memhdr->len;
>       return aligned_addr;

On x86_64, the page size is 4 KB = 0x1000.

Let's look at an example where memhdr->addr is not aligned to the page size:

In the example,
memhdr->addr is 0x700100, and
memhdr->len is 0x20000.

Then
aligned_addr becomes 0x700000,
*align becomes 0x100, and
*len becomes 0x20000.

>  }
> 
> @@ -1126,6 +1142,7 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals
> *internals,
>       void *base_addr = NULL;
>       struct rte_mempool *mb_pool = rxq->mb_pool;
>       uint64_t umem_size, align = 0;
> +     size_t len = 0;
> 
>       if (internals->shared_umem) {
>               if (get_shared_umem(rxq, internals->if_name, &umem) < 0)
> @@ -1157,10 +1174,12 @@ xsk_umem_info *xdp_umem_configure(struct pmd_internals
> *internals,
>               }
> 
>               umem->mb_pool = mb_pool;
> -             base_addr = (void *)get_base_addr(mb_pool, &align);
> -             umem_size = (uint64_t)mb_pool->populated_size *
> -                             (uint64_t)usr_config.frame_size +
> -                             align;
> +             base_addr = (void *)get_memhdr_info(mb_pool, &align, &len);
> +             if (!base_addr) {
> +                     AF_XDP_LOG(ERR, "The memory pool can't be mapped as
> umem\n");
> +                     goto err;
> +             }
> +             umem_size = (uint64_t)len + align;

Here, umem_size becomes 0x20100.

> 
>               ret = xsk_umem__create(&umem->umem, base_addr, umem_size,
>                               &rxq->fq, &rxq->cq, &usr_config);

Here, xsk_umem__create() is called with the base_address (0x700000) preceding 
the address of the memory chunk (0x700100).
It looks like a bug, causing a buffer underrun. I.e. will it access memory 
starting at base_address?

If I'm correct, the code should probably do this for alignment instead:

aligned_addr = RTE_ALIGN_CEIL(memhdr_addr, getpagesize());
*align = aligned_addr - memhdr_addr;
umem_size = (uint64_t)len - align;


Disclaimer: I don't know much about the AF_XDP implementation, so maybe I just 
don't understand what is going on.

> --
> 2.34.1

Reply via email to