Thank you for your contribution! Sashiko AI review found 3 potential issue(s) 
to consider:

New issues:
- [High] Missing ST index reference acquisition in `create_real_mr` fallback 
path causes a refcount underflow upon MR destruction.
- [Medium] TPH hints for pooled MRs (FRMRs) are stored in metadata but never 
programmed into the hardware Memory Key Context.

Pre-existing issues:
- [High] Deallocating an ST index leaks the `mlx5_st_idx_data` structure in 
`mlx5_st_dealloc_index`.
--

commit 0757112498e4ebc4a6ffc6fad46c40d302a22056
Author: Zhiping Zhang <[email protected]>

RDMA/mlx5: get tph for p2p access when registering dma-buf mr
    
This patch wires up mlx5_ib to query the exporter's chosen PCIe TLP Processing
Hint (TPH) Steering Tag at MR registration time for p2p access. It uses this
metadata to program requester-side TPH on the outbound mkey and translates
exporter-provided steering tags into local ST entries.

> diff --git a/drivers/infiniband/hw/mlx5/mr.c b/drivers/infiniband/hw/mlx5/mr.c
> index e6b74955d95dc..7aced3f554561 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c

[ ... ]

> @@ -335,6 +364,7 @@ static int mlx5r_build_frmr_key(struct ib_device *device,
>               get_unchangeable_access_flags(dev, in->access_flags);
>       out->vendor_key = in->vendor_key;
>       out->num_dma_blocks = in->num_dma_blocks;
> +     out->kernel_vendor_key = in->kernel_vendor_key;
>  
>       return 0;
>  }

[Severity: Medium]
While the steering tag and processing hint are stored inside the pool key here,
is there any code that actually extracts these values to program the pcie_tph_en
bit or steering tag into the hardware Memory Key Context during FRMR creation or
UMR WQE posting?

It appears that pooled MRs silently fall back to default non-TPH routing.

[ ... ]

> @@ -755,6 +788,12 @@ static struct ib_mr *create_real_mr(struct ib_pd *pd, 
> struct ib_umem *umem,
>  
>       xlt_with_umr = mlx5r_umr_can_load_pas(dev, umem->length);
>       if (xlt_with_umr) {
> +             err = mlx5_ib_get_frmr_st_handle_ref(dev, st_index);
> +             if (err) {
> +                     ib_umem_release(umem);
> +                     return ERR_PTR(err);
> +             }
> +
>               mr = alloc_cacheable_mr(pd, umem, iova, access_flags,
>                                       MLX5_MKC_ACCESS_MODE_MTT,
>                                       st_index, ph);
> @@ -769,6 +808,8 @@ static struct ib_mr *create_real_mr(struct ib_pd *pd, 
> struct ib_umem *umem,
>               mutex_unlock(&dev->slow_path_mutex);
>       }

[Severity: High]
Does this create a refcount underflow for the ST index?

In the else branch (when xlt_with_umr is false), reg_create() is called
and embeds st_index into the MR's kernel_vendor_key, but it misses calling
mlx5_ib_get_frmr_st_handle_ref() to acquire a reference for the steering tag
index. 

Later, mlx5r_handle_mkey_cleanup() unconditionally extracts st_index and
decrements its reference count via mlx5_ib_put_frmr_st_handle_ref(). Dropping
a reference that was never acquired can lead to a refcount underflow and
premature deallocation.

[ ... ]

> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/st.c 
> b/drivers/net/ethernet/mellanox/mlx5/core/lib/st.c
> index 997be91f0a135..bf14a7a6df15e 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/lib/st.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/st.c
> @@ -92,23 +92,18 @@ void mlx5_st_destroy(struct mlx5_core_dev *dev)
>       kfree(st);
>  }
>  
> -int mlx5_st_alloc_index(struct mlx5_core_dev *dev, enum tph_mem_type 
> mem_type,
> -                     unsigned int cpu_uid, u16 *st_index)
> +int mlx5_st_alloc_index_by_tag(struct mlx5_core_dev *dev, u16 tag,
> +                            u16 *st_index)
>  {
>       struct mlx5_st_idx_data *idx_data;

[Severity: High]
This is a pre-existing issue, but does this lead to a memory leak when
deallocating an ST index?

In mlx5_st_alloc_index_by_tag() (or previously mlx5_st_alloc_index()), a new
struct mlx5_st_idx_data object is dynamically allocated with kzalloc_obj() and
inserted into the st->idx_xa XArray. 

However, in mlx5_st_dealloc_index(), when the reference count drops to zero, the
entry is removed from the XArray using xa_erase():

    if (refcount_dec_and_test(&idx_data->usecount)) {
        xa_erase(&st->idx_xa, st_index);
        /* We leave PCI config space as was before, no mkey will refer to it */
    }

The dynamically allocated idx_data pointer is never freed with kfree(). Because
userspace can repeatedly trigger this leak by creating and destroying DMA hints,
can this serve as a DoS memory exhaustion vector?

-- 
Sashiko AI review ยท 
https://sashiko.dev/#/patchset/[email protected]?part=4

Reply via email to