amdgpu: Optimize VM invalidation engine allocation and synchronize GPU TLB flush

Lazar, Lijo Mon, 03 Mar 2025 23:11:54 -0800

On 2/28/2025 3:57 PM, [email protected] wrote:
> From: "[email protected]" <[email protected]>
> 
> - Modify the VM invalidation engine allocation logic to handle SDMA page 
> rings.
>   SDMA page rings now share the VM invalidation engine with SDMA gfx rings 
> instead of
>   allocating a separate engine. This change ensures efficient resource 
> management and
>   avoids the issue of insufficient VM invalidation engines.
> 
> - Add synchronization for GPU TLB flush operations in gmc_v9_0.c.
>   Use spin_lock and spin_unlock to ensure thread safety and prevent race 
> conditions
>   during TLB flush operations. This improves the stability and reliability of 
> the driver,
>   especially in multi-threaded environments.
> 
>  v2: replace the sdma ring check with a function `amdgpu_sdma_is_page_queue`
>  to check if a ring is an SDMA page queue.(Lijo)
> 
>  v3: Add GC version check, only enabled on GC9.4.3/9.4.4/9.5.0
>  v4: Fix code style and add more detailed description (Christian)
> 
> Suggested-by: Lijo Lazar <[email protected]>
> Signed-off-by: Jesse Zhang <[email protected]>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c  | 12 ++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 25 +++++++++++++++++++++++-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h |  1 +
>  3 files changed, 37 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index 4eefa17fa39b..aad3c5ea8557 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -602,8 +602,20 @@ int amdgpu_gmc_allocate_vm_inv_eng(struct amdgpu_device 
> *adev)
>                       return -EINVAL;
>               }
>  
> +     if (amdgpu_sdma_is_shared_inv_eng(adev, ring)) {
> +             /* SDMA has a special packet which allows it to use the same
> +              * invalidation engine for all the rings in one instance.
> +              * Therefore, we do not allocate a separate VM invalidation 
> engine
> +              * for SDMA page rings. Instead, they share the VM invalidation
> +              * engine with the SDMA gfx ring. This change ensures efficient
> +              * resource management and avoids the issue of insufficient VM
> +              * invalidation engines.
> +              */
> +             ring->vm_inv_eng = inv_eng - 1;

As mentioned in a previous comment also, strongly discourage to assume
vm_inv_eng based just on loop order. If two rings of the same instance
need to share the inv_eng, call them out explicitly and assign.
Shouldn't have any implicit assumption that the previous/next ring in
the loop order will be part of the same engine.

Thanks,
Lijo

> +     } else {
>               ring->vm_inv_eng = inv_eng - 1;
>               vm_inv_engs[vmhub] &= ~(1 << ring->vm_inv_eng);
> +     }
>  
>               dev_info(adev->dev, "ring %s uses VM inv eng %u on hub %u\n",
>                        ring->name, ring->vm_inv_eng, ring->vm_hub);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> index 42a7b86e41c3..9b958d6202bc 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> @@ -462,6 +462,29 @@ void amdgpu_sdma_sysfs_reset_mask_fini(struct 
> amdgpu_device *adev)
>       }
>  }
>  
> +/**
> +* amdgpu_sdma_is_shared_inv_eng - Check if a ring is an SDMA ring that 
> shares a VM invalidation engine
> +* @adev: Pointer to the AMDGPU device structure
> +* @ring: Pointer to the ring structure to check
> +*
> +* This function checks if the given ring is an SDMA ring that shares a VM 
> invalidation engine.
> +* It returns true if the ring is such an SDMA ring, false otherwise.
> +*/
> +bool amdgpu_sdma_is_shared_inv_eng(struct amdgpu_device *adev, struct 
> amdgpu_ring *ring)
> +{
> +     int i = ring->me;
> +
> +     if (!adev->sdma.has_page_queue || i >= adev->sdma.num_instances)
> +             return false;
> +
> +     if (amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 3) ||
> +         amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 4, 4) ||
> +         amdgpu_ip_version(adev, GC_HWIP, 0) == IP_VERSION(9, 5, 0))
> +             return (ring == &adev->sdma.instance[i].ring);
> +     else
> +             return false;
> +}
> +
>  /**
>   * amdgpu_sdma_register_on_reset_callbacks - Register SDMA reset callbacks
>   * @funcs: Pointer to the callback structure containing pre_reset and 
> post_reset functions
> @@ -503,7 +526,7 @@ int amdgpu_sdma_reset_engine(struct amdgpu_device *adev, 
> uint32_t instance_id, b
>  {
>       struct sdma_on_reset_funcs *funcs;
>       int ret = 0;
> -     struct amdgpu_sdma_instance *sdma_instance = 
> &adev->sdma.instance[instance_id];;
> +     struct amdgpu_sdma_instance *sdma_instance = 
> &adev->sdma.instance[instance_id];
>       struct amdgpu_ring *gfx_ring = &sdma_instance->ring;
>       struct amdgpu_ring *page_ring = &sdma_instance->page;
>       bool gfx_sched_stopped = false, page_sched_stopped = false;
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> index 965169320065..1fa2049da6c3 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> @@ -194,4 +194,5 @@ int amdgpu_sdma_ras_sw_init(struct amdgpu_device *adev);
>  void amdgpu_debugfs_sdma_sched_mask_init(struct amdgpu_device *adev);
>  int amdgpu_sdma_sysfs_reset_mask_init(struct amdgpu_device *adev);
>  void amdgpu_sdma_sysfs_reset_mask_fini(struct amdgpu_device *adev);
> +bool amdgpu_sdma_is_shared_inv_eng(struct amdgpu_device *adev, struct 
> amdgpu_ring *ring);
>  #endif
Re: [PATCH V6 2/3] drm/amdgpu: Optimize VM invalidation engine allocation and synchronize GPU TLB flush

Reply via email to