amdgpu: Optimize VM invalidation engine allocation and synchronize GPU TLB flush

Lazar, Lijo Wed, 19 Feb 2025 00:28:24 -0800

On 2/19/2025 1:30 PM, [email protected] wrote:
> From: "[email protected]" <[email protected]>
> 
> - Modify the VM invalidation engine allocation logic to handle SDMA page 
> rings.
>   SDMA page rings now share the VM invalidation engine with SDMA gfx rings 
> instead of
>   allocating a separate engine. This change ensures efficient resource 
> management and
>   avoids the issue of insufficient VM invalidation engines.
> 
> - Add synchronization for GPU TLB flush operations in gmc_v9_0.c.
>   Use spin_lock and spin_unlock to ensure thread safety and prevent race 
> conditions
>   during TLB flush operations. This improves the stability and reliability of 
> the driver,
>   especially in multi-threaded environments.
> 
>  replace the sdma ring check with a function `amdgpu_sdma_is_page_queue`
>  to check if a ring is an SDMA page queue.(Lijo)
> 
> Suggested-by: Lijo Lazar <[email protected]>
> Signed-off-by: Jesse Zhang <[email protected]>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c  |  7 +++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c | 18 ++++++++++++++++++
>  drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h |  1 +
>  drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c    |  2 ++
>  4 files changed, 28 insertions(+)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> index cb914ce82eb5..da719ec6c6c7 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gmc.c
> @@ -601,8 +601,15 @@ int amdgpu_gmc_allocate_vm_inv_eng(struct amdgpu_device 
> *adev)
>                       return -EINVAL;
>               }
>  
> +     if(amdgpu_sdma_is_page_queue(adev, ring)) {

Sorry, didn't mean to exclude the ring type check.

BTW, there is another problem. If the previous ring is regular sdma ring,

vm_inv_engs[vmhub] &= ~(1 << ring->vm_inv_eng);

This step would have modified the bitmap and invalidation engine in the
next loop is not the same.

What you may want to do is -

After allocating sdma ring invalidation engine, assign the same inv
engine to page ring corresponding to the sdma instance.

        ring->vm_inv_eng = inv_eng - 1;
        if (ring->type == sdma) {
                page_ring = amdgpu_sdma_get_page_ring(adev, ring->me); => 
returns
&adev->sdma.instance[i].page
                if (page_ring)
                        page_ring->vm_inv_eng = inv_eng - 1;
        }       
        vm_inv_engs[vmhub] &= ~(1 << ring->vm_inv_eng);

Then skip any page rings in the generic loop.
        if (ring->type==sdma && amdgpu_sdma_is_page_queue(adev, ring)) 
continue;  -

Thanks,
Lijo

> +             /* Do not allocate a separate VM invalidation engine for SDMA 
> page rings.
> +              * Shared VM invalid engine with sdma gfx ring.
> +              */
> +             ring->vm_inv_eng = inv_eng - 1;
> +     } else {
>               ring->vm_inv_eng = inv_eng - 1;
>               vm_inv_engs[vmhub] &= ~(1 << ring->vm_inv_eng);
> +     }
>  
>               dev_info(adev->dev, "ring %s uses VM inv eng %u on hub %u\n",
>                        ring->name, ring->vm_inv_eng, ring->vm_hub);
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> index 8de214a8ba6d..96df544feb67 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.c
> @@ -503,6 +503,24 @@ void amdgpu_sdma_sysfs_reset_mask_fini(struct 
> amdgpu_device *adev)
>       }
>  }
>  
> +/**
> +* amdgpu_sdma_is_page_queue - Check if a ring is an SDMA page queue
> +* @adev: Pointer to the AMDGPU device structure
> +* @ring: Pointer to the ring structure to check
> +*
> +* This function checks if the given ring is an SDMA page queue.
> +* It returns true if the ring is an SDMA page queue, false otherwise.
> +*/
> +bool amdgpu_sdma_is_page_queue(struct amdgpu_device *adev, struct 
> amdgpu_ring* ring)
> +{
> +     int i = ring->me;
> +
> +     if (!adev->sdma.has_page_queue || i >= adev->sdma.num_instances)
> +             return false;
> +
> +     return (ring == &adev->sdma.instance[i].page);
> +}
> +
>  /**
>   * amdgpu_sdma_register_on_reset_callbacks - Register SDMA reset callbacks
>   * @funcs: Pointer to the callback structure containing pre_reset and 
> post_reset functions
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> index 7effc2673466..c2df9c3ab882 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_sdma.h
> @@ -194,4 +194,5 @@ int amdgpu_sdma_ras_sw_init(struct amdgpu_device *adev);
>  void amdgpu_debugfs_sdma_sched_mask_init(struct amdgpu_device *adev);
>  int amdgpu_sdma_sysfs_reset_mask_init(struct amdgpu_device *adev);
>  void amdgpu_sdma_sysfs_reset_mask_fini(struct amdgpu_device *adev);
> +bool amdgpu_sdma_is_page_queue(struct amdgpu_device *adev, struct 
> amdgpu_ring* ring);
>  #endif
> diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c 
> b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> index 2aa87fdf715f..2599da8677da 100644
> --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c
> @@ -1000,6 +1000,7 @@ static uint64_t gmc_v9_0_emit_flush_gpu_tlb(struct 
> amdgpu_ring *ring,
>        * to WA the Issue
>        */
>  
> +     spin_lock(&adev->gmc.invalidate_lock);
>       /* TODO: It needs to continue working on debugging with semaphore for 
> GFXHUB as well. */
>       if (use_semaphore)
>               /* a read return value of 1 means semaphore acuqire */
> @@ -1030,6 +1031,7 @@ static uint64_t gmc_v9_0_emit_flush_gpu_tlb(struct 
> amdgpu_ring *ring,
>               amdgpu_ring_emit_wreg(ring, hub->vm_inv_eng0_sem +
>                                     hub->eng_distance * eng, 0);
>  
> +     spin_unlock(&adev->gmc.invalidate_lock);
>       return pd_addr;
>  }
>
Re: [PATCH V2 2/2] drm/amdgpu: Optimize VM invalidation engine allocation and synchronize GPU TLB flush

Reply via email to