On 1/24/2024 2:28 PM, Le Ma wrote:
> This patch is to eliminate interrupt warning below:
> 
>   "[drm] Fence fallback timer expired on ring sdma0.0".
> 
> An early vm pt clearing job is sent to SDMA ahead of interrupt enabled,
> introduced by patch below:
> 
>   - drm/amdkfd: Export DMABufs from KFD using GEM handles

I think the fix needs to be in the above patch. In this flow, the client
is initialized even before the drm device is registered.

Thanks,
Lijo

> Signed-off-by: Le Ma <[email protected]>
> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 56d9dfa61290..c8aa07282366 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -2833,12 +2833,6 @@ static int amdgpu_device_ip_init(struct amdgpu_device 
> *adev)
>       if (r)
>               goto init_failed;
>  
> -     /* Don't init kfd if whole hive need to be reset during init */
> -     if (!adev->gmc.xgmi.pending_reset) {
> -             kgd2kfd_init_zone_device(adev);
> -             amdgpu_amdkfd_device_init(adev);
> -     }
> -
>       amdgpu_fru_get_product_info(adev);
>  
>  init_failed:
> @@ -4204,6 +4198,12 @@ int amdgpu_device_init(struct amdgpu_device *adev,
>  
>       amdgpu_fence_driver_hw_init(adev);
>  
> +     /* Don't init kfd if whole hive need to be reset during init */
> +     if (!adev->gmc.xgmi.pending_reset) {
> +             kgd2kfd_init_zone_device(adev);
> +             amdgpu_amdkfd_device_init(adev);
> +     }
> +
>       dev_info(adev->dev,
>               "SE %d, SH per SE %d, CU per SH %d, active_cu_number %d\n",
>                       adev->gfx.config.max_shader_engines,

Reply via email to