Re: [PATCH] SWDEV-206718 drm/amdgpu: Fix tdr3 could hang with slow compute issue
Acked-by: Alex Deucher From: amd-gfx on behalf of Deng, Emily Sent: Saturday, October 12, 2019 1:36 AM To: Deng, Emily ; amd-gfx@lists.freedesktop.org Subject: RE: [PATCH] SWDEV-206718 drm/amdgpu: Fix tdr3 could hang with slow compute issue Ping Best wishes Emily Deng >-Original Message- >From: Emily Deng >Sent: Wednesday, October 9, 2019 6:52 PM >To: amd-gfx@lists.freedesktop.org >Cc: Deng, Emily >Subject: [PATCH] SWDEV-206718 drm/amdgpu: Fix tdr3 could hang with slow >compute issue > >When index is 1, need to set compute ring timeout for sriov and passthrough. > >Signed-off-by: Emily Deng >--- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 - > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 6 -- > 2 files changed, 8 insertions(+), 3 deletions(-) > >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >index 53ce227..2f5a015 100644 >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >@@ -2664,8 +2664,11 @@ static int >amdgpu_device_get_job_timeout_settings(struct amdgpu_device *adev) > * There is only one value specified and > * it should apply to all non-compute jobs. > */ >- if (index == 1) >+ if (index == 1) { >adev->sdma_timeout = adev->video_timeout = adev- >>gfx_timeout; >+ if (amdgpu_sriov_vf(adev) || >amdgpu_passthrough(adev)) >+ adev->compute_timeout = adev->gfx_timeout; >+ } >} > >return ret; >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >index a88ea74..311abc8 100644 >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >@@ -250,9 +250,11 @@ module_param_named(msi, amdgpu_msi, int, 0444); > * By default(with no lockup_timeout settings), the timeout for all non- >compute(GFX, SDMA and Video) > * jobs is 1. And there is no timeout enforced on compute jobs. > */ >-MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default: >1 for non-compute jobs and infinity timeout for compute jobs." >+MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default: >for bare metal 1 for non-compute jobs and infinity timeout for compute >jobs; " >+ "for passthrough or sriov, 1 for all jobs." >" 0: keep default value. negative: infinity timeout), " >- "format is [Non-Compute] or [GFX,Compute,SDMA,Video]"); >+ "format: for bare metal [Non-Compute] or >[GFX,Compute,SDMA,Video]; " >+ "for passthrough or sriov [all jobs] or >[GFX,Compute,SDMA,Video]."); > module_param_string(lockup_timeout, amdgpu_lockup_timeout, >sizeof(amdgpu_lockup_timeout), 0444); > > /** >-- >2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
RE: [PATCH] SWDEV-206718 drm/amdgpu: Fix tdr3 could hang with slow compute issue
Ping Best wishes Emily Deng >-Original Message- >From: Emily Deng >Sent: Wednesday, October 9, 2019 6:52 PM >To: amd-gfx@lists.freedesktop.org >Cc: Deng, Emily >Subject: [PATCH] SWDEV-206718 drm/amdgpu: Fix tdr3 could hang with slow >compute issue > >When index is 1, need to set compute ring timeout for sriov and passthrough. > >Signed-off-by: Emily Deng >--- > drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 - > drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 6 -- > 2 files changed, 8 insertions(+), 3 deletions(-) > >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >index 53ce227..2f5a015 100644 >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c >@@ -2664,8 +2664,11 @@ static int >amdgpu_device_get_job_timeout_settings(struct amdgpu_device *adev) >* There is only one value specified and >* it should apply to all non-compute jobs. >*/ >- if (index == 1) >+ if (index == 1) { > adev->sdma_timeout = adev->video_timeout = adev- >>gfx_timeout; >+ if (amdgpu_sriov_vf(adev) || >amdgpu_passthrough(adev)) >+ adev->compute_timeout = adev->gfx_timeout; >+ } > } > > return ret; >diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >index a88ea74..311abc8 100644 >--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c >@@ -250,9 +250,11 @@ module_param_named(msi, amdgpu_msi, int, 0444); > * By default(with no lockup_timeout settings), the timeout for all non- >compute(GFX, SDMA and Video) > * jobs is 1. And there is no timeout enforced on compute jobs. > */ >-MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default: >1 for non-compute jobs and infinity timeout for compute jobs." >+MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default: >for bare metal 1 for non-compute jobs and infinity timeout for compute >jobs; " >+ "for passthrough or sriov, 1 for all jobs." > " 0: keep default value. negative: infinity timeout), " >- "format is [Non-Compute] or [GFX,Compute,SDMA,Video]"); >+ "format: for bare metal [Non-Compute] or >[GFX,Compute,SDMA,Video]; " >+ "for passthrough or sriov [all jobs] or >[GFX,Compute,SDMA,Video]."); > module_param_string(lockup_timeout, amdgpu_lockup_timeout, >sizeof(amdgpu_lockup_timeout), 0444); > > /** >-- >2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx
[PATCH] SWDEV-206718 drm/amdgpu: Fix tdr3 could hang with slow compute issue
When index is 1, need to set compute ring timeout for sriov and passthrough. Signed-off-by: Emily Deng --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 5 - drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c| 6 -- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 53ce227..2f5a015 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -2664,8 +2664,11 @@ static int amdgpu_device_get_job_timeout_settings(struct amdgpu_device *adev) * There is only one value specified and * it should apply to all non-compute jobs. */ - if (index == 1) + if (index == 1) { adev->sdma_timeout = adev->video_timeout = adev->gfx_timeout; + if (amdgpu_sriov_vf(adev) || amdgpu_passthrough(adev)) + adev->compute_timeout = adev->gfx_timeout; + } } return ret; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index a88ea74..311abc8 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -250,9 +250,11 @@ module_param_named(msi, amdgpu_msi, int, 0444); * By default(with no lockup_timeout settings), the timeout for all non-compute(GFX, SDMA and Video) * jobs is 1. And there is no timeout enforced on compute jobs. */ -MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default: 1 for non-compute jobs and infinity timeout for compute jobs." +MODULE_PARM_DESC(lockup_timeout, "GPU lockup timeout in ms (default: for bare metal 1 for non-compute jobs and infinity timeout for compute jobs; " + "for passthrough or sriov, 1 for all jobs." " 0: keep default value. negative: infinity timeout), " - "format is [Non-Compute] or [GFX,Compute,SDMA,Video]"); + "format: for bare metal [Non-Compute] or [GFX,Compute,SDMA,Video]; " + "for passthrough or sriov [all jobs] or [GFX,Compute,SDMA,Video]."); module_param_string(lockup_timeout, amdgpu_lockup_timeout, sizeof(amdgpu_lockup_timeout), 0444); /** -- 2.7.4 ___ amd-gfx mailing list amd-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/amd-gfx