Re: [PATCH] drm/amdgpu: Check entity rq

2020-03-30 Thread Christian König
Am 25.03.20 um 12:03 schrieb Nirmoy: On 3/25/20 10:23 AM, Pan, Xinhui wrote: 2020年3月25日 15:48,Koenig, Christian 写道: Am 25.03.20 um 06:47 schrieb xinhui pan: Hit panic during GPU recovery test. drm_sched_entity_select_rq might set NULL to rq. So add a check like drm_sched_job_init does.

Re: [PATCH] drm/amdgpu: Check entity rq

2020-03-25 Thread Pan, Xinhui
well, submit job with HW disabled shluld be no harm. The only concern is that we might use up IBs if we park scheduler thread during recovery. I have saw recovery stuck in sa new functuon. ring test alloc IBs to test if recovery succeed or not. But if there is no enough IBs it will wait fences

Re: [PATCH] drm/amdgpu: Check entity rq

2020-03-25 Thread Pan, Xinhui
. From: Koenig, Christian Sent: Wednesday, March 25, 2020 7:13:13 PM To: Das, Nirmoy Cc: Pan, Xinhui ; amd-gfx@lists.freedesktop.org ; Deucher, Alexander ; Kuehling, Felix Subject: Re: [PATCH] drm/amdgpu: Check entity rq Hi guys, thanks for pointing this out Nirmoy. Yeah, could be that I

Re: [PATCH] drm/amdgpu: Check entity rq

2020-03-25 Thread Koenig, Christian
Hi guys, thanks for pointing this out Nirmoy. Yeah, could be that I forgot to commit the patch. Currently I don't know at which end of the chaos I should start to clean up. Christian. Am 25.03.2020 12:09 schrieb "Das, Nirmoy" : Hi Xinhui, Can you please check if you can reproduce the crash w

Re: [PATCH] drm/amdgpu: Check entity rq

2020-03-25 Thread Nirmoy
Hi Xinhui, Can you please check if you can reproduce the crash with https://lists.freedesktop.org/archives/amd-gfx/2020-February/046414.html Christian fix it earlier, I think he forgot to push it. Regards, Nirmoy On 3/25/20 12:07 PM, xinhui pan wrote: gpu recover will call sdma suspend/r

Re: [PATCH] drm/amdgpu: Check entity rq

2020-03-25 Thread Nirmoy
On 3/25/20 10:23 AM, Pan, Xinhui wrote: 2020年3月25日 15:48,Koenig, Christian 写道: Am 25.03.20 um 06:47 schrieb xinhui pan: Hit panic during GPU recovery test. drm_sched_entity_select_rq might set NULL to rq. So add a check like drm_sched_job_init does. NAK, the rq should never be set to NULL

Re: [PATCH] drm/amdgpu: Check entity rq

2020-03-25 Thread Pan, Xinhui
> 2020年3月25日 17:23,Pan, Xinhui 写道: > > > >> 2020年3月25日 15:48,Koenig, Christian 写道: >> >> Am 25.03.20 um 06:47 schrieb xinhui pan: >>> Hit panic during GPU recovery test. drm_sched_entity_select_rq might >>> set NULL to rq. So add a check like drm_sched_job_init does. >> >> NAK, the rq shou

Re: [PATCH] drm/amdgpu: Check entity rq

2020-03-25 Thread Pan, Xinhui
> 2020年3月25日 15:48,Koenig, Christian 写道: > > Am 25.03.20 um 06:47 schrieb xinhui pan: >> Hit panic during GPU recovery test. drm_sched_entity_select_rq might >> set NULL to rq. So add a check like drm_sched_job_init does. > > NAK, the rq should never be set to NULL in the first place. > > How

Re: [PATCH] drm/amdgpu: Check entity rq

2020-03-25 Thread Christian König
Am 25.03.20 um 06:47 schrieb xinhui pan: Hit panic during GPU recovery test. drm_sched_entity_select_rq might set NULL to rq. So add a check like drm_sched_job_init does. NAK, the rq should never be set to NULL in the first place. How did that happened? Regards, Christian. Cc: Christian Kö