Am 25.03.20 um 12:03 schrieb Nirmoy:
On 3/25/20 10:23 AM, Pan, Xinhui wrote:
2020年3月25日 15:48,Koenig, Christian 写道:
Am 25.03.20 um 06:47 schrieb xinhui pan:
Hit panic during GPU recovery test. drm_sched_entity_select_rq might
set NULL to rq. So add a check like drm_sched_job_init does.
well, submit job with HW disabled shluld be no harm.
The only concern is that we might use up IBs if we park scheduler thread during
recovery.
I have saw recovery stuck in sa new functuon.
ring test alloc IBs to test if recovery succeed or not. But if there is no
enough IBs it will wait fences
.
From: Koenig, Christian
Sent: Wednesday, March 25, 2020 7:13:13 PM
To: Das, Nirmoy
Cc: Pan, Xinhui ; amd-gfx@lists.freedesktop.org
; Deucher, Alexander
; Kuehling, Felix
Subject: Re: [PATCH] drm/amdgpu: Check entity rq
Hi guys,
thanks for pointing this out Nirmoy.
Yeah, could be that I
Hi guys,
thanks for pointing this out Nirmoy.
Yeah, could be that I forgot to commit the patch. Currently I don't know at
which end of the chaos I should start to clean up.
Christian.
Am 25.03.2020 12:09 schrieb "Das, Nirmoy" :
Hi Xinhui,
Can you please check if you can reproduce the crash w
Hi Xinhui,
Can you please check if you can reproduce the crash with
https://lists.freedesktop.org/archives/amd-gfx/2020-February/046414.html
Christian fix it earlier, I think he forgot to push it.
Regards,
Nirmoy
On 3/25/20 12:07 PM, xinhui pan wrote:
gpu recover will call sdma suspend/r
On 3/25/20 10:23 AM, Pan, Xinhui wrote:
2020年3月25日 15:48,Koenig, Christian 写道:
Am 25.03.20 um 06:47 schrieb xinhui pan:
Hit panic during GPU recovery test. drm_sched_entity_select_rq might
set NULL to rq. So add a check like drm_sched_job_init does.
NAK, the rq should never be set to NULL
> 2020年3月25日 17:23,Pan, Xinhui 写道:
>
>
>
>> 2020年3月25日 15:48,Koenig, Christian 写道:
>>
>> Am 25.03.20 um 06:47 schrieb xinhui pan:
>>> Hit panic during GPU recovery test. drm_sched_entity_select_rq might
>>> set NULL to rq. So add a check like drm_sched_job_init does.
>>
>> NAK, the rq shou
> 2020年3月25日 15:48,Koenig, Christian 写道:
>
> Am 25.03.20 um 06:47 schrieb xinhui pan:
>> Hit panic during GPU recovery test. drm_sched_entity_select_rq might
>> set NULL to rq. So add a check like drm_sched_job_init does.
>
> NAK, the rq should never be set to NULL in the first place.
>
> How
Am 25.03.20 um 06:47 schrieb xinhui pan:
Hit panic during GPU recovery test. drm_sched_entity_select_rq might
set NULL to rq. So add a check like drm_sched_job_init does.
NAK, the rq should never be set to NULL in the first place.
How did that happened?
Regards,
Christian.
Cc: Christian Kö