Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
On 2021-09-01 12:40 a.m., Jingwen Chen wrote: On Wed Sep 01, 2021 at 12:28:59AM -0400, Andrey Grodzovsky wrote: On 2021-09-01 12:25 a.m., Jingwen Chen wrote: On Wed Sep 01, 2021 at 12:04:47AM -0400, Andrey Grodzovsky wrote: I will answer everything here - On 2021-08-31 9:58 p.m., Liu, Monk wrote: [AMD Official Use Only] In the previous discussion, you guys stated that we should drop the “kthread_should_park” in cleanup_job. @@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) { struct drm_sched_job *job, *next; - /* -* Don't destroy jobs while the timeout worker is running OR thread -* is being parked and hence assumed to not touch pending_list -*/ - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && - !cancel_delayed_work(>work_tdr)) || - kthread_should_park()) - return NULL; But I suddenly have a question here: if return the timedout job no matter kthread_should_park() or not, then we are backing to the original problem again: that the timedout_job is suddenly signaling and cleanup_job still returns it to sched_main and job is freed while it is still handling by vendor’s timeout callback If we return NULL when kthread_should_park() in cleanup_job, we can prevent above scenario from happening: once a job is processed by job_timedout we can stop its scheduler, and after that even this job suddenly signaled the cleanup_job won’t return it so sched_main won’t free it in parallel … What do you think ? Is your analysis above takes into account that you also submit '[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' then I don't see a problem - Hi Andrey, Monk has talked to me and we agreed that as there're multiple opinions about the '[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' and patch 1 is an independent patch to fix some error. So we should not take the patch 2 into analysis. I think that as long as you put kthread_park(sched->thread) BEFORE fetching next bad job from pending list (under spinlock) there is no such issue as in the case you describe because this potential bad job that became signaled will be removed from pending list before you even fetch the next job and by the time you fetch it the scheduler thread is already stopped anyway If you don't submit and we keep the removal hack for now then also no problem because we temporary remove the job we fetch for TDR from pending list under spinlock exactly to avoid this race So can you help review [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)? patch v3 keeps this kthread_should_park check. But since in both cases looks like there is no danger of use after free then I see no reason to keep kthread_should_park check. Andrey OK, I get it. So patch v4 has removed this check, can you help review [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)? Sure Andrey Best Regards, JingWen Thanks -- Monk Liu | Cloud-GPU Core team -- From: Liu, Monk Sent: Wednesday, September 1, 2021 9:23 AM To: Koenig, Christian ; Grodzovsky, Andrey ; Daniel Vetter ; Chen, JingWen Cc: DRI Development ; amd-gfx@lists.freedesktop.org Subject: [diagnostic TDR mode patches] unify our solution opinions/ suggestions in one thread [AMD Official Use Only] Hi Daniel/Christian/Andrey It looks the voice from you three are spread over those email floods to me, the feature we are working on (diagnostic TDR scheme) is pending there for more than 6 month (we started it from feb 2021). Honestly speaking the email ways that we are using now is not friendly and quite painful to me …. Can we try to put all our opinions, suggestions, or even objects here together, let’s go through them one by one, it’s too hard for us to reply each email on different questions . For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4) This is a fixing patch on the timeout timer in scheduler, can we complete this one first ? it should already resolved all the questions and suggestions. I have no objections for this one besides getting rid of the kthread_should_park()) return null part, if my answer above is not wrong then it seems superfluous to me For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler I think I already explained the questions raised by Daniel in other thread , regarding why I use __kthread_should_park() Is this race free ? Can't the other thread execute kthread_park after the check ? For other aspects, can we put all our opinion
Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
On Wed Sep 01, 2021 at 12:28:59AM -0400, Andrey Grodzovsky wrote: > > On 2021-09-01 12:25 a.m., Jingwen Chen wrote: > > On Wed Sep 01, 2021 at 12:04:47AM -0400, Andrey Grodzovsky wrote: > > > I will answer everything here - > > > > > > On 2021-08-31 9:58 p.m., Liu, Monk wrote: > > > > > > > > > [AMD Official Use Only] > > > > > > > > > In the previous discussion, you guys stated that we should drop the > > > “kthread_should_park” in cleanup_job. > > > > > > > > > @@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct > > > drm_gpu_scheduler > > > *sched) > > > > > > { > > > > > > struct drm_sched_job *job, *next; > > > > > > > > > - /* > > > > > > -* Don't destroy jobs while the timeout worker is running OR > > > thread > > > > > > -* is being parked and hence assumed to not touch > > > pending_list > > > > > > -*/ > > > > > > - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && > > > > > > - !cancel_delayed_work(>work_tdr)) || > > > > > > - kthread_should_park()) > > > > > > - return NULL; > > > > > > > > > But I suddenly have a question here: if return the timedout job no > > > matter > > > kthread_should_park() or not, then we are backing to the original > > > problem > > > again: that the timedout_job is suddenly signaling and cleanup_job > > > still > > > returns it to sched_main and job is freed while it is still handling > > > by > > > vendor’s timeout callback > > > > > > > > > If we return NULL when kthread_should_park() in cleanup_job, we can > > > prevent > > > above scenario from happening: once a job is processed by > > > job_timedout we > > > can stop its scheduler, and after that even this job suddenly > > > signaled the > > > cleanup_job won’t return it so sched_main won’t free it in parallel … > > > > > > > > > What do you think ? > > > > > > > > > Is your analysis above takes into account that you also submit > > > '[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' then I don't > > > see a > > > problem - > > Hi Andrey, > > Monk has talked to me and we agreed that as there're multiple opinions > > about the > > '[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' and patch > > 1 is an independent patch to fix some error. So we should not take the > > patch 2 into > > analysis. > > > > > I think that as long as you put kthread_park(sched->thread) BEFORE > > > fetching next bad job from pending list (under spinlock) there is no > > > such issue as in the case you describe because this potential bad job > > > that became signaled will be removed from pending list before you > > > even fetch the next job and by the time you fetch it the scheduler > > > thread is already stopped anyway > > > > > > If you don't submit and we keep the removal hack for now then also no > > > problem > > > because > > > we temporary remove the job we fetch for TDR from pending list under > > > spinlock > > > exactly to avoid this race > > > > > So can you help review [PATCH 1/2] drm/sched: fix the bug of time out > > calculation(v3)? > > patch v3 keeps this kthread_should_park check. > > > But since in both cases looks like there is no danger of use after free > then I see no reason to keep kthread_should_park check. > > Andrey OK, I get it. So patch v4 has removed this check, can you help review [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)? > > > > > > Best Regards, > > JingWen > > > > > > Thanks > > > > > > > > > -- > > > > > > Monk Liu | Cloud-GPU Core team > > > > > > -- > > > > > > > > > From: Liu, Monk > > > Sent: Wednesday, September 1, 2021 9:23 AM > > > To: Koenig, Christian ; Grodzovsky, Andrey > > > ; Daniel Vetter ; Chen, > > > JingWen > > > > > > Cc: DRI Development ; > > > amd-gfx@lists.freedesktop.org > > > Subject: [diagnostic TDR mode patches] unify our solution opinions/ > > > suggestions in one thread > > > > > > > > > [AMD Official Use Only] > > > > > > > > > Hi Daniel/Christian/Andrey > > > > > > > > > It looks the voice from you three are spread over those email floods > > > to me, > > > the feature we are working on (diagnostic TDR scheme) is pending > > > there for > > > more than 6 month (we started it from feb 2021). > > > > > > > > > Honestly speaking the email ways that we are using now is not > > > friendly and > > > quite painful to me …. > > > > > > Can we try to put all our opinions, suggestions, or even objects here > > > together, let’s go through them one by one, it’s too hard for us to > > > reply > > > each email on different questions . > > > > > > > > > For [PATCH 1/2]
Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
On 2021-09-01 12:25 a.m., Jingwen Chen wrote: On Wed Sep 01, 2021 at 12:04:47AM -0400, Andrey Grodzovsky wrote: I will answer everything here - On 2021-08-31 9:58 p.m., Liu, Monk wrote: [AMD Official Use Only] In the previous discussion, you guys stated that we should drop the “kthread_should_park” in cleanup_job. @@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) { struct drm_sched_job *job, *next; - /* -* Don't destroy jobs while the timeout worker is running OR thread -* is being parked and hence assumed to not touch pending_list -*/ - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && - !cancel_delayed_work(>work_tdr)) || - kthread_should_park()) - return NULL; But I suddenly have a question here: if return the timedout job no matter kthread_should_park() or not, then we are backing to the original problem again: that the timedout_job is suddenly signaling and cleanup_job still returns it to sched_main and job is freed while it is still handling by vendor’s timeout callback If we return NULL when kthread_should_park() in cleanup_job, we can prevent above scenario from happening: once a job is processed by job_timedout we can stop its scheduler, and after that even this job suddenly signaled the cleanup_job won’t return it so sched_main won’t free it in parallel … What do you think ? Is your analysis above takes into account that you also submit '[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' then I don't see a problem - Hi Andrey, Monk has talked to me and we agreed that as there're multiple opinions about the '[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' and patch 1 is an independent patch to fix some error. So we should not take the patch 2 into analysis. I think that as long as you put kthread_park(sched->thread) BEFORE fetching next bad job from pending list (under spinlock) there is no such issue as in the case you describe because this potential bad job that became signaled will be removed from pending list before you even fetch the next job and by the time you fetch it the scheduler thread is already stopped anyway If you don't submit and we keep the removal hack for now then also no problem because we temporary remove the job we fetch for TDR from pending list under spinlock exactly to avoid this race So can you help review [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)? patch v3 keeps this kthread_should_park check. But since in both cases looks like there is no danger of use after free then I see no reason to keep kthread_should_park check. Andrey Best Regards, JingWen Thanks -- Monk Liu | Cloud-GPU Core team -- From: Liu, Monk Sent: Wednesday, September 1, 2021 9:23 AM To: Koenig, Christian ; Grodzovsky, Andrey ; Daniel Vetter ; Chen, JingWen Cc: DRI Development ; amd-gfx@lists.freedesktop.org Subject: [diagnostic TDR mode patches] unify our solution opinions/ suggestions in one thread [AMD Official Use Only] Hi Daniel/Christian/Andrey It looks the voice from you three are spread over those email floods to me, the feature we are working on (diagnostic TDR scheme) is pending there for more than 6 month (we started it from feb 2021). Honestly speaking the email ways that we are using now is not friendly and quite painful to me …. Can we try to put all our opinions, suggestions, or even objects here together, let’s go through them one by one, it’s too hard for us to reply each email on different questions . For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4) This is a fixing patch on the timeout timer in scheduler, can we complete this one first ? it should already resolved all the questions and suggestions. I have no objections for this one besides getting rid of the kthread_should_park()) return null part, if my answer above is not wrong then it seems superfluous to me For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler I think I already explained the questions raised by Daniel in other thread , regarding why I use __kthread_should_park() Is this race free ? Can't the other thread execute kthread_park after the check ? For other aspects, can we put all our opinion synthesized here ? So to summarize from previous threads I think that the best solution to the problem being solved in this patch is if we do HW fence embedding at the drm_sched_job level instead
Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
On Wed Sep 01, 2021 at 12:04:47AM -0400, Andrey Grodzovsky wrote: > I will answer everything here - > > On 2021-08-31 9:58 p.m., Liu, Monk wrote: > > > [AMD Official Use Only] > > > > In the previous discussion, you guys stated that we should drop the > “kthread_should_park” in cleanup_job. > > > > @@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler > *sched) > > { > > struct drm_sched_job *job, *next; > > > > - /* > > -* Don't destroy jobs while the timeout worker is running OR > thread > > -* is being parked and hence assumed to not touch pending_list > > -*/ > > - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && > > - !cancel_delayed_work(>work_tdr)) || > > - kthread_should_park()) > > - return NULL; > > > > But I suddenly have a question here: if return the timedout job no matter > kthread_should_park() or not, then we are backing to the original problem > again: that the timedout_job is suddenly signaling and cleanup_job still > returns it to sched_main and job is freed while it is still handling by > vendor’s timeout callback > > > > If we return NULL when kthread_should_park() in cleanup_job, we can > prevent > above scenario from happening: once a job is processed by job_timedout we > can stop its scheduler, and after that even this job suddenly signaled the > cleanup_job won’t return it so sched_main won’t free it in parallel … > > > > What do you think ? > > > Is your analysis above takes into account that you also submit > '[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' then I don't see > a > problem - Hi Andrey, Monk has talked to me and we agreed that as there're multiple opinions about the '[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' and patch 1 is an independent patch to fix some error. So we should not take the patch 2 into analysis. > I think that as long as you put kthread_park(sched->thread) BEFORE > fetching next bad job from pending list (under spinlock) there is no > such issue as in the case you describe because this potential bad job > that became signaled will be removed from pending list before you > even fetch the next job and by the time you fetch it the scheduler > thread is already stopped anyway > > If you don't submit and we keep the removal hack for now then also no problem > because > we temporary remove the job we fetch for TDR from pending list under spinlock > exactly to avoid this race > So can you help review [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)? patch v3 keeps this kthread_should_park check. Best Regards, JingWen > > > Thanks > > > > -- > > Monk Liu | Cloud-GPU Core team > > -- > > > > From: Liu, Monk > Sent: Wednesday, September 1, 2021 9:23 AM > To: Koenig, Christian ; Grodzovsky, Andrey > ; Daniel Vetter ; Chen, > JingWen > > Cc: DRI Development ; > amd-gfx@lists.freedesktop.org > Subject: [diagnostic TDR mode patches] unify our solution opinions/ > suggestions in one thread > > > > [AMD Official Use Only] > > > > Hi Daniel/Christian/Andrey > > > > It looks the voice from you three are spread over those email floods to > me, > the feature we are working on (diagnostic TDR scheme) is pending there for > more than 6 month (we started it from feb 2021). > > > > Honestly speaking the email ways that we are using now is not friendly and > quite painful to me …. > > Can we try to put all our opinions, suggestions, or even objects here > together, let’s go through them one by one, it’s too hard for us to reply > each email on different questions . > > > > For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4) > > > > This is a fixing patch on the timeout timer in scheduler, can we complete > this one first ? it should already resolved all the questions and > suggestions. > > > I have no objections for this one besides getting rid of the > kthread_should_park()) return null part, > if my answer above is not wrong then it seems superfluous to me > > > > > For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler > > > > I think I already explained the questions raised by Daniel in other thread > , regarding why I use __kthread_should_park() > > > Is this race free ? Can't the other thread execute kthread_park after the > check > ? > > > For other aspects, can we put all our opinion synthesized here ? > > > So to summarize from previous threads I think that the best solution > to the problem being solved in this patch is if we do HW fence
Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
I will answer everything here - On 2021-08-31 9:58 p.m., Liu, Monk wrote: [AMD Official Use Only] In the previous discussion, you guys stated that we should drop the “kthread_should_park” in cleanup_job. @@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) { struct drm_sched_job *job, *next; - /* - * Don't destroy jobs while the timeout worker is running OR thread - * is being parked and hence assumed to not touch pending_list - */ - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && - !cancel_delayed_work(>work_tdr)) || - kthread_should_park()) - return NULL; But I suddenly have a question here: if return the timedout job no matter kthread_should_park() or not, then we are backing to the original problem again: that the timedout_job is suddenly signaling and cleanup_job still returns it to sched_main and job is freed while it is still handling by vendor’s timeout callback If we return NULL when kthread_should_park() in cleanup_job, we can prevent above scenario from happening: once a job is processed by job_timedout we can stop its scheduler, and after that even this job suddenly signaled the cleanup_job won’t return it so sched_main won’t free it in parallel … What do you think ? Is your analysis above takes into account that you also submit '[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' then I don't see a problem - I think that as long as you put kthread_park(sched->thread) BEFORE fetching next bad job from pending list (under spinlock) there is no such issue as in the case you describe because this potential bad job that became signaled will be removed from pending list before you even fetch the next job and by the time you fetch it the scheduler thread is already stopped anyway If you don't submit and we keep the removal hack for now then also no problem because we temporary remove the job we fetch for TDR from pending list under spinlock exactly to avoid this race Thanks -- Monk Liu | Cloud-GPU Core team -- *From:* Liu, Monk *Sent:* Wednesday, September 1, 2021 9:23 AM *To:* Koenig, Christian ; Grodzovsky, Andrey ; Daniel Vetter ; Chen, JingWen *Cc:* DRI Development ; amd-gfx@lists.freedesktop.org *Subject:* [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread [AMD Official Use Only] Hi Daniel/Christian/Andrey It looks the voice from you three are spread over those email floods to me, the feature we are working on (diagnostic TDR scheme) is pending there for more than 6 month (we started it from feb 2021). Honestly speaking the email ways that we are using now is not friendly and quite painful to me …. Can we try to put all our opinions, suggestions, or even objects here together, let’s go through them one by one, it’s too hard for us to reply each email on different questions . For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4) This is a fixing patch on the timeout timer in scheduler, can we complete this one first ? it should already resolved all the questions and suggestions. I have no objections for this one besides getting rid of the kthread_should_park()) return null part, if my answer above is not wrong then it seems superfluous to me For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler I think I already explained the questions raised by Daniel in other thread , regarding why I use __kthread_should_park() Is this race free ? Can't the other thread execute kthread_park after the check ? For other aspects, can we put all our opinion synthesized here ? So to summarize from previous threads I think that the best solution to the problem being solved in this patch is if we do HW fence embedding at the drm_sched_job level instead of doing it only for amdgpu, and modifying all the drivers to support this we can both remove this hack and solve the race against concurrent drm_sched_cleanup_jobs job freeing just by taking reference to the hw fence of the job at the beginning of drm_sched_job_timedout If doing this refactoring for all the drivers is not an option now and you need a quick solution such as the serialization you do here then take into account other concurrent TDR handlers that might run, as mentioned before, without serializing against other TDR handlers from other schedulers you just race here against them, e.g. you parked it now but another one in progress will unpark it as part of calling drm_sched_start for other rings. So you either need a global lock or dedicated single threaded queue as Daniel suggested. At minimum we should change cancel_delayed_work in drm_sched_stop to cancel_delayed_work_sync to cancel and flush all concurrent TDRs and probably move it to the begining of the function, after kthread_park and before we start playing
[PATCH] drm/amdkfd: drop process ref count when xnack disable
During svm restore pages interrupt handler, kfd_process ref count was never dropped when xnack was disabled. Therefore, the object was never released. Signed-off-by: Alex Sierra --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 8f9b5b53dab5..110c46cd7fac 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -2484,7 +2484,8 @@ svm_range_restore_pages(struct amdgpu_device *adev, unsigned int pasid, } if (!p->xnack_enabled) { pr_debug("XNACK not enabled for pasid 0x%x\n", pasid); - return -EFAULT; + r = -EFAULT; + goto out; } svms = >svms; -- 2.32.0
RE: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
[AMD Official Use Only] In the previous discussion, you guys stated that we should drop the "kthread_should_park" in cleanup_job. @@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) { struct drm_sched_job *job, *next; - /* -* Don't destroy jobs while the timeout worker is running OR thread -* is being parked and hence assumed to not touch pending_list -*/ - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && - !cancel_delayed_work(>work_tdr)) || - kthread_should_park()) - return NULL; But I suddenly have a question here: if return the timedout job no matter kthread_should_park() or not, then we are backing to the original problem again: that the timedout_job is suddenly signaling and cleanup_job still returns it to sched_main and job is freed while it is still handling by vendor's timeout callback If we return NULL when kthread_should_park() in cleanup_job, we can prevent above scenario from happening: once a job is processed by job_timedout we can stop its scheduler, and after that even this job suddenly signaled the cleanup_job won't return it so sched_main won't free it in parallel ... What do you think ? Thanks -- Monk Liu | Cloud-GPU Core team -- From: Liu, Monk Sent: Wednesday, September 1, 2021 9:23 AM To: Koenig, Christian ; Grodzovsky, Andrey ; Daniel Vetter ; Chen, JingWen Cc: DRI Development ; amd-gfx@lists.freedesktop.org Subject: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread [AMD Official Use Only] Hi Daniel/Christian/Andrey It looks the voice from you three are spread over those email floods to me, the feature we are working on (diagnostic TDR scheme) is pending there for more than 6 month (we started it from feb 2021). Honestly speaking the email ways that we are using now is not friendly and quite painful to me Can we try to put all our opinions, suggestions, or even objects here together, let's go through them one by one, it's too hard for us to reply each email on different questions . For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4) This is a fixing patch on the timeout timer in scheduler, can we complete this one first ? it should already resolved all the questions and suggestions. For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler I think I already explained the questions raised by Daniel in other thread , regarding why I use __kthread_should_park() For other aspects, can we put all our opinion synthesized here ? Thanks ! -- Monk Liu | Cloud-GPU Core team --
RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)
[AMD Official Use Only] That' really matter in practice, when two jobs from different process scheduled to the ring close to each other, if we don't discriminate A from B then B will be considered a bad job due to A's timeout, which will force B's process to exit (e.g.: X server) Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Daniel Vetter Sent: Tuesday, August 31, 2021 9:09 PM To: Koenig, Christian Cc: Grodzovsky, Andrey ; Christian König ; Liu, Monk ; amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3) On Fri, Aug 27, 2021 at 08:30:32PM +0200, Christian König wrote: > Yeah, that's what I meant with that the start of processing a job is a > bit swampy defined. > > Jobs overload, but we simply don't have another good indicator that a > job started except that the previous one completed. > > It's still better than starting the timer when pushing the job to the > ring buffer, because that is completely off. Not sure this matters that much really in practice, and in some cases we might want to actually just reset it all if the built up backlog is way too long. For anything that really runs way too long you want preempt-ctx/revoke fences anyway, not end-of-cs fences with tdr. -Daniel > > Christian. > > Am 27.08.21 um 20:22 schrieb Andrey Grodzovsky: > > As I mentioned to Monk before - what about cases such as in this > > test - > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi > > tlab.freedesktop.org%2Fmesa%2Fdrm%2F-%2Fcommit%2Fbc21168fa924d3fc4a0 > > 00492e861f50a1a135b25data=04%7C01%7CMonk.Liu%40amd.com%7Cbd1847 > > 4429e34f8eaac208d96c80710e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C > > 0%7C637660121179715855%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL > > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=1WTD3 > > opiBtT29bbbqJub5nfhWgX5vMNppiFKgWDe%2FoQ%3Dreserved=0 > > > > Here you don't have serialized sequence where when jobs finishes > > processing and second starts, they execute together concurrently - > > for those cases seems to me restarting the timer for the second job > > from scratch will let it hang much longer then allowed by TO value. > > > > Andrey > > > > On 2021-08-27 10:29 a.m., Christian König wrote: > > > I don't think that makes sense. > > > > > > See we don't want to start the time when the job is inserted into > > > the ring buffer, but rather when it starts processing. > > > > > > Starting processing is a bit swampy defined, but just starting the > > > timer when the previous job completes should be fine enough. > > > > > > Christian. > > > > > > Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky: > > > > The TS represents the point in time when the job was inserted > > > > into the pending list. > > > > I don't think it matters when it actually starts to be > > > > processed, what matters is when this job was inserted into > > > > pending list because right at that point you arm the TO timer > > > > (when no other is running already) and so when the previous job > > > > completes and you cancel and rearm again you can use that TS > > > > from the next job in pending list to calculate how much time has > > > > actually left for it to run before TDR must be initiated and not > > > > just give it again full TO value to run even if it has already > > > > been running for a while. > > > > > > > > Also, i am not sure also about the assumption that what we > > > > measure is processing by HW, what we measure is from the moment > > > > it was scheduled to ring to the moment the job completed (EOP > > > > event). At least that what our TDR timer measures and so it > > > > makes sense to set the TS at this point. > > > > > > > > Andrey > > > > > > > > On 2021-08-27 3:20 a.m., Liu, Monk wrote: > > > > > [AMD Official Use Only] > > > > > > > > > > what is that 'ts' representing for ? it looks to me the > > > > > jiffies that it get scheduled to the ring, but a job > > > > > scheduled to the ring doesn't represent it's being processed > > > > > by hw. > > > > > > > > > > Thanks > > > > > > > > > > -- > > > > > Monk Liu | Cloud-GPU Core team > > > > > -- > > > > > > > > > > -Original Message- > > > > > From: Grodzovsky, Andrey > > > > > Sent: Friday, August 27, 2021 4:14 AM > > > > > To: Liu, Monk ; > > > > > amd-gfx@lists.freedesktop.org; Koenig, Christian > > > > > > > > > > Cc: dri-de...@lists.freedesktop.org > > > > > Subject: Re: [PATCH] drm/sched: fix the bug of time out > > > > > calculation(v3) > > > > > > > > > > Attached quick patch for per job TTL calculation to make more > > > > > precises next timer expiration. It's on top of the patch in > > > > > this thread. Let me know if this makes sense. > > > > >
RE: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
[AMD Official Use Only] Okay, I will reprepare this patch Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Daniel Vetter Sent: Tuesday, August 31, 2021 9:02 PM To: Liu, Monk Cc: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; Chen, Jingwen Subject: Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler On Tue, Aug 31, 2021 at 02:59:02PM +0200, Daniel Vetter wrote: > Can we please have some actual commit message here, with detailed > explanation of the race/bug/whatever, how you fix it and why this is > the best option? > > On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote: > > tested-by: jingwen chen > > Signed-off-by: Monk Liu > > Signed-off-by: jingwen chen > > --- > > drivers/gpu/drm/scheduler/sched_main.c | 24 > > > > 1 file changed, 4 insertions(+), 20 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > b/drivers/gpu/drm/scheduler/sched_main.c > > index ecf8140..894fdb24 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct > > *work) > > sched = container_of(work, struct drm_gpu_scheduler, > > work_tdr.work); > > > > /* Protects against concurrent deletion in > > drm_sched_get_cleanup_job */ > > + if (!__kthread_should_park(sched->thread)) > > This is a __ function, i.e. considered internal, and it's lockless > atomic, i.e. unordered. And you're not explaining why this works. > > Iow it's probably buggy, and an just unconditionally parking the > kthread is probably the right thing to do. If it's not the right thing > to do, there's a bug here for sure. Also why don't we reuse the function drivers already have to stop a scheduler thread? We seem to have two kthread_park now, that's probably one too much. -Daniel > > + kthread_park(sched->thread); > > + > > spin_lock(>job_list_lock); > > job = list_first_entry_or_null(>pending_list, > >struct drm_sched_job, list); > > > > if (job) { > > - /* > > -* Remove the bad job so it cannot be freed by concurrent > > -* drm_sched_cleanup_jobs. It will be reinserted back after > > sched->thread > > -* is parked at which point it's safe. > > -*/ > > - list_del_init(>list); > > spin_unlock(>job_list_lock); > > > > + /* vendor's timeout_job should call drm_sched_start() */ > > status = job->sched->ops->timedout_job(job); > > > > /* > > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, > > struct drm_sched_job *bad) > > kthread_park(sched->thread); > > > > /* > > -* Reinsert back the bad job here - now it's safe as > > -* drm_sched_get_cleanup_job cannot race against us and release the > > -* bad job at this point - we parked (waited for) any in progress > > -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called > > -* now until the scheduler thread is unparked. > > -*/ > > - if (bad && bad->sched == sched) > > - /* > > -* Add at the head of the queue to reflect it was the earliest > > -* job extracted. > > -*/ > > - list_add(>list, >pending_list); > > - > > - /* > > * Iterate the job list from later to earlier one and either deactive > > * their HW callbacks or remove them from pending list if they already > > * signaled. > > -- > > 2.7.4 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog. > ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C298815bea18f4fbf76 > b308d96c7f7a8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6376601170 > 51194614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiL > CJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=QzgCU7%2BPdA0aWL5%2BJLg > KeKbGaMMGqeGI9KE0P0LXlN4%3Dreserved=0 -- Daniel Vetter Software Engineer, Intel Corporation https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C298815bea18f4fbf76b308d96c7f7a8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660117051194614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=QzgCU7%2BPdA0aWL5%2BJLgKeKbGaMMGqeGI9KE0P0LXlN4%3Dreserved=0
[diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread
[AMD Official Use Only] Hi Daniel/Christian/Andrey It looks the voice from you three are spread over those email floods to me, the feature we are working on (diagnostic TDR scheme) is pending there for more than 6 month (we started it from feb 2021). Honestly speaking the email ways that we are using now is not friendly and quite painful to me Can we try to put all our opinions, suggestions, or even objects here together, let's go through them one by one, it's too hard for us to reply each email on different questions . For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4) This is a fixing patch on the timeout timer in scheduler, can we complete this one first ? it should already resolved all the questions and suggestions. For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler I think I already explained the questions raised by Daniel in other thread , regarding why I use __kthread_should_park() For other aspects, can we put all our opinion synthesized here ? Thanks ! -- Monk Liu | Cloud-GPU Core team --
RE: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
[AMD Official Use Only] >> Also why don't we reuse the function drivers already have to stop a >> scheduler thread? We seem to have two kthread_park now, that's probably one >> too much. Are you referring to drm_sched_stop ? That's different, we don't need the logic from it, see that it go through pending list and remove all callbacks , etc... meanwhile vendor's timeout callback will call drm_sched_stop in a proper way, All we want in my patch is to simply park scheduler, Besides, even you call drm_sched_stop in job_timeout you still run into the warning issue I hit. Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Daniel Vetter Sent: Tuesday, August 31, 2021 9:02 PM To: Liu, Monk Cc: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; Chen, Jingwen Subject: Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler On Tue, Aug 31, 2021 at 02:59:02PM +0200, Daniel Vetter wrote: > Can we please have some actual commit message here, with detailed > explanation of the race/bug/whatever, how you fix it and why this is > the best option? > > On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote: > > tested-by: jingwen chen > > Signed-off-by: Monk Liu > > Signed-off-by: jingwen chen > > --- > > drivers/gpu/drm/scheduler/sched_main.c | 24 > > > > 1 file changed, 4 insertions(+), 20 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > b/drivers/gpu/drm/scheduler/sched_main.c > > index ecf8140..894fdb24 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct > > *work) > > sched = container_of(work, struct drm_gpu_scheduler, > > work_tdr.work); > > > > /* Protects against concurrent deletion in > > drm_sched_get_cleanup_job */ > > + if (!__kthread_should_park(sched->thread)) > > This is a __ function, i.e. considered internal, and it's lockless > atomic, i.e. unordered. And you're not explaining why this works. > > Iow it's probably buggy, and an just unconditionally parking the > kthread is probably the right thing to do. If it's not the right thing > to do, there's a bug here for sure. Also why don't we reuse the function drivers already have to stop a scheduler thread? We seem to have two kthread_park now, that's probably one too much. -Daniel > > + kthread_park(sched->thread); > > + > > spin_lock(>job_list_lock); > > job = list_first_entry_or_null(>pending_list, > >struct drm_sched_job, list); > > > > if (job) { > > - /* > > -* Remove the bad job so it cannot be freed by concurrent > > -* drm_sched_cleanup_jobs. It will be reinserted back after > > sched->thread > > -* is parked at which point it's safe. > > -*/ > > - list_del_init(>list); > > spin_unlock(>job_list_lock); > > > > + /* vendor's timeout_job should call drm_sched_start() */ > > status = job->sched->ops->timedout_job(job); > > > > /* > > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, > > struct drm_sched_job *bad) > > kthread_park(sched->thread); > > > > /* > > -* Reinsert back the bad job here - now it's safe as > > -* drm_sched_get_cleanup_job cannot race against us and release the > > -* bad job at this point - we parked (waited for) any in progress > > -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called > > -* now until the scheduler thread is unparked. > > -*/ > > - if (bad && bad->sched == sched) > > - /* > > -* Add at the head of the queue to reflect it was the earliest > > -* job extracted. > > -*/ > > - list_add(>list, >pending_list); > > - > > - /* > > * Iterate the job list from later to earlier one and either deactive > > * their HW callbacks or remove them from pending list if they already > > * signaled. > > -- > > 2.7.4 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog. > ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C298815bea18f4fbf76 > b308d96c7f7a8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6376601170 > 51194614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiL > CJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=QzgCU7%2BPdA0aWL5%2BJLg > KeKbGaMMGqeGI9KE0P0LXlN4%3Dreserved=0 -- Daniel Vetter Software Engineer, Intel Corporation
RE: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
[AMD Official Use Only] >> This is a __ function, i.e. considered internal, and it's lockless atomic, >> i.e. unordered. And you're not explaining why this works. It's not a traditional habit from what I can see that put explain in code, but we can do that in mails , We want to park the scheduler in job_timeout to serialize the job accessing from both sched and TO handler , but inside vendor's callback timeout_job at least both panfrost and amd They both call drm_sched_stop() on all schedulers. If we unconditionally call "kthread_park" in job_timedout then the bailing job's timedout will try to call "kthread_park" again on its scheduler and introduce "warning" The scenario is : 1,Job1 on sched1 triggers timedout, and sched1 is parked, 2,vendor callback runs, it will usually stop all schedulers. 3,Job2 on sched2 triggers timedout, so the job_timedout also try to park sched2, but sched2 was stopped already by above step. (job2's timeout is introduced by job1, or by other VF) ---So there will be "warning" in kernel log from above step... after this "__kthread_should_park()" here we can avoid the warning, that's the only reason I need this __function. 4,Before vendor callback exit, it will unpark all schedulers. >From another hand if we don't do the kthread_park() and still delete the job >here (drop deleting/reinserting the job from pending_list is what we want), >we still have a small windows to hit the race issue: That cleanup_job from sched thread is freeing the job while job is under processing by job_timedout or vendor's call back. And the reason we want to avoid deleting/reinserting the timedout job is because we (amd vendor) have our own way to do a diagnostic on all jobs in pending list from all scheduler, we want to cherry-pick the real bad job >From all scheduler's pending list that causes this JOB TIMEOUT. Besides, it is also much reasonable to park scheduler when job_timedout is running there, they should exclusively access those common members like sched_job. (due to spin_lock is off before running into vendor's calback) Hope I explained ourselves well enough. Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Daniel Vetter Sent: Tuesday, August 31, 2021 8:59 PM To: Liu, Monk Cc: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; Chen, Jingwen Subject: Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler Can we please have some actual commit message here, with detailed explanation of the race/bug/whatever, how you fix it and why this is the best option? On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote: > tested-by: jingwen chen > Signed-off-by: Monk Liu > Signed-off-by: jingwen chen > --- > drivers/gpu/drm/scheduler/sched_main.c | 24 > 1 file changed, 4 insertions(+), 20 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index ecf8140..894fdb24 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct > *work) > sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); > > /* Protects against concurrent deletion in drm_sched_get_cleanup_job > */ > + if (!__kthread_should_park(sched->thread)) This is a __ function, i.e. considered internal, and it's lockless atomic, i.e. unordered. And you're not explaining why this works. Iow it's probably buggy, and an just unconditionally parking the kthread is probably the right thing to do. If it's not the right thing to do, there's a bug here for sure. -Daniel > + kthread_park(sched->thread); > + > spin_lock(>job_list_lock); > job = list_first_entry_or_null(>pending_list, > struct drm_sched_job, list); > > if (job) { > - /* > - * Remove the bad job so it cannot be freed by concurrent > - * drm_sched_cleanup_jobs. It will be reinserted back after > sched->thread > - * is parked at which point it's safe. > - */ > - list_del_init(>list); > spin_unlock(>job_list_lock); > > + /* vendor's timeout_job should call drm_sched_start() */ > status = job->sched->ops->timedout_job(job); > > /* > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, > struct drm_sched_job *bad) > kthread_park(sched->thread); > > /* > - * Reinsert back the bad job here - now it's safe as > - * drm_sched_get_cleanup_job cannot race against us and release the > - * bad job at this point - we parked (waited for) any in progress > - * (earlier) cleanups and drm_sched_get_cleanup_job will not be called >
[PATCH 2/2] drm/sched: serialize job_timeout and scheduler
tested-by: jingwen chen Signed-off-by: Monk Liu Signed-off-by: jingwen chen --- drivers/gpu/drm/scheduler/sched_main.c | 24 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index 3e0bbc7..87d72e9 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct *work) sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ + if (!__kthread_should_park(sched->thread)) + kthread_park(sched->thread); + spin_lock(>job_list_lock); job = list_first_entry_or_null(>pending_list, struct drm_sched_job, list); if (job) { - /* -* Remove the bad job so it cannot be freed by concurrent -* drm_sched_cleanup_jobs. It will be reinserted back after sched->thread -* is parked at which point it's safe. -*/ - list_del_init(>list); spin_unlock(>job_list_lock); + /* vendor's timeout_job should call drm_sched_start() */ status = job->sched->ops->timedout_job(job); /* @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad) kthread_park(sched->thread); /* -* Reinsert back the bad job here - now it's safe as -* drm_sched_get_cleanup_job cannot race against us and release the -* bad job at this point - we parked (waited for) any in progress -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called -* now until the scheduler thread is unparked. -*/ - if (bad && bad->sched == sched) - /* -* Add at the head of the queue to reflect it was the earliest -* job extracted. -*/ - list_add(>list, >pending_list); - - /* * Iterate the job list from later to earlier one and either deactive * their HW callbacks or remove them from pending list if they already * signaled. -- 2.7.4
[PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)
issue: in cleanup_job the cancle_delayed_work will cancel a TO timer even the its corresponding job is still running. fix: do not cancel the timer in cleanup_job, instead do the cancelling only when the heading job is signaled, and if there is a "next" job we start_timeout again. v2: further cleanup the logic, and do the TDR timer cancelling if the signaled job is the last one in its scheduler. v3: change the issue description remove the cancel_delayed_work in the begining of the cleanup_job recover the implement of drm_sched_job_begin. v4: remove the kthread_should_park() checking in cleanup_job routine, we should cleanup the signaled job asap TODO: 1)introduce pause/resume scheduler in job_timeout to serial the handling of scheduler and job_timeout. 2)drop the bad job's del and insert in scheduler due to above serialization (no race issue anymore with the serialization) tested-by: jingwen Signed-off-by: Monk Liu --- drivers/gpu/drm/scheduler/sched_main.c | 26 +- 1 file changed, 9 insertions(+), 17 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index a2a9536..3e0bbc7 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) { struct drm_sched_job *job, *next; - /* -* Don't destroy jobs while the timeout worker is running OR thread -* is being parked and hence assumed to not touch pending_list -*/ - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && - !cancel_delayed_work(>work_tdr)) || - kthread_should_park()) - return NULL; - spin_lock(>job_list_lock); job = list_first_entry_or_null(>pending_list, @@ -693,17 +684,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) if (job && dma_fence_is_signaled(>s_fence->finished)) { /* remove job from pending_list */ list_del_init(>list); + + /* cancel this job's TO timer */ + cancel_delayed_work(>work_tdr); /* make the scheduled timestamp more accurate */ next = list_first_entry_or_null(>pending_list, typeof(*next), list); - if (next) + + if (next) { next->s_fence->scheduled.timestamp = job->s_fence->finished.timestamp; - + /* start TO timer for next job */ + drm_sched_start_timeout(sched); + } } else { job = NULL; - /* queue timeout for next job */ - drm_sched_start_timeout(sched); } spin_unlock(>job_list_lock); @@ -791,11 +786,8 @@ static int drm_sched_main(void *param) (entity = drm_sched_select_entity(sched))) || kthread_should_stop()); - if (cleanup_job) { + if (cleanup_job) sched->ops->free_job(cleanup_job); - /* queue timeout for next job */ - drm_sched_start_timeout(sched); - } if (!entity) continue; -- 2.7.4
Re: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)
What about removing (kthread_should_park()) ? We decided it's useless as far as I remember. Andrey From: amd-gfx on behalf of Liu, Monk Sent: 31 August 2021 20:24 To: Liu, Monk ; amd-gfx@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org Subject: RE: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3) [AMD Official Use Only] Ping Christian, Andrey Can we merge this patch first ? this is a standalone patch for the timer Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Monk Liu Sent: Tuesday, August 31, 2021 6:36 PM To: amd-gfx@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org; Liu, Monk Subject: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3) issue: in cleanup_job the cancle_delayed_work will cancel a TO timer even the its corresponding job is still running. fix: do not cancel the timer in cleanup_job, instead do the cancelling only when the heading job is signaled, and if there is a "next" job we start_timeout again. v2: further cleanup the logic, and do the TDR timer cancelling if the signaled job is the last one in its scheduler. v3: change the issue description remove the cancel_delayed_work in the begining of the cleanup_job recover the implement of drm_sched_job_begin. TODO: 1)introduce pause/resume scheduler in job_timeout to serial the handling of scheduler and job_timeout. 2)drop the bad job's del and insert in scheduler due to above serialization (no race issue anymore with the serialization) tested-by: jingwen Signed-off-by: Monk Liu --- drivers/gpu/drm/scheduler/sched_main.c | 25 ++--- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index a2a9536..ecf8140 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) { struct drm_sched_job *job, *next; - /* -* Don't destroy jobs while the timeout worker is running OR thread -* is being parked and hence assumed to not touch pending_list -*/ - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && - !cancel_delayed_work(>work_tdr)) || - kthread_should_park()) + if (kthread_should_park()) return NULL; spin_lock(>job_list_lock); @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) if (job && dma_fence_is_signaled(>s_fence->finished)) { /* remove job from pending_list */ list_del_init(>list); + + /* cancel this job's TO timer */ + cancel_delayed_work(>work_tdr); /* make the scheduled timestamp more accurate */ next = list_first_entry_or_null(>pending_list, typeof(*next), list); - if (next) + + if (next) { next->s_fence->scheduled.timestamp = job->s_fence->finished.timestamp; - + /* start TO timer for next job */ + drm_sched_start_timeout(sched); + } } else { job = NULL; - /* queue timeout for next job */ - drm_sched_start_timeout(sched); } spin_unlock(>job_list_lock); @@ -791,11 +789,8 @@ static int drm_sched_main(void *param) (entity = drm_sched_select_entity(sched))) || kthread_should_stop()); - if (cleanup_job) { + if (cleanup_job) sched->ops->free_job(cleanup_job); - /* queue timeout for next job */ - drm_sched_start_timeout(sched); - } if (!entity) continue; -- 2.7.4
RE: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)
[AMD Official Use Only] Ping Christian, Andrey Can we merge this patch first ? this is a standalone patch for the timer Thanks -- Monk Liu | Cloud-GPU Core team -- -Original Message- From: Monk Liu Sent: Tuesday, August 31, 2021 6:36 PM To: amd-gfx@lists.freedesktop.org Cc: dri-de...@lists.freedesktop.org; Liu, Monk Subject: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3) issue: in cleanup_job the cancle_delayed_work will cancel a TO timer even the its corresponding job is still running. fix: do not cancel the timer in cleanup_job, instead do the cancelling only when the heading job is signaled, and if there is a "next" job we start_timeout again. v2: further cleanup the logic, and do the TDR timer cancelling if the signaled job is the last one in its scheduler. v3: change the issue description remove the cancel_delayed_work in the begining of the cleanup_job recover the implement of drm_sched_job_begin. TODO: 1)introduce pause/resume scheduler in job_timeout to serial the handling of scheduler and job_timeout. 2)drop the bad job's del and insert in scheduler due to above serialization (no race issue anymore with the serialization) tested-by: jingwen Signed-off-by: Monk Liu --- drivers/gpu/drm/scheduler/sched_main.c | 25 ++--- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index a2a9536..ecf8140 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) { struct drm_sched_job *job, *next; - /* -* Don't destroy jobs while the timeout worker is running OR thread -* is being parked and hence assumed to not touch pending_list -*/ - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && - !cancel_delayed_work(>work_tdr)) || - kthread_should_park()) + if (kthread_should_park()) return NULL; spin_lock(>job_list_lock); @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) if (job && dma_fence_is_signaled(>s_fence->finished)) { /* remove job from pending_list */ list_del_init(>list); + + /* cancel this job's TO timer */ + cancel_delayed_work(>work_tdr); /* make the scheduled timestamp more accurate */ next = list_first_entry_or_null(>pending_list, typeof(*next), list); - if (next) + + if (next) { next->s_fence->scheduled.timestamp = job->s_fence->finished.timestamp; - + /* start TO timer for next job */ + drm_sched_start_timeout(sched); + } } else { job = NULL; - /* queue timeout for next job */ - drm_sched_start_timeout(sched); } spin_unlock(>job_list_lock); @@ -791,11 +789,8 @@ static int drm_sched_main(void *param) (entity = drm_sched_select_entity(sched))) || kthread_should_stop()); - if (cleanup_job) { + if (cleanup_job) sched->ops->free_job(cleanup_job); - /* queue timeout for next job */ - drm_sched_start_timeout(sched); - } if (!entity) continue; -- 2.7.4
Re: [PATCH 1/1] drm/amdkfd: make needs_pcie_atomics FW-version dependent
On 2021-08-31 6:09 p.m., Zeng, Oak wrote: A nit-pick inline. Otherwise this patch is Reviewed-by: Oak Zeng Regards, Oak On 2021-08-31, 5:57 PM, "amd-gfx on behalf of Felix Kuehling" wrote: On some GPUs the PCIe atomic requirement for KFD depends on the MEC firmware version. Add a firmware version check for this. The minimum firmware version that works without atomics can be updated in the device_info structure for each GPU type. Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 9 +++-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 + 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 16a57b70cc1a..655ee5733229 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -688,6 +688,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct kfd_dev *kfd; const struct kfd_device_info *device_info; const struct kfd2kgd_calls *f2g; + uint32_t fw_version; if (asic_type >= sizeof(kfd_supported_devices) / (sizeof(void *) * 2) || asic_type >= sizeof(kfd2kgd_funcs) / sizeof(void *)) { @@ -713,8 +714,12 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, * supported. */ kfd->pci_atomic_requested = amdgpu_amdkfd_have_atomics_support(kgd); - if (device_info->needs_pci_atomics && - !kfd->pci_atomic_requested) { + fw_version = amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1); + if (!kfd->pci_atomic_requested && + device_info->needs_pci_atomics && + (!device_info->no_atomic_fw_version || +amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1) < You already get the fw_version above __ I'll fix that. I forgot to remove the local variable after I decided to move the function call into the condition. Thanks, Felix + device_info->no_atomic_fw_version)) { dev_info(kfd_device, "skipped device %x:%x, PCI rejects atomics\n", pdev->vendor, pdev->device); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index ab83b0de6b22..6d8f9bb2d905 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -207,6 +207,7 @@ struct kfd_device_info { bool supports_cwsr; bool needs_iommu_device; bool needs_pci_atomics; + uint32_t no_atomic_fw_version; unsigned int num_sdma_engines; unsigned int num_xgmi_sdma_engines; unsigned int num_sdma_queues_per_engine; -- 2.32.0
Re: [PATCH 1/1] drm/amdkfd: make needs_pcie_atomics FW-version dependent
A nit-pick inline. Otherwise this patch is Reviewed-by: Oak Zeng Regards, Oak On 2021-08-31, 5:57 PM, "amd-gfx on behalf of Felix Kuehling" wrote: On some GPUs the PCIe atomic requirement for KFD depends on the MEC firmware version. Add a firmware version check for this. The minimum firmware version that works without atomics can be updated in the device_info structure for each GPU type. Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 9 +++-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 + 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 16a57b70cc1a..655ee5733229 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -688,6 +688,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct kfd_dev *kfd; const struct kfd_device_info *device_info; const struct kfd2kgd_calls *f2g; + uint32_t fw_version; if (asic_type >= sizeof(kfd_supported_devices) / (sizeof(void *) * 2) || asic_type >= sizeof(kfd2kgd_funcs) / sizeof(void *)) { @@ -713,8 +714,12 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, * supported. */ kfd->pci_atomic_requested = amdgpu_amdkfd_have_atomics_support(kgd); - if (device_info->needs_pci_atomics && - !kfd->pci_atomic_requested) { + fw_version = amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1); + if (!kfd->pci_atomic_requested && + device_info->needs_pci_atomics && + (!device_info->no_atomic_fw_version || + amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1) < You already get the fw_version above __ + device_info->no_atomic_fw_version)) { dev_info(kfd_device, "skipped device %x:%x, PCI rejects atomics\n", pdev->vendor, pdev->device); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index ab83b0de6b22..6d8f9bb2d905 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -207,6 +207,7 @@ struct kfd_device_info { bool supports_cwsr; bool needs_iommu_device; bool needs_pci_atomics; + uint32_t no_atomic_fw_version; unsigned int num_sdma_engines; unsigned int num_xgmi_sdma_engines; unsigned int num_sdma_queues_per_engine; -- 2.32.0
[PATCH 1/1] drm/amdkfd: make needs_pcie_atomics FW-version dependent
On some GPUs the PCIe atomic requirement for KFD depends on the MEC firmware version. Add a firmware version check for this. The minimum firmware version that works without atomics can be updated in the device_info structure for each GPU type. Signed-off-by: Felix Kuehling --- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 9 +++-- drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 + 2 files changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 16a57b70cc1a..655ee5733229 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -688,6 +688,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, struct kfd_dev *kfd; const struct kfd_device_info *device_info; const struct kfd2kgd_calls *f2g; + uint32_t fw_version; if (asic_type >= sizeof(kfd_supported_devices) / (sizeof(void *) * 2) || asic_type >= sizeof(kfd2kgd_funcs) / sizeof(void *)) { @@ -713,8 +714,12 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd, * supported. */ kfd->pci_atomic_requested = amdgpu_amdkfd_have_atomics_support(kgd); - if (device_info->needs_pci_atomics && - !kfd->pci_atomic_requested) { + fw_version = amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1); + if (!kfd->pci_atomic_requested && + device_info->needs_pci_atomics && + (!device_info->no_atomic_fw_version || + amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1) < + device_info->no_atomic_fw_version)) { dev_info(kfd_device, "skipped device %x:%x, PCI rejects atomics\n", pdev->vendor, pdev->device); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index ab83b0de6b22..6d8f9bb2d905 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -207,6 +207,7 @@ struct kfd_device_info { bool supports_cwsr; bool needs_iommu_device; bool needs_pci_atomics; + uint32_t no_atomic_fw_version; unsigned int num_sdma_engines; unsigned int num_xgmi_sdma_engines; unsigned int num_sdma_queues_per_engine; -- 2.32.0
Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
On 2021-08-31 16:56, Andrey Grodzovsky wrote: > On 2021-08-31 12:01 p.m., Luben Tuikov wrote: >> On 2021-08-31 11:23, Andrey Grodzovsky wrote: >>> On 2021-08-31 10:38 a.m., Daniel Vetter wrote: On Tue, Aug 31, 2021 at 10:20:40AM -0400, Andrey Grodzovsky wrote: > On 2021-08-31 10:03 a.m., Daniel Vetter wrote: >> On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote: >>> It's says patch [2/2] but i can't find patch 1 >>> >>> On 2021-08-31 6:35 a.m., Monk Liu wrote: tested-by: jingwen chen Signed-off-by: Monk Liu Signed-off-by: jingwen chen --- drivers/gpu/drm/scheduler/sched_main.c | 24 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index ecf8140..894fdb24 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct *work) sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ + if (!__kthread_should_park(sched->thread)) + kthread_park(sched->thread); + >>> As mentioned before, without serializing against other TDR handlers from >>> other >>> schedulers you just race here against them, e.g. you parked it now but >>> another >>> one in progress will unpark it as part of calling drm_sched_start for >>> other >>> rings[1] >>> Unless I am missing something since I haven't found patch [1/2] >>> >>> [1] - >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L5041data=04%7C01%7Cluben.tuikov%40amd.com%7C228bd1600c914efe24aa08d96c934bbb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660202148713283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=PrrvFHAwDeLlbcOctlKHsCFs9%2F56XNVzoLVcT1RoJgI%3Dreserved=0 >> You need to have your own wq and run all your tdr work on the same wq if >> your reset has any cross-engine impact. > IMHO what is problematic in serializing vs. locking (with trylock and bail > out like we do in [1]) is for multiple TO events arising from same reason > like maybe one job just waits for another and once first is hanged the > second will also appear to be hanged triggering it's own TO event. > In this case multiple TOs event will trigger multiple resets if we > serialize > but if we use lock with trylock the second one will quietly bail out. > > [1] > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L4903data=04%7C01%7Cluben.tuikov%40amd.com%7C228bd1600c914efe24aa08d96c934bbb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660202148713283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=kxSWBoshVTLMMFIFZdPsP4MBgUAoC%2F3szo9GUemSRLY%3Dreserved=0 Hm so I guess a single wq here, that will hold up all other TO. And they should recheck whether the job is moving meanwhile. >>> Can you clarify about this ? What job should be moving ? The dependent job ? >>> >>> Also unless you use hw semaphores the job shouldn't even start before the deps are singalled, so not sure how this goes wrong? >>> What about a simple example where >>> we actually can submit a shader on one ring and a simple >>> WAIT_REG_MEM packet on another to wait for the shader to write >>> a specific value to specific memory location. Here you have both of them >>> started >>> in close proximity and no explicit dependencies involved (at the >>> scheduler level) >>> and yet if the shader hangs also the WAIT_REG_MEM job will hang. >>> >>> The vm_id flush stuff can make things a bit more fun for your specific case, but in your specific case you have to run all TO handlers on the same ordered workqueue anyway (because trying to paper over this in other ways doesn't work imo). >>> I didn't get this one. >> So, awhile back I tried to "serialize" this by moving timed-out jobs >> into their own timed-out-dedicated list, then freeing them asynchronously, >> but I never got it to work reliably due to races with low-level drivers and >> assumptions made way back. >> >> My idea was to atomic-move timed-out jobs into their own list, at the time of >> timeout, and later asynchronously to free them (or better yet, inquire about >> their state, and free them or move them back--ideally the
Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
On 2021-08-31 12:01 p.m., Luben Tuikov wrote: On 2021-08-31 11:23, Andrey Grodzovsky wrote: On 2021-08-31 10:38 a.m., Daniel Vetter wrote: On Tue, Aug 31, 2021 at 10:20:40AM -0400, Andrey Grodzovsky wrote: On 2021-08-31 10:03 a.m., Daniel Vetter wrote: On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote: It's says patch [2/2] but i can't find patch 1 On 2021-08-31 6:35 a.m., Monk Liu wrote: tested-by: jingwen chen Signed-off-by: Monk Liu Signed-off-by: jingwen chen --- drivers/gpu/drm/scheduler/sched_main.c | 24 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index ecf8140..894fdb24 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct *work) sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ + if (!__kthread_should_park(sched->thread)) + kthread_park(sched->thread); + As mentioned before, without serializing against other TDR handlers from other schedulers you just race here against them, e.g. you parked it now but another one in progress will unpark it as part of calling drm_sched_start for other rings[1] Unless I am missing something since I haven't found patch [1/2] [1] - https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L5041data=04%7C01%7Cluben.tuikov%40amd.com%7C228bd1600c914efe24aa08d96c934bbb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660202148713283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=PrrvFHAwDeLlbcOctlKHsCFs9%2F56XNVzoLVcT1RoJgI%3Dreserved=0 You need to have your own wq and run all your tdr work on the same wq if your reset has any cross-engine impact. IMHO what is problematic in serializing vs. locking (with trylock and bail out like we do in [1]) is for multiple TO events arising from same reason like maybe one job just waits for another and once first is hanged the second will also appear to be hanged triggering it's own TO event. In this case multiple TOs event will trigger multiple resets if we serialize but if we use lock with trylock the second one will quietly bail out. [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L4903data=04%7C01%7Cluben.tuikov%40amd.com%7C228bd1600c914efe24aa08d96c934bbb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660202148713283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=kxSWBoshVTLMMFIFZdPsP4MBgUAoC%2F3szo9GUemSRLY%3Dreserved=0 Hm so I guess a single wq here, that will hold up all other TO. And they should recheck whether the job is moving meanwhile. Can you clarify about this ? What job should be moving ? The dependent job ? Also unless you use hw semaphores the job shouldn't even start before the deps are singalled, so not sure how this goes wrong? What about a simple example where we actually can submit a shader on one ring and a simple WAIT_REG_MEM packet on another to wait for the shader to write a specific value to specific memory location. Here you have both of them started in close proximity and no explicit dependencies involved (at the scheduler level) and yet if the shader hangs also the WAIT_REG_MEM job will hang. The vm_id flush stuff can make things a bit more fun for your specific case, but in your specific case you have to run all TO handlers on the same ordered workqueue anyway (because trying to paper over this in other ways doesn't work imo). I didn't get this one. So, awhile back I tried to "serialize" this by moving timed-out jobs into their own timed-out-dedicated list, then freeing them asynchronously, but I never got it to work reliably due to races with low-level drivers and assumptions made way back. My idea was to atomic-move timed-out jobs into their own list, at the time of timeout, and later asynchronously to free them (or better yet, inquire about their state, and free them or move them back--ideally the inquiry is atomic and done at timeout time before being moved to the timeout list). Anyway... But I found out that all these knobs and levers weren't in place and I was getting problems with it and it never materialized. The paradigm was loosely "let someone else do it", like, "on an event, move it to a list, and let someone else handle it", or "on an event, mark it, and let someone else handle it". (loosely borrowed from an iSCSI target I did many many years ago--it worked well and there were no races, even with out-of-order executions.) If
[PATCH v7 8/8] nouveau: fold multiple DRM_DEBUG_DRIVERs together
With DRM_USE_DYNAMIC_DEBUG, each callsite record requires 56 bytes. We can combine 12 into one here and save ~620 bytes. Signed-off-by: Jim Cromie --- drivers/gpu/drm/nouveau/nouveau_drm.c | 36 +-- 1 file changed, 23 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c index ba4cd5f83725..0f45399535bf 100644 --- a/drivers/gpu/drm/nouveau/nouveau_drm.c +++ b/drivers/gpu/drm/nouveau/nouveau_drm.c @@ -1245,19 +1245,29 @@ nouveau_drm_pci_table[] = { static void nouveau_display_options(void) { - DRM_DEBUG_DRIVER("Loading Nouveau with parameters:\n"); - - DRM_DEBUG_DRIVER("... tv_disable : %d\n", nouveau_tv_disable); - DRM_DEBUG_DRIVER("... ignorelid: %d\n", nouveau_ignorelid); - DRM_DEBUG_DRIVER("... duallink : %d\n", nouveau_duallink); - DRM_DEBUG_DRIVER("... nofbaccel: %d\n", nouveau_nofbaccel); - DRM_DEBUG_DRIVER("... config : %s\n", nouveau_config); - DRM_DEBUG_DRIVER("... debug: %s\n", nouveau_debug); - DRM_DEBUG_DRIVER("... noaccel : %d\n", nouveau_noaccel); - DRM_DEBUG_DRIVER("... modeset : %d\n", nouveau_modeset); - DRM_DEBUG_DRIVER("... runpm: %d\n", nouveau_runtime_pm); - DRM_DEBUG_DRIVER("... vram_pushbuf : %d\n", nouveau_vram_pushbuf); - DRM_DEBUG_DRIVER("... hdmimhz : %d\n", nouveau_hdmimhz); + DRM_DEBUG_DRIVER("Loading Nouveau with parameters:\n" +"... tv_disable : %d\n" +"... ignorelid: %d\n" +"... duallink : %d\n" +"... nofbaccel: %d\n" +"... config : %s\n" +"... debug: %s\n" +"... noaccel : %d\n" +"... modeset : %d\n" +"... runpm: %d\n" +"... vram_pushbuf : %d\n" +"... hdmimhz : %d\n" +, nouveau_tv_disable +, nouveau_ignorelid +, nouveau_duallink +, nouveau_nofbaccel +, nouveau_config +, nouveau_debug +, nouveau_noaccel +, nouveau_modeset +, nouveau_runtime_pm +, nouveau_vram_pushbuf +, nouveau_hdmimhz); } static const struct dev_pm_ops nouveau_pm_ops = { -- 2.31.1
[PATCH v7 5/8] drm_print: add choice to use dynamic debug in drm-debug
drm's debug system writes 10 distinct categories of messages to syslog using a small API[1]: drm_dbg*(10 names), DRM_DEV_DEBUG*(3 names), DRM_DEBUG*(8 names). There are thousands of these callsites, each categorized in this systematized way. These callsites can be enabled at runtime by their category, each controlled by a bit in drm.debug (/sys/modules/drm/parameter/debug). In the current "basic" implementation, drm_debug_enabled() tests these bits in __drm_debug each time an API[1] call is executed; while cheap individually, the costs accumulate with uptime. This patch uses dynamic-debug with jump-label to patch enabled calls onto their respective NOOP slots, avoiding all runtime bit-checks of __drm_debug by drm_debug_enabled(). Dynamic debug has no concept of category, but we can emulate one by replacing enum categories with a set of prefix-strings; "drm:core:", "drm:kms:" "drm:driver:" etc, and prepend them (at compile time) to the given formats. Then we can use: `echo module drm format "^drm:core: " +p > control` to enable the whole category with one query. This conversion yields many new prdbg callsites: dyndbg: 195 debug prints in module drm_kms_helper dyndbg: 298 debug prints in module drm dyndbg: 1630 debug prints in module i915 dyndbg: ~3500 debug prints in module amdgpu CONFIG_DRM_USE_DYNAMIC_DEBUG enables this, and is available if CONFIG_DYNAMIC_DEBUG or CONFIG_DYNAMIC_DEBUG_CORE is chosen, and if CONFIG_JUMP_LABEL is enabled; this because its required to get the promised optimizations. The "basic" -> "dyndbg" switchover is layered into the macro scheme A. A "prefix" version of DRM_UT_ map, named DRM_DBG_CAT_ "basic": DRM_DBG_CAT_ <=== DRM_UT_. Identity map. "dyndbg": #define DRM_DBG_CAT_KMS"drm:kms: " #define DRM_DBG_CAT_PRIME "drm:prime: " #define DRM_DBG_CAT_ATOMIC "drm:atomic: " In v3, had older name, DRM_DBG_CLASS_ was countered, I had agreed, but this seems better still; CATEGORY is already DRM's term-of-art, and adding a near-synonym 'CLASS' only adds ambiguity. DRM_UT_* are preserved, since theyre used elsewhere. We can probably reduce their use further, but thats a separate thing. B. drm_dev_dbg() & drm_debug() are interposed with macros basic:forward to renamed fn, with args preserved enabled: redirect to pr_debug, dev_dbg, with CATEGORY format catenated This is where drm_debug_enabled() is avoided. The prefix is prepended at compile-time, no category at runtime. C. API[1] uses DRM_DBG_CAT_s these already use (B), now they use (A) too, to get the correct token type for "basic" and "dyndbg" configs. D. use DEFINE_DYNAMIC_DEBUG_CATEGORIES() This defines the map using DRM_CAT_s, and creates the /sysfs bitmap to control those categories. NOTES: Because the dyndbg callback is watching __drm_debug, it is coherent with drm_debug_enabled() and its remaining users; the switchover should be transparent. Code Review is expected to catch the lack of correspondence between bit=>prefix definitions (the selector) and the prefixes used in the API[1] layer above pr_debug() I've coded the search-prefixes/categories with a trailing space, which excludes any sub-categories added later. This convention protects any "drm:atomic:fail:" callsites from getting stomped on by `echo 0 > debug`. Other categories could differ, but we need some default. Dyndbg requires that the prefix be in the compiled-in format string; run-time prefixing evades callsite selection by category. pr_debug("%s: ...", __func__, ...) // not ideal With "lineno X" in a query, its possible to enable single callsites, but it is tedious, and useless in a category context. Unfortunately __func__ is not a macro, and cannot be catenated at preprocess/compile time. Signed-off-by: Jim Cromie --- v5: . use DEFINE_DYNAMIC_DEBUG_CATEGORIES in drm_print.c . s/DRM_DBG_CLASS_/DRM_DBG_CAT_/ - dont need another term . default=y in KBuild entry - per @DanVet . move some commit-log prose to dyndbg commit . add-prototyes to (param_get/set)_dyndbg . more wrinkles found by . relocate ratelimit chunk from elsewhere v6: . add kernel doc . fix cpp paste, drop '#' v7: . change __drm_debug to long, to fit with DEFINE_DYNAMIC_DEBUG_CATEGORIES . add -DDYNAMIC_DEBUG_MODULE to ccflags if DRM_USE_DYNAMIC_DEBUG --- drivers/gpu/drm/Kconfig | 13 drivers/gpu/drm/Makefile| 3 + drivers/gpu/drm/drm_print.c | 53 + include/drm/drm_print.h | 144 4 files changed, 166 insertions(+), 47 deletions(-) diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig index 7ff89690a976..97e38d86fd27 100644 --- a/drivers/gpu/drm/Kconfig +++ b/drivers/gpu/drm/Kconfig @@ -57,6 +57,19 @@ config DRM_DEBUG_MM If in doubt, say "N". +config DRM_USE_DYNAMIC_DEBUG + bool "use dynamic debug to implement drm.debug" + default y + depends on DRM + depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE + depends on
[PATCH v7 7/8] amdgpu_ucode: reduce number of pr_debug calls
There are blocks of DRM_DEBUG calls, consolidate their args into single calls. With dynamic-debug in use, each callsite consumes 56 bytes of callsite data, and this patch removes about 65 calls, so it saves ~3.5kb. no functional changes. RFC: this creates multi-line log messages, does that break any syslog conventions ? Signed-off-by: Jim Cromie --- drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 293 -- 1 file changed, 158 insertions(+), 135 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c index 2834981f8c08..14a9fef1f4c6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c @@ -30,17 +30,26 @@ static void amdgpu_ucode_print_common_hdr(const struct common_firmware_header *hdr) { - DRM_DEBUG("size_bytes: %u\n", le32_to_cpu(hdr->size_bytes)); - DRM_DEBUG("header_size_bytes: %u\n", le32_to_cpu(hdr->header_size_bytes)); - DRM_DEBUG("header_version_major: %u\n", le16_to_cpu(hdr->header_version_major)); - DRM_DEBUG("header_version_minor: %u\n", le16_to_cpu(hdr->header_version_minor)); - DRM_DEBUG("ip_version_major: %u\n", le16_to_cpu(hdr->ip_version_major)); - DRM_DEBUG("ip_version_minor: %u\n", le16_to_cpu(hdr->ip_version_minor)); - DRM_DEBUG("ucode_version: 0x%08x\n", le32_to_cpu(hdr->ucode_version)); - DRM_DEBUG("ucode_size_bytes: %u\n", le32_to_cpu(hdr->ucode_size_bytes)); - DRM_DEBUG("ucode_array_offset_bytes: %u\n", - le32_to_cpu(hdr->ucode_array_offset_bytes)); - DRM_DEBUG("crc32: 0x%08x\n", le32_to_cpu(hdr->crc32)); + DRM_DEBUG("size_bytes: %u\n" + "header_size_bytes: %u\n" + "header_version_major: %u\n" + "header_version_minor: %u\n" + "ip_version_major: %u\n" + "ip_version_minor: %u\n" + "ucode_version: 0x%08x\n" + "ucode_size_bytes: %u\n" + "ucode_array_offset_bytes: %u\n" + "crc32: 0x%08x\n", + le32_to_cpu(hdr->size_bytes), + le32_to_cpu(hdr->header_size_bytes), + le16_to_cpu(hdr->header_version_major), + le16_to_cpu(hdr->header_version_minor), + le16_to_cpu(hdr->ip_version_major), + le16_to_cpu(hdr->ip_version_minor), + le32_to_cpu(hdr->ucode_version), + le32_to_cpu(hdr->ucode_size_bytes), + le32_to_cpu(hdr->ucode_array_offset_bytes), + le32_to_cpu(hdr->crc32)); } void amdgpu_ucode_print_mc_hdr(const struct common_firmware_header *hdr) @@ -55,9 +64,9 @@ void amdgpu_ucode_print_mc_hdr(const struct common_firmware_header *hdr) const struct mc_firmware_header_v1_0 *mc_hdr = container_of(hdr, struct mc_firmware_header_v1_0, header); - DRM_DEBUG("io_debug_size_bytes: %u\n", - le32_to_cpu(mc_hdr->io_debug_size_bytes)); - DRM_DEBUG("io_debug_array_offset_bytes: %u\n", + DRM_DEBUG("io_debug_size_bytes: %u\n" + "io_debug_array_offset_bytes: %u\n", + le32_to_cpu(mc_hdr->io_debug_size_bytes), le32_to_cpu(mc_hdr->io_debug_array_offset_bytes)); } else { DRM_ERROR("Unknown MC ucode version: %u.%u\n", version_major, version_minor); @@ -82,13 +91,17 @@ void amdgpu_ucode_print_smc_hdr(const struct common_firmware_header *hdr) switch (version_minor) { case 0: v2_0_hdr = container_of(hdr, struct smc_firmware_header_v2_0, v1_0.header); - DRM_DEBUG("ppt_offset_bytes: %u\n", le32_to_cpu(v2_0_hdr->ppt_offset_bytes)); - DRM_DEBUG("ppt_size_bytes: %u\n", le32_to_cpu(v2_0_hdr->ppt_size_bytes)); + DRM_DEBUG("ppt_offset_bytes: %u\n" + "ppt_size_bytes: %u\n", + le32_to_cpu(v2_0_hdr->ppt_offset_bytes), + le32_to_cpu(v2_0_hdr->ppt_size_bytes)); break; case 1: v2_1_hdr = container_of(hdr, struct smc_firmware_header_v2_1, v1_0.header); - DRM_DEBUG("pptable_count: %u\n", le32_to_cpu(v2_1_hdr->pptable_count)); - DRM_DEBUG("pptable_entry_offset: %u\n", le32_to_cpu(v2_1_hdr->pptable_entry_offset)); + DRM_DEBUG("pptable_count: %u\n" + "pptable_entry_offset: %u\n", + le32_to_cpu(v2_1_hdr->pptable_count), + le32_to_cpu(v2_1_hdr->pptable_entry_offset)); break; default: break; @@ -111,10 +124,12 @@ void
[PATCH v7 6/8] drm_print: instrument drm_debug_enabled
Duplicate drm_debug_enabled() code into both "basic" and "dyndbg" ifdef branches. Then add a pr_debug("todo: ...") into the "dyndbg" branch. Then convert the "dyndbg" branch's code to a macro, so that its pr_debug() get its callsite info from the invoking function, instead of from drm_debug_enabled() itself. This gives us unique callsite info for the 8 remaining users of drm_debug_enabled(), and lets us enable them individually to see how much logging traffic they generate. The oft-visited callsites can then be reviewed for runtime cost and possible optimizations. Heres what we get: bash-5.1# modprobe drm dyndbg: 384 debug prints in module drm bash-5.1# grep todo: /proc/dynamic_debug/control drivers/gpu/drm/drm_edid.c:1843 [drm]connector_bad_edid =_ "todo: maybe avoid via dyndbg\012" drivers/gpu/drm/drm_print.c:309 [drm]___drm_dbg =p "todo: maybe avoid via dyndbg\012" drivers/gpu/drm/drm_print.c:286 [drm]__drm_dev_dbg =p "todo: maybe avoid via dyndbg\012" drivers/gpu/drm/drm_vblank.c:1491 [drm]drm_vblank_restore =_ "todo: maybe avoid via dyndbg\012" drivers/gpu/drm/drm_vblank.c:787 [drm]drm_crtc_vblank_helper_get_vblank_timestamp_internal =_ "todo: maybe avoid via dyndbg\012" drivers/gpu/drm/drm_vblank.c:410 [drm]drm_crtc_accurate_vblank_count =_ "todo: maybe avoid via dyndbg\012" drivers/gpu/drm/drm_atomic_uapi.c:1457 [drm]drm_mode_atomic_ioctl =_ "todo: maybe avoid via dyndbg\012" drivers/gpu/drm/drm_edid_load.c:178 [drm]edid_load =_ "todo: maybe avoid via dyndbg\012" At quick glance, edid won't qualify, drm_print might, drm_vblank is strongest chance, maybe atomic-ioctl too. Signed-off-by: Jim Cromie --- include/drm/drm_print.h | 16 +++- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h index 973443040561..03f9ae83fade 100644 --- a/include/drm/drm_print.h +++ b/include/drm/drm_print.h @@ -378,6 +378,11 @@ enum drm_debug_category { #define DRM_DBG_CAT_DP DRM_UT_DP #define DRM_DBG_CAT_DRMRES DRM_UT_DRMRES +static inline bool drm_debug_enabled(enum drm_debug_category category) +{ + return unlikely(__drm_debug & category); +} + #else /* CONFIG_DRM_USE_DYNAMIC_DEBUG */ /* join prefix+format in cpp so dyndbg can see it */ @@ -397,12 +402,13 @@ enum drm_debug_category { #define DRM_DBG_CAT_DP "drm:dp: " #define DRM_DBG_CAT_DRMRES "drm:res: " /* not in MODULE_PARM_DESC */ -#endif /* CONFIG_DRM_USE_DYNAMIC_DEBUG */ +#define drm_debug_enabled(category)\ + ({ \ + pr_debug("todo: maybe avoid via dyndbg\n"); \ + unlikely(__drm_debug & category); \ + }) -static inline bool drm_debug_enabled(enum drm_debug_category category) -{ - return unlikely(__drm_debug & category); -} +#endif /* CONFIG_DRM_USE_DYNAMIC_DEBUG */ /* * struct device based logging -- 2.31.1
[PATCH v7 3/8] i915/gvt: use DEFINE_DYNAMIC_DEBUG_CATEGORIES to create "gvt:core:" etc categories
The gvt component of this driver has ~120 pr_debugs, in 9 categories quite similar to those in DRM. Following the interface model of drm.debug, add a parameter to map bits to these categorizations. DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug_gvt, __gvt_debug, "dyndbg bitmap desc", { "gvt:cmd: ", "command processing" }, { "gvt:core: ", "core help" }, { "gvt:dpy: ", "display help" }, { "gvt:el: ", "help" }, { "gvt:irq: ", "help" }, { "gvt:mm: ", "help" }, { "gvt:mmio: ", "help" }, { "gvt:render: ", "help" }, { "gvt:sched: " "help" }); The actual patch has a few details different, cmd_help() macro emits the initialization construct. if CONFIG_DRM_USE_DYNAMIC_DEBUG, then -DDYNAMIC_DEBUG_MODULE is added cflags, by gvt/Makefile. Signed-off-by: Jim Cromie --- v5: . static decl of vector of bit->class descriptors - Emil.V . relocate gvt-makefile chunk from elsewhere v7: . move ccflags addition up to i915/Makefile from i915/gvt --- drivers/gpu/drm/i915/Makefile | 4 drivers/gpu/drm/i915/i915_params.c | 35 ++ 2 files changed, 39 insertions(+) diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile index 4f22cac1c49b..5a4e371a3ec2 100644 --- a/drivers/gpu/drm/i915/Makefile +++ b/drivers/gpu/drm/i915/Makefile @@ -30,6 +30,10 @@ CFLAGS_display/intel_fbdev.o = $(call cc-disable-warning, override-init) subdir-ccflags-y += -I$(srctree)/$(src) +#ifdef CONFIG_DRM_USE_DYNAMIC_DEBUG +ccflags-y += -DDYNAMIC_DEBUG_MODULE +#endif + # Please keep these build lists sorted! # core driver code diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index e07f4cfea63a..e645e149485e 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -265,3 +265,38 @@ void i915_params_free(struct i915_params *params) I915_PARAMS_FOR_EACH(FREE); #undef FREE } + +#ifdef CONFIG_DRM_USE_DYNAMIC_DEBUG +/* todo: needs DYNAMIC_DEBUG_MODULE in some cases */ + +unsigned long __gvt_debug; +EXPORT_SYMBOL(__gvt_debug); + +#define _help(key) "\t\"" key "\"\t: help for " key "\n" + +#define I915_GVT_CATEGORIES(name) \ + " Enable debug output via /sys/module/i915/parameters/" #name \ + ", where each bit enables a debug category.\n" \ + _help("gvt:cmd:") \ + _help("gvt:core:") \ + _help("gvt:dpy:") \ + _help("gvt:el:")\ + _help("gvt:irq:") \ + _help("gvt:mm:")\ + _help("gvt:mmio:") \ + _help("gvt:render:")\ + _help("gvt:sched:") + +DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug_gvt, __gvt_debug, + I915_GVT_CATEGORIES(debug_gvt), + _DD_cat_("gvt:cmd:"), + _DD_cat_("gvt:core:"), + _DD_cat_("gvt:dpy:"), + _DD_cat_("gvt:el:"), + _DD_cat_("gvt:irq:"), + _DD_cat_("gvt:mm:"), + _DD_cat_("gvt:mmio:"), + _DD_cat_("gvt:render:"), + _DD_cat_("gvt:sched:")); + +#endif -- 2.31.1
[PATCH v7 4/8] amdgpu: use DEFINE_DYNAMIC_DEBUG_CATEGORIES
logger_types.h defines many DC_LOG_*() categorized debug wrappers. Most of these use DRM debug API, so are controllable using drm.debug, but others use bare pr_debug("$prefix: .."), each with a different class-prefix matching "^\[\w+\]:" Use DEFINE_DYNAMIC_DEBUG_CATEGORIES to create a /sys debug_dc parameter, modinfos, and to specify a map from bits -> categorized pr_debugs to be controlled. Signed-off-by: Jim Cromie --- .../gpu/drm/amd/display/dc/core/dc_debug.c| 44 ++- 1 file changed, 43 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c index 21be2a684393..9fd43254db88 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c @@ -36,8 +36,50 @@ #include "resource.h" -#define DC_LOGGER_INIT(logger) +#ifdef CONFIG_DRM_USE_DYNAMIC_DEBUG +/* define a drm.debug style dyndbg pr-debug control point */ +#include + +unsigned long __debug_dc; +EXPORT_SYMBOL(__debug_dc); + +#define _help_(key)"\t " key "\t- help for " key "\n" + +/* Id like to do these inside DEFINE_DYNAMIC_DEBUG_CATEGORIES, if possible */ +#define DC_DYNDBG_BITMAP_DESC(name)\ + "Control pr_debugs via /sys/module/amdgpu/parameters/" #name\ + ", where each bit controls a debug category.\n" \ + _help_("[SURFACE]:")\ + _help_("[CURSOR]:") \ + _help_("[PFLIP]:") \ + _help_("[VBLANK]:") \ + _help_("[HW_LINK_TRAINING]:") \ + _help_("[HW_AUDIO]:") \ + _help_("[SCALER]:") \ + _help_("[BIOS]:") \ + _help_("[BANDWIDTH_CALCS]:")\ + _help_("[DML]:")\ + _help_("[IF_TRACE]:") \ + _help_("[GAMMA]:") \ + _help_("[SMU_MSG]:") + +DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug_dc, __debug_dc, + DC_DYNDBG_BITMAP_DESC(debug_dc), + _DD_cat_("[CURSOR]:"), + _DD_cat_("[PFLIP]:"), + _DD_cat_("[VBLANK]:"), + _DD_cat_("[HW_LINK_TRAINING]:"), + _DD_cat_("[HW_AUDIO]:"), + _DD_cat_("[SCALER]:"), + _DD_cat_("[BIOS]:"), + _DD_cat_("[BANDWIDTH_CALCS]:"), + _DD_cat_("[DML]:"), + _DD_cat_("[IF_TRACE]:"), + _DD_cat_("[GAMMA]:"), + _DD_cat_("[SMU_MSG]:")); +#endif +#define DC_LOGGER_INIT(logger) #define SURFACE_TRACE(...) do {\ if (dc->debug.surface_trace) \ -- 2.31.1
[PATCH v7 2/8] dyndbg: remove spaces in pr_debug "gvt: core:" etc prefixes
Taking embedded spaces out of existing prefixes makes them better class-prefixes; simplifying the extra quoting needed otherwise: $> echo format "^gvt: core:" +p >control Dropping the internal spaces means any trailing space in a query will more clearly terminate the prefix being searched for. Consider a generic drm-debug example: # turn off ATOMIC reports echo format "^drm:atomic: " -p > control # turn off all ATOMIC:* reports, including any sub-categories echo format "^drm:atomic:" -p > control # turn on ATOMIC:FAIL: reports echo format "^drm:atomic:fail: " +p > control Removing embedded spaces in the class-prefixes simplifies the corresponding match-prefix. This means that "quoted" match-prefixes are only needed when the trailing space is desired, in order to exclude explicitly sub-categorized pr-debugs; in this example, "drm:atomic:fail:". RFC: maybe the prefix catenation should paste in the " " class-prefix terminator explicitly. A pr_debug_() flavor could exclude the " ", allowing ad-hoc sub-categorization by appending for example, "fail:" to "drm:atomic:" without the default " " insertion. Signed-off-by: Jim Cromie --- drivers/gpu/drm/i915/gvt/debug.h | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/gvt/debug.h b/drivers/gpu/drm/i915/gvt/debug.h index c6027125c1ec..b4021f41c546 100644 --- a/drivers/gpu/drm/i915/gvt/debug.h +++ b/drivers/gpu/drm/i915/gvt/debug.h @@ -36,30 +36,30 @@ do { \ } while (0) #define gvt_dbg_core(fmt, args...) \ - pr_debug("gvt: core: "fmt, ##args) + pr_debug("gvt:core: "fmt, ##args) #define gvt_dbg_irq(fmt, args...) \ - pr_debug("gvt: irq: "fmt, ##args) + pr_debug("gvt:irq: "fmt, ##args) #define gvt_dbg_mm(fmt, args...) \ - pr_debug("gvt: mm: "fmt, ##args) + pr_debug("gvt:mm: "fmt, ##args) #define gvt_dbg_mmio(fmt, args...) \ - pr_debug("gvt: mmio: "fmt, ##args) + pr_debug("gvt:mmio: "fmt, ##args) #define gvt_dbg_dpy(fmt, args...) \ - pr_debug("gvt: dpy: "fmt, ##args) + pr_debug("gvt:dpy: "fmt, ##args) #define gvt_dbg_el(fmt, args...) \ - pr_debug("gvt: el: "fmt, ##args) + pr_debug("gvt:el: "fmt, ##args) #define gvt_dbg_sched(fmt, args...) \ - pr_debug("gvt: sched: "fmt, ##args) + pr_debug("gvt:sched: "fmt, ##args) #define gvt_dbg_render(fmt, args...) \ - pr_debug("gvt: render: "fmt, ##args) + pr_debug("gvt:render: "fmt, ##args) #define gvt_dbg_cmd(fmt, args...) \ - pr_debug("gvt: cmd: "fmt, ##args) + pr_debug("gvt:cmd: "fmt, ##args) #endif -- 2.31.1
[PATCH v7 1/8] dyndbg: add DEFINE_DYNAMIC_DEBUG_CATEGORIES and callbacks
DEFINE_DYNAMIC_DEBUG_CATEGORIES(name, var, bitmap_desc, @bit_descs) allows users to define a drm.debug style (bitmap) sysfs interface, and to specify the desired mapping from bits[0-N] to the format-prefix'd pr_debug()s to be controlled. DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug_gvt, __gvt_debug, "i915/gvt bitmap desc", /** * search-prefixes, passed to dd-exec_queries * defines bits 0-N in order. * leading ^ is tacitly inserted (by callback currently) * trailing space used here excludes subcats. * helper macro needs more work * macro to autogen ++$i, 0x%x$i ? */ _DD_cat_("gvt:cmd: "), _DD_cat_("gvt:core: "), _DD_cat_("gvt:dpy: "), _DD_cat_("gvt:el: "), _DD_cat_("gvt:irq: "), _DD_cat_("gvt:mm: "), _DD_cat_("gvt:mmio: "), _DD_cat_("gvt:render: "), _DD_cat_("gvt:sched: ")); dynamic_debug.c: add 2 new elements: - int param_set_dyndbg() - int param_get_dyndbg() - struct kernel_param_ops param_ops_dyndbg Following the model of kernel/params.c STANDARD_PARAM_DEFS, these are non-static and exported. get/set use an augmented kernel_param; the arg refs a new struct dyndbg_bitmap_param containing: - the map (array, indexed by bitpos) of format-prefix strings, which define the set/category of prdbgs to be changed per each bit. - pointer to the user's ulong holding the bits/state. by sharing state, we coordinate with code that still uses it directly This will allow drm-debug to be converted incrementally, while still using __drm_debug & drm_debug_enabled() in other parts. param_set_dyndbg() compares new vs old bits, and only updates prdbgs on changes. This maximally preserves the underlying state, which may have been customized via later `echo $cmd >control`. So if a user really wants to know that all prdbgs are set precisely, they must pre-clear then set. TLDR: this also doesn't affect the decorator flags "mflt" set per prdbg. dynamic_debug.h: Add DEFINE_DYNAMIC_DEBUG_CATEGORIES() described above, and a stub throwing a BUILD_BUG (RFC) when used without DYNAMIC_DEBUG support. Add structs dyndbg_bitdesc, dyndbg_bitmap_param to support the macro. Note that it also calls MODULE_PARM_DESC for the user, but expects the user to catenate all the bit-descriptions together (as is done in drm.debug), and in the following uses in amdgpu, i915. The intent is to regenerate this output from per-bit help given in VA_ARGS, including a bit_label(); but this can wait. Also externs the struct kernel_param param_ops_dyndbg symbol, as is done in moduleparams.h for all the STANDARD params. USAGE NOTES: Using dyndbg to query on "format ^$prefix" requires that the prefix be present in the compiled-in format string; where run-time prefixing is used, that format would be "%s...", which is not usefully selectable. Using DEFINE_DYNAMIC_DEBUG_CATEGORIES wo support gets a BUILD_BUG. ISTM there is already action at a declarative distance, nobody needs mystery as to why the /sysfs thingy didn't appear. Dyndbg is completely agnostic wrt the categorization scheme used, in order to play well with any prefix convention already in use in the codebase. Ad-hoc categories and sub-categories are implicitly allowed, author discipline and review is expected. Here are some examples: "1","2","3" 2 doesn't imply 1. otherwize, sorta like printk levels "1:","2:","3:" are better, avoiding [1-9]\d+ ambiguity "hi:","mid:","low:" are reasonable, and imply independence "todo:","rfc:","fixme:" might be handy "A:".."Z:" uhm, yeah Hierarchical classes/categories are natural: "drm::"is used in a later commit "drm:::" is a natural extension. "drm:atomic:fail:" has been proposed, sounds directly useful NB: in a real sense we abandon enum strictures here, and lose some compiler help, on spelling errs for example. Obviously "drm:" != "DRM:". Some properties of a hierarchical category deserve explication: Trailing spaces matter ! With 1..3-space ("drm: ", "drm:atomic: ", "drm:atomic:fail: "), the ":" doesn't terminate the search-space, the trailing space does. So a "drm:" search spec will match all DRM categories & subcategories, and will not be useful in an interface where all categories are already controlled together. That said, "drm:atomic:" & "drm:atomic: " are different, and both are useful in cases. Ad-Hoc sub-categories: These have a caveat wrt wrapper macros adding prefixes like "drm:atomic: "; the trailing space in the prefix means that drm_dbg_atomic("fail: ...") pastes as "drm:atomic: fail: ", which obviously isn't ideal wrt clear and simple bitmaps. A possible solution is to have a FOO_() version of every FOO() macro which (anti-mnemonically) elides the trailing space, which is normally inserted by a newer FOO(). IE: drm_dbg_atomic_("fail: ..."); // trailing _ means ad-hoc subcat Summarizing:
[PATCH v7 0/8] use DYNAMIC_DEBUG to implement DRM.debug
Hi Jason, DRM folks, In DRM-debug currently, drm_debug_enabled() is called a lot to decide whether or not to write debug messages. Each test is cheap, but costs continue with uptime. DYNAMIC_DEBUG "dyndbg", when built with JUMP_LABEL, replaces each of those tests with a patchable NOOP, for "zero cost when off" debugs. this is v7: - drops RFC distractions -JBaron - drops patch-1: kp->data addition in moduleparams.h not needed -JBaron - rework dyndbg callbacks to use kp->arg instead -JBaron - fixes for problem configs reported -lkp - others noted per patch below --- Broadly, this patchset does 3 things: Adds DEFINE_DYNAMIC_DEBUG_CATEGORIES, which defines a mapping from bits to "categorized" pr_debugs, a sysfs interface, and callbacks to implement the bits as dynamic_debug >controls. This is modelled after DRM's debug interface. Uses the new macro in amdgpu & i915 to control existing pr_debugs according to their ad-hoc categorizations. Adapts the drm-debug API (~20 macros) to configurably "replace" drm_dbg & drm_dev_dbg with pr_debug & dev_dbg, which avoids drm_debug_enabled() overheads. DEFINE_DYNAMIC_DEBUG_CATEGORIES is used to create the controlling bitmap, which is wired to __drm_debug var so remaining drm_debug_enabled() users stay in sync. on a virtual box: bash-5.1# for m in i915 amdgpu nouveau; do modprobe $m; done [43833.332326] dyndbg: 384 debug prints in module drm [43833.609577] dyndbg: 211 debug prints in module drm_kms_helper [43833.707124] dyndbg: 2 debug prints in module ttm [43837.471166] dyndbg: 1727 debug prints in module i915 [43838.030774] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel [43838.031905] AMD-Vi: AMD IOMMUv2 functionality not available on this system [43846.209583] dyndbg: 3852 debug prints in module amdgpu [43846.261391] [drm] amdgpu kernel modesetting enabled. [43846.262512] amdgpu: CRAT table not found [43846.263264] amdgpu: Virtual CRAT table created for CPU [43846.264293] amdgpu: Topology: Add CPU node [43846.743781] dyndbg: 3 debug prints in module wmi [43852.517497] dyndbg: 92 debug prints in module nouveau There are a few multi-second delays there, just before dyndbg initializes the large blocks of debug prints. Jim Cromie (8): dyndbg: add DEFINE_DYNAMIC_DEBUG_CATEGORIES and callbacks dyndbg: remove spaces in pr_debug "gvt: core:" etc prefixes i915/gvt: use DEFINE_DYNAMIC_DEBUG_CATEGORIES to create "gvt:core:" etc categories amdgpu: use DEFINE_DYNAMIC_DEBUG_CATEGORIES drm_print: add choice to use dynamic debug in drm-debug drm_print: instrument drm_debug_enabled amdgpu_ucode: reduce number of pr_debug calls nouveau: fold multiple DRM_DEBUG_DRIVERs together drivers/gpu/drm/Kconfig | 13 + drivers/gpu/drm/Makefile | 3 + drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 293 ++ .../gpu/drm/amd/display/dc/core/dc_debug.c| 44 ++- drivers/gpu/drm/drm_print.c | 53 +++- drivers/gpu/drm/i915/Makefile | 4 + drivers/gpu/drm/i915/gvt/debug.h | 18 +- drivers/gpu/drm/i915/i915_params.c| 35 +++ drivers/gpu/drm/nouveau/nouveau_drm.c | 36 ++- include/drm/drm_print.h | 150 +++-- include/linux/dynamic_debug.h | 60 lib/dynamic_debug.c | 79 - 12 files changed, 582 insertions(+), 206 deletions(-) -- 2.31.1
Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."
On 2021-08-31 9:11 a.m., Daniel Vetter wrote: On Thu, Aug 26, 2021 at 11:04:14AM +0200, Daniel Vetter wrote: On Thu, Aug 19, 2021 at 11:25:09AM -0400, Andrey Grodzovsky wrote: On 2021-08-19 5:30 a.m., Daniel Vetter wrote: On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote: On 2021-08-18 10:42 a.m., Daniel Vetter wrote: On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote: On 2021-08-18 10:32 a.m., Daniel Vetter wrote: On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote: On 2021-08-18 10:02 a.m., Alex Deucher wrote: + dri-devel Since scheduler is a shared component, please add dri-devel on all scheduler patches. On Wed, Aug 18, 2021 at 7:21 AM Jingwen Chen wrote: [Why] for bailing job, this commit will delete it from pending list thus the bailing job will never have a chance to be resubmitted even in advance tdr mode. [How] after embeded hw_fence into amdgpu_job is done, the race condition that this commit tries to work around is completely solved.So revert this commit. This reverts commit 135517d3565b48f4def3b1b82008bc17eb5d1c90. v2: add dma_fence_get/put() around timedout_job to avoid concurrent delete during processing timedout_job Signed-off-by: Jingwen Chen --- drivers/gpu/drm/scheduler/sched_main.c | 23 +-- 1 file changed, 5 insertions(+), 18 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index a2a953693b45..f9b9b3aefc4a 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work) { struct drm_gpu_scheduler *sched; struct drm_sched_job *job; + struct dma_fence *fence; enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL; sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); @@ -325,11 +326,10 @@ static void drm_sched_job_timedout(struct work_struct *work) if (job) { /* -* Remove the bad job so it cannot be freed by concurrent -* drm_sched_cleanup_jobs. It will be reinserted back after sched->thread -* is parked at which point it's safe. +* Get job->s_fence->parent here to avoid concurrent delete during +* processing timedout_job */ - list_del_init(>list); + fence = dma_fence_get(job->s_fence->parent); While this is true for amdgpu, it has no meaning for other drivers for whom we haven't done the refactoring of embedding HW fence (parent) into the job structure. In fact thinking about it, unless you do the HW fence embedding for all the drivers using the scheduler you cannot revert this patch or you will just break them. btw, why did you do that embedding? I do still have my patches with dma_fence annotations floating around, but my idea at least was to fix that issue with a mempool, not with embeddeding. What was the motivation for embedding the wh fence? -Daniel The motivation was 2 fold, avoid memory allocation during jobs submissions (HW fence allocation) because as Christian explained this leads to deadlock with mm code during evictions due to memory pressure (Christian can clarify if I messed Yeah that's the exact same thing I've chased with my dma_fence annotations, but thus far zero to none interested in getting it sorted. I think it'd be good to have some cross-driver agreement on how this should be solved before someone just charges ahead ... this explanation). Second is to exactly revert this patch because while it solved the issue described in the patch it created another with drivers who baildc out early during TDR handling for various reason and the job would just leak because it was already removed form pending list. Can't we reinsert it before we restart the scheduler thread? It might need a separate list for that due to the lockless queue tricks. Or am I thinking about the wrong kind of "we lost the job"? -Danile If you look at the original patch it would reinsert it even earlier - right after stopping the SW scheduler thread, and even then it was to late for some drivers as they would decide to return back from their TDR handler even before that. It is solvable but in an ugly way as far as I see, you need to require each driver in his code to put the job back in the list if they do it before reaching the place where scheduler framework does it. Kind of spaghetti code seems to me. Hm yeah I didn't realize this all happens before we stop the scheduler thread. Why can't we stop the scheduler thread first, so that there's guaranteed no race? I've recently had a lot of discussions with panfrost folks about their reset that spawns across engines, and without stopping the scheduler thread first before you touch anything it's just plain impossible. Talked with
Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
On 2021-08-31 11:23, Andrey Grodzovsky wrote: > On 2021-08-31 10:38 a.m., Daniel Vetter wrote: >> On Tue, Aug 31, 2021 at 10:20:40AM -0400, Andrey Grodzovsky wrote: >>> On 2021-08-31 10:03 a.m., Daniel Vetter wrote: On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote: > It's says patch [2/2] but i can't find patch 1 > > On 2021-08-31 6:35 a.m., Monk Liu wrote: >> tested-by: jingwen chen >> Signed-off-by: Monk Liu >> Signed-off-by: jingwen chen >> --- >> drivers/gpu/drm/scheduler/sched_main.c | 24 >> 1 file changed, 4 insertions(+), 20 deletions(-) >> >> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >> b/drivers/gpu/drm/scheduler/sched_main.c >> index ecf8140..894fdb24 100644 >> --- a/drivers/gpu/drm/scheduler/sched_main.c >> +++ b/drivers/gpu/drm/scheduler/sched_main.c >> @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct >> work_struct *work) >> sched = container_of(work, struct drm_gpu_scheduler, >> work_tdr.work); >> /* Protects against concurrent deletion in >> drm_sched_get_cleanup_job */ >> +if (!__kthread_should_park(sched->thread)) >> +kthread_park(sched->thread); >> + > As mentioned before, without serializing against other TDR handlers from > other > schedulers you just race here against them, e.g. you parked it now but > another > one in progress will unpark it as part of calling drm_sched_start for > other > rings[1] > Unless I am missing something since I haven't found patch [1/2] > > [1] - > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L5041data=04%7C01%7Cluben.tuikov%40amd.com%7C228bd1600c914efe24aa08d96c934bbb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660202148713283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=PrrvFHAwDeLlbcOctlKHsCFs9%2F56XNVzoLVcT1RoJgI%3Dreserved=0 You need to have your own wq and run all your tdr work on the same wq if your reset has any cross-engine impact. >>> IMHO what is problematic in serializing vs. locking (with trylock and bail >>> out like we do in [1]) is for multiple TO events arising from same reason >>> like maybe one job just waits for another and once first is hanged the >>> second will also appear to be hanged triggering it's own TO event. >>> In this case multiple TOs event will trigger multiple resets if we serialize >>> but if we use lock with trylock the second one will quietly bail out. >>> >>> [1] >>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L4903data=04%7C01%7Cluben.tuikov%40amd.com%7C228bd1600c914efe24aa08d96c934bbb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660202148713283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=kxSWBoshVTLMMFIFZdPsP4MBgUAoC%2F3szo9GUemSRLY%3Dreserved=0 >> Hm so I guess a single wq here, that will hold up all other TO. And they >> should recheck whether the job is moving meanwhile. > > Can you clarify about this ? What job should be moving ? The dependent job ? > > >> Also unless you use hw semaphores the job shouldn't even start before the >> deps are singalled, so not sure how this goes wrong? > > What about a simple example where > we actually can submit a shader on one ring and a simple > WAIT_REG_MEM packet on another to wait for the shader to write > a specific value to specific memory location. Here you have both of them > started > in close proximity and no explicit dependencies involved (at the > scheduler level) > and yet if the shader hangs also the WAIT_REG_MEM job will hang. > > >> The vm_id flush stuff can make things a bit more fun for your specific >> case, but in your specific case you have to run all TO handlers on the >> same ordered workqueue anyway (because trying to paper over this in other >> ways doesn't work imo). > > I didn't get this one. So, awhile back I tried to "serialize" this by moving timed-out jobs into their own timed-out-dedicated list, then freeing them asynchronously, but I never got it to work reliably due to races with low-level drivers and assumptions made way back. My idea was to atomic-move timed-out jobs into their own list, at the time of timeout, and later asynchronously to free them (or better yet, inquire about their state, and free them or move them back--ideally the inquiry is atomic and done at timeout time before being moved to the timeout list). Anyway... But I found out that all these knobs and levers weren't in place and I was getting problems with it and it never materialized. The paradigm was loosely "let someone else do
Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
On 2021-08-31 10:38 a.m., Daniel Vetter wrote: On Tue, Aug 31, 2021 at 10:20:40AM -0400, Andrey Grodzovsky wrote: On 2021-08-31 10:03 a.m., Daniel Vetter wrote: On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote: It's says patch [2/2] but i can't find patch 1 On 2021-08-31 6:35 a.m., Monk Liu wrote: tested-by: jingwen chen Signed-off-by: Monk Liu Signed-off-by: jingwen chen --- drivers/gpu/drm/scheduler/sched_main.c | 24 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index ecf8140..894fdb24 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct *work) sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ + if (!__kthread_should_park(sched->thread)) + kthread_park(sched->thread); + As mentioned before, without serializing against other TDR handlers from other schedulers you just race here against them, e.g. you parked it now but another one in progress will unpark it as part of calling drm_sched_start for other rings[1] Unless I am missing something since I haven't found patch [1/2] [1] - https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L5041data=04%7C01%7Candrey.grodzovsky%40amd.com%7C86b39a7bbcd34a18c6e908d96c8cf862%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660174991641911%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=tz7lxvL%2BR6NrpcdfIg1Mjw5lZ55%2F5HTPF%2Fkwu7wqvqE%3Dreserved=0 You need to have your own wq and run all your tdr work on the same wq if your reset has any cross-engine impact. IMHO what is problematic in serializing vs. locking (with trylock and bail out like we do in [1]) is for multiple TO events arising from same reason like maybe one job just waits for another and once first is hanged the second will also appear to be hanged triggering it's own TO event. In this case multiple TOs event will trigger multiple resets if we serialize but if we use lock with trylock the second one will quietly bail out. [1] https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L4903data=04%7C01%7Candrey.grodzovsky%40amd.com%7C86b39a7bbcd34a18c6e908d96c8cf862%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660174991651903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=SpirDOLVdw5kIZAq0LHjnB0Qy6apwPLDPFjm61Wc2ko%3Dreserved=0 Hm so I guess a single wq here, that will hold up all other TO. And they should recheck whether the job is moving meanwhile. Can you clarify about this ? What job should be moving ? The dependent job ? Also unless you use hw semaphores the job shouldn't even start before the deps are singalled, so not sure how this goes wrong? What about a simple example where we actually can submit a shader on one ring and a simple WAIT_REG_MEM packet on another to wait for the shader to write a specific value to specific memory location. Here you have both of them started in close proximity and no explicit dependencies involved (at the scheduler level) and yet if the shader hangs also the WAIT_REG_MEM job will hang. The vm_id flush stuff can make things a bit more fun for your specific case, but in your specific case you have to run all TO handlers on the same ordered workqueue anyway (because trying to paper over this in other ways doesn't work imo). I didn't get this one. Andrey So I think this should all work, no need for tricky cross-scheduler locking. -Daniel Andrey See https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdri.freedesktop.org%2Fdocs%2Fdrm%2Fgpu%2Fdrm-mm.html%23c.drm_sched_backend_opsdata=04%7C01%7Candrey.grodzovsky%40amd.com%7C86b39a7bbcd34a18c6e908d96c8cf862%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660174991651903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=uV4s%2Fsu%2FKabZvsRMd1PAyd36JRSz91aPfYEn8PlvFlM%3Dreserved=0 for the ->timeout_job callback docs. I thought I brought this up already? -Daniel Yes, this discussion is a continuation of your comment about serializing, I mentioned before that you proposed it. Andrey Andrey spin_lock(>job_list_lock); job = list_first_entry_or_null(>pending_list, struct drm_sched_job, list); if (job) { - /* -* Remove the bad job so it cannot be freed by concurrent -*
Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
On 2021-08-31 08:59, Daniel Vetter wrote: > Can we please have some actual commit message here, with detailed > explanation of the race/bug/whatever, how you fix it and why this is the > best option? I agree with Daniel--a narrative form of a commit message is so much easier for humans to digest. The "[what]"/"[why]"/"[how]" and "issue"/"fix" format is somewhat dry and uninformative, and leaves much to be desired. Regards, Luben > > On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote: >> tested-by: jingwen chen >> Signed-off-by: Monk Liu >> Signed-off-by: jingwen chen >> --- >> drivers/gpu/drm/scheduler/sched_main.c | 24 >> 1 file changed, 4 insertions(+), 20 deletions(-) >> >> diff --git a/drivers/gpu/drm/scheduler/sched_main.c >> b/drivers/gpu/drm/scheduler/sched_main.c >> index ecf8140..894fdb24 100644 >> --- a/drivers/gpu/drm/scheduler/sched_main.c >> +++ b/drivers/gpu/drm/scheduler/sched_main.c >> @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct >> *work) >> sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); >> >> /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ >> +if (!__kthread_should_park(sched->thread)) > This is a __ function, i.e. considered internal, and it's lockless atomic, > i.e. unordered. And you're not explaining why this works. > > Iow it's probably buggy, and an just unconditionally parking the kthread > is probably the right thing to do. If it's not the right thing to do, > there's a bug here for sure. > -Daniel > >> +kthread_park(sched->thread); >> + >> spin_lock(>job_list_lock); >> job = list_first_entry_or_null(>pending_list, >> struct drm_sched_job, list); >> >> if (job) { >> -/* >> - * Remove the bad job so it cannot be freed by concurrent >> - * drm_sched_cleanup_jobs. It will be reinserted back after >> sched->thread >> - * is parked at which point it's safe. >> - */ >> -list_del_init(>list); >> spin_unlock(>job_list_lock); >> >> +/* vendor's timeout_job should call drm_sched_start() */ >> status = job->sched->ops->timedout_job(job); >> >> /* >> @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, >> struct drm_sched_job *bad) >> kthread_park(sched->thread); >> >> /* >> - * Reinsert back the bad job here - now it's safe as >> - * drm_sched_get_cleanup_job cannot race against us and release the >> - * bad job at this point - we parked (waited for) any in progress >> - * (earlier) cleanups and drm_sched_get_cleanup_job will not be called >> - * now until the scheduler thread is unparked. >> - */ >> -if (bad && bad->sched == sched) >> -/* >> - * Add at the head of the queue to reflect it was the earliest >> - * job extracted. >> - */ >> -list_add(>list, >pending_list); >> - >> -/* >> * Iterate the job list from later to earlier one and either deactive >> * their HW callbacks or remove them from pending list if they already >> * signaled. >> -- >> 2.7.4 >>
Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
On Tue, Aug 31, 2021 at 10:20:40AM -0400, Andrey Grodzovsky wrote: > > On 2021-08-31 10:03 a.m., Daniel Vetter wrote: > > On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote: > > > It's says patch [2/2] but i can't find patch 1 > > > > > > On 2021-08-31 6:35 a.m., Monk Liu wrote: > > > > tested-by: jingwen chen > > > > Signed-off-by: Monk Liu > > > > Signed-off-by: jingwen chen > > > > --- > > > >drivers/gpu/drm/scheduler/sched_main.c | 24 > > > >1 file changed, 4 insertions(+), 20 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > > > b/drivers/gpu/drm/scheduler/sched_main.c > > > > index ecf8140..894fdb24 100644 > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > > > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct > > > > work_struct *work) > > > > sched = container_of(work, struct drm_gpu_scheduler, > > > > work_tdr.work); > > > > /* Protects against concurrent deletion in > > > > drm_sched_get_cleanup_job */ > > > > + if (!__kthread_should_park(sched->thread)) > > > > + kthread_park(sched->thread); > > > > + > > > > > > As mentioned before, without serializing against other TDR handlers from > > > other > > > schedulers you just race here against them, e.g. you parked it now but > > > another > > > one in progress will unpark it as part of calling drm_sched_start for > > > other > > > rings[1] > > > Unless I am missing something since I haven't found patch [1/2] > > > > > > [1] - > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L5041data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc697c75898664f678f4b08d96c8820e7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660154199259544%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=1Y8Tuh2fLtexYsGrmQD2ITTSIfUVJmqTylwgMryDjxw%3Dreserved=0 > > You need to have your own wq and run all your tdr work on the same wq if > > your reset has any cross-engine impact. > > > IMHO what is problematic in serializing vs. locking (with trylock and bail > out like we do in [1]) is for multiple TO events arising from same reason > like maybe one job just waits for another and once first is hanged the > second will also appear to be hanged triggering it's own TO event. > In this case multiple TOs event will trigger multiple resets if we serialize > but if we use lock with trylock the second one will quietly bail out. > > [1] > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L4903 Hm so I guess a single wq here, that will hold up all other TO. And they should recheck whether the job is moving meanwhile. Also unless you use hw semaphores the job shouldn't even start before the deps are singalled, so not sure how this goes wrong? The vm_id flush stuff can make things a bit more fun for your specific case, but in your specific case you have to run all TO handlers on the same ordered workqueue anyway (because trying to paper over this in other ways doesn't work imo). So I think this should all work, no need for tricky cross-scheduler locking. -Daniel > > Andrey > > > > > > See > > > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdri.freedesktop.org%2Fdocs%2Fdrm%2Fgpu%2Fdrm-mm.html%23c.drm_sched_backend_opsdata=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc697c75898664f678f4b08d96c8820e7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660154199259544%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=tLjFaN7mREYjjydxHszbQlTk3lwH4bQtBDVzHFHvPJg%3Dreserved=0 > > > > for the ->timeout_job callback docs. I thought I brought this up already? > > -Daniel > > > Yes, this discussion is a continuation of your comment about serializing, I > mentioned before that you proposed it. > > Andrey > > > > > > > Andrey > > > > > > > > > > spin_lock(>job_list_lock); > > > > job = list_first_entry_or_null(>pending_list, > > > >struct drm_sched_job, list); > > > > if (job) { > > > > - /* > > > > -* Remove the bad job so it cannot be freed by > > > > concurrent > > > > -* drm_sched_cleanup_jobs. It will be reinserted back > > > > after sched->thread > > > > -* is parked at which point it's safe. > > > > -*/ > > > > - list_del_init(>list); > > > > spin_unlock(>job_list_lock); > > > > + /* vendor's timeout_job should call drm_sched_start() */ > > > > status = job->sched->ops->timedout_job(job); > > > > /* > > > > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler > > > >
Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
On 2021-08-31 10:03 a.m., Daniel Vetter wrote: On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote: It's says patch [2/2] but i can't find patch 1 On 2021-08-31 6:35 a.m., Monk Liu wrote: tested-by: jingwen chen Signed-off-by: Monk Liu Signed-off-by: jingwen chen --- drivers/gpu/drm/scheduler/sched_main.c | 24 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index ecf8140..894fdb24 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct *work) sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ + if (!__kthread_should_park(sched->thread)) + kthread_park(sched->thread); + As mentioned before, without serializing against other TDR handlers from other schedulers you just race here against them, e.g. you parked it now but another one in progress will unpark it as part of calling drm_sched_start for other rings[1] Unless I am missing something since I haven't found patch [1/2] [1] - https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L5041data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc697c75898664f678f4b08d96c8820e7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660154199259544%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=1Y8Tuh2fLtexYsGrmQD2ITTSIfUVJmqTylwgMryDjxw%3Dreserved=0 You need to have your own wq and run all your tdr work on the same wq if your reset has any cross-engine impact. IMHO what is problematic in serializing vs. locking (with trylock and bail out like we do in [1]) is for multiple TO events arising from same reason like maybe one job just waits for another and once first is hanged the second will also appear to be hanged triggering it's own TO event. In this case multiple TOs event will trigger multiple resets if we serialize but if we use lock with trylock the second one will quietly bail out. [1] https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L4903 Andrey See https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdri.freedesktop.org%2Fdocs%2Fdrm%2Fgpu%2Fdrm-mm.html%23c.drm_sched_backend_opsdata=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc697c75898664f678f4b08d96c8820e7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660154199259544%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=tLjFaN7mREYjjydxHszbQlTk3lwH4bQtBDVzHFHvPJg%3Dreserved=0 for the ->timeout_job callback docs. I thought I brought this up already? -Daniel Yes, this discussion is a continuation of your comment about serializing, I mentioned before that you proposed it. Andrey Andrey spin_lock(>job_list_lock); job = list_first_entry_or_null(>pending_list, struct drm_sched_job, list); if (job) { - /* -* Remove the bad job so it cannot be freed by concurrent -* drm_sched_cleanup_jobs. It will be reinserted back after sched->thread -* is parked at which point it's safe. -*/ - list_del_init(>list); spin_unlock(>job_list_lock); + /* vendor's timeout_job should call drm_sched_start() */ status = job->sched->ops->timedout_job(job); /* @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad) kthread_park(sched->thread); /* -* Reinsert back the bad job here - now it's safe as -* drm_sched_get_cleanup_job cannot race against us and release the -* bad job at this point - we parked (waited for) any in progress -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called -* now until the scheduler thread is unparked. -*/ - if (bad && bad->sched == sched) - /* -* Add at the head of the queue to reflect it was the earliest -* job extracted. -*/ - list_add(>list, >pending_list); - - /* * Iterate the job list from later to earlier one and either deactive * their HW callbacks or remove them from pending list if they already * signaled.
Re: [PATCH] drm/amdgpu: stop scheduler when calling hw_fini (v2)
On Mon, Aug 30, 2021 at 2:24 AM Guchun Chen wrote: > > This gurantees no more work on the ring can be submitted > to hardware in suspend/resume case, otherwise a potential > race will occur and the ring will get no chance to stay > empty before suspend. > > v2: Call drm_sched_resubmit_job before drm_sched_start to > restart jobs from the pending list. > > Suggested-by: Andrey Grodzovsky > Suggested-by: Christian König > Signed-off-by: Guchun Chen Maybe add: Fixes: 067f44c8b459 ("drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)") ? Alex > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8 > 1 file changed, 8 insertions(+) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > index b439eb7d4177..fd4ba076ff8a 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c > @@ -552,6 +552,9 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device > *adev) > if (!ring || !ring->fence_drv.initialized) > continue; > > + if (!ring->no_scheduler) > + drm_sched_stop(>sched, NULL); > + > /* You can't wait for HW to signal if it's gone */ > if (!drm_dev_is_unplugged(>ddev)) > r = amdgpu_fence_wait_empty(ring); > @@ -611,6 +614,11 @@ void amdgpu_fence_driver_hw_init(struct amdgpu_device > *adev) > if (!ring || !ring->fence_drv.initialized) > continue; > > + if (!ring->no_scheduler) { > + drm_sched_resubmit_jobs(>sched); > + drm_sched_start(>sched, true); > + } > + > /* enable the interrupt */ > if (ring->fence_drv.irq_src) > amdgpu_irq_get(adev, ring->fence_drv.irq_src, > -- > 2.17.1 >
Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote: > It's says patch [2/2] but i can't find patch 1 > > On 2021-08-31 6:35 a.m., Monk Liu wrote: > > tested-by: jingwen chen > > Signed-off-by: Monk Liu > > Signed-off-by: jingwen chen > > --- > > drivers/gpu/drm/scheduler/sched_main.c | 24 > > 1 file changed, 4 insertions(+), 20 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > b/drivers/gpu/drm/scheduler/sched_main.c > > index ecf8140..894fdb24 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct > > *work) > > sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); > > /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ > > + if (!__kthread_should_park(sched->thread)) > > + kthread_park(sched->thread); > > + > > > As mentioned before, without serializing against other TDR handlers from > other > schedulers you just race here against them, e.g. you parked it now but > another > one in progress will unpark it as part of calling drm_sched_start for other > rings[1] > Unless I am missing something since I haven't found patch [1/2] > > [1] - > https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L5041 You need to have your own wq and run all your tdr work on the same wq if your reset has any cross-engine impact. See https://dri.freedesktop.org/docs/drm/gpu/drm-mm.html#c.drm_sched_backend_ops for the ->timeout_job callback docs. I thought I brought this up already? -Daniel > > Andrey > > > > spin_lock(>job_list_lock); > > job = list_first_entry_or_null(>pending_list, > >struct drm_sched_job, list); > > if (job) { > > - /* > > -* Remove the bad job so it cannot be freed by concurrent > > -* drm_sched_cleanup_jobs. It will be reinserted back after > > sched->thread > > -* is parked at which point it's safe. > > -*/ > > - list_del_init(>list); > > spin_unlock(>job_list_lock); > > + /* vendor's timeout_job should call drm_sched_start() */ > > status = job->sched->ops->timedout_job(job); > > /* > > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, > > struct drm_sched_job *bad) > > kthread_park(sched->thread); > > /* > > -* Reinsert back the bad job here - now it's safe as > > -* drm_sched_get_cleanup_job cannot race against us and release the > > -* bad job at this point - we parked (waited for) any in progress > > -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called > > -* now until the scheduler thread is unparked. > > -*/ > > - if (bad && bad->sched == sched) > > - /* > > -* Add at the head of the queue to reflect it was the earliest > > -* job extracted. > > -*/ > > - list_add(>list, >pending_list); > > - > > - /* > > * Iterate the job list from later to earlier one and either deactive > > * their HW callbacks or remove them from pending list if they already > > * signaled. -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
It's says patch [2/2] but i can't find patch 1 On 2021-08-31 6:35 a.m., Monk Liu wrote: tested-by: jingwen chen Signed-off-by: Monk Liu Signed-off-by: jingwen chen --- drivers/gpu/drm/scheduler/sched_main.c | 24 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index ecf8140..894fdb24 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct *work) sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ + if (!__kthread_should_park(sched->thread)) + kthread_park(sched->thread); + As mentioned before, without serializing against other TDR handlers from other schedulers you just race here against them, e.g. you parked it now but another one in progress will unpark it as part of calling drm_sched_start for other rings[1] Unless I am missing something since I haven't found patch [1/2] [1] - https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L5041 Andrey spin_lock(>job_list_lock); job = list_first_entry_or_null(>pending_list, struct drm_sched_job, list); if (job) { - /* -* Remove the bad job so it cannot be freed by concurrent -* drm_sched_cleanup_jobs. It will be reinserted back after sched->thread -* is parked at which point it's safe. -*/ - list_del_init(>list); spin_unlock(>job_list_lock); + /* vendor's timeout_job should call drm_sched_start() */ status = job->sched->ops->timedout_job(job); /* @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad) kthread_park(sched->thread); /* -* Reinsert back the bad job here - now it's safe as -* drm_sched_get_cleanup_job cannot race against us and release the -* bad job at this point - we parked (waited for) any in progress -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called -* now until the scheduler thread is unparked. -*/ - if (bad && bad->sched == sched) - /* -* Add at the head of the queue to reflect it was the earliest -* job extracted. -*/ - list_add(>list, >pending_list); - - /* * Iterate the job list from later to earlier one and either deactive * their HW callbacks or remove them from pending list if they already * signaled.
Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."
On Thu, Aug 26, 2021 at 11:04:14AM +0200, Daniel Vetter wrote: > On Thu, Aug 19, 2021 at 11:25:09AM -0400, Andrey Grodzovsky wrote: > > > > On 2021-08-19 5:30 a.m., Daniel Vetter wrote: > > > On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote: > > > > On 2021-08-18 10:42 a.m., Daniel Vetter wrote: > > > > > On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote: > > > > > > On 2021-08-18 10:32 a.m., Daniel Vetter wrote: > > > > > > > On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote: > > > > > > > > On 2021-08-18 10:02 a.m., Alex Deucher wrote: > > > > > > > > > > > > > > > > > + dri-devel > > > > > > > > > > > > > > > > > > Since scheduler is a shared component, please add dri-devel > > > > > > > > > on all > > > > > > > > > scheduler patches. > > > > > > > > > > > > > > > > > > On Wed, Aug 18, 2021 at 7:21 AM Jingwen Chen > > > > > > > > > wrote: > > > > > > > > > > [Why] > > > > > > > > > > for bailing job, this commit will delete it from pending > > > > > > > > > > list thus the > > > > > > > > > > bailing job will never have a chance to be resubmitted even > > > > > > > > > > in advance > > > > > > > > > > tdr mode. > > > > > > > > > > > > > > > > > > > > [How] > > > > > > > > > > after embeded hw_fence into amdgpu_job is done, the race > > > > > > > > > > condition that > > > > > > > > > > this commit tries to work around is completely solved.So > > > > > > > > > > revert this > > > > > > > > > > commit. > > > > > > > > > > This reverts commit > > > > > > > > > > 135517d3565b48f4def3b1b82008bc17eb5d1c90. > > > > > > > > > > v2: > > > > > > > > > > add dma_fence_get/put() around timedout_job to avoid > > > > > > > > > > concurrent delete > > > > > > > > > > during processing timedout_job > > > > > > > > > > > > > > > > > > > > Signed-off-by: Jingwen Chen > > > > > > > > > > --- > > > > > > > > > > drivers/gpu/drm/scheduler/sched_main.c | 23 > > > > > > > > > > +-- > > > > > > > > > > 1 file changed, 5 insertions(+), 18 deletions(-) > > > > > > > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > > > > > > > > > b/drivers/gpu/drm/scheduler/sched_main.c > > > > > > > > > > index a2a953693b45..f9b9b3aefc4a 100644 > > > > > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > > > > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > > > > > > > > > @@ -314,6 +314,7 @@ static void > > > > > > > > > > drm_sched_job_timedout(struct work_struct *work) > > > > > > > > > > { > > > > > > > > > > struct drm_gpu_scheduler *sched; > > > > > > > > > > struct drm_sched_job *job; > > > > > > > > > > + struct dma_fence *fence; > > > > > > > > > > enum drm_gpu_sched_stat status = > > > > > > > > > > DRM_GPU_SCHED_STAT_NOMINAL; > > > > > > > > > > > > > > > > > > > > sched = container_of(work, struct > > > > > > > > > > drm_gpu_scheduler, work_tdr.work); > > > > > > > > > > @@ -325,11 +326,10 @@ static void > > > > > > > > > > drm_sched_job_timedout(struct work_struct *work) > > > > > > > > > > > > > > > > > > > > if (job) { > > > > > > > > > > /* > > > > > > > > > > -* Remove the bad job so it cannot be freed > > > > > > > > > > by concurrent > > > > > > > > > > -* drm_sched_cleanup_jobs. It will be > > > > > > > > > > reinserted back after sched->thread > > > > > > > > > > -* is parked at which point it's safe. > > > > > > > > > > +* Get job->s_fence->parent here to avoid > > > > > > > > > > concurrent delete during > > > > > > > > > > +* processing timedout_job > > > > > > > > > > */ > > > > > > > > > > - list_del_init(>list); > > > > > > > > > > + fence = dma_fence_get(job->s_fence->parent); > > > > > > > > While this is true for amdgpu, it has no meaning for other > > > > > > > > drivers for whom > > > > > > > > we haven't > > > > > > > > done the refactoring of embedding HW fence (parent) into the > > > > > > > > job structure. > > > > > > > > In fact thinking > > > > > > > > about it, unless you do the HW fence embedding for all the > > > > > > > > drivers using the > > > > > > > > scheduler you cannot > > > > > > > > revert this patch or you will just break them. > > > > > > > btw, why did you do that embedding? I do still have my patches > > > > > > > with > > > > > > > dma_fence annotations floating around, but my idea at least was > > > > > > > to fix > > > > > > > that issue with a mempool, not with embeddeding. What was the > > > > > > > motivation > > > > > > > for embedding the wh fence? > > > > > > > -Daniel > > > > > > The motivation was 2 fold, avoid memory allocation during jobs > > > > > > submissions > > > > > > (HW fence allocation) because as Christian explained this leads to > > > > > > deadlock > > > > > > with > > > > > > mm code
Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)
On Fri, Aug 27, 2021 at 08:30:32PM +0200, Christian König wrote: > Yeah, that's what I meant with that the start of processing a job is a bit > swampy defined. > > Jobs overload, but we simply don't have another good indicator that a job > started except that the previous one completed. > > It's still better than starting the timer when pushing the job to the ring > buffer, because that is completely off. Not sure this matters that much really in practice, and in some cases we might want to actually just reset it all if the built up backlog is way too long. For anything that really runs way too long you want preempt-ctx/revoke fences anyway, not end-of-cs fences with tdr. -Daniel > > Christian. > > Am 27.08.21 um 20:22 schrieb Andrey Grodzovsky: > > As I mentioned to Monk before - what about cases such as in this test - > > https://gitlab.freedesktop.org/mesa/drm/-/commit/bc21168fa924d3fc4a000492e861f50a1a135b25 > > > > Here you don't have serialized sequence where when jobs finishes > > processing and second starts, they execute together concurrently - for > > those cases seems > > to me restarting the timer for the second job from scratch will let it > > hang much longer then allowed by TO value. > > > > Andrey > > > > On 2021-08-27 10:29 a.m., Christian König wrote: > > > I don't think that makes sense. > > > > > > See we don't want to start the time when the job is inserted into > > > the ring buffer, but rather when it starts processing. > > > > > > Starting processing is a bit swampy defined, but just starting the > > > timer when the previous job completes should be fine enough. > > > > > > Christian. > > > > > > Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky: > > > > The TS represents the point in time when the job was inserted > > > > into the pending list. > > > > I don't think it matters when it actually starts to be > > > > processed, what matters is when this job was inserted into > > > > pending list because right at that point you arm the TO timer > > > > (when no other is running already) > > > > and so when the previous job completes and you cancel and rearm > > > > again you can use that TS from the next job in pending list to > > > > calculate how much time has actually left for it to run before > > > > TDR must be initiated > > > > and not just give it again full TO value to run even if it has > > > > already been running for a while. > > > > > > > > Also, i am not sure also about the assumption that what we > > > > measure is processing by HW, what we measure is from the moment > > > > it was scheduled to ring to the moment the job completed (EOP > > > > event). At least that what our TDR timer measures and so it > > > > makes sense to set the TS at this point. > > > > > > > > Andrey > > > > > > > > On 2021-08-27 3:20 a.m., Liu, Monk wrote: > > > > > [AMD Official Use Only] > > > > > > > > > > what is that 'ts' representing for ? it looks to me the > > > > > jiffies that it get scheduled to the ring, but a job > > > > > scheduled to the ring doesn't represent it's being processed > > > > > by hw. > > > > > > > > > > Thanks > > > > > > > > > > -- > > > > > Monk Liu | Cloud-GPU Core team > > > > > -- > > > > > > > > > > -Original Message- > > > > > From: Grodzovsky, Andrey > > > > > Sent: Friday, August 27, 2021 4:14 AM > > > > > To: Liu, Monk ; > > > > > amd-gfx@lists.freedesktop.org; Koenig, Christian > > > > > > > > > > Cc: dri-de...@lists.freedesktop.org > > > > > Subject: Re: [PATCH] drm/sched: fix the bug of time out > > > > > calculation(v3) > > > > > > > > > > Attached quick patch for per job TTL calculation to make > > > > > more precises next timer expiration. It's on top of the > > > > > patch in this thread. Let me know if this makes sense. > > > > > > > > > > Andrey > > > > > > > > > > On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote: > > > > > > On 2021-08-26 12:55 a.m., Monk Liu wrote: > > > > > > > issue: > > > > > > > in cleanup_job the cancle_delayed_work will cancel a TO timer even > > > > > > > the its corresponding job is still running. > > > > > > > > > > > > > > fix: > > > > > > > do not cancel the timer in cleanup_job, instead do the cancelling > > > > > > > only when the heading job is signaled, and if there is a "next" > > > > > > > job > > > > > > > we start_timeout again. > > > > > > > > > > > > > > v2: > > > > > > > further cleanup the logic, and do the TDR timer cancelling if the > > > > > > > signaled job is the last one in its scheduler. > > > > > > > > > > > > > > v3: > > > > > > > change the issue description > > > > > > > remove the cancel_delayed_work in the begining of the cleanup_job > > > > > > > recover the implement of drm_sched_job_begin. > > > > > > > > > > > > > > TODO: > > > > > > > 1)introduce pause/resume scheduler in job_timeout to serial the > > > > > > > handling of scheduler and job_timeout. > > > > > > > 2)drop the
Re: [drm/amdgpu] Driver crashes on 5.13.9 kernel
On Mon, Aug 30, 2021 at 07:15:29PM +0300, Skyler Mäntysaari wrote: > I have tried kernel 5.13.13, without any difference and I haven't > tried with an older kernel, as this hardware is that new that I have > very little faith in less than 5.x kernel would even have support for > the needed GPU. Yeah might be. > What do you mean with git bisect? I have checked that the crash > happens somewhere in the monitor connection code: If we found some older version which work. Then with git bisect we can track which patch made this bug. But if it never work then git bisect won't help. You may still want to test with 5.14 as it was released today. If that does not work then I cannot help more. It is still very important to know if it is also broken in 5.14. Hopefully someone will able to help you. Argillander > [ 9605.269927] Call Trace: > [ 9605.269931] core_link_enable_stream+0x746/0x870 [amdgpu] > [ 9605.270038] dce110_apply_ctx_to_hw+0x519/0x560 [amdgpu] > [ 9605.270146] dc_commit_state+0x2f6/0xa50 [amdgpu] > [ 9605.270249] amdgpu_dm_atomic_commit_tail+0x569/0x26a0 [amdgpu] > [ 9605.270326] ? kfree+0xc3/0x460 > [ 9605.270329] ? dcn30_validate_bandwidth+0x11f/0x270 [amdgpu] > [ 9605.270402] ? dcn30_validate_bandwidth+0x11f/0x270 [amdgpu] > [ 9605.270469] ? dm_plane_helper_prepare_fb+0x19c/0x250 [amdgpu] > [ 9605.270542] ? __cond_resched+0x16/0x40 > [ 9605.270544] ? __wait_for_common+0x3b/0x160 > [ 9605.270545] ? __raw_callee_save___native_queued_spin_unlock+0x11/0x1e > [ 9605.270548] commit_tail+0x94/0x130 [drm_kms_helper] > [ 9605.270557] drm_atomic_helper_commit+0x113/0x140 [drm_kms_helper] > [ 9605.270562] drm_atomic_helper_set_config+0x70/0xb0 [drm_kms_helper] > [ 9605.270568] drm_mode_setcrtc+0x1d3/0x6d0 [drm] > [ 9605.270582] ? drm_mode_getcrtc+0x180/0x180 [drm] > [ 9605.270590] drm_ioctl_kernel+0xaa/0xf0 [drm] > [ 9605.270600] drm_ioctl+0x220/0x3c0 [drm] > [ 9605.270609] ? drm_mode_getcrtc+0x180/0x180 [drm] > [ 9605.270618] amdgpu_drm_ioctl+0x49/0x80 [amdgpu] > [ 9605.270673] __x64_sys_ioctl+0x83/0xb0 > [ 9605.270675] do_syscall_64+0x40/0xb0 > [ 9605.270677] entry_SYSCALL_64_after_hwframe+0x44/0xae > > > On Sun, Aug 29, 2021, at 20:34, Kari Argillander wrote: > > On Sun, Aug 29, 2021 at 06:38:39PM +0300, Skyler Mäntysaari wrote: > > > Hello everyone on the list, > > > > This is universal kernel list and it is not read by many. I have added > > hopefully right list (amd-gfx@lists.freedesktop.org). > > > > > Subject: Re: [drm/amdgpu] Driver crashes on 5.13.9 kernel > > > > I have no influence or knowledge about this driver, but I still try to > > help because it seems good bug report. Have you test with 5.13.13 or > > 5.14-rc7. Does this work with some other kernel? If needed can you git > > bisect if needed? You will probably get some support for it if needed. > > > > Argillander > > > > > I thought that this should probably be discussed here, so I came > > > across weird issue to me which is driver crashing while trying to get > > > one of my monitors working on Gentoo. I would like to ask here how > > > that would happen that the Display appears to jump from DisplayPort-6 > > > (physical port) to DisplayPort-7 (which doesn't exist physically)? Has > > > anyone else experienced this? > > > > > > It seems that the driver sees a rather large amount of inputs for the > > > GPU, even though I only have 4, 3 of which are DisplayPort, and the > > > issue monitor is also on DisplayPort. > > > > > > Hardware: > > > CPU: AMD Ryzen 5800X > > > GPU: AMD Radeon RX 6800 > > > System Memory: 32GB of DDR4 3200Mhz > > > Display(s): BenQ Zowie XL2430 (1080p), DELL U2414H (1080p), DELL U2415 > > > (1920x1200) > > > Type of Diplay Connection: All are connected via Display-Port > > > > > > Related DRM issue: > > > https://gitlab.freedesktop.org/drm/amd/-/issues/1621 which includes > > > logs too. > > > > > > > > > Best regards, > > > Skyler Mäntysaari
Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
On Tue, Aug 31, 2021 at 02:59:02PM +0200, Daniel Vetter wrote: > Can we please have some actual commit message here, with detailed > explanation of the race/bug/whatever, how you fix it and why this is the > best option? > > On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote: > > tested-by: jingwen chen > > Signed-off-by: Monk Liu > > Signed-off-by: jingwen chen > > --- > > drivers/gpu/drm/scheduler/sched_main.c | 24 > > 1 file changed, 4 insertions(+), 20 deletions(-) > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > > b/drivers/gpu/drm/scheduler/sched_main.c > > index ecf8140..894fdb24 100644 > > --- a/drivers/gpu/drm/scheduler/sched_main.c > > +++ b/drivers/gpu/drm/scheduler/sched_main.c > > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct > > *work) > > sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); > > > > /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ > > + if (!__kthread_should_park(sched->thread)) > > This is a __ function, i.e. considered internal, and it's lockless atomic, > i.e. unordered. And you're not explaining why this works. > > Iow it's probably buggy, and an just unconditionally parking the kthread > is probably the right thing to do. If it's not the right thing to do, > there's a bug here for sure. Also why don't we reuse the function drivers already have to stop a scheduler thread? We seem to have two kthread_park now, that's probably one too much. -Daniel > > + kthread_park(sched->thread); > > + > > spin_lock(>job_list_lock); > > job = list_first_entry_or_null(>pending_list, > >struct drm_sched_job, list); > > > > if (job) { > > - /* > > -* Remove the bad job so it cannot be freed by concurrent > > -* drm_sched_cleanup_jobs. It will be reinserted back after > > sched->thread > > -* is parked at which point it's safe. > > -*/ > > - list_del_init(>list); > > spin_unlock(>job_list_lock); > > > > + /* vendor's timeout_job should call drm_sched_start() */ > > status = job->sched->ops->timedout_job(job); > > > > /* > > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, > > struct drm_sched_job *bad) > > kthread_park(sched->thread); > > > > /* > > -* Reinsert back the bad job here - now it's safe as > > -* drm_sched_get_cleanup_job cannot race against us and release the > > -* bad job at this point - we parked (waited for) any in progress > > -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called > > -* now until the scheduler thread is unparked. > > -*/ > > - if (bad && bad->sched == sched) > > - /* > > -* Add at the head of the queue to reflect it was the earliest > > -* job extracted. > > -*/ > > - list_add(>list, >pending_list); > > - > > - /* > > * Iterate the job list from later to earlier one and either deactive > > * their HW callbacks or remove them from pending list if they already > > * signaled. > > -- > > 2.7.4 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
Can we please have some actual commit message here, with detailed explanation of the race/bug/whatever, how you fix it and why this is the best option? On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote: > tested-by: jingwen chen > Signed-off-by: Monk Liu > Signed-off-by: jingwen chen > --- > drivers/gpu/drm/scheduler/sched_main.c | 24 > 1 file changed, 4 insertions(+), 20 deletions(-) > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c > b/drivers/gpu/drm/scheduler/sched_main.c > index ecf8140..894fdb24 100644 > --- a/drivers/gpu/drm/scheduler/sched_main.c > +++ b/drivers/gpu/drm/scheduler/sched_main.c > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct > *work) > sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); > > /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ > + if (!__kthread_should_park(sched->thread)) This is a __ function, i.e. considered internal, and it's lockless atomic, i.e. unordered. And you're not explaining why this works. Iow it's probably buggy, and an just unconditionally parking the kthread is probably the right thing to do. If it's not the right thing to do, there's a bug here for sure. -Daniel > + kthread_park(sched->thread); > + > spin_lock(>job_list_lock); > job = list_first_entry_or_null(>pending_list, > struct drm_sched_job, list); > > if (job) { > - /* > - * Remove the bad job so it cannot be freed by concurrent > - * drm_sched_cleanup_jobs. It will be reinserted back after > sched->thread > - * is parked at which point it's safe. > - */ > - list_del_init(>list); > spin_unlock(>job_list_lock); > > + /* vendor's timeout_job should call drm_sched_start() */ > status = job->sched->ops->timedout_job(job); > > /* > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, > struct drm_sched_job *bad) > kthread_park(sched->thread); > > /* > - * Reinsert back the bad job here - now it's safe as > - * drm_sched_get_cleanup_job cannot race against us and release the > - * bad job at this point - we parked (waited for) any in progress > - * (earlier) cleanups and drm_sched_get_cleanup_job will not be called > - * now until the scheduler thread is unparked. > - */ > - if (bad && bad->sched == sched) > - /* > - * Add at the head of the queue to reflect it was the earliest > - * job extracted. > - */ > - list_add(>list, >pending_list); > - > - /* >* Iterate the job list from later to earlier one and either deactive >* their HW callbacks or remove them from pending list if they already >* signaled. > -- > 2.7.4 > -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
[PATCH 2/2] drm/sched: serialize job_timeout and scheduler
tested-by: jingwen chen Signed-off-by: Monk Liu Signed-off-by: jingwen chen --- drivers/gpu/drm/scheduler/sched_main.c | 24 1 file changed, 4 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index ecf8140..894fdb24 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct *work) sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work); /* Protects against concurrent deletion in drm_sched_get_cleanup_job */ + if (!__kthread_should_park(sched->thread)) + kthread_park(sched->thread); + spin_lock(>job_list_lock); job = list_first_entry_or_null(>pending_list, struct drm_sched_job, list); if (job) { - /* -* Remove the bad job so it cannot be freed by concurrent -* drm_sched_cleanup_jobs. It will be reinserted back after sched->thread -* is parked at which point it's safe. -*/ - list_del_init(>list); spin_unlock(>job_list_lock); + /* vendor's timeout_job should call drm_sched_start() */ status = job->sched->ops->timedout_job(job); /* @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, struct drm_sched_job *bad) kthread_park(sched->thread); /* -* Reinsert back the bad job here - now it's safe as -* drm_sched_get_cleanup_job cannot race against us and release the -* bad job at this point - we parked (waited for) any in progress -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called -* now until the scheduler thread is unparked. -*/ - if (bad && bad->sched == sched) - /* -* Add at the head of the queue to reflect it was the earliest -* job extracted. -*/ - list_add(>list, >pending_list); - - /* * Iterate the job list from later to earlier one and either deactive * their HW callbacks or remove them from pending list if they already * signaled. -- 2.7.4
[PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)
issue: in cleanup_job the cancle_delayed_work will cancel a TO timer even the its corresponding job is still running. fix: do not cancel the timer in cleanup_job, instead do the cancelling only when the heading job is signaled, and if there is a "next" job we start_timeout again. v2: further cleanup the logic, and do the TDR timer cancelling if the signaled job is the last one in its scheduler. v3: change the issue description remove the cancel_delayed_work in the begining of the cleanup_job recover the implement of drm_sched_job_begin. TODO: 1)introduce pause/resume scheduler in job_timeout to serial the handling of scheduler and job_timeout. 2)drop the bad job's del and insert in scheduler due to above serialization (no race issue anymore with the serialization) tested-by: jingwen Signed-off-by: Monk Liu --- drivers/gpu/drm/scheduler/sched_main.c | 25 ++--- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index a2a9536..ecf8140 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) { struct drm_sched_job *job, *next; - /* -* Don't destroy jobs while the timeout worker is running OR thread -* is being parked and hence assumed to not touch pending_list -*/ - if ((sched->timeout != MAX_SCHEDULE_TIMEOUT && - !cancel_delayed_work(>work_tdr)) || - kthread_should_park()) + if (kthread_should_park()) return NULL; spin_lock(>job_list_lock); @@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) if (job && dma_fence_is_signaled(>s_fence->finished)) { /* remove job from pending_list */ list_del_init(>list); + + /* cancel this job's TO timer */ + cancel_delayed_work(>work_tdr); /* make the scheduled timestamp more accurate */ next = list_first_entry_or_null(>pending_list, typeof(*next), list); - if (next) + + if (next) { next->s_fence->scheduled.timestamp = job->s_fence->finished.timestamp; - + /* start TO timer for next job */ + drm_sched_start_timeout(sched); + } } else { job = NULL; - /* queue timeout for next job */ - drm_sched_start_timeout(sched); } spin_unlock(>job_list_lock); @@ -791,11 +789,8 @@ static int drm_sched_main(void *param) (entity = drm_sched_select_entity(sched))) || kthread_should_stop()); - if (cleanup_job) { + if (cleanup_job) sched->ops->free_job(cleanup_job); - /* queue timeout for next job */ - drm_sched_start_timeout(sched); - } if (!entity) continue; -- 2.7.4
[PATCH] drm/amdgpu: Clear RAS interrupt status on aldebaran
[AMD Official Use Only] Submitting patch to resolve incorrect register address' on Aldebaran affecting RAS interrupt handling 0001-drm-amdgpu-Clear-RAS-interrupt-status-on-aldebaran.patch Description: 0001-drm-amdgpu-Clear-RAS-interrupt-status-on-aldebaran.patch
Re: [PATCH v3] drm/amdgpu: Fix a deadlock if previous GEM object allocation fails
Am 31.08.21 um 09:08 schrieb Pan, Xinhui: Fall through to handle the error instead of return. Fixes: f8aab60422c37 ("drm/amdgpu: Initialise drm_gem_object_funcs for imported BOs") Cc: sta...@vger.kernel.org Signed-off-by: xinhui pan Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 23 ++- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 85b292ed5c43..9e2525b96d04 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -341,21 +341,18 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data, r = amdgpu_gem_object_create(adev, size, args->in.alignment, initial_domain, flags, ttm_bo_type_device, resv, ); - if (r) { - if (r != -ERESTARTSYS) { - if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) { - flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; - goto retry; - } + if (r && r != -ERESTARTSYS) { + if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) { + flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; + goto retry; + } - if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) { - initial_domain |= AMDGPU_GEM_DOMAIN_GTT; - goto retry; - } - DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n", - size, initial_domain, args->in.alignment, r); + if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) { + initial_domain |= AMDGPU_GEM_DOMAIN_GTT; + goto retry; } - return r; + DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n", + size, initial_domain, args->in.alignment, r); } if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) {
[PATCH v3] drm/amdgpu: Fix a deadlock if previous GEM object allocation fails
Fall through to handle the error instead of return. Fixes: f8aab60422c37 ("drm/amdgpu: Initialise drm_gem_object_funcs for imported BOs") Cc: sta...@vger.kernel.org Signed-off-by: xinhui pan --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 23 ++- 1 file changed, 10 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 85b292ed5c43..9e2525b96d04 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -341,21 +341,18 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data, r = amdgpu_gem_object_create(adev, size, args->in.alignment, initial_domain, flags, ttm_bo_type_device, resv, ); - if (r) { - if (r != -ERESTARTSYS) { - if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) { - flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; - goto retry; - } + if (r && r != -ERESTARTSYS) { + if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) { + flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED; + goto retry; + } - if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) { - initial_domain |= AMDGPU_GEM_DOMAIN_GTT; - goto retry; - } - DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n", - size, initial_domain, args->in.alignment, r); + if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) { + initial_domain |= AMDGPU_GEM_DOMAIN_GTT; + goto retry; } - return r; + DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n", + size, initial_domain, args->in.alignment, r); } if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) { -- 2.25.1
RE: [PATCH] drm/amdgpu: Clear RAS interrupt status on aldebaran
[AMD Official Use Only] Reviewed-by: Candice Li mailto:candice...@amd.com>> Thanks, Candice From: Clements, John Sent: Tuesday, August 31, 2021 2:27 PM To: amd-gfx@lists.freedesktop.org Cc: Li, Candice Subject: [PATCH] drm/amdgpu: Clear RAS interrupt status on aldebaran [AMD Official Use Only] Submitting patch to resolve incorrect register address' on Aldebaran affecting RAS interrupt handling
Re: [PATCH v2] drm/amdgpu: Fix a deadlock if previous GEM object allocation fails
Am 31.08.21 um 07:55 schrieb Pan, Xinhui: Fall through to handle the error instead of return. Can you also clean up the "if (r) {.. if (r !=..." above? And please figure out which patch introduced the problem and add a Fixes: tag and maybe CC: stable as well. Thanks for the help, Christian. Signed-off-by: xinhui pan --- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 1 - 1 file changed, 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 85b292ed5c43..7ddd429052ea 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -355,7 +355,6 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data, DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n", size, initial_domain, args->in.alignment, r); } - return r; } if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) {
Re: [PATCH] drm/amdgpu: Fix a deadlock if previous GEM object allocation fails
Am 31.08.21 um 07:47 schrieb Pan, Xinhui: 在 2021/8/31 13:38,“Pan, Xinhui” 写入: 在 2021/8/31 12:03,“Grodzovsky, Andrey” 写入: On 2021-08-30 11:24 p.m., Pan, Xinhui wrote: > [AMD Official Use Only] > > [AMD Official Use Only] > > Unreserve root BO before return otherwise next allocation got deadlock. > > Signed-off-by: xinhui pan > --- > drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 11 +-- > 1 file changed, 5 insertions(+), 6 deletions(-) > > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > index 85b292ed5c43..c9db7d2c5816 100644 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c > @@ -355,19 +355,18 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data, > DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n", >size, initial_domain, args->in.alignment, r); > } > + > + if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) > + amdgpu_bo_unreserve(vm->root.bo); > return r; > } > > if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) { > - if (!r) { > - struct amdgpu_bo *abo = gem_to_amdgpu_bo(gobj); > + struct amdgpu_bo *abo = gem_to_amdgpu_bo(gobj); > > - abo->parent = amdgpu_bo_ref(vm->root.bo); > - } > + abo->parent = amdgpu_bo_ref(vm->root.bo); > amdgpu_bo_unreserve(vm->root.bo); > } > - if (r) > - return r; The above early return seems to be needed for -ERESTARTSYS case. Andrey There are two returns. ERESTARTSYS and other error after retry are already handled by the first return in if {}. So the second return is not needed. Thanks Xinhui Also we can do something like below which is simpler. Yeah, I think that would be better. We could also change the "if (r) { if (r != -ERESTARTSYS) {" into just "if (r != -ERESTARTSYS)" then. But good catch anyway. Regards, Christian. @@ -355,7 +355,6 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data, DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, %d)\n", size, initial_domain, args->in.alignment, r); } - return r; } if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) { > > r = drm_gem_handle_create(filp, gobj, ); > /* drop reference from allocate - handle holds it now */ > -- > 2.25.1