Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread

2021-08-31 Thread Andrey Grodzovsky



On 2021-09-01 12:40 a.m., Jingwen Chen wrote:

On Wed Sep 01, 2021 at 12:28:59AM -0400, Andrey Grodzovsky wrote:

On 2021-09-01 12:25 a.m., Jingwen Chen wrote:

On Wed Sep 01, 2021 at 12:04:47AM -0400, Andrey Grodzovsky wrote:

I will answer everything here -

On 2021-08-31 9:58 p.m., Liu, Monk wrote:


  [AMD Official Use Only]


  In the previous discussion, you guys stated that we should drop the
  “kthread_should_park” in cleanup_job.


  @@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler
  *sched)

  {

  struct drm_sched_job *job, *next;


  -   /*

  -* Don't destroy jobs while the timeout worker is running  OR
  thread

  -* is being parked and hence assumed to not touch pending_list

  -*/

  -   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&

  -   !cancel_delayed_work(>work_tdr)) ||

  -   kthread_should_park())

  -   return NULL;


  But I suddenly have a question here: if return the timedout job no matter
  kthread_should_park() or not, then we are backing to the original problem
  again: that the timedout_job is suddenly signaling and cleanup_job still
  returns it to sched_main and job is freed while it is still handling by
  vendor’s timeout callback


  If we return NULL when kthread_should_park() in cleanup_job, we can 
prevent
  above scenario from happening: once a job is processed by job_timedout we
  can stop its scheduler, and after that even this job suddenly signaled the
  cleanup_job won’t return it so sched_main won’t free it in parallel …


  What do you think ?


Is your analysis above takes into account that you also submit
'[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' then I don't see a
problem -

Hi Andrey,
Monk has talked to me and we agreed that as there're multiple opinions about the
'[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' and patch
1 is an independent patch to fix some error. So we should not take the patch 2 
into
analysis.


I think that as long as you put kthread_park(sched->thread) BEFORE
fetching next bad job from pending list (under spinlock) there is no
such issue as in the case you describe because this potential bad job
that became signaled will be removed from pending list before you
even fetch the next job and by the time you fetch it the scheduler
thread is already stopped anyway

If you don't submit and we keep the removal hack for now then also no problem
because
we temporary remove the job we fetch for TDR from pending list under spinlock
exactly to avoid this race


So can you help review [PATCH 1/2] drm/sched: fix the bug of time out 
calculation(v3)?
patch v3 keeps this kthread_should_park check.


But since in both cases looks like there is no danger of use after free
then I see no reason to keep kthread_should_park check.

Andrey

OK, I get it. So patch v4 has removed this check, can you help review
[PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)?



Sure

Andrey





Best Regards,
JingWen

  Thanks


  --

  Monk Liu | Cloud-GPU Core team

  --


  From: Liu, Monk
  Sent: Wednesday, September 1, 2021 9:23 AM
  To: Koenig, Christian ; Grodzovsky, Andrey
  ; Daniel Vetter ; Chen, 
JingWen
  
  Cc: DRI Development ;
  amd-gfx@lists.freedesktop.org
  Subject: [diagnostic TDR mode patches] unify our solution opinions/
  suggestions in one thread


  [AMD Official Use Only]


  Hi Daniel/Christian/Andrey


  It looks the voice from you three are spread over those email floods to 
me,
  the feature we are working on (diagnostic TDR scheme) is pending there for
  more than 6 month (we started it from feb 2021).


  Honestly speaking the email ways that we are using now is not friendly and
  quite painful to me ….

  Can we try to put all our opinions, suggestions, or even objects here
  together, let’s go through them one by one, it’s too hard for us to reply
  each email on different questions .


  For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)


  This is a fixing patch on the timeout timer in scheduler, can we complete
  this one first ? it should already resolved all the questions and
  suggestions.


I have no objections for this one besides getting rid of the
kthread_should_park()) return null part,
if my answer above is not wrong then it seems superfluous to me



  For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler


  I think I already explained the questions raised by Daniel in other thread
  , regarding why I use __kthread_should_park()


Is this race free ? Can't the other thread execute kthread_park after the check
?


  For other aspects, can we put all our opinion 

Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread

2021-08-31 Thread Jingwen Chen
On Wed Sep 01, 2021 at 12:28:59AM -0400, Andrey Grodzovsky wrote:
> 
> On 2021-09-01 12:25 a.m., Jingwen Chen wrote:
> > On Wed Sep 01, 2021 at 12:04:47AM -0400, Andrey Grodzovsky wrote:
> > > I will answer everything here -
> > > 
> > > On 2021-08-31 9:58 p.m., Liu, Monk wrote:
> > > 
> > > 
> > >  [AMD Official Use Only]
> > > 
> > > 
> > >  In the previous discussion, you guys stated that we should drop the
> > >  “kthread_should_park” in cleanup_job.
> > > 
> > > 
> > >  @@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct 
> > > drm_gpu_scheduler
> > >  *sched)
> > > 
> > >  {
> > > 
> > >  struct drm_sched_job *job, *next;
> > > 
> > > 
> > >  -   /*
> > > 
> > >  -* Don't destroy jobs while the timeout worker is running  OR
> > >  thread
> > > 
> > >  -* is being parked and hence assumed to not touch 
> > > pending_list
> > > 
> > >  -*/
> > > 
> > >  -   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> > > 
> > >  -   !cancel_delayed_work(>work_tdr)) ||
> > > 
> > >  -   kthread_should_park())
> > > 
> > >  -   return NULL;
> > > 
> > > 
> > >  But I suddenly have a question here: if return the timedout job no 
> > > matter
> > >  kthread_should_park() or not, then we are backing to the original 
> > > problem
> > >  again: that the timedout_job is suddenly signaling and cleanup_job 
> > > still
> > >  returns it to sched_main and job is freed while it is still handling 
> > > by
> > >  vendor’s timeout callback
> > > 
> > > 
> > >  If we return NULL when kthread_should_park() in cleanup_job, we can 
> > > prevent
> > >  above scenario from happening: once a job is processed by 
> > > job_timedout we
> > >  can stop its scheduler, and after that even this job suddenly 
> > > signaled the
> > >  cleanup_job won’t return it so sched_main won’t free it in parallel …
> > > 
> > > 
> > >  What do you think ?
> > > 
> > > 
> > > Is your analysis above takes into account that you also submit
> > > '[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' then I don't 
> > > see a
> > > problem -
> > Hi Andrey,
> > Monk has talked to me and we agreed that as there're multiple opinions 
> > about the
> > '[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' and patch
> > 1 is an independent patch to fix some error. So we should not take the 
> > patch 2 into
> > analysis.
> > 
> > > I think that as long as you put kthread_park(sched->thread) BEFORE
> > > fetching next bad job from pending list (under spinlock) there is no
> > > such issue as in the case you describe because this potential bad job
> > > that became signaled will be removed from pending list before you
> > > even fetch the next job and by the time you fetch it the scheduler
> > > thread is already stopped anyway
> > > 
> > > If you don't submit and we keep the removal hack for now then also no 
> > > problem
> > > because
> > > we temporary remove the job we fetch for TDR from pending list under 
> > > spinlock
> > > exactly to avoid this race
> > > 
> > So can you help review [PATCH 1/2] drm/sched: fix the bug of time out 
> > calculation(v3)?
> > patch v3 keeps this kthread_should_park check.
> 
> 
> But since in both cases looks like there is no danger of use after free
> then I see no reason to keep kthread_should_park check.
> 
> Andrey
OK, I get it. So patch v4 has removed this check, can you help review
[PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)?
> 
> 
> > 
> > Best Regards,
> > JingWen
> > > 
> > >  Thanks
> > > 
> > > 
> > >  --
> > > 
> > >  Monk Liu | Cloud-GPU Core team
> > > 
> > >  --
> > > 
> > > 
> > >  From: Liu, Monk
> > >  Sent: Wednesday, September 1, 2021 9:23 AM
> > >  To: Koenig, Christian ; Grodzovsky, Andrey
> > >  ; Daniel Vetter ; Chen, 
> > > JingWen
> > >  
> > >  Cc: DRI Development ;
> > >  amd-gfx@lists.freedesktop.org
> > >  Subject: [diagnostic TDR mode patches] unify our solution opinions/
> > >  suggestions in one thread
> > > 
> > > 
> > >  [AMD Official Use Only]
> > > 
> > > 
> > >  Hi Daniel/Christian/Andrey
> > > 
> > > 
> > >  It looks the voice from you three are spread over those email floods 
> > > to me,
> > >  the feature we are working on (diagnostic TDR scheme) is pending 
> > > there for
> > >  more than 6 month (we started it from feb 2021).
> > > 
> > > 
> > >  Honestly speaking the email ways that we are using now is not 
> > > friendly and
> > >  quite painful to me ….
> > > 
> > >  Can we try to put all our opinions, suggestions, or even objects here
> > >  together, let’s go through them one by one, it’s too hard for us to 
> > > reply
> > >  each email on different questions .
> > > 
> > > 
> > >  For [PATCH 1/2] 

Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread

2021-08-31 Thread Andrey Grodzovsky



On 2021-09-01 12:25 a.m., Jingwen Chen wrote:

On Wed Sep 01, 2021 at 12:04:47AM -0400, Andrey Grodzovsky wrote:

I will answer everything here -

On 2021-08-31 9:58 p.m., Liu, Monk wrote:


 [AMD Official Use Only]

  


 In the previous discussion, you guys stated that we should drop the
 “kthread_should_park” in cleanup_job.

  


 @@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler
 *sched)

 {

 struct drm_sched_job *job, *next;

  


 -   /*

 -* Don't destroy jobs while the timeout worker is running  OR
 thread

 -* is being parked and hence assumed to not touch pending_list

 -*/

 -   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&

 -   !cancel_delayed_work(>work_tdr)) ||

 -   kthread_should_park())

 -   return NULL;

  


 But I suddenly have a question here: if return the timedout job no matter
 kthread_should_park() or not, then we are backing to the original problem
 again: that the timedout_job is suddenly signaling and cleanup_job still
 returns it to sched_main and job is freed while it is still handling by
 vendor’s timeout callback

  


 If we return NULL when kthread_should_park() in cleanup_job, we can prevent
 above scenario from happening: once a job is processed by job_timedout we
 can stop its scheduler, and after that even this job suddenly signaled the
 cleanup_job won’t return it so sched_main won’t free it in parallel …

  


 What do you think ?


Is your analysis above takes into account that you also submit
'[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' then I don't see a
problem -

Hi Andrey,
Monk has talked to me and we agreed that as there're multiple opinions about the
'[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' and patch
1 is an independent patch to fix some error. So we should not take the patch 2 
into
analysis.


I think that as long as you put kthread_park(sched->thread) BEFORE
fetching next bad job from pending list (under spinlock) there is no
such issue as in the case you describe because this potential bad job
that became signaled will be removed from pending list before you
even fetch the next job and by the time you fetch it the scheduler
thread is already stopped anyway

If you don't submit and we keep the removal hack for now then also no problem
because
we temporary remove the job we fetch for TDR from pending list under spinlock
exactly to avoid this race


So can you help review [PATCH 1/2] drm/sched: fix the bug of time out 
calculation(v3)?
patch v3 keeps this kthread_should_park check.



But since in both cases looks like there is no danger of use after free
then I see no reason to keep kthread_should_park check.

Andrey




Best Regards,
JingWen


 Thanks

  


 --

 Monk Liu | Cloud-GPU Core team

 --

  


 From: Liu, Monk
 Sent: Wednesday, September 1, 2021 9:23 AM
 To: Koenig, Christian ; Grodzovsky, Andrey
 ; Daniel Vetter ; Chen, JingWen
 
 Cc: DRI Development ;
 amd-gfx@lists.freedesktop.org
 Subject: [diagnostic TDR mode patches] unify our solution opinions/
 suggestions in one thread

  


 [AMD Official Use Only]

  


 Hi Daniel/Christian/Andrey

  


 It looks the voice from you three are spread over those email floods to me,
 the feature we are working on (diagnostic TDR scheme) is pending there for
 more than 6 month (we started it from feb 2021).

  


 Honestly speaking the email ways that we are using now is not friendly and
 quite painful to me ….

 Can we try to put all our opinions, suggestions, or even objects here
 together, let’s go through them one by one, it’s too hard for us to reply
 each email on different questions .

  


 For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)

  


 This is a fixing patch on the timeout timer in scheduler, can we complete
 this one first ? it should already resolved all the questions and
 suggestions.


I have no objections for this one besides getting rid of the
kthread_should_park()) return null part,
if my answer above is not wrong then it seems superfluous to me


  


 For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

  


 I think I already explained the questions raised by Daniel in other thread
 , regarding why I use __kthread_should_park()


Is this race free ? Can't the other thread execute kthread_park after the check
?


 For other aspects, can we put all our opinion synthesized here ?


So to summarize from previous threads I think that the best solution
to the problem being solved in this patch is if we do HW fence embedding
at the drm_sched_job level instead 

Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread

2021-08-31 Thread Jingwen Chen
On Wed Sep 01, 2021 at 12:04:47AM -0400, Andrey Grodzovsky wrote:
> I will answer everything here -
> 
> On 2021-08-31 9:58 p.m., Liu, Monk wrote:
> 
> 
> [AMD Official Use Only]
> 
>  
> 
> In the previous discussion, you guys stated that we should drop the
> “kthread_should_park” in cleanup_job.
> 
>  
> 
> @@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler
> *sched)
> 
> {
> 
> struct drm_sched_job *job, *next;
> 
>  
> 
> -   /*
> 
> -* Don't destroy jobs while the timeout worker is running  OR
> thread
> 
> -* is being parked and hence assumed to not touch pending_list
> 
> -*/
> 
> -   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
> 
> -   !cancel_delayed_work(>work_tdr)) ||
> 
> -   kthread_should_park())
> 
> -   return NULL;
> 
>  
> 
> But I suddenly have a question here: if return the timedout job no matter
> kthread_should_park() or not, then we are backing to the original problem
> again: that the timedout_job is suddenly signaling and cleanup_job still
> returns it to sched_main and job is freed while it is still handling by
> vendor’s timeout callback
> 
>  
> 
> If we return NULL when kthread_should_park() in cleanup_job, we can 
> prevent
> above scenario from happening: once a job is processed by job_timedout we
> can stop its scheduler, and after that even this job suddenly signaled the
> cleanup_job won’t return it so sched_main won’t free it in parallel …
> 
>  
> 
> What do you think ?
> 
> 
> Is your analysis above takes into account that you also submit
> '[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' then I don't see 
> a
> problem -
Hi Andrey,
Monk has talked to me and we agreed that as there're multiple opinions about the
'[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' and patch
1 is an independent patch to fix some error. So we should not take the patch 2 
into
analysis.

> I think that as long as you put kthread_park(sched->thread) BEFORE
> fetching next bad job from pending list (under spinlock) there is no
> such issue as in the case you describe because this potential bad job
> that became signaled will be removed from pending list before you
> even fetch the next job and by the time you fetch it the scheduler
> thread is already stopped anyway
> 
> If you don't submit and we keep the removal hack for now then also no problem
> because
> we temporary remove the job we fetch for TDR from pending list under spinlock
> exactly to avoid this race
> 
So can you help review [PATCH 1/2] drm/sched: fix the bug of time out 
calculation(v3)?
patch v3 keeps this kthread_should_park check.

Best Regards,
JingWen
> 
> 
> Thanks
> 
>  
> 
> --
> 
> Monk Liu | Cloud-GPU Core team
> 
> --
> 
>  
> 
> From: Liu, Monk
> Sent: Wednesday, September 1, 2021 9:23 AM
> To: Koenig, Christian ; Grodzovsky, Andrey
> ; Daniel Vetter ; Chen, 
> JingWen
> 
> Cc: DRI Development ;
> amd-gfx@lists.freedesktop.org
> Subject: [diagnostic TDR mode patches] unify our solution opinions/
> suggestions in one thread
> 
>  
> 
> [AMD Official Use Only]
> 
>  
> 
> Hi Daniel/Christian/Andrey
> 
>  
> 
> It looks the voice from you three are spread over those email floods to 
> me,
> the feature we are working on (diagnostic TDR scheme) is pending there for
> more than 6 month (we started it from feb 2021).
> 
>  
> 
> Honestly speaking the email ways that we are using now is not friendly and
> quite painful to me ….
> 
> Can we try to put all our opinions, suggestions, or even objects here
> together, let’s go through them one by one, it’s too hard for us to reply
> each email on different questions .
> 
>  
> 
> For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)
> 
>  
> 
> This is a fixing patch on the timeout timer in scheduler, can we complete
> this one first ? it should already resolved all the questions and
> suggestions.
> 
> 
> I have no objections for this one besides getting rid of the
> kthread_should_park()) return null part,
> if my answer above is not wrong then it seems superfluous to me
> 
> 
>  
> 
> For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler
> 
>  
> 
> I think I already explained the questions raised by Daniel in other thread
> , regarding why I use __kthread_should_park()
> 
> 
> Is this race free ? Can't the other thread execute kthread_park after the 
> check
> ?
> 
> 
> For other aspects, can we put all our opinion synthesized here ?
> 
> 
> So to summarize from previous threads I think that the best solution
> to the problem being solved in this patch is if we do HW fence 

Re: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread

2021-08-31 Thread Andrey Grodzovsky

I will answer everything here -

On 2021-08-31 9:58 p.m., Liu, Monk wrote:


[AMD Official Use Only]

In the previous discussion, you guys stated that we should drop the 
“kthread_should_park” in cleanup_job.


@@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct 
drm_gpu_scheduler *sched)


{

    struct drm_sched_job *job, *next;

-   /*

-    * Don't destroy jobs while the timeout worker is running  OR 
thread


-    * is being parked and hence assumed to not touch pending_list

-    */

-   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&

- !cancel_delayed_work(>work_tdr)) ||

-   kthread_should_park())

-   return NULL;

But I suddenly have a question here: if return the timedout job no 
matter kthread_should_park() or not, then we are backing to the 
original problem again: that the timedout_job is suddenly signaling 
and cleanup_job still returns it to sched_main and job is freed while 
it is still handling by vendor’s timeout callback


If we return NULL when kthread_should_park() in cleanup_job, we can 
prevent above scenario from happening: once a job is processed by 
job_timedout we can stop its scheduler, and after that even this job 
suddenly signaled the cleanup_job won’t return it so sched_main won’t 
free it in parallel …


What do you think ?



Is your analysis above takes into account that you also submit
'[PATCH 2/2] drm/sched: serialize job_timeout and scheduler' then I 
don't see a problem -

I think that as long as you put kthread_park(sched->thread) BEFORE
fetching next bad job from pending list (under spinlock) there is no
such issue as in the case you describe because this potential bad job
that became signaled will be removed from pending list before you
even fetch the next job and by the time you fetch it the scheduler
thread is already stopped anyway

If you don't submit and we keep the removal hack for now then also no 
problem because
we temporary remove the job we fetch for TDR from pending list under 
spinlock

exactly to avoid this race



Thanks

--

Monk Liu | Cloud-GPU Core team

--

*From:* Liu, Monk
*Sent:* Wednesday, September 1, 2021 9:23 AM
*To:* Koenig, Christian ; Grodzovsky, Andrey 
; Daniel Vetter ; Chen, 
JingWen 
*Cc:* DRI Development ; 
amd-gfx@lists.freedesktop.org
*Subject:* [diagnostic TDR mode patches] unify our solution 
opinions/suggestions in one thread


[AMD Official Use Only]

Hi Daniel/Christian/Andrey

It looks the voice from you three are spread over those email floods 
to me, the feature we are working on (diagnostic TDR scheme) is 
pending there for more than 6 month (we started it from feb 2021).


Honestly speaking the email ways that we are using now is not friendly 
and quite painful to me ….


Can we try to put all our opinions, suggestions, or even objects here 
together, let’s go through them one by one, it’s too hard for us to 
reply each email on different questions .


For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)

This is a fixing patch on the timeout timer in scheduler, can we 
complete this one first ? it should already resolved all the questions 
and suggestions.




I have no objections for this one besides getting rid of the 
kthread_should_park()) return null part,

if my answer above is not wrong then it seems superfluous to me



For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

I think I already explained the questions raised by Daniel in other 
thread , regarding why I use __kthread_should_park()




Is this race free ? Can't the other thread execute kthread_park after 
the check ?




For other aspects, can we put all our opinion synthesized here ?



So to summarize from previous threads I think that the best solution
to the problem being solved in this patch is if we do HW fence embedding
at the drm_sched_job level instead of doing it only for amdgpu, and 
modifying all

the drivers to support this we can both remove this hack and solve the race
against concurrent drm_sched_cleanup_jobs job freeing just by taking 
reference

to the hw fence of the job at the beginning of drm_sched_job_timedout

If doing this refactoring for all the drivers is not an option now and 
you need a quick
solution such as the serialization you do here then take into account 
other concurrent
TDR handlers that might run, as mentioned before, without serializing 
against other TDR handlers from other
schedulers you just race here against them, e.g. you parked it now but 
another
one in progress will unpark it as part of calling  drm_sched_start for 
other rings.
So you either need a global lock or dedicated single threaded queue as 
Daniel suggested.
At minimum we should change cancel_delayed_work in drm_sched_stop to 
cancel_delayed_work_sync
to cancel and flush all concurrent TDRs and probably move it to the 
begining of the function, after kthread_park

and before we start playing 

[PATCH] drm/amdkfd: drop process ref count when xnack disable

2021-08-31 Thread Alex Sierra
During svm restore pages interrupt handler, kfd_process ref count was
never dropped when xnack was disabled. Therefore, the object was never
released.

Signed-off-by: Alex Sierra 
---
 drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
index 8f9b5b53dab5..110c46cd7fac 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c
@@ -2484,7 +2484,8 @@ svm_range_restore_pages(struct amdgpu_device *adev, 
unsigned int pasid,
}
if (!p->xnack_enabled) {
pr_debug("XNACK not enabled for pasid 0x%x\n", pasid);
-   return -EFAULT;
+   r = -EFAULT;
+   goto out;
}
svms = >svms;
 
-- 
2.32.0



RE: [diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

In the previous discussion, you guys stated that we should drop the 
"kthread_should_park" in cleanup_job.

@@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
{
struct drm_sched_job *job, *next;

-   /*
-* Don't destroy jobs while the timeout worker is running  OR thread
-* is being parked and hence assumed to not touch pending_list
-*/
-   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-   !cancel_delayed_work(>work_tdr)) ||
-   kthread_should_park())
-   return NULL;

But I suddenly have a question here: if return the timedout job no matter 
kthread_should_park() or not, then we are backing to the original problem 
again: that the timedout_job is suddenly signaling and cleanup_job still 
returns it to sched_main and job is freed while it is still handling by 
vendor's timeout callback

If we return NULL when kthread_should_park() in cleanup_job, we can prevent 
above scenario from happening: once a job is processed by job_timedout we can 
stop its scheduler, and after that even this job suddenly signaled the 
cleanup_job won't return it so sched_main won't free it in parallel ...

What do you think ?
Thanks

--
Monk Liu | Cloud-GPU Core team
--

From: Liu, Monk
Sent: Wednesday, September 1, 2021 9:23 AM
To: Koenig, Christian ; Grodzovsky, Andrey 
; Daniel Vetter ; Chen, JingWen 

Cc: DRI Development ; 
amd-gfx@lists.freedesktop.org
Subject: [diagnostic TDR mode patches] unify our solution opinions/suggestions 
in one thread


[AMD Official Use Only]

Hi Daniel/Christian/Andrey

It looks the voice from you three are spread over those email floods to me, the 
feature we are working on (diagnostic TDR scheme) is pending there for more 
than 6 month (we started it from feb 2021).

Honestly speaking the email ways that we are using now is not friendly and 
quite painful to me 
Can we try to put all our opinions, suggestions, or even objects here together, 
let's go through them one by one, it's too hard for us to reply each email on 
different questions .

For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)

This is a fixing patch on the timeout timer in scheduler, can we complete this 
one first ? it should already resolved all the questions and suggestions.

For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

I think I already explained the questions raised by Daniel in other thread , 
regarding why I use __kthread_should_park()
For other aspects, can we put all our opinion synthesized here ?

Thanks !

--
Monk Liu | Cloud-GPU Core team
--



RE: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

That' really matter in practice, when two jobs from different process scheduled 
to the ring close to each other, if we don't discriminate A from B then B will 
be considered a bad job due to A's timeout, which will force B's process to 
exit (e.g.: X server)

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Daniel Vetter  
Sent: Tuesday, August 31, 2021 9:09 PM
To: Koenig, Christian 
Cc: Grodzovsky, Andrey ; Christian König 
; Liu, Monk ; 
amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org
Subject: Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

On Fri, Aug 27, 2021 at 08:30:32PM +0200, Christian König wrote:
> Yeah, that's what I meant with that the start of processing a job is a 
> bit swampy defined.
> 
> Jobs overload, but we simply don't have another good indicator that a 
> job started except that the previous one completed.
> 
> It's still better than starting the timer when pushing the job to the 
> ring buffer, because that is completely off.

Not sure this matters that much really in practice, and in some cases we might 
want to actually just reset it all if the built up backlog is way too long.

For anything that really runs way too long you want preempt-ctx/revoke fences 
anyway, not end-of-cs fences with tdr.
-Daniel

> 
> Christian.
> 
> Am 27.08.21 um 20:22 schrieb Andrey Grodzovsky:
> > As I mentioned to Monk before - what about cases such as in this 
> > test - 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgi
> > tlab.freedesktop.org%2Fmesa%2Fdrm%2F-%2Fcommit%2Fbc21168fa924d3fc4a0
> > 00492e861f50a1a135b25data=04%7C01%7CMonk.Liu%40amd.com%7Cbd1847
> > 4429e34f8eaac208d96c80710e%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C
> > 0%7C637660121179715855%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL
> > CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=1WTD3
> > opiBtT29bbbqJub5nfhWgX5vMNppiFKgWDe%2FoQ%3Dreserved=0
> > 
> > Here you don't have serialized sequence where when jobs finishes 
> > processing and second starts, they execute together  concurrently - 
> > for those cases seems to me restarting the timer for the second job 
> > from scratch will let it hang much longer then allowed by TO value.
> > 
> > Andrey
> > 
> > On 2021-08-27 10:29 a.m., Christian König wrote:
> > > I don't think that makes sense.
> > > 
> > > See we don't want to start the time when the job is inserted into 
> > > the ring buffer, but rather when it starts processing.
> > > 
> > > Starting processing is a bit swampy defined, but just starting the 
> > > timer when the previous job completes should be fine enough.
> > > 
> > > Christian.
> > > 
> > > Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky:
> > > > The TS represents the point in time when the job was inserted 
> > > > into the pending list.
> > > > I don't think it matters when it actually starts to be 
> > > > processed, what matters is when this job was inserted into 
> > > > pending list because right at that point you arm the TO timer 
> > > > (when no other is running already) and so when the previous job 
> > > > completes and you cancel and rearm again you can use that TS 
> > > > from the next job in pending list to calculate how much time has 
> > > > actually left for it to run before TDR must be initiated and not 
> > > > just give it again full TO value to run even if it has already 
> > > > been running for a while.
> > > > 
> > > > Also, i am not sure also about the assumption that what we 
> > > > measure is processing by HW, what we measure is from the moment 
> > > > it was scheduled to ring to the moment the job completed (EOP 
> > > > event). At least that what our TDR timer measures and so it 
> > > > makes sense to set the TS at this point.
> > > > 
> > > > Andrey
> > > > 
> > > > On 2021-08-27 3:20 a.m., Liu, Monk wrote:
> > > > > [AMD Official Use Only]
> > > > > 
> > > > > what is that 'ts' representing for ? it looks to me the 
> > > > > jiffies that it get scheduled to the ring,  but a job 
> > > > > scheduled to the ring doesn't represent it's being processed 
> > > > > by hw.
> > > > > 
> > > > > Thanks
> > > > > 
> > > > > --
> > > > > Monk Liu | Cloud-GPU Core team
> > > > > --
> > > > > 
> > > > > -Original Message-
> > > > > From: Grodzovsky, Andrey 
> > > > > Sent: Friday, August 27, 2021 4:14 AM
> > > > > To: Liu, Monk ; 
> > > > > amd-gfx@lists.freedesktop.org; Koenig, Christian 
> > > > > 
> > > > > Cc: dri-de...@lists.freedesktop.org
> > > > > Subject: Re: [PATCH] drm/sched: fix the bug of time out
> > > > > calculation(v3)
> > > > > 
> > > > > Attached quick patch for per job TTL calculation to make more 
> > > > > precises next timer expiration. It's on top of the patch in 
> > > > > this thread. Let me know if this makes sense.
> > > > > 

RE: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

Okay, I will reprepare this patch

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Daniel Vetter  
Sent: Tuesday, August 31, 2021 9:02 PM
To: Liu, Monk 
Cc: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; Chen, 
Jingwen 
Subject: Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

On Tue, Aug 31, 2021 at 02:59:02PM +0200, Daniel Vetter wrote:
> Can we please have some actual commit message here, with detailed 
> explanation of the race/bug/whatever, how you fix it and why this is 
> the best option?
> 
> On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote:
> > tested-by: jingwen chen 
> > Signed-off-by: Monk Liu 
> > Signed-off-by: jingwen chen 
> > ---
> >  drivers/gpu/drm/scheduler/sched_main.c | 24 
> > 
> >  1 file changed, 4 insertions(+), 20 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index ecf8140..894fdb24 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
> > *work)
> > sched = container_of(work, struct drm_gpu_scheduler, 
> > work_tdr.work);
> >  
> > /* Protects against concurrent deletion in 
> > drm_sched_get_cleanup_job */
> > +   if (!__kthread_should_park(sched->thread))
> 
> This is a __ function, i.e. considered internal, and it's lockless 
> atomic, i.e. unordered. And you're not explaining why this works.
> 
> Iow it's probably buggy, and an just unconditionally parking the 
> kthread is probably the right thing to do. If it's not the right thing 
> to do, there's a bug here for sure.

Also why don't we reuse the function drivers already have to stop a scheduler 
thread? We seem to have two kthread_park now, that's probably one too much.
-Daniel

> > +   kthread_park(sched->thread);
> > +
> > spin_lock(>job_list_lock);
> > job = list_first_entry_or_null(>pending_list,
> >struct drm_sched_job, list);
> >  
> > if (job) {
> > -   /*
> > -* Remove the bad job so it cannot be freed by concurrent
> > -* drm_sched_cleanup_jobs. It will be reinserted back after 
> > sched->thread
> > -* is parked at which point it's safe.
> > -*/
> > -   list_del_init(>list);
> > spin_unlock(>job_list_lock);
> >  
> > +   /* vendor's timeout_job should call drm_sched_start() */
> > status = job->sched->ops->timedout_job(job);
> >  
> > /*
> > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
> > struct drm_sched_job *bad)
> > kthread_park(sched->thread);
> >  
> > /*
> > -* Reinsert back the bad job here - now it's safe as
> > -* drm_sched_get_cleanup_job cannot race against us and release the
> > -* bad job at this point - we parked (waited for) any in progress
> > -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> > -* now until the scheduler thread is unparked.
> > -*/
> > -   if (bad && bad->sched == sched)
> > -   /*
> > -* Add at the head of the queue to reflect it was the earliest
> > -* job extracted.
> > -*/
> > -   list_add(>list, >pending_list);
> > -
> > -   /*
> >  * Iterate the job list from later to  earlier one and either deactive
> >  * their HW callbacks or remove them from pending list if they already
> >  * signaled.
> > --
> > 2.7.4
> > 
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.
> ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C298815bea18f4fbf76
> b308d96c7f7a8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6376601170
> 51194614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiL
> CJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=QzgCU7%2BPdA0aWL5%2BJLg
> KeKbGaMMGqeGI9KE0P0LXlN4%3Dreserved=0

--
Daniel Vetter
Software Engineer, Intel Corporation
https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C298815bea18f4fbf76b308d96c7f7a8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660117051194614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=QzgCU7%2BPdA0aWL5%2BJLgKeKbGaMMGqeGI9KE0P0LXlN4%3Dreserved=0


[diagnostic TDR mode patches] unify our solution opinions/suggestions in one thread

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

Hi Daniel/Christian/Andrey

It looks the voice from you three are spread over those email floods to me, the 
feature we are working on (diagnostic TDR scheme) is pending there for more 
than 6 month (we started it from feb 2021).

Honestly speaking the email ways that we are using now is not friendly and 
quite painful to me 
Can we try to put all our opinions, suggestions, or even objects here together, 
let's go through them one by one, it's too hard for us to reply each email on 
different questions .

For [PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)

This is a fixing patch on the timeout timer in scheduler, can we complete this 
one first ? it should already resolved all the questions and suggestions.

For [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

I think I already explained the questions raised by Daniel in other thread , 
regarding why I use __kthread_should_park()
For other aspects, can we put all our opinion synthesized here ?

Thanks !

--
Monk Liu | Cloud-GPU Core team
--



RE: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

>> Also why don't we reuse the function drivers already have to stop a 
>> scheduler thread? We seem to have two kthread_park now, that's probably one 
>> too much.
Are you referring to drm_sched_stop ?

That's  different, we don't need the logic from it, see that it go through 
pending list and remove all callbacks , etc... meanwhile vendor's timeout 
callback will call drm_sched_stop in a proper way,
All we want in my patch is to simply park scheduler,
Besides, even you call drm_sched_stop in job_timeout you still run into the 
warning issue I hit.

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Daniel Vetter  
Sent: Tuesday, August 31, 2021 9:02 PM
To: Liu, Monk 
Cc: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; Chen, 
Jingwen 
Subject: Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

On Tue, Aug 31, 2021 at 02:59:02PM +0200, Daniel Vetter wrote:
> Can we please have some actual commit message here, with detailed 
> explanation of the race/bug/whatever, how you fix it and why this is 
> the best option?
> 
> On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote:
> > tested-by: jingwen chen 
> > Signed-off-by: Monk Liu 
> > Signed-off-by: jingwen chen 
> > ---
> >  drivers/gpu/drm/scheduler/sched_main.c | 24 
> > 
> >  1 file changed, 4 insertions(+), 20 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index ecf8140..894fdb24 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
> > *work)
> > sched = container_of(work, struct drm_gpu_scheduler, 
> > work_tdr.work);
> >  
> > /* Protects against concurrent deletion in 
> > drm_sched_get_cleanup_job */
> > +   if (!__kthread_should_park(sched->thread))
> 
> This is a __ function, i.e. considered internal, and it's lockless 
> atomic, i.e. unordered. And you're not explaining why this works.
> 
> Iow it's probably buggy, and an just unconditionally parking the 
> kthread is probably the right thing to do. If it's not the right thing 
> to do, there's a bug here for sure.

Also why don't we reuse the function drivers already have to stop a scheduler 
thread? We seem to have two kthread_park now, that's probably one too much.
-Daniel

> > +   kthread_park(sched->thread);
> > +
> > spin_lock(>job_list_lock);
> > job = list_first_entry_or_null(>pending_list,
> >struct drm_sched_job, list);
> >  
> > if (job) {
> > -   /*
> > -* Remove the bad job so it cannot be freed by concurrent
> > -* drm_sched_cleanup_jobs. It will be reinserted back after 
> > sched->thread
> > -* is parked at which point it's safe.
> > -*/
> > -   list_del_init(>list);
> > spin_unlock(>job_list_lock);
> >  
> > +   /* vendor's timeout_job should call drm_sched_start() */
> > status = job->sched->ops->timedout_job(job);
> >  
> > /*
> > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
> > struct drm_sched_job *bad)
> > kthread_park(sched->thread);
> >  
> > /*
> > -* Reinsert back the bad job here - now it's safe as
> > -* drm_sched_get_cleanup_job cannot race against us and release the
> > -* bad job at this point - we parked (waited for) any in progress
> > -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> > -* now until the scheduler thread is unparked.
> > -*/
> > -   if (bad && bad->sched == sched)
> > -   /*
> > -* Add at the head of the queue to reflect it was the earliest
> > -* job extracted.
> > -*/
> > -   list_add(>list, >pending_list);
> > -
> > -   /*
> >  * Iterate the job list from later to  earlier one and either deactive
> >  * their HW callbacks or remove them from pending list if they already
> >  * signaled.
> > --
> > 2.7.4
> > 
> 
> --
> Daniel Vetter
> Software Engineer, Intel Corporation
> https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fblog.
> ffwll.ch%2Fdata=04%7C01%7CMonk.Liu%40amd.com%7C298815bea18f4fbf76
> b308d96c7f7a8b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C6376601170
> 51194614%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiL
> CJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=QzgCU7%2BPdA0aWL5%2BJLg
> KeKbGaMMGqeGI9KE0P0LXlN4%3Dreserved=0

--
Daniel Vetter
Software Engineer, Intel Corporation

RE: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

>> This is a __ function, i.e. considered internal, and it's lockless atomic, 
>> i.e. unordered. And you're not explaining why this works.

It's not a traditional habit from what I can see that put explain in code, but 
we can do that in mails ,
We want to park the scheduler in job_timeout to serialize the job accessing 
from both sched and TO handler , but inside vendor's callback timeout_job at 
least both panfrost and amd 
They both call drm_sched_stop() on all schedulers.  

If we unconditionally call "kthread_park" in job_timedout  then the bailing 
job's timedout will try to call "kthread_park" again on its scheduler and 
introduce "warning"

The scenario is :
1,Job1 on sched1 triggers timedout, and sched1 is parked,
2,vendor callback runs, it will usually stop all schedulers.
3,Job2 on sched2 triggers timedout, so the job_timedout also try to park 
sched2, but sched2 was stopped already by above step.  (job2's timeout is 
introduced by job1, or by other VF)
  ---So there will be "warning" in kernel log from above step... after 
this "__kthread_should_park()" here we can avoid the warning, that's the only 
reason I need this __function.
4,Before vendor callback exit, it will unpark all schedulers.

>From another hand if we don't do the kthread_park() and still delete the job 
>here (drop deleting/reinserting the job from pending_list  is what we want), 
>we still have a small windows to hit the race issue: 
That cleanup_job from sched thread is freeing the job while job is under 
processing by job_timedout or vendor's call back.

And the reason we want to avoid deleting/reinserting the timedout job is 
because we (amd vendor) have our own way to do a diagnostic on all jobs in 
pending list from all scheduler, we want to cherry-pick the real bad job 
>From all scheduler's pending list that causes this JOB TIMEOUT.

Besides, it is also much reasonable to park scheduler when job_timedout is 
running there, they should exclusively access those common members like 
sched_job. (due to spin_lock is off before running into vendor's calback)

Hope I explained ourselves well enough.

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Daniel Vetter  
Sent: Tuesday, August 31, 2021 8:59 PM
To: Liu, Monk 
Cc: amd-gfx@lists.freedesktop.org; dri-de...@lists.freedesktop.org; Chen, 
Jingwen 
Subject: Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

Can we please have some actual commit message here, with detailed explanation 
of the race/bug/whatever, how you fix it and why this is the best option?

On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote:
> tested-by: jingwen chen 
> Signed-off-by: Monk Liu 
> Signed-off-by: jingwen chen 
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 24 
>  1 file changed, 4 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index ecf8140..894fdb24 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
> *work)
>   sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>  
>   /* Protects against concurrent deletion in drm_sched_get_cleanup_job 
> */
> + if (!__kthread_should_park(sched->thread))

This is a __ function, i.e. considered internal, and it's lockless atomic, i.e. 
unordered. And you're not explaining why this works.

Iow it's probably buggy, and an just unconditionally parking the kthread is 
probably the right thing to do. If it's not the right thing to do, there's a 
bug here for sure.
-Daniel

> + kthread_park(sched->thread);
> +
>   spin_lock(>job_list_lock);
>   job = list_first_entry_or_null(>pending_list,
>  struct drm_sched_job, list);
>  
>   if (job) {
> - /*
> -  * Remove the bad job so it cannot be freed by concurrent
> -  * drm_sched_cleanup_jobs. It will be reinserted back after 
> sched->thread
> -  * is parked at which point it's safe.
> -  */
> - list_del_init(>list);
>   spin_unlock(>job_list_lock);
>  
> + /* vendor's timeout_job should call drm_sched_start() */
>   status = job->sched->ops->timedout_job(job);
>  
>   /*
> @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
> struct drm_sched_job *bad)
>   kthread_park(sched->thread);
>  
>   /*
> -  * Reinsert back the bad job here - now it's safe as
> -  * drm_sched_get_cleanup_job cannot race against us and release the
> -  * bad job at this point - we parked (waited for) any in progress
> -  * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> 

[PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Monk Liu
tested-by: jingwen chen 
Signed-off-by: Monk Liu 
Signed-off-by: jingwen chen 
---
 drivers/gpu/drm/scheduler/sched_main.c | 24 
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index 3e0bbc7..87d72e9 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
+   if (!__kthread_should_park(sched->thread))
+   kthread_park(sched->thread);
+
spin_lock(>job_list_lock);
job = list_first_entry_or_null(>pending_list,
   struct drm_sched_job, list);
 
if (job) {
-   /*
-* Remove the bad job so it cannot be freed by concurrent
-* drm_sched_cleanup_jobs. It will be reinserted back after 
sched->thread
-* is parked at which point it's safe.
-*/
-   list_del_init(>list);
spin_unlock(>job_list_lock);
 
+   /* vendor's timeout_job should call drm_sched_start() */
status = job->sched->ops->timedout_job(job);
 
/*
@@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
struct drm_sched_job *bad)
kthread_park(sched->thread);
 
/*
-* Reinsert back the bad job here - now it's safe as
-* drm_sched_get_cleanup_job cannot race against us and release the
-* bad job at this point - we parked (waited for) any in progress
-* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
-* now until the scheduler thread is unparked.
-*/
-   if (bad && bad->sched == sched)
-   /*
-* Add at the head of the queue to reflect it was the earliest
-* job extracted.
-*/
-   list_add(>list, >pending_list);
-
-   /*
 * Iterate the job list from later to  earlier one and either deactive
 * their HW callbacks or remove them from pending list if they already
 * signaled.
-- 
2.7.4



[PATCH 1/2] drm/sched: fix the bug of time out calculation(v4)

2021-08-31 Thread Monk Liu
issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer
even the its corresponding job is still running.

fix:
do not cancel the timer in cleanup_job, instead do the cancelling
only when the heading job is signaled, and if there is a "next" job
we start_timeout again.

v2:
further cleanup the logic, and do the TDR timer cancelling if the signaled job
is the last one in its scheduler.

v3:
change the issue description
remove the cancel_delayed_work in the begining of the cleanup_job
recover the implement of drm_sched_job_begin.

v4:
remove the kthread_should_park() checking in cleanup_job routine,
we should cleanup the signaled job asap

TODO:
1)introduce pause/resume scheduler in job_timeout to serial the handling
of scheduler and job_timeout.
2)drop the bad job's del and insert in scheduler due to above serialization
(no race issue anymore with the serialization)

tested-by: jingwen 
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/scheduler/sched_main.c | 26 +-
 1 file changed, 9 insertions(+), 17 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..3e0bbc7 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -676,15 +676,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 {
struct drm_sched_job *job, *next;
 
-   /*
-* Don't destroy jobs while the timeout worker is running  OR thread
-* is being parked and hence assumed to not touch pending_list
-*/
-   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-   !cancel_delayed_work(>work_tdr)) ||
-   kthread_should_park())
-   return NULL;
-
spin_lock(>job_list_lock);
 
job = list_first_entry_or_null(>pending_list,
@@ -693,17 +684,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
if (job && dma_fence_is_signaled(>s_fence->finished)) {
/* remove job from pending_list */
list_del_init(>list);
+
+   /* cancel this job's TO timer */
+   cancel_delayed_work(>work_tdr);
/* make the scheduled timestamp more accurate */
next = list_first_entry_or_null(>pending_list,
typeof(*next), list);
-   if (next)
+
+   if (next) {
next->s_fence->scheduled.timestamp =
job->s_fence->finished.timestamp;
-
+   /* start TO timer for next job */
+   drm_sched_start_timeout(sched);
+   }
} else {
job = NULL;
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
}
 
spin_unlock(>job_list_lock);
@@ -791,11 +786,8 @@ static int drm_sched_main(void *param)
  (entity = 
drm_sched_select_entity(sched))) ||
 kthread_should_stop());
 
-   if (cleanup_job) {
+   if (cleanup_job)
sched->ops->free_job(cleanup_job);
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
-   }
 
if (!entity)
continue;
-- 
2.7.4



Re: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)

2021-08-31 Thread Grodzovsky, Andrey
What about removing

(kthread_should_park()) ? We decided it's useless as far as I remember.

Andrey

From: amd-gfx  on behalf of Liu, Monk 

Sent: 31 August 2021 20:24
To: Liu, Monk ; amd-gfx@lists.freedesktop.org 

Cc: dri-de...@lists.freedesktop.org 
Subject: RE: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)

[AMD Official Use Only]

Ping Christian, Andrey

Can we merge this patch first ? this is a standalone patch for the timer

Thanks

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Monk Liu 
Sent: Tuesday, August 31, 2021 6:36 PM
To: amd-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org; Liu, Monk 
Subject: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)

issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer even the its 
corresponding job is still running.

fix:
do not cancel the timer in cleanup_job, instead do the cancelling only when the 
heading job is signaled, and if there is a "next" job we start_timeout again.

v2:
further cleanup the logic, and do the TDR timer cancelling if the signaled job 
is the last one in its scheduler.

v3:
change the issue description
remove the cancel_delayed_work in the begining of the cleanup_job recover the 
implement of drm_sched_job_begin.

TODO:
1)introduce pause/resume scheduler in job_timeout to serial the handling of 
scheduler and job_timeout.
2)drop the bad job's del and insert in scheduler due to above serialization (no 
race issue anymore with the serialization)

tested-by: jingwen 
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/scheduler/sched_main.c | 25 ++---
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..ecf8140 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) 
 {
 struct drm_sched_job *job, *next;

-   /*
-* Don't destroy jobs while the timeout worker is running  OR thread
-* is being parked and hence assumed to not touch pending_list
-*/
-   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-   !cancel_delayed_work(>work_tdr)) ||
-   kthread_should_park())
+   if (kthread_should_park())
 return NULL;

 spin_lock(>job_list_lock);
@@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 if (job && dma_fence_is_signaled(>s_fence->finished)) {
 /* remove job from pending_list */
 list_del_init(>list);
+
+   /* cancel this job's TO timer */
+   cancel_delayed_work(>work_tdr);
 /* make the scheduled timestamp more accurate */
 next = list_first_entry_or_null(>pending_list,
 typeof(*next), list);
-   if (next)
+
+   if (next) {
 next->s_fence->scheduled.timestamp =
 job->s_fence->finished.timestamp;
-
+   /* start TO timer for next job */
+   drm_sched_start_timeout(sched);
+   }
 } else {
 job = NULL;
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
 }

 spin_unlock(>job_list_lock);
@@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
   (entity = 
drm_sched_select_entity(sched))) ||
  kthread_should_stop());

-   if (cleanup_job) {
+   if (cleanup_job)
 sched->ops->free_job(cleanup_job);
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
-   }

 if (!entity)
 continue;
--
2.7.4


RE: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)

2021-08-31 Thread Liu, Monk
[AMD Official Use Only]

Ping Christian, Andrey

Can we merge this patch first ? this is a standalone patch for the timer 

Thanks 

--
Monk Liu | Cloud-GPU Core team
--

-Original Message-
From: Monk Liu  
Sent: Tuesday, August 31, 2021 6:36 PM
To: amd-gfx@lists.freedesktop.org
Cc: dri-de...@lists.freedesktop.org; Liu, Monk 
Subject: [PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)

issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer even the its 
corresponding job is still running.

fix:
do not cancel the timer in cleanup_job, instead do the cancelling only when the 
heading job is signaled, and if there is a "next" job we start_timeout again.

v2:
further cleanup the logic, and do the TDR timer cancelling if the signaled job 
is the last one in its scheduler.

v3:
change the issue description
remove the cancel_delayed_work in the begining of the cleanup_job recover the 
implement of drm_sched_job_begin.

TODO:
1)introduce pause/resume scheduler in job_timeout to serial the handling of 
scheduler and job_timeout.
2)drop the bad job's del and insert in scheduler due to above serialization (no 
race issue anymore with the serialization)

tested-by: jingwen 
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/scheduler/sched_main.c | 25 ++---
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..ecf8140 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched) 
 {
struct drm_sched_job *job, *next;
 
-   /*
-* Don't destroy jobs while the timeout worker is running  OR thread
-* is being parked and hence assumed to not touch pending_list
-*/
-   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-   !cancel_delayed_work(>work_tdr)) ||
-   kthread_should_park())
+   if (kthread_should_park())
return NULL;
 
spin_lock(>job_list_lock);
@@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
if (job && dma_fence_is_signaled(>s_fence->finished)) {
/* remove job from pending_list */
list_del_init(>list);
+
+   /* cancel this job's TO timer */
+   cancel_delayed_work(>work_tdr);
/* make the scheduled timestamp more accurate */
next = list_first_entry_or_null(>pending_list,
typeof(*next), list);
-   if (next)
+
+   if (next) {
next->s_fence->scheduled.timestamp =
job->s_fence->finished.timestamp;
-
+   /* start TO timer for next job */
+   drm_sched_start_timeout(sched);
+   }
} else {
job = NULL;
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
}
 
spin_unlock(>job_list_lock);
@@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
  (entity = 
drm_sched_select_entity(sched))) ||
 kthread_should_stop());
 
-   if (cleanup_job) {
+   if (cleanup_job)
sched->ops->free_job(cleanup_job);
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
-   }
 
if (!entity)
continue;
--
2.7.4


Re: [PATCH 1/1] drm/amdkfd: make needs_pcie_atomics FW-version dependent

2021-08-31 Thread Felix Kuehling

On 2021-08-31 6:09 p.m., Zeng, Oak wrote:

A nit-pick inline. Otherwise this patch is Reviewed-by: Oak Zeng 


Regards,
Oak

  


On 2021-08-31, 5:57 PM, "amd-gfx on behalf of Felix Kuehling" 
 wrote:

 On some GPUs the PCIe atomic requirement for KFD depends on the MEC
 firmware version. Add a firmware version check for this. The minimum
 firmware version that works without atomics can be updated in the
 device_info structure for each GPU type.

 Signed-off-by: Felix Kuehling 
 ---
  drivers/gpu/drm/amd/amdkfd/kfd_device.c | 9 +++--
  drivers/gpu/drm/amd/amdkfd/kfd_priv.h   | 1 +
  2 files changed, 8 insertions(+), 2 deletions(-)

 diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
 index 16a57b70cc1a..655ee5733229 100644
 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
 +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
 @@ -688,6 +688,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
struct kfd_dev *kfd;
const struct kfd_device_info *device_info;
const struct kfd2kgd_calls *f2g;
 +  uint32_t fw_version;

if (asic_type >= sizeof(kfd_supported_devices) / (sizeof(void *) * 2)
|| asic_type >= sizeof(kfd2kgd_funcs) / sizeof(void *)) {
 @@ -713,8 +714,12 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
 * supported.
 */
kfd->pci_atomic_requested = amdgpu_amdkfd_have_atomics_support(kgd);
 -  if (device_info->needs_pci_atomics &&
 -  !kfd->pci_atomic_requested) {
 +  fw_version = amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1);
 +  if (!kfd->pci_atomic_requested &&
 +  device_info->needs_pci_atomics &&
 +  (!device_info->no_atomic_fw_version ||
 +amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1) <
You already get the fw_version above __


I'll fix that. I forgot to remove the local variable after I decided to 
move the function call into the condition.


Thanks,
  Felix



 +  device_info->no_atomic_fw_version)) {
dev_info(kfd_device,
 "skipped device %x:%x, PCI rejects atomics\n",
 pdev->vendor, pdev->device);
 diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
 index ab83b0de6b22..6d8f9bb2d905 100644
 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
 +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
 @@ -207,6 +207,7 @@ struct kfd_device_info {
bool supports_cwsr;
bool needs_iommu_device;
bool needs_pci_atomics;
 +  uint32_t no_atomic_fw_version;
unsigned int num_sdma_engines;
unsigned int num_xgmi_sdma_engines;
unsigned int num_sdma_queues_per_engine;
 --
 2.32.0




Re: [PATCH 1/1] drm/amdkfd: make needs_pcie_atomics FW-version dependent

2021-08-31 Thread Zeng, Oak
A nit-pick inline. Otherwise this patch is Reviewed-by: Oak Zeng 


Regards,
Oak 

 

On 2021-08-31, 5:57 PM, "amd-gfx on behalf of Felix Kuehling" 
 
wrote:

On some GPUs the PCIe atomic requirement for KFD depends on the MEC
firmware version. Add a firmware version check for this. The minimum
firmware version that works without atomics can be updated in the
device_info structure for each GPU type.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 9 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h   | 1 +
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 16a57b70cc1a..655ee5733229 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -688,6 +688,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
struct kfd_dev *kfd;
const struct kfd_device_info *device_info;
const struct kfd2kgd_calls *f2g;
+   uint32_t fw_version;

if (asic_type >= sizeof(kfd_supported_devices) / (sizeof(void *) * 2)
|| asic_type >= sizeof(kfd2kgd_funcs) / sizeof(void *)) {
@@ -713,8 +714,12 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
 * supported.
 */
kfd->pci_atomic_requested = amdgpu_amdkfd_have_atomics_support(kgd);
-   if (device_info->needs_pci_atomics &&
-   !kfd->pci_atomic_requested) {
+   fw_version = amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1);
+   if (!kfd->pci_atomic_requested &&
+   device_info->needs_pci_atomics &&
+   (!device_info->no_atomic_fw_version ||
+ amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1) <
You already get the fw_version above __
+   device_info->no_atomic_fw_version)) {
dev_info(kfd_device,
 "skipped device %x:%x, PCI rejects atomics\n",
 pdev->vendor, pdev->device);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index ab83b0de6b22..6d8f9bb2d905 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -207,6 +207,7 @@ struct kfd_device_info {
bool supports_cwsr;
bool needs_iommu_device;
bool needs_pci_atomics;
+   uint32_t no_atomic_fw_version;
unsigned int num_sdma_engines;
unsigned int num_xgmi_sdma_engines;
unsigned int num_sdma_queues_per_engine;
-- 
2.32.0




[PATCH 1/1] drm/amdkfd: make needs_pcie_atomics FW-version dependent

2021-08-31 Thread Felix Kuehling
On some GPUs the PCIe atomic requirement for KFD depends on the MEC
firmware version. Add a firmware version check for this. The minimum
firmware version that works without atomics can be updated in the
device_info structure for each GPU type.

Signed-off-by: Felix Kuehling 
---
 drivers/gpu/drm/amd/amdkfd/kfd_device.c | 9 +++--
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h   | 1 +
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c 
b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
index 16a57b70cc1a..655ee5733229 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c
@@ -688,6 +688,7 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
struct kfd_dev *kfd;
const struct kfd_device_info *device_info;
const struct kfd2kgd_calls *f2g;
+   uint32_t fw_version;
 
if (asic_type >= sizeof(kfd_supported_devices) / (sizeof(void *) * 2)
|| asic_type >= sizeof(kfd2kgd_funcs) / sizeof(void *)) {
@@ -713,8 +714,12 @@ struct kfd_dev *kgd2kfd_probe(struct kgd_dev *kgd,
 * supported.
 */
kfd->pci_atomic_requested = amdgpu_amdkfd_have_atomics_support(kgd);
-   if (device_info->needs_pci_atomics &&
-   !kfd->pci_atomic_requested) {
+   fw_version = amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1);
+   if (!kfd->pci_atomic_requested &&
+   device_info->needs_pci_atomics &&
+   (!device_info->no_atomic_fw_version ||
+ amdgpu_amdkfd_get_fw_version(kgd, KGD_ENGINE_MEC1) <
+   device_info->no_atomic_fw_version)) {
dev_info(kfd_device,
 "skipped device %x:%x, PCI rejects atomics\n",
 pdev->vendor, pdev->device);
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h 
b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index ab83b0de6b22..6d8f9bb2d905 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -207,6 +207,7 @@ struct kfd_device_info {
bool supports_cwsr;
bool needs_iommu_device;
bool needs_pci_atomics;
+   uint32_t no_atomic_fw_version;
unsigned int num_sdma_engines;
unsigned int num_xgmi_sdma_engines;
unsigned int num_sdma_queues_per_engine;
-- 
2.32.0



Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Luben Tuikov
On 2021-08-31 16:56, Andrey Grodzovsky wrote:
> On 2021-08-31 12:01 p.m., Luben Tuikov wrote:
>> On 2021-08-31 11:23, Andrey Grodzovsky wrote:
>>> On 2021-08-31 10:38 a.m., Daniel Vetter wrote:
 On Tue, Aug 31, 2021 at 10:20:40AM -0400, Andrey Grodzovsky wrote:
> On 2021-08-31 10:03 a.m., Daniel Vetter wrote:
>> On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote:
>>> It's says patch [2/2] but i can't find patch 1
>>>
>>> On 2021-08-31 6:35 a.m., Monk Liu wrote:
 tested-by: jingwen chen 
 Signed-off-by: Monk Liu 
 Signed-off-by: jingwen chen 
 ---
  drivers/gpu/drm/scheduler/sched_main.c | 24 
 
  1 file changed, 4 insertions(+), 20 deletions(-)

 diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
 b/drivers/gpu/drm/scheduler/sched_main.c
 index ecf8140..894fdb24 100644
 --- a/drivers/gpu/drm/scheduler/sched_main.c
 +++ b/drivers/gpu/drm/scheduler/sched_main.c
 @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct 
 work_struct *work)
sched = container_of(work, struct drm_gpu_scheduler, 
 work_tdr.work);
/* Protects against concurrent deletion in 
 drm_sched_get_cleanup_job */
 +  if (!__kthread_should_park(sched->thread))
 +  kthread_park(sched->thread);
 +
>>> As mentioned before, without serializing against other TDR handlers from
>>> other
>>> schedulers you just race here against them, e.g. you parked it now but
>>> another
>>> one in progress will unpark it as part of calling  drm_sched_start for 
>>> other
>>> rings[1]
>>> Unless I am missing something since I haven't found patch [1/2]
>>>
>>> [1] - 
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L5041data=04%7C01%7Cluben.tuikov%40amd.com%7C228bd1600c914efe24aa08d96c934bbb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660202148713283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=PrrvFHAwDeLlbcOctlKHsCFs9%2F56XNVzoLVcT1RoJgI%3Dreserved=0
>> You need to have your own wq and run all your tdr work on the same wq if
>> your reset has any cross-engine impact.
> IMHO what is problematic in serializing vs. locking (with trylock and bail
> out like we do in [1]) is for multiple TO events arising from same reason
> like maybe one job just waits for another and once first is hanged the
> second will also appear to be hanged triggering it's own TO event.
> In this case multiple TOs event will trigger multiple resets if we 
> serialize
> but if we use lock with trylock the second one will quietly bail out.
>
> [1] 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L4903data=04%7C01%7Cluben.tuikov%40amd.com%7C228bd1600c914efe24aa08d96c934bbb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660202148713283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=kxSWBoshVTLMMFIFZdPsP4MBgUAoC%2F3szo9GUemSRLY%3Dreserved=0
 Hm so I guess a single wq here, that will hold up all other TO. And they
 should recheck whether the job is moving meanwhile.
>>> Can you clarify about this ? What job should be moving ? The dependent job ?
>>>
>>>
 Also unless you use hw semaphores the job shouldn't even start before the
 deps are singalled, so not sure how this goes wrong?
>>> What about a simple example where
>>> we actually can submit a shader on one ring and a simple
>>> WAIT_REG_MEM packet on another to wait for the shader to write
>>> a specific value to specific memory location. Here you have both of them
>>> started
>>> in close proximity and no explicit dependencies involved (at the
>>> scheduler level)
>>> and yet if the shader hangs also the WAIT_REG_MEM job will hang.
>>>
>>>
 The vm_id flush stuff can make things a bit more fun for your specific
 case, but in your specific case you have to run all TO handlers on the
 same ordered workqueue anyway (because trying to paper over this in other
 ways doesn't work imo).
>>> I didn't get this one.
>> So, awhile back I tried to "serialize" this by moving timed-out jobs
>> into their own timed-out-dedicated list, then freeing them asynchronously,
>> but I never got it to work reliably due to races with low-level drivers and
>> assumptions made way back.
>>
>> My idea was to atomic-move timed-out jobs into their own list, at the time of
>> timeout, and later asynchronously to free them (or better yet, inquire about
>> their state, and free them or move them back--ideally the 

Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Andrey Grodzovsky



On 2021-08-31 12:01 p.m., Luben Tuikov wrote:

On 2021-08-31 11:23, Andrey Grodzovsky wrote:

On 2021-08-31 10:38 a.m., Daniel Vetter wrote:

On Tue, Aug 31, 2021 at 10:20:40AM -0400, Andrey Grodzovsky wrote:

On 2021-08-31 10:03 a.m., Daniel Vetter wrote:

On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote:

It's says patch [2/2] but i can't find patch 1

On 2021-08-31 6:35 a.m., Monk Liu wrote:

tested-by: jingwen chen 
Signed-off-by: Monk Liu 
Signed-off-by: jingwen chen 
---
 drivers/gpu/drm/scheduler/sched_main.c | 24 
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index ecf8140..894fdb24 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
+   if (!__kthread_should_park(sched->thread))
+   kthread_park(sched->thread);
+

As mentioned before, without serializing against other TDR handlers from
other
schedulers you just race here against them, e.g. you parked it now but
another
one in progress will unpark it as part of calling  drm_sched_start for other
rings[1]
Unless I am missing something since I haven't found patch [1/2]

[1] - 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L5041data=04%7C01%7Cluben.tuikov%40amd.com%7C228bd1600c914efe24aa08d96c934bbb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660202148713283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=PrrvFHAwDeLlbcOctlKHsCFs9%2F56XNVzoLVcT1RoJgI%3Dreserved=0

You need to have your own wq and run all your tdr work on the same wq if
your reset has any cross-engine impact.

IMHO what is problematic in serializing vs. locking (with trylock and bail
out like we do in [1]) is for multiple TO events arising from same reason
like maybe one job just waits for another and once first is hanged the
second will also appear to be hanged triggering it's own TO event.
In this case multiple TOs event will trigger multiple resets if we serialize
but if we use lock with trylock the second one will quietly bail out.

[1] 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L4903data=04%7C01%7Cluben.tuikov%40amd.com%7C228bd1600c914efe24aa08d96c934bbb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660202148713283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=kxSWBoshVTLMMFIFZdPsP4MBgUAoC%2F3szo9GUemSRLY%3Dreserved=0

Hm so I guess a single wq here, that will hold up all other TO. And they
should recheck whether the job is moving meanwhile.

Can you clarify about this ? What job should be moving ? The dependent job ?



Also unless you use hw semaphores the job shouldn't even start before the
deps are singalled, so not sure how this goes wrong?

What about a simple example where
we actually can submit a shader on one ring and a simple
WAIT_REG_MEM packet on another to wait for the shader to write
a specific value to specific memory location. Here you have both of them
started
in close proximity and no explicit dependencies involved (at the
scheduler level)
and yet if the shader hangs also the WAIT_REG_MEM job will hang.



The vm_id flush stuff can make things a bit more fun for your specific
case, but in your specific case you have to run all TO handlers on the
same ordered workqueue anyway (because trying to paper over this in other
ways doesn't work imo).

I didn't get this one.

So, awhile back I tried to "serialize" this by moving timed-out jobs
into their own timed-out-dedicated list, then freeing them asynchronously,
but I never got it to work reliably due to races with low-level drivers and
assumptions made way back.

My idea was to atomic-move timed-out jobs into their own list, at the time of
timeout, and later asynchronously to free them (or better yet, inquire about
their state, and free them or move them back--ideally the inquiry is atomic
and done at timeout time before being moved to the timeout list). Anyway...

But I found out that all these knobs and levers weren't in place and I was
getting problems with it and it never materialized.

The paradigm was loosely "let someone else do it", like, "on an event,
move it to a list, and let someone else handle it", or "on an event, mark
it, and let someone else handle it". (loosely borrowed from an iSCSI target
I did many many years ago--it worked well and there were no races, even with
out-of-order executions.)

If 

[PATCH v7 8/8] nouveau: fold multiple DRM_DEBUG_DRIVERs together

2021-08-31 Thread Jim Cromie
With DRM_USE_DYNAMIC_DEBUG, each callsite record requires 56 bytes.
We can combine 12 into one here and save ~620 bytes.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/nouveau/nouveau_drm.c | 36 +--
 1 file changed, 23 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c 
b/drivers/gpu/drm/nouveau/nouveau_drm.c
index ba4cd5f83725..0f45399535bf 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -1245,19 +1245,29 @@ nouveau_drm_pci_table[] = {
 
 static void nouveau_display_options(void)
 {
-   DRM_DEBUG_DRIVER("Loading Nouveau with parameters:\n");
-
-   DRM_DEBUG_DRIVER("... tv_disable   : %d\n", nouveau_tv_disable);
-   DRM_DEBUG_DRIVER("... ignorelid: %d\n", nouveau_ignorelid);
-   DRM_DEBUG_DRIVER("... duallink : %d\n", nouveau_duallink);
-   DRM_DEBUG_DRIVER("... nofbaccel: %d\n", nouveau_nofbaccel);
-   DRM_DEBUG_DRIVER("... config   : %s\n", nouveau_config);
-   DRM_DEBUG_DRIVER("... debug: %s\n", nouveau_debug);
-   DRM_DEBUG_DRIVER("... noaccel  : %d\n", nouveau_noaccel);
-   DRM_DEBUG_DRIVER("... modeset  : %d\n", nouveau_modeset);
-   DRM_DEBUG_DRIVER("... runpm: %d\n", nouveau_runtime_pm);
-   DRM_DEBUG_DRIVER("... vram_pushbuf : %d\n", nouveau_vram_pushbuf);
-   DRM_DEBUG_DRIVER("... hdmimhz  : %d\n", nouveau_hdmimhz);
+   DRM_DEBUG_DRIVER("Loading Nouveau with parameters:\n"
+"... tv_disable   : %d\n"
+"... ignorelid: %d\n"
+"... duallink : %d\n"
+"... nofbaccel: %d\n"
+"... config   : %s\n"
+"... debug: %s\n"
+"... noaccel  : %d\n"
+"... modeset  : %d\n"
+"... runpm: %d\n"
+"... vram_pushbuf : %d\n"
+"... hdmimhz  : %d\n"
+, nouveau_tv_disable
+, nouveau_ignorelid
+, nouveau_duallink
+, nouveau_nofbaccel
+, nouveau_config
+, nouveau_debug
+, nouveau_noaccel
+, nouveau_modeset
+, nouveau_runtime_pm
+, nouveau_vram_pushbuf
+, nouveau_hdmimhz);
 }
 
 static const struct dev_pm_ops nouveau_pm_ops = {
-- 
2.31.1



[PATCH v7 5/8] drm_print: add choice to use dynamic debug in drm-debug

2021-08-31 Thread Jim Cromie
drm's debug system writes 10 distinct categories of messages to syslog
using a small API[1]: drm_dbg*(10 names), DRM_DEV_DEBUG*(3 names),
DRM_DEBUG*(8 names).  There are thousands of these callsites, each
categorized in this systematized way.

These callsites can be enabled at runtime by their category, each
controlled by a bit in drm.debug (/sys/modules/drm/parameter/debug).
In the current "basic" implementation, drm_debug_enabled() tests these
bits in __drm_debug each time an API[1] call is executed; while cheap
individually, the costs accumulate with uptime.

This patch uses dynamic-debug with jump-label to patch enabled calls
onto their respective NOOP slots, avoiding all runtime bit-checks of
__drm_debug by drm_debug_enabled().

Dynamic debug has no concept of category, but we can emulate one by
replacing enum categories with a set of prefix-strings; "drm:core:",
"drm:kms:" "drm:driver:" etc, and prepend them (at compile time) to
the given formats.

Then we can use:
  `echo module drm format "^drm:core: " +p > control`

to enable the whole category with one query.

This conversion yields many new prdbg callsites:

  dyndbg: 195 debug prints in module drm_kms_helper
  dyndbg: 298 debug prints in module drm
  dyndbg: 1630 debug prints in module i915
  dyndbg: ~3500 debug prints in module amdgpu

CONFIG_DRM_USE_DYNAMIC_DEBUG enables this, and is available if
CONFIG_DYNAMIC_DEBUG or CONFIG_DYNAMIC_DEBUG_CORE is chosen, and if
CONFIG_JUMP_LABEL is enabled; this because its required to get the
promised optimizations.

The "basic" -> "dyndbg" switchover is layered into the macro scheme

A. A "prefix" version of DRM_UT_ map, named DRM_DBG_CAT_

"basic":  DRM_DBG_CAT_  <===  DRM_UT_.  Identity map.
"dyndbg":
   #define DRM_DBG_CAT_KMS"drm:kms: "
   #define DRM_DBG_CAT_PRIME  "drm:prime: "
   #define DRM_DBG_CAT_ATOMIC "drm:atomic: "

In v3, had older name, DRM_DBG_CLASS_ was countered, I had
agreed, but this seems better still; CATEGORY is already DRM's
term-of-art, and adding a near-synonym 'CLASS' only adds ambiguity.

DRM_UT_* are preserved, since theyre used elsewhere.  We can probably
reduce their use further, but thats a separate thing.

B. drm_dev_dbg() & drm_debug() are interposed with macros

basic:forward to renamed fn, with args preserved
enabled:  redirect to pr_debug, dev_dbg, with CATEGORY format catenated

This is where drm_debug_enabled() is avoided.  The prefix is prepended
at compile-time, no category at runtime.

C. API[1] uses DRM_DBG_CAT_s

these already use (B), now they use (A) too, to get the correct token
type for "basic" and "dyndbg" configs.

D. use DEFINE_DYNAMIC_DEBUG_CATEGORIES()

This defines the map using DRM_CAT_s, and creates the /sysfs
bitmap to control those categories.

NOTES:

Because the dyndbg callback is watching __drm_debug, it is coherent
with drm_debug_enabled() and its remaining users; the switchover
should be transparent.

Code Review is expected to catch the lack of correspondence between
bit=>prefix definitions (the selector) and the prefixes used in the
API[1] layer above pr_debug()

I've coded the search-prefixes/categories with a trailing space, which
excludes any sub-categories added later.  This convention protects any
"drm:atomic:fail:" callsites from getting stomped on by `echo 0 > debug`.
Other categories could differ, but we need some default.

Dyndbg requires that the prefix be in the compiled-in format string;
run-time prefixing evades callsite selection by category.

pr_debug("%s: ...", __func__, ...) // not ideal

With "lineno X" in a query, its possible to enable single callsites,
but it is tedious, and useless in a category context.

Unfortunately __func__ is not a macro, and cannot be catenated at
preprocess/compile time.

Signed-off-by: Jim Cromie 
---
v5:
. use DEFINE_DYNAMIC_DEBUG_CATEGORIES in drm_print.c
. s/DRM_DBG_CLASS_/DRM_DBG_CAT_/ - dont need another term
. default=y in KBuild entry - per @DanVet
. move some commit-log prose to dyndbg commit
. add-prototyes to (param_get/set)_dyndbg
. more wrinkles found by 
. relocate ratelimit chunk from elsewhere
v6:
. add kernel doc
. fix cpp paste, drop '#'
v7:
. change __drm_debug to long, to fit with DEFINE_DYNAMIC_DEBUG_CATEGORIES
. add -DDYNAMIC_DEBUG_MODULE to ccflags if DRM_USE_DYNAMIC_DEBUG
---
 drivers/gpu/drm/Kconfig |  13 
 drivers/gpu/drm/Makefile|   3 +
 drivers/gpu/drm/drm_print.c |  53 +
 include/drm/drm_print.h | 144 
 4 files changed, 166 insertions(+), 47 deletions(-)

diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
index 7ff89690a976..97e38d86fd27 100644
--- a/drivers/gpu/drm/Kconfig
+++ b/drivers/gpu/drm/Kconfig
@@ -57,6 +57,19 @@ config DRM_DEBUG_MM
 
  If in doubt, say "N".
 
+config DRM_USE_DYNAMIC_DEBUG
+   bool "use dynamic debug to implement drm.debug"
+   default y
+   depends on DRM
+   depends on DYNAMIC_DEBUG || DYNAMIC_DEBUG_CORE
+   depends on 

[PATCH v7 7/8] amdgpu_ucode: reduce number of pr_debug calls

2021-08-31 Thread Jim Cromie
There are blocks of DRM_DEBUG calls, consolidate their args into
single calls.  With dynamic-debug in use, each callsite consumes 56
bytes of callsite data, and this patch removes about 65 calls, so
it saves ~3.5kb.

no functional changes.

RFC: this creates multi-line log messages, does that break any syslog
conventions ?

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 293 --
 1 file changed, 158 insertions(+), 135 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
index 2834981f8c08..14a9fef1f4c6 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c
@@ -30,17 +30,26 @@
 
 static void amdgpu_ucode_print_common_hdr(const struct common_firmware_header 
*hdr)
 {
-   DRM_DEBUG("size_bytes: %u\n", le32_to_cpu(hdr->size_bytes));
-   DRM_DEBUG("header_size_bytes: %u\n", 
le32_to_cpu(hdr->header_size_bytes));
-   DRM_DEBUG("header_version_major: %u\n", 
le16_to_cpu(hdr->header_version_major));
-   DRM_DEBUG("header_version_minor: %u\n", 
le16_to_cpu(hdr->header_version_minor));
-   DRM_DEBUG("ip_version_major: %u\n", le16_to_cpu(hdr->ip_version_major));
-   DRM_DEBUG("ip_version_minor: %u\n", le16_to_cpu(hdr->ip_version_minor));
-   DRM_DEBUG("ucode_version: 0x%08x\n", le32_to_cpu(hdr->ucode_version));
-   DRM_DEBUG("ucode_size_bytes: %u\n", le32_to_cpu(hdr->ucode_size_bytes));
-   DRM_DEBUG("ucode_array_offset_bytes: %u\n",
- le32_to_cpu(hdr->ucode_array_offset_bytes));
-   DRM_DEBUG("crc32: 0x%08x\n", le32_to_cpu(hdr->crc32));
+   DRM_DEBUG("size_bytes: %u\n"
+ "header_size_bytes: %u\n"
+ "header_version_major: %u\n"
+ "header_version_minor: %u\n"
+ "ip_version_major: %u\n"
+ "ip_version_minor: %u\n"
+ "ucode_version: 0x%08x\n"
+ "ucode_size_bytes: %u\n"
+ "ucode_array_offset_bytes: %u\n"
+ "crc32: 0x%08x\n",
+ le32_to_cpu(hdr->size_bytes),
+ le32_to_cpu(hdr->header_size_bytes),
+ le16_to_cpu(hdr->header_version_major),
+ le16_to_cpu(hdr->header_version_minor),
+ le16_to_cpu(hdr->ip_version_major),
+ le16_to_cpu(hdr->ip_version_minor),
+ le32_to_cpu(hdr->ucode_version),
+ le32_to_cpu(hdr->ucode_size_bytes),
+ le32_to_cpu(hdr->ucode_array_offset_bytes),
+ le32_to_cpu(hdr->crc32));
 }
 
 void amdgpu_ucode_print_mc_hdr(const struct common_firmware_header *hdr)
@@ -55,9 +64,9 @@ void amdgpu_ucode_print_mc_hdr(const struct 
common_firmware_header *hdr)
const struct mc_firmware_header_v1_0 *mc_hdr =
container_of(hdr, struct mc_firmware_header_v1_0, 
header);
 
-   DRM_DEBUG("io_debug_size_bytes: %u\n",
- le32_to_cpu(mc_hdr->io_debug_size_bytes));
-   DRM_DEBUG("io_debug_array_offset_bytes: %u\n",
+   DRM_DEBUG("io_debug_size_bytes: %u\n"
+ "io_debug_array_offset_bytes: %u\n",
+ le32_to_cpu(mc_hdr->io_debug_size_bytes),
  le32_to_cpu(mc_hdr->io_debug_array_offset_bytes));
} else {
DRM_ERROR("Unknown MC ucode version: %u.%u\n", version_major, 
version_minor);
@@ -82,13 +91,17 @@ void amdgpu_ucode_print_smc_hdr(const struct 
common_firmware_header *hdr)
switch (version_minor) {
case 0:
v2_0_hdr = container_of(hdr, struct 
smc_firmware_header_v2_0, v1_0.header);
-   DRM_DEBUG("ppt_offset_bytes: %u\n", 
le32_to_cpu(v2_0_hdr->ppt_offset_bytes));
-   DRM_DEBUG("ppt_size_bytes: %u\n", 
le32_to_cpu(v2_0_hdr->ppt_size_bytes));
+   DRM_DEBUG("ppt_offset_bytes: %u\n"
+ "ppt_size_bytes: %u\n",
+ le32_to_cpu(v2_0_hdr->ppt_offset_bytes),
+ le32_to_cpu(v2_0_hdr->ppt_size_bytes));
break;
case 1:
v2_1_hdr = container_of(hdr, struct 
smc_firmware_header_v2_1, v1_0.header);
-   DRM_DEBUG("pptable_count: %u\n", 
le32_to_cpu(v2_1_hdr->pptable_count));
-   DRM_DEBUG("pptable_entry_offset: %u\n", 
le32_to_cpu(v2_1_hdr->pptable_entry_offset));
+   DRM_DEBUG("pptable_count: %u\n"
+ "pptable_entry_offset: %u\n",
+ le32_to_cpu(v2_1_hdr->pptable_count),
+ le32_to_cpu(v2_1_hdr->pptable_entry_offset));
break;
default:
break;
@@ -111,10 +124,12 @@ void 

[PATCH v7 6/8] drm_print: instrument drm_debug_enabled

2021-08-31 Thread Jim Cromie
Duplicate drm_debug_enabled() code into both "basic" and "dyndbg"
ifdef branches.  Then add a pr_debug("todo: ...") into the "dyndbg"
branch.

Then convert the "dyndbg" branch's code to a macro, so that its
pr_debug() get its callsite info from the invoking function, instead
of from drm_debug_enabled() itself.

This gives us unique callsite info for the 8 remaining users of
drm_debug_enabled(), and lets us enable them individually to see how
much logging traffic they generate.  The oft-visited callsites can
then be reviewed for runtime cost and possible optimizations.

Heres what we get:

bash-5.1# modprobe drm
dyndbg: 384 debug prints in module drm
bash-5.1# grep todo: /proc/dynamic_debug/control
drivers/gpu/drm/drm_edid.c:1843 [drm]connector_bad_edid =_ "todo: maybe avoid 
via dyndbg\012"
drivers/gpu/drm/drm_print.c:309 [drm]___drm_dbg =p "todo: maybe avoid via 
dyndbg\012"
drivers/gpu/drm/drm_print.c:286 [drm]__drm_dev_dbg =p "todo: maybe avoid via 
dyndbg\012"
drivers/gpu/drm/drm_vblank.c:1491 [drm]drm_vblank_restore =_ "todo: maybe avoid 
via dyndbg\012"
drivers/gpu/drm/drm_vblank.c:787 
[drm]drm_crtc_vblank_helper_get_vblank_timestamp_internal =_ "todo: maybe avoid 
via dyndbg\012"
drivers/gpu/drm/drm_vblank.c:410 [drm]drm_crtc_accurate_vblank_count =_ "todo: 
maybe avoid via dyndbg\012"
drivers/gpu/drm/drm_atomic_uapi.c:1457 [drm]drm_mode_atomic_ioctl =_ "todo: 
maybe avoid via dyndbg\012"
drivers/gpu/drm/drm_edid_load.c:178 [drm]edid_load =_ "todo: maybe avoid via 
dyndbg\012"

At quick glance, edid won't qualify, drm_print might, drm_vblank is
strongest chance, maybe atomic-ioctl too.

Signed-off-by: Jim Cromie 
---
 include/drm/drm_print.h | 16 +++-
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/include/drm/drm_print.h b/include/drm/drm_print.h
index 973443040561..03f9ae83fade 100644
--- a/include/drm/drm_print.h
+++ b/include/drm/drm_print.h
@@ -378,6 +378,11 @@ enum drm_debug_category {
 #define DRM_DBG_CAT_DP DRM_UT_DP
 #define DRM_DBG_CAT_DRMRES DRM_UT_DRMRES
 
+static inline bool drm_debug_enabled(enum drm_debug_category category)
+{
+   return unlikely(__drm_debug & category);
+}
+
 #else /* CONFIG_DRM_USE_DYNAMIC_DEBUG */
 
 /* join prefix+format in cpp so dyndbg can see it */
@@ -397,12 +402,13 @@ enum drm_debug_category {
 #define DRM_DBG_CAT_DP "drm:dp: "
 #define DRM_DBG_CAT_DRMRES "drm:res: " /* not in MODULE_PARM_DESC */
 
-#endif /* CONFIG_DRM_USE_DYNAMIC_DEBUG */
+#define drm_debug_enabled(category)\
+   ({  \
+   pr_debug("todo: maybe avoid via dyndbg\n"); \
+   unlikely(__drm_debug & category);   \
+   })
 
-static inline bool drm_debug_enabled(enum drm_debug_category category)
-{
-   return unlikely(__drm_debug & category);
-}
+#endif /* CONFIG_DRM_USE_DYNAMIC_DEBUG */
 
 /*
  * struct device based logging
-- 
2.31.1



[PATCH v7 3/8] i915/gvt: use DEFINE_DYNAMIC_DEBUG_CATEGORIES to create "gvt:core:" etc categories

2021-08-31 Thread Jim Cromie
The gvt component of this driver has ~120 pr_debugs, in 9 categories
quite similar to those in DRM.  Following the interface model of
drm.debug, add a parameter to map bits to these categorizations.

DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug_gvt, __gvt_debug,
"dyndbg bitmap desc",
{ "gvt:cmd: ",  "command processing" },
{ "gvt:core: ", "core help" },
{ "gvt:dpy: ",  "display help" },
{ "gvt:el: ",   "help" },
{ "gvt:irq: ",  "help" },
{ "gvt:mm: ",   "help" },
{ "gvt:mmio: ", "help" },
{ "gvt:render: ", "help" },
{ "gvt:sched: " "help" });

The actual patch has a few details different, cmd_help() macro emits
the initialization construct.

if CONFIG_DRM_USE_DYNAMIC_DEBUG, then -DDYNAMIC_DEBUG_MODULE is added
cflags, by gvt/Makefile.

Signed-off-by: Jim Cromie 
---
v5:
. static decl of vector of bit->class descriptors - Emil.V
. relocate gvt-makefile chunk from elsewhere
v7:
. move ccflags addition up to i915/Makefile from i915/gvt
---
 drivers/gpu/drm/i915/Makefile  |  4 
 drivers/gpu/drm/i915/i915_params.c | 35 ++
 2 files changed, 39 insertions(+)

diff --git a/drivers/gpu/drm/i915/Makefile b/drivers/gpu/drm/i915/Makefile
index 4f22cac1c49b..5a4e371a3ec2 100644
--- a/drivers/gpu/drm/i915/Makefile
+++ b/drivers/gpu/drm/i915/Makefile
@@ -30,6 +30,10 @@ CFLAGS_display/intel_fbdev.o = $(call cc-disable-warning, 
override-init)
 
 subdir-ccflags-y += -I$(srctree)/$(src)
 
+#ifdef CONFIG_DRM_USE_DYNAMIC_DEBUG
+ccflags-y += -DDYNAMIC_DEBUG_MODULE
+#endif
+
 # Please keep these build lists sorted!
 
 # core driver code
diff --git a/drivers/gpu/drm/i915/i915_params.c 
b/drivers/gpu/drm/i915/i915_params.c
index e07f4cfea63a..e645e149485e 100644
--- a/drivers/gpu/drm/i915/i915_params.c
+++ b/drivers/gpu/drm/i915/i915_params.c
@@ -265,3 +265,38 @@ void i915_params_free(struct i915_params *params)
I915_PARAMS_FOR_EACH(FREE);
 #undef FREE
 }
+
+#ifdef CONFIG_DRM_USE_DYNAMIC_DEBUG
+/* todo: needs DYNAMIC_DEBUG_MODULE in some cases */
+
+unsigned long __gvt_debug;
+EXPORT_SYMBOL(__gvt_debug);
+
+#define _help(key) "\t\"" key "\"\t: help for " key "\n"
+
+#define I915_GVT_CATEGORIES(name) \
+   " Enable debug output via /sys/module/i915/parameters/" #name   \
+   ", where each bit enables a debug category.\n"  \
+   _help("gvt:cmd:")   \
+   _help("gvt:core:")  \
+   _help("gvt:dpy:")   \
+   _help("gvt:el:")\
+   _help("gvt:irq:")   \
+   _help("gvt:mm:")\
+   _help("gvt:mmio:")  \
+   _help("gvt:render:")\
+   _help("gvt:sched:")
+
+DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug_gvt, __gvt_debug,
+   I915_GVT_CATEGORIES(debug_gvt),
+   _DD_cat_("gvt:cmd:"),
+   _DD_cat_("gvt:core:"),
+   _DD_cat_("gvt:dpy:"),
+   _DD_cat_("gvt:el:"),
+   _DD_cat_("gvt:irq:"),
+   _DD_cat_("gvt:mm:"),
+   _DD_cat_("gvt:mmio:"),
+   _DD_cat_("gvt:render:"),
+   _DD_cat_("gvt:sched:"));
+
+#endif
-- 
2.31.1



[PATCH v7 4/8] amdgpu: use DEFINE_DYNAMIC_DEBUG_CATEGORIES

2021-08-31 Thread Jim Cromie
logger_types.h defines many DC_LOG_*() categorized debug wrappers.
Most of these use DRM debug API, so are controllable using drm.debug,
but others use bare pr_debug("$prefix: .."), each with a different
class-prefix matching "^\[\w+\]:"

Use DEFINE_DYNAMIC_DEBUG_CATEGORIES to create a /sys debug_dc
parameter, modinfos, and to specify a map from bits -> categorized
pr_debugs to be controlled.

Signed-off-by: Jim Cromie 
---
 .../gpu/drm/amd/display/dc/core/dc_debug.c| 44 ++-
 1 file changed, 43 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c 
b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
index 21be2a684393..9fd43254db88 100644
--- a/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
+++ b/drivers/gpu/drm/amd/display/dc/core/dc_debug.c
@@ -36,8 +36,50 @@
 
 #include "resource.h"
 
-#define DC_LOGGER_INIT(logger)
+#ifdef CONFIG_DRM_USE_DYNAMIC_DEBUG
+/* define a drm.debug style dyndbg pr-debug control point */
+#include 
+
+unsigned long __debug_dc;
+EXPORT_SYMBOL(__debug_dc);
+
+#define _help_(key)"\t   " key "\t- help for " key "\n"
+
+/* Id like to do these inside DEFINE_DYNAMIC_DEBUG_CATEGORIES, if possible */
+#define DC_DYNDBG_BITMAP_DESC(name)\
+   "Control pr_debugs via /sys/module/amdgpu/parameters/" #name\
+   ", where each bit controls a debug category.\n" \
+   _help_("[SURFACE]:")\
+   _help_("[CURSOR]:") \
+   _help_("[PFLIP]:")  \
+   _help_("[VBLANK]:") \
+   _help_("[HW_LINK_TRAINING]:")   \
+   _help_("[HW_AUDIO]:")   \
+   _help_("[SCALER]:") \
+   _help_("[BIOS]:")   \
+   _help_("[BANDWIDTH_CALCS]:")\
+   _help_("[DML]:")\
+   _help_("[IF_TRACE]:")   \
+   _help_("[GAMMA]:")  \
+   _help_("[SMU_MSG]:")
+
+DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug_dc, __debug_dc,
+   DC_DYNDBG_BITMAP_DESC(debug_dc),
+   _DD_cat_("[CURSOR]:"),
+   _DD_cat_("[PFLIP]:"),
+   _DD_cat_("[VBLANK]:"),
+   _DD_cat_("[HW_LINK_TRAINING]:"),
+   _DD_cat_("[HW_AUDIO]:"),
+   _DD_cat_("[SCALER]:"),
+   _DD_cat_("[BIOS]:"),
+   _DD_cat_("[BANDWIDTH_CALCS]:"),
+   _DD_cat_("[DML]:"),
+   _DD_cat_("[IF_TRACE]:"),
+   _DD_cat_("[GAMMA]:"),
+   _DD_cat_("[SMU_MSG]:"));
+#endif
 
+#define DC_LOGGER_INIT(logger)
 
 #define SURFACE_TRACE(...) do {\
if (dc->debug.surface_trace) \
-- 
2.31.1



[PATCH v7 2/8] dyndbg: remove spaces in pr_debug "gvt: core:" etc prefixes

2021-08-31 Thread Jim Cromie
Taking embedded spaces out of existing prefixes makes them better
class-prefixes; simplifying the extra quoting needed otherwise:

  $> echo format "^gvt: core:" +p >control

Dropping the internal spaces means any trailing space in a query will
more clearly terminate the prefix being searched for.

Consider a generic drm-debug example:

  # turn off ATOMIC reports
  echo format "^drm:atomic: " -p > control

  # turn off all ATOMIC:* reports, including any sub-categories
  echo format "^drm:atomic:" -p > control

  # turn on ATOMIC:FAIL: reports
  echo format "^drm:atomic:fail: " +p > control

Removing embedded spaces in the class-prefixes simplifies the
corresponding match-prefix.  This means that "quoted" match-prefixes
are only needed when the trailing space is desired, in order to
exclude explicitly sub-categorized pr-debugs; in this example,
"drm:atomic:fail:".

RFC: maybe the prefix catenation should paste in the " " class-prefix
terminator explicitly.  A pr_debug_() flavor could exclude the " ",
allowing ad-hoc sub-categorization by appending for example, "fail:"
to "drm:atomic:" without the default " " insertion.

Signed-off-by: Jim Cromie 
---
 drivers/gpu/drm/i915/gvt/debug.h | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/drivers/gpu/drm/i915/gvt/debug.h b/drivers/gpu/drm/i915/gvt/debug.h
index c6027125c1ec..b4021f41c546 100644
--- a/drivers/gpu/drm/i915/gvt/debug.h
+++ b/drivers/gpu/drm/i915/gvt/debug.h
@@ -36,30 +36,30 @@ do {
\
 } while (0)
 
 #define gvt_dbg_core(fmt, args...) \
-   pr_debug("gvt: core: "fmt, ##args)
+   pr_debug("gvt:core: "fmt, ##args)
 
 #define gvt_dbg_irq(fmt, args...) \
-   pr_debug("gvt: irq: "fmt, ##args)
+   pr_debug("gvt:irq: "fmt, ##args)
 
 #define gvt_dbg_mm(fmt, args...) \
-   pr_debug("gvt: mm: "fmt, ##args)
+   pr_debug("gvt:mm: "fmt, ##args)
 
 #define gvt_dbg_mmio(fmt, args...) \
-   pr_debug("gvt: mmio: "fmt, ##args)
+   pr_debug("gvt:mmio: "fmt, ##args)
 
 #define gvt_dbg_dpy(fmt, args...) \
-   pr_debug("gvt: dpy: "fmt, ##args)
+   pr_debug("gvt:dpy: "fmt, ##args)
 
 #define gvt_dbg_el(fmt, args...) \
-   pr_debug("gvt: el: "fmt, ##args)
+   pr_debug("gvt:el: "fmt, ##args)
 
 #define gvt_dbg_sched(fmt, args...) \
-   pr_debug("gvt: sched: "fmt, ##args)
+   pr_debug("gvt:sched: "fmt, ##args)
 
 #define gvt_dbg_render(fmt, args...) \
-   pr_debug("gvt: render: "fmt, ##args)
+   pr_debug("gvt:render: "fmt, ##args)
 
 #define gvt_dbg_cmd(fmt, args...) \
-   pr_debug("gvt: cmd: "fmt, ##args)
+   pr_debug("gvt:cmd: "fmt, ##args)
 
 #endif
-- 
2.31.1



[PATCH v7 1/8] dyndbg: add DEFINE_DYNAMIC_DEBUG_CATEGORIES and callbacks

2021-08-31 Thread Jim Cromie
DEFINE_DYNAMIC_DEBUG_CATEGORIES(name, var, bitmap_desc, @bit_descs)
allows users to define a drm.debug style (bitmap) sysfs interface, and
to specify the desired mapping from bits[0-N] to the format-prefix'd
pr_debug()s to be controlled.

DEFINE_DYNAMIC_DEBUG_CATEGORIES(debug_gvt, __gvt_debug,
"i915/gvt bitmap desc",
/**
 * search-prefixes, passed to dd-exec_queries
 * defines bits 0-N in order.
 * leading ^ is tacitly inserted (by callback currently)
 * trailing space used here excludes subcats.
 * helper macro needs more work
 * macro to autogen ++$i, 0x%x$i ?
 */
_DD_cat_("gvt:cmd: "),
_DD_cat_("gvt:core: "),
_DD_cat_("gvt:dpy: "),
_DD_cat_("gvt:el: "),
_DD_cat_("gvt:irq: "),
_DD_cat_("gvt:mm: "),
_DD_cat_("gvt:mmio: "),
_DD_cat_("gvt:render: "),
_DD_cat_("gvt:sched: "));

dynamic_debug.c: add 2 new elements:

 - int param_set_dyndbg()
 - int param_get_dyndbg()
 - struct kernel_param_ops param_ops_dyndbg

Following the model of kernel/params.c STANDARD_PARAM_DEFS, these are
non-static and exported.

get/set use an augmented kernel_param; the arg refs a new struct
dyndbg_bitmap_param containing:

- the map (array, indexed by bitpos) of format-prefix strings, which
  define the set/category of prdbgs to be changed per each bit.

- pointer to the user's ulong holding the bits/state.
  by sharing state, we coordinate with code that still uses it directly

This will allow drm-debug to be converted incrementally, while still
using __drm_debug & drm_debug_enabled() in other parts.

param_set_dyndbg() compares new vs old bits, and only updates prdbgs
on changes.  This maximally preserves the underlying state, which may
have been customized via later `echo $cmd >control`.  So if a user
really wants to know that all prdbgs are set precisely, they must
pre-clear then set.

TLDR: this also doesn't affect the decorator flags "mflt" set per prdbg.

dynamic_debug.h:

Add DEFINE_DYNAMIC_DEBUG_CATEGORIES() described above, and a stub
throwing a BUILD_BUG (RFC) when used without DYNAMIC_DEBUG support.
Add structs dyndbg_bitdesc, dyndbg_bitmap_param to support the macro.

Note that it also calls MODULE_PARM_DESC for the user, but expects the
user to catenate all the bit-descriptions together (as is done in
drm.debug), and in the following uses in amdgpu, i915.

The intent is to regenerate this output from per-bit help given in
VA_ARGS, including a bit_label(); but this can wait.

Also externs the struct kernel_param param_ops_dyndbg symbol, as is
done in moduleparams.h for all the STANDARD params.

USAGE NOTES:

Using dyndbg to query on "format ^$prefix" requires that the prefix be
present in the compiled-in format string; where run-time prefixing is
used, that format would be "%s...", which is not usefully selectable.

Using DEFINE_DYNAMIC_DEBUG_CATEGORIES wo support gets a BUILD_BUG.
ISTM there is already action at a declarative distance, nobody needs
mystery as to why the /sysfs thingy didn't appear.

Dyndbg is completely agnostic wrt the categorization scheme used, in
order to play well with any prefix convention already in use in the
codebase.  Ad-hoc categories and sub-categories are implicitly
allowed, author discipline and review is expected.

Here are some examples:

"1","2","3" 2 doesn't imply 1.
otherwize, sorta like printk levels
"1:","2:","3:"  are better, avoiding [1-9]\d+ ambiguity
"hi:","mid:","low:" are reasonable, and imply independence
"todo:","rfc:","fixme:" might be handy
"A:".."Z:"  uhm, yeah

Hierarchical classes/categories are natural:

"drm::"is used in a later commit
"drm:::"  is a natural extension.
"drm:atomic:fail:"  has been proposed, sounds directly useful

NB: in a real sense we abandon enum strictures here, and lose some
compiler help, on spelling errs for example.  Obviously "drm:" != "DRM:".

Some properties of a hierarchical category deserve explication:

Trailing spaces matter !

With 1..3-space ("drm: ", "drm:atomic: ", "drm:atomic:fail: "), the
":" doesn't terminate the search-space, the trailing space does.  So a
"drm:" search spec will match all DRM categories & subcategories, and
will not be useful in an interface where all categories are already
controlled together.  That said, "drm:atomic:" & "drm:atomic: " are
different, and both are useful in cases.

Ad-Hoc sub-categories:

These have a caveat wrt wrapper macros adding prefixes like
"drm:atomic: "; the trailing space in the prefix means that
drm_dbg_atomic("fail: ...") pastes as "drm:atomic: fail: ", which
obviously isn't ideal wrt clear and simple bitmaps.

A possible solution is to have a FOO_() version of every FOO() macro
which (anti-mnemonically) elides the trailing space, which is normally
inserted by a newer FOO().

IE: drm_dbg_atomic_("fail: ..."); // trailing _ means ad-hoc subcat

Summarizing:

 

[PATCH v7 0/8] use DYNAMIC_DEBUG to implement DRM.debug

2021-08-31 Thread Jim Cromie
Hi Jason, DRM folks,

In DRM-debug currently, drm_debug_enabled() is called a lot to decide
whether or not to write debug messages.  Each test is cheap, but costs
continue with uptime.  DYNAMIC_DEBUG "dyndbg", when built with
JUMP_LABEL, replaces each of those tests with a patchable NOOP, for
"zero cost when off" debugs.

this is v7:
- drops RFC distractions -JBaron
- drops patch-1: kp->data addition in moduleparams.h not needed -JBaron
- rework dyndbg callbacks to use kp->arg instead -JBaron
- fixes for problem configs reported -lkp 
- others noted per patch below ---

Broadly, this patchset does 3 things:

Adds DEFINE_DYNAMIC_DEBUG_CATEGORIES, which defines a mapping from
bits to "categorized" pr_debugs, a sysfs interface, and callbacks to
implement the bits as dynamic_debug >controls.  This is modelled after
DRM's debug interface.

Uses the new macro in amdgpu & i915 to control existing pr_debugs
according to their ad-hoc categorizations.

Adapts the drm-debug API (~20 macros) to configurably "replace"
drm_dbg & drm_dev_dbg with pr_debug & dev_dbg, which avoids
drm_debug_enabled() overheads.  DEFINE_DYNAMIC_DEBUG_CATEGORIES is
used to create the controlling bitmap, which is wired to __drm_debug
var so remaining drm_debug_enabled() users stay in sync.

on a virtual box:
bash-5.1# for m in i915 amdgpu nouveau; do modprobe $m; done
[43833.332326] dyndbg: 384 debug prints in module drm
[43833.609577] dyndbg: 211 debug prints in module drm_kms_helper
[43833.707124] dyndbg:   2 debug prints in module ttm
[43837.471166] dyndbg: 1727 debug prints in module i915
[43838.030774] AMD-Vi: AMD IOMMUv2 driver by Joerg Roedel 
[43838.031905] AMD-Vi: AMD IOMMUv2 functionality not available on this system
[43846.209583] dyndbg: 3852 debug prints in module amdgpu
[43846.261391] [drm] amdgpu kernel modesetting enabled.
[43846.262512] amdgpu: CRAT table not found
[43846.263264] amdgpu: Virtual CRAT table created for CPU
[43846.264293] amdgpu: Topology: Add CPU node
[43846.743781] dyndbg:   3 debug prints in module wmi
[43852.517497] dyndbg:  92 debug prints in module nouveau

There are a few multi-second delays there, just before dyndbg
initializes the large blocks of debug prints.


Jim Cromie (8):
  dyndbg: add DEFINE_DYNAMIC_DEBUG_CATEGORIES and callbacks
  dyndbg: remove spaces in pr_debug "gvt: core:" etc prefixes
  i915/gvt: use DEFINE_DYNAMIC_DEBUG_CATEGORIES to create "gvt:core:"
etc categories
  amdgpu: use DEFINE_DYNAMIC_DEBUG_CATEGORIES
  drm_print: add choice to use dynamic debug in drm-debug
  drm_print: instrument drm_debug_enabled
  amdgpu_ucode: reduce number of pr_debug calls
  nouveau: fold multiple DRM_DEBUG_DRIVERs together

 drivers/gpu/drm/Kconfig   |  13 +
 drivers/gpu/drm/Makefile  |   3 +
 drivers/gpu/drm/amd/amdgpu/amdgpu_ucode.c | 293 ++
 .../gpu/drm/amd/display/dc/core/dc_debug.c|  44 ++-
 drivers/gpu/drm/drm_print.c   |  53 +++-
 drivers/gpu/drm/i915/Makefile |   4 +
 drivers/gpu/drm/i915/gvt/debug.h  |  18 +-
 drivers/gpu/drm/i915/i915_params.c|  35 +++
 drivers/gpu/drm/nouveau/nouveau_drm.c |  36 ++-
 include/drm/drm_print.h   | 150 +++--
 include/linux/dynamic_debug.h |  60 
 lib/dynamic_debug.c   |  79 -
 12 files changed, 582 insertions(+), 206 deletions(-)

-- 
2.31.1



Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-31 Thread Andrey Grodzovsky



On 2021-08-31 9:11 a.m., Daniel Vetter wrote:

On Thu, Aug 26, 2021 at 11:04:14AM +0200, Daniel Vetter wrote:

On Thu, Aug 19, 2021 at 11:25:09AM -0400, Andrey Grodzovsky wrote:

On 2021-08-19 5:30 a.m., Daniel Vetter wrote:

On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote:

On 2021-08-18 10:42 a.m., Daniel Vetter wrote:

On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote:

On 2021-08-18 10:32 a.m., Daniel Vetter wrote:

On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote:

On 2021-08-18 10:02 a.m., Alex Deucher wrote:


+ dri-devel

Since scheduler is a shared component, please add dri-devel on all
scheduler patches.

On Wed, Aug 18, 2021 at 7:21 AM Jingwen Chen  wrote:

[Why]
for bailing job, this commit will delete it from pending list thus the
bailing job will never have a chance to be resubmitted even in advance
tdr mode.

[How]
after embeded hw_fence into amdgpu_job is done, the race condition that
this commit tries to work around is completely solved.So revert this
commit.
This reverts commit 135517d3565b48f4def3b1b82008bc17eb5d1c90.
v2:
add dma_fence_get/put() around timedout_job to avoid concurrent delete
during processing timedout_job

Signed-off-by: Jingwen Chen 
---
  drivers/gpu/drm/scheduler/sched_main.c | 23 +--
  1 file changed, 5 insertions(+), 18 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a953693b45..f9b9b3aefc4a 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -314,6 +314,7 @@ static void drm_sched_job_timedout(struct work_struct *work)
  {
 struct drm_gpu_scheduler *sched;
 struct drm_sched_job *job;
+   struct dma_fence *fence;
 enum drm_gpu_sched_stat status = DRM_GPU_SCHED_STAT_NOMINAL;

 sched = container_of(work, struct drm_gpu_scheduler, 
work_tdr.work);
@@ -325,11 +326,10 @@ static void drm_sched_job_timedout(struct work_struct 
*work)

 if (job) {
 /*
-* Remove the bad job so it cannot be freed by concurrent
-* drm_sched_cleanup_jobs. It will be reinserted back after 
sched->thread
-* is parked at which point it's safe.
+* Get job->s_fence->parent here to avoid concurrent delete 
during
+* processing timedout_job
  */
-   list_del_init(>list);
+   fence = dma_fence_get(job->s_fence->parent);

While this is true for amdgpu, it has no meaning for other drivers for whom
we haven't
done the refactoring of embedding HW fence (parent) into the job structure.
In fact thinking
about it, unless you do the HW fence embedding for all the drivers using the
scheduler you cannot
revert this patch or you will just break them.

btw, why did you do that embedding? I do still have my patches with
dma_fence annotations floating around, but my idea at least was to fix
that issue with a mempool, not with embeddeding. What was the motivation
for embedding the wh fence?
-Daniel

The motivation was 2 fold, avoid memory allocation during jobs submissions
(HW fence allocation) because as Christian explained this leads to deadlock
with
mm code during evictions due to memory pressure (Christian can clarify if I
messed

Yeah that's the exact same thing I've chased with my dma_fence
annotations, but thus far zero to none interested in getting it sorted. I
think it'd be good to have some cross-driver agreement on how this should
be solved before someone just charges ahead ...


this explanation). Second is to exactly revert this patch because while it
solved the issue
described in the patch it created another with drivers who baildc out early
during TDR handling
for various reason and the job would just leak because it was already
removed form pending list.

Can't we reinsert it before we restart the scheduler thread? It might need
a separate list for that due to the lockless queue tricks. Or am I
thinking about the wrong kind of "we lost the job"?
-Danile

If you look at the original patch it would reinsert it even earlier - right
after stopping the  SW scheduler thread, and even then it was to late for
some drivers as they would decide to return back from their TDR handler even
before that. It is solvable but in an ugly way as far as I see, you need to
require each driver in his code to put the job back in the list if they do
it before reaching the place where scheduler framework does it. Kind of
spaghetti code seems to me.

Hm yeah I didn't realize this all happens before we stop the scheduler
thread.

Why can't we stop the scheduler thread first, so that there's guaranteed
no race? I've recently had a lot of discussions with panfrost folks about
their reset that spawns across engines, and without stopping the scheduler
thread first before you touch anything it's just plain impossible.


Talked with 

Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Luben Tuikov
On 2021-08-31 11:23, Andrey Grodzovsky wrote:
> On 2021-08-31 10:38 a.m., Daniel Vetter wrote:
>> On Tue, Aug 31, 2021 at 10:20:40AM -0400, Andrey Grodzovsky wrote:
>>> On 2021-08-31 10:03 a.m., Daniel Vetter wrote:
 On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote:
> It's says patch [2/2] but i can't find patch 1
>
> On 2021-08-31 6:35 a.m., Monk Liu wrote:
>> tested-by: jingwen chen 
>> Signed-off-by: Monk Liu 
>> Signed-off-by: jingwen chen 
>> ---
>> drivers/gpu/drm/scheduler/sched_main.c | 24 
>> 1 file changed, 4 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index ecf8140..894fdb24 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct 
>> work_struct *work)
>>  sched = container_of(work, struct drm_gpu_scheduler, 
>> work_tdr.work);
>>  /* Protects against concurrent deletion in 
>> drm_sched_get_cleanup_job */
>> +if (!__kthread_should_park(sched->thread))
>> +kthread_park(sched->thread);
>> +
> As mentioned before, without serializing against other TDR handlers from
> other
> schedulers you just race here against them, e.g. you parked it now but
> another
> one in progress will unpark it as part of calling  drm_sched_start for 
> other
> rings[1]
> Unless I am missing something since I haven't found patch [1/2]
>
> [1] - 
> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L5041data=04%7C01%7Cluben.tuikov%40amd.com%7C228bd1600c914efe24aa08d96c934bbb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660202148713283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=PrrvFHAwDeLlbcOctlKHsCFs9%2F56XNVzoLVcT1RoJgI%3Dreserved=0
 You need to have your own wq and run all your tdr work on the same wq if
 your reset has any cross-engine impact.
>>> IMHO what is problematic in serializing vs. locking (with trylock and bail
>>> out like we do in [1]) is for multiple TO events arising from same reason
>>> like maybe one job just waits for another and once first is hanged the
>>> second will also appear to be hanged triggering it's own TO event.
>>> In this case multiple TOs event will trigger multiple resets if we serialize
>>> but if we use lock with trylock the second one will quietly bail out.
>>>
>>> [1] 
>>> https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L4903data=04%7C01%7Cluben.tuikov%40amd.com%7C228bd1600c914efe24aa08d96c934bbb%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660202148713283%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=kxSWBoshVTLMMFIFZdPsP4MBgUAoC%2F3szo9GUemSRLY%3Dreserved=0
>> Hm so I guess a single wq here, that will hold up all other TO. And they
>> should recheck whether the job is moving meanwhile.
>
> Can you clarify about this ? What job should be moving ? The dependent job ?
>
>
>> Also unless you use hw semaphores the job shouldn't even start before the
>> deps are singalled, so not sure how this goes wrong?
>
> What about a simple example where
> we actually can submit a shader on one ring and a simple
> WAIT_REG_MEM packet on another to wait for the shader to write
> a specific value to specific memory location. Here you have both of them 
> started
> in close proximity and no explicit dependencies involved (at the 
> scheduler level)
> and yet if the shader hangs also the WAIT_REG_MEM job will hang.
>
>
>> The vm_id flush stuff can make things a bit more fun for your specific
>> case, but in your specific case you have to run all TO handlers on the
>> same ordered workqueue anyway (because trying to paper over this in other
>> ways doesn't work imo).
>
> I didn't get this one.

So, awhile back I tried to "serialize" this by moving timed-out jobs
into their own timed-out-dedicated list, then freeing them asynchronously,
but I never got it to work reliably due to races with low-level drivers and
assumptions made way back.

My idea was to atomic-move timed-out jobs into their own list, at the time of
timeout, and later asynchronously to free them (or better yet, inquire about
their state, and free them or move them back--ideally the inquiry is atomic
and done at timeout time before being moved to the timeout list). Anyway...

But I found out that all these knobs and levers weren't in place and I was
getting problems with it and it never materialized.

The paradigm was loosely "let someone else do 

Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Andrey Grodzovsky



On 2021-08-31 10:38 a.m., Daniel Vetter wrote:

On Tue, Aug 31, 2021 at 10:20:40AM -0400, Andrey Grodzovsky wrote:

On 2021-08-31 10:03 a.m., Daniel Vetter wrote:

On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote:

It's says patch [2/2] but i can't find patch 1

On 2021-08-31 6:35 a.m., Monk Liu wrote:

tested-by: jingwen chen 
Signed-off-by: Monk Liu 
Signed-off-by: jingwen chen 
---
drivers/gpu/drm/scheduler/sched_main.c | 24 
1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index ecf8140..894fdb24 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
+   if (!__kthread_should_park(sched->thread))
+   kthread_park(sched->thread);
+

As mentioned before, without serializing against other TDR handlers from
other
schedulers you just race here against them, e.g. you parked it now but
another
one in progress will unpark it as part of calling  drm_sched_start for other
rings[1]
Unless I am missing something since I haven't found patch [1/2]

[1] - 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L5041data=04%7C01%7Candrey.grodzovsky%40amd.com%7C86b39a7bbcd34a18c6e908d96c8cf862%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660174991641911%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=tz7lxvL%2BR6NrpcdfIg1Mjw5lZ55%2F5HTPF%2Fkwu7wqvqE%3Dreserved=0

You need to have your own wq and run all your tdr work on the same wq if
your reset has any cross-engine impact.


IMHO what is problematic in serializing vs. locking (with trylock and bail
out like we do in [1]) is for multiple TO events arising from same reason
like maybe one job just waits for another and once first is hanged the
second will also appear to be hanged triggering it's own TO event.
In this case multiple TOs event will trigger multiple resets if we serialize
but if we use lock with trylock the second one will quietly bail out.

[1] 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L4903data=04%7C01%7Candrey.grodzovsky%40amd.com%7C86b39a7bbcd34a18c6e908d96c8cf862%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660174991651903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=SpirDOLVdw5kIZAq0LHjnB0Qy6apwPLDPFjm61Wc2ko%3Dreserved=0

Hm so I guess a single wq here, that will hold up all other TO. And they
should recheck whether the job is moving meanwhile.



Can you clarify about this ? What job should be moving ? The dependent job ?




Also unless you use hw semaphores the job shouldn't even start before the
deps are singalled, so not sure how this goes wrong?



What about a simple example where
we actually can submit a shader on one ring and a simple
WAIT_REG_MEM packet on another to wait for the shader to write
a specific value to specific memory location. Here you have both of them 
started
in close proximity and no explicit dependencies involved (at the 
scheduler level)

and yet if the shader hangs also the WAIT_REG_MEM job will hang.




The vm_id flush stuff can make things a bit more fun for your specific
case, but in your specific case you have to run all TO handlers on the
same ordered workqueue anyway (because trying to paper over this in other
ways doesn't work imo).



I didn't get this one.

Andrey




So I think this should all work, no need for tricky cross-scheduler
locking.
-Daniel


Andrey



See

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdri.freedesktop.org%2Fdocs%2Fdrm%2Fgpu%2Fdrm-mm.html%23c.drm_sched_backend_opsdata=04%7C01%7Candrey.grodzovsky%40amd.com%7C86b39a7bbcd34a18c6e908d96c8cf862%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660174991651903%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=uV4s%2Fsu%2FKabZvsRMd1PAyd36JRSz91aPfYEn8PlvFlM%3Dreserved=0

for the ->timeout_job callback docs. I thought I brought this up already?
-Daniel


Yes, this discussion is a continuation of your comment about serializing, I
mentioned before that you proposed it.

Andrey



Andrey



spin_lock(>job_list_lock);
job = list_first_entry_or_null(>pending_list,
   struct drm_sched_job, list);
if (job) {
-   /*
-* Remove the bad job so it cannot be freed by concurrent
-* 

Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Luben Tuikov
On 2021-08-31 08:59, Daniel Vetter wrote:
> Can we please have some actual commit message here, with detailed
> explanation of the race/bug/whatever, how you fix it and why this is the
> best option?

I agree with Daniel--a narrative form of a commit message is so much easier
for humans to digest. The "[what]"/"[why]"/"[how]" and "issue"/"fix" format is
somewhat dry and uninformative, and leaves much to be desired.

Regards,
Luben

>
> On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote:
>> tested-by: jingwen chen 
>> Signed-off-by: Monk Liu 
>> Signed-off-by: jingwen chen 
>> ---
>>  drivers/gpu/drm/scheduler/sched_main.c | 24 
>>  1 file changed, 4 insertions(+), 20 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
>> b/drivers/gpu/drm/scheduler/sched_main.c
>> index ecf8140..894fdb24 100644
>> --- a/drivers/gpu/drm/scheduler/sched_main.c
>> +++ b/drivers/gpu/drm/scheduler/sched_main.c
>> @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
>> *work)
>>  sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>>  
>>  /* Protects against concurrent deletion in drm_sched_get_cleanup_job */
>> +if (!__kthread_should_park(sched->thread))
> This is a __ function, i.e. considered internal, and it's lockless atomic,
> i.e. unordered. And you're not explaining why this works.
>
> Iow it's probably buggy, and an just unconditionally parking the kthread
> is probably the right thing to do. If it's not the right thing to do,
> there's a bug here for sure.
> -Daniel
>
>> +kthread_park(sched->thread);
>> +
>>  spin_lock(>job_list_lock);
>>  job = list_first_entry_or_null(>pending_list,
>> struct drm_sched_job, list);
>>  
>>  if (job) {
>> -/*
>> - * Remove the bad job so it cannot be freed by concurrent
>> - * drm_sched_cleanup_jobs. It will be reinserted back after 
>> sched->thread
>> - * is parked at which point it's safe.
>> - */
>> -list_del_init(>list);
>>  spin_unlock(>job_list_lock);
>>  
>> +/* vendor's timeout_job should call drm_sched_start() */
>>  status = job->sched->ops->timedout_job(job);
>>  
>>  /*
>> @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
>> struct drm_sched_job *bad)
>>  kthread_park(sched->thread);
>>  
>>  /*
>> - * Reinsert back the bad job here - now it's safe as
>> - * drm_sched_get_cleanup_job cannot race against us and release the
>> - * bad job at this point - we parked (waited for) any in progress
>> - * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
>> - * now until the scheduler thread is unparked.
>> - */
>> -if (bad && bad->sched == sched)
>> -/*
>> - * Add at the head of the queue to reflect it was the earliest
>> - * job extracted.
>> - */
>> -list_add(>list, >pending_list);
>> -
>> -/*
>>   * Iterate the job list from later to  earlier one and either deactive
>>   * their HW callbacks or remove them from pending list if they already
>>   * signaled.
>> -- 
>> 2.7.4
>>



Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Daniel Vetter
On Tue, Aug 31, 2021 at 10:20:40AM -0400, Andrey Grodzovsky wrote:
> 
> On 2021-08-31 10:03 a.m., Daniel Vetter wrote:
> > On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote:
> > > It's says patch [2/2] but i can't find patch 1
> > > 
> > > On 2021-08-31 6:35 a.m., Monk Liu wrote:
> > > > tested-by: jingwen chen 
> > > > Signed-off-by: Monk Liu 
> > > > Signed-off-by: jingwen chen 
> > > > ---
> > > >drivers/gpu/drm/scheduler/sched_main.c | 24 
> > > >1 file changed, 4 insertions(+), 20 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > index ecf8140..894fdb24 100644
> > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct 
> > > > work_struct *work)
> > > > sched = container_of(work, struct drm_gpu_scheduler, 
> > > > work_tdr.work);
> > > > /* Protects against concurrent deletion in 
> > > > drm_sched_get_cleanup_job */
> > > > +   if (!__kthread_should_park(sched->thread))
> > > > +   kthread_park(sched->thread);
> > > > +
> > > 
> > > As mentioned before, without serializing against other TDR handlers from
> > > other
> > > schedulers you just race here against them, e.g. you parked it now but
> > > another
> > > one in progress will unpark it as part of calling  drm_sched_start for 
> > > other
> > > rings[1]
> > > Unless I am missing something since I haven't found patch [1/2]
> > > 
> > > [1] - 
> > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L5041data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc697c75898664f678f4b08d96c8820e7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660154199259544%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=1Y8Tuh2fLtexYsGrmQD2ITTSIfUVJmqTylwgMryDjxw%3Dreserved=0
> > You need to have your own wq and run all your tdr work on the same wq if
> > your reset has any cross-engine impact.
> 
> 
> IMHO what is problematic in serializing vs. locking (with trylock and bail
> out like we do in [1]) is for multiple TO events arising from same reason
> like maybe one job just waits for another and once first is hanged the
> second will also appear to be hanged triggering it's own TO event.
> In this case multiple TOs event will trigger multiple resets if we serialize
> but if we use lock with trylock the second one will quietly bail out.
> 
> [1] 
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L4903

Hm so I guess a single wq here, that will hold up all other TO. And they
should recheck whether the job is moving meanwhile.

Also unless you use hw semaphores the job shouldn't even start before the
deps are singalled, so not sure how this goes wrong?

The vm_id flush stuff can make things a bit more fun for your specific
case, but in your specific case you have to run all TO handlers on the
same ordered workqueue anyway (because trying to paper over this in other
ways doesn't work imo).

So I think this should all work, no need for tricky cross-scheduler
locking.
-Daniel

> 
> Andrey
> 
> 
> > 
> > See
> > 
> > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdri.freedesktop.org%2Fdocs%2Fdrm%2Fgpu%2Fdrm-mm.html%23c.drm_sched_backend_opsdata=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc697c75898664f678f4b08d96c8820e7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660154199259544%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=tLjFaN7mREYjjydxHszbQlTk3lwH4bQtBDVzHFHvPJg%3Dreserved=0
> > 
> > for the ->timeout_job callback docs. I thought I brought this up already?
> > -Daniel
> 
> 
> Yes, this discussion is a continuation of your comment about serializing, I
> mentioned before that you proposed it.
> 
> Andrey
> 
> 
> > 
> > > Andrey
> > > 
> > > 
> > > > spin_lock(>job_list_lock);
> > > > job = list_first_entry_or_null(>pending_list,
> > > >struct drm_sched_job, list);
> > > > if (job) {
> > > > -   /*
> > > > -* Remove the bad job so it cannot be freed by 
> > > > concurrent
> > > > -* drm_sched_cleanup_jobs. It will be reinserted back 
> > > > after sched->thread
> > > > -* is parked at which point it's safe.
> > > > -*/
> > > > -   list_del_init(>list);
> > > > spin_unlock(>job_list_lock);
> > > > +   /* vendor's timeout_job should call drm_sched_start() */
> > > > status = job->sched->ops->timedout_job(job);
> > > > /*
> > > > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler 
> > > > 

Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Andrey Grodzovsky



On 2021-08-31 10:03 a.m., Daniel Vetter wrote:

On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote:

It's says patch [2/2] but i can't find patch 1

On 2021-08-31 6:35 a.m., Monk Liu wrote:

tested-by: jingwen chen 
Signed-off-by: Monk Liu 
Signed-off-by: jingwen chen 
---
   drivers/gpu/drm/scheduler/sched_main.c | 24 
   1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index ecf8140..894fdb24 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
+   if (!__kthread_should_park(sched->thread))
+   kthread_park(sched->thread);
+


As mentioned before, without serializing against other TDR handlers from
other
schedulers you just race here against them, e.g. you parked it now but
another
one in progress will unpark it as part of calling  drm_sched_start for other
rings[1]
Unless I am missing something since I haven't found patch [1/2]

[1] - 
https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Felixir.bootlin.com%2Flinux%2Flatest%2Fsource%2Fdrivers%2Fgpu%2Fdrm%2Famd%2Famdgpu%2Famdgpu_device.c%23L5041data=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc697c75898664f678f4b08d96c8820e7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660154199259544%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=1Y8Tuh2fLtexYsGrmQD2ITTSIfUVJmqTylwgMryDjxw%3Dreserved=0

You need to have your own wq and run all your tdr work on the same wq if
your reset has any cross-engine impact.



IMHO what is problematic in serializing vs. locking (with trylock and 
bail out like we do in [1]) is for multiple TO events arising from same 
reason
like maybe one job just waits for another and once first is hanged the 
second will also appear to be hanged triggering it's own TO event.
In this case multiple TOs event will trigger multiple resets if we 
serialize but if we use lock with trylock the second one will quietly 
bail out.


[1] 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L4903


Andrey




See

https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdri.freedesktop.org%2Fdocs%2Fdrm%2Fgpu%2Fdrm-mm.html%23c.drm_sched_backend_opsdata=04%7C01%7Candrey.grodzovsky%40amd.com%7Cc697c75898664f678f4b08d96c8820e7%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637660154199259544%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000sdata=tLjFaN7mREYjjydxHszbQlTk3lwH4bQtBDVzHFHvPJg%3Dreserved=0

for the ->timeout_job callback docs. I thought I brought this up already?
-Daniel



Yes, this discussion is a continuation of your comment about 
serializing, I mentioned before that you proposed it.


Andrey





Andrey



spin_lock(>job_list_lock);
job = list_first_entry_or_null(>pending_list,
   struct drm_sched_job, list);
if (job) {
-   /*
-* Remove the bad job so it cannot be freed by concurrent
-* drm_sched_cleanup_jobs. It will be reinserted back after 
sched->thread
-* is parked at which point it's safe.
-*/
-   list_del_init(>list);
spin_unlock(>job_list_lock);
+   /* vendor's timeout_job should call drm_sched_start() */
status = job->sched->ops->timedout_job(job);
/*
@@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
struct drm_sched_job *bad)
kthread_park(sched->thread);
/*
-* Reinsert back the bad job here - now it's safe as
-* drm_sched_get_cleanup_job cannot race against us and release the
-* bad job at this point - we parked (waited for) any in progress
-* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
-* now until the scheduler thread is unparked.
-*/
-   if (bad && bad->sched == sched)
-   /*
-* Add at the head of the queue to reflect it was the earliest
-* job extracted.
-*/
-   list_add(>list, >pending_list);
-
-   /*
 * Iterate the job list from later to  earlier one and either deactive
 * their HW callbacks or remove them from pending list if they already
 * signaled.


Re: [PATCH] drm/amdgpu: stop scheduler when calling hw_fini (v2)

2021-08-31 Thread Alex Deucher
On Mon, Aug 30, 2021 at 2:24 AM Guchun Chen  wrote:
>
> This gurantees no more work on the ring can be submitted
> to hardware in suspend/resume case, otherwise a potential
> race will occur and the ring will get no chance to stay
> empty before suspend.
>
> v2: Call drm_sched_resubmit_job before drm_sched_start to
> restart jobs from the pending list.
>
> Suggested-by: Andrey Grodzovsky 
> Suggested-by: Christian König 
> Signed-off-by: Guchun Chen 

Maybe add:
Fixes: 067f44c8b459 ("drm/amdgpu: avoid over-handle of fence driver
fini in s3 test (v2)")
?

Alex

> ---
>  drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c | 8 
>  1 file changed, 8 insertions(+)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> index b439eb7d4177..fd4ba076ff8a 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_fence.c
> @@ -552,6 +552,9 @@ void amdgpu_fence_driver_hw_fini(struct amdgpu_device 
> *adev)
> if (!ring || !ring->fence_drv.initialized)
> continue;
>
> +   if (!ring->no_scheduler)
> +   drm_sched_stop(>sched, NULL);
> +
> /* You can't wait for HW to signal if it's gone */
> if (!drm_dev_is_unplugged(>ddev))
> r = amdgpu_fence_wait_empty(ring);
> @@ -611,6 +614,11 @@ void amdgpu_fence_driver_hw_init(struct amdgpu_device 
> *adev)
> if (!ring || !ring->fence_drv.initialized)
> continue;
>
> +   if (!ring->no_scheduler) {
> +   drm_sched_resubmit_jobs(>sched);
> +   drm_sched_start(>sched, true);
> +   }
> +
> /* enable the interrupt */
> if (ring->fence_drv.irq_src)
> amdgpu_irq_get(adev, ring->fence_drv.irq_src,
> --
> 2.17.1
>


Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Daniel Vetter
On Tue, Aug 31, 2021 at 09:53:36AM -0400, Andrey Grodzovsky wrote:
> It's says patch [2/2] but i can't find patch 1
> 
> On 2021-08-31 6:35 a.m., Monk Liu wrote:
> > tested-by: jingwen chen 
> > Signed-off-by: Monk Liu 
> > Signed-off-by: jingwen chen 
> > ---
> >   drivers/gpu/drm/scheduler/sched_main.c | 24 
> >   1 file changed, 4 insertions(+), 20 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index ecf8140..894fdb24 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
> > *work)
> > sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
> > /* Protects against concurrent deletion in drm_sched_get_cleanup_job */
> > +   if (!__kthread_should_park(sched->thread))
> > +   kthread_park(sched->thread);
> > +
> 
> 
> As mentioned before, without serializing against other TDR handlers from
> other
> schedulers you just race here against them, e.g. you parked it now but
> another
> one in progress will unpark it as part of calling  drm_sched_start for other
> rings[1]
> Unless I am missing something since I haven't found patch [1/2]
> 
> [1] - 
> https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L5041

You need to have your own wq and run all your tdr work on the same wq if
your reset has any cross-engine impact.

See

https://dri.freedesktop.org/docs/drm/gpu/drm-mm.html#c.drm_sched_backend_ops

for the ->timeout_job callback docs. I thought I brought this up already?
-Daniel

> 
> Andrey
> 
> 
> > spin_lock(>job_list_lock);
> > job = list_first_entry_or_null(>pending_list,
> >struct drm_sched_job, list);
> > if (job) {
> > -   /*
> > -* Remove the bad job so it cannot be freed by concurrent
> > -* drm_sched_cleanup_jobs. It will be reinserted back after 
> > sched->thread
> > -* is parked at which point it's safe.
> > -*/
> > -   list_del_init(>list);
> > spin_unlock(>job_list_lock);
> > +   /* vendor's timeout_job should call drm_sched_start() */
> > status = job->sched->ops->timedout_job(job);
> > /*
> > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
> > struct drm_sched_job *bad)
> > kthread_park(sched->thread);
> > /*
> > -* Reinsert back the bad job here - now it's safe as
> > -* drm_sched_get_cleanup_job cannot race against us and release the
> > -* bad job at this point - we parked (waited for) any in progress
> > -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> > -* now until the scheduler thread is unparked.
> > -*/
> > -   if (bad && bad->sched == sched)
> > -   /*
> > -* Add at the head of the queue to reflect it was the earliest
> > -* job extracted.
> > -*/
> > -   list_add(>list, >pending_list);
> > -
> > -   /*
> >  * Iterate the job list from later to  earlier one and either deactive
> >  * their HW callbacks or remove them from pending list if they already
> >  * signaled.

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Andrey Grodzovsky

It's says patch [2/2] but i can't find patch 1

On 2021-08-31 6:35 a.m., Monk Liu wrote:

tested-by: jingwen chen 
Signed-off-by: Monk Liu 
Signed-off-by: jingwen chen 
---
  drivers/gpu/drm/scheduler/sched_main.c | 24 
  1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index ecf8140..894fdb24 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
  
  	/* Protects against concurrent deletion in drm_sched_get_cleanup_job */

+   if (!__kthread_should_park(sched->thread))
+   kthread_park(sched->thread);
+



As mentioned before, without serializing against other TDR handlers from 
other
schedulers you just race here against them, e.g. you parked it now but 
another
one in progress will unpark it as part of calling  drm_sched_start for 
other rings[1]

Unless I am missing something since I haven't found patch [1/2]

[1] - 
https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c#L5041


Andrey



spin_lock(>job_list_lock);
job = list_first_entry_or_null(>pending_list,
   struct drm_sched_job, list);
  
  	if (job) {

-   /*
-* Remove the bad job so it cannot be freed by concurrent
-* drm_sched_cleanup_jobs. It will be reinserted back after 
sched->thread
-* is parked at which point it's safe.
-*/
-   list_del_init(>list);
spin_unlock(>job_list_lock);
  
+		/* vendor's timeout_job should call drm_sched_start() */

status = job->sched->ops->timedout_job(job);
  
  		/*

@@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
struct drm_sched_job *bad)
kthread_park(sched->thread);
  
  	/*

-* Reinsert back the bad job here - now it's safe as
-* drm_sched_get_cleanup_job cannot race against us and release the
-* bad job at this point - we parked (waited for) any in progress
-* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
-* now until the scheduler thread is unparked.
-*/
-   if (bad && bad->sched == sched)
-   /*
-* Add at the head of the queue to reflect it was the earliest
-* job extracted.
-*/
-   list_add(>list, >pending_list);
-
-   /*
 * Iterate the job list from later to  earlier one and either deactive
 * their HW callbacks or remove them from pending list if they already
 * signaled.


Re: [PATCH v2] Revert "drm/scheduler: Avoid accessing freed bad job."

2021-08-31 Thread Daniel Vetter
On Thu, Aug 26, 2021 at 11:04:14AM +0200, Daniel Vetter wrote:
> On Thu, Aug 19, 2021 at 11:25:09AM -0400, Andrey Grodzovsky wrote:
> > 
> > On 2021-08-19 5:30 a.m., Daniel Vetter wrote:
> > > On Wed, Aug 18, 2021 at 10:51:00AM -0400, Andrey Grodzovsky wrote:
> > > > On 2021-08-18 10:42 a.m., Daniel Vetter wrote:
> > > > > On Wed, Aug 18, 2021 at 10:36:32AM -0400, Andrey Grodzovsky wrote:
> > > > > > On 2021-08-18 10:32 a.m., Daniel Vetter wrote:
> > > > > > > On Wed, Aug 18, 2021 at 10:26:25AM -0400, Andrey Grodzovsky wrote:
> > > > > > > > On 2021-08-18 10:02 a.m., Alex Deucher wrote:
> > > > > > > > 
> > > > > > > > > + dri-devel
> > > > > > > > > 
> > > > > > > > > Since scheduler is a shared component, please add dri-devel 
> > > > > > > > > on all
> > > > > > > > > scheduler patches.
> > > > > > > > > 
> > > > > > > > > On Wed, Aug 18, 2021 at 7:21 AM Jingwen Chen 
> > > > > > > > >  wrote:
> > > > > > > > > > [Why]
> > > > > > > > > > for bailing job, this commit will delete it from pending 
> > > > > > > > > > list thus the
> > > > > > > > > > bailing job will never have a chance to be resubmitted even 
> > > > > > > > > > in advance
> > > > > > > > > > tdr mode.
> > > > > > > > > > 
> > > > > > > > > > [How]
> > > > > > > > > > after embeded hw_fence into amdgpu_job is done, the race 
> > > > > > > > > > condition that
> > > > > > > > > > this commit tries to work around is completely solved.So 
> > > > > > > > > > revert this
> > > > > > > > > > commit.
> > > > > > > > > > This reverts commit 
> > > > > > > > > > 135517d3565b48f4def3b1b82008bc17eb5d1c90.
> > > > > > > > > > v2:
> > > > > > > > > > add dma_fence_get/put() around timedout_job to avoid 
> > > > > > > > > > concurrent delete
> > > > > > > > > > during processing timedout_job
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Jingwen Chen 
> > > > > > > > > > ---
> > > > > > > > > >  drivers/gpu/drm/scheduler/sched_main.c | 23 
> > > > > > > > > > +--
> > > > > > > > > >  1 file changed, 5 insertions(+), 18 deletions(-)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > > > > > > > > > b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > > index a2a953693b45..f9b9b3aefc4a 100644
> > > > > > > > > > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > > > > > > > > > @@ -314,6 +314,7 @@ static void 
> > > > > > > > > > drm_sched_job_timedout(struct work_struct *work)
> > > > > > > > > >  {
> > > > > > > > > > struct drm_gpu_scheduler *sched;
> > > > > > > > > > struct drm_sched_job *job;
> > > > > > > > > > +   struct dma_fence *fence;
> > > > > > > > > > enum drm_gpu_sched_stat status = 
> > > > > > > > > > DRM_GPU_SCHED_STAT_NOMINAL;
> > > > > > > > > > 
> > > > > > > > > > sched = container_of(work, struct 
> > > > > > > > > > drm_gpu_scheduler, work_tdr.work);
> > > > > > > > > > @@ -325,11 +326,10 @@ static void 
> > > > > > > > > > drm_sched_job_timedout(struct work_struct *work)
> > > > > > > > > > 
> > > > > > > > > > if (job) {
> > > > > > > > > > /*
> > > > > > > > > > -* Remove the bad job so it cannot be freed 
> > > > > > > > > > by concurrent
> > > > > > > > > > -* drm_sched_cleanup_jobs. It will be 
> > > > > > > > > > reinserted back after sched->thread
> > > > > > > > > > -* is parked at which point it's safe.
> > > > > > > > > > +* Get job->s_fence->parent here to avoid 
> > > > > > > > > > concurrent delete during
> > > > > > > > > > +* processing timedout_job
> > > > > > > > > >  */
> > > > > > > > > > -   list_del_init(>list);
> > > > > > > > > > +   fence = dma_fence_get(job->s_fence->parent);
> > > > > > > > While this is true for amdgpu, it has no meaning for other 
> > > > > > > > drivers for whom
> > > > > > > > we haven't
> > > > > > > > done the refactoring of embedding HW fence (parent) into the 
> > > > > > > > job structure.
> > > > > > > > In fact thinking
> > > > > > > > about it, unless you do the HW fence embedding for all the 
> > > > > > > > drivers using the
> > > > > > > > scheduler you cannot
> > > > > > > > revert this patch or you will just break them.
> > > > > > > btw, why did you do that embedding? I do still have my patches 
> > > > > > > with
> > > > > > > dma_fence annotations floating around, but my idea at least was 
> > > > > > > to fix
> > > > > > > that issue with a mempool, not with embeddeding. What was the 
> > > > > > > motivation
> > > > > > > for embedding the wh fence?
> > > > > > > -Daniel
> > > > > > The motivation was 2 fold, avoid memory allocation during jobs 
> > > > > > submissions
> > > > > > (HW fence allocation) because as Christian explained this leads to 
> > > > > > deadlock
> > > > > > with
> > > > > > mm code 

Re: [PATCH] drm/sched: fix the bug of time out calculation(v3)

2021-08-31 Thread Daniel Vetter
On Fri, Aug 27, 2021 at 08:30:32PM +0200, Christian König wrote:
> Yeah, that's what I meant with that the start of processing a job is a bit
> swampy defined.
> 
> Jobs overload, but we simply don't have another good indicator that a job
> started except that the previous one completed.
> 
> It's still better than starting the timer when pushing the job to the ring
> buffer, because that is completely off.

Not sure this matters that much really in practice, and in some cases we
might want to actually just reset it all if the built up backlog is way
too long.

For anything that really runs way too long you want preempt-ctx/revoke
fences anyway, not end-of-cs fences with tdr.
-Daniel

> 
> Christian.
> 
> Am 27.08.21 um 20:22 schrieb Andrey Grodzovsky:
> > As I mentioned to Monk before - what about cases such as in this test - 
> > https://gitlab.freedesktop.org/mesa/drm/-/commit/bc21168fa924d3fc4a000492e861f50a1a135b25
> > 
> > Here you don't have serialized sequence where when jobs finishes
> > processing and second starts, they execute together  concurrently - for
> > those cases seems
> > to me restarting the timer for the second job from scratch will let it
> > hang much longer then allowed by TO value.
> > 
> > Andrey
> > 
> > On 2021-08-27 10:29 a.m., Christian König wrote:
> > > I don't think that makes sense.
> > > 
> > > See we don't want to start the time when the job is inserted into
> > > the ring buffer, but rather when it starts processing.
> > > 
> > > Starting processing is a bit swampy defined, but just starting the
> > > timer when the previous job completes should be fine enough.
> > > 
> > > Christian.
> > > 
> > > Am 27.08.21 um 15:57 schrieb Andrey Grodzovsky:
> > > > The TS represents the point in time when the job was inserted
> > > > into the pending list.
> > > > I don't think it matters when it actually starts to be
> > > > processed, what matters is when this job was inserted into
> > > > pending list because right at that point you arm the TO timer
> > > > (when no other is running already)
> > > > and so when the previous job completes and you cancel and rearm
> > > > again you can use that TS from the next job in pending list to
> > > > calculate how much time has actually left for it to run before
> > > > TDR must be initiated
> > > > and not just give it again full TO value to run even if it has
> > > > already been running for a while.
> > > > 
> > > > Also, i am not sure also about the assumption that what we
> > > > measure is processing by HW, what we measure is from the moment
> > > > it was scheduled to ring to the moment the job completed (EOP
> > > > event). At least that what our TDR timer measures and so it
> > > > makes sense to set the TS at this point.
> > > > 
> > > > Andrey
> > > > 
> > > > On 2021-08-27 3:20 a.m., Liu, Monk wrote:
> > > > > [AMD Official Use Only]
> > > > > 
> > > > > what is that 'ts' representing for ? it looks to me the
> > > > > jiffies that it get scheduled to the ring,  but a job
> > > > > scheduled to the ring doesn't represent it's being processed
> > > > > by hw.
> > > > > 
> > > > > Thanks
> > > > > 
> > > > > --
> > > > > Monk Liu | Cloud-GPU Core team
> > > > > --
> > > > > 
> > > > > -Original Message-
> > > > > From: Grodzovsky, Andrey 
> > > > > Sent: Friday, August 27, 2021 4:14 AM
> > > > > To: Liu, Monk ;
> > > > > amd-gfx@lists.freedesktop.org; Koenig, Christian
> > > > > 
> > > > > Cc: dri-de...@lists.freedesktop.org
> > > > > Subject: Re: [PATCH] drm/sched: fix the bug of time out
> > > > > calculation(v3)
> > > > > 
> > > > > Attached quick patch for per job TTL calculation to make
> > > > > more precises next timer expiration. It's on top of the
> > > > > patch in this thread. Let me know if this makes sense.
> > > > > 
> > > > > Andrey
> > > > > 
> > > > > On 2021-08-26 10:03 a.m., Andrey Grodzovsky wrote:
> > > > > > On 2021-08-26 12:55 a.m., Monk Liu wrote:
> > > > > > > issue:
> > > > > > > in cleanup_job the cancle_delayed_work will cancel a TO timer even
> > > > > > > the its corresponding job is still running.
> > > > > > > 
> > > > > > > fix:
> > > > > > > do not cancel the timer in cleanup_job, instead do the cancelling
> > > > > > > only when the heading job is signaled, and if there is a "next" 
> > > > > > > job
> > > > > > > we start_timeout again.
> > > > > > > 
> > > > > > > v2:
> > > > > > > further cleanup the logic, and do the TDR timer cancelling if the
> > > > > > > signaled job is the last one in its scheduler.
> > > > > > > 
> > > > > > > v3:
> > > > > > > change the issue description
> > > > > > > remove the cancel_delayed_work in the begining of the cleanup_job
> > > > > > > recover the implement of drm_sched_job_begin.
> > > > > > > 
> > > > > > > TODO:
> > > > > > > 1)introduce pause/resume scheduler in job_timeout to serial the
> > > > > > > handling of scheduler and job_timeout.
> > > > > > > 2)drop the 

Re: [drm/amdgpu] Driver crashes on 5.13.9 kernel

2021-08-31 Thread Kari Argillander
On Mon, Aug 30, 2021 at 07:15:29PM +0300, Skyler Mäntysaari wrote:
> I have tried kernel 5.13.13, without any difference and I haven't
> tried with an older kernel, as this hardware is that new that I have
> very little faith in less than 5.x kernel would even have support for
> the needed GPU.

Yeah might be. 

> What do you mean with git bisect? I have checked that the crash
> happens somewhere in the monitor connection code:

If we found some older version which work. Then with git bisect we can
track which patch made this bug. But if it never work then git bisect
won't help.

You may still want to test with 5.14 as it was released today.

If that does not work then I cannot help more. It is still very
important to know if it is also broken in 5.14. Hopefully someone will
able to help you. 

  Argillander

> [ 9605.269927] Call Trace:
> [ 9605.269931]  core_link_enable_stream+0x746/0x870 [amdgpu]
> [ 9605.270038]  dce110_apply_ctx_to_hw+0x519/0x560 [amdgpu]
> [ 9605.270146]  dc_commit_state+0x2f6/0xa50 [amdgpu]
> [ 9605.270249]  amdgpu_dm_atomic_commit_tail+0x569/0x26a0 [amdgpu]
> [ 9605.270326]  ? kfree+0xc3/0x460
> [ 9605.270329]  ? dcn30_validate_bandwidth+0x11f/0x270 [amdgpu]
> [ 9605.270402]  ? dcn30_validate_bandwidth+0x11f/0x270 [amdgpu]
> [ 9605.270469]  ? dm_plane_helper_prepare_fb+0x19c/0x250 [amdgpu]
> [ 9605.270542]  ? __cond_resched+0x16/0x40
> [ 9605.270544]  ? __wait_for_common+0x3b/0x160
> [ 9605.270545]  ? __raw_callee_save___native_queued_spin_unlock+0x11/0x1e
> [ 9605.270548]  commit_tail+0x94/0x130 [drm_kms_helper]
> [ 9605.270557]  drm_atomic_helper_commit+0x113/0x140 [drm_kms_helper]
> [ 9605.270562]  drm_atomic_helper_set_config+0x70/0xb0 [drm_kms_helper]
> [ 9605.270568]  drm_mode_setcrtc+0x1d3/0x6d0 [drm]
> [ 9605.270582]  ? drm_mode_getcrtc+0x180/0x180 [drm]
> [ 9605.270590]  drm_ioctl_kernel+0xaa/0xf0 [drm]
> [ 9605.270600]  drm_ioctl+0x220/0x3c0 [drm]
> [ 9605.270609]  ? drm_mode_getcrtc+0x180/0x180 [drm]
> [ 9605.270618]  amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
> [ 9605.270673]  __x64_sys_ioctl+0x83/0xb0
> [ 9605.270675]  do_syscall_64+0x40/0xb0
> [ 9605.270677]  entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> 
> On Sun, Aug 29, 2021, at 20:34, Kari Argillander wrote:
> > On Sun, Aug 29, 2021 at 06:38:39PM +0300, Skyler Mäntysaari wrote:
> > > Hello everyone on the list,
> > 
> > This is universal kernel list and it is not read by many. I have added
> > hopefully right list (amd-gfx@lists.freedesktop.org).
> > 
> > > Subject: Re: [drm/amdgpu] Driver crashes on 5.13.9 kernel
> > 
> > I have no influence or knowledge about this driver, but I still try to
> > help because it seems good bug report. Have you test with 5.13.13 or
> > 5.14-rc7. Does this work with some other kernel? If needed can you git
> > bisect if needed? You will probably get some support for it if needed.
> > 
> > Argillander
> > 
> > > I thought that this should probably be discussed here,  so I came
> > > across weird issue to me which is driver crashing while trying to get
> > > one of my monitors working on Gentoo.  I would like to ask here how
> > > that would happen that the Display appears to jump from DisplayPort-6
> > > (physical port) to DisplayPort-7 (which doesn't exist physically)? Has
> > > anyone else experienced this?
> > > 
> > > It seems that the driver sees a rather large amount of inputs for the
> > > GPU, even though I only have 4, 3 of which are DisplayPort, and the
> > > issue monitor is also on DisplayPort. 
> > > 
> > > Hardware:
> > > CPU: AMD Ryzen 5800X
> > > GPU: AMD Radeon RX 6800
> > > System Memory: 32GB of DDR4 3200Mhz
> > > Display(s): BenQ Zowie XL2430 (1080p), DELL U2414H (1080p), DELL U2415 
> > > (1920x1200)
> > > Type of Diplay Connection: All are connected via Display-Port
> > > 
> > > Related DRM issue:
> > > https://gitlab.freedesktop.org/drm/amd/-/issues/1621 which includes
> > > logs too.
> > > 
> > > 
> > > Best regards,
> > > Skyler Mäntysaari


Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Daniel Vetter
On Tue, Aug 31, 2021 at 02:59:02PM +0200, Daniel Vetter wrote:
> Can we please have some actual commit message here, with detailed
> explanation of the race/bug/whatever, how you fix it and why this is the
> best option?
> 
> On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote:
> > tested-by: jingwen chen 
> > Signed-off-by: Monk Liu 
> > Signed-off-by: jingwen chen 
> > ---
> >  drivers/gpu/drm/scheduler/sched_main.c | 24 
> >  1 file changed, 4 insertions(+), 20 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> > b/drivers/gpu/drm/scheduler/sched_main.c
> > index ecf8140..894fdb24 100644
> > --- a/drivers/gpu/drm/scheduler/sched_main.c
> > +++ b/drivers/gpu/drm/scheduler/sched_main.c
> > @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
> > *work)
> > sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
> >  
> > /* Protects against concurrent deletion in drm_sched_get_cleanup_job */
> > +   if (!__kthread_should_park(sched->thread))
> 
> This is a __ function, i.e. considered internal, and it's lockless atomic,
> i.e. unordered. And you're not explaining why this works.
> 
> Iow it's probably buggy, and an just unconditionally parking the kthread
> is probably the right thing to do. If it's not the right thing to do,
> there's a bug here for sure.

Also why don't we reuse the function drivers already have to stop a
scheduler thread? We seem to have two kthread_park now, that's probably
one too much.
-Daniel

> > +   kthread_park(sched->thread);
> > +
> > spin_lock(>job_list_lock);
> > job = list_first_entry_or_null(>pending_list,
> >struct drm_sched_job, list);
> >  
> > if (job) {
> > -   /*
> > -* Remove the bad job so it cannot be freed by concurrent
> > -* drm_sched_cleanup_jobs. It will be reinserted back after 
> > sched->thread
> > -* is parked at which point it's safe.
> > -*/
> > -   list_del_init(>list);
> > spin_unlock(>job_list_lock);
> >  
> > +   /* vendor's timeout_job should call drm_sched_start() */
> > status = job->sched->ops->timedout_job(job);
> >  
> > /*
> > @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
> > struct drm_sched_job *bad)
> > kthread_park(sched->thread);
> >  
> > /*
> > -* Reinsert back the bad job here - now it's safe as
> > -* drm_sched_get_cleanup_job cannot race against us and release the
> > -* bad job at this point - we parked (waited for) any in progress
> > -* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> > -* now until the scheduler thread is unparked.
> > -*/
> > -   if (bad && bad->sched == sched)
> > -   /*
> > -* Add at the head of the queue to reflect it was the earliest
> > -* job extracted.
> > -*/
> > -   list_add(>list, >pending_list);
> > -
> > -   /*
> >  * Iterate the job list from later to  earlier one and either deactive
> >  * their HW callbacks or remove them from pending list if they already
> >  * signaled.
> > -- 
> > 2.7.4
> > 
> 
> -- 
> Daniel Vetter
> Software Engineer, Intel Corporation
> http://blog.ffwll.ch

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


Re: [PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Daniel Vetter
Can we please have some actual commit message here, with detailed
explanation of the race/bug/whatever, how you fix it and why this is the
best option?

On Tue, Aug 31, 2021 at 06:35:39PM +0800, Monk Liu wrote:
> tested-by: jingwen chen 
> Signed-off-by: Monk Liu 
> Signed-off-by: jingwen chen 
> ---
>  drivers/gpu/drm/scheduler/sched_main.c | 24 
>  1 file changed, 4 insertions(+), 20 deletions(-)
> 
> diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
> b/drivers/gpu/drm/scheduler/sched_main.c
> index ecf8140..894fdb24 100644
> --- a/drivers/gpu/drm/scheduler/sched_main.c
> +++ b/drivers/gpu/drm/scheduler/sched_main.c
> @@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
> *work)
>   sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
>  
>   /* Protects against concurrent deletion in drm_sched_get_cleanup_job */
> + if (!__kthread_should_park(sched->thread))

This is a __ function, i.e. considered internal, and it's lockless atomic,
i.e. unordered. And you're not explaining why this works.

Iow it's probably buggy, and an just unconditionally parking the kthread
is probably the right thing to do. If it's not the right thing to do,
there's a bug here for sure.
-Daniel

> + kthread_park(sched->thread);
> +
>   spin_lock(>job_list_lock);
>   job = list_first_entry_or_null(>pending_list,
>  struct drm_sched_job, list);
>  
>   if (job) {
> - /*
> -  * Remove the bad job so it cannot be freed by concurrent
> -  * drm_sched_cleanup_jobs. It will be reinserted back after 
> sched->thread
> -  * is parked at which point it's safe.
> -  */
> - list_del_init(>list);
>   spin_unlock(>job_list_lock);
>  
> + /* vendor's timeout_job should call drm_sched_start() */
>   status = job->sched->ops->timedout_job(job);
>  
>   /*
> @@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
> struct drm_sched_job *bad)
>   kthread_park(sched->thread);
>  
>   /*
> -  * Reinsert back the bad job here - now it's safe as
> -  * drm_sched_get_cleanup_job cannot race against us and release the
> -  * bad job at this point - we parked (waited for) any in progress
> -  * (earlier) cleanups and drm_sched_get_cleanup_job will not be called
> -  * now until the scheduler thread is unparked.
> -  */
> - if (bad && bad->sched == sched)
> - /*
> -  * Add at the head of the queue to reflect it was the earliest
> -  * job extracted.
> -  */
> - list_add(>list, >pending_list);
> -
> - /*
>* Iterate the job list from later to  earlier one and either deactive
>* their HW callbacks or remove them from pending list if they already
>* signaled.
> -- 
> 2.7.4
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH 2/2] drm/sched: serialize job_timeout and scheduler

2021-08-31 Thread Monk Liu
tested-by: jingwen chen 
Signed-off-by: Monk Liu 
Signed-off-by: jingwen chen 
---
 drivers/gpu/drm/scheduler/sched_main.c | 24 
 1 file changed, 4 insertions(+), 20 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index ecf8140..894fdb24 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -319,19 +319,17 @@ static void drm_sched_job_timedout(struct work_struct 
*work)
sched = container_of(work, struct drm_gpu_scheduler, work_tdr.work);
 
/* Protects against concurrent deletion in drm_sched_get_cleanup_job */
+   if (!__kthread_should_park(sched->thread))
+   kthread_park(sched->thread);
+
spin_lock(>job_list_lock);
job = list_first_entry_or_null(>pending_list,
   struct drm_sched_job, list);
 
if (job) {
-   /*
-* Remove the bad job so it cannot be freed by concurrent
-* drm_sched_cleanup_jobs. It will be reinserted back after 
sched->thread
-* is parked at which point it's safe.
-*/
-   list_del_init(>list);
spin_unlock(>job_list_lock);
 
+   /* vendor's timeout_job should call drm_sched_start() */
status = job->sched->ops->timedout_job(job);
 
/*
@@ -393,20 +391,6 @@ void drm_sched_stop(struct drm_gpu_scheduler *sched, 
struct drm_sched_job *bad)
kthread_park(sched->thread);
 
/*
-* Reinsert back the bad job here - now it's safe as
-* drm_sched_get_cleanup_job cannot race against us and release the
-* bad job at this point - we parked (waited for) any in progress
-* (earlier) cleanups and drm_sched_get_cleanup_job will not be called
-* now until the scheduler thread is unparked.
-*/
-   if (bad && bad->sched == sched)
-   /*
-* Add at the head of the queue to reflect it was the earliest
-* job extracted.
-*/
-   list_add(>list, >pending_list);
-
-   /*
 * Iterate the job list from later to  earlier one and either deactive
 * their HW callbacks or remove them from pending list if they already
 * signaled.
-- 
2.7.4



[PATCH 1/2] drm/sched: fix the bug of time out calculation(v3)

2021-08-31 Thread Monk Liu
issue:
in cleanup_job the cancle_delayed_work will cancel a TO timer
even the its corresponding job is still running.

fix:
do not cancel the timer in cleanup_job, instead do the cancelling
only when the heading job is signaled, and if there is a "next" job
we start_timeout again.

v2:
further cleanup the logic, and do the TDR timer cancelling if the signaled job
is the last one in its scheduler.

v3:
change the issue description
remove the cancel_delayed_work in the begining of the cleanup_job
recover the implement of drm_sched_job_begin.

TODO:
1)introduce pause/resume scheduler in job_timeout to serial the handling
of scheduler and job_timeout.
2)drop the bad job's del and insert in scheduler due to above serialization
(no race issue anymore with the serialization)

tested-by: jingwen 
Signed-off-by: Monk Liu 
---
 drivers/gpu/drm/scheduler/sched_main.c | 25 ++---
 1 file changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/gpu/drm/scheduler/sched_main.c 
b/drivers/gpu/drm/scheduler/sched_main.c
index a2a9536..ecf8140 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -676,13 +676,7 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
 {
struct drm_sched_job *job, *next;
 
-   /*
-* Don't destroy jobs while the timeout worker is running  OR thread
-* is being parked and hence assumed to not touch pending_list
-*/
-   if ((sched->timeout != MAX_SCHEDULE_TIMEOUT &&
-   !cancel_delayed_work(>work_tdr)) ||
-   kthread_should_park())
+   if (kthread_should_park())
return NULL;
 
spin_lock(>job_list_lock);
@@ -693,17 +687,21 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
if (job && dma_fence_is_signaled(>s_fence->finished)) {
/* remove job from pending_list */
list_del_init(>list);
+
+   /* cancel this job's TO timer */
+   cancel_delayed_work(>work_tdr);
/* make the scheduled timestamp more accurate */
next = list_first_entry_or_null(>pending_list,
typeof(*next), list);
-   if (next)
+
+   if (next) {
next->s_fence->scheduled.timestamp =
job->s_fence->finished.timestamp;
-
+   /* start TO timer for next job */
+   drm_sched_start_timeout(sched);
+   }
} else {
job = NULL;
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
}
 
spin_unlock(>job_list_lock);
@@ -791,11 +789,8 @@ static int drm_sched_main(void *param)
  (entity = 
drm_sched_select_entity(sched))) ||
 kthread_should_stop());
 
-   if (cleanup_job) {
+   if (cleanup_job)
sched->ops->free_job(cleanup_job);
-   /* queue timeout for next job */
-   drm_sched_start_timeout(sched);
-   }
 
if (!entity)
continue;
-- 
2.7.4



[PATCH] drm/amdgpu: Clear RAS interrupt status on aldebaran

2021-08-31 Thread Clements, John
[AMD Official Use Only]

Submitting patch to resolve incorrect register address' on Aldebaran affecting 
RAS interrupt handling


0001-drm-amdgpu-Clear-RAS-interrupt-status-on-aldebaran.patch
Description: 0001-drm-amdgpu-Clear-RAS-interrupt-status-on-aldebaran.patch


Re: [PATCH v3] drm/amdgpu: Fix a deadlock if previous GEM object allocation fails

2021-08-31 Thread Christian König

Am 31.08.21 um 09:08 schrieb Pan, Xinhui:

Fall through to handle the error instead of return.

Fixes: f8aab60422c37 ("drm/amdgpu: Initialise drm_gem_object_funcs for
imported BOs")
Cc: sta...@vger.kernel.org
Signed-off-by: xinhui pan 


Reviewed-by: Christian König 


---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 23 ++-
  1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 85b292ed5c43..9e2525b96d04 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -341,21 +341,18 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void 
*data,
r = amdgpu_gem_object_create(adev, size, args->in.alignment,
 initial_domain,
 flags, ttm_bo_type_device, resv, );
-   if (r) {
-   if (r != -ERESTARTSYS) {
-   if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
-   flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
-   goto retry;
-   }
+   if (r && r != -ERESTARTSYS) {
+   if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
+   flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
+   goto retry;
+   }
  
-			if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {

-   initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
-   goto retry;
-   }
-   DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, 
%d)\n",
- size, initial_domain, args->in.alignment, r);
+   if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
+   initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
+   goto retry;
}
-   return r;
+   DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, 
%d)\n",
+   size, initial_domain, args->in.alignment, r);
}
  
  	if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) {




[PATCH v3] drm/amdgpu: Fix a deadlock if previous GEM object allocation fails

2021-08-31 Thread Pan, Xinhui
Fall through to handle the error instead of return.

Fixes: f8aab60422c37 ("drm/amdgpu: Initialise drm_gem_object_funcs for
imported BOs")
Cc: sta...@vger.kernel.org
Signed-off-by: xinhui pan 
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 23 ++-
 1 file changed, 10 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 85b292ed5c43..9e2525b96d04 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -341,21 +341,18 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void 
*data,
r = amdgpu_gem_object_create(adev, size, args->in.alignment,
 initial_domain,
 flags, ttm_bo_type_device, resv, );
-   if (r) {
-   if (r != -ERESTARTSYS) {
-   if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
-   flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
-   goto retry;
-   }
+   if (r && r != -ERESTARTSYS) {
+   if (flags & AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED) {
+   flags &= ~AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED;
+   goto retry;
+   }
 
-   if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
-   initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
-   goto retry;
-   }
-   DRM_DEBUG("Failed to allocate GEM object (%llu, %d, 
%llu, %d)\n",
- size, initial_domain, args->in.alignment, r);
+   if (initial_domain == AMDGPU_GEM_DOMAIN_VRAM) {
+   initial_domain |= AMDGPU_GEM_DOMAIN_GTT;
+   goto retry;
}
-   return r;
+   DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, 
%d)\n",
+   size, initial_domain, args->in.alignment, r);
}
 
if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) {
-- 
2.25.1




RE: [PATCH] drm/amdgpu: Clear RAS interrupt status on aldebaran

2021-08-31 Thread Li, Candice
[AMD Official Use Only]

Reviewed-by: Candice Li mailto:candice...@amd.com>>



Thanks,
Candice

From: Clements, John 
Sent: Tuesday, August 31, 2021 2:27 PM
To: amd-gfx@lists.freedesktop.org
Cc: Li, Candice 
Subject: [PATCH] drm/amdgpu: Clear RAS interrupt status on aldebaran


[AMD Official Use Only]

Submitting patch to resolve incorrect register address' on Aldebaran affecting 
RAS interrupt handling


Re: [PATCH v2] drm/amdgpu: Fix a deadlock if previous GEM object allocation fails

2021-08-31 Thread Christian König

Am 31.08.21 um 07:55 schrieb Pan, Xinhui:

Fall through to handle the error instead of return.


Can you also clean up the "if (r) {.. if (r !=..." above?

And please figure out which patch introduced the problem and add a 
Fixes: tag and maybe CC: stable as well.


Thanks for the help,
Christian.



Signed-off-by: xinhui pan 
---
  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 1 -
  1 file changed, 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index 85b292ed5c43..7ddd429052ea 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -355,7 +355,6 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void 
*data,
DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, 
%d)\n",
  size, initial_domain, args->in.alignment, r);
}
-   return r;
}
  
  	if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) {




Re: [PATCH] drm/amdgpu: Fix a deadlock if previous GEM object allocation fails

2021-08-31 Thread Christian König




Am 31.08.21 um 07:47 schrieb Pan, Xinhui:


在 2021/8/31 13:38,“Pan, Xinhui” 写入:

 
 
 在 2021/8/31 12:03,“Grodzovsky, Andrey” 写入:
 
 
 On 2021-08-30 11:24 p.m., Pan, Xinhui wrote:

 > [AMD Official Use Only]
 >
 > [AMD Official Use Only]
 >
 > Unreserve root BO before return otherwise next allocation got 
deadlock.
 >
 > Signed-off-by: xinhui pan 
 > ---
 >   drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 11 +--
 >   1 file changed, 5 insertions(+), 6 deletions(-)
 >
 > diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c 
b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
 > index 85b292ed5c43..c9db7d2c5816 100644
 > --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
 > +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
 > @@ -355,19 +355,18 @@ int amdgpu_gem_create_ioctl(struct drm_device 
*dev, void *data,
 >  DRM_DEBUG("Failed to allocate GEM object (%llu, 
%d, %llu, %d)\n",
 >size, initial_domain, 
args->in.alignment, r);
 >  }
 > +
 > +   if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID)
 > +   amdgpu_bo_unreserve(vm->root.bo);
 >  return r;
 >  }
 >
 >  if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) {
 > -   if (!r) {
 > -   struct amdgpu_bo *abo = 
gem_to_amdgpu_bo(gobj);
 > +   struct amdgpu_bo *abo = gem_to_amdgpu_bo(gobj);
 >
 > -   abo->parent = amdgpu_bo_ref(vm->root.bo);
 > -   }
 > +   abo->parent = amdgpu_bo_ref(vm->root.bo);
 >  amdgpu_bo_unreserve(vm->root.bo);
 >  }
 > -   if (r)
 > -   return r;
 
 
 The above early return seems to be needed for -ERESTARTSYS case.
 
 Andrey
 
 There are two returns. ERESTARTSYS and other error after retry are already handled by the first return in if {}. So the second return is not needed.
 
 Thanks

 Xinhui

Also we can do something like below which is simpler.


Yeah, I think that would be better. We could also change the "if (r) { 
if (r != -ERESTARTSYS) {" into just "if (r != -ERESTARTSYS)" then.


But good catch anyway.

Regards,
Christian.


@@ -355,7 +355,6 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void 
*data,
 DRM_DEBUG("Failed to allocate GEM object (%llu, %d, %llu, 
%d)\n",
   size, initial_domain, args->in.alignment, r);
 }
-   return r;
 }
  
 if (flags & AMDGPU_GEM_CREATE_VM_ALWAYS_VALID) {
  
 
 >

 >  r = drm_gem_handle_create(filp, gobj, );
 >  /* drop reference from allocate - handle holds it now */
 > --
 > 2.25.1