Hi Christian,

Am Freitag, den 03.08.2018, 15:51 +0200 schrieb Christian König:
> Hi Lucas,
> 
> thanks a lot for taking care of that, but there is one thing you have 
> missed:
> 
> It is perfectly possible that the job is the last one on the list and 
> list_next_entry() doesn't test for that, e.g. it never return NULL.

Urgh, right. For some reason I was thinking I would get the same
sched_job in that case again, thus the "next != s_job" in the condition
below. But for sure the next pointer is to the list head in the
scheduler, must be the temperature in here causing bitflips in my
brain. ;)

Updated patch coming up in few minutes.

> BTW: There are also quite a lot of other things we could optimize here. 
> So if you need something todo, just ping me :)

Hehe, I think we are all in the same boat here: too much stuff on the
plate.

Regards,
Lucas

> Cheers,
> Christian.
> 
> Am 03.08.2018 um 15:18 schrieb Lucas Stach:
> > drm_sched_job_finish() is a work item scheduled for each finished job on
> > a unbound system workqueue. This means the workers can execute out of order
> > with regard to the real hardware job completions.
> > 
> > If this happens queueing a timeout worker for the first job on the ring
> > mirror list is wrong, as this may be a job which has already finished
> > executing. Fix this by reorganizing the code to always queue the worker
> > for the next job on the list, if this job hasn't finished yet. This is
> > robust against a potential reordering of the finish workers.
> > 
> > Also move out the timeout worker cancelling, so that we don't need to
> > take the job list lock twice. As a small optimization list_del is used
> > to remove the job from the ring mirror list, as there is no need to
> > reinit the list head in the job we are about to free.
> > 
> > > > Signed-off-by: Lucas Stach <l.st...@pengutronix.de>
> > ---
> >   drivers/gpu/drm/scheduler/gpu_scheduler.c | 20 +++++++++-----------
> >   1 file changed, 9 insertions(+), 11 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/scheduler/gpu_scheduler.c 
> > b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > index 44d480768dfe..0be2859d7b80 100644
> > --- a/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > +++ b/drivers/gpu/drm/scheduler/gpu_scheduler.c
> > @@ -452,24 +452,22 @@ static void drm_sched_job_finish(struct work_struct 
> > *work)
> > > >                                                    finish_work);
> > > >         struct drm_gpu_scheduler *sched = s_job->sched;
> >   
> > > > -       /* remove job from ring_mirror_list */
> > > > +       if (sched->timeout != MAX_SCHEDULE_TIMEOUT)
> > > > +               cancel_delayed_work_sync(&s_job->work_tdr);
> > +
> > > >         spin_lock(&sched->job_list_lock);
> > > > -       list_del_init(&s_job->node);
> > > >         if (sched->timeout != MAX_SCHEDULE_TIMEOUT) {
> > > > -               struct drm_sched_job *next;
> > -
> > > > -               spin_unlock(&sched->job_list_lock);
> > > > -               cancel_delayed_work_sync(&s_job->work_tdr);
> > > > -               spin_lock(&sched->job_list_lock);
> > > > +               struct drm_sched_job *next = list_next_entry(s_job, 
> > > > node);
> >   
> > > >                 /* queue TDR for next job */
> > > > -               next = 
> > > > list_first_entry_or_null(&sched->ring_mirror_list,
> > > > -                                               struct drm_sched_job, 
> > > > node);
> > -
> > > > -               if (next)
> > > > +               if (next != s_job &&
> > > > +                   !dma_fence_is_signaled(&s_job->s_fence->finished))
> > > >                         schedule_delayed_work(&next->work_tdr, 
> > > > sched->timeout);
> > > >         }
> > > > +       /* remove job from ring_mirror_list */
> > > > +       list_del(&s_job->node);
> > > >         spin_unlock(&sched->job_list_lock);
> > +
> > > >         dma_fence_put(&s_job->s_fence->finished);
> > > >         sched->ops->free_job(s_job);
> >   }
> 
> 
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Reply via email to