On la, 2016-11-05 at 18:35 +0000, Chris Wilson wrote:
> On Sat, Nov 05, 2016 at 10:35:59AM +0200, Imre Deak wrote:
> > There is a small race where a new request can be submitted and retired
> > after the idle worker started to run which leads to idling the GPU too
> > early. Fix this by deferring the idling to the pending instance of the
> > worker.
> > 
> > This scenario was pointed out by Chris.
> > 
> > Cc: Chris Wilson <[email protected]>
> > Signed-off-by: Imre Deak <[email protected]>
> > ---
> >  drivers/gpu/drm/i915/i915_gem.c | 7 +++++++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_gem.c 
> > b/drivers/gpu/drm/i915/i915_gem.c
> > index 0dbf38c..36a16df 100644
> > --- a/drivers/gpu/drm/i915/i915_gem.c
> > +++ b/drivers/gpu/drm/i915/i915_gem.c
> > @@ -2766,6 +2766,13 @@ i915_gem_idle_work_handler(struct work_struct *work)
> >             goto out_rearm;
> >     }
> >  
> > +   /*
> > +    * New request retired after this work handler started, extend active
> > +    * period until next instance of the work.
> > +    */
> > +   if (work_pending(work))
> > +           goto out_unlock;
> 
> Ok, it took some digging around inside kernel/workqueue.c to come to
> agreement that this works.
> 
> WORK_STRUCT_PENDING_BIT is set when we queue the work and released just
> prior to execution of the work->func. A race on active_requests before
> we take the struct_mutex will leave this set.
> 
> Reviewed-by: Chris Wilson <[email protected]>
> 
> For bonus points
> 
> diff --git a/drivers/gpu/drm/i915/i915_gem_request.c 
> b/drivers/gpu/drm/i915/i915_gem_request.c
> index cfda095..5d349e0 100644
> --- a/drivers/gpu/drm/i915/i915_gem_request.c
> +++ b/drivers/gpu/drm/i915/i915_gem_request.c
> @@ -1146,7 +1146,7 @@ void i915_gem_retire_requests(struct drm_i915_private 
> *dev_priv)
>                 engine_retire_requests(engine);
>  
>         if (!dev_priv->gt.active_requests)
> -               queue_delayed_work(dev_priv->wq,
> -                                  &dev_priv->gt.idle_work,
> -                                  msecs_to_jiffies(100));
> +               mod_delayed_work(dev_priv->wq,
> +                               &dev_priv->gt.idle_work,
> +                               msecs_to_jiffies(100));
>  }

Yep, without this we could still do early idling and that scenario has
a bigger likelihood:/ But I think it's still a separate patch, I can
follow up with it. Good to know the difference between the above two
helpers!

--Imre
_______________________________________________
Intel-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to