On Thu, 1 Nov 2012 16:29:35 +0100
Daniel Vetter <[email protected]> wrote:

> On Thu, Nov 01, 2012 at 03:18:46PM +0000, Chris Wilson wrote:
> > On Thu, 1 Nov 2012 08:07:59 -0700, Jesse Barnes <[email protected]> 
> > wrote:
> > > On Thu,  1 Nov 2012 09:26:26 +0000
> > > Chris Wilson <[email protected]> wrote:
> > > 
> > > > If we accumulate unpin tasks because we are pageflipping faster than the
> > > > system can schedule its workers, we can effectively create a
> > > > pin-leak. The solution taken here is to limit the number of unpin tasks
> > > > we have per-crtc and to flush those outstanding tasks if we accumulate
> > > > too many. This should prevent any jitter in the normal case, and also
> > > > prevent the hang if we should run too fast.
> > > > 
> > > > Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=46991
> > > > Reported-and-tested-by: Tvrtko Ursulin <[email protected]>
> > > > Signed-off-by: Chris Wilson <[email protected]>
> > > > ---
> > > >  drivers/gpu/drm/i915/intel_display.c |   22 ++++++++++++++++------
> > > >  drivers/gpu/drm/i915/intel_drv.h     |    4 +++-
> > > >  2 files changed, 19 insertions(+), 7 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/intel_display.c 
> > > > b/drivers/gpu/drm/i915/intel_display.c
> > > > index 69b1739..800b195 100644
> > > > --- a/drivers/gpu/drm/i915/intel_display.c
> > > > +++ b/drivers/gpu/drm/i915/intel_display.c
> > > > @@ -6908,14 +6908,19 @@ static void intel_unpin_work_fn(struct 
> > > > work_struct *__work)
> > > >  {
> > > >         struct intel_unpin_work *work =
> > > >                 container_of(__work, struct intel_unpin_work, work);
> > > > +       struct drm_device *dev = work->crtc->dev;
> > > >  
> > > > -       mutex_lock(&work->dev->struct_mutex);
> > > > +       mutex_lock(&dev->struct_mutex);
> > > >         intel_unpin_fb_obj(work->old_fb_obj);
> > > >         drm_gem_object_unreference(&work->pending_flip_obj->base);
> > > >         drm_gem_object_unreference(&work->old_fb_obj->base);
> > > >  
> > > > -       intel_update_fbc(work->dev);
> > > > -       mutex_unlock(&work->dev->struct_mutex);
> > > > +       intel_update_fbc(dev);
> > > > +       mutex_unlock(&dev->struct_mutex);
> > > > +
> > > > +       
> > > > BUG_ON(atomic_read(&to_intel_crtc(work->crtc)->unpin_work_count) == 0);
> > > > +       atomic_dec(&to_intel_crtc(work->crtc)->unpin_work_count);
> > > > +
> > > >         kfree(work);
> > > >  }
> > > >  
> > > > @@ -6963,9 +6968,9 @@ static void do_intel_finish_page_flip(struct 
> > > > drm_device *dev,
> > > >  
> > > >         atomic_clear_mask(1 << intel_crtc->plane,
> > > >                           &obj->pending_flip.counter);
> > > > -
> > > >         wake_up(&dev_priv->pending_flip_queue);
> > > > -       schedule_work(&work->work);
> > > > +
> > > > +       queue_work(dev_priv->wq, &work->work);
> > > >  
> > > >         trace_i915_flip_complete(intel_crtc->plane, 
> > > > work->pending_flip_obj);
> > > >  }
> > > > @@ -7266,7 +7271,7 @@ static int intel_crtc_page_flip(struct drm_crtc 
> > > > *crtc,
> > > >                 return -ENOMEM;
> > > >  
> > > >         work->event = event;
> > > > -       work->dev = crtc->dev;
> > > > +       work->crtc = crtc;
> > > >         intel_fb = to_intel_framebuffer(crtc->fb);
> > > >         work->old_fb_obj = intel_fb->obj;
> > > >         INIT_WORK(&work->work, intel_unpin_work_fn);
> > > > @@ -7291,6 +7296,9 @@ static int intel_crtc_page_flip(struct drm_crtc 
> > > > *crtc,
> > > >         intel_fb = to_intel_framebuffer(fb);
> > > >         obj = intel_fb->obj;
> > > >  
> > > > +       if (atomic_read(&intel_crtc->unpin_work_count) >= 2)
> > > > +               flush_workqueue(dev_priv->wq);
> > > > +
> > > 
> > > Have you by chance tested this with the async flip patch?  I wonder if
> > > in that case whether 2 is too small, and something like 100 might be
> > > better (though really async flips are for cases where we can't keep up
> > > with refresh, so a small number shouldn't hurt too much there either).
> > 
> > The limit on 2 is due to the limited resolution of pincount. Hence my
> > earlier fear for your async flip patch.
> 
> I think for asyn flips we simply need to have a real flip queue in our
> code, instead of abusing the implicit list in the workqueue code ...
> 
> One other thing is that with async flips we don't have a natural limit on
> the number of pinned framebuffers any more, which means we can easily
> exhaust all mappable GTT space. Hence we need to integrate that new,
> explicit flip queue into our eviction code, too.
> 
> For now I'm rather happy with the flush_wq ducttape presented here ;-)

Yeah I don't have a problem with it as long as we don't block when
queuing flips in real life. :)

-- 
Jesse Barnes, Intel Open Source Technology Center
_______________________________________________
Intel-gfx mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/intel-gfx

Reply via email to