On Tue, 18 Aug 2009 08:08:24 +0200 Thomas Hellström <tho...@shipmail.org> wrote:
> Kristian Høgsberg wrote: > > 2009/8/17 Thomas Hellström <tho...@shipmail.org>: > > > >> Kristian Høgsberg wrote: > >> > >>> This patch adds a vblank synced pageflip ioctl for to the > >>> modesetting family of ioctls. The ioctl takes a crtc and an fb > >>> and schedules a pageflip to the new fb at the next coming > >>> vertical blank event. This feature lets userspace implement > >>> tear-free updating of the screen contents with hw-guaranteed low > >>> latency page flipping. > >>> > >>> The ioctl is asynchronous in that it returns immediately and then > >>> later notifies the client by making an event available for > >>> reading on the drm fd. This lets applications add the drm fd to > >>> their main loop and handle other tasks while waiting for the flip > >>> to happen. The event includes the time of the flip, the frame > >>> counter and a 64 bit opaque token provided by user space in the > >>> ioctl. > >>> > >>> Based on work and suggestions from > >>> Jesse Barnes <jbar...@virtuousgeek.org>, > >>> Jakob Bornecrantz <wallbra...@gmail.com>, > >>> Chris Wilson <ch...@chris-wilson.co.uk> > >>> > >>> Signed-off-by: Kristian Høgsberg <k...@redhat.com> > >>> Signed-off-by: Jesse Barnes <jbar...@virtuousgeek.org> > >>> --- > >>> > >>> Ok, another version of this patch. This one has fixes to work > >>> with radeon kms plus a missing list head init that would cause an > >>> oops in the case where we schedule a flip before the previous one > >>> has been queued. > >>> > >>> I'm now ready to propose this patch for the 2.6.32 merge window. > >>> > >>> > >>> > >> Hi! > >> First a general question: There is some hardware (for example > >> Unichromes and Chrome9) that can schedule > >> page-flips in the command stream after optional vblank barriers. > >> For this kind of hardware the pageflip would be a matter of > >> scheduling the flip and fence the old and new buffers. No need > >> for delayed unpins and explicit user-space notifications, although > >> the workqueue could be handy to avoid command processing stalls in > >> the vblank barriers. How would such a scheme fit into the > >> framework below? > > > > By sending an event back on the file descriptor we allow users of > > the API to wait on the flip to finish in a standard poll or select > > mainloop, where it can handle input from other sources while > > waiting. If you rely on fences, the application will block when it > > tries to access the buffers protected by the fence, and is unable > > to handle input from other sources while it's blocking. > > > > Even with the vsync barrier in the command stream (which intel hw > > also supports) you do need to keep the current front buffer pinned > > for the remainder of the frame, and then you need notification when > > the swap has happened so that you can unpin the old buffer. I'm > > sure Unichrome has a command to send an interrupt, which can be > > queued after the vsync barrier to run the workqueue and trigger > > this cleanup as well as sending the userspace notification. > > > > > >>> Kristian > >>> > >>> drivers/gpu/drm/drm_crtc.c | 170 > >>> ++++++++++++++++++++++++++++++- > >>> drivers/gpu/drm/drm_crtc_helper.c | 12 ++ > >>> drivers/gpu/drm/drm_drv.c | 1 + > >>> drivers/gpu/drm/drm_fops.c | 68 ++++++++++++- > >>> drivers/gpu/drm/drm_irq.c | 43 ++++++++ > >>> drivers/gpu/drm/i915/i915_drv.c | 1 + > >>> drivers/gpu/drm/i915/intel_display.c | 24 +++-- > >>> drivers/gpu/drm/radeon/radeon_display.c | 3 +- > >>> include/drm/drm.h | 25 +++++ > >>> include/drm/drmP.h | 32 ++++++ > >>> include/drm/drm_crtc.h | 27 +++++ > >>> include/drm/drm_crtc_helper.h | 4 + > >>> include/drm/drm_mode.h | 16 +++ 13 files > >>> changed, 414 insertions(+), 12 deletions(-) > >>> > >>> diff --git a/drivers/gpu/drm/drm_crtc.c > >>> b/drivers/gpu/drm/drm_crtc.c index 8fab789..0906cb3 100644 > >>> --- a/drivers/gpu/drm/drm_crtc.c > >>> +++ b/drivers/gpu/drm/drm_crtc.c > >>> @@ -34,6 +34,8 @@ > >>> #include "drmP.h" > >>> #include "drm_crtc.h" > >>> > >>> +#undef set_base > >>> + > >>> struct drm_prop_enum_list { > >>> int type; > >>> char *name; > >>> @@ -342,6 +344,34 @@ void drm_framebuffer_cleanup(struct > >>> drm_framebuffer *fb) EXPORT_SYMBOL(drm_framebuffer_cleanup); > >>> > >>> /** > >>> + * drm_crtc_async_flip - do a set_base call from a work queue > >>> + * @work: work struct > >>> + * > >>> + * Called when a set_base call is queued by the page flip code. > >>> This > >>> + * allows the flip ioctl itself to return immediately and allow > >>> userspace > >>> + * to continue working. > >>> + */ > >>> +static void drm_crtc_async_flip(struct work_struct *work) > >>> +{ > >>> + struct drm_crtc *crtc = container_of(work, struct drm_crtc, > >>> async_flip); > >>> + struct drm_device *dev = crtc->dev; > >>> + struct drm_pending_flip *pending; > >>> + > >>> + BUG_ON(crtc->pending_flip == NULL); > >>> + > >>> + mutex_lock(&dev->struct_mutex); > >>> > >>> > >> Here the struct mutex is taken before a call to a function that > >> doesn't even have dev as an argument. > >> > > > > It's a work queue callback, there's no way we can change that to > > include a dev argument :) > > > > > >> Exactly what in the generic modesetting code is protected by the > >> struct mutex here? > >> > > > > I'm a little fuzzy on the details, this part of the patch was > > carried over from Jesse's original work. I believe set_base is > > supposed to be called with the struct mutex held. > > > > > > We shouldn't be this sloppy about locking, and in particular we > shouldn't protect a callback with a lock without knowing why. A lock > should protect data, not serialize calls to functions, and it should > ideally be documented somewhere. Why does set_base need the _caller_ > to take the struct_mutex lock? Are there more driver callbacks that > need the struct_mutex held? Yes; set_base is called from several places. The struct_mutex protect all the mode setting related fields & lists, so if we're going to change the fb (or any other setting) we need to hold it. Pushing the locking for it out to callers was necessary because in the flip ioctl, we already hold the lock. And we don't want to drop it and re-acquire since that opens up a big window where the config could change out from under us. > I mean if I implement a modesetting driver that takes care to use > device private locks not visible to the caller to avoid polluting the > struct_mutex with possible contention and lockdep errors as a result, > I'd still find the generic code taking the struct_mutex for me becase > it assumes that I need it? This can't be right and should go away. To do that you'd need to avoid all the code in drm_crtc, drm_crtc_helper and probably elsewhere. struct_mutex is our "big lock" at the moment. So far it hasn't been a big performance issue afaik. -- Jesse Barnes, Intel Open Source Technology Center ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel