Kristian Høgsberg wrote: > 2009/8/17 Thomas Hellström <tho...@shipmail.org>: > >> Kristian Høgsberg wrote: >> >>> This patch adds a vblank synced pageflip ioctl for to the modesetting >>> family of ioctls. The ioctl takes a crtc and an fb and schedules a >>> pageflip to the new fb at the next coming vertical blank event. This >>> feature lets userspace implement tear-free updating of the screen contents >>> with hw-guaranteed low latency page flipping. >>> >>> The ioctl is asynchronous in that it returns immediately and then later >>> notifies the client by making an event available for reading on the drm fd. >>> This lets applications add the drm fd to their main loop and handle other >>> tasks while waiting for the flip to happen. The event includes the time >>> of the flip, the frame counter and a 64 bit opaque token provided by >>> user space in the ioctl. >>> >>> Based on work and suggestions from >>> Jesse Barnes <jbar...@virtuousgeek.org>, >>> Jakob Bornecrantz <wallbra...@gmail.com>, >>> Chris Wilson <ch...@chris-wilson.co.uk> >>> >>> Signed-off-by: Kristian Høgsberg <k...@redhat.com> >>> Signed-off-by: Jesse Barnes <jbar...@virtuousgeek.org> >>> --- >>> >>> Ok, another version of this patch. This one has fixes to work with radeon >>> kms plus a missing list head init that would cause an oops in the case >>> where we schedule a flip before the previous one has been queued. >>> >>> I'm now ready to propose this patch for the 2.6.32 merge window. >>> >>> >>> >> Hi! >> First a general question: There is some hardware (for example Unichromes >> and Chrome9) that can schedule >> page-flips in the command stream after optional vblank barriers. For >> this kind of hardware the pageflip would be a matter of scheduling the >> flip and fence the old and new buffers. No need for delayed unpins and >> explicit user-space notifications, although the workqueue could be handy >> to avoid command processing stalls in the vblank barriers. How would >> such a scheme fit into the framework below? >> > > By sending an event back on the file descriptor we allow users of the > API to wait on the flip to finish in a standard poll or select > mainloop, where it can handle input from other sources while waiting. > If you rely on fences, the application will block when it tries to > access the buffers protected by the fence, and is unable to handle > input from other sources while it's blocking. > > Even with the vsync barrier in the command stream (which intel hw also > supports) you do need to keep the current front buffer pinned for the > remainder of the frame, and then you need notification when the swap > has happened so that you can unpin the old buffer. I'm sure Unichrome > has a command to send an interrupt, which can be queued after the > vsync barrier to run the workqueue and trigger this cleanup as well as > sending the userspace notification. > > >>> Kristian >>> >>> drivers/gpu/drm/drm_crtc.c | 170 >>> ++++++++++++++++++++++++++++++- >>> drivers/gpu/drm/drm_crtc_helper.c | 12 ++ >>> drivers/gpu/drm/drm_drv.c | 1 + >>> drivers/gpu/drm/drm_fops.c | 68 ++++++++++++- >>> drivers/gpu/drm/drm_irq.c | 43 ++++++++ >>> drivers/gpu/drm/i915/i915_drv.c | 1 + >>> drivers/gpu/drm/i915/intel_display.c | 24 +++-- >>> drivers/gpu/drm/radeon/radeon_display.c | 3 +- >>> include/drm/drm.h | 25 +++++ >>> include/drm/drmP.h | 32 ++++++ >>> include/drm/drm_crtc.h | 27 +++++ >>> include/drm/drm_crtc_helper.h | 4 + >>> include/drm/drm_mode.h | 16 +++ >>> 13 files changed, 414 insertions(+), 12 deletions(-) >>> >>> diff --git a/drivers/gpu/drm/drm_crtc.c b/drivers/gpu/drm/drm_crtc.c >>> index 8fab789..0906cb3 100644 >>> --- a/drivers/gpu/drm/drm_crtc.c >>> +++ b/drivers/gpu/drm/drm_crtc.c >>> @@ -34,6 +34,8 @@ >>> #include "drmP.h" >>> #include "drm_crtc.h" >>> >>> +#undef set_base >>> + >>> struct drm_prop_enum_list { >>> int type; >>> char *name; >>> @@ -342,6 +344,34 @@ void drm_framebuffer_cleanup(struct drm_framebuffer >>> *fb) >>> EXPORT_SYMBOL(drm_framebuffer_cleanup); >>> >>> /** >>> + * drm_crtc_async_flip - do a set_base call from a work queue >>> + * @work: work struct >>> + * >>> + * Called when a set_base call is queued by the page flip code. This >>> + * allows the flip ioctl itself to return immediately and allow userspace >>> + * to continue working. >>> + */ >>> +static void drm_crtc_async_flip(struct work_struct *work) >>> +{ >>> + struct drm_crtc *crtc = container_of(work, struct drm_crtc, >>> async_flip); >>> + struct drm_device *dev = crtc->dev; >>> + struct drm_pending_flip *pending; >>> + >>> + BUG_ON(crtc->pending_flip == NULL); >>> + >>> + mutex_lock(&dev->struct_mutex); >>> >>> >> Here the struct mutex is taken before a call to a function that doesn't >> even have dev as an argument. >> > > It's a work queue callback, there's no way we can change that to > include a dev argument :) > > >> Exactly what in the generic modesetting code is protected by the struct >> mutex here? >> > > I'm a little fuzzy on the details, this part of the patch was carried > over from Jesse's original work. I believe set_base is supposed to be > called with the struct mutex held. > >
We shouldn't be this sloppy about locking, and in particular we shouldn't protect a callback with a lock without knowing why. A lock should protect data, not serialize calls to functions, and it should ideally be documented somewhere. Why does set_base need the _caller_ to take the struct_mutex lock? Are there more driver callbacks that need the struct_mutex held? I mean if I implement a modesetting driver that takes care to use device private locks not visible to the caller to avoid polluting the struct_mutex with possible contention and lockdep errors as a result, I'd still find the generic code taking the struct_mutex for me becase it assumes that I need it? This can't be right and should go away. > > > Protecting mode_set_base again. > > > Same arguments again... :) > > Perhaps TTM bo read/write could use ioctls like the gem pwrite/pread > ioctls? The only way we can get asynchronous notifications from the > drm to userspace is through readable events on the drm fd. How were > you planning to implement read and write from multiple bo's through > one fd anyway? Reusing the fake offset we use for mmap? I think the > ioctl approach is cleaner since you can just pass in the object handle > and the offset into the object. Maybe even add src and dst stride > arguments. > > It's a common misconception that the TTM bo offsets are fake offsets from outer space. The offsets aren't fake offsets but real offsets into the device address space, so a logical consequence of mapping a part of that address space is to be able to read and write to the same address space, That is read / write implemented using syscalls as intended. That said, rectangular bo blit ioctls is probably a good idea as well, and some drivers, like via, already have them. But I do think that if we claim device reads and writes for modesetting, we should move bo IO to another device to be consistent. >>> +{ >>> + struct drm_file *file_priv = filp->private_data; >>> + struct drm_device *dev = file_priv->minor->dev; >>> + struct drm_pending_event *event; >>> + ssize_t total, ret; >>> + >>> + ret = wait_event_interruptible(file_priv->event_wait, >>> + !list_empty(&file_priv->event_list)); >>> + if (ret < 0) >>> + return ret; >>> + >>> + total = 0; >>> + while (!list_empty(&file_priv->event_list)) { >>> + mutex_lock(&dev->struct_mutex); >>> + event = list_first_entry(&file_priv->event_list, >>> + struct drm_pending_event, link); >>> + if (total + event->event->length > count) { >>> + mutex_unlock(&dev->struct_mutex); >>> + break; >>> + } >>> + list_del(&event->link); >>> + mutex_unlock(&dev->struct_mutex); >>> + >>> + if (copy_to_user(buffer + total, >>> + event->event, event->event->length)) { >>> + total = -EFAULT; >>> + break; >>> + } >>> + >>> + total += event->event->length; >>> + event->destroy(event); >>> + } >>> + >>> + return total; >>> +} >>> +EXPORT_SYMBOL(drm_read); >>> + >>> unsigned int drm_poll(struct file *filp, struct poll_table_struct *wait) >>> { >>> >>> >> A couple of years ago, any attempt to return anything else than 0 from >> drm poll resulted in an X server error. >> http://freedesktop.org/bugzilla/show_bug.cgi?id=1505. The fix mentioned >> in the bug was actually to return 0 from drm poll, and a comment about >> this is still present in drm.git. The above breaks drm for old X servers >> and all drivers, which I think is against drm policy? >> > > That can't be the real problem. The X server polls on a ton of file > descriptors already; sockets from clients, dbus, input devices. They > all have poll implementations that don't return 0... I mean, otherwise > they wouldn't work. Look at evdev_poll() in drivers/input/evdev.c for > the evdev poll implementation, for example. > You're probably right, but we should probably find out what went wrong and make sure it doesn't happen again with non-modesetting drivers + dri1 before pushing this. > cheers, > Kristian > Thanks, Thomas ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel