[PATCH 2/2] drm: Shortcircuit vblank queries
On Tue, Apr 14, 2015 at 08:43:12PM +0200, Mario Kleiner wrote: > On 04/05/2015 05:40 PM, Chris Wilson wrote: > >Bypass all the spinlocks and return the last timestamp and counter from > >the last vblank if the driver delcares that it is accurate (and stable > >across on/off), and the vblank is currently enabled. > > > >Signed-off-by: Chris Wilson > >Cc: Ville Syrjälä > >Cc: Daniel Vetter > >Cc: Michel Dänzer > >Cc: Laurent Pinchart > >Cc: Dave Airlie , > >Cc: Mario Kleiner > >--- > > drivers/gpu/drm/drm_irq.c | 26 ++ > > 1 file changed, 26 insertions(+) > > > >diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > >index ba80b51b4b00..be9c210bb22e 100644 > >--- a/drivers/gpu/drm/drm_irq.c > >+++ b/drivers/gpu/drm/drm_irq.c > >@@ -1538,6 +1538,17 @@ err_put: > > return ret; > > } > > > >+static bool drm_wait_vblank_is_query(union drm_wait_vblank *vblwait) > >+{ > >+if (vblwait->request.sequence) > >+return false; > >+ > >+return _DRM_VBLANK_RELATIVE == > >+(vblwait->request.type & (_DRM_VBLANK_TYPES_MASK | > >+ _DRM_VBLANK_EVENT | > >+ _DRM_VBLANK_NEXTONMISS)); > >+} > >+ > > /* > > * Wait for VBLANK. > > * > >@@ -1587,6 +1598,21 @@ int drm_wait_vblank(struct drm_device *dev, void > >*data, > > > > vblank = >vblank[crtc]; > > > >+/* If the counter is currently enabled and accurate, short-circuit > >queries > >+ * to return the cached timestamp of the last vblank. > >+ */ > > Maybe somehow stress in the comment that this location in drm_wait_vblank is > really the only place where it is ok'ish to call > drm_vblank_count_and_time() without wrapping it into a drm_vblank_get/put(), > so nobody thinks this approach is ok anywhere else. > > >+if (dev->vblank_disable_immediate && > >+drm_wait_vblank_is_query(vblwait) && > >+vblank->enabled) { > > You should also check for (drm_vblank_offdelay != 0) whenever checking for > dev->vblank_disable_immediate. This is so one can override all the > vblank_disable_immediate related logic via the drm vblankoffdelay module > parameter, both for debugging and as a safety switch for desparate users in > case some driver+gpu combo screws up wrt. immediate disable and that makes > it into distro kernels. > > The other thing i'm not sure is if it wouldn't be a good idea to have some > kind of write memory barrier in vblank_disable_and_save() after setting > vblank->enabled = false; and some read memory barrier here before your check > for vblank->enabled? I don't have a feeling for how much time can pass > between one core executing the disable and the other core receiving the news > that vblank->enabled is no longer true if those bits run on different cores? > > I've run your patches through my standard tests on x86_64 and they don't > seem to introduce errors or more skipped frames. Normally it would be so > wrong to do this without drm_vblank_get/put(), but i think here potential > errors introduced wouldn't be worse than what a userspace client would see > due to preemption or other execution delays at the wrong moment, so it's > probably ok. But i don't know if lack of memory barriers etc. could > introduce large delays and trouble on other architectures? Barriers don't reduce that latency but only enforce ordering. And you always need two of them, one on the sending side of some piece of data and the other on the receiving side. From that pov drm_vblank_count_and_time is broken since it doesn't fully braket the timestamp read against the counter update (you'd need a barrier both before and after), and the barrier on the write side is missing. And then it's also too heavy, as long as we only have 1 updater we don't need atomics for the counter. I think I'll review this properly and then write a patch. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
[PATCH 2/2] drm: Shortcircuit vblank queries
On 04/05/2015 05:40 PM, Chris Wilson wrote: > Bypass all the spinlocks and return the last timestamp and counter from > the last vblank if the driver delcares that it is accurate (and stable > across on/off), and the vblank is currently enabled. > > Signed-off-by: Chris Wilson > Cc: Ville Syrjälä > Cc: Daniel Vetter > Cc: Michel Dänzer > Cc: Laurent Pinchart > Cc: Dave Airlie , > Cc: Mario Kleiner > --- > drivers/gpu/drm/drm_irq.c | 26 ++ > 1 file changed, 26 insertions(+) > > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > index ba80b51b4b00..be9c210bb22e 100644 > --- a/drivers/gpu/drm/drm_irq.c > +++ b/drivers/gpu/drm/drm_irq.c > @@ -1538,6 +1538,17 @@ err_put: > return ret; > } > > +static bool drm_wait_vblank_is_query(union drm_wait_vblank *vblwait) > +{ > + if (vblwait->request.sequence) > + return false; > + > + return _DRM_VBLANK_RELATIVE == > + (vblwait->request.type & (_DRM_VBLANK_TYPES_MASK | > + _DRM_VBLANK_EVENT | > + _DRM_VBLANK_NEXTONMISS)); > +} > + > /* >* Wait for VBLANK. >* > @@ -1587,6 +1598,21 @@ int drm_wait_vblank(struct drm_device *dev, void *data, > > vblank = >vblank[crtc]; > > + /* If the counter is currently enabled and accurate, short-circuit > queries > + * to return the cached timestamp of the last vblank. > + */ Maybe somehow stress in the comment that this location in drm_wait_vblank is really the only place where it is ok'ish to call drm_vblank_count_and_time() without wrapping it into a drm_vblank_get/put(), so nobody thinks this approach is ok anywhere else. > + if (dev->vblank_disable_immediate && > + drm_wait_vblank_is_query(vblwait) && > + vblank->enabled) { You should also check for (drm_vblank_offdelay != 0) whenever checking for dev->vblank_disable_immediate. This is so one can override all the vblank_disable_immediate related logic via the drm vblankoffdelay module parameter, both for debugging and as a safety switch for desparate users in case some driver+gpu combo screws up wrt. immediate disable and that makes it into distro kernels. The other thing i'm not sure is if it wouldn't be a good idea to have some kind of write memory barrier in vblank_disable_and_save() after setting vblank->enabled = false; and some read memory barrier here before your check for vblank->enabled? I don't have a feeling for how much time can pass between one core executing the disable and the other core receiving the news that vblank->enabled is no longer true if those bits run on different cores? I've run your patches through my standard tests on x86_64 and they don't seem to introduce errors or more skipped frames. Normally it would be so wrong to do this without drm_vblank_get/put(), but i think here potential errors introduced wouldn't be worse than what a userspace client would see due to preemption or other execution delays at the wrong moment, so it's probably ok. But i don't know if lack of memory barriers etc. could introduce large delays and trouble on other architectures? > + struct timeval now; > + > + vblwait->reply.sequence = > + drm_vblank_count_and_time(dev, crtc, ); > + vblwait->reply.tval_sec = now.tv_sec; > + vblwait->reply.tval_usec = now.tv_usec; Have some DRM_DEBUG here, so one can follow the client doing the instant query through this path. > + return 0; > + } > + > ret = drm_vblank_get(dev, crtc); > if (ret) { > DRM_DEBUG("failed to acquire vblank counter, %d\n", ret); > With the above addressed i'd give you a Reviewed-and-tested-by, but it would be good if somebody else could look over it as well. -mario
[PATCH 2/2] drm: Shortcircuit vblank queries
Bypass all the spinlocks and return the last timestamp and counter from the last vblank if the driver delcares that it is accurate (and stable across on/off), and the vblank is currently enabled. Signed-off-by: Chris Wilson Cc: Ville Syrjälä Cc: Daniel Vetter Cc: Michel Dänzer Cc: Laurent Pinchart Cc: Dave Airlie , Cc: Mario Kleiner --- drivers/gpu/drm/drm_irq.c | 26 ++ 1 file changed, 26 insertions(+) diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index ba80b51b4b00..be9c210bb22e 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -1538,6 +1538,17 @@ err_put: return ret; } +static bool drm_wait_vblank_is_query(union drm_wait_vblank *vblwait) +{ + if (vblwait->request.sequence) + return false; + + return _DRM_VBLANK_RELATIVE == + (vblwait->request.type & (_DRM_VBLANK_TYPES_MASK | + _DRM_VBLANK_EVENT | + _DRM_VBLANK_NEXTONMISS)); +} + /* * Wait for VBLANK. * @@ -1587,6 +1598,21 @@ int drm_wait_vblank(struct drm_device *dev, void *data, vblank = >vblank[crtc]; + /* If the counter is currently enabled and accurate, short-circuit queries +* to return the cached timestamp of the last vblank. +*/ + if (dev->vblank_disable_immediate && + drm_wait_vblank_is_query(vblwait) && + vblank->enabled) { + struct timeval now; + + vblwait->reply.sequence = + drm_vblank_count_and_time(dev, crtc, ); + vblwait->reply.tval_sec = now.tv_sec; + vblwait->reply.tval_usec = now.tv_usec; + return 0; + } + ret = drm_vblank_get(dev, crtc); if (ret) { DRM_DEBUG("failed to acquire vblank counter, %d\n", ret); -- 2.1.4