[PATCH 2/2] drm: Shortcircuit vblank queries

2015-04-15 Thread Daniel Vetter
On Tue, Apr 14, 2015 at 08:43:12PM +0200, Mario Kleiner wrote:
> On 04/05/2015 05:40 PM, Chris Wilson wrote:
> >Bypass all the spinlocks and return the last timestamp and counter from
> >the last vblank if the driver delcares that it is accurate (and stable
> >across on/off), and the vblank is currently enabled.
> >
> >Signed-off-by: Chris Wilson 
> >Cc: Ville Syrjälä 
> >Cc: Daniel Vetter 
> >Cc: Michel Dänzer 
> >Cc: Laurent Pinchart 
> >Cc: Dave Airlie ,
> >Cc: Mario Kleiner 
> >---
> >  drivers/gpu/drm/drm_irq.c | 26 ++
> >  1 file changed, 26 insertions(+)
> >
> >diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> >index ba80b51b4b00..be9c210bb22e 100644
> >--- a/drivers/gpu/drm/drm_irq.c
> >+++ b/drivers/gpu/drm/drm_irq.c
> >@@ -1538,6 +1538,17 @@ err_put:
> > return ret;
> >  }
> >
> >+static bool drm_wait_vblank_is_query(union drm_wait_vblank *vblwait)
> >+{
> >+if (vblwait->request.sequence)
> >+return false;
> >+
> >+return _DRM_VBLANK_RELATIVE ==
> >+(vblwait->request.type & (_DRM_VBLANK_TYPES_MASK |
> >+  _DRM_VBLANK_EVENT |
> >+  _DRM_VBLANK_NEXTONMISS));
> >+}
> >+
> >  /*
> >   * Wait for VBLANK.
> >   *
> >@@ -1587,6 +1598,21 @@ int drm_wait_vblank(struct drm_device *dev, void 
> >*data,
> >
> > vblank = >vblank[crtc];
> >
> >+/* If the counter is currently enabled and accurate, short-circuit 
> >queries
> >+ * to return the cached timestamp of the last vblank.
> >+ */
> 
> Maybe somehow stress in the comment that this location in drm_wait_vblank is
> really the only place where it is ok'ish to call
> drm_vblank_count_and_time() without wrapping it into a drm_vblank_get/put(),
> so nobody thinks this approach is ok anywhere else.
> 
> >+if (dev->vblank_disable_immediate &&
> >+drm_wait_vblank_is_query(vblwait) &&
> >+vblank->enabled) {
> 
> You should also check for (drm_vblank_offdelay != 0) whenever checking for
> dev->vblank_disable_immediate. This is so one can override all the
> vblank_disable_immediate related logic via the drm vblankoffdelay module
> parameter, both for debugging and as a safety switch for desparate users in
> case some driver+gpu combo screws up wrt. immediate disable and that makes
> it into distro kernels.
> 
> The other thing i'm not sure is if it wouldn't be a good idea to have some
> kind of write memory barrier in vblank_disable_and_save() after setting
> vblank->enabled = false; and some read memory barrier here before your check
> for vblank->enabled? I don't have a feeling for how much time can pass
> between one core executing the disable and the other core receiving the news
> that vblank->enabled is no longer true if those bits run on different cores?
> 
> I've run your patches through my standard tests on x86_64 and they don't
> seem to introduce errors or more skipped frames. Normally it would be so
> wrong to do this without drm_vblank_get/put(), but i think here potential
> errors introduced wouldn't be worse than what a userspace client would see
> due to preemption or other execution delays at the wrong moment, so it's
> probably ok. But i don't know if lack of memory barriers etc. could
> introduce large delays and trouble on other architectures?

Barriers don't reduce that latency but only enforce ordering. And you
always need two of them, one on the sending side of some piece of data and
the other on the receiving side. From that pov drm_vblank_count_and_time
is broken since it doesn't fully braket the timestamp read against the
counter update (you'd need a barrier both before and after), and the
barrier on the write side is missing. And then it's also too heavy, as
long as we only have 1 updater we don't need atomics for the counter.

I think I'll review this properly and then write a patch.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch


[PATCH 2/2] drm: Shortcircuit vblank queries

2015-04-14 Thread Mario Kleiner
On 04/05/2015 05:40 PM, Chris Wilson wrote:
> Bypass all the spinlocks and return the last timestamp and counter from
> the last vblank if the driver delcares that it is accurate (and stable
> across on/off), and the vblank is currently enabled.
>
> Signed-off-by: Chris Wilson 
> Cc: Ville Syrjälä 
> Cc: Daniel Vetter 
> Cc: Michel Dänzer 
> Cc: Laurent Pinchart 
> Cc: Dave Airlie ,
> Cc: Mario Kleiner 
> ---
>   drivers/gpu/drm/drm_irq.c | 26 ++
>   1 file changed, 26 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> index ba80b51b4b00..be9c210bb22e 100644
> --- a/drivers/gpu/drm/drm_irq.c
> +++ b/drivers/gpu/drm/drm_irq.c
> @@ -1538,6 +1538,17 @@ err_put:
>   return ret;
>   }
>
> +static bool drm_wait_vblank_is_query(union drm_wait_vblank *vblwait)
> +{
> + if (vblwait->request.sequence)
> + return false;
> +
> + return _DRM_VBLANK_RELATIVE ==
> + (vblwait->request.type & (_DRM_VBLANK_TYPES_MASK |
> +   _DRM_VBLANK_EVENT |
> +   _DRM_VBLANK_NEXTONMISS));
> +}
> +
>   /*
>* Wait for VBLANK.
>*
> @@ -1587,6 +1598,21 @@ int drm_wait_vblank(struct drm_device *dev, void *data,
>
>   vblank = >vblank[crtc];
>
> + /* If the counter is currently enabled and accurate, short-circuit 
> queries
> +  * to return the cached timestamp of the last vblank.
> +  */

Maybe somehow stress in the comment that this location in 
drm_wait_vblank is really the only place where it is ok'ish to call
drm_vblank_count_and_time() without wrapping it into a 
drm_vblank_get/put(), so nobody thinks this approach is ok anywhere else.

> + if (dev->vblank_disable_immediate &&
> + drm_wait_vblank_is_query(vblwait) &&
> + vblank->enabled) {

You should also check for (drm_vblank_offdelay != 0) whenever checking 
for dev->vblank_disable_immediate. This is so one can override all the
vblank_disable_immediate related logic via the drm vblankoffdelay module 
parameter, both for debugging and as a safety switch for desparate users 
in case some driver+gpu combo screws up wrt. immediate disable and that 
makes it into distro kernels.

The other thing i'm not sure is if it wouldn't be a good idea to have 
some kind of write memory barrier in vblank_disable_and_save() after 
setting vblank->enabled = false; and some read memory barrier here 
before your check for vblank->enabled? I don't have a feeling for how 
much time can pass between one core executing the disable and the other 
core receiving the news that vblank->enabled is no longer true if those 
bits run on different cores?

I've run your patches through my standard tests on x86_64 and they don't 
seem to introduce errors or more skipped frames. Normally it would be so 
wrong to do this without drm_vblank_get/put(), but i think here 
potential errors introduced wouldn't be worse than what a userspace 
client would see due to preemption or other execution delays at the 
wrong moment, so it's probably ok. But i don't know if lack of memory 
barriers etc. could introduce large delays and trouble on other 
architectures?

> + struct timeval now;
> +
> + vblwait->reply.sequence =
> + drm_vblank_count_and_time(dev, crtc, );
> + vblwait->reply.tval_sec = now.tv_sec;
> + vblwait->reply.tval_usec = now.tv_usec;

Have some DRM_DEBUG here, so one can follow the client doing the instant 
query through this path.

> + return 0;
> + }
> +
>   ret = drm_vblank_get(dev, crtc);
>   if (ret) {
>   DRM_DEBUG("failed to acquire vblank counter, %d\n", ret);
>

With the above addressed i'd give you a Reviewed-and-tested-by, but it 
would be good if somebody else could look over it as well.

-mario


[PATCH 2/2] drm: Shortcircuit vblank queries

2015-04-05 Thread Chris Wilson
Bypass all the spinlocks and return the last timestamp and counter from
the last vblank if the driver delcares that it is accurate (and stable
across on/off), and the vblank is currently enabled.

Signed-off-by: Chris Wilson 
Cc: Ville Syrjälä 
Cc: Daniel Vetter 
Cc: Michel Dänzer 
Cc: Laurent Pinchart 
Cc: Dave Airlie ,
Cc: Mario Kleiner 
---
 drivers/gpu/drm/drm_irq.c | 26 ++
 1 file changed, 26 insertions(+)

diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
index ba80b51b4b00..be9c210bb22e 100644
--- a/drivers/gpu/drm/drm_irq.c
+++ b/drivers/gpu/drm/drm_irq.c
@@ -1538,6 +1538,17 @@ err_put:
return ret;
 }

+static bool drm_wait_vblank_is_query(union drm_wait_vblank *vblwait)
+{
+   if (vblwait->request.sequence)
+   return false;
+
+   return _DRM_VBLANK_RELATIVE ==
+   (vblwait->request.type & (_DRM_VBLANK_TYPES_MASK |
+ _DRM_VBLANK_EVENT |
+ _DRM_VBLANK_NEXTONMISS));
+}
+
 /*
  * Wait for VBLANK.
  *
@@ -1587,6 +1598,21 @@ int drm_wait_vblank(struct drm_device *dev, void *data,

vblank = >vblank[crtc];

+   /* If the counter is currently enabled and accurate, short-circuit 
queries
+* to return the cached timestamp of the last vblank.
+*/
+   if (dev->vblank_disable_immediate &&
+   drm_wait_vblank_is_query(vblwait) &&
+   vblank->enabled) {
+   struct timeval now;
+
+   vblwait->reply.sequence =
+   drm_vblank_count_and_time(dev, crtc, );
+   vblwait->reply.tval_sec = now.tv_sec;
+   vblwait->reply.tval_usec = now.tv_usec;
+   return 0;
+   }
+
ret = drm_vblank_get(dev, crtc);
if (ret) {
DRM_DEBUG("failed to acquire vblank counter, %d\n", ret);
-- 
2.1.4