[PATCH 05/19] drm: Have the vblank counter account for the time between vblank irq disable and drm_vblank_off()

Mario Kleiner Tue, 02 Sep 2014 21:33:18 +0200

Hi Ville,

went through the vblank rework patch set, mostly looks good to me. I 
couldn't find any bugs in the code. A first quick test-run on my old 
Intel GMA-950 (Gen 3'ish i think?) also didn't show apparent problems 
with the OML_sync_control functions. I'll try to test more carefully 
with that card and maybe with a few more cards in the next days, if i 
can get my hands on something more recent.

The problematic bits:

Patch 3/19 [Don't clear vblank timestamp...] in combination with [5/19 
below]:

I agree that not clearing the timestamps during drm_vblank_off() is 
probably the better thing to do for userspace. The idea behind clearing 
the timestamps was that a ust timestamp of zero signals to userspace 
that the timestamp is invalid/undefined at the moment, so the client 
should retry the query if it needs a valid timestamp. This worked in 
practice insofar as a value of zero can't happen normally, unless a 
client would query a timestamp during the first microsecond since 
machine powerup. But i guess returning the last valid (msc, ust) pair to 
a client during vblank off may be better for things like compositors 
etc. I also wonder if we ever documented this ust == 0 -> -EAGAIN behaviour?

The problem with patch 5/19 is gpus/drivers which don't provide precise 
instantaneous vblank timestamps - which are afaik everything except 
intel, radeon and nouveau. On such drivers, the old code would return a 
zero ust timestamp during queries after the first drm_vblank_get() and 
until the first vblank irq happens and initializes the timestamps to 
something valid. The zero ust would signal "please retry" to the client. 
With patch 5/19 you'd get an updated vblank counter with an outdated 
vblank timestamp - whatever is stored in the ringbuffer from the past, 
because drm_update_vblank_count() can't update the timestamp without 
support for the optional vblank-timestamp driver function. A mismatched 
msc, ust would be very confusing to clients.

The only way one could get valid msc + ust on such drivers would be to 
enable vblank irq's and then wait for the next vblank irq to actually 
update the timestamp, at the cost of a couple of msecs waiting time.

So either have drm_update_vblank_count() itself sleep until next vblank 
"if (!rc) ..." at the very end, as a rc == 0 would signal an 
imprecise/wrong vblank timestamp. Or have all callers of it do this, if 
locking makes it neccessary. Or only care about it for the 
drm_vblank_off() special case, e.g., if !vblank->enabled there, then 
drm_vblank_get() -> wait for a valid timestamp and count to show up -> 
drm_vblank_put() -> vblank_disable_and_save().

For Patch 11/19 [Add dev->vblank_disable_immediate flag]: Can we make it 
so that a drm_vblank_offdelay module parameter of zero always overrides 
the flag? The only reason why a user wants to set drm_vblank_offdelay to 
zero is if that user absolutely needs precise and reliable vblank 
counts/timestamps and finds out that something is broken with his 
driver+gpu, so uses this as an override to temporarily fix a broken 
driver. That doesn't work if the vblank_disable_immediate flag overrides 
the override from the user - the user couldn't opt out of the trouble.

This might not be such an issue with Intel cards, as you have test 
suites and a QA team, and i assume/hope you tested every single intel 
gpu shipped in the last decade or so if the whole vblank off/on logic 
really is perfectly race-free now? At least it seems to work with that 
one gen-3 card i quickly tested. But for most other drivers with small 
teams and no dedicated QA this could end badly quickly for the user 
without any manual override.

The docs should probably clarify that a hw vblank counter isn't enough 
for the vblank_disable_immediate flag to be set. Their vblank 
off/on/hardware counter query implementation must be completely race 
free. iirc this means the hw counter query must behave as if the vblank 
counter always increments at the leading edge of vblank. E.g., radeon 
has hw counter queries, but the counter increments either at the 
trailing edge, or somewhere in the middle of vblank, so there it 
wouldn't work without races, causing off-by-one errors sometimes.

For Patch 14/19 [Don't update vblank timestamp when the counter didn't 
change]

That would go wrong if a driver doesn't implement a proper vblank 
counter query. E.g., nouveau has precise vblank timestamping since Linux 
3.14, but still no functional hw counter query.

Almost all embedded gpu drivers currently implement completely bogus hw 
vblank counter queries, because that driver hook is mandatory. I think 
it would make sense if we would make that hook optional, allow a NULL 
function pointer and adapt to the lack of that query, e.g., by never 
disabling vblank irq's, except in drm_vblank_off() when a kms-driver 
insists on disabling its irq during modeset/dpms off/suspend etc.

With these remarks somehow taken into account you have my

Reviewed-by: Mario Kleiner <mario.kleiner.de at gmail.com>

for the whole series, if you want.

thanks,
-mario

On 08/06/2014 01:49 PM, ville.syrjala at linux.intel.com wrote:
> From: Ville Syrj?l? <ville.syrjala at linux.intel.com>
>
> If the vblank irq has already been disabled (via the disable timer) when
> we call drm_vblank_off() sample the counter and timestamp one last time.
> This will make the sure that the user space visible counter will account
> for time between vblank irq disable and drm_vblank_off().
>
> Reviewed-by: Matt Roper <matthew.d.roper at intel.com>
> Reviewed-by: Daniel Vetter <daniel.vetter at ffwll.ch>
> Signed-off-by: Ville Syrj?l? <ville.syrjala at linux.intel.com>
> ---
>   drivers/gpu/drm/drm_irq.c | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
> index af96517..1f86f6c 100644
> --- a/drivers/gpu/drm/drm_irq.c
> +++ b/drivers/gpu/drm/drm_irq.c
> @@ -140,6 +140,19 @@ static void vblank_disable_and_save(struct drm_device 
> *dev, int crtc)
>        */
>       spin_lock_irqsave(&dev->vblank_time_lock, irqflags);
>   
> +     /*
> +      * If the vblank interrupt was already disbled update the count
> +      * and timestamp to maintain the appearance that the counter
> +      * has been ticking all along until this time. This makes the
> +      * count account for the entire time between drm_vblank_on() and
> +      * drm_vblank_off().
> +      */
> +     if (!dev->vblank[crtc].enabled) {
> +             drm_update_vblank_count(dev, crtc);
> +             spin_unlock_irqrestore(&dev->vblank_time_lock, irqflags);
> +             return;
> +     }
> +
>       dev->driver->disable_vblank(dev, crtc);
>       dev->vblank[crtc].enabled = false;
>

[PATCH 05/19] drm: Have the vblank counter account for the time between vblank irq disable and drm_vblank_off()

Reply via email to