Re: [RFC] Reduce idle vblank wakeups

Mario Kleiner Wed, 16 Nov 2011 16:27:17 -0800

On Nov 16, 2011, at 7:48 PM, Matthew Garrett wrote:

On Wed, Nov 16, 2011 at 07:27:51PM +0100, Mario Kleiner wrote:

It's not broken hardware, but fast ping-ponging it on and off can
make the vblank counter and vblank timestamps unreliable for apps
that need high timing precision, especially for the ones that use
the OML_sync_control extensions for precise swap scheduling. My
target application is vision science
 neuroscience, where (sub-)milliseconds often matter for visual
stimulation.


I'll admit that I'm struggling to understand the issue here. If the
vblank counter is incremented at the time of vblank (which isn't the
case for radeon, it seems, but as far as I can tell is the case for
Intel) then how does ping-ponging the IRQ matter?
vblank_disable_and_save() appears to handle this case.

The drm software vblank counter which is used for scheduling swapsetc. gets incremented in the gpu's vblank irq handler. The gpu'shardware counter gets incremented somewhere inside the vblankinterval. The increments don't happen at the same point in time. Therace is between the vblank on/off code, which gets scheduled eitherby the timeout timer for the "off" case, or by usercode for the "on"case and the gpu's hardware vblank counter.

The existing code uses a lock (vblank_time_lock) to prevent someraces between itself and the vblank irq handler, but it can't "lock"the gpu, so if the enable/disable code executes while the gpu isscanning out the vblank interval, it is undefined if the final vblankcount will be correct or off by one. Vblank duration is somewhere upto 4.5% of refresh interval duration, so there's your up to 4% chanceof getting it wrong.

If one could reliably avoid enabling/disabling in the problematictime period, the problem would go away.

I think making the vblank off delay driver specific via these
patches is a good idea. Lowering the timeout to something like a few
refresh cycles, maybe somewhere between 50 msecs and 100 msecs would
be also fine by me. I still would like to keep some drm config
option to disable or override the vblank off delay by users.


Does the timeout serve any purpose other than letting software
effectively prevent vblanks from being disabled?

With perfect drivers and gpu's in a perfect world, no. In realitythere's the race i described above, and nouveau and all other driversexcept intel and radeon. The vblank irq also drives timestamping ofvblanks, one update per vblank. The timestamps are cached if a clientneeds them inbetween updates. Turning off vblank irq invalidates thetimestamps. radeon and intel can recreate the timestamp anytime asneeded, but nouveau lacks this atm., so timestamps remain invalid fora whole video refresh cycle after vblank irq on. We have patches fornouveau kms almost ready, so only the race mentioned above would remain.

The intel and radeon kms drivers implement everything that's needed
to make it mostly work. Except for a small race between the cpu and
gpu in the vblank_disable_and_save() function <http://lxr.free-
electrons.com/source/drivers/gpu/drm/drm_irq.c#L101> and
drm_update_vblank_count(). It can cause an off-by-one error when
reinitializing the drm vblank counter from the gpu's hardware
counter if the enable/disable function is called at the wrong moment
while the gpu's scanout is inside the vblank interval, see comments
in the code. I have some sketchy idea for a patch that could detect
when the race happens and retry hw counter queries to fix this.
Without that patch, there's some chance between 0% and 4% of being
off-by-one.

For Radeon, I'd have thought you could handle this by scheduling anirq

for the beginning of scanout (avivo has a register for that) and
delaying the vblank disable until you hit it?

For Radeon there is such an irq, but iirc we had some discussions onthis, also with Alex Deucher, a while ago and some irq's weren'tconsidered very reliable, or already used for other stuff. The idea ihad goes like this:

Use the crtc scanout position queries together with the vblankcounter queries inside some calibration loop, maybe executed aftereach modeset, to find out the scanline range in which the hardwarevblank counter increments -- basically a forbidden range of scanlinepositions where the race would happen. Then at each vblank off/on,query scanout position before and after the hw vblank counter query.If according to the scanout positions the vblank counter queryhappened within the forbidden time window, retry the query. With awell working calibration that should add no delay in most cases and adelay to the on/off code of a few dozen microseconds (=duration of afew scanlines) worst case.

With that and the pending nouveau patches in place i think wewouldn't need the vblank off delay anymore on such drivers.

On current nouveau kms, disabling vblank irqs guarantees you wrong
vblank counts and wrong vblank timestamps after turning them on
again, because the kms driver doesn't implement the hook for
hardware vblank counter query correctly. The drm vblank counter
misses all counts during the vblank irq off period. Other timing
related hooks are missing as well. I have a couple of patches queued
up and some more to come for the ddx and kms driver to mostly fix
this. NVidia gpu's only have hardware vblank counters for NV-50 and
later, fixing this for earlier gpu's would require some emulation of
a hw vblank counter inside the kms driver.


I've no problem with all of this work being case by case.

Me neither. I just say if you'd go to the extreme and disable vblankirq's immediately with zero delay, without reliably fixing theproblem i mentioned, you'd get those off by one counts, which wouldbe very bad for apps that rely on precise timing. Doing so wouldbasically make the whole oml_sync_control implementation mostly useless.

Apps that rely on the vblank counter being totally reliable over
long periods of time currently would be in a bad situation with a
lowered vblank off delay, but that's probably generally not a good
assumption. Toolkits like mine, which are more paranoid, currently
can work fine as long as the off delay is at least a few video
refresh cycles. I do the following for scheduling a reliably timed
swap:

1. Query current vblank counter current_msc and vblank timestamp
current_ust.
2. Calculate a target vblank count target_msc, based on current_msc,
current_ust and some target time from usercode.
3. Schedule bufferswap for target_msc.

As long as the vblank off delay is long enough so that vblanks don't
get turned off between 1. and 3, everything is fine, otherwise bad
things will happen.
Keeping a way to override the default off delay would be good to
allow poor scientists to work around potentially broken drivers or
gpu's in the future. @Matthew: I'm appealing here to your ex-
Drosophila biologist heritage ;-)

If vblanks are disabled and then re-enabled between 1 and 3, what'sthe

negative outcome?

1. glXGetSyncValuesOML() gives you the current vblank count and atimestamp of when exactly scanout of the corresponding video framestarted. These values are the baseline for any vblank based swapscheduling or any other vblank synchronized actions, e.g., viaglXWaitForMscOML().

2. App calculates stuff based on 1. but gets preempted or experiencesscheduling delay on return from 1. - or the x-server gets preemptedor scheduled away. Whatever, time passes...

3. glXSwapBuffersMscOML() or glXWaitForMscOML() is used to schedulesomething to occur at a target vblank count, based on the results of 1.

As long as vblanks don't get disabled between 1 and 3, everything isconsistent. Scheduling delays/preemption of more than a few (dozen)milliseconds worst case are very rare, so a lowered vblank off delayof even one or two refresh cycles would solve the problem. I assumethat most toolkits with needs for timing precision would follow athree step strategy like this one.

If vblank irq's get disabled immediately after step 1. and reenabledat step 3., and this triggers an off by one error due to a race, thenthe calculated target vblank count from steps 1/2 is invalid anduseless for step 3 and many vision scientists which just started tomake friendship with Linux lately will make very sad faces. Due tothe nature of many of the stimulation paradigms used, there is also ahigh chance that many of the enables and disables would happen insideor very close to the problematic vblank interval when off by oneerror is most likely.

I agree with your patches, i just don't want vblank irq's to bedisabled immediately with a zero timeout before remaining races arefixed, that would sure break things at least for scientificapplications. And i'd like to keep some optional parameter thatallows desparate users to disable the vblank off mechanism in casethey would encounter a bug.


-mario


*********************************************************************
Mario Kleiner
Max Planck Institute for Biological Cybernetics
Spemannstr. 38
72076 Tuebingen
Germany

e-mail: mario.klei...@tuebingen.mpg.de
office: +49 (0)7071/601-1623
fax:    +49 (0)7071/601-616
www:    http://www.kyb.tuebingen.mpg.de/~kleinerm
*********************************************************************
"For a successful technology, reality must take precedence
over public relations, for Nature cannot be fooled."
(Richard Feynman)

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [RFC] Reduce idle vblank wakeups

Reply via email to