On 05/07/2015 01:56 PM, Peter Hurley wrote: > On 05/06/2015 04:56 AM, Daniel Vetter wrote: >> On Tue, May 05, 2015 at 11:57:42AM -0400, Peter Hurley wrote: >>> On 05/05/2015 11:42 AM, Daniel Vetter wrote: >>>> On Tue, May 05, 2015 at 10:36:24AM -0400, Peter Hurley wrote: >>>>> On 05/04/2015 12:52 AM, Mario Kleiner wrote: >>>>>> On 04/16/2015 03:03 PM, Daniel Vetter wrote: >>>>>>> On Thu, Apr 16, 2015 at 08:30:55AM -0400, Peter Hurley wrote: >>>>>>>> On 04/15/2015 01:31 PM, Daniel Vetter wrote: >>>>>>>>> On Wed, Apr 15, 2015 at 09:00:04AM -0400, Peter Hurley wrote: >>>>>>>>>> Hi Daniel, >>>>>>>>>> >>>>>>>>>> On 04/15/2015 03:17 AM, Daniel Vetter wrote: >>>>>>>>>>> This was a bit too much cargo-culted, so lets make it solid: >>>>>>>>>>> - vblank->count doesn't need to be an atomic, writes are always done >>>>>>>>>>> under the protection of dev->vblank_time_lock. Switch to an >>>>>>>>>>> unsigned >>>>>>>>>>> long instead and update comments. Note that atomic_read is just >>>>>>>>>>> a >>>>>>>>>>> normal read of a volatile variable, so no need to audit all the >>>>>>>>>>> read-side access specifically. >>>>>>>>>>> >>>>>>>>>>> - The barriers for the vblank counter seqlock weren't complete: The >>>>>>>>>>> read-side was missing the first barrier between the counter >>>>>>>>>>> read and >>>>>>>>>>> the timestamp read, it only had a barrier between the ts and the >>>>>>>>>>> counter read. We need both. >>>>>>>>>>> >>>>>>>>>>> - Barriers weren't properly documented. Since barriers only work if >>>>>>>>>>> you have them on boths sides of the transaction it's prudent to >>>>>>>>>>> reference where the other side is. To avoid duplicating the >>>>>>>>>>> write-side comment 3 times extract a little store_vblank() >>>>>>>>>>> helper. >>>>>>>>>>> In that helper also assert that we do indeed hold >>>>>>>>>>> dev->vblank_time_lock, since in some cases the lock is acquired >>>>>>>>>>> a >>>>>>>>>>> few functions up in the callchain. >>>>>>>>>>> >>>>>>>>>>> Spotted while reviewing a patch from Chris Wilson to add a fastpath >>>>>>>>>>> to >>>>>>>>>>> the vblank_wait ioctl. >>>>>>>>>>> >>>>>>>>>>> Cc: Chris Wilson <chris at chris-wilson.co.uk> >>>>>>>>>>> Cc: Mario Kleiner <mario.kleiner.de at gmail.com> >>>>>>>>>>> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com> >>>>>>>>>>> Cc: Michel Dänzer <michel at daenzer.net> >>>>>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com> >>>>>>>>>>> --- >>>>>>>>>>> drivers/gpu/drm/drm_irq.c | 92 >>>>>>>>>>> ++++++++++++++++++++++++----------------------- >>>>>>>>>>> include/drm/drmP.h | 8 +++-- >>>>>>>>>>> 2 files changed, 54 insertions(+), 46 deletions(-) >>>>>>>>>>> >>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c >>>>>>>>>>> index c8a34476570a..23bfbc61a494 100644 >>>>>>>>>>> --- a/drivers/gpu/drm/drm_irq.c >>>>>>>>>>> +++ b/drivers/gpu/drm/drm_irq.c >>>>>>>>>>> @@ -74,6 +74,33 @@ module_param_named(vblankoffdelay, >>>>>>>>>>> drm_vblank_offdelay, int, 0600); >>>>>>>>>>> module_param_named(timestamp_precision_usec, >>>>>>>>>>> drm_timestamp_precision, int, 0600); >>>>>>>>>>> module_param_named(timestamp_monotonic, drm_timestamp_monotonic, >>>>>>>>>>> int, 0600); >>>>>>>>>>> >>>>>>>>>>> +static void store_vblank(struct drm_device *dev, int crtc, >>>>>>>>>>> + unsigned vblank_count_inc, >>>>>>>>>>> + struct timeval *t_vblank) >>>>>>>>>>> +{ >>>>>>>>>>> + struct drm_vblank_crtc *vblank = &dev->vblank[crtc]; >>>>>>>>>>> + u32 tslot; >>>>>>>>>>> + >>>>>>>>>>> + assert_spin_locked(&dev->vblank_time_lock); >>>>>>>>>>> + >>>>>>>>>>> + if (t_vblank) { >>>>>>>>>>> + tslot = vblank->count + vblank_count_inc; >>>>>>>>>>> + vblanktimestamp(dev, crtc, tslot) = *t_vblank; >>>>>>>>>>> + } >>>>>>>>>>> + >>>>>>>>>>> + /* >>>>>>>>>>> + * vblank timestamp updates are protected on the write side >>>>>>>>>>> with >>>>>>>>>>> + * vblank_time_lock, but on the read side done locklessly >>>>>>>>>>> using a >>>>>>>>>>> + * sequence-lock on the vblank counter. Ensure correct >>>>>>>>>>> ordering using >>>>>>>>>>> + * memory barrriers. We need the barrier both before and also >>>>>>>>>>> after the >>>>>>>>>>> + * counter update to synchronize with the next timestamp write. >>>>>>>>>>> + * The read-side barriers for this are in >>>>>>>>>>> drm_vblank_count_and_time. >>>>>>>>>>> + */ >>>>>>>>>>> + smp_wmb(); >>>>>>>>>>> + vblank->count += vblank_count_inc; >>>>>>>>>>> + smp_wmb(); >>>>>>>>>> >>>>>>>>>> The comment and the code are each self-contradictory. >>>>>>>>>> >>>>>>>>>> If vblank->count writes are always protected by vblank_time_lock >>>>>>>>>> (something I >>>>>>>>>> did not verify but that the comment above asserts), then the >>>>>>>>>> trailing write >>>>>>>>>> barrier is not required (and the assertion that it is in the comment >>>>>>>>>> is incorrect). >>>>>>>>>> >>>>>>>>>> A spin unlock operation is always a write barrier. >>>>>>>>> >>>>>>>>> Hm yeah. Otoh to me that's bordering on "code too clever for my own >>>>>>>>> good". >>>>>>>>> That the spinlock is held I can assure. That no one goes around and >>>>>>>>> does >>>>>>>>> multiple vblank updates (because somehow that code raced with the hw >>>>>>>>> itself) I can't easily assure with a simple assert or something >>>>>>>>> similar. >>>>>>>>> It's not the case right now, but that can changes. >>>>>>>> >>>>>>>> The algorithm would be broken if multiple updates for the same vblank >>>>>>>> count were allowed; that's why it checks to see if the vblank count has >>>>>>>> not advanced before storing a new timestamp. >>>>>>>> >>>>>>>> Otherwise, the read side would not be able to determine that the >>>>>>>> timestamp is valid by double-checking that the vblank count has not >>>>>>>> changed. >>>>>>>> >>>>>>>> And besides, even if the code looped without dropping the spinlock, >>>>>>>> the correct write order would still be observed because it would still >>>>>>>> be executing on the same cpu. >>>>>>>> >>>>>>>> My objection to the write memory barrier is not about optimization; >>>>>>>> it's about correct code. >>>>>>> >>>>>>> Well diff=0 is not allowed, I guess I could enforce this with some >>>>>>> WARN_ON. And I still think my point of non-local correctness is solid. >>>>>>> With the smp_wmb() removed the following still works correctly: >>>>>>> >>>>>>> spin_lock(vblank_time_lock); >>>>>>> store_vblank(dev, crtc, 1, ts1); >>>>>>> spin_unlock(vblank_time_lock); >>>>>>> >>>>>>> spin_lock(vblank_time_lock); >>>>>>> store_vblank(dev, crtc, 1, ts2); >>>>>>> spin_unlock(vblank_time_lock); >>>>>>> >>>>>>> But with the smp_wmb(); removed the following would be broken: >>>>>>> >>>>>>> spin_lock(vblank_time_lock); >>>>>>> store_vblank(dev, crtc, 1, ts1); >>>>>>> store_vblank(dev, crtc, 1, ts2); >>>>>>> spin_unlock(vblank_time_lock); >>>>>>> >>>>>>> because the compiler/cpu is free to reorder the store for vblank->count >>>>>>> _ahead_ of the store for the timestamp. And that would trick readers >>>>>>> into >>>>>>> believing that they have a valid timestamp when they potentially raced. >>>>>>> >>>>>>> Now you're correct that right now there's no such thing going on, and >>>>>>> it's >>>>>>> unlikely to happen (given the nature of vblank updates). But my point is >>>>>>> that if we optimize this then the correctness can't be proven locally >>>>>>> anymore by just looking at store_vblank, but instead you must audit all >>>>>>> the callers. And leaking locking/barriers like that is too fragile >>>>>>> design >>>>>>> for my taste. >>>>>>> >>>>>>> But you insist that my approach is broken somehow and dropping the >>>>>>> smp_wmb >>>>>>> is needed for correctness. I don't see how that's the case at all. >>>>> >>>>> Daniel, >>>>> >>>>> I've been really busy this last week; my apologies for not replying >>>>> promptly. >>>>> >>>>>> Fwiw, i spent some time reeducating myself about memory barriers (thanks >>>>>> for your explanations) and thinking about this, and the last version of >>>>>> your patch looks good to me. It also makes sense to me to leave that >>>>>> last smb_wmb() in place to make future use of the helper robust - for >>>>>> non-local correctness, to avoid having to audit all future callers of >>>>>> that helper. >>>>> >>>>> My concern wrt to unnecessary barriers in this algorithm is that the >>>>> trailing >>>>> barrier now appears mandatory, when in fact it is not. >>>>> >>>>> Moreover, this algorithm is, in general, fragile and not designed to >>>>> handle >>>>> random or poorly-researched changes. >>>> >>>> Less fragility is exactly why I want that surplus barrier. But I've run >>>> out of new ideas for how to explain that ... >>>> >>>>> For example, if only the read and store operations are considered, it's >>>>> obviously >>>>> unsafe, since a read may unwittingly retrieve an store in progress. >>>>> >>>>> >>>>> CPU 0 | CPU 1 >>>>> | >>>>> /* vblank->count == 0 */ >>>>> | >>>>> drm_vblank_count_and_time() | store_vblank(.., inc = 2, ...) >>>>> | >>>>> cur_vblank <= LOAD vblank->count | >>>>> | tslot = vblank->count + 2 >>>>> | /* tslot == 2 */ >>>>> | STORE vblanktime[0] >>>> >>>> This line here is wrong, it should be "STORE vblanktime[2]" >>>> >>>> The "STORE vblanktime[0]" happened way earlier, before 2 smp_wmb and the >>>> previous updating of vblank->count. >>> >>> &vblanktime[0] == &vblanktime[2] >>> >>> That's why I keep trying to explain you actually have to look at and >>> understand the algorithm before blindly assuming local behavior is >>> sufficient. >> >> Ok now I think I got it, the issue is when the array (which is only 2 >> elements big) wraps around. And that's racy because we don't touch the >> increment before _and_ after the write side update. But that seems like a >> bug that's always been there? > > I'm not sure if those conditions can actually occur; it's been a long time > since I analyzed vblank timestamping. > >
They shouldn't occur under correct use. Normally one has to wrap any call to drm_vblank_count() or drm_vblank_count_and_time() into a pair of drm_vblank_get() -> query -> drm_vblank_put(). Only drm_vblank_get() will call drm_update_vblank() on a refcount 0 -> 1 transition if vblanks were previously off, and only that function bumps the count by more than +1. Iow. the overflow case isn't executed in parallel with queries -> problem avoided. Proper _get()->query->put() sequence is given for queueing vblank events, waiting for target vblank counts or during pageflip completion. The one exception would be in Chris recently proposed "lockless instant query" patch where a pure query is done - That patch that triggered Daniels cleanup patch. I was just about to ok that one, and testing with my timing tests and under normal use didn't show problems. There drm_vblank_count_and_time() is used outside a get/put protected path. I'm not sure if some race there could happen under realistic conditions. -mario