On 11/25/2015 08:36 PM, Ville Syrjälä wrote:
> On Wed, Nov 25, 2015 at 08:04:26PM +0100, Mario Kleiner wrote:
>> On 11/25/2015 06:46 PM, Ville Syrjälä wrote:

...

>> Attached is my current patch i wanted to submit for the drm core's
>> drm_update_vblank_count(). I think it's good to make the core somewhat
>> robust against potential kms driver bugs or glitches. But if you
>> wouldn't like that patch, there wouldn't be much of a point sending it
>> out at all.
>>
>> thanks,
>> -mario
>>
>
>> >From 2d5d58a1c575ad002ce2cb643f395d0e4757d959 Mon Sep 17 00:00:00 2001
>> From: Mario Kleiner <mario.kleiner.de at gmail.com>
>> Date: Wed, 25 Nov 2015 18:48:31 +0100
>> Subject: [PATCH] drm/irq: Make drm_update_vblank_count() more robust.
>>
>> The changes to drm_update_vblank_count() for Linux 4.4-rc
>> made the function more fragile wrt. some hw quirks. E.g.,
>> at dev->driver->enable_vblank(), AMD gpu's fire a spurious
>> redundant vblank irq shortly after enabling vblank irqs, not
>> locked to vblank. This causes a redundant call which needs
>> to be suppressed to avoid miscounting.
>>
>> To increase robustness, shuffle things around a bit:
>>
>> On drivers with high precision vblank timestamping always
>> evaluate the timestamp difference between current timestamp
>> and previously recorded timestamp to detect such redundant
>> invocations and no-op in that case.
>>
>> Also detect and warn about timestamps going backwards to
>> catch potential kms driver bugs.
>>
>> This patch is meant for Linux 4.4-rc and later.
>>
>> Signed-off-by: Mario Kleiner <mario.kleiner.de at gmail.com>
>> ---
>>   drivers/gpu/drm/drm_irq.c | 53 
>> ++++++++++++++++++++++++++++++++++++-----------
>>   1 file changed, 41 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>> index 819b8c1..8728c3c 100644
>> --- a/drivers/gpu/drm/drm_irq.c
>> +++ b/drivers/gpu/drm/drm_irq.c
>> @@ -172,9 +172,11 @@ static void drm_update_vblank_count(struct drm_device 
>> *dev, unsigned int pipe,
>>                                  unsigned long flags)
>>   {
>>      struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
>> -    u32 cur_vblank, diff;
>> +    u32 cur_vblank, diff = 0;
>>      bool rc;
>>      struct timeval t_vblank;
>> +    const struct timeval *t_old;
>> +    u64 diff_ns;
>>      int count = DRM_TIMESTAMP_MAXRETRIES;
>>      int framedur_ns = vblank->framedur_ns;
>>
>> @@ -195,13 +197,15 @@ static void drm_update_vblank_count(struct drm_device 
>> *dev, unsigned int pipe,
>>              rc = drm_get_last_vbltimestamp(dev, pipe, &t_vblank, flags);
>>      } while (cur_vblank != dev->driver->get_vblank_counter(dev, pipe) && 
>> --count > 0);
>>
>> -    if (dev->max_vblank_count != 0) {
>> -            /* trust the hw counter when it's around */
>> -            diff = (cur_vblank - vblank->last) & dev->max_vblank_count;
>> -    } else if (rc && framedur_ns) {
>> -            const struct timeval *t_old;
>> -            u64 diff_ns;
>> -
>> +    /*
>> +     * Always use vblank timestamping based method if supported to reject
>> +     * redundant vblank irqs. E.g., AMD hardware needs this to not screw up
>> +     * due to some irq handling quirk.
>> +     *
>> +     * This also sets the diff value for use as fallback below in case the
>> +     * hw does not support a suitable hw vblank counter.
>> +     */
>> +    if (rc && framedur_ns) {
>
> If you fudged everything properly why do you still need this? With
> working hw counter there should be no need to do this stuff.
>

As far as testing on one DCE4 card goes, i don't need it anymore with my 
fudged hw counters and timestamps. The fudging so far seems to work 
nicely. I just wanted to have a bit of extra robustness and a bit of 
extra available debug output against future or other broken drivers, or 
mistakes in fudging on the current driver, e.g., against things like 
timestamps going backwards. Especially since i can only test on two AMD 
cards atm., quite a limited sample. There are 3 display engine 
generations before and 5 generations after my test sample.

-mario


>>              t_old = &vblanktimestamp(dev, pipe, vblank->count);
>>              diff_ns = timeval_to_ns(&t_vblank) - timeval_to_ns(t_old);
>>
>> @@ -212,11 +216,36 @@ static void drm_update_vblank_count(struct drm_device 
>> *dev, unsigned int pipe,
>>               */
>>              diff = DIV_ROUND_CLOSEST_ULL(diff_ns, framedur_ns);
>>
>> -            if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ)
>> +            /* Catch driver timestamping bugs and prevent worse. */
>> +            if ((s32) diff < 0) {
>> +                    DRM_DEBUG_VBL("crtc %u: Timestamp going backward!"
>> +                    " diff_ns = %lld, framedur_ns = %d)\n",
>> +                    pipe, (long long) diff_ns, framedur_ns);
>> +                    return;
>> +            }
>> +
>> +            if (diff == 0 && flags & DRM_CALLED_FROM_VBLIRQ) {
>>                      DRM_DEBUG_VBL("crtc %u: Redundant vblirq ignored."
>> -                                  " diff_ns = %lld, framedur_ns = %d)\n",
>> -                                  pipe, (long long) diff_ns, framedur_ns);
>> -    } else {
>> +                    " diff_ns = %lld, framedur_ns = %d)\n",
>> +                    pipe, (long long) diff_ns, framedur_ns);
>> +
>> +                    /* Treat this redundant invocation as no-op. */
>> +                    WARN_ON_ONCE(cur_vblank != vblank->last);
>> +                    return;
>> +            }
>> +    }
>> +
>> +    /*
>> +     * hw counters, if marked as supported via max_vblank_count != 0,
>> +     * *must* increment at leading edge of vblank or at least before
>> +     * the firing of the hw vblank irq, otherwise we have a race here
>> +     * between gpu and us, where small variations in our execution
>> +     * timing can cause off-by-one miscounting of vblanks.
>> +     */
>> +    if (dev->max_vblank_count != 0) {
>> +            /* trust the hw counter when it's around */
>> +            diff = (cur_vblank - vblank->last) & dev->max_vblank_count;
>> +    } else if (!(rc && framedur_ns)) {
>>              /* some kind of default for drivers w/o accurate vbl 
>> timestamping */
>>              diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0;
>>      }
>> --
>> 1.9.1
>>
>
>

Reply via email to