Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Mario Kleiner via amd-gfx
On Wed, Feb 13, 2019 at 5:03 PM Daniel Vetter  wrote:
>
> On Wed, Feb 13, 2019 at 4:46 PM Kazlauskas, Nicholas
>  wrote:
> >
> > On 2/13/19 10:14 AM, Daniel Vetter wrote:
> > > On Wed, Feb 13, 2019 at 3:33 PM Kazlauskas, Nicholas
> > >  wrote:
> > >>
> > >> On 2/13/19 4:50 AM, Daniel Vetter wrote:
> > >>> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
> >  On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter  wrote:
> > >
> > > On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas
> > >  wrote:
> > >>
> > >> On 2/11/19 3:35 AM, Daniel Vetter wrote:
> > >>> On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote:
> >  The pageflip completion timestamps transmitted to userspace
> >  via pageflip completion events are supposed to describe the
> >  time at which the first pixel of the new post-pageflip scanout
> >  buffer leaves the video output of the gpu. This time is
> >  identical to end of vblank, when active scanout starts.
> > 
> >  For a crtc in standard fixed refresh rate, the end of vblank
> >  is identical to the vblank timestamps calculated by
> >  drm_update_vblank_count() at each vblank interrupt, or each
> >  vblank dis-/enable. Therefore pageflip events just carry
> >  that vblank timestamp as their pageflip timestamp.
> > 
> >  For a crtc switched to variable refresh rate mode (vrr), the
> >  pageflip completion timestamps are identical to the vblank
> >  timestamps iff the pageflip was executed early in vblank,
> >  before the minimum vblank duration elapsed. In this case
> >  the time of display onset is identical to when the crtc
> >  is running in fixed refresh rate.
> > 
> >  However, if a pageflip completes later in the vblank, inside
> >  the "extended front porch" in vrr mode, then the vblank will
> >  terminate at a fixed (back porch) duration after flip, so
> >  the display onset time is delayed correspondingly. In this
> >  case the vblank timestamp computed at vblank irq time would
> >  be too early, and we need a way to calculate an estimated
> >  pageflip timestamp that will be later than the vblank timestamp.
> > 
> >  How a driver determines such a "late flip" timestamp is hw
> >  and driver specific, but this patch adds a new helper function
> >  that allows the driver to propose such an alternate "late flip"
> >  timestamp for use in pageflip events:
> > 
> >  drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> > 
> >  When sending out pageflip events, we now compare that proposed
> >  flip_timestamp against the vblank timestamp of the current
> >  vblank of flip completion and choose to send out the greater/
> >  later timestamp as flip completion timestamp.
> > 
> >  The most simple way for a kms driver to supply a suitable
> >  flip_timestamp in vrr mode would be to simply take a timestamp
> >  at start of the pageflip completion handler, e.g., pageflip
> >  irq handler: flip_timestamp = ktime_get(); and then set that
> >  as proposed "late" alternative timestamp via ...
> >  drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> > 
> >  More clever approaches could try to add some corrective offset
> >  for fixed back porch duration, or ideally use hardware features
> >  like hw timestamps to calculate the exact end time of vblank.
> > 
> >  Signed-off-by: Mario Kleiner 
> >  Cc: Nicholas Kazlauskas 
> >  Cc: Harry Wentland 
> >  Cc: Alex Deucher 
> > >>>
> > >>> Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only 
> > >>> give us
> > >>> the right timestampe, once? With this I guess if you do a vblank 
> > >>> query in
> > >>> between the wrong and the right vblank you'll get the bogus value. 
> > >>> Not
> > >>> really great for userspace.
> > >>> -Daniel
> > >>
> > >> I think we calculate the timestamp and send the vblank event both 
> > >> within
> > >> the pageflip IRQ handler so calculating the right pageflip timestamp
> > >> once could probably be done. I'm not sure if it's easier than 
> > >> proposing
> > >> a later flip time with an API like this though.
> > >>
> > >> The actual scanout time should be known from the page-flip handler so
> > >> the semantics for VRR on/off remain the same. This is because the
> > >> page-flip triggers entering the back porch if we're in the extended
> > >> front porch.
> > >>
> > >> But scanout time from vblank events for something like
> > >> DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are 
> > >> only
> > >> 

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Mario Kleiner via amd-gfx
On Wed, Feb 13, 2019 at 10:50 AM Daniel Vetter  wrote:
>
> On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
> > On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter  wrote:
> > >
> > > On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas
> > >  wrote:
> > > >
> > > > On 2/11/19 3:35 AM, Daniel Vetter wrote:
> > > > > On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote:
> > > > >> The pageflip completion timestamps transmitted to userspace
...
> > > > >
> > > > > Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only 
> > > > > give us
> > > > > the right timestampe, once? With this I guess if you do a vblank 
> > > > > query in
> > > > > between the wrong and the right vblank you'll get the bogus value. Not
> > > > > really great for userspace.
> > > > > -Daniel
> > > >
> > > > I think we calculate the timestamp and send the vblank event both within
> > > > the pageflip IRQ handler so calculating the right pageflip timestamp
> > > > once could probably be done. I'm not sure if it's easier than proposing
> > > > a later flip time with an API like this though.
> > > >
> > > > The actual scanout time should be known from the page-flip handler so
> > > > the semantics for VRR on/off remain the same. This is because the
> > > > page-flip triggers entering the back porch if we're in the extended
> > > > front porch.
> > > >
> > > > But scanout time from vblank events for something like
> > > > DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only
> > > > treated as estimates. If we're in the regular front porch then the
> > > > timing to scanout is based on the fixed duration front porch for the
> > > > current mode. If we're in the extended back porch then it's technically
> > > > driver defined but the most reasonable guess is to assume that the front
> > > > porch is going to end at any moment, so just return the length of the
> > > > back porch for getting the scanout time.
> > > >
> > > > Proposing the late timestamp shouldn't affect vblank event in the
> > > > DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip
> > > > event case. I'm not sure if that's what's guaranteed to happen with this
> > > > patch though. There doesn't seem to be any locking on either
> > > > dev->vblank_time_lock or the vblank->seqlock so while it's likely to get
> > > > the same vblank event back as the one just stored I don't think it's
> > > > guaranteed.
> > >
> > > That's the inconsistency I mean to highlight - the timestamp for the
> > > same frame as observed through flip complete and through the
> > > wait_vblank ioctl can differ. Which they really shouldn't.
> > >
> >
> > Ideally they shouldn't differ. The kernel docs for drm_crtc_state say
> > that vblank and pageflip timestamps should always match. But then the
> > kernel docs for "Variable refresh properties" in drm_connector.c for
> > vblank timestamps were changed for the VRR implementation in Linux
> > 5.0-rc to redefine them when in VRR mode. They are defined, but
> > probably rather useless for any practical purpose, like this:
> >
> > "The semantics for the vertical blank timestamp differ when
> > variable refresh rate is active. The vertical blank timestamp
> > is defined to be an estimate using the current mode's fixed
> > refresh rate timings. The semantics for the page-flip event
> > timestamp remain the same."
>
> Uh I missed that. That sounds like nonsense tbh.
>
> > So our docs contradict each other as of Linux 5.0-rc. Certainly having
> > useful vblank timetamps would be useful.
>
> Yup, imo vblank should still match page_flip. Otherwise I expect a lot of
> hilarity will ensue.
>
> > > Now added complication is that amdgpu sends out vblank events really
> > > early, which is used by userspace to do frontbuffer rendering in the
> > > vblank time. But I don't think anyone wants to do both VRR and
> >
> > I think all kms drivers try to call drm_crtc_handle_vblank() at start
> > of vblank to give Mesa the most time for frontbuffer rendering for
> > classic X. But vblank events are also used for scheduling bufferswaps
> > or other stuff for redirected windowed rendering, or via api's like
> > OML_sync_controls glXWaitForMscOML, so there might be other things
> > affected by a more delayed vblank handling.
>
> The frontbuffer rendering is very much X driver specific, and I think
> -amdgpu/radeon is the only one that requires this. No i915 driver ever
> used the vblank interrupt to schedule frontbuffer blits, we use some
> CS-side stalls.
>
> Wrt scheduling pageflips: The rule is that right after you've received the
> vblank for frame X, then an immediately schedule pageflip should hit X+1,
> but not X. amdgpu had this broken, but it's fixed since a while.
>
> > > frontbuffer rendering, hence I think it should be possible to create
> > > correct vblank timestamps for the VRR case, while leaving the current
> > > logic in place for everything else. But that means moving the entire
> > > vblank 

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-13 Thread Mario Kleiner via amd-gfx
On Wed, Feb 13, 2019 at 10:56 AM Chris Wilson  wrote:
>
> Quoting Daniel Vetter (2019-02-13 09:50:55)
> > On Tue, Feb 12, 2019 at 10:32:31PM +0100, Mario Kleiner wrote:
> > > I think all kms drivers try to call drm_crtc_handle_vblank() at start
> > > of vblank to give Mesa the most time for frontbuffer rendering for
> > > classic X. But vblank events are also used for scheduling bufferswaps
> > > or other stuff for redirected windowed rendering, or via api's like
> > > OML_sync_controls glXWaitForMscOML, so there might be other things
> > > affected by a more delayed vblank handling.
> >
> > The frontbuffer rendering is very much X driver specific, and I think
> > -amdgpu/radeon is the only one that requires this. No i915 driver ever
> > used the vblank interrupt to schedule frontbuffer blits, we use some
> > CS-side stalls.
>
> Fwiw, the Present midlayer does use vblank scheduling for inplace copy
> updates. Not that I wish to encourage anyone to use frontbuffer
> rendering.
> -Chris

Yes, that's what i meant. Under DRI2 at least AMD, Intel and nouveau
have throttling based on CS stalls to avoid tearing and do throttling.
DRI3/Present last time i checked just waited for a vblank event and
then triggered the blit - something that causes tearing even when
triggered at start of vblank.

-mario
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 2/3] drm: Add basic helper to allow precise pageflip timestamps in vrr.

2019-02-12 Thread Mario Kleiner via amd-gfx
On Mon, Feb 11, 2019 at 6:04 PM Daniel Vetter  wrote:
>
> On Mon, Feb 11, 2019 at 4:01 PM Kazlauskas, Nicholas
>  wrote:
> >
> > On 2/11/19 3:35 AM, Daniel Vetter wrote:
> > > On Mon, Feb 11, 2019 at 04:22:24AM +0100, Mario Kleiner wrote:
> > >> The pageflip completion timestamps transmitted to userspace
> > >> via pageflip completion events are supposed to describe the
> > >> time at which the first pixel of the new post-pageflip scanout
> > >> buffer leaves the video output of the gpu. This time is
> > >> identical to end of vblank, when active scanout starts.
> > >>
> > >> For a crtc in standard fixed refresh rate, the end of vblank
> > >> is identical to the vblank timestamps calculated by
> > >> drm_update_vblank_count() at each vblank interrupt, or each
> > >> vblank dis-/enable. Therefore pageflip events just carry
> > >> that vblank timestamp as their pageflip timestamp.
> > >>
> > >> For a crtc switched to variable refresh rate mode (vrr), the
> > >> pageflip completion timestamps are identical to the vblank
> > >> timestamps iff the pageflip was executed early in vblank,
> > >> before the minimum vblank duration elapsed. In this case
> > >> the time of display onset is identical to when the crtc
> > >> is running in fixed refresh rate.
> > >>
> > >> However, if a pageflip completes later in the vblank, inside
> > >> the "extended front porch" in vrr mode, then the vblank will
> > >> terminate at a fixed (back porch) duration after flip, so
> > >> the display onset time is delayed correspondingly. In this
> > >> case the vblank timestamp computed at vblank irq time would
> > >> be too early, and we need a way to calculate an estimated
> > >> pageflip timestamp that will be later than the vblank timestamp.
> > >>
> > >> How a driver determines such a "late flip" timestamp is hw
> > >> and driver specific, but this patch adds a new helper function
> > >> that allows the driver to propose such an alternate "late flip"
> > >> timestamp for use in pageflip events:
> > >>
> > >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> > >>
> > >> When sending out pageflip events, we now compare that proposed
> > >> flip_timestamp against the vblank timestamp of the current
> > >> vblank of flip completion and choose to send out the greater/
> > >> later timestamp as flip completion timestamp.
> > >>
> > >> The most simple way for a kms driver to supply a suitable
> > >> flip_timestamp in vrr mode would be to simply take a timestamp
> > >> at start of the pageflip completion handler, e.g., pageflip
> > >> irq handler: flip_timestamp = ktime_get(); and then set that
> > >> as proposed "late" alternative timestamp via ...
> > >> drm_crtc_set_vrr_pageflip_timestamp(crtc, flip_timestamp);
> > >>
> > >> More clever approaches could try to add some corrective offset
> > >> for fixed back porch duration, or ideally use hardware features
> > >> like hw timestamps to calculate the exact end time of vblank.
> > >>
> > >> Signed-off-by: Mario Kleiner 
> > >> Cc: Nicholas Kazlauskas 
> > >> Cc: Harry Wentland 
> > >> Cc: Alex Deucher 
> > >
> > > Uh, this looks like a pretty bad hack. Can't we fix amdgpu to only give us
> > > the right timestampe, once? With this I guess if you do a vblank query in
> > > between the wrong and the right vblank you'll get the bogus value. Not
> > > really great for userspace.
> > > -Daniel
> >
> > I think we calculate the timestamp and send the vblank event both within
> > the pageflip IRQ handler so calculating the right pageflip timestamp
> > once could probably be done. I'm not sure if it's easier than proposing
> > a later flip time with an API like this though.
> >
> > The actual scanout time should be known from the page-flip handler so
> > the semantics for VRR on/off remain the same. This is because the
> > page-flip triggers entering the back porch if we're in the extended
> > front porch.
> >
> > But scanout time from vblank events for something like
> > DRM_IOCTL_WAIT_VBLANK are going to be wrong in most cases and are only
> > treated as estimates. If we're in the regular front porch then the
> > timing to scanout is based on the fixed duration front porch for the
> > current mode. If we're in the extended back porch then it's technically
> > driver defined but the most reasonable guess is to assume that the front
> > porch is going to end at any moment, so just return the length of the
> > back porch for getting the scanout time.
> >
> > Proposing the late timestamp shouldn't affect vblank event in the
> > DRM_IOCTL_WAIT_VBLANK case and should only be used in the page-flip
> > event case. I'm not sure if that's what's guaranteed to happen with this
> > patch though. There doesn't seem to be any locking on either
> > dev->vblank_time_lock or the vblank->seqlock so while it's likely to get
> > the same vblank event back as the one just stored I don't think it's
> > guaranteed.
>
> That's the inconsistency I mean to highlight - the timestamp for the
> same frame as 

Re: [PATCH] drm/amd/display: Use vrr friendly pageflip throttling in DC.

2019-02-12 Thread Mario Kleiner via amd-gfx
Oops, dropped the mailing ist from my reply, so again...

On Mon, Feb 11, 2019 at 4:01 PM Michel Dänzer  wrote:
>
> On 2019-02-09 7:52 a.m., Mario Kleiner wrote:
> > In VRR mode, keep track of the vblank count of the last
> > completed pageflip in amdgpu_crtc->last_flip_vblank, as
> > recorded in the pageflip completion handler after each
> > completed flip.
> >
> > Use that count to prevent mmio programming a new pageflip
> > within the same vblank in which the last pageflip completed,
> > iow. to throttle pageflips to at most one flip per video
> > frame, while at the same time allowing to request a flip
> > not only before start of vblank, but also anywhere within
> > vblank.
> >
> > The old logic did the same, and made sense for regular fixed
> > refresh rate flipping, but in vrr mode it prevents requesting
> > a flip anywhere inside the possibly huge vblank, thereby
> > reducing framerate in vrr mode instead of improving it, by
> > delaying a slightly delayed flip requests up to a maximum
> > vblank duration + 1 scanout duration. This would limit VRR
> > usefulness to only help applications with a very high GPU
> > demand, which can submit the flip request before start of
> > vblank, but then have to wait long for fences to complete.
> >
> > With this method a flip can be both requested and - after
> > fences have completed - executed, ie. it doesn't matter if
> > the request (amdgpu_dm_do_flip()) gets delayed until deep
> > into the extended vblank due to cpu execution delays. This
> > also allows clients which want to regulate framerate within
> > the vrr range a much more fine-grained control of flip timing,
> > a feature that might be useful for video playback, and is
> > very useful for neuroscience/vision research applications.
> >
> > In regular non-VRR mode, retain the old flip submission
> > behavior. This to keep flip scheduling for fullscreen X11/GLX
> > OpenGL clients intact, if they use the GLX_OML_sync_control
> > extensions glXSwapBufferMscOML(, ..., target_msc,...) function
> > with a specific target_msc target vblank count.
> >
> > glXSwapBuffersMscOML() or DRI3/Present PresentPixmap() will
> > not flip at the proper target_msc for a non-zero target_msc
> > if VRR mode is active with this patch. They'd often flip one
> > frame too early. However, this limitation should not matter
> > much in VRR mode, as scheduling based on vblank counts is
> > pretty futile/unusable under variable refresh duration
> > anyway, so no real extra harm is done.
> >
> > According to some testing already done with this patch by
> > Nicholas on top of my tests, IGT tests didn't report any
> > problems. If fixes stuttering and flickering when flipping
> > at rates below the minimum vrr refresh rate.
> >
> > Fixes: bb47de736661 ("drm/amdgpu: Set FreeSync state using drm VRR
> > properties")
> > Signed-off-by: Mario Kleiner 
> > Cc: 
> > Cc: Nicholas Kazlauskas 
> > Cc: Harry Wentland 
> > Cc: Alex Deucher 
> > Cc: Michel Dänzer 
>
> I wonder if this couldn't be solved in a simpler / cleaner way by making
> use of the target MSC passed to the page_flip_target hook.
>
If DisplayCore would implement the page_flip_target hook, one could do
the same implementation in userspace, ie. tracking msc of last
completed flip and setting target msc accordingly to throttle. But i
don't think we'd be better of with that. Same solution, but now we'd
have to let userspace know if the crtc is currently in VRR active mode
or not. And implement page_flip_target hook in DC. And implement the
tracking logic in every userspace driver that wants good performance
from VRR, e.g., also the modesetting ddx and all wayland compositors
in addition to amdgpu ddx.

What i'd like to have in the long term would be a way to pass a target
presentation time for VRR instead of a target msc.

>
> --
> Earthling Michel Dänzer   |   http://www.amd.com
> Libre software enthusiast | Mesa and X developer
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH] drm/amd/display: Use vrr friendly pageflip throttling in DC.

2019-02-12 Thread Mario Kleiner via amd-gfx
On Mon, Feb 11, 2019 at 7:44 PM Kazlauskas, Nicholas
 wrote:
>
> On 2/11/19 10:01 AM, Michel Dänzer wrote:
> > On 2019-02-09 7:52 a.m., Mario Kleiner wrote:
> >> In VRR mode, keep track of the vblank count of the last
> >> completed pageflip in amdgpu_crtc->last_flip_vblank, as
> >> recorded in the pageflip completion handler after each
> >> completed flip.
...
> >
> > I wonder if this couldn't be solved in a simpler / cleaner way by making
> > use of the target MSC passed to the page_flip_target hook.
> >
> >
>
> I'm not sure if this has a good userspace solution.
>
> I think the problem with all of this is the difference between the
> hardware vblank and the DRM vblank.
>
> The hardware vblank counter wraps around at the front porch, DRM wraps
> around at the back porch. The behavior in amdgpu_get_vblank_counter_kms
> is to return the hardware count if vpos < 0 and return the
> hardware_count + 1 if vpos >= 0.
>

Small correction: It's actually the other way around. DRM wraps at
start of vblank ~ start of front porch, whereas the hw counter wraps
at start of back porch. I think that removes the caveat you mention
below, if i understand what you meant.

One way i tested if this patch does what i expect it to do was do
flips (glXSwapbuffers calls) with different durations between calls,
and continuously change that wait time in between calls to see if the
delta between consecutive flip timestamps follows those wait times,
ie. without any big discontinous "jumps" inbetween.

> The value for vpos being used here is from
> amdgpu_display_vblank_get_counter which returns a value < 0 if we're in
> vblank and >= 0 if we're in scanout. The value returned in the extended
> front porch will always be considered -vbl_end so it'll never be
> considered within scanout.
>

Another small correction: Afaik we don't actually use the sign of vpos
for "in vblank" detection anywhere, but instead the returned status
flag. Which is good, given that patch 1/3 of that other vrr fixes
series slightly change the meaning of the sign in vrr mode.

-mario

> The target vblank for the "wait for vblank" logic is
> amdgpu_get_vblank_counter_kms(...) + 1.
>
> The "wait for vblank" logic waits while we're within vblank and if the
> target - amdgpu_get_vblank_counter_kms(...) > 0.
>
> There are 3 cases that can happen here:
> (1) The wait starts in the same vblank interval as the last flip
> (2) The wait starts in scanout
> (3) The wait starts in the next vblank interval as the last flip
>
> In case (1) the wait breaks as soon as we enter scanout.
> In case (2) the wait breaks immediately.
> In case (3) we wait for the next scanout.
>
> By DRM logic the programming should be happening at the start of the
> back porch, but it doesn't really matter since it's both targeting the
> right scanout for cases (1) and (2).
>
> However, for case (3) we have an issue. If the wait starts within vblank
> front porch then it's not going to be programmed for the upcoming
> scanout, but the one after that.
>
> This is the case that's trying to be addressed with the patch - with a
> caveat.
>
> If the last_flip_vblank != drm_crtc_vblank_count then the target_vblank
> should be < amdgpu_get_vblank_counter_kms which would cause the wait to
> exit immediately. If we're in the back porch of the vblank in this
> instance then we should be waiting until scanout to begin programming
> (and target the scanout after next).
>
> Nicholas Kazlauskas
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx