On Mon, Feb 10, 2014 at 3:53 AM, Pekka Paalanen <ppaala...@gmail.com
<mailto:ppaala...@gmail.com>> wrote:
On Sat, 8 Feb 2014 15:23:29 -0600
Jason Ekstrand <ja...@jlekstrand.net
<mailto:ja...@jlekstrand.net>> wrote:
> Pekka,
> First off, I think you've done a great job over-all. I think it
will both
> cover most cases and work well I've got a few comments below.
Thank you for the review. :-)
Replies below.
> On Thu, Jan 30, 2014 at 9:35 AM, Pekka Paalanen
<ppaala...@gmail.com <mailto:ppaala...@gmail.com>> wrote:
>
> > Hi,
> >
> > it's time for a take two on the Wayland presentation extension.
> >
> >
> > 1. Introduction
> >
> > The v1 proposal is here:
> >
> >
http://lists.freedesktop.org/archives/wayland-devel/2013-October/011496.html
> >
> > In v2 the basic idea is the same: you can queue frames with a
> > target presentation time, and you can get accurate presentation
> > feedback. All the details are new, though. The re-design started
> > from the wish to handle resizing better, preferably without
> > clearing the buffer queue.
> >
> > All the changed details are probably too much to describe here,
> > so it is maybe better to look at this as a new proposal. It
> > still does build on Frederic's work, and everyone who commented
> > on it. Special thanks to Axel Davy for his counter-proposal and
> > fighting with me on IRC. :-)
> >
> > Some highlights:
> >
> > - Accurate presentation feedback is possible also without
> > queueing.
> >
> > - You can queue also EGL-based rendering, and get presentation
> > feedback if you want. Also EGL can do this internally, too, as
> > long as EGL and the app do not try to use queueing at the
same time.
> >
> > - More detailed presentation feedback to better allow predicting
> > future display refreshes.
> >
> > - If wl_viewport is used, neither video resolution changes nor
> > surface (window) size changes alone require clearing the queue.
> > Video can continue playing even during resizes.
...
> > <interface name="presentation" version="1">
> > <description summary="timed presentation related
wl_surface requests">
> > The main features of this interface are accurate
presentation
> > timing feedback, and queued wl_surface content updates
to ensure
> > smooth video playback while maintaining audio/video
> > synchronization. Some features use the concept of a
presentation
> > clock, which is defined in presentation.clock_id event.
> >
> > Requests 'feedback' and 'queue' can be regarded as
additional
> > wl_surface methods. They are part of the double-buffered
> > surface state update mechanism, where other requests
first set
> > up the state and then wl_surface.commit atomically
applies the
> > state into use. In other words, wl_surface.commit submits a
> > content update.
> >
> > Interface wl_surface has requests to set surface related
state
> > and buffer related state, because there is no separate
interface
> > for buffer state alone. Queueing requires separating the
surface
> > from buffer state, and buffer state can be queued while
surface
> > state cannot.
> >
> > Buffer state includes the wl_buffer from
wl_surface.attach, the
> > state assigned by wl_surface requests frame,
> > set_buffer_transform and set_buffer_scale, and any
> > buffer-related state from extensions, for instance
> > wl_viewport.set_source. This state is inherent to the buffer
> > and the content update, rather than the surface.
> >
> > Surface state includes all other state associated with
> > wl_surfaces, like the x,y arguments of
wl_surface.attach, input
> > and opaque regions, damage, and extension state like
> > wl_viewport.destination. In general, anything expressed in
> > surface local coordinates is better as surface state.
> >
> > The standard way of posting new content to a surface
using the
> > wl_surface requests damage, attach, and commit is called
> > immediate content submission. This happens when a
> > presentation.queue request has not been sent since the last
> > wl_surface.commit.
> >
> > The new way of posting a content update is a queued content
> > update submission. This happens on a wl_surface.commit
when a
> > presentation.queue request has been sent since the last
> > wl_surface.commit.
> >
> > Queued content updates do not get applied immediately in the
> > compositor but are pushed to a queue on receiving the
> > wl_surface.commit. The queue is ordered by the
submission target
> > timestamp. Each item in the queue contains the
wl_buffer, the
> > target timestamp, and all the buffer state as defined
above. All
> > the queued state is taken from the pending wl_surface
state at
> > the time of the commit, exactly like an immediate commit
would
> > have taken it.
> >
> > For instance on a queueing commit, the pending buffer is
queued
> > and no buffer is pending afterwards. The stored values
of the
> > x,y parameters of wl_surface.attach are reset to zero,
but they
> > also are not queued; queued content updates do not carry the
> > attach offsets. All other surface state (that is not
queued),
> > e.g. damage, is not applied nor reset.
> >
> > Issuing a queueing commit without a wl_surface.attach is
> > undefined. However, queueing a commit with explicitly
attached
> > NULL wl_buffer works; when and if the content update is
> > executed, the surface content is removed as defined for
> > wl_surface.attach.
> >
> > If a queued content update has been submitted, and the
wl_buffer
> > used in the update is destroyed before the wl_buffer.release
> > event, the results are undefined. The compositor may or
may not
> > have executed the update, therefore the surface contents
become
> > undefined as explained in wl_surface.attach. Whether any
> > presentation feedback or frame callbacks occur is undefined.
> >
> > For each surface, the compositor maintains an
association to a
> > single output that is considered as the main output for the
> > surface. Queued content updates are synchronized to the
> > surface's main output, to provide a consistent and
meaningful
> > definition of the moment the update is displayed to the
user.
> > When a compositor updates an output, it processes only the
> > queues of the surfaces whose main output is the one being
> > updated. The queues of other surfaces, even if they are
part of
> > the redrawing, are not processed at that time.
> >
> > When a compositor chooses to update an output, it must
predict
> > the presentation clock value when the display update
will occur.
> > For the definition of the moment of display update, see
> > presentation_feedback.presented. Therefore if the
prediction is
> > absolutely perfect, presentation_feedback.presented will
carry
> > the same clock value.
> >
> > For each surface with queued content updates and
matching main
> > output, the compositor picks the update with the highest
> > timestamp no later than a half frame period after the
predicted
> > presentation time. The intent is to pick the content update
> > whose target timestamp as rounded to the output refresh
period
> > granularity matches the same display update as the
compositor is
> > targeting, while not displaying any content update more
than a
> >
>
> I'm not really following 100% here. It's not your fault, this is
just a
> terribly awkward sort of thing to try and put into English. It
sounds to
> me like the following: If P0 is the time of the next present and
P1 is the
> time of the one after that, you look for the largest thing less
than the
> average of P1 and P2. Is this correct? Why go for the average?
The
> client is going to have to adjust anyway.
>
>
> > half frame period too early. If all the updates in the
queue are
> > already late, the highest timestamp update is taken
regardless
> > of how late it is. Once an update in a queue has been
chosen,
> > all remaining updates with an earlier timestamp in the
queue are
> > discarded.
> >
>
> Ok, I think what you are saying works. Again, it's difficult to
parse but
> these things always are.
>
Yes, it is hard to write a generic algorithm in English. Axel did a
nice job clarifying it. I hope I can improve on the language after I
have actually implemented this and any possible changes we need to
this.
Also, the inline documentation in the XML file is getting a bit out of
hand, lacking in expressional power. I would have liked to use
sub-headings, the algorithm could use pseudo-code, etc, but they just
don't really exist here. Yet, I want these things to be part of the
protocol spec, so the semantics of the protocol get properly defined.
> > 4.5. The frame callback and swap interval
> >
> > The frame callback needs to be with the buffer state, so it gets
> > queued. If a client makes e.g. EGL's commits queued, EGL may
> > still rely on frame callbacks for blocking apps properly, and
> > that is related to presenting the buffer, not just the very next
> > output refresh. EGL may also internally use queueing and
> > feedback to implement swap interval > 1.
> >
>
> Doesn't this mean that you need eglSwapInterval(0) if you're
queueing?
> This is probably the case anyway, but it might be worth noting
explicitly.
> I think what you're doing with frame callbacks is sane, but I'm
not sure.
Yeah, swapinterval zero is needed indeed. Personally I would be more
worried about whether an EGL implementation agrees to allocate new
buffers if the app is queueing in advance. I suspect queueing many
frames in advance won't work with EGL in practice.
But you can still queue a frame at a time, that might be enough for
e.g. GL-based video players under good conditions. That might not need
swapinterval zero, either.
> My one latent concern is that I still don't think we're entirely
handling
> the case that QtQuick wants. What they want is to do their
rendering a few
> frames in advance in case of CPU/GPU jitter. Technically, this
extension
> handles this by the client simply doing a good job of guessing
presentation
> times on a one-per-frame baseis. However, it doesn't allow for
any damage
> tracking. In the case of QtQuick they want a linear queue of
buffers where
> no buffer ever gets skipped. In this case, you could do damage
tracking by
> allowing it to accumulate from one frame to another and you get
all of the
> damage-tracking advantages that you had before. I'm not sure
how much this
> matters, but it might be worth thinking about it.
Does it really want to display *every* frame regardless of time? It
doesn't matter that if a deadline is missed, the animation slows down
rather than jumps to keep up with intended velocity?
That is my understanding of how it works now. I *think* they figure
the compositor isn't the bottle-kneck and that it will git its 60 FPS.
That said, I don't actually work on QtQuick. I'm just trying to make
sure they don't get completely left out in the cold.
Axel has a good point, cannot this be just done client side and
immediate updates based on frame callbacks?
Probably not. They're using GLES and EGL so they can't draw early
and just stash the buffer.