Re: [RFC v2] Wayland presentation extension (video protocol)

Axel Davy Mon, 10 Feb 2014 08:31:02 -0800

On 10/02/2014, Jason Ekstrand wrote :

On Mon, Feb 10, 2014 at 3:53 AM, Pekka Paalanen <ppaala...@gmail.com<mailto:ppaala...@gmail.com>> wrote:


    On Sat, 8 Feb 2014 15:23:29 -0600
    Jason Ekstrand <ja...@jlekstrand.net
    <mailto:ja...@jlekstrand.net>> wrote:

    > Pekka,
    > First off, I think you've done a great job over-all.  I think it
    will both
    > cover most cases and work well  I've got a few comments below.

    Thank you for the review. :-)
    Replies below.

    > On Thu, Jan 30, 2014 at 9:35 AM, Pekka Paalanen
    <ppaala...@gmail.com <mailto:ppaala...@gmail.com>> wrote:
    >
    > > Hi,
    > >
    > > it's time for a take two on the Wayland presentation extension.
    > >
    > >
    > >                 1. Introduction
    > >
    > > The v1 proposal is here:
    > >
    > >
    http://lists.freedesktop.org/archives/wayland-devel/2013-October/011496.html
    > >
    > > In v2 the basic idea is the same: you can queue frames with a
    > > target presentation time, and you can get accurate presentation
    > > feedback. All the details are new, though. The re-design started
    > > from the wish to handle resizing better, preferably without
    > > clearing the buffer queue.
    > >
    > > All the changed details are probably too much to describe here,
    > > so it is maybe better to look at this as a new proposal. It
    > > still does build on Frederic's work, and everyone who commented
    > > on it. Special thanks to Axel Davy for his counter-proposal and
    > > fighting with me on IRC. :-)
    > >
    > > Some highlights:
    > >
    > > - Accurate presentation feedback is possible also without
    > >   queueing.
    > >
    > > - You can queue also EGL-based rendering, and get presentation
    > >   feedback if you want. Also EGL can do this internally, too, as
    > >   long as EGL and the app do not try to use queueing at the
    same time.
    > >
    > > - More detailed presentation feedback to better allow predicting
    > >   future display refreshes.
    > >
    > > - If wl_viewport is used, neither video resolution changes nor
    > >   surface (window) size changes alone require clearing the queue.
    > >   Video can continue playing even during resizes.
    ...
    > >   <interface name="presentation" version="1">
    > >     <description summary="timed presentation related
    wl_surface requests">
    > >       The main features of this interface are accurate
    presentation
    > >       timing feedback, and queued wl_surface content updates
    to ensure
    > >       smooth video playback while maintaining audio/video
    > >       synchronization. Some features use the concept of a
    presentation
    > >       clock, which is defined in presentation.clock_id event.
    > >
    > >       Requests 'feedback' and 'queue' can be regarded as
    additional
    > >       wl_surface methods. They are part of the double-buffered
    > >       surface state update mechanism, where other requests
    first set
    > >       up the state and then wl_surface.commit atomically
    applies the
    > >       state into use. In other words, wl_surface.commit submits a
    > >       content update.
    > >
    > >       Interface wl_surface has requests to set surface related
    state
    > >       and buffer related state, because there is no separate
    interface
    > >       for buffer state alone. Queueing requires separating the
    surface
    > >       from buffer state, and buffer state can be queued while
    surface
    > >       state cannot.
    > >
    > >       Buffer state includes the wl_buffer from
    wl_surface.attach, the
    > >       state assigned by wl_surface requests frame,
    > >       set_buffer_transform and set_buffer_scale, and any
    > >       buffer-related state from extensions, for instance
    > >       wl_viewport.set_source. This state is inherent to the buffer
    > >       and the content update, rather than the surface.
    > >
    > >       Surface state includes all other state associated with
    > >       wl_surfaces, like the x,y arguments of
    wl_surface.attach, input
    > >       and opaque regions, damage, and extension state like
    > >       wl_viewport.destination. In general, anything expressed in
    > >       surface local coordinates is better as surface state.
    > >
    > >       The standard way of posting new content to a surface
    using the
    > >       wl_surface requests damage, attach, and commit is called
    > >       immediate content submission. This happens when a
    > >       presentation.queue request has not been sent since the last
    > >       wl_surface.commit.
    > >
    > >       The new way of posting a content update is a queued content
    > >       update submission. This happens on a wl_surface.commit
    when a
    > >       presentation.queue request has been sent since the last
    > >       wl_surface.commit.
    > >
    > >       Queued content updates do not get applied immediately in the
    > >       compositor but are pushed to a queue on receiving the
    > >       wl_surface.commit. The queue is ordered by the
    submission target
    > >       timestamp. Each item in the queue contains the
    wl_buffer, the
    > >       target timestamp, and all the buffer state as defined
    above. All
    > >       the queued state is taken from the pending wl_surface
    state at
    > >       the time of the commit, exactly like an immediate commit
    would
    > >       have taken it.
    > >
    > >       For instance on a queueing commit, the pending buffer is
    queued
    > >       and no buffer is pending afterwards. The stored values
    of the
    > >       x,y parameters of wl_surface.attach are reset to zero,
    but they
    > >       also are not queued; queued content updates do not carry the
    > >       attach offsets. All other surface state (that is not
    queued),
    > >       e.g. damage, is not applied nor reset.
    > >
    > >       Issuing a queueing commit without a wl_surface.attach is
    > >       undefined. However, queueing a commit with explicitly
    attached
    > >       NULL wl_buffer works; when and if the content update is
    > >       executed, the surface content is removed as defined for
    > >       wl_surface.attach.
    > >
    > >       If a queued content update has been submitted, and the
    wl_buffer
    > >       used in the update is destroyed before the wl_buffer.release
    > >       event, the results are undefined. The compositor may or
    may not
    > >       have executed the update, therefore the surface contents
    become
    > >       undefined as explained in wl_surface.attach. Whether any
    > >       presentation feedback or frame callbacks occur is undefined.
    > >
    > >       For each surface, the compositor maintains an
    association to a
    > >       single output that is considered as the main output for the
    > >       surface. Queued content updates are synchronized to the
    > >       surface's main output, to provide a consistent and
    meaningful
    > >       definition of the moment the update is displayed to the
    user.
    > >       When a compositor updates an output, it processes only the
    > >       queues of the surfaces whose main output is the one being
    > >       updated. The queues of other surfaces, even if they are
    part of
    > >       the redrawing, are not processed at that time.
    > >
    > >       When a compositor chooses to update an output, it must
    predict
    > >       the presentation clock value when the display update
    will occur.
    > >       For the definition of the moment of display update, see
    > >       presentation_feedback.presented. Therefore if the
    prediction is
    > >       absolutely perfect, presentation_feedback.presented will
    carry
    > >       the same clock value.
    > >
    > >       For each surface with queued content updates and
    matching main
    > >       output, the compositor picks the update with the highest
    > >       timestamp no later than a half frame period after the
    predicted
    > >       presentation time. The intent is to pick the content update
    > >       whose target timestamp as rounded to the output refresh
    period
    > >       granularity matches the same display update as the
    compositor is
    > >       targeting, while not displaying any content update more
    than a
    > >
    >
    > I'm not really following 100% here. It's not your fault, this is
    just a
    > terribly awkward sort of thing to try and put into English.  It
    sounds to
    > me like the following: If P0 is the time of the next present and
    P1 is the
    > time of the one after that, you look for the largest thing less
    than the
    > average of P1 and P2.  Is this correct?  Why go for the average?
     The
    > client is going to have to adjust anyway.
    >
    >
    > >       half frame period too early. If all the updates in the
    queue are
    > >       already late, the highest timestamp update is taken
    regardless
    > >       of how late it is. Once an update in a queue has been
    chosen,
    > >       all remaining updates with an earlier timestamp in the
    queue are
    > >       discarded.
    > >
    >
    > Ok, I think what you are saying works.  Again, it's difficult to
    parse but
    > these things always are.
    >

    Yes, it is hard to write a generic algorithm in English. Axel did a
    nice job clarifying it. I hope I can improve on the language after I
    have actually implemented this and any possible changes we need to
    this.

    Also, the inline documentation in the XML file is getting a bit out of
    hand, lacking in expressional power. I would have liked to use
    sub-headings, the algorithm could use pseudo-code, etc, but they just
    don't really exist here. Yet, I want these things to be part of the
    protocol spec, so the semantics of the protocol get properly defined.

    > >         4.5. The frame callback and swap interval
    > >
    > > The frame callback needs to be with the buffer state, so it gets
    > > queued. If a client makes e.g. EGL's commits queued, EGL may
    > > still rely on frame callbacks for blocking apps properly, and
    > > that is related to presenting the buffer, not just the very next
    > > output refresh. EGL may also internally use queueing and
    > > feedback to implement swap interval > 1.
    > >
    >
    > Doesn't this mean that you need eglSwapInterval(0) if you're
    queueing?
    > This is probably the case anyway, but it might be worth noting
    explicitly.
    > I think what you're doing with frame callbacks is sane, but I'm
    not sure.

    Yeah, swapinterval zero is needed indeed. Personally I would be more
    worried about whether an EGL implementation agrees to allocate new
    buffers if the app is queueing in advance. I suspect queueing many
    frames in advance won't work with EGL in practice.

    But you can still queue a frame at a time, that might be enough for
    e.g. GL-based video players under good conditions. That might not need
    swapinterval zero, either.

    > My one latent concern is that I still don't think we're entirely
    handling
    > the case that QtQuick wants.  What they want is to do their
    rendering a few
    > frames in advance in case of CPU/GPU jitter.  Technically, this
    extension
    > handles this by the client simply doing a good job of guessing
    presentation
    > times on a one-per-frame baseis.  However, it doesn't allow for
    any damage
    > tracking.  In the case of QtQuick they want a linear queue of
    buffers where
    > no buffer ever gets skipped.  In this case, you could do damage
    tracking by
    > allowing it to accumulate from one frame to another and you get
    all of the
    > damage-tracking advantages that you had before.  I'm not sure
    how much this
    > matters, but it might be worth thinking about it.

    Does it really want to display *every* frame regardless of time? It
    doesn't matter that if a deadline is missed, the animation slows down
    rather than jumps to keep up with intended velocity?

That is my understanding of how it works now. I *think* they figurethe compositor isn't the bottle-kneck and that it will git its 60 FPS.That said, I don't actually work on QtQuick. I'm just trying to makesure they don't get completely left out in the cold.



    Axel has a good point, cannot this be just done client side and
    immediate updates based on frame callbacks?

Probably not. They're using GLES and EGL so they can't draw earlyand just stash the buffer.

That's not a problem.

They can render to a fbo linked to an EGLImage, and we can get awl_buffer from an EGLImage.


Axel Davy

_______________________________________________
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/wayland-devel

Re: [RFC v2] Wayland presentation extension (video protocol)

Reply via email to