Re: wp_presentation vs wl_surface_frame

2022-05-30 Thread Maksim Sisov
I see! Thank you!

On 2022-05-30 11:58, Pekka Paalanen wrote:
> On Fri, 27 May 2022 20:07:27 +0300
> Maksim Sisov  wrote:
> 
>> On 2022-05-27 11:43, Pekka Paalanen wrote:
>> > On Thu, 26 May 2022 10:25:24 +0300
>> > Maksim Sisov  wrote:
>> >
>> >> Hello!
>> >>
>> >> I'm currently working on some begin frame issues in Chromium and I
>> >> started to wonder if we use available mechanism for synchronization
>> >> correctly.
>> >>
>> >> Wayland compositors may support wl_surface_frame and wp_presentation or
>> >> just wl_surface_frame. At the moment, Chromium uses both wp_presentation
>> >> to adjust own begin frame scheduling based on the presentation time.
>> >> Also, it utilizes wl_surface_frame just before attaching new wl_buffers
>> >> to wl_surfaces (we use dmabuf + zwp_linux_dmabuf).
>> >>
>> >> I wonder if using wl_surface_frame along with wp_presentation is
>> >> necessary except the cases when a window is fully occluded.
>> >>
>> >
>> > Hi,
>> >
>> > presentation-time feedback and frame callbacks are not quite
>> > equivalent. It is possible to get frame callbacks before the previous
>> > frame was actually presented, allowing you more time to draw and
>> > submit. Feedback with "presented" OTOH can only occur once the
>> > presentation timestamp is known, which usually means the KMS atomic
>> > commit has been replied with a KMS pageflip event in the compositor.
>> >
>> > If your question is about finding the optimal time to start drawing in
>> > order to minimize latency to screen, then I don't think there is a
>> > clear answer to that yet. Some compositors may automatically adjust when
>> > they send frame callbacks, others don't. Feedback "presented"
>> > definitely occurs after the current frame was presented, but it has no
>> > information about when the deadline for the next one is, so all you can
>> > do is draw immediately and hope the deadline didn't pass already.
>> >
>> > I didn't quite understand your explanation of what Chromium does.
>> >
>> > Can you describe the problem you have?
>> >
>> >
>> > Thanks,
>> > pq
>>
>> Chromium has internal begin frame scheduling that adjusts self according
>> to the begin frame scheduling. It can adjust self using vsync
>> information from the presentation feedback it receives. The presentation
>> feedback won't be used as a signal to start drawing, but rather to
>> adjust self better to hit next vblank.
> 
> Sounds nice.
> 
> The frame submission deadline is always some time earlier than the
> next vblank. How much, depends on the compositor, might vary, and
> currently there is no protocol to find that out (without occasionally
> missing the deadline which causes visual glitches).
> 
> There are discussions about improving this, e.g.:
> https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/45
> https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/103
> 
>> And that is something that already happens. However, Chromium also use
>> wl_surface.frame that is waited if a buffers are submitted earlier than
>> that arrives [1]. It's basically what mesa wayland egl does - whenever
>> eglSwapBuffers is called, it checks for the frame callback. If it's not
>> received yet, it waits until it's received and proceeds submitting
>> buffers.
> 
> Yeah, that is the basic mechanism for throttling frame rate, and for
> throttling occluded windows.
> 
> Mesa is using frame callbacks for eglSwapInterval=1, because nothing
> else was possible really when Mesa's EGL Wayland implementation was
> written, and it stuck.
> 
> Since then, it has caused problems with proposed solutions like
> https://gitlab.freedesktop.org/wayland/wayland-protocols/-/merge_requests/99
> 
> I recall having talked about disconnecting eglSwapInterval from frame
> callbacks, but I've forgot where those comments were. They could be
> buried somewhere in the links above.
> 
>> However, I wonder if that is even necessary to do. It looks like a
>> redundant functionality given internal timer adjusts self according to
>> presentation times.
>>
>> [1]
>> https://source.chromium.org/chromium/chromium/src/+/main:ui/ozone/platform/wayland/host/wayland_frame_manager.cc;l=352;drc=67172ba3828a038c491384620c3f854bd6d0ece9?q=wayland_frame_manager.cc=chromium
>>
> 
> Is it required by protocol? No.
> 
> Is it a good idea? Not absolutely always, but in many cases yes.
> 
> Is it a good idea for Chromium? Hmm. If you do have presentation-time
> feedback, maybe you can do without frame callbacks. But you have the
> caveat of not knowing when the deadline is, as I explained above. OTOH,
> sounds like you will hit that caveat with the current policy anyway.
> 
> 
> Thanks,
> pq

-- 
Best Regards,
Maksim Sisov
* Usual work time - 08:00-16:00 (EEST).


Re: wp_presentation vs wl_surface_frame

2022-05-27 Thread Maksim Sisov
On 2022-05-27 11:43, Pekka Paalanen wrote:
> On Thu, 26 May 2022 10:25:24 +0300
> Maksim Sisov  wrote:
> 
>> Hello!
>>
>> I'm currently working on some begin frame issues in Chromium and I
>> started to wonder if we use available mechanism for synchronization
>> correctly.
>>
>> Wayland compositors may support wl_surface_frame and wp_presentation or
>> just wl_surface_frame. At the moment, Chromium uses both wp_presentation
>> to adjust own begin frame scheduling based on the presentation time.
>> Also, it utilizes wl_surface_frame just before attaching new wl_buffers
>> to wl_surfaces (we use dmabuf + zwp_linux_dmabuf).
>>
>> I wonder if using wl_surface_frame along with wp_presentation is
>> necessary except the cases when a window is fully occluded.
>>
> 
> Hi,
> 
> presentation-time feedback and frame callbacks are not quite
> equivalent. It is possible to get frame callbacks before the previous
> frame was actually presented, allowing you more time to draw and
> submit. Feedback with "presented" OTOH can only occur once the
> presentation timestamp is known, which usually means the KMS atomic
> commit has been replied with a KMS pageflip event in the compositor.
> 
> If your question is about finding the optimal time to start drawing in
> order to minimize latency to screen, then I don't think there is a
> clear answer to that yet. Some compositors may automatically adjust when
> they send frame callbacks, others don't. Feedback "presented"
> definitely occurs after the current frame was presented, but it has no
> information about when the deadline for the next one is, so all you can
> do is draw immediately and hope the deadline didn't pass already.
> 
> I didn't quite understand your explanation of what Chromium does.
> 
> Can you describe the problem you have?
> 
> 
> Thanks,
> pq

Chromium has internal begin frame scheduling that adjusts self according
to the begin frame scheduling. It can adjust self using vsync
information from the presentation feedback it receives. The presentation
feedback won't be used as a signal to start drawing, but rather to
adjust self better to hit next vblank.

And that is something that already happens. However, Chromium also use
wl_surface.frame that is waited if a buffers are submitted earlier than
that arrives [1]. It's basically what mesa wayland egl does - whenever
eglSwapBuffers is called, it checks for the frame callback. If it's not
received yet, it waits until it's received and proceeds submitting
buffers.

However, I wonder if that is even necessary to do. It looks like a
redundant functionality given internal timer adjusts self according to
presentation times.

[1]
https://source.chromium.org/chromium/chromium/src/+/main:ui/ozone/platform/wayland/host/wayland_frame_manager.cc;l=352;drc=67172ba3828a038c491384620c3f854bd6d0ece9?q=wayland_frame_manager.cc=chromium

-- 
Best Regards,
Maksim Sisov
* Usual work time - 08:00-16:00 (EEST).


wp_presentation vs wl_surface_frame

2022-05-26 Thread Maksim Sisov
Hello!

I'm currently working on some begin frame issues in Chromium and I
started to wonder if we use available mechanism for synchronization
correctly.

Wayland compositors may support wl_surface_frame and wp_presentation or
just wl_surface_frame. At the moment, Chromium uses both wp_presentation
to adjust own begin frame scheduling based on the presentation time.
Also, it utilizes wl_surface_frame just before attaching new wl_buffers
to wl_surfaces (we use dmabuf + zwp_linux_dmabuf).

I wonder if using wl_surface_frame along with wp_presentation is
necessary except the cases when a window is fully occluded.

-- 
Best Regards,
Maksim Sisov


Re: Wayland fd watching

2022-01-14 Thread Maksim Sisov
On 2022-01-13 15:46, Pekka Paalanen wrote:
> On Thu, 13 Jan 2022 14:05:47 +0200
> Maksim Sisov  wrote:
> 
>> On 2022-01-13 11:42, Pekka Paalanen wrote:
>> > On Thu, 13 Jan 2022 10:05:56 +0200
>> > Maksim Sisov  wrote:
>> >
> 
> ...
> 
>> >> I see. The only only problem here is that we don't use poll, but rather
>> >> libevent, which is wrapped around Chromium's MessagePumpLibevent. It
>> >> constantly notifies us that a fd is ready to be read.
>> >
>> > You say "constantly", do you mean it never waits?
>> >
>> >> Here is a diagram that shows the flow without a dedicated thread -
>> >> https://drive.google.com/file/d/1Jq8aX4cWszwL5fleE8ayGOeond70xChS/view?usp=sharing
>> >> .
>> >>
>> >> As you can see, it returns from OnCanRead and waits until libevent
>> >> notifies us the next time that we can actually read events (if prepared
>> >> to read previously). And if the main thread blocks because of the above
>> >> mentioned reason (libevent didn't have a chance to notify us), the
>> >> events will never be read.
>> >
>> > Are you describing the old approach or the new approach with the
>> > separate polling thread?
>>
>> I'm describing the old one as having a dedicated thread doesn't really
>> bring much benefits as I said in my first message.
>>
>> >
>> > Doing prepare_read from OnCanRead handler seems a bit backwards. The
>> > prepare_read dance is to make sure that when you are going to wait for
>> > new events (block), you have processed all incoming events so far and
>> > flushed out all requests that might have resulted from those, so that
>> > if you are waiting for replies to those requests, they will actually
>> > arrive. So the prepare_read/dispatch/flush dance should be done from a
>> > "the event loop is going to sleep now" hook, not from "the fd has bytes
>> > to read" hook.
>>
>> I checked libevent - they added prepare/check watchers in
>> https://github.com/libevent/libevent/pull/793 in Spring, 2019. Chromium
>> still uses libevent 1.4.15
>> (https://source.chromium.org/chromium/chromium/src/+/main:base/third_party/libevent/README.chromium),
>> which looks to be old. It seems like we need to make an update that
>> hasn't been done for ages. Moreover, it seems to be patched :\ I'll need
>> to check if we can upgrade.
> 
> For your reference, SardemFF7 (in OFTC IRC) just pointed me to
> 
> https://github.com/sardemff7/libgwater/blob/master/wayland/libgwater-wayland.c
> 
> as an example of hooking libwayland-client into glib event loop.
> 
> An important detail is that "check" (read) and "dispatch" are separate
> steps as well. Those should help to avoid the below I think.

Thanks. gtk/gdk also implements event loop the same way using glib.
However, Chromium can either use libevent or libglib based on the
configuration and I'm afraid we have to use support both. Even though,
glib is default on Desktop Linux, we need to make it work correctly with
libevent as well. I'm not aware about the hook that libevent does before
going into sleep except the watcher callbacks I mentioned before that
are not part of stable versions yet.

> 
>> > Hmm, yeah... I suppose when your event loop decides to do something
>> > else than actually read the Wayland socket after it was prepared for
>> > reading, and that something else is a blocking thing, you would have to
>> > cancel the read first just in case, and then prepare_read again to be
>> > able to read afterwards.
> 
> If check only reads, and dispatch is the step that might e.g. tear down
> other threads synchronously, you shouldn't need to cancel_read in order
> to do synchronous things between threads.
> 
> Also the time it takes from Wayland socket becoming readable to
> actually calling read_events should be minimized, because other threads
> might have done prepare_read, have been woken up, and are now waiting
> for the last reader to enter the read_events. Not doing long operations
> in the check/read step should help with that too.
> 
> 
> Thanks,
> pq

-- 
Best Regards,
Maksim Sisov
* Usual work time - 08:00-16:00 (EEST).


Re: Wayland fd watching

2022-01-13 Thread Maksim Sisov
On 2022-01-13 11:42, Pekka Paalanen wrote:
> On Thu, 13 Jan 2022 10:05:56 +0200
> Maksim Sisov  wrote:
> 
>> + wayland-devel ML.
>>
>> Hi Pekka,
>>
>> Thanks for your answers! Please see my questions inlined.
>>
>> On 2022-01-12 16:16, Pekka Paalanen wrote:
>>
>> > On Mon, 10 Jan 2022 08:53:50 +0200
>> > Maksim Sisov  wrote:
>> >
>> >> Hi Pekka,
>> >>
>> >> Thanks for answering all my previous questions before.
>> >
>> > Hi Maksim,
>> >
>> > looks like that was a year ago. I had forgot. :-)
>>
>> Time flies!
>>
>> >
>> >> I came up with a new question and wondered what was your opinion.
>> >>
>> >> In Chromium, we have two modes - one is normal when a GPU service runs
>> >> in a separate sandboxed process and uses surfaceless path with libgbm
>> >> (we pass dmabuf to the browser process where Wayland connection is...),
>> >> and another one is --in-process-gpu when the GPU service runs in the
>> >> same browser process, but in a different thread. That also uses
>> >> surfaceless path by default.
>> >>
>> >> However, when either libgbm is not available or a system in question
>> >> doesn't support drm render nodes, the browser may fall back to use
>> >> Wayland EGL. However, it is only used if the browser runs with
>> >> --in-process-gpu flag as we need to access a wl_surface somehow. That
>> >> means the GPU service on a separate thread that use Wayland EGL and the
>> >> main UI thread that actually does all the UI stuff, manages the
>> >> connection, etc.
>> >>
>> >> Long time ago, we figured out there was a problem with the
>> >> --in-process-gpu + Wayland EGL. But given both Chromium as a Wayland
>> >> client and Wayland EGL that is another client prepare/read/dispatch
>> >> events (each own event queue, of course), we started to experience
>> >> deadlocks.
>> >
>> > Surely they are the same client (connection)?
>>
>> Yes, they are.
>>
>> >
>> >> The deadlock happened when Chromium was closing a native window and our
>> >> event loop had already prepared to read for the display, but it then
>> >> didn't read as the thread was closing the GPU thread, which was calling
>> >> eglSwapBuffers, which internally also called prepare and then read, but
>> >> it didn't read, but rather was blocked as another thread already
>> >> prepared to read, but didn't read either. So, the UI thread could make a
>> >> blocking wait until the GPU thread is done doing its stuff before
>> >> tearing it down.
>> >>
>> >> As you can see from my description, the UI thread prepared to read the
>> >> wayland display, then switched to tearing down the GPU thread, which was
>> >> busy with processing our gl commands, and it had to wait until its done.
>> >> The GPU thread used Wayland EGL, which also prepared to read and the
>> >> read call was blocked because there were other clients willing to read.
>> >
>> > I see. Yes, there is the assumption that if a thread prepares to read,
>> > then it will either read or cancel the read in finite time (ASAP).
>> >
>> >> To overcome this problem, it was decided to use a separate polling
>> >> thread. What it does now is that the thread prepares/polls/reads (we use
>> >> libevent or libglib, btw. this decision is based on how chromium is
>> >> configured), while the UI thread calls dispatch events whenever our
>> >> "polling" thread asks. This way, the polling thread can always call
>> >> wl_display_read_events and the deadlock doesn't ever happen.
>> >
>> > Right, I can see that working.
>> >
>> >> I wonder if this is the right way to do so. I also wonder if doing this
>> >> stuff on a separate thread gives any performance benefits?
>> >
>> > It works, but it's not a "proposed" design in my opinion. Doing
>> > all of prepare/read/dispatch in each thread is the "intended" usage.
>> >
>> > I think the core of the problem is the thread that prepares to read and
>> > then doesn't read but goes to do something else. That is basically a
>> > violation of the API contract of libwayland-client. Unfortunately in
>> > this case it leads to a solid deadlock. The documentation is very c

Re: Wayland fd watching

2022-01-13 Thread Maksim Sisov
+ wayland-devel ML.

Hi Pekka, 

Thanks for your answers! Please see my questions inlined. 

On 2022-01-12 16:16, Pekka Paalanen wrote:

> On Mon, 10 Jan 2022 08:53:50 +0200
> Maksim Sisov  wrote:
> 
>> Hi Pekka, 
>> 
>> Thanks for answering all my previous questions before.
> 
> Hi Maksim,
> 
> looks like that was a year ago. I had forgot. :-) 

Time flies!

> 
>> I came up with a new question and wondered what was your opinion. 
>> 
>> In Chromium, we have two modes - one is normal when a GPU service runs
>> in a separate sandboxed process and uses surfaceless path with libgbm
>> (we pass dmabuf to the browser process where Wayland connection is...),
>> and another one is --in-process-gpu when the GPU service runs in the
>> same browser process, but in a different thread. That also uses
>> surfaceless path by default. 
>> 
>> However, when either libgbm is not available or a system in question
>> doesn't support drm render nodes, the browser may fall back to use
>> Wayland EGL. However, it is only used if the browser runs with
>> --in-process-gpu flag as we need to access a wl_surface somehow. That
>> means the GPU service on a separate thread that use Wayland EGL and the
>> main UI thread that actually does all the UI stuff, manages the
>> connection, etc. 
>> 
>> Long time ago, we figured out there was a problem with the
>> --in-process-gpu + Wayland EGL. But given both Chromium as a Wayland
>> client and Wayland EGL that is another client prepare/read/dispatch
>> events (each own event queue, of course), we started to experience
>> deadlocks.
> 
> Surely they are the same client (connection)?

Yes, they are.

> 
>> The deadlock happened when Chromium was closing a native window and our
>> event loop had already prepared to read for the display, but it then
>> didn't read as the thread was closing the GPU thread, which was calling
>> eglSwapBuffers, which internally also called prepare and then read, but
>> it didn't read, but rather was blocked as another thread already
>> prepared to read, but didn't read either. So, the UI thread could make a
>> blocking wait until the GPU thread is done doing its stuff before
>> tearing it down. 
>> 
>> As you can see from my description, the UI thread prepared to read the
>> wayland display, then switched to tearing down the GPU thread, which was
>> busy with processing our gl commands, and it had to wait until its done.
>> The GPU thread used Wayland EGL, which also prepared to read and the
>> read call was blocked because there were other clients willing to read.
> 
> I see. Yes, there is the assumption that if a thread prepares to read,
> then it will either read or cancel the read in finite time (ASAP).
> 
>> To overcome this problem, it was decided to use a separate polling
>> thread. What it does now is that the thread prepares/polls/reads (we use
>> libevent or libglib, btw. this decision is based on how chromium is
>> configured), while the UI thread calls dispatch events whenever our
>> "polling" thread asks. This way, the polling thread can always call
>> wl_display_read_events and the deadlock doesn't ever happen.
> 
> Right, I can see that working.
> 
>> I wonder if this is the right way to do so. I also wonder if doing this
>> stuff on a separate thread gives any performance benefits?
> 
> It works, but it's not a "proposed" design in my opinion. Doing
> all of prepare/read/dispatch in each thread is the "intended" usage.
> 
> I think the core of the problem is the thread that prepares to read and
> then doesn't read but goes to do something else. That is basically a
> violation of the API contract of libwayland-client. Unfortunately in
> this case it leads to a solid deadlock. The documentation is very clear
> about it:
> 
> * This function (or wl_display_prepare_read()) must be called before reading
> * from the file descriptor using wl_display_read_events(). Calling
> * wl_display_prepare_read_queue() announces the calling thread's intention to
> * read and ensures that until the thread is ready to read and calls
> * wl_display_read_events(), no other thread will read from the file 
> descriptor.
> 
> You could also call wl_display_cancel_read() instead:
> 
> * After a thread successfully called wl_display_prepare_read() it must
> * either call wl_display_read_events() or wl_display_cancel_read().
> * If the threads do not follow this rule it will lead to deadlock.
> 
> But if you cancel, then you must not read without preparing again first.
> 
> The separate polling thread does not sound bad though. It is a co

Sharing wl display/surface with another process.

2018-03-02 Thread Maksim Sisov
Hi,

I have a question about a proper client side of implementation of
wayland with egl.

The idea is that we have wayland ui related stuff on ui process
(creating wayland window, managing mouse/keyboard inputs and etc), and
have egl related stuff created and rendering on another gpu process.

That gpu process creates an egl window, but it has to use an wl_surface
handle, which is available on the ui process side.

What would be the best way to achieve that? Can I just pass the handle
to another process or is there a better way how to achieve that? And how
I can share the wl_display across the process? Or should I better have
another wayland connection established?

What is the best approach in this kind of cases?

Regards,
Maksim
___
wayland-devel mailing list
wayland-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/wayland-devel