On 11/10/25 04:00, Kasireddy, Vivek wrote: > Hi Dmitry, > >> Subject: Re: [PATCH v4] drm/virtgpu: Make vblank event dependent on the >> external sync >> >> On 10/31/25 08:04, Kasireddy, Vivek wrote: >>> Hi Dmitry, >>> >>>> Subject: Re: [PATCH v4] drm/virtgpu: Make vblank event dependent on >> the >>>> external sync >>>> >>>> On 10/29/25 08:47, Kasireddy, Vivek wrote: >>>>> I applied this patch and tested 3 different scenarios: >>>>> 1) blob=true + imported buffers >>>>> root@localhost:weston_upstream/weston# ./build/clients/weston- >> simple- >>>> egl -o & >>>>> Using config: r8g8b8a0 >>>>> has EGL_EXT_buffer_age and EGL_EXT_swap_buffers_with_damage >>>>> has EGL_EXT_surface_compression >>>>> 245 frames in 5 seconds: 49.000000 fps >>>>> 236 frames in 5 seconds: 47.200001 fps >>>>> 279 frames in 5 seconds: 55.799999 fps >>>>> 240 frames in 5 seconds: 48.000000 fps >>>>> 244 frames in 5 seconds: 48.799999 fps >>>>> 283 frames in 5 seconds: 56.599998 fps >>>>> 274 frames in 5 seconds: 54.799999 fps >>>>> >>>>> 2) blob=true >>>>> root@localhost:/weston_upstream/weston# ./build/clients/weston- >> simple- >>>> egl -o & >>>>> Using config: r8g8b8a0 >>>>> has EGL_EXT_buffer_age and EGL_EXT_swap_buffers_with_damage >>>>> has EGL_EXT_surface_compression >>>>> 167 frames in 5 seconds: 33.400002 fps >>>>> 208 frames in 5 seconds: 41.599998 fps >>>>> 203 frames in 5 seconds: 40.599998 fps >>>>> 252 frames in 5 seconds: 50.400002 fps >>>>> 180 frames in 5 seconds: 36.000000 fps >>>>> 219 frames in 5 seconds: 43.799999 fps >>>>> 176 frames in 5 seconds: 35.200001 fps >>>>> 220 frames in 5 seconds: 44.000000 fps >>>> >>>> Thanks for the testing. >>>> >>>> For the second scenario I'm getting better result, it's 55-60fps for me. >>>> Given that this is a SW rendering, curious whether that's because you're >>>> testing on a lower end machine(?) >>> I am testing with a dGPU (B60) on the Host where I am running Gnome >> Xorg >>> but I am running Gnome Wayland in the Guest. What compositors are you >>> running on the Host and Guest? Kwin? Also, could you please share your >>> Qemu launch parameters as well so that we can try to figure out the >> reason >>> for the performance discrepancy? >> >> Did further testing on older IvyBridge machine and getting comparable >> result to yours on it. > Thank you for running these tests. > >> >> I'm using exactly same QEMU setup as you do with nothing special in the >> cmdline. Tested with both KDE and Gnome on host. >> >> On IvyBridge getting: >> >> - 46-55fps jittery with GTK display + blob + vblank_timer >> - 46-55fps jittery with GTK display + blob + vblank_timer + >> vblank_mode=1 to disable vsync on host >> - constant 60fps with GTK display without blob + vblank_timer >> - constant 60fps with GTK display + blob without vblank_timer >> - constant 60fps with SDL display + blob + vblank_timer > What happens in the case of: > - SDL display + blob without vblank_timer > - SDL display + vblank_timer without blob > >> >> So, discrepancy of the results is directly tied to CPU performance + GTK >> oddity when SW+blob rendering is used on guest. >> >> Why SDL performs much better than GTK I don't know. > AFAICS, there are some key differences between SDL and GTK UIs that might > be playing a role: > - SDL UI does not implement dmabuf ops such as dpy_gl_scanout_dmabuf() > but it does implement dpy_gl_scanout_texture() which I see is only called > from virtio-gpu-virgl.c. I am wondering how does the display get updated > in non-virgl usecases, especially when blob=true and given that this option > is specifically meant for UI's dmabuf ops? > > - GTK asynchronously waits (using an EGL fence) until the GPU Blit is > completed (before releasing Guest FB) which other UIs do not do when > blob=true. So, this might incur some extra delay which is not the case > with SDL. > > - SDL's display updates (sent to the Host compositor) seem to be UI timer > driven while GTK's updates are repaint callback driven.
My QEMU branch contains patch adding dpy_gl_scanout_dmabuf() support to SDL display. Haven't checked without it, will check next week. Assume this is a QEMU-specific issue with how QEMU display logic works. It's unclear to me how much severe this problem is in practice. Meanwhile should be okay to add a workaround for guest blobs with a comment explaining why it's needed if this is a significant-enough regression for your use-cases. Feel free to send v5. -- Best regards, Dmitry
