On 11/10/25 04:00, Kasireddy, Vivek wrote:
> Hi Dmitry,
> 
>> Subject: Re: [PATCH v4] drm/virtgpu: Make vblank event dependent on the
>> external sync
>>
>> On 10/31/25 08:04, Kasireddy, Vivek wrote:
>>> Hi Dmitry,
>>>
>>>> Subject: Re: [PATCH v4] drm/virtgpu: Make vblank event dependent on
>> the
>>>> external sync
>>>>
>>>> On 10/29/25 08:47, Kasireddy, Vivek wrote:
>>>>> I applied this patch and tested 3 different scenarios:
>>>>> 1) blob=true + imported buffers
>>>>> root@localhost:weston_upstream/weston# ./build/clients/weston-
>> simple-
>>>> egl -o &
>>>>> Using config: r8g8b8a0
>>>>> has EGL_EXT_buffer_age and EGL_EXT_swap_buffers_with_damage
>>>>> has EGL_EXT_surface_compression
>>>>> 245 frames in 5 seconds: 49.000000 fps
>>>>> 236 frames in 5 seconds: 47.200001 fps
>>>>> 279 frames in 5 seconds: 55.799999 fps
>>>>> 240 frames in 5 seconds: 48.000000 fps
>>>>> 244 frames in 5 seconds: 48.799999 fps
>>>>> 283 frames in 5 seconds: 56.599998 fps
>>>>> 274 frames in 5 seconds: 54.799999 fps
>>>>>
>>>>> 2) blob=true
>>>>> root@localhost:/weston_upstream/weston# ./build/clients/weston-
>> simple-
>>>> egl -o &
>>>>> Using config: r8g8b8a0
>>>>> has EGL_EXT_buffer_age and EGL_EXT_swap_buffers_with_damage
>>>>> has EGL_EXT_surface_compression
>>>>> 167 frames in 5 seconds: 33.400002 fps
>>>>> 208 frames in 5 seconds: 41.599998 fps
>>>>> 203 frames in 5 seconds: 40.599998 fps
>>>>> 252 frames in 5 seconds: 50.400002 fps
>>>>> 180 frames in 5 seconds: 36.000000 fps
>>>>> 219 frames in 5 seconds: 43.799999 fps
>>>>> 176 frames in 5 seconds: 35.200001 fps
>>>>> 220 frames in 5 seconds: 44.000000 fps
>>>>
>>>> Thanks for the testing.
>>>>
>>>> For the second scenario I'm getting better result, it's 55-60fps for me.
>>>> Given that this is a SW rendering, curious whether that's because you're
>>>> testing on a lower end machine(?)
>>> I am testing with a dGPU (B60) on the Host where I am running Gnome
>> Xorg
>>> but I am running Gnome Wayland in the Guest. What compositors are you
>>> running on the Host and Guest? Kwin? Also, could you please share your
>>> Qemu launch parameters as well so that we can try to figure out the
>> reason
>>> for the performance discrepancy?
>>
>> Did further testing on older IvyBridge machine and getting comparable
>> result to yours on it.
> Thank you for running these tests.
> 
>>
>> I'm using exactly same QEMU setup as you do with nothing special in the
>> cmdline. Tested with both KDE and Gnome on host.
>>
>> On IvyBridge getting:
>>
>> - 46-55fps jittery with GTK display + blob + vblank_timer
>> - 46-55fps jittery with GTK display + blob + vblank_timer +
>> vblank_mode=1 to disable vsync on host
>> - constant 60fps with GTK display without blob + vblank_timer
>> - constant 60fps with GTK display + blob without vblank_timer
>> - constant 60fps with SDL display + blob + vblank_timer
> What happens in the case of:
> - SDL display + blob without vblank_timer
> - SDL display + vblank_timer without blob
> 
>>
>> So, discrepancy of the results is directly tied to CPU performance + GTK
>> oddity when SW+blob rendering is used on guest.
>>
>> Why SDL performs much better than GTK I don't know.
> AFAICS, there are some key differences between SDL and GTK UIs that might
> be playing a role:
> - SDL UI does not implement dmabuf ops such as dpy_gl_scanout_dmabuf()
> but it does implement dpy_gl_scanout_texture() which I see is only called
> from virtio-gpu-virgl.c. I am wondering how does the display get updated
> in non-virgl usecases, especially when blob=true and given that this option
> is specifically meant for UI's dmabuf ops?
> 
> - GTK asynchronously waits (using an EGL fence) until the GPU Blit is
> completed (before releasing Guest FB) which other UIs do not do when
> blob=true. So, this might incur some extra delay which is not the case
> with SDL.
> 
> - SDL's display updates (sent to the Host compositor) seem to be UI timer
> driven while GTK's updates are repaint callback driven.

My QEMU branch contains patch adding dpy_gl_scanout_dmabuf() support to
SDL display. Haven't checked without it, will check next week.

Assume this is a QEMU-specific issue with how QEMU display logic works.
It's unclear to me how much severe this problem is in practice.
Meanwhile should be okay to add a workaround for guest blobs with a
comment explaining why it's needed if this is a significant-enough
regression for your use-cases. Feel free to send v5.

-- 
Best regards,
Dmitry

Reply via email to