Re: [PULL 12/17] virtio-gpu: support context init multiple timeline

2025-06-20 Thread Akihiko Odaki

On 2025/06/21 4:47, Yiwei Zhang wrote:

On Thu, Jun 19, 2025 at 11:45 PM Alex Bennée  wrote:


Yiwei Zhang  writes:


On Sun, Jun 8, 2025 at 1:24 AM Akihiko Odaki
 wrote:


On 2025/06/06 1:26, Alex Bennée wrote:

From: Yiwei Zhang 

Venus and later native contexts have their own fence context along with
multiple timelines within. Fences wtih VIRTIO_GPU_FLAG_INFO_RING_IDX in
the flags must be dispatched to be created on the target context. Fence
signaling also has to be handled on the specific timeline within that
target context.

Before this change, venus fencing is completely broken if the host
driver doesn't support implicit fencing with external memory objects.
Frames can go backwards along with random artifacts on screen if the
host driver doesn't attach an implicit fence to the render target. The
symptom could be hidden by certain guest wsi backend that waits on a
venus native VkFence object for the actual payload with limited present
modes or under special configs. e.g. x11 mailbox or xwayland.

After this change, everything related to venus fencing starts making
sense. Confirmed this via guest and host side perfetto tracing.

Cc: [email protected]
Fixes: 94d0ea1c1928 ("virtio-gpu: Support Venus context")


I suggest moving this in the front of the patch series to ease backporting.

I also wonder if "[PULL 11/17] ui/gtk-gl-area: Remove extra draw call in
refresh" requires Cc: [email protected]. Fixing -display gtk,gl=on
for Wayland sounds as important as this patch.

Regards,
Akihiko Odaki


Hi Alex,

+1 for Akihiko's point. I'm also curious when will the venus fix land
in-tree?


We have a problem that there are two contradictory bugs - one that shows
up in the x86/kvm case and one in the aarch64/tcg case. Both are caused
by the weird lifetime semantics of the virgl resource vs QEMU memory
region and we haven't found a solution that solves both yet.


Sounds like irrelevant to the venus fix. Might be worth filing a
virglrenderer issue with some details. More eyes would be helpful if
this turns out to be some known kvm issue seen before on other vmms.


This patch itself looks good to me so:

Reviewed-by: Akihiko Odaki 

It's up to Alex whether to wait until other patches (the blob 
mapping/memory region fix in particular) piles up or not.


Regards,
Akihiko Odaki



Re: [PULL 12/17] virtio-gpu: support context init multiple timeline

2025-06-20 Thread Yiwei Zhang
On Thu, Jun 19, 2025 at 11:45 PM Alex Bennée  wrote:
>
> Yiwei Zhang  writes:
>
> > On Sun, Jun 8, 2025 at 1:24 AM Akihiko Odaki
> >  wrote:
> >>
> >> On 2025/06/06 1:26, Alex Bennée wrote:
> >> > From: Yiwei Zhang 
> >> >
> >> > Venus and later native contexts have their own fence context along with
> >> > multiple timelines within. Fences wtih VIRTIO_GPU_FLAG_INFO_RING_IDX in
> >> > the flags must be dispatched to be created on the target context. Fence
> >> > signaling also has to be handled on the specific timeline within that
> >> > target context.
> >> >
> >> > Before this change, venus fencing is completely broken if the host
> >> > driver doesn't support implicit fencing with external memory objects.
> >> > Frames can go backwards along with random artifacts on screen if the
> >> > host driver doesn't attach an implicit fence to the render target. The
> >> > symptom could be hidden by certain guest wsi backend that waits on a
> >> > venus native VkFence object for the actual payload with limited present
> >> > modes or under special configs. e.g. x11 mailbox or xwayland.
> >> >
> >> > After this change, everything related to venus fencing starts making
> >> > sense. Confirmed this via guest and host side perfetto tracing.
> >> >
> >> > Cc: [email protected]
> >> > Fixes: 94d0ea1c1928 ("virtio-gpu: Support Venus context")
> >>
> >> I suggest moving this in the front of the patch series to ease backporting.
> >>
> >> I also wonder if "[PULL 11/17] ui/gtk-gl-area: Remove extra draw call in
> >> refresh" requires Cc: [email protected]. Fixing -display gtk,gl=on
> >> for Wayland sounds as important as this patch.
> >>
> >> Regards,
> >> Akihiko Odaki
> >
> > Hi Alex,
> >
> > +1 for Akihiko's point. I'm also curious when will the venus fix land
> > in-tree?
>
> We have a problem that there are two contradictory bugs - one that shows
> up in the x86/kvm case and one in the aarch64/tcg case. Both are caused
> by the weird lifetime semantics of the virgl resource vs QEMU memory
> region and we haven't found a solution that solves both yet.

Sounds like irrelevant to the venus fix. Might be worth filing a
virglrenderer issue with some details. More eyes would be helpful if
this turns out to be some known kvm issue seen before on other vmms.

>
> I'm currently busy with other stuff and need to do a sweep of my other
> maintainer trees for 10.1. Once I've done that I'll have another go at
> coming up with a solution unless someone else beats me to it.

Understood. Hope there's a chance to move the venus fix up in the
queue to land first when you get to this again. Thanks!

>
> --
> Alex Bennée
> Virtualisation Tech Lead @ Linaro



Re: [PULL 12/17] virtio-gpu: support context init multiple timeline

2025-06-19 Thread Alex Bennée
Yiwei Zhang  writes:

> On Sun, Jun 8, 2025 at 1:24 AM Akihiko Odaki
>  wrote:
>>
>> On 2025/06/06 1:26, Alex Bennée wrote:
>> > From: Yiwei Zhang 
>> >
>> > Venus and later native contexts have their own fence context along with
>> > multiple timelines within. Fences wtih VIRTIO_GPU_FLAG_INFO_RING_IDX in
>> > the flags must be dispatched to be created on the target context. Fence
>> > signaling also has to be handled on the specific timeline within that
>> > target context.
>> >
>> > Before this change, venus fencing is completely broken if the host
>> > driver doesn't support implicit fencing with external memory objects.
>> > Frames can go backwards along with random artifacts on screen if the
>> > host driver doesn't attach an implicit fence to the render target. The
>> > symptom could be hidden by certain guest wsi backend that waits on a
>> > venus native VkFence object for the actual payload with limited present
>> > modes or under special configs. e.g. x11 mailbox or xwayland.
>> >
>> > After this change, everything related to venus fencing starts making
>> > sense. Confirmed this via guest and host side perfetto tracing.
>> >
>> > Cc: [email protected]
>> > Fixes: 94d0ea1c1928 ("virtio-gpu: Support Venus context")
>>
>> I suggest moving this in the front of the patch series to ease backporting.
>>
>> I also wonder if "[PULL 11/17] ui/gtk-gl-area: Remove extra draw call in
>> refresh" requires Cc: [email protected]. Fixing -display gtk,gl=on
>> for Wayland sounds as important as this patch.
>>
>> Regards,
>> Akihiko Odaki
>
> Hi Alex,
>
> +1 for Akihiko's point. I'm also curious when will the venus fix land
> in-tree?

We have a problem that there are two contradictory bugs - one that shows
up in the x86/kvm case and one in the aarch64/tcg case. Both are caused
by the weird lifetime semantics of the virgl resource vs QEMU memory
region and we haven't found a solution that solves both yet.

I'm currently busy with other stuff and need to do a sweep of my other
maintainer trees for 10.1. Once I've done that I'll have another go at
coming up with a solution unless someone else beats me to it.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro



Re: [PULL 12/17] virtio-gpu: support context init multiple timeline

2025-06-19 Thread Yiwei Zhang
On Sun, Jun 8, 2025 at 1:24 AM Akihiko Odaki
 wrote:
>
> On 2025/06/06 1:26, Alex Bennée wrote:
> > From: Yiwei Zhang 
> >
> > Venus and later native contexts have their own fence context along with
> > multiple timelines within. Fences wtih VIRTIO_GPU_FLAG_INFO_RING_IDX in
> > the flags must be dispatched to be created on the target context. Fence
> > signaling also has to be handled on the specific timeline within that
> > target context.
> >
> > Before this change, venus fencing is completely broken if the host
> > driver doesn't support implicit fencing with external memory objects.
> > Frames can go backwards along with random artifacts on screen if the
> > host driver doesn't attach an implicit fence to the render target. The
> > symptom could be hidden by certain guest wsi backend that waits on a
> > venus native VkFence object for the actual payload with limited present
> > modes or under special configs. e.g. x11 mailbox or xwayland.
> >
> > After this change, everything related to venus fencing starts making
> > sense. Confirmed this via guest and host side perfetto tracing.
> >
> > Cc: [email protected]
> > Fixes: 94d0ea1c1928 ("virtio-gpu: Support Venus context")
>
> I suggest moving this in the front of the patch series to ease backporting.
>
> I also wonder if "[PULL 11/17] ui/gtk-gl-area: Remove extra draw call in
> refresh" requires Cc: [email protected]. Fixing -display gtk,gl=on
> for Wayland sounds as important as this patch.
>
> Regards,
> Akihiko Odaki

Hi Alex,

+1 for Akihiko's point. I'm also curious when will the venus fix land in-tree?

Best,
Yiwei

>
> > Signed-off-by: Yiwei Zhang 
> > Reviewed-by: Dmitry Osipenko 
> > Message-Id: <[email protected]>
> > [AJB: remove version history from commit message]
> > Tested-by: Dmitry Osipenko 
> > Signed-off-by: Alex Bennée 
> > Message-ID: <[email protected]>
> >
> > diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> > index b4aa8abb96..cea2e12eb9 100644
> > --- a/hw/display/virtio-gpu-virgl.c
> > +++ b/hw/display/virtio-gpu-virgl.c
> > @@ -978,6 +978,15 @@ void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
> >   }
> >
> >   trace_virtio_gpu_fence_ctrl(cmd->cmd_hdr.fence_id, cmd->cmd_hdr.type);
> > +#if VIRGL_VERSION_MAJOR >= 1
> > +if (cmd->cmd_hdr.flags & VIRTIO_GPU_FLAG_INFO_RING_IDX) {
> > +virgl_renderer_context_create_fence(cmd->cmd_hdr.ctx_id,
> > +
> > VIRGL_RENDERER_FENCE_FLAG_MERGEABLE,
> > +cmd->cmd_hdr.ring_idx,
> > +cmd->cmd_hdr.fence_id);
> > +return;
> > +}
> > +#endif
> >   virgl_renderer_create_fence(cmd->cmd_hdr.fence_id, cmd->cmd_hdr.type);
> >   }
> >
> > @@ -991,6 +1000,11 @@ static void virgl_write_fence(void *opaque, uint32_t 
> > fence)
> >* the guest can end up emitting fences out of order
> >* so we should check all fenced cmds not just the first one.
> >*/
> > +#if VIRGL_VERSION_MAJOR >= 1
> > +if (cmd->cmd_hdr.flags & VIRTIO_GPU_FLAG_INFO_RING_IDX) {
> > +continue;
> > +}
> > +#endif
> >   if (cmd->cmd_hdr.fence_id > fence) {
> >   continue;
> >   }
> > @@ -1005,6 +1019,29 @@ static void virgl_write_fence(void *opaque, uint32_t 
> > fence)
> >   }
> >   }
> >
> > +#if VIRGL_VERSION_MAJOR >= 1
> > +static void virgl_write_context_fence(void *opaque, uint32_t ctx_id,
> > +  uint32_t ring_idx, uint64_t 
> > fence_id) {
> > +VirtIOGPU *g = opaque;
> > +struct virtio_gpu_ctrl_command *cmd, *tmp;
> > +
> > +QTAILQ_FOREACH_SAFE(cmd, &g->fenceq, next, tmp) {
> > +if (cmd->cmd_hdr.flags & VIRTIO_GPU_FLAG_INFO_RING_IDX &&
> > +cmd->cmd_hdr.ctx_id == ctx_id && cmd->cmd_hdr.ring_idx == 
> > ring_idx &&
> > +cmd->cmd_hdr.fence_id <= fence_id) {
> > +trace_virtio_gpu_fence_resp(cmd->cmd_hdr.fence_id);
> > +virtio_gpu_ctrl_response_nodata(g, cmd, 
> > VIRTIO_GPU_RESP_OK_NODATA);
> > +QTAILQ_REMOVE(&g->fenceq, cmd, next);
> > +g_free(cmd);
> > +g->inflight--;
> > +if (virtio_gpu_stats_enabled(g->parent_obj.conf)) {
> > +trace_virtio_gpu_dec_inflight_fences(g->inflight);
> > +}
> > +}
> > +}
> > +}
> > +#endif
> > +
> >   static virgl_renderer_gl_context
> >   virgl_create_context(void *opaque, int scanout_idx,
> >struct virgl_renderer_gl_ctx_param *params)
> > @@ -1039,11 +1076,18 @@ static int virgl_make_context_current(void *opaque, 
> > int scanout_idx,
> >   }
> >
> >   static struct virgl_renderer_callbacks virtio_gpu_3d_cbs = {
> > +#if VIRGL_VERSION_MAJOR >= 1
> > +.version = 3,
> > +#else
> >   .version = 1,
> > +#endif
> >   

Re: [PULL 12/17] virtio-gpu: support context init multiple timeline

2025-06-08 Thread Akihiko Odaki

On 2025/06/06 1:26, Alex Bennée wrote:

From: Yiwei Zhang 

Venus and later native contexts have their own fence context along with
multiple timelines within. Fences wtih VIRTIO_GPU_FLAG_INFO_RING_IDX in
the flags must be dispatched to be created on the target context. Fence
signaling also has to be handled on the specific timeline within that
target context.

Before this change, venus fencing is completely broken if the host
driver doesn't support implicit fencing with external memory objects.
Frames can go backwards along with random artifacts on screen if the
host driver doesn't attach an implicit fence to the render target. The
symptom could be hidden by certain guest wsi backend that waits on a
venus native VkFence object for the actual payload with limited present
modes or under special configs. e.g. x11 mailbox or xwayland.

After this change, everything related to venus fencing starts making
sense. Confirmed this via guest and host side perfetto tracing.

Cc: [email protected]
Fixes: 94d0ea1c1928 ("virtio-gpu: Support Venus context")


I suggest moving this in the front of the patch series to ease backporting.

I also wonder if "[PULL 11/17] ui/gtk-gl-area: Remove extra draw call in 
refresh" requires Cc: [email protected]. Fixing -display gtk,gl=on 
for Wayland sounds as important as this patch.


Regards,
Akihiko Odaki


Signed-off-by: Yiwei Zhang 
Reviewed-by: Dmitry Osipenko 
Message-Id: <[email protected]>
[AJB: remove version history from commit message]
Tested-by: Dmitry Osipenko 
Signed-off-by: Alex Bennée 
Message-ID: <[email protected]>

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index b4aa8abb96..cea2e12eb9 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -978,6 +978,15 @@ void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
  }
  
  trace_virtio_gpu_fence_ctrl(cmd->cmd_hdr.fence_id, cmd->cmd_hdr.type);

+#if VIRGL_VERSION_MAJOR >= 1
+if (cmd->cmd_hdr.flags & VIRTIO_GPU_FLAG_INFO_RING_IDX) {
+virgl_renderer_context_create_fence(cmd->cmd_hdr.ctx_id,
+
VIRGL_RENDERER_FENCE_FLAG_MERGEABLE,
+cmd->cmd_hdr.ring_idx,
+cmd->cmd_hdr.fence_id);
+return;
+}
+#endif
  virgl_renderer_create_fence(cmd->cmd_hdr.fence_id, cmd->cmd_hdr.type);
  }
  
@@ -991,6 +1000,11 @@ static void virgl_write_fence(void *opaque, uint32_t fence)

   * the guest can end up emitting fences out of order
   * so we should check all fenced cmds not just the first one.
   */
+#if VIRGL_VERSION_MAJOR >= 1
+if (cmd->cmd_hdr.flags & VIRTIO_GPU_FLAG_INFO_RING_IDX) {
+continue;
+}
+#endif
  if (cmd->cmd_hdr.fence_id > fence) {
  continue;
  }
@@ -1005,6 +1019,29 @@ static void virgl_write_fence(void *opaque, uint32_t 
fence)
  }
  }
  
+#if VIRGL_VERSION_MAJOR >= 1

+static void virgl_write_context_fence(void *opaque, uint32_t ctx_id,
+  uint32_t ring_idx, uint64_t fence_id) {
+VirtIOGPU *g = opaque;
+struct virtio_gpu_ctrl_command *cmd, *tmp;
+
+QTAILQ_FOREACH_SAFE(cmd, &g->fenceq, next, tmp) {
+if (cmd->cmd_hdr.flags & VIRTIO_GPU_FLAG_INFO_RING_IDX &&
+cmd->cmd_hdr.ctx_id == ctx_id && cmd->cmd_hdr.ring_idx == ring_idx 
&&
+cmd->cmd_hdr.fence_id <= fence_id) {
+trace_virtio_gpu_fence_resp(cmd->cmd_hdr.fence_id);
+virtio_gpu_ctrl_response_nodata(g, cmd, VIRTIO_GPU_RESP_OK_NODATA);
+QTAILQ_REMOVE(&g->fenceq, cmd, next);
+g_free(cmd);
+g->inflight--;
+if (virtio_gpu_stats_enabled(g->parent_obj.conf)) {
+trace_virtio_gpu_dec_inflight_fences(g->inflight);
+}
+}
+}
+}
+#endif
+
  static virgl_renderer_gl_context
  virgl_create_context(void *opaque, int scanout_idx,
   struct virgl_renderer_gl_ctx_param *params)
@@ -1039,11 +1076,18 @@ static int virgl_make_context_current(void *opaque, int 
scanout_idx,
  }
  
  static struct virgl_renderer_callbacks virtio_gpu_3d_cbs = {

+#if VIRGL_VERSION_MAJOR >= 1
+.version = 3,
+#else
  .version = 1,
+#endif
  .write_fence = virgl_write_fence,
  .create_gl_context   = virgl_create_context,
  .destroy_gl_context  = virgl_destroy_context,
  .make_current= virgl_make_context_current,
+#if VIRGL_VERSION_MAJOR >= 1
+.write_context_fence = virgl_write_context_fence,
+#endif
  };
  
  static void virtio_gpu_print_stats(void *opaque)





[PULL 12/17] virtio-gpu: support context init multiple timeline

2025-06-05 Thread Alex Bennée
From: Yiwei Zhang 

Venus and later native contexts have their own fence context along with
multiple timelines within. Fences wtih VIRTIO_GPU_FLAG_INFO_RING_IDX in
the flags must be dispatched to be created on the target context. Fence
signaling also has to be handled on the specific timeline within that
target context.

Before this change, venus fencing is completely broken if the host
driver doesn't support implicit fencing with external memory objects.
Frames can go backwards along with random artifacts on screen if the
host driver doesn't attach an implicit fence to the render target. The
symptom could be hidden by certain guest wsi backend that waits on a
venus native VkFence object for the actual payload with limited present
modes or under special configs. e.g. x11 mailbox or xwayland.

After this change, everything related to venus fencing starts making
sense. Confirmed this via guest and host side perfetto tracing.

Cc: [email protected]
Fixes: 94d0ea1c1928 ("virtio-gpu: Support Venus context")
Signed-off-by: Yiwei Zhang 
Reviewed-by: Dmitry Osipenko 
Message-Id: <[email protected]>
[AJB: remove version history from commit message]
Tested-by: Dmitry Osipenko 
Signed-off-by: Alex Bennée 
Message-ID: <[email protected]>

diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index b4aa8abb96..cea2e12eb9 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -978,6 +978,15 @@ void virtio_gpu_virgl_process_cmd(VirtIOGPU *g,
 }
 
 trace_virtio_gpu_fence_ctrl(cmd->cmd_hdr.fence_id, cmd->cmd_hdr.type);
+#if VIRGL_VERSION_MAJOR >= 1
+if (cmd->cmd_hdr.flags & VIRTIO_GPU_FLAG_INFO_RING_IDX) {
+virgl_renderer_context_create_fence(cmd->cmd_hdr.ctx_id,
+
VIRGL_RENDERER_FENCE_FLAG_MERGEABLE,
+cmd->cmd_hdr.ring_idx,
+cmd->cmd_hdr.fence_id);
+return;
+}
+#endif
 virgl_renderer_create_fence(cmd->cmd_hdr.fence_id, cmd->cmd_hdr.type);
 }
 
@@ -991,6 +1000,11 @@ static void virgl_write_fence(void *opaque, uint32_t 
fence)
  * the guest can end up emitting fences out of order
  * so we should check all fenced cmds not just the first one.
  */
+#if VIRGL_VERSION_MAJOR >= 1
+if (cmd->cmd_hdr.flags & VIRTIO_GPU_FLAG_INFO_RING_IDX) {
+continue;
+}
+#endif
 if (cmd->cmd_hdr.fence_id > fence) {
 continue;
 }
@@ -1005,6 +1019,29 @@ static void virgl_write_fence(void *opaque, uint32_t 
fence)
 }
 }
 
+#if VIRGL_VERSION_MAJOR >= 1
+static void virgl_write_context_fence(void *opaque, uint32_t ctx_id,
+  uint32_t ring_idx, uint64_t fence_id) {
+VirtIOGPU *g = opaque;
+struct virtio_gpu_ctrl_command *cmd, *tmp;
+
+QTAILQ_FOREACH_SAFE(cmd, &g->fenceq, next, tmp) {
+if (cmd->cmd_hdr.flags & VIRTIO_GPU_FLAG_INFO_RING_IDX &&
+cmd->cmd_hdr.ctx_id == ctx_id && cmd->cmd_hdr.ring_idx == ring_idx 
&&
+cmd->cmd_hdr.fence_id <= fence_id) {
+trace_virtio_gpu_fence_resp(cmd->cmd_hdr.fence_id);
+virtio_gpu_ctrl_response_nodata(g, cmd, VIRTIO_GPU_RESP_OK_NODATA);
+QTAILQ_REMOVE(&g->fenceq, cmd, next);
+g_free(cmd);
+g->inflight--;
+if (virtio_gpu_stats_enabled(g->parent_obj.conf)) {
+trace_virtio_gpu_dec_inflight_fences(g->inflight);
+}
+}
+}
+}
+#endif
+
 static virgl_renderer_gl_context
 virgl_create_context(void *opaque, int scanout_idx,
  struct virgl_renderer_gl_ctx_param *params)
@@ -1039,11 +1076,18 @@ static int virgl_make_context_current(void *opaque, int 
scanout_idx,
 }
 
 static struct virgl_renderer_callbacks virtio_gpu_3d_cbs = {
+#if VIRGL_VERSION_MAJOR >= 1
+.version = 3,
+#else
 .version = 1,
+#endif
 .write_fence = virgl_write_fence,
 .create_gl_context   = virgl_create_context,
 .destroy_gl_context  = virgl_destroy_context,
 .make_current= virgl_make_context_current,
+#if VIRGL_VERSION_MAJOR >= 1
+.write_context_fence = virgl_write_context_fence,
+#endif
 };
 
 static void virtio_gpu_print_stats(void *opaque)
-- 
2.47.2