On Wed, Feb 07, 2024 at 04:05:29PM +0800, Cindy Lu wrote:
Jason Wang <jasow...@redhat.com> 于2024年2月7日周三 11:17写道:

On Tue, Feb 6, 2024 at 4:31 PM Stefano Garzarella <sgarz...@redhat.com> wrote:
>
> On Tue, Feb 06, 2024 at 10:47:40AM +0800, Jason Wang wrote:
> >On Mon, Feb 5, 2024 at 6:51 PM Stefano Garzarella <sgarz...@redhat.com> 
wrote:
> >>
> >> On Fri, Feb 02, 2024 at 02:25:21PM +0100, Kevin Wolf wrote:
> >> >VDUSE requires that virtqueues are first enabled before the DRIVER_OK
> >> >status flag is set; with the current API of the kernel module, it is
> >> >impossible to enable the opposite order in our block export code because
> >> >userspace is not notified when a virtqueue is enabled.
> >
> >Did this mean virtio-blk will enable a virtqueue after DRIVER_OK?
>
> It's not specific to virtio-blk, but to the generic vdpa device we have
> in QEMU (i.e. vhost-vdpa-device). Yep, after commit
> 6c4825476a4351530bcac17abab72295b75ffe98, virtqueues are enabled after
> DRIVER_OK.

Right.

>
> >Sepc is not clear about this and that's why we introduce
> >VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK.
>
> Ah, I didn't know about this new feature. So after commit
> 6c4825476a4351530bcac17abab72295b75ffe98 the vhost-vdpa-device is not
> complying with the specification, right?

Kind of, but as stated, it's just because spec is unclear about the
behaviour. There's a chance that spec will explicitly support it in
the future.

>
> >
> >>
> >> Yeah, IMHO the VDUSE protocol is missing a VDUSE_SET_VQ_READY message,
> >
> >I think you meant when VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is
> >negotiated.
>
> At this point yes. But if VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK is not
> negotiated, should we return an error in vhost-vdpa kernel module if
> VHOST_VDPA_SET_VRING_ENABLE is called when DRIVER_OK is already set?

I'm not sure if this can break some setups or not. It might be better
to leave it as is?

Without VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK, we don't know if
parent support vq_ready after driver_ok.
With VHOST_BACKEND_F_ENABLE_AFTER_DRIVER_OK, we know parent support
vq_ready after driver_ok.

>
> >If this is truth, it seems a little more complicated, for
> >example the get_backend_features needs to be forward to the userspace?
>
> I'm not understanding, don't we already have VHOST_GET_BACKEND_FEATURES
> for this? Or do you mean userspace on the VDUSE side?

Yes, since in this case the parent is in the userspace, there's no way
for VDUSE to know if user space supports vq_ready after driver_ok or
not.

As you may have noticed, we don't have a message for vq_ready which
implies that vq_ready after driver_ok can't be supported.

>
> >This seems suboptimal to implement this in the spec first and then we
> >can leverage the features. Or we can have another parameter for the
> >ioctl that creates the vduse device.
>
> I got a little lost, though in vhost-user, the device can always expect
> a vring_enable/disable, so I thought it was not complicated in VDUSE.

Yes, the problem is assuming we have a message for vq_ready, there
could be  a "legacy" userspace that doesn't support that.  So in that
case, VDUSE needs to know if the userspace parent can support that or
not.

>
> >
> >> I'll start another thread about that, but in the meantime I agree that
> >> we should fix QEMU since we need to work properly with old kernels as
> >> well.
> >>
> >> >
> >> >This requirement also mathces the normal initialisation order as done by
> >> >the generic vhost code in QEMU. However, commit 6c482547 accidentally
> >> >changed the order for vdpa-dev and broke access to VDUSE devices with
> >> >this.
> >> >
> >> >This changes vdpa-dev to use the normal order again and use the standard
> >> >vhost callback .vhost_set_vring_enable for this. VDUSE devices can be
> >> >used with vdpa-dev again after this fix.
> >>
> >> I like this approach and the patch LGTM, but I'm a bit worried about
> >> this function in hw/net/vhost_net.c:
> >>
> >>      int vhost_set_vring_enable(NetClientState *nc, int enable)
> >>      {
> >>          VHostNetState *net = get_vhost_net(nc);
> >>          const VhostOps *vhost_ops = net->dev.vhost_ops;
> >>
> >>          nc->vring_enable = enable;
> >>
> >>          if (vhost_ops && vhost_ops->vhost_set_vring_enable) {
> >>              return vhost_ops->vhost_set_vring_enable(&net->dev, enable);
> >>          }
> >>
> >>          return 0;
> >>      }
> >>
> >> @Eugenio, @Jason, should we change some things there if vhost-vdpa
> >> implements the vhost_set_vring_enable callback?
> >
> >Eugenio may know more, I remember we need to enable cvq first for
> >shadow virtqueue to restore some states.
> >
> >>
> >> Do you remember why we didn't implement it from the beginning?
> >
> >It seems the vrings parameter is introduced after vhost-vdpa is
> >implemented.
>
> Sorry, I mean why we didn't implement the vhost_set_vring_enable
> callback for vhost-vdpa from the beginning.

Adding Cindy who writes those codes for more thoughts.

it's a really long time ago, and I can't remember it clearly. It seems
like there might be an issue with the sequence being called for
dev_start and vhost_set_vring_enable?
I have searched my mail but find nothing. I think we should do a full
test if we want to this

IMHO in hw/net/vhost_net.c we should check the type of backend (e.g. vhost-vdpa) to decide whether to call vhost_ops->vhost_set_vring_enable() or not, instead of just checking whether the callback is implemented or not, because in the future we might have other backends (e.g. a native vdpa-blk driver) that might want that callback implemented.

Something like this (untested):

diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
index e8e1661646..1fea678e29 100644
--- a/hw/net/vhost_net.c
+++ b/hw/net/vhost_net.c
@@ -543,7 +543,8 @@ int vhost_set_vring_enable(NetClientState *nc, int enable)

     nc->vring_enable = enable;

-    if (vhost_ops && vhost_ops->vhost_set_vring_enable) {
+    if (nc->info->type != NET_CLIENT_DRIVER_VHOST_VDPA &&
+            vhost_ops && vhost_ops->vhost_set_vring_enable) {
         return vhost_ops->vhost_set_vring_enable(&net->dev, enable);
     }

WDYT?

Thanks,
Stefano


Reply via email to