On Thu, Oct 14, 2021 at 2:00 PM Eugenio Perez Martin <epere...@redhat.com> wrote: > > On Wed, Oct 13, 2021 at 5:27 AM Jason Wang <jasow...@redhat.com> wrote: > > > > > > 在 2021/10/1 下午3:05, Eugenio Pérez 写道: > > > Shadow virtqueue notifications forwarding is disabled when vhost_dev > > > stops, so code flow follows usual cleanup. > > > > > > Also, host notifiers must be disabled at SVQ start, > > > > > > Any reason for this? > > > > It will be addressed in a later series, sorry. > > > > > > and they will not > > > start if SVQ has been enabled when device is stopped. This is trivial > > > to address, but it is left out for simplicity at this moment. > > > > > > It looks to me this patch also contains the following logics > > > > 1) codes to enable svq > > > > 2) codes to let svq to be enabled from QMP. > > > > I think they need to be split out, > > I agree that we can split this more, with the code that belongs to SVQ > and the code that belongs to vhost-vdpa. it will be addressed in > future series. > > > we may endup with the following > > series of patches > > > > With "series of patches" do you mean to send every step in a separated > series? There are odds of having the need of modifying code already > sent & merged with later ones. If you confirm to me that it is fine, I > can do it that way for sure. > > > 1) svq skeleton with enable/disable > > 2) route host notifier to svq > > 3) route guest notifier to svq > > 4) codes to enable svq > > 5) enable svq via QMP > > > > I'm totally fine with that, but there is code that is never called if > the qmp command is not added. The compiler complains about static > functions that are not called, making impossible things like bisecting > through these commits, unless I use attribute((unused)) or similar. Or > have I missed something? > > We could do that way with the code that belongs to SVQ though, since > all of it is declared in headers. But to delay the "enable svq via > qmp" to the last one makes debugging harder, as we cannot just enable > notifications forwarding with no buffers forwarding. > > If I introduce a change in the notifications code, I can simply go to > these commits and enable SVQ for notifications. This way I can have an > idea of what part is failing. A similar logic can be applied to other > devices than vp_vdpa. We would lose it if we > > > > > > > > > Signed-off-by: Eugenio Pérez <epere...@redhat.com> > > > --- > > > qapi/net.json | 2 +- > > > hw/virtio/vhost-shadow-virtqueue.h | 8 ++ > > > include/hw/virtio/vhost-vdpa.h | 4 + > > > hw/virtio/vhost-shadow-virtqueue.c | 138 ++++++++++++++++++++++++++++- > > > hw/virtio/vhost-vdpa.c | 116 +++++++++++++++++++++++- > > > 5 files changed, 264 insertions(+), 4 deletions(-) > > > > > > diff --git a/qapi/net.json b/qapi/net.json > > > index a2c30fd455..fe546b0e7c 100644 > > > --- a/qapi/net.json > > > +++ b/qapi/net.json > > > @@ -88,7 +88,7 @@ > > > # > > > # @enable: true to use the alternate shadow VQ notifications > > > # > > > -# Returns: Always error, since SVQ is not implemented at the moment. > > > +# Returns: Error if failure, or 'no error' for success. > > > # > > > # Since: 6.2 > > > # > > > diff --git a/hw/virtio/vhost-shadow-virtqueue.h > > > b/hw/virtio/vhost-shadow-virtqueue.h > > > index 27ac6388fa..237cfceb9c 100644 > > > --- a/hw/virtio/vhost-shadow-virtqueue.h > > > +++ b/hw/virtio/vhost-shadow-virtqueue.h > > > @@ -14,6 +14,14 @@ > > > > > > typedef struct VhostShadowVirtqueue VhostShadowVirtqueue; > > > > > > +EventNotifier *vhost_svq_get_svq_call_notifier(VhostShadowVirtqueue > > > *svq); > > > > > > Let's move this function to another patch since it's unrelated to the > > guest->host routing. > > > > Right, I missed it while squashing commits and at later reviews. > > > > > > +void vhost_svq_set_guest_call_notifier(VhostShadowVirtqueue *svq, int > > > call_fd); > > > + > > > +bool vhost_svq_start(struct vhost_dev *dev, unsigned idx, > > > + VhostShadowVirtqueue *svq); > > > +void vhost_svq_stop(struct vhost_dev *dev, unsigned idx, > > > + VhostShadowVirtqueue *svq); > > > + > > > VhostShadowVirtqueue *vhost_svq_new(struct vhost_dev *dev, int idx); > > > > > > void vhost_svq_free(VhostShadowVirtqueue *vq); > > > diff --git a/include/hw/virtio/vhost-vdpa.h > > > b/include/hw/virtio/vhost-vdpa.h > > > index 0d565bb5bd..48aae59d8e 100644 > > > --- a/include/hw/virtio/vhost-vdpa.h > > > +++ b/include/hw/virtio/vhost-vdpa.h > > > @@ -12,6 +12,8 @@ > > > #ifndef HW_VIRTIO_VHOST_VDPA_H > > > #define HW_VIRTIO_VHOST_VDPA_H > > > > > > +#include <gmodule.h> > > > + > > > #include "qemu/queue.h" > > > #include "hw/virtio/virtio.h" > > > > > > @@ -24,6 +26,8 @@ typedef struct vhost_vdpa { > > > int device_fd; > > > uint32_t msg_type; > > > MemoryListener listener; > > > + bool shadow_vqs_enabled; > > > + GPtrArray *shadow_vqs; > > > struct vhost_dev *dev; > > > QLIST_ENTRY(vhost_vdpa) entry; > > > VhostVDPAHostNotifier notifier[VIRTIO_QUEUE_MAX]; > > > diff --git a/hw/virtio/vhost-shadow-virtqueue.c > > > b/hw/virtio/vhost-shadow-virtqueue.c > > > index c4826a1b56..21dc99ab5d 100644 > > > --- a/hw/virtio/vhost-shadow-virtqueue.c > > > +++ b/hw/virtio/vhost-shadow-virtqueue.c > > > @@ -9,9 +9,12 @@ > > > > > > #include "qemu/osdep.h" > > > #include "hw/virtio/vhost-shadow-virtqueue.h" > > > +#include "hw/virtio/vhost.h" > > > + > > > +#include "standard-headers/linux/vhost_types.h" > > > > > > #include "qemu/error-report.h" > > > -#include "qemu/event_notifier.h" > > > +#include "qemu/main-loop.h" > > > > > > /* Shadow virtqueue to relay notifications */ > > > typedef struct VhostShadowVirtqueue { > > > @@ -19,14 +22,146 @@ typedef struct VhostShadowVirtqueue { > > > EventNotifier kick_notifier; > > > /* Shadow call notifier, sent to vhost */ > > > EventNotifier call_notifier; > > > + > > > + /* > > > + * Borrowed virtqueue's guest to host notifier. > > > + * To borrow it in this event notifier allows to register on the > > > event > > > + * loop and access the associated shadow virtqueue easily. If we use > > > the > > > + * VirtQueue, we don't have an easy way to retrieve it. > > > + * > > > + * So shadow virtqueue must not clean it, or we would lose VirtQueue > > > one. > > > + */ > > > + EventNotifier host_notifier; > > > + > > > + /* Guest's call notifier, where SVQ calls guest. */ > > > + EventNotifier guest_call_notifier; > > > > > > To be consistent, let's simply use "guest_notifier" here. > > > > It could be confused when the series adds a guest -> qemu kick > notifier then. Actually, I think it would be better to rename > host_notifier to something like host_svq_notifier. Or host_call and > guest_call, since "notifier" is already in the type, making the name > to be a little bit "Hungarian notation". > > > > > > + > > > + /* Virtio queue shadowing */ > > > + VirtQueue *vq; > > > } VhostShadowVirtqueue; > > > > > > +/* Forward guest notifications */ > > > +static void vhost_handle_guest_kick(EventNotifier *n) > > > +{ > > > + VhostShadowVirtqueue *svq = container_of(n, VhostShadowVirtqueue, > > > + host_notifier); > > > + > > > + if (unlikely(!event_notifier_test_and_clear(n))) { > > > + return; > > > + } > > > > > > Is there a chance that we may stop the processing of available buffers > > during the svq enabling? There could be no kick from the guest in this case. > > > > Actually, yes, I think you are right. The guest kick eventfd could > have been consumed by vhost but there may be still pending buffers. > > I think it would be better to check for available buffers first, then > clear the notifier unconditionally, and then re-check and process them > if any [1]. > > However, this problem arises later in the series: At this moment the > device is not reset and guest's host notifier is not replaced, so > either vhost/device receives the kick, or SVQ does and forwards it. > > Does it make sense to you? > > > > > > + > > > + event_notifier_set(&svq->kick_notifier); > > > +} > > > + > > > +/* > > > + * Obtain the SVQ call notifier, where vhost device notifies SVQ that > > > there > > > + * exists pending used buffers. > > > + * > > > + * @svq Shadow Virtqueue > > > + */ > > > +EventNotifier *vhost_svq_get_svq_call_notifier(VhostShadowVirtqueue *svq) > > > +{ > > > + return &svq->call_notifier; > > > +} > > > + > > > +/* > > > + * Set the call notifier for the SVQ to call the guest > > > + * > > > + * @svq Shadow virtqueue > > > + * @call_fd call notifier > > > + * > > > + * Called on BQL context. > > > + */ > > > +void vhost_svq_set_guest_call_notifier(VhostShadowVirtqueue *svq, int > > > call_fd) > > > +{ > > > + event_notifier_init_fd(&svq->guest_call_notifier, call_fd); > > > +} > > > + > > > +/* > > > + * Restore the vhost guest to host notifier, i.e., disables svq effect. > > > + */ > > > +static int vhost_svq_restore_vdev_host_notifier(struct vhost_dev *dev, > > > + unsigned vhost_index, > > > + VhostShadowVirtqueue > > > *svq) > > > +{ > > > + EventNotifier *vq_host_notifier = > > > virtio_queue_get_host_notifier(svq->vq); > > > + struct vhost_vring_file file = { > > > + .index = vhost_index, > > > + .fd = event_notifier_get_fd(vq_host_notifier), > > > + }; > > > + int r; > > > + > > > + /* Restore vhost kick */ > > > + r = dev->vhost_ops->vhost_set_vring_kick(dev, &file); > > > > > > And remap the notification area if necessary. > > Totally right, that step is missed in this series. > > However, remapping guest host notifier memory region has no advantages > over using ioeventfd to perform guest -> SVQ notifications, doesn't > it? By both methods flow needs to go through the hypervisor kernel. > > > > > > > > + return r ? -errno : 0; > > > +} > > > + > > > +/* > > > + * Start shadow virtqueue operation. > > > + * @dev vhost device > > > + * @hidx vhost virtqueue index > > > + * @svq Shadow Virtqueue > > > + */ > > > +bool vhost_svq_start(struct vhost_dev *dev, unsigned idx, > > > + VhostShadowVirtqueue *svq) > > > +{ > > > + EventNotifier *vq_host_notifier = > > > virtio_queue_get_host_notifier(svq->vq); > > > + struct vhost_vring_file file = { > > > + .index = dev->vhost_ops->vhost_get_vq_index(dev, dev->vq_index + > > > idx), > > > + .fd = event_notifier_get_fd(&svq->kick_notifier), > > > + }; > > > + int r; > > > + > > > + /* Check that notifications are still going directly to vhost dev */ > > > + assert(virtio_queue_is_host_notifier_enabled(svq->vq)); > > > + > > > + /* > > > + * event_notifier_set_handler already checks for guest's > > > notifications if > > > + * they arrive in the switch, so there is no need to explicitely > > > check for > > > + * them. > > > + */ > > > > > > If this is true, shouldn't we call vhost_set_vring_kick() before the > > event_notifier_set_handler()? > > > > Not at this point of the series, but it could be another solution when > we need to reset the device and we are unsure if all buffers have been > read. But I think I prefer the solution exposed in [1] and to > explicitly call vhost_handle_guest_kick here. Do you think > differently? > > > Btw, I think we should update the fd if set_vring_kick() was called > > after this function? > > > > Kind of. This is currently bad in the code, but... > > Backend callbacks vhost_ops->vhost_set_vring_kick and > vhost_ops->vhost_set_vring_addr are only called at > vhost_virtqueue_start. And they are always called with known data > already stored in VirtQueue. > > To avoid storing more state in vhost_vdpa, I think that we should > avoid duplicating them, but ignore new kick_fd or address in SVQ mode, > and retrieve them again at the moment the device is (re)started in SVQ > mode. Qemu already avoids things like allowing the guest to set > addresses at random time, using the VirtIOPCIProxy to store them. > > I also see how duplicating that status could protect vdpa SVQ code > against future changes to vhost code, but that would make this series > bigger and more complex with no need at this moment in my opinion. > > Do you agree? > > > > > > + event_notifier_init_fd(&svq->host_notifier, > > > + event_notifier_get_fd(vq_host_notifier)); > > > + event_notifier_set_handler(&svq->host_notifier, > > > vhost_handle_guest_kick); > > > + > > > + r = dev->vhost_ops->vhost_set_vring_kick(dev, &file); > > > > > > And we need to stop the notification area mmap. > > > > Right. > > > > > > + if (unlikely(r != 0)) { > > > + error_report("Couldn't set kick fd: %s", strerror(errno)); > > > + goto err_set_vring_kick; > > > + } > > > + > > > + return true; > > > + > > > +err_set_vring_kick: > > > + event_notifier_set_handler(&svq->host_notifier, NULL); > > > + > > > + return false; > > > +} > > > + > > > +/* > > > + * Stop shadow virtqueue operation. > > > + * @dev vhost device > > > + * @idx vhost queue index > > > + * @svq Shadow Virtqueue > > > + */ > > > +void vhost_svq_stop(struct vhost_dev *dev, unsigned idx, > > > + VhostShadowVirtqueue *svq) > > > +{ > > > + int r = vhost_svq_restore_vdev_host_notifier(dev, idx, svq); > > > + if (unlikely(r < 0)) { > > > + error_report("Couldn't restore vq kick fd: %s", strerror(-r)); > > > + } > > > + > > > + event_notifier_set_handler(&svq->host_notifier, NULL); > > > +} > > > + > > > /* > > > * Creates vhost shadow virtqueue, and instruct vhost device to use the > > > shadow > > > * methods and file descriptors. > > > */ > > > VhostShadowVirtqueue *vhost_svq_new(struct vhost_dev *dev, int idx) > > > { > > > + int vq_idx = dev->vq_index + idx; > > > g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, > > > 1); > > > int r; > > > > > > @@ -44,6 +179,7 @@ VhostShadowVirtqueue *vhost_svq_new(struct vhost_dev > > > *dev, int idx) > > > goto err_init_call_notifier; > > > } > > > > > > + svq->vq = virtio_get_queue(dev->vdev, vq_idx); > > > return g_steal_pointer(&svq); > > > > > > err_init_call_notifier: > > > diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c > > > index e0dc7508c3..36c954a779 100644 > > > --- a/hw/virtio/vhost-vdpa.c > > > +++ b/hw/virtio/vhost-vdpa.c > > > @@ -17,6 +17,7 @@ > > > #include "hw/virtio/vhost.h" > > > #include "hw/virtio/vhost-backend.h" > > > #include "hw/virtio/virtio-net.h" > > > +#include "hw/virtio/vhost-shadow-virtqueue.h" > > > #include "hw/virtio/vhost-vdpa.h" > > > #include "exec/address-spaces.h" > > > #include "qemu/main-loop.h" > > > @@ -272,6 +273,16 @@ static void vhost_vdpa_add_status(struct vhost_dev > > > *dev, uint8_t status) > > > vhost_vdpa_call(dev, VHOST_VDPA_SET_STATUS, &s); > > > } > > > > > > +/** > > > + * Adaptor function to free shadow virtqueue through gpointer > > > + * > > > + * @svq The Shadow Virtqueue > > > + */ > > > +static void vhost_psvq_free(gpointer svq) > > > +{ > > > + vhost_svq_free(svq); > > > +} > > > + > > > static int vhost_vdpa_init(struct vhost_dev *dev, void *opaque, Error > > > **errp) > > > { > > > struct vhost_vdpa *v; > > > @@ -283,6 +294,7 @@ static int vhost_vdpa_init(struct vhost_dev *dev, > > > void *opaque, Error **errp) > > > dev->opaque = opaque ; > > > v->listener = vhost_vdpa_memory_listener; > > > v->msg_type = VHOST_IOTLB_MSG_V2; > > > + v->shadow_vqs = g_ptr_array_new_full(dev->nvqs, vhost_psvq_free); > > > QLIST_INSERT_HEAD(&vhost_vdpa_devices, v, entry); > > > > > > vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE | > > > @@ -373,6 +385,17 @@ err: > > > return; > > > } > > > > > > +static void vhost_vdpa_svq_cleanup(struct vhost_dev *dev) > > > +{ > > > + struct vhost_vdpa *v = dev->opaque; > > > + size_t idx; > > > + > > > + for (idx = 0; idx < v->shadow_vqs->len; ++idx) { > > > + vhost_svq_stop(dev, idx, g_ptr_array_index(v->shadow_vqs, idx)); > > > + } > > > + g_ptr_array_free(v->shadow_vqs, true); > > > +} > > > + > > > static int vhost_vdpa_cleanup(struct vhost_dev *dev) > > > { > > > struct vhost_vdpa *v; > > > @@ -381,6 +404,7 @@ static int vhost_vdpa_cleanup(struct vhost_dev *dev) > > > trace_vhost_vdpa_cleanup(dev, v); > > > vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs); > > > memory_listener_unregister(&v->listener); > > > + vhost_vdpa_svq_cleanup(dev); > > > QLIST_REMOVE(v, entry); > > > > > > dev->opaque = NULL; > > > @@ -557,7 +581,9 @@ static int vhost_vdpa_dev_start(struct vhost_dev > > > *dev, bool started) > > > if (started) { > > > uint8_t status = 0; > > > memory_listener_register(&v->listener, &address_space_memory); > > > - vhost_vdpa_host_notifiers_init(dev); > > > + if (!v->shadow_vqs_enabled) { > > > + vhost_vdpa_host_notifiers_init(dev); > > > + } > > > > > > This looks like a trick, why not check and setup shadow_vqs inside: > > > > 1) vhost_vdpa_host_notifiers_init() > > > > and > > > > 2) vhost_vdpa_set_vring_kick() > > > > Ok I will move the checks there. > > > > > > vhost_vdpa_set_vring_ready(dev); > > > vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_DRIVER_OK); > > > vhost_vdpa_call(dev, VHOST_VDPA_GET_STATUS, &status); > > > @@ -663,10 +689,96 @@ static bool vhost_vdpa_force_iommu(struct > > > vhost_dev *dev) > > > return true; > > > } > > > > > > +/* > > > + * Start shadow virtqueue. > > > + */ > > > +static bool vhost_vdpa_svq_start_vq(struct vhost_dev *dev, unsigned idx) > > > +{ > > > + struct vhost_vdpa *v = dev->opaque; > > > + VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, idx); > > > + return vhost_svq_start(dev, idx, svq); > > > +} > > > + > > > +static unsigned vhost_vdpa_enable_svq(struct vhost_vdpa *v, bool enable) > > > +{ > > > + struct vhost_dev *hdev = v->dev; > > > + unsigned n; > > > + > > > + if (enable == v->shadow_vqs_enabled) { > > > + return hdev->nvqs; > > > + } > > > + > > > + if (enable) { > > > + /* Allocate resources */ > > > + assert(v->shadow_vqs->len == 0); > > > + for (n = 0; n < hdev->nvqs; ++n) { > > > + VhostShadowVirtqueue *svq = vhost_svq_new(hdev, n); > > > + bool ok; > > > + > > > + if (unlikely(!svq)) { > > > + g_ptr_array_set_size(v->shadow_vqs, 0); > > > + return 0; > > > + } > > > + g_ptr_array_add(v->shadow_vqs, svq); > > > + > > > + ok = vhost_vdpa_svq_start_vq(hdev, n); > > > + if (unlikely(!ok)) { > > > + /* Free still not started svqs */ > > > + g_ptr_array_set_size(v->shadow_vqs, n); > > > + enable = false; > > [2] > > > > + break; > > > + } > > > + } > > > > > > Since there's almost no logic could be shared between enable and > > disable. Let's split those logic out into dedicated functions where the > > codes looks more easy to be reviewed (e.g have a better error handling etc). > > > > Maybe it could be more clear in the code, but the reused logic is the > disabling of SVQ and the fallback in case it cannot be enabled with > [2]. But I'm not against splitting in two different functions if it > makes review easier. > > > > > > + } > > > + > > > + v->shadow_vqs_enabled = enable; > > > + > > > + if (!enable) { > > > + /* Disable all queues or clean up failed start */ > > > + for (n = 0; n < v->shadow_vqs->len; ++n) { > > > + unsigned vq_idx = vhost_vdpa_get_vq_index(hdev, n); > > > + VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, > > > n); > > > + vhost_svq_stop(hdev, n, svq); > > > + vhost_virtqueue_start(hdev, hdev->vdev, &hdev->vqs[n], > > > vq_idx); > > > + } > > > + > > > + /* Resources cleanup */ > > > + g_ptr_array_set_size(v->shadow_vqs, 0); > > > + } > > > + > > > + return n; > > > +} > > > > > > void qmp_x_vhost_enable_shadow_vq(const char *name, bool enable, Error > > > **errp) > > > { > > > - error_setg(errp, "Shadow virtqueue still not implemented"); > > > + struct vhost_vdpa *v; > > > + const char *err_cause = NULL; > > > + bool r; > > > + > > > + QLIST_FOREACH(v, &vhost_vdpa_devices, entry) { > > > + if (v->dev->vdev && 0 == strcmp(v->dev->vdev->name, name)) { > > > + break; > > > + } > > > + } > > > > > > I think you can iterate the NetClientStates to ge tthe vhost-vdpa backends. > > > > Right, I missed it. >
Actually, that would always miss other device types like blk (isn't it?). But using just the name is definitely a bad idea. > > > > > + > > > + if (!v) { > > > + err_cause = "Device not found"; > > > + goto err; > > > + } else if (v->notifier[0].addr) { > > > + err_cause = "Device has host notifiers enabled"; > > > > > > I don't get this. > > > > At this moment of the series you can enable guest -> SVQ -> 'vdpa > device' if the device is not using the host notifiers memory region. > The right solution is to disable it for the guest, and to handle it in > SVQ. Otherwise, guest kick will bypass SVQ and > > It can be done in the same patch, or at least to disable (as unmap) > them at this moment and handle them in a posterior patch. but for > prototyping the solution I just ignored it in this series. It will be > handled some way or another in the next one. I prefer the last one, to > handle in a different patch, but let me know if you think it is better > otherwise. > > > Btw this function should be implemented in an independent patch after > > svq is fully functional. > > > > (Reasons for that are already commented at the top of this mail :) ). > > Thanks! > > > Thanks > > > > > > > + goto err; > > > + } > > > + > > > + r = vhost_vdpa_enable_svq(v, enable); > > > + if (unlikely(!r)) { > > > + err_cause = "Error enabling (see monitor)"; > > > + goto err; > > > + } > > > + > > > +err: > > > + if (err_cause) { > > > + error_setg(errp, "Can't enable shadow vq on %s: %s", name, > > > err_cause); > > > + } > > > } > > > > > > const VhostOps vdpa_ops = { > >