Re: [PATCH v3 00/17] virtio pci support VIRTIO_F_RING_RESET
On Tue, 08 Feb 2022 11:14:45 +0800, Xuan Zhuo wrote: > On Tue, 8 Feb 2022 10:59:48 +0800, Jason Wang wrote: > > > > 在 2022/2/7 下午2:02, Xuan Zhuo 写道: > > > On Mon, 7 Feb 2022 11:39:36 +0800, Jason Wang wrote: > > >> On Wed, Jan 26, 2022 at 3:35 PM Xuan Zhuo > > >> wrote: > > >>> > > >>> The virtio spec already supports the virtio queue reset function. This > > >>> patch set > > >>> is to add this function to the kernel. The relevant virtio spec > > >>> information is > > >>> here: > > >>> > > >>> https://github.com/oasis-tcs/virtio-spec/issues/124 > > >>> > > >>> Also regarding MMIO support for queue reset, I plan to support it after > > >>> this > > >>> patch is passed. > > >>> > > >>> #14-#17 is the disable/enable function of rx/tx pair implemented by > > >>> virtio-net > > >>> using the new helper. > > >> One thing that came to mind is the steering. E.g if we disable an RX, > > >> should we stop steering packets to that queue? Regarding this spec, if there are multiple queues disabled at the same time, it will be a troublesome problem for the backend to select the queue, so I want to directly define that only one queue is allowed to reset at the same time, do you think this is appropriate? In terms of the implementation of backend queue reselection, it would be more convenient to implement if we drop packets directly. Do you think we must implement this reselection function? Thanks. > > > Yes, we should reselect a queue. > > > > > > Thanks. > > > > > > Maybe a spec patch for that? > > Yes, I also realized this. Although virtio-net's disable/enable is implemented > based on queue reset, virtio-net still has to define its own flag and define > some more detailed implementations. > > I'll think about it and post a spec patch. > > Thanks. > > > > > Thanks > > > > > > > > > >> Thanks > > >> > > >>> This function is not currently referenced by other > > >>> functions. It is more to show the usage of the new helper, I not sure > > >>> if they > > >>> are going to be merged together. > > >>> > > >>> Please review. Thanks. > > >>> > > >>> v3: > > >>>1. keep vq, irq unreleased > > >>> > > >>> Xuan Zhuo (17): > > >>>virtio_pci: struct virtio_pci_common_cfg add queue_notify_data > > >>>virtio: queue_reset: add VIRTIO_F_RING_RESET > > >>>virtio: queue_reset: struct virtio_config_ops add callbacks for > > >>> queue_reset > > >>>virtio: queue_reset: add helper > > >>>vritio_ring: queue_reset: extract the release function of the vq ring > > >>>virtio_ring: queue_reset: split: add __vring_init_virtqueue() > > >>>virtio_ring: queue_reset: split: support enable reset queue > > >>>virtio_ring: queue_reset: packed: support enable reset queue > > >>>virtio_ring: queue_reset: add vring_reset_virtqueue() > > >>>virtio_pci: queue_reset: update struct virtio_pci_common_cfg and > > >>> option functions > > >>>virtio_pci: queue_reset: release vq by vp_dev->vqs > > >>>virtio_pci: queue_reset: setup_vq use vring_setup_virtqueue() > > >>>virtio_pci: queue_reset: support VIRTIO_F_RING_RESET > > >>>virtio_net: virtnet_tx_timeout() fix style > > >>>virtio_net: virtnet_tx_timeout() stop ref sq->vq > > >>>virtio_net: split free_unused_bufs() > > >>>virtio_net: support pair disable/enable > > >>> > > >>> drivers/net/virtio_net.c | 220 ++--- > > >>> drivers/virtio/virtio_pci_common.c | 62 --- > > >>> drivers/virtio/virtio_pci_common.h | 11 +- > > >>> drivers/virtio/virtio_pci_legacy.c | 5 +- > > >>> drivers/virtio/virtio_pci_modern.c | 120 +- > > >>> drivers/virtio/virtio_pci_modern_dev.c | 28 > > >>> drivers/virtio/virtio_ring.c | 144 +++- > > >>> include/linux/virtio.h | 1 + > > >>> include/linux/virtio_config.h | 75 - > > >>> include/linux/virtio_pci_modern.h | 2 + > > >>> include/linux/virtio_ring.h| 42 +++-- > > >>> include/uapi/linux/virtio_config.h | 7 +- > > >>> include/uapi/linux/virtio_pci.h| 2 + > > >>> 13 files changed, 618 insertions(+), 101 deletions(-) > > >>> > > >>> -- > > >>> 2.31.0 > > >>> > > > ___ > Virtualization mailing list > Virtualization@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/virtualization ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 13/17] virtio_pci: queue_reset: support VIRTIO_F_RING_RESET
On Tue, 8 Feb 2022 10:55:37 +0800, Jason Wang wrote: > On Mon, Feb 7, 2022 at 4:19 PM Xuan Zhuo wrote: > > > > On Mon, 7 Feb 2022 14:57:13 +0800, Jason Wang wrote: > > > > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道: > > > > This patch implements virtio pci support for QUEUE RESET. > > > > > > > > Performing reset on a queue is divided into two steps: > > > > > > > > 1. reset_vq: reset one vq > > > > 2. enable_reset_vq: re-enable the reset queue > > > > > > > > In the first step, these tasks will be completed: > > > > 1. notify the hardware queue to reset > > > > 2. recycle the buffer from vq > > > > 3. release the ring of the vq > > > > > > > > The process of enable reset vq: > > > > vp_modern_enable_reset_vq() > > > > vp_enable_reset_vq() > > > > __vp_setup_vq() > > > > setup_vq() > > > > vring_setup_virtqueue() > > > > > > > > In this process, we added two parameters, vq and num, and finally passed > > > > them to vring_setup_virtqueue(). And reuse the original info and vq. > > > > > > > > Signed-off-by: Xuan Zhuo > > > > --- > > > > drivers/virtio/virtio_pci_common.c | 36 +++ > > > > drivers/virtio/virtio_pci_common.h | 5 ++ > > > > drivers/virtio/virtio_pci_modern.c | 100 + > > > > 3 files changed, 128 insertions(+), 13 deletions(-) > > > > > > > > diff --git a/drivers/virtio/virtio_pci_common.c > > > > b/drivers/virtio/virtio_pci_common.c > > > > index c02936d29a31..ad21638fbf66 100644 > > > > --- a/drivers/virtio/virtio_pci_common.c > > > > +++ b/drivers/virtio/virtio_pci_common.c > > > > @@ -205,23 +205,28 @@ static int vp_request_msix_vectors(struct > > > > virtio_device *vdev, int nvectors, > > > > return err; > > > > } > > > > > > > > -static struct virtqueue *vp_setup_vq(struct virtio_device *vdev, > > > > unsigned index, > > > > -void (*callback)(struct virtqueue *vq), > > > > -const char *name, > > > > -bool ctx, > > > > -u16 msix_vec, u16 num) > > > > +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned > > > > index, > > > > + void (*callback)(struct virtqueue *vq), > > > > + const char *name, > > > > + bool ctx, > > > > + u16 msix_vec, u16 num) > > > > { > > > > struct virtio_pci_device *vp_dev = to_vp_device(vdev); > > > > - struct virtio_pci_vq_info *info = kmalloc(sizeof *info, GFP_KERNEL); > > > > + struct virtio_pci_vq_info *info; > > > > struct virtqueue *vq; > > > > unsigned long flags; > > > > > > > > - /* fill out our structure that represents an active queue */ > > > > - if (!info) > > > > - return ERR_PTR(-ENOMEM); > > > > + info = vp_dev->vqs[index]; > > > > + if (!info) { > > > > + info = kzalloc(sizeof *info, GFP_KERNEL); > > > > + > > > > + /* fill out our structure that represents an active queue */ > > > > + if (!info) > > > > + return ERR_PTR(-ENOMEM); > > > > + } > > > > > > > > vq = vp_dev->setup_vq(vp_dev, info, index, callback, name, ctx, > > > > - msix_vec, NULL, num); > > > > + msix_vec, info->vq, num); > > > > if (IS_ERR(vq)) > > > > goto out_info; > > > > > > > > @@ -238,6 +243,9 @@ static struct virtqueue *vp_setup_vq(struct > > > > virtio_device *vdev, unsigned index, > > > > return vq; > > > > > > > > out_info: > > > > + if (info->vq && info->vq->reset) > > > > + return vq; > > > > + > > > > kfree(info); > > > > return vq; > > > > } > > > > @@ -248,9 +256,11 @@ static void vp_del_vq(struct virtqueue *vq) > > > > struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index]; > > > > unsigned long flags; > > > > > > > > - spin_lock_irqsave(_dev->lock, flags); > > > > - list_del(>node); > > > > - spin_unlock_irqrestore(_dev->lock, flags); > > > > + if (!vq->reset) { > > > > + spin_lock_irqsave(_dev->lock, flags); > > > > + list_del(>node); > > > > + spin_unlock_irqrestore(_dev->lock, flags); > > > > + } > > > > > > > > vp_dev->del_vq(info); > > > > kfree(info); > > > > diff --git a/drivers/virtio/virtio_pci_common.h > > > > b/drivers/virtio/virtio_pci_common.h > > > > index 65db92245e41..c1d15f7c0be4 100644 > > > > --- a/drivers/virtio/virtio_pci_common.h > > > > +++ b/drivers/virtio/virtio_pci_common.h > > > > @@ -119,6 +119,11 @@ int vp_find_vqs(struct virtio_device *vdev, > > > > unsigned nvqs, > > > > struct virtqueue *vqs[], vq_callback_t *callbacks[], > > > > const char * const names[], const bool *ctx, > > > > struct irq_affinity *desc); > > > > +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned > > > > index, > > > > +
Re: [PATCH 16/31] vhost: pass queue index to vhost_vq_get_addr
在 2022/2/1 上午1:44, Eugenio Perez Martin 写道: On Sat, Jan 29, 2022 at 9:20 AM Jason Wang wrote: 在 2022/1/22 上午4:27, Eugenio Pérez 写道: Doing that way allows vhost backend to know what address to return. Signed-off-by: Eugenio Pérez --- hw/virtio/vhost.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 7b03efccec..64b955ba0c 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -798,9 +798,10 @@ static int vhost_virtqueue_set_addr(struct vhost_dev *dev, struct vhost_virtqueue *vq, unsigned idx, bool enable_log) { -struct vhost_vring_addr addr; +struct vhost_vring_addr addr = { +.index = idx, +}; int r; -memset(, 0, sizeof(struct vhost_vring_addr)); if (dev->vhost_ops->vhost_vq_get_addr) { r = dev->vhost_ops->vhost_vq_get_addr(dev, , vq); @@ -813,7 +814,6 @@ static int vhost_virtqueue_set_addr(struct vhost_dev *dev, addr.avail_user_addr = (uint64_t)(unsigned long)vq->avail; addr.used_user_addr = (uint64_t)(unsigned long)vq->used; } I'm a bit lost in the logic above, any reason we need call vhost_vq_get_addr() :) ? It's the way vhost_virtqueue_set_addr works if the backend has a vhost_vq_get_addr operation (currently, only vhost-vdpa). vhost first ask the address to the back end and then set it. Right it's because vhost-vdpa doesn't use VA but GPA. But I'm not sure it's worth a dedicated vhost_ops. But consider we introduce shadow virtqueue stuffs, it should be ok now. (In the future, we may consider to generalize non vhost-vdpa specific stuffs to VhostShadowVirtqueue, then we can get rid of this vhost_ops. Previously, index was not needed because all the information was in vhost_virtqueue. However to extract queue index from vhost_virtqueue is tricky, so I think it's easier to simply have that information at request, something similar to get_base or get_num when asking vdpa device. We can extract the index from vq - dev->vqs or something similar if it's prefered. It looks odd for the caller to tell the index consider vhost_virtqueue is already passed. So I think we need deduce it from vhost_virtqueue as you mentioned here. Thanks Thanks! Thanks -addr.index = idx; addr.log_guest_addr = vq->used_phys; addr.flags = enable_log ? (1 << VHOST_VRING_F_LOG) : 0; r = dev->vhost_ops->vhost_set_vring_addr(dev, ); ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 13/17] virtio_pci: queue_reset: support VIRTIO_F_RING_RESET
On 2022/2/8 上午10:55, Jason Wang wrote: On Mon, Feb 7, 2022 at 4:19 PM Xuan Zhuo wrote: On Mon, 7 Feb 2022 14:57:13 +0800, Jason Wang wrote: 在 2022/1/26 下午3:35, Xuan Zhuo 写道: This patch implements virtio pci support for QUEUE RESET. Performing reset on a queue is divided into two steps: 1. reset_vq: reset one vq 2. enable_reset_vq: re-enable the reset queue In the first step, these tasks will be completed: 1. notify the hardware queue to reset 2. recycle the buffer from vq 3. release the ring of the vq The process of enable reset vq: vp_modern_enable_reset_vq() vp_enable_reset_vq() __vp_setup_vq() setup_vq() vring_setup_virtqueue() In this process, we added two parameters, vq and num, and finally passed them to vring_setup_virtqueue(). And reuse the original info and vq. Signed-off-by: Xuan Zhuo --- drivers/virtio/virtio_pci_common.c | 36 +++ drivers/virtio/virtio_pci_common.h | 5 ++ drivers/virtio/virtio_pci_modern.c | 100 + 3 files changed, 128 insertions(+), 13 deletions(-) diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index c02936d29a31..ad21638fbf66 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -205,23 +205,28 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors, return err; } -static struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index, -void (*callback)(struct virtqueue *vq), -const char *name, -bool ctx, -u16 msix_vec, u16 num) +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index, + void (*callback)(struct virtqueue *vq), + const char *name, + bool ctx, + u16 msix_vec, u16 num) { struct virtio_pci_device *vp_dev = to_vp_device(vdev); - struct virtio_pci_vq_info *info = kmalloc(sizeof *info, GFP_KERNEL); + struct virtio_pci_vq_info *info; struct virtqueue *vq; unsigned long flags; - /* fill out our structure that represents an active queue */ - if (!info) - return ERR_PTR(-ENOMEM); + info = vp_dev->vqs[index]; + if (!info) { + info = kzalloc(sizeof *info, GFP_KERNEL); + + /* fill out our structure that represents an active queue */ + if (!info) + return ERR_PTR(-ENOMEM); + } vq = vp_dev->setup_vq(vp_dev, info, index, callback, name, ctx, - msix_vec, NULL, num); + msix_vec, info->vq, num); if (IS_ERR(vq)) goto out_info; @@ -238,6 +243,9 @@ static struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index, return vq; out_info: + if (info->vq && info->vq->reset) + return vq; + kfree(info); return vq; } @@ -248,9 +256,11 @@ static void vp_del_vq(struct virtqueue *vq) struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index]; unsigned long flags; - spin_lock_irqsave(_dev->lock, flags); - list_del(>node); - spin_unlock_irqrestore(_dev->lock, flags); + if (!vq->reset) { + spin_lock_irqsave(_dev->lock, flags); + list_del(>node); + spin_unlock_irqrestore(_dev->lock, flags); + } vp_dev->del_vq(info); kfree(info); diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h index 65db92245e41..c1d15f7c0be4 100644 --- a/drivers/virtio/virtio_pci_common.h +++ b/drivers/virtio/virtio_pci_common.h @@ -119,6 +119,11 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs, struct virtqueue *vqs[], vq_callback_t *callbacks[], const char * const names[], const bool *ctx, struct irq_affinity *desc); +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index, + void (*callback)(struct virtqueue *vq), + const char *name, + bool ctx, + u16 msix_vec, u16 num); const char *vp_bus_name(struct virtio_device *vdev); /* Setup the affinity for a virtqueue: diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c index 2ce58de549de..6789411169e4 100644 --- a/drivers/virtio/virtio_pci_modern.c +++ b/drivers/virtio/virtio_pci_modern.c @@ -34,6 +34,9 @@ static void vp_transport_features(struct virtio_device *vdev, u64 features) if ((features & BIT_ULL(VIRTIO_F_SR_IOV)) && pci_find_ext_capability(pci_dev, PCI_EXT_CAP_ID_SRIOV)) __virtio_set_bit(vdev, VIRTIO_F_SR_IOV); + + if (features & BIT_ULL(VIRTIO_F_RING_RESET)) + __virtio_set_bit(vdev, VIRTIO_F_RING_RESET); } /* virtio config->finalize_features()
Re: [PATCH 17/31] vdpa: adapt vhost_ops callbacks to svq
在 2022/2/1 上午2:58, Eugenio Perez Martin 写道: On Sun, Jan 30, 2022 at 5:03 AM Jason Wang wrote: 在 2022/1/22 上午4:27, Eugenio Pérez 写道: First half of the buffers forwarding part, preparing vhost-vdpa callbacks to SVQ to offer it. QEMU cannot enable it at this moment, so this is effectively dead code at the moment, but it helps to reduce patch size. Signed-off-by: Eugenio Pérez --- hw/virtio/vhost-shadow-virtqueue.h | 2 +- hw/virtio/vhost-shadow-virtqueue.c | 21 - hw/virtio/vhost-vdpa.c | 133 ++--- 3 files changed, 143 insertions(+), 13 deletions(-) diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h index 035207a469..39aef5ffdf 100644 --- a/hw/virtio/vhost-shadow-virtqueue.h +++ b/hw/virtio/vhost-shadow-virtqueue.h @@ -35,7 +35,7 @@ size_t vhost_svq_device_area_size(const VhostShadowVirtqueue *svq); void vhost_svq_stop(VhostShadowVirtqueue *svq); -VhostShadowVirtqueue *vhost_svq_new(void); +VhostShadowVirtqueue *vhost_svq_new(uint16_t qsize); void vhost_svq_free(VhostShadowVirtqueue *vq); diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c index f129ec8395..7c168075d7 100644 --- a/hw/virtio/vhost-shadow-virtqueue.c +++ b/hw/virtio/vhost-shadow-virtqueue.c @@ -277,9 +277,17 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq) /** * Creates vhost shadow virtqueue, and instruct vhost device to use the shadow * methods and file descriptors. + * + * @qsize Shadow VirtQueue size + * + * Returns the new virtqueue or NULL. + * + * In case of error, reason is reported through error_report. */ -VhostShadowVirtqueue *vhost_svq_new(void) +VhostShadowVirtqueue *vhost_svq_new(uint16_t qsize) { +size_t desc_size = sizeof(vring_desc_t) * qsize; +size_t device_size, driver_size; g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1); int r; @@ -300,6 +308,15 @@ VhostShadowVirtqueue *vhost_svq_new(void) /* Placeholder descriptor, it should be deleted at set_kick_fd */ event_notifier_init_fd(>svq_kick, INVALID_SVQ_KICK_FD); +svq->vring.num = qsize; I wonder if this is the best. E.g some hardware can support up to 32K queue size. So this will probably end up with: 1) SVQ use 32K queue size 2) hardware queue uses 256 In that case SVQ vring queue size will be 32K and guest's vring can negotiate any number with SVQ equal or less than 32K, Sorry for being unclear what I meant is actually 1) SVQ uses 32K queue size 2) guest vq uses 256 This looks like a burden that needs extra logic and may damage the performance. And this can lead other interesting situation: 1) SVQ uses 256 2) guest vq uses 1024 Where a lot of more SVQ logic is needed. including 256. Is that what you mean? I mean, it looks to me the logic will be much more simplified if we just allocate the shadow virtqueue with the size what guest can see (guest vring). Then we don't need to think if the difference of the queue size can have any side effects. If with hardware queues you mean guest's vring, not sure why it is "probably 256". I'd say that in that case with the virtio-net kernel driver the ring size will be the same as the device export, for example, isn't it? The implementation should support any combination of sizes, but the ring size exposed to the guest is never bigger than hardware one. ? Or we SVQ can stick to 256 but this will this cause trouble if we want to add event index support? I think we should not have any problem with event idx. If you mean that the guest could mark more buffers available than SVQ vring's size, that should not happen because there must be less entries in the guest than SVQ. But if I understood you correctly, a similar situation could happen if a guest's contiguous buffer is scattered across many qemu's VA chunks. Even if that would happen, the situation should be ok too: SVQ knows the guest's avail idx and, if SVQ is full, it will continue forwarding avail buffers when the device uses more buffers. Does that make sense to you? Yes. Thanks ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 11/31] vhost: Add vhost_svq_valid_device_features to shadow vq
在 2022/2/1 下午6:57, Eugenio Perez Martin 写道: On Mon, Jan 31, 2022 at 4:49 PM Eugenio Perez Martin wrote: On Sat, Jan 29, 2022 at 9:11 AM Jason Wang wrote: 在 2022/1/22 上午4:27, Eugenio Pérez 写道: This allows SVQ to negotiate features with the device. For the device, SVQ is a driver. While this function needs to bypass all non-transport features, it needs to disable the features that SVQ does not support when forwarding buffers. This includes packed vq layout, indirect descriptors or event idx. Signed-off-by: Eugenio Pérez --- hw/virtio/vhost-shadow-virtqueue.h | 2 ++ hw/virtio/vhost-shadow-virtqueue.c | 44 ++ hw/virtio/vhost-vdpa.c | 21 ++ 3 files changed, 67 insertions(+) diff --git a/hw/virtio/vhost-shadow-virtqueue.h b/hw/virtio/vhost-shadow-virtqueue.h index c9ffa11fce..d963867a04 100644 --- a/hw/virtio/vhost-shadow-virtqueue.h +++ b/hw/virtio/vhost-shadow-virtqueue.h @@ -15,6 +15,8 @@ typedef struct VhostShadowVirtqueue VhostShadowVirtqueue; +bool vhost_svq_valid_device_features(uint64_t *features); + void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd); void vhost_svq_set_guest_call_notifier(VhostShadowVirtqueue *svq, int call_fd); const EventNotifier *vhost_svq_get_dev_kick_notifier( diff --git a/hw/virtio/vhost-shadow-virtqueue.c b/hw/virtio/vhost-shadow-virtqueue.c index 9619c8082c..51442b3dbf 100644 --- a/hw/virtio/vhost-shadow-virtqueue.c +++ b/hw/virtio/vhost-shadow-virtqueue.c @@ -45,6 +45,50 @@ const EventNotifier *vhost_svq_get_dev_kick_notifier( return >hdev_kick; } +/** + * Validate the transport device features that SVQ can use with the device + * + * @dev_features The device features. If success, the acknowledged features. + * + * Returns true if SVQ can go with a subset of these, false otherwise. + */ +bool vhost_svq_valid_device_features(uint64_t *dev_features) +{ +bool r = true; + +for (uint64_t b = VIRTIO_TRANSPORT_F_START; b <= VIRTIO_TRANSPORT_F_END; + ++b) { +switch (b) { +case VIRTIO_F_NOTIFY_ON_EMPTY: +case VIRTIO_F_ANY_LAYOUT: +continue; + +case VIRTIO_F_ACCESS_PLATFORM: +/* SVQ does not know how to translate addresses */ I may miss something but any reason that we need to disable ACCESS_PLATFORM? I'd expect the vring helper we used for shadow virtqueue can deal with vIOMMU perfectly. This function is validating SVQ <-> Device communications features, that may or may not be the same as guest <-> SVQ. These feature flags are valid for guest <-> SVQ communication, same as with indirect descriptors one. Having said that, there is a point in the series where VIRTIO_F_ACCESS_PLATFORM is actually mandatory, so I think we could use the latter addition of x-svq cmdline parameter and delay the feature validations where it makes more sense. +if (*dev_features & BIT_ULL(b)) { +clear_bit(b, dev_features); +r = false; +} +break; + +case VIRTIO_F_VERSION_1: I had the same question here. For VERSION_1 it's easier to assume that guest is little endian at some points, but we could try harder to support both endianness if needed. Re-thinking the SVQ feature isolation stuff for this first iteration based on your comments. Maybe it's easier to simply fail if the device does not *match* the expected feature set, and add all of the "feature isolation" later. While a lot of guest <-> SVQ communication details are already solved for free with qemu's VirtQueue (indirect, packed, ...), we may simplify this series in particular and add the support for it later. For example, at this moment would be valid for the device to export indirect descriptors feature flag, and SVQ simply forward that feature flag offering to the guest. So the guest <-> SVQ communication could have indirect descriptors (qemu's VirtQueue code handles it for free), but SVQ would not acknowledge it for the device. As a side note, to negotiate it would have been harmless actually, but it's not the case of packed vq. So maybe for the v2 we can simply force the device to just export the strictly needed features and nothing else with qemu cmdline, and then enable the feature negotiation isolation for each side of SVQ? Yes, that's exactly my point. Thanks Thanks! Thanks! Thanks +/* SVQ trust that guest vring is little endian */ +if (!(*dev_features & BIT_ULL(b))) { +set_bit(b, dev_features); +r = false; +} +continue; + +default: +if (*dev_features & BIT_ULL(b)) { +clear_bit(b, dev_features); +} +} +} + +return r; +} + /* Forward guest notifications */ static void vhost_handle_guest_kick(EventNotifier *n) { diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index
Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data
On Tue, Feb 8, 2022 at 11:29 AM Xuan Zhuo wrote: > > On Tue, 8 Feb 2022 11:24:13 +0800, Jason Wang wrote: > > On Tue, Feb 8, 2022 at 11:20 AM Xuan Zhuo > > wrote: > > > > > > On Tue, 8 Feb 2022 11:03:17 +0800, Jason Wang wrote: > > > > On Tue, Feb 8, 2022 at 10:17 AM Xuan Zhuo > > > > wrote: > > > > > > > > > > On Mon, 7 Feb 2022 16:06:15 +0800, Jason Wang > > > > > wrote: > > > > > > On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo > > > > > > wrote: > > > > > > > > > > > > > > On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang > > > > > > > wrote: > > > > > > > > > > > > > > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道: > > > > > > > > > Add queue_notify_data in struct virtio_pci_common_cfg, which > > > > > > > > > comes from > > > > > > > > > here https://github.com/oasis-tcs/virtio-spec/issues/89 > > > > > > > > > > > > > > > > > > Since I want to add queue_reset after it, I submitted this > > > > > > > > > patch first. > > > > > > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo > > > > > > > > > --- > > > > > > > > > include/uapi/linux/virtio_pci.h | 1 + > > > > > > > > > 1 file changed, 1 insertion(+) > > > > > > > > > > > > > > > > > > diff --git a/include/uapi/linux/virtio_pci.h > > > > > > > > > b/include/uapi/linux/virtio_pci.h > > > > > > > > > index 3a86f36d7e3d..492c89f56c6a 100644 > > > > > > > > > --- a/include/uapi/linux/virtio_pci.h > > > > > > > > > +++ b/include/uapi/linux/virtio_pci.h > > > > > > > > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg { > > > > > > > > > __le32 queue_avail_hi; /* read-write */ > > > > > > > > > __le32 queue_used_lo; /* read-write */ > > > > > > > > > __le32 queue_used_hi; /* read-write */ > > > > > > > > > + __le16 queue_notify_data; /* read-write */ > > > > > > > > > }; > > > > > > > > > > > > > > > > > > > > > > > > So I had the same concern as previous version. > > > > > > > > > > > > > > > > This breaks uABI where program may try to use sizeof(struct > > > > > > > > virtio_pci_common_cfg). > > > > > > > > > > > > > > > > We probably need a container structure here. > > > > > > > > > > > > > > I see, I plan to add a struct like this, do you think it's > > > > > > > appropriate? > > > > > > > > > > > > > > struct virtio_pci_common_cfg_v1 { > > > > > > > struct virtio_pci_common_cfg cfg; > > > > > > > __le16 queue_notify_data; /* read-write */ > > > > > > > } > > > > > > > > > > > > Something like this but we probably need a better name. > > > > > > > > > > > > > > > how about this? > > > > > > > > > > /* Ext Fields in VIRTIO_PCI_CAP_COMMON_CFG: */ > > > > > struct virtio_pci_common_cfg_ext { > > > > > struct virtio_pci_common_cfg cfg; > > > > > > > > > > __le16 queue_notify_data; /* read-write */ > > > > > > > > > > __le16 reserved0; > > > > > __le16 reserved1; > > > > > __le16 reserved2; > > > > > __le16 reserved3; > > > > > __le16 reserved4; > > > > > __le16 reserved5; > > > > > __le16 reserved6; > > > > > __le16 reserved7; > > > > > __le16 reserved8; > > > > > __le16 reserved9; > > > > > __le16 reserved10; > > > > > __le16 reserved11; > > > > > __le16 reserved12; > > > > > __le16 reserved13; > > > > > __le16 reserved14; > > > > > }; > > > > > > > > I still think the container without padding is better. Otherwise > > > > userspace needs to use offset_of() trick instead of sizeof(). > > > > > > In this case, as virtio_pci_common_cfg_ext adds new members in the > > > future, we > > > will add more container structures. > > > > > > In that case, I think virtio_pci_common_cfg_v1 is a good name instead. > > > > Something like "virtio_pci_common_cfg_notify" might be a little bit better. > > Although there is only one notify_data in this patch, I plan to look like this > after my patch set: > > struct virtio_pci_common_cfg_v1 { > struct virtio_pci_common_cfg cfg; > > __le16 queue_notify_data; /* read-write */ > __le16 queue_reset; /* read-write */ > } > > If we use virtio_pci_common_cfg_notify, then we will get two structures after > this patch set: > > struct virtio_pci_common_cfg_notify { > struct virtio_pci_common_cfg cfg; > > __le16 queue_notify_data; /* read-write */ > } > > struct virtio_pci_common_cfg_reset { > struct virtio_pci_common_cfg_notify cfg; > > __le16 queue_reset; /* read-write */ > } Right, this is sub-optimal, and we need padding in cfg_notify probably. But I couldn't think of a better idea currently, maybe we can listen from others opinion But we use something like this for
Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data
On Tue, 8 Feb 2022 11:24:13 +0800, Jason Wang wrote: > On Tue, Feb 8, 2022 at 11:20 AM Xuan Zhuo wrote: > > > > On Tue, 8 Feb 2022 11:03:17 +0800, Jason Wang wrote: > > > On Tue, Feb 8, 2022 at 10:17 AM Xuan Zhuo > > > wrote: > > > > > > > > On Mon, 7 Feb 2022 16:06:15 +0800, Jason Wang > > > > wrote: > > > > > On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo > > > > > wrote: > > > > > > > > > > > > On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang > > > > > > wrote: > > > > > > > > > > > > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道: > > > > > > > > Add queue_notify_data in struct virtio_pci_common_cfg, which > > > > > > > > comes from > > > > > > > > here https://github.com/oasis-tcs/virtio-spec/issues/89 > > > > > > > > > > > > > > > > Since I want to add queue_reset after it, I submitted this > > > > > > > > patch first. > > > > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo > > > > > > > > --- > > > > > > > > include/uapi/linux/virtio_pci.h | 1 + > > > > > > > > 1 file changed, 1 insertion(+) > > > > > > > > > > > > > > > > diff --git a/include/uapi/linux/virtio_pci.h > > > > > > > > b/include/uapi/linux/virtio_pci.h > > > > > > > > index 3a86f36d7e3d..492c89f56c6a 100644 > > > > > > > > --- a/include/uapi/linux/virtio_pci.h > > > > > > > > +++ b/include/uapi/linux/virtio_pci.h > > > > > > > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg { > > > > > > > > __le32 queue_avail_hi; /* read-write */ > > > > > > > > __le32 queue_used_lo; /* read-write */ > > > > > > > > __le32 queue_used_hi; /* read-write */ > > > > > > > > + __le16 queue_notify_data; /* read-write */ > > > > > > > > }; > > > > > > > > > > > > > > > > > > > > > So I had the same concern as previous version. > > > > > > > > > > > > > > This breaks uABI where program may try to use sizeof(struct > > > > > > > virtio_pci_common_cfg). > > > > > > > > > > > > > > We probably need a container structure here. > > > > > > > > > > > > I see, I plan to add a struct like this, do you think it's > > > > > > appropriate? > > > > > > > > > > > > struct virtio_pci_common_cfg_v1 { > > > > > > struct virtio_pci_common_cfg cfg; > > > > > > __le16 queue_notify_data; /* read-write */ > > > > > > } > > > > > > > > > > Something like this but we probably need a better name. > > > > > > > > > > > > how about this? > > > > > > > > /* Ext Fields in VIRTIO_PCI_CAP_COMMON_CFG: */ > > > > struct virtio_pci_common_cfg_ext { > > > > struct virtio_pci_common_cfg cfg; > > > > > > > > __le16 queue_notify_data; /* read-write */ > > > > > > > > __le16 reserved0; > > > > __le16 reserved1; > > > > __le16 reserved2; > > > > __le16 reserved3; > > > > __le16 reserved4; > > > > __le16 reserved5; > > > > __le16 reserved6; > > > > __le16 reserved7; > > > > __le16 reserved8; > > > > __le16 reserved9; > > > > __le16 reserved10; > > > > __le16 reserved11; > > > > __le16 reserved12; > > > > __le16 reserved13; > > > > __le16 reserved14; > > > > }; > > > > > > I still think the container without padding is better. Otherwise > > > userspace needs to use offset_of() trick instead of sizeof(). > > > > In this case, as virtio_pci_common_cfg_ext adds new members in the future, > > we > > will add more container structures. > > > > In that case, I think virtio_pci_common_cfg_v1 is a good name instead. > > Something like "virtio_pci_common_cfg_notify" might be a little bit better. Although there is only one notify_data in this patch, I plan to look like this after my patch set: struct virtio_pci_common_cfg_v1 { struct virtio_pci_common_cfg cfg; __le16 queue_notify_data; /* read-write */ __le16 queue_reset; /* read-write */ } If we use virtio_pci_common_cfg_notify, then we will get two structures after this patch set: struct virtio_pci_common_cfg_notify { struct virtio_pci_common_cfg cfg; __le16 queue_notify_data; /* read-write */ } struct virtio_pci_common_cfg_reset { struct virtio_pci_common_cfg_notify cfg; __le16 queue_reset; /* read-write */ } Thanks. > > Thanks > > > > > Thanks. > > > > > > > > > > Thanks > > > > > > > > > > > Thanks > > > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > > > > THanks > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */ > > > > > > > > > > > > > > > > > > > > > > > > > > > > ___ Virtualization mailing list
Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data
On Tue, Feb 8, 2022 at 11:20 AM Xuan Zhuo wrote: > > On Tue, 8 Feb 2022 11:03:17 +0800, Jason Wang wrote: > > On Tue, Feb 8, 2022 at 10:17 AM Xuan Zhuo > > wrote: > > > > > > On Mon, 7 Feb 2022 16:06:15 +0800, Jason Wang wrote: > > > > On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo > > > > wrote: > > > > > > > > > > On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang > > > > > wrote: > > > > > > > > > > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道: > > > > > > > Add queue_notify_data in struct virtio_pci_common_cfg, which > > > > > > > comes from > > > > > > > here https://github.com/oasis-tcs/virtio-spec/issues/89 > > > > > > > > > > > > > > Since I want to add queue_reset after it, I submitted this patch > > > > > > > first. > > > > > > > > > > > > > > Signed-off-by: Xuan Zhuo > > > > > > > --- > > > > > > > include/uapi/linux/virtio_pci.h | 1 + > > > > > > > 1 file changed, 1 insertion(+) > > > > > > > > > > > > > > diff --git a/include/uapi/linux/virtio_pci.h > > > > > > > b/include/uapi/linux/virtio_pci.h > > > > > > > index 3a86f36d7e3d..492c89f56c6a 100644 > > > > > > > --- a/include/uapi/linux/virtio_pci.h > > > > > > > +++ b/include/uapi/linux/virtio_pci.h > > > > > > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg { > > > > > > > __le32 queue_avail_hi; /* read-write */ > > > > > > > __le32 queue_used_lo; /* read-write */ > > > > > > > __le32 queue_used_hi; /* read-write */ > > > > > > > + __le16 queue_notify_data; /* read-write */ > > > > > > > }; > > > > > > > > > > > > > > > > > > So I had the same concern as previous version. > > > > > > > > > > > > This breaks uABI where program may try to use sizeof(struct > > > > > > virtio_pci_common_cfg). > > > > > > > > > > > > We probably need a container structure here. > > > > > > > > > > I see, I plan to add a struct like this, do you think it's > > > > > appropriate? > > > > > > > > > > struct virtio_pci_common_cfg_v1 { > > > > > struct virtio_pci_common_cfg cfg; > > > > > __le16 queue_notify_data; /* read-write */ > > > > > } > > > > > > > > Something like this but we probably need a better name. > > > > > > > > > how about this? > > > > > > /* Ext Fields in VIRTIO_PCI_CAP_COMMON_CFG: */ > > > struct virtio_pci_common_cfg_ext { > > > struct virtio_pci_common_cfg cfg; > > > > > > __le16 queue_notify_data; /* read-write */ > > > > > > __le16 reserved0; > > > __le16 reserved1; > > > __le16 reserved2; > > > __le16 reserved3; > > > __le16 reserved4; > > > __le16 reserved5; > > > __le16 reserved6; > > > __le16 reserved7; > > > __le16 reserved8; > > > __le16 reserved9; > > > __le16 reserved10; > > > __le16 reserved11; > > > __le16 reserved12; > > > __le16 reserved13; > > > __le16 reserved14; > > > }; > > > > I still think the container without padding is better. Otherwise > > userspace needs to use offset_of() trick instead of sizeof(). > > In this case, as virtio_pci_common_cfg_ext adds new members in the future, we > will add more container structures. > > In that case, I think virtio_pci_common_cfg_v1 is a good name instead. Something like "virtio_pci_common_cfg_notify" might be a little bit better. Thanks > > Thanks. > > > > > > Thanks > > > > > > > > Thanks > > > > > > > > > > > Thanks > > > > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > > > > THanks > > > > > > > > > > > > > > > > > > > > > > > > > > /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */ > > > > > > > > > > > > > > > > > > > > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 09/31] vhost-vdpa: Take into account SVQ in vhost_vdpa_set_vring_call
在 2022/1/31 下午11:34, Eugenio Perez Martin 写道: On Sat, Jan 29, 2022 at 9:06 AM Jason Wang wrote: 在 2022/1/22 上午4:27, Eugenio Pérez 写道: Signed-off-by: Eugenio Pérez --- hw/virtio/vhost-vdpa.c | 20 ++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index 18de14f0fb..029f98feee 100644 --- a/hw/virtio/vhost-vdpa.c +++ b/hw/virtio/vhost-vdpa.c @@ -687,13 +687,29 @@ static int vhost_vdpa_set_vring_kick(struct vhost_dev *dev, } } -static int vhost_vdpa_set_vring_call(struct vhost_dev *dev, - struct vhost_vring_file *file) +static int vhost_vdpa_set_vring_dev_call(struct vhost_dev *dev, + struct vhost_vring_file *file) { trace_vhost_vdpa_set_vring_call(dev, file->index, file->fd); return vhost_vdpa_call(dev, VHOST_SET_VRING_CALL, file); } +static int vhost_vdpa_set_vring_call(struct vhost_dev *dev, + struct vhost_vring_file *file) +{ +struct vhost_vdpa *v = dev->opaque; + +if (v->shadow_vqs_enabled) { +int vdpa_idx = vhost_vdpa_get_vq_index(dev, file->index); +VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, vdpa_idx); + +vhost_svq_set_guest_call_notifier(svq, file->fd); Two questions here (had similar questions for vring kick): 1) Any reason that we setup the eventfd for vhost-vdpa in vhost_vdpa_svq_setup() not here? I'm not sure what you mean. The guest->SVQ call and kick fds are set here and at vhost_vdpa_set_vring_kick. The event notifier handler of the guest -> SVQ kick_fd is set at vhost_vdpa_set_vring_kick / vhost_svq_set_svq_kick_fd. The guest -> SVQ call fd has no event notifier handler since we don't poll it. On the other hand, the connection SVQ <-> device uses the same fds from the beginning to the end, and they will not change with, for example, call fd masking. That's why it's setup from vhost_vdpa_svq_setup. Delaying to vhost_vdpa_set_vring_call would make us add way more logic there. More logic in general shadow vq code but less codes for vhost-vdpa specific code I think. E.g for we can move the kick set logic from vhost_vdpa_svq_set_fds() to here. Thanks 2) The call could be disabled by using -1 as the fd, I don't see any code to deal with that. Right, I didn't take that into account. vhost-kernel takes also -1 as kick_fd to unbind, so SVQ can be reworked to take that into account for sure. Thanks! Thanks +return 0; +} else { +return vhost_vdpa_set_vring_dev_call(dev, file); +} +} + /** * Set shadow virtqueue descriptors to the device * ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data
On Tue, 8 Feb 2022 11:03:17 +0800, Jason Wang wrote: > On Tue, Feb 8, 2022 at 10:17 AM Xuan Zhuo wrote: > > > > On Mon, 7 Feb 2022 16:06:15 +0800, Jason Wang wrote: > > > On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo > > > wrote: > > > > > > > > On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang > > > > wrote: > > > > > > > > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道: > > > > > > Add queue_notify_data in struct virtio_pci_common_cfg, which comes > > > > > > from > > > > > > here https://github.com/oasis-tcs/virtio-spec/issues/89 > > > > > > > > > > > > Since I want to add queue_reset after it, I submitted this patch > > > > > > first. > > > > > > > > > > > > Signed-off-by: Xuan Zhuo > > > > > > --- > > > > > > include/uapi/linux/virtio_pci.h | 1 + > > > > > > 1 file changed, 1 insertion(+) > > > > > > > > > > > > diff --git a/include/uapi/linux/virtio_pci.h > > > > > > b/include/uapi/linux/virtio_pci.h > > > > > > index 3a86f36d7e3d..492c89f56c6a 100644 > > > > > > --- a/include/uapi/linux/virtio_pci.h > > > > > > +++ b/include/uapi/linux/virtio_pci.h > > > > > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg { > > > > > > __le32 queue_avail_hi; /* read-write */ > > > > > > __le32 queue_used_lo; /* read-write */ > > > > > > __le32 queue_used_hi; /* read-write */ > > > > > > + __le16 queue_notify_data; /* read-write */ > > > > > > }; > > > > > > > > > > > > > > > So I had the same concern as previous version. > > > > > > > > > > This breaks uABI where program may try to use sizeof(struct > > > > > virtio_pci_common_cfg). > > > > > > > > > > We probably need a container structure here. > > > > > > > > I see, I plan to add a struct like this, do you think it's appropriate? > > > > > > > > struct virtio_pci_common_cfg_v1 { > > > > struct virtio_pci_common_cfg cfg; > > > > __le16 queue_notify_data; /* read-write */ > > > > } > > > > > > Something like this but we probably need a better name. > > > > > > how about this? > > > > /* Ext Fields in VIRTIO_PCI_CAP_COMMON_CFG: */ > > struct virtio_pci_common_cfg_ext { > > struct virtio_pci_common_cfg cfg; > > > > __le16 queue_notify_data; /* read-write */ > > > > __le16 reserved0; > > __le16 reserved1; > > __le16 reserved2; > > __le16 reserved3; > > __le16 reserved4; > > __le16 reserved5; > > __le16 reserved6; > > __le16 reserved7; > > __le16 reserved8; > > __le16 reserved9; > > __le16 reserved10; > > __le16 reserved11; > > __le16 reserved12; > > __le16 reserved13; > > __le16 reserved14; > > }; > > I still think the container without padding is better. Otherwise > userspace needs to use offset_of() trick instead of sizeof(). In this case, as virtio_pci_common_cfg_ext adds new members in the future, we will add more container structures. In that case, I think virtio_pci_common_cfg_v1 is a good name instead. Thanks. > > Thanks > > > > > Thanks > > > > > > > > Thanks > > > > > > > > > > > Thanks. > > > > > > > > > > > > > > THanks > > > > > > > > > > > > > > > > > > > > > > /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */ > > > > > > > > > > > > > > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 00/17] virtio pci support VIRTIO_F_RING_RESET
On Tue, 8 Feb 2022 10:59:48 +0800, Jason Wang wrote: > > 在 2022/2/7 下午2:02, Xuan Zhuo 写道: > > On Mon, 7 Feb 2022 11:39:36 +0800, Jason Wang wrote: > >> On Wed, Jan 26, 2022 at 3:35 PM Xuan Zhuo > >> wrote: > >>> > >>> The virtio spec already supports the virtio queue reset function. This > >>> patch set > >>> is to add this function to the kernel. The relevant virtio spec > >>> information is > >>> here: > >>> > >>> https://github.com/oasis-tcs/virtio-spec/issues/124 > >>> > >>> Also regarding MMIO support for queue reset, I plan to support it after > >>> this > >>> patch is passed. > >>> > >>> #14-#17 is the disable/enable function of rx/tx pair implemented by > >>> virtio-net > >>> using the new helper. > >> One thing that came to mind is the steering. E.g if we disable an RX, > >> should we stop steering packets to that queue? > > Yes, we should reselect a queue. > > > > Thanks. > > > Maybe a spec patch for that? Yes, I also realized this. Although virtio-net's disable/enable is implemented based on queue reset, virtio-net still has to define its own flag and define some more detailed implementations. I'll think about it and post a spec patch. Thanks. > > Thanks > > > > > >> Thanks > >> > >>> This function is not currently referenced by other > >>> functions. It is more to show the usage of the new helper, I not sure if > >>> they > >>> are going to be merged together. > >>> > >>> Please review. Thanks. > >>> > >>> v3: > >>>1. keep vq, irq unreleased > >>> > >>> Xuan Zhuo (17): > >>>virtio_pci: struct virtio_pci_common_cfg add queue_notify_data > >>>virtio: queue_reset: add VIRTIO_F_RING_RESET > >>>virtio: queue_reset: struct virtio_config_ops add callbacks for > >>> queue_reset > >>>virtio: queue_reset: add helper > >>>vritio_ring: queue_reset: extract the release function of the vq ring > >>>virtio_ring: queue_reset: split: add __vring_init_virtqueue() > >>>virtio_ring: queue_reset: split: support enable reset queue > >>>virtio_ring: queue_reset: packed: support enable reset queue > >>>virtio_ring: queue_reset: add vring_reset_virtqueue() > >>>virtio_pci: queue_reset: update struct virtio_pci_common_cfg and > >>> option functions > >>>virtio_pci: queue_reset: release vq by vp_dev->vqs > >>>virtio_pci: queue_reset: setup_vq use vring_setup_virtqueue() > >>>virtio_pci: queue_reset: support VIRTIO_F_RING_RESET > >>>virtio_net: virtnet_tx_timeout() fix style > >>>virtio_net: virtnet_tx_timeout() stop ref sq->vq > >>>virtio_net: split free_unused_bufs() > >>>virtio_net: support pair disable/enable > >>> > >>> drivers/net/virtio_net.c | 220 ++--- > >>> drivers/virtio/virtio_pci_common.c | 62 --- > >>> drivers/virtio/virtio_pci_common.h | 11 +- > >>> drivers/virtio/virtio_pci_legacy.c | 5 +- > >>> drivers/virtio/virtio_pci_modern.c | 120 +- > >>> drivers/virtio/virtio_pci_modern_dev.c | 28 > >>> drivers/virtio/virtio_ring.c | 144 +++- > >>> include/linux/virtio.h | 1 + > >>> include/linux/virtio_config.h | 75 - > >>> include/linux/virtio_pci_modern.h | 2 + > >>> include/linux/virtio_ring.h| 42 +++-- > >>> include/uapi/linux/virtio_config.h | 7 +- > >>> include/uapi/linux/virtio_pci.h| 2 + > >>> 13 files changed, 618 insertions(+), 101 deletions(-) > >>> > >>> -- > >>> 2.31.0 > >>> > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data
On Tue, Feb 8, 2022 at 10:17 AM Xuan Zhuo wrote: > > On Mon, 7 Feb 2022 16:06:15 +0800, Jason Wang wrote: > > On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo wrote: > > > > > > On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang wrote: > > > > > > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道: > > > > > Add queue_notify_data in struct virtio_pci_common_cfg, which comes > > > > > from > > > > > here https://github.com/oasis-tcs/virtio-spec/issues/89 > > > > > > > > > > Since I want to add queue_reset after it, I submitted this patch > > > > > first. > > > > > > > > > > Signed-off-by: Xuan Zhuo > > > > > --- > > > > > include/uapi/linux/virtio_pci.h | 1 + > > > > > 1 file changed, 1 insertion(+) > > > > > > > > > > diff --git a/include/uapi/linux/virtio_pci.h > > > > > b/include/uapi/linux/virtio_pci.h > > > > > index 3a86f36d7e3d..492c89f56c6a 100644 > > > > > --- a/include/uapi/linux/virtio_pci.h > > > > > +++ b/include/uapi/linux/virtio_pci.h > > > > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg { > > > > > __le32 queue_avail_hi; /* read-write */ > > > > > __le32 queue_used_lo; /* read-write */ > > > > > __le32 queue_used_hi; /* read-write */ > > > > > + __le16 queue_notify_data; /* read-write */ > > > > > }; > > > > > > > > > > > > So I had the same concern as previous version. > > > > > > > > This breaks uABI where program may try to use sizeof(struct > > > > virtio_pci_common_cfg). > > > > > > > > We probably need a container structure here. > > > > > > I see, I plan to add a struct like this, do you think it's appropriate? > > > > > > struct virtio_pci_common_cfg_v1 { > > > struct virtio_pci_common_cfg cfg; > > > __le16 queue_notify_data; /* read-write */ > > > } > > > > Something like this but we probably need a better name. > > > how about this? > > /* Ext Fields in VIRTIO_PCI_CAP_COMMON_CFG: */ > struct virtio_pci_common_cfg_ext { > struct virtio_pci_common_cfg cfg; > > __le16 queue_notify_data; /* read-write */ > > __le16 reserved0; > __le16 reserved1; > __le16 reserved2; > __le16 reserved3; > __le16 reserved4; > __le16 reserved5; > __le16 reserved6; > __le16 reserved7; > __le16 reserved8; > __le16 reserved9; > __le16 reserved10; > __le16 reserved11; > __le16 reserved12; > __le16 reserved13; > __le16 reserved14; > }; I still think the container without padding is better. Otherwise userspace needs to use offset_of() trick instead of sizeof(). Thanks > > Thanks > > > > > Thanks > > > > > > > > Thanks. > > > > > > > > > > > THanks > > > > > > > > > > > > > > > > > > /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */ > > > > > > > > > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 03/17] virtio: queue_reset: struct virtio_config_ops add callbacks for queue_reset
On Tue, 8 Feb 2022 10:58:45 +0800, Jason Wang wrote: > > 2022/2/7 pm 3:19, Xuan Zhuo : > > On Mon, 7 Feb 2022 14:45:02 +0800, Jason Wang wrote: > >> 2022/1/26 pm 3:35, Xuan Zhuo : > >>> Performing reset on a queue is divided into two steps: > >>> > >>> 1. reset_vq: reset one vq > >>> 2. enable_reset_vq: re-enable the reset queue > >>> > >>> In the first step, these tasks will be completed: > >>> 1. notify the hardware queue to reset > >>> 2. recycle the buffer from vq > >>> 3. release the ring of the vq > >>> > >>> The second step is similar to find vqs, > >> > >> Not sure, since find_vqs will usually try to allocate interrupts. > >> > >> > > Yes. > > > > > >>>passing parameters callback and > >>> name, etc. Based on the original vq, the ring is re-allocated and > >>> configured to the backend. > >> > >> I wonder whether we really have such requirement. > >> > >> For example, do we really have a use case that may change: > >> > >> vq callback, ctx, ring_num or even re-create the virtqueue? > > 1. virtqueue is not recreated > > 2. ring_num can be used to modify ring num by ethtool -G > > > It looks to me we don't support this right now. I am trying to implement this function based on virtio queue reset. It will be added to the next version. > > > > > > There is really no scene to modify vq callback, ctx, name. > > > > Do you mean we still use the old one instead of adding these parameters? > > > Yes, I think for driver we need to implement the function that is needed > for the first user (e.g AF_XDP). If there's no use case, we can leave > those extension for the future. OK. Thanks. > > Thanks > > > > > > Thanks. > > > >> Thanks > >> > >> > >>> So add two callbacks reset_vq, enable_reset_vq to struct > >>> virtio_config_ops. > >>> > >>> Add a structure for passing parameters. This will facilitate subsequent > >>> expansion of the parameters of enable reset vq. > >>> There is currently only one default extended parameter ring_num. > >>> > >>> Signed-off-by: Xuan Zhuo > >>> --- > >>>include/linux/virtio_config.h | 43 ++- > >>>1 file changed, 42 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h > >>> index 4d107ad31149..51dd8461d1b6 100644 > >>> --- a/include/linux/virtio_config.h > >>> +++ b/include/linux/virtio_config.h > >>> @@ -16,6 +16,44 @@ struct virtio_shm_region { > >>> u64 len; > >>>}; > >>> > >>> +typedef void vq_callback_t(struct virtqueue *); > >>> + > >>> +/* virtio_reset_vq: specify parameters for queue_reset > >>> + * > >>> + * vdev: the device > >>> + * queue_index: the queue index > >>> + * > >>> + * free_unused_cb: callback to free unused bufs > >>> + * data: used by free_unused_cb > >>> + * > >>> + * callback: callback for the virtqueue, NULL for vq that do not > >>> need a > >>> + * callback > >>> + * name: virtqueue names (mainly for debugging), NULL for vq > >>> unused by > >>> + * driver > >>> + * ctx: ctx > >>> + * > >>> + * ring_num: specify ring num for the vq to be re-enabled. 0 means > >>> use the > >>> + * default value. MUST be a power of 2. > >>> + */ > >>> +struct virtio_reset_vq; > >>> +typedef void vq_reset_callback_t(struct virtio_reset_vq *param, void > >>> *buf); > >>> +struct virtio_reset_vq { > >>> + struct virtio_device *vdev; > >>> + u16 queue_index; > >>> + > >>> + /* reset vq param */ > >>> + vq_reset_callback_t *free_unused_cb; > >>> + void *data; > >>> + > >>> + /* enable reset vq param */ > >>> + vq_callback_t *callback; > >>> + const char *name; > >>> + const bool *ctx; > >>> + > >>> + /* ext enable reset vq param */ > >>> + u16 ring_num; > >>> +}; > >>> + > >>>/** > >>> * virtio_config_ops - operations for configuring a virtio device > >>> * Note: Do not assume that a transport implements all of the > >>> operations > >>> @@ -74,8 +112,9 @@ struct virtio_shm_region { > >>> * @set_vq_affinity: set the affinity for a virtqueue (optional). > >>> * @get_vq_affinity: get the affinity for a virtqueue (optional). > >>> * @get_shm_region: get a shared memory region based on the index. > >>> + * @reset_vq: reset a queue individually > >>> + * @enable_reset_vq: enable a reset queue > >>> */ > >>> -typedef void vq_callback_t(struct virtqueue *); > >>>struct virtio_config_ops { > >>> void (*enable_cbs)(struct virtio_device *vdev); > >>> void (*get)(struct virtio_device *vdev, unsigned offset, > >>> @@ -100,6 +139,8 @@ struct virtio_config_ops { > >>> int index); > >>> bool (*get_shm_region)(struct virtio_device *vdev, > >>> struct virtio_shm_region *region, u8 id); > >>> + int (*reset_vq)(struct virtio_reset_vq *param); > >>> + struct virtqueue *(*enable_reset_vq)(struct virtio_reset_vq *param); > >>>}; >
Re: [PATCH v3 00/17] virtio pci support VIRTIO_F_RING_RESET
在 2022/2/7 下午2:02, Xuan Zhuo 写道: On Mon, 7 Feb 2022 11:39:36 +0800, Jason Wang wrote: On Wed, Jan 26, 2022 at 3:35 PM Xuan Zhuo wrote: The virtio spec already supports the virtio queue reset function. This patch set is to add this function to the kernel. The relevant virtio spec information is here: https://github.com/oasis-tcs/virtio-spec/issues/124 Also regarding MMIO support for queue reset, I plan to support it after this patch is passed. #14-#17 is the disable/enable function of rx/tx pair implemented by virtio-net using the new helper. One thing that came to mind is the steering. E.g if we disable an RX, should we stop steering packets to that queue? Yes, we should reselect a queue. Thanks. Maybe a spec patch for that? Thanks Thanks This function is not currently referenced by other functions. It is more to show the usage of the new helper, I not sure if they are going to be merged together. Please review. Thanks. v3: 1. keep vq, irq unreleased Xuan Zhuo (17): virtio_pci: struct virtio_pci_common_cfg add queue_notify_data virtio: queue_reset: add VIRTIO_F_RING_RESET virtio: queue_reset: struct virtio_config_ops add callbacks for queue_reset virtio: queue_reset: add helper vritio_ring: queue_reset: extract the release function of the vq ring virtio_ring: queue_reset: split: add __vring_init_virtqueue() virtio_ring: queue_reset: split: support enable reset queue virtio_ring: queue_reset: packed: support enable reset queue virtio_ring: queue_reset: add vring_reset_virtqueue() virtio_pci: queue_reset: update struct virtio_pci_common_cfg and option functions virtio_pci: queue_reset: release vq by vp_dev->vqs virtio_pci: queue_reset: setup_vq use vring_setup_virtqueue() virtio_pci: queue_reset: support VIRTIO_F_RING_RESET virtio_net: virtnet_tx_timeout() fix style virtio_net: virtnet_tx_timeout() stop ref sq->vq virtio_net: split free_unused_bufs() virtio_net: support pair disable/enable drivers/net/virtio_net.c | 220 ++--- drivers/virtio/virtio_pci_common.c | 62 --- drivers/virtio/virtio_pci_common.h | 11 +- drivers/virtio/virtio_pci_legacy.c | 5 +- drivers/virtio/virtio_pci_modern.c | 120 +- drivers/virtio/virtio_pci_modern_dev.c | 28 drivers/virtio/virtio_ring.c | 144 +++- include/linux/virtio.h | 1 + include/linux/virtio_config.h | 75 - include/linux/virtio_pci_modern.h | 2 + include/linux/virtio_ring.h| 42 +++-- include/uapi/linux/virtio_config.h | 7 +- include/uapi/linux/virtio_pci.h| 2 + 13 files changed, 618 insertions(+), 101 deletions(-) -- 2.31.0 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 03/17] virtio: queue_reset: struct virtio_config_ops add callbacks for queue_reset
在 2022/2/7 下午3:19, Xuan Zhuo 写道: On Mon, 7 Feb 2022 14:45:02 +0800, Jason Wang wrote: 在 2022/1/26 下午3:35, Xuan Zhuo 写道: Performing reset on a queue is divided into two steps: 1. reset_vq: reset one vq 2. enable_reset_vq: re-enable the reset queue In the first step, these tasks will be completed: 1. notify the hardware queue to reset 2. recycle the buffer from vq 3. release the ring of the vq The second step is similar to find vqs, Not sure, since find_vqs will usually try to allocate interrupts. Yes. passing parameters callback and name, etc. Based on the original vq, the ring is re-allocated and configured to the backend. I wonder whether we really have such requirement. For example, do we really have a use case that may change: vq callback, ctx, ring_num or even re-create the virtqueue? 1. virtqueue is not recreated 2. ring_num can be used to modify ring num by ethtool -G It looks to me we don't support this right now. There is really no scene to modify vq callback, ctx, name. Do you mean we still use the old one instead of adding these parameters? Yes, I think for driver we need to implement the function that is needed for the first user (e.g AF_XDP). If there's no use case, we can leave those extension for the future. Thanks Thanks. Thanks So add two callbacks reset_vq, enable_reset_vq to struct virtio_config_ops. Add a structure for passing parameters. This will facilitate subsequent expansion of the parameters of enable reset vq. There is currently only one default extended parameter ring_num. Signed-off-by: Xuan Zhuo --- include/linux/virtio_config.h | 43 ++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h index 4d107ad31149..51dd8461d1b6 100644 --- a/include/linux/virtio_config.h +++ b/include/linux/virtio_config.h @@ -16,6 +16,44 @@ struct virtio_shm_region { u64 len; }; +typedef void vq_callback_t(struct virtqueue *); + +/* virtio_reset_vq: specify parameters for queue_reset + * + * vdev: the device + * queue_index: the queue index + * + * free_unused_cb: callback to free unused bufs + * data: used by free_unused_cb + * + * callback: callback for the virtqueue, NULL for vq that do not need a + * callback + * name: virtqueue names (mainly for debugging), NULL for vq unused by + * driver + * ctx: ctx + * + * ring_num: specify ring num for the vq to be re-enabled. 0 means use the + * default value. MUST be a power of 2. + */ +struct virtio_reset_vq; +typedef void vq_reset_callback_t(struct virtio_reset_vq *param, void *buf); +struct virtio_reset_vq { + struct virtio_device *vdev; + u16 queue_index; + + /* reset vq param */ + vq_reset_callback_t *free_unused_cb; + void *data; + + /* enable reset vq param */ + vq_callback_t *callback; + const char *name; + const bool *ctx; + + /* ext enable reset vq param */ + u16 ring_num; +}; + /** * virtio_config_ops - operations for configuring a virtio device * Note: Do not assume that a transport implements all of the operations @@ -74,8 +112,9 @@ struct virtio_shm_region { * @set_vq_affinity: set the affinity for a virtqueue (optional). * @get_vq_affinity: get the affinity for a virtqueue (optional). * @get_shm_region: get a shared memory region based on the index. + * @reset_vq: reset a queue individually + * @enable_reset_vq: enable a reset queue */ -typedef void vq_callback_t(struct virtqueue *); struct virtio_config_ops { void (*enable_cbs)(struct virtio_device *vdev); void (*get)(struct virtio_device *vdev, unsigned offset, @@ -100,6 +139,8 @@ struct virtio_config_ops { int index); bool (*get_shm_region)(struct virtio_device *vdev, struct virtio_shm_region *region, u8 id); + int (*reset_vq)(struct virtio_reset_vq *param); + struct virtqueue *(*enable_reset_vq)(struct virtio_reset_vq *param); }; /* If driver didn't advertise the feature, it will never appear. */ ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 13/17] virtio_pci: queue_reset: support VIRTIO_F_RING_RESET
On Mon, Feb 7, 2022 at 4:19 PM Xuan Zhuo wrote: > > On Mon, 7 Feb 2022 14:57:13 +0800, Jason Wang wrote: > > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道: > > > This patch implements virtio pci support for QUEUE RESET. > > > > > > Performing reset on a queue is divided into two steps: > > > > > > 1. reset_vq: reset one vq > > > 2. enable_reset_vq: re-enable the reset queue > > > > > > In the first step, these tasks will be completed: > > > 1. notify the hardware queue to reset > > > 2. recycle the buffer from vq > > > 3. release the ring of the vq > > > > > > The process of enable reset vq: > > > vp_modern_enable_reset_vq() > > > vp_enable_reset_vq() > > > __vp_setup_vq() > > > setup_vq() > > > vring_setup_virtqueue() > > > > > > In this process, we added two parameters, vq and num, and finally passed > > > them to vring_setup_virtqueue(). And reuse the original info and vq. > > > > > > Signed-off-by: Xuan Zhuo > > > --- > > > drivers/virtio/virtio_pci_common.c | 36 +++ > > > drivers/virtio/virtio_pci_common.h | 5 ++ > > > drivers/virtio/virtio_pci_modern.c | 100 + > > > 3 files changed, 128 insertions(+), 13 deletions(-) > > > > > > diff --git a/drivers/virtio/virtio_pci_common.c > > > b/drivers/virtio/virtio_pci_common.c > > > index c02936d29a31..ad21638fbf66 100644 > > > --- a/drivers/virtio/virtio_pci_common.c > > > +++ b/drivers/virtio/virtio_pci_common.c > > > @@ -205,23 +205,28 @@ static int vp_request_msix_vectors(struct > > > virtio_device *vdev, int nvectors, > > > return err; > > > } > > > > > > -static struct virtqueue *vp_setup_vq(struct virtio_device *vdev, > > > unsigned index, > > > -void (*callback)(struct virtqueue *vq), > > > -const char *name, > > > -bool ctx, > > > -u16 msix_vec, u16 num) > > > +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index, > > > + void (*callback)(struct virtqueue *vq), > > > + const char *name, > > > + bool ctx, > > > + u16 msix_vec, u16 num) > > > { > > > struct virtio_pci_device *vp_dev = to_vp_device(vdev); > > > - struct virtio_pci_vq_info *info = kmalloc(sizeof *info, GFP_KERNEL); > > > + struct virtio_pci_vq_info *info; > > > struct virtqueue *vq; > > > unsigned long flags; > > > > > > - /* fill out our structure that represents an active queue */ > > > - if (!info) > > > - return ERR_PTR(-ENOMEM); > > > + info = vp_dev->vqs[index]; > > > + if (!info) { > > > + info = kzalloc(sizeof *info, GFP_KERNEL); > > > + > > > + /* fill out our structure that represents an active queue */ > > > + if (!info) > > > + return ERR_PTR(-ENOMEM); > > > + } > > > > > > vq = vp_dev->setup_vq(vp_dev, info, index, callback, name, ctx, > > > - msix_vec, NULL, num); > > > + msix_vec, info->vq, num); > > > if (IS_ERR(vq)) > > > goto out_info; > > > > > > @@ -238,6 +243,9 @@ static struct virtqueue *vp_setup_vq(struct > > > virtio_device *vdev, unsigned index, > > > return vq; > > > > > > out_info: > > > + if (info->vq && info->vq->reset) > > > + return vq; > > > + > > > kfree(info); > > > return vq; > > > } > > > @@ -248,9 +256,11 @@ static void vp_del_vq(struct virtqueue *vq) > > > struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index]; > > > unsigned long flags; > > > > > > - spin_lock_irqsave(_dev->lock, flags); > > > - list_del(>node); > > > - spin_unlock_irqrestore(_dev->lock, flags); > > > + if (!vq->reset) { > > > + spin_lock_irqsave(_dev->lock, flags); > > > + list_del(>node); > > > + spin_unlock_irqrestore(_dev->lock, flags); > > > + } > > > > > > vp_dev->del_vq(info); > > > kfree(info); > > > diff --git a/drivers/virtio/virtio_pci_common.h > > > b/drivers/virtio/virtio_pci_common.h > > > index 65db92245e41..c1d15f7c0be4 100644 > > > --- a/drivers/virtio/virtio_pci_common.h > > > +++ b/drivers/virtio/virtio_pci_common.h > > > @@ -119,6 +119,11 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned > > > nvqs, > > > struct virtqueue *vqs[], vq_callback_t *callbacks[], > > > const char * const names[], const bool *ctx, > > > struct irq_affinity *desc); > > > +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index, > > > + void (*callback)(struct virtqueue *vq), > > > + const char *name, > > > + bool ctx, > > > + u16 msix_vec, u16 num); > > > const char *vp_bus_name(struct virtio_device *vdev); > > > > > > /* Setup the affinity for a virtqueue: > > >
Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data
On Mon, 7 Feb 2022 16:06:15 +0800, Jason Wang wrote: > On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo wrote: > > > > On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang wrote: > > > > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道: > > > > Add queue_notify_data in struct virtio_pci_common_cfg, which comes from > > > > here https://github.com/oasis-tcs/virtio-spec/issues/89 > > > > > > > > Since I want to add queue_reset after it, I submitted this patch first. > > > > > > > > Signed-off-by: Xuan Zhuo > > > > --- > > > > include/uapi/linux/virtio_pci.h | 1 + > > > > 1 file changed, 1 insertion(+) > > > > > > > > diff --git a/include/uapi/linux/virtio_pci.h > > > > b/include/uapi/linux/virtio_pci.h > > > > index 3a86f36d7e3d..492c89f56c6a 100644 > > > > --- a/include/uapi/linux/virtio_pci.h > > > > +++ b/include/uapi/linux/virtio_pci.h > > > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg { > > > > __le32 queue_avail_hi; /* read-write */ > > > > __le32 queue_used_lo; /* read-write */ > > > > __le32 queue_used_hi; /* read-write */ > > > > + __le16 queue_notify_data; /* read-write */ > > > > }; > > > > > > > > > So I had the same concern as previous version. > > > > > > This breaks uABI where program may try to use sizeof(struct > > > virtio_pci_common_cfg). > > > > > > We probably need a container structure here. > > > > I see, I plan to add a struct like this, do you think it's appropriate? > > > > struct virtio_pci_common_cfg_v1 { > > struct virtio_pci_common_cfg cfg; > > __le16 queue_notify_data; /* read-write */ > > } > > Something like this but we probably need a better name. how about this? /* Ext Fields in VIRTIO_PCI_CAP_COMMON_CFG: */ struct virtio_pci_common_cfg_ext { struct virtio_pci_common_cfg cfg; __le16 queue_notify_data; /* read-write */ __le16 reserved0; __le16 reserved1; __le16 reserved2; __le16 reserved3; __le16 reserved4; __le16 reserved5; __le16 reserved6; __le16 reserved7; __le16 reserved8; __le16 reserved9; __le16 reserved10; __le16 reserved11; __le16 reserved12; __le16 reserved13; __le16 reserved14; }; Thanks > > Thanks > > > > > Thanks. > > > > > > > > THanks > > > > > > > > > > > > > > /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */ > > > > > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH v3 8/8] VMCI: dma dg: add support for DMA datagrams receive
Use the DMA based receive operation instead of the ioread8_rep based datagram receive when DMA datagrams are supported. In the receive operation, configure the header to point to the page aligned VMCI_MAX_DG_SIZE part of the receive buffer using s/g configuration for the header. This ensures that the existing dispatch routine can be used with little modification. Initiate the receive by writing the lower 32 bit of the buffer to the VMCI_DATA_IN_LOW_ADDR register, and wait for the busy flag to be changed by the device using a wait queue. The existing dispatch routine for received datagrams is reused for the DMA datagrams with a few modifications: - the receive buffer is always the maximum size for DMA datagrams (IO ports would try with a shorter buffer first to reduce overhead of the ioread8_rep operation). - for DMA datagrams, datagrams are provided contiguous in the buffer as opposed to IO port datagrams, where they can start on any page boundary Reviewed-by: Vishnu Dasa Signed-off-by: Jorgen Hansen --- drivers/misc/vmw_vmci/vmci_guest.c | 103 ++--- 1 file changed, 79 insertions(+), 24 deletions(-) diff --git a/drivers/misc/vmw_vmci/vmci_guest.c b/drivers/misc/vmw_vmci/vmci_guest.c index bf524217914e..aa61a687b3e2 100644 --- a/drivers/misc/vmw_vmci/vmci_guest.c +++ b/drivers/misc/vmw_vmci/vmci_guest.c @@ -58,6 +58,7 @@ struct vmci_guest_device { struct tasklet_struct datagram_tasklet; struct tasklet_struct bm_tasklet; + struct wait_queue_head inout_wq; void *data_buffer; dma_addr_t data_buffer_base; @@ -115,6 +116,36 @@ static void vmci_write_reg(struct vmci_guest_device *dev, u32 val, u32 reg) iowrite32(val, dev->iobase + reg); } +static void vmci_read_data(struct vmci_guest_device *vmci_dev, + void *dest, size_t size) +{ + if (vmci_dev->mmio_base == NULL) + ioread8_rep(vmci_dev->iobase + VMCI_DATA_IN_ADDR, + dest, size); + else { + /* +* For DMA datagrams, the data_buffer will contain the header on the +* first page, followed by the incoming datagram(s) on the following +* pages. The header uses an S/G element immediately following the +* header on the first page to point to the data area. +*/ + struct vmci_data_in_out_header *buffer_header = vmci_dev->data_buffer; + struct vmci_sg_elem *sg_array = (struct vmci_sg_elem *)(buffer_header + 1); + size_t buffer_offset = dest - vmci_dev->data_buffer; + + buffer_header->opcode = 1; + buffer_header->size = 1; + buffer_header->busy = 0; + sg_array[0].addr = vmci_dev->data_buffer_base + buffer_offset; + sg_array[0].size = size; + + vmci_write_reg(vmci_dev, lower_32_bits(vmci_dev->data_buffer_base), + VMCI_DATA_IN_LOW_ADDR); + + wait_event(vmci_dev->inout_wq, buffer_header->busy == 1); + } +} + static int vmci_write_data(struct vmci_guest_device *dev, struct vmci_datagram *dg) { @@ -261,15 +292,17 @@ static int vmci_check_host_caps(struct pci_dev *pdev) } /* - * Reads datagrams from the data in port and dispatches them. We - * always start reading datagrams into only the first page of the - * datagram buffer. If the datagrams don't fit into one page, we - * use the maximum datagram buffer size for the remainder of the - * invocation. This is a simple heuristic for not penalizing - * small datagrams. + * Reads datagrams from the device and dispatches them. For IO port + * based access to the device, we always start reading datagrams into + * only the first page of the datagram buffer. If the datagrams don't + * fit into one page, we use the maximum datagram buffer size for the + * remainder of the invocation. This is a simple heuristic for not + * penalizing small datagrams. For DMA-based datagrams, we always + * use the maximum datagram buffer size, since there is no performance + * penalty for doing so. * * This function assumes that it has exclusive access to the data - * in port for the duration of the call. + * in register(s) for the duration of the call. */ static void vmci_dispatch_dgs(unsigned long data) { @@ -277,23 +310,41 @@ static void vmci_dispatch_dgs(unsigned long data) u8 *dg_in_buffer = vmci_dev->data_buffer; struct vmci_datagram *dg; size_t dg_in_buffer_size = VMCI_MAX_DG_SIZE; - size_t current_dg_in_buffer_size = PAGE_SIZE; + size_t current_dg_in_buffer_size; size_t remaining_bytes; + bool is_io_port = vmci_dev->mmio_base == NULL; BUILD_BUG_ON(VMCI_MAX_DG_SIZE < PAGE_SIZE); - ioread8_rep(vmci_dev->iobase + VMCI_DATA_IN_ADDR, - vmci_dev->data_buffer,
[PATCH v3 7/8] VMCI: dma dg: add support for DMA datagrams sends
Use DMA based send operation from the transmit buffer instead of the iowrite8_rep based datagram send when DMA datagrams are supported. The outgoing datagram is sent as inline data in the VMCI transmit buffer. Once the header has been configured, the send is initiated by writing the lower 32 bit of the buffer base address to the VMCI_DATA_OUT_LOW_ADDR register. Only then will the device process the header and the datagram itself. Following that, the driver busy waits (it isn't possible to sleep on the send path) for the header busy flag to change - indicating that the send is complete. Reviewed-by: Vishnu Dasa Signed-off-by: Jorgen Hansen --- drivers/misc/vmw_vmci/vmci_guest.c | 45 -- include/linux/vmw_vmci_defs.h | 34 ++ 2 files changed, 77 insertions(+), 2 deletions(-) diff --git a/drivers/misc/vmw_vmci/vmci_guest.c b/drivers/misc/vmw_vmci/vmci_guest.c index 36eade15ba87..bf524217914e 100644 --- a/drivers/misc/vmw_vmci/vmci_guest.c +++ b/drivers/misc/vmw_vmci/vmci_guest.c @@ -13,6 +13,7 @@ #include #include #include +#include #include #include #include @@ -114,6 +115,47 @@ static void vmci_write_reg(struct vmci_guest_device *dev, u32 val, u32 reg) iowrite32(val, dev->iobase + reg); } +static int vmci_write_data(struct vmci_guest_device *dev, + struct vmci_datagram *dg) +{ + int result; + + if (dev->mmio_base != NULL) { + struct vmci_data_in_out_header *buffer_header = dev->tx_buffer; + u8 *dg_out_buffer = (u8 *)(buffer_header + 1); + + if (VMCI_DG_SIZE(dg) > VMCI_MAX_DG_SIZE) + return VMCI_ERROR_INVALID_ARGS; + + /* +* Initialize send buffer with outgoing datagram +* and set up header for inline data. Device will +* not access buffer asynchronously - only after +* the write to VMCI_DATA_OUT_LOW_ADDR. +*/ + memcpy(dg_out_buffer, dg, VMCI_DG_SIZE(dg)); + buffer_header->opcode = 0; + buffer_header->size = VMCI_DG_SIZE(dg); + buffer_header->busy = 1; + + vmci_write_reg(dev, lower_32_bits(dev->tx_buffer_base), + VMCI_DATA_OUT_LOW_ADDR); + + /* Caller holds a spinlock, so cannot block. */ + spin_until_cond(buffer_header->busy == 0); + + result = vmci_read_reg(vmci_dev_g, VMCI_RESULT_LOW_ADDR); + if (result == VMCI_SUCCESS) + result = (int)buffer_header->result; + } else { + iowrite8_rep(dev->iobase + VMCI_DATA_OUT_ADDR, +dg, VMCI_DG_SIZE(dg)); + result = vmci_read_reg(vmci_dev_g, VMCI_RESULT_LOW_ADDR); + } + + return result; +} + /* * VM to hypervisor call mechanism. We use the standard VMware naming * convention since shared code is calling this function as well. @@ -139,8 +181,7 @@ int vmci_send_datagram(struct vmci_datagram *dg) spin_lock_irqsave(_dev_spinlock, flags); if (vmci_dev_g) { - iowrite8_rep(vmci_dev_g->iobase + VMCI_DATA_OUT_ADDR, -dg, VMCI_DG_SIZE(dg)); + vmci_write_data(vmci_dev_g, dg); result = vmci_read_reg(vmci_dev_g, VMCI_RESULT_LOW_ADDR); } else { result = VMCI_ERROR_UNAVAILABLE; diff --git a/include/linux/vmw_vmci_defs.h b/include/linux/vmw_vmci_defs.h index 8bc37d8244a8..6fb663b36f72 100644 --- a/include/linux/vmw_vmci_defs.h +++ b/include/linux/vmw_vmci_defs.h @@ -110,6 +110,40 @@ enum { #define VMCI_MMIO_ACCESS_OFFSET((size_t)(128 * 1024)) #define VMCI_MMIO_ACCESS_SIZE ((size_t)(64 * 1024)) +/* + * For VMCI devices supporting the VMCI_CAPS_DMA_DATAGRAM capability, the + * sending and receiving of datagrams can be performed using DMA to/from + * a driver allocated buffer. + * Sending and receiving will be handled as follows: + * - when sending datagrams, the driver initializes the buffer where the + * data part will refer to the outgoing VMCI datagram, sets the busy flag + * to 1 and writes the address of the buffer to VMCI_DATA_OUT_HIGH_ADDR + * and VMCI_DATA_OUT_LOW_ADDR. Writing to VMCI_DATA_OUT_LOW_ADDR triggers + * the device processing of the buffer. When the device has processed the + * buffer, it will write the result value to the buffer and then clear the + * busy flag. + * - when receiving datagrams, the driver initializes the buffer where the + * data part will describe the receive buffer, clears the busy flag and + * writes the address of the buffer to VMCI_DATA_IN_HIGH_ADDR and + * VMCI_DATA_IN_LOW_ADDR. Writing to VMCI_DATA_IN_LOW_ADDR triggers the + * device processing of the buffer. The device will copy as many available + * datagrams into the buffer as possible, and then sets the
[PATCH v3 5/8] VMCI: dma dg: register dummy IRQ handlers for DMA datagrams
Register dummy interrupt handlers for DMA datagrams in preparation for DMA datagram receive operations. Reviewed-by: Vishnu Dasa Signed-off-by: Jorgen Hansen --- drivers/misc/vmw_vmci/vmci_guest.c | 42 +++--- include/linux/vmw_vmci_defs.h | 14 -- 2 files changed, 51 insertions(+), 5 deletions(-) diff --git a/drivers/misc/vmw_vmci/vmci_guest.c b/drivers/misc/vmw_vmci/vmci_guest.c index ced187e7ac08..acef19c562b3 100644 --- a/drivers/misc/vmw_vmci/vmci_guest.c +++ b/drivers/misc/vmw_vmci/vmci_guest.c @@ -414,6 +414,9 @@ static irqreturn_t vmci_interrupt(int irq, void *_dev) icr &= ~VMCI_ICR_NOTIFICATION; } + if (icr & VMCI_ICR_DMA_DATAGRAM) + icr &= ~VMCI_ICR_DMA_DATAGRAM; + if (icr != 0) dev_warn(dev->dev, "Ignoring unknown interrupt cause (%d)\n", @@ -438,6 +441,16 @@ static irqreturn_t vmci_interrupt_bm(int irq, void *_dev) return IRQ_HANDLED; } +/* + * Interrupt handler for MSI-X interrupt vector VMCI_INTR_DMA_DATAGRAM, + * which is for the completion of a DMA datagram send or receive operation. + * Will only get called if we are using MSI-X with exclusive vectors. + */ +static irqreturn_t vmci_interrupt_dma_datagram(int irq, void *_dev) +{ + return IRQ_HANDLED; +} + /* * Most of the initialization at module load time is done here. */ @@ -447,6 +460,7 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, struct vmci_guest_device *vmci_dev; void __iomem *iobase = NULL; void __iomem *mmio_base = NULL; + unsigned int num_irq_vectors; unsigned int capabilities; unsigned int caps_in_use; unsigned long cmd; @@ -627,8 +641,12 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, * Enable interrupts. Try MSI-X first, then MSI, and then fallback on * legacy interrupts. */ - error = pci_alloc_irq_vectors(pdev, VMCI_MAX_INTRS, VMCI_MAX_INTRS, - PCI_IRQ_MSIX); + if (vmci_dev->mmio_base != NULL) + num_irq_vectors = VMCI_MAX_INTRS; + else + num_irq_vectors = VMCI_MAX_INTRS_NOTIFICATION; + error = pci_alloc_irq_vectors(pdev, num_irq_vectors, num_irq_vectors, + PCI_IRQ_MSIX); if (error < 0) { error = pci_alloc_irq_vectors(pdev, 1, 1, PCI_IRQ_MSIX | PCI_IRQ_MSI | PCI_IRQ_LEGACY); @@ -666,6 +684,17 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, pci_irq_vector(pdev, 1), error); goto err_free_irq; } + if (caps_in_use & VMCI_CAPS_DMA_DATAGRAM) { + error = request_irq(pci_irq_vector(pdev, 2), + vmci_interrupt_dma_datagram, + 0, KBUILD_MODNAME, vmci_dev); + if (error) { + dev_err(>dev, + "Failed to allocate irq %u: %d\n", + pci_irq_vector(pdev, 2), error); + goto err_free_bm_irq; + } + } } dev_dbg(>dev, "Registered device\n"); @@ -676,6 +705,8 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, cmd = VMCI_IMR_DATAGRAM; if (caps_in_use & VMCI_CAPS_NOTIFICATIONS) cmd |= VMCI_IMR_NOTIFICATION; + if (caps_in_use & VMCI_CAPS_DMA_DATAGRAM) + cmd |= VMCI_IMR_DMA_DATAGRAM; vmci_write_reg(vmci_dev, cmd, VMCI_IMR_ADDR); /* Enable interrupts. */ @@ -686,6 +717,8 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, vmci_call_vsock_callback(false); return 0; +err_free_bm_irq: + free_irq(pci_irq_vector(pdev, 1), vmci_dev); err_free_irq: free_irq(pci_irq_vector(pdev, 0), vmci_dev); tasklet_kill(_dev->datagram_tasklet); @@ -751,8 +784,11 @@ static void vmci_guest_remove_device(struct pci_dev *pdev) * MSI-X, we might have multiple vectors, each with their own * IRQ, which we must free too. */ - if (vmci_dev->exclusive_vectors) + if (vmci_dev->exclusive_vectors) { free_irq(pci_irq_vector(pdev, 1), vmci_dev); + if (vmci_dev->mmio_base != NULL) + free_irq(pci_irq_vector(pdev, 2), vmci_dev); + } free_irq(pci_irq_vector(pdev, 0), vmci_dev); pci_free_irq_vectors(pdev); diff --git a/include/linux/vmw_vmci_defs.h b/include/linux/vmw_vmci_defs.h index 4167779469fd..2b70c024dacb 100644 --- a/include/linux/vmw_vmci_defs.h +++ b/include/linux/vmw_vmci_defs.h @@ -45,13 +45,22 @@ /* Interrupt Cause register bits. */ #define
[PATCH v3 4/8] VMCI: dma dg: set OS page size
Tell the device the page size used by the OS. Reviewed-by: Vishnu Dasa Signed-off-by: Jorgen Hansen --- drivers/misc/vmw_vmci/vmci_guest.c | 4 include/linux/vmw_vmci_defs.h | 1 + 2 files changed, 5 insertions(+) diff --git a/drivers/misc/vmw_vmci/vmci_guest.c b/drivers/misc/vmw_vmci/vmci_guest.c index b93afe7f7119..ced187e7ac08 100644 --- a/drivers/misc/vmw_vmci/vmci_guest.c +++ b/drivers/misc/vmw_vmci/vmci_guest.c @@ -578,6 +578,10 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, /* Let the host know which capabilities we intend to use. */ vmci_write_reg(vmci_dev, caps_in_use, VMCI_CAPS_ADDR); + /* Let the device know the size for pages passed down. */ + if (caps_in_use & VMCI_CAPS_DMA_DATAGRAM) + vmci_write_reg(vmci_dev, PAGE_SHIFT, VMCI_GUEST_PAGE_SHIFT); + /* Set up global device so that we can start sending datagrams */ spin_lock_irq(_dev_spinlock); vmci_dev_g = vmci_dev; diff --git a/include/linux/vmw_vmci_defs.h b/include/linux/vmw_vmci_defs.h index 1ce2cffdc3ae..4167779469fd 100644 --- a/include/linux/vmw_vmci_defs.h +++ b/include/linux/vmw_vmci_defs.h @@ -21,6 +21,7 @@ #define VMCI_CAPS_ADDR 0x18 #define VMCI_RESULT_LOW_ADDR0x1c #define VMCI_RESULT_HIGH_ADDR 0x20 +#define VMCI_GUEST_PAGE_SHIFT 0x34 /* Max number of devices. */ #define VMCI_MAX_DEVICES 1 -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH v3 3/8] VMCI: dma dg: detect DMA datagram capability
Detect the VMCI DMA datagram capability, and if present, ack it to the device. Reviewed-by: Vishnu Dasa Signed-off-by: Jorgen Hansen --- drivers/misc/vmw_vmci/vmci_guest.c | 11 +++ include/linux/vmw_vmci_defs.h | 1 + 2 files changed, 12 insertions(+) diff --git a/drivers/misc/vmw_vmci/vmci_guest.c b/drivers/misc/vmw_vmci/vmci_guest.c index d30d66258e52..b93afe7f7119 100644 --- a/drivers/misc/vmw_vmci/vmci_guest.c +++ b/drivers/misc/vmw_vmci/vmci_guest.c @@ -562,6 +562,17 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, } } + if (mmio_base != NULL) { + if (capabilities & VMCI_CAPS_DMA_DATAGRAM) { + caps_in_use |= VMCI_CAPS_DMA_DATAGRAM; + } else { + dev_err(>dev, + "Missing capability: VMCI_CAPS_DMA_DATAGRAM\n"); + error = -ENXIO; + goto err_free_data_buffer; + } + } + dev_info(>dev, "Using capabilities 0x%x\n", caps_in_use); /* Let the host know which capabilities we intend to use. */ diff --git a/include/linux/vmw_vmci_defs.h b/include/linux/vmw_vmci_defs.h index 8fc00e2685cf..1ce2cffdc3ae 100644 --- a/include/linux/vmw_vmci_defs.h +++ b/include/linux/vmw_vmci_defs.h @@ -39,6 +39,7 @@ #define VMCI_CAPS_DATAGRAM BIT(2) #define VMCI_CAPS_NOTIFICATIONS BIT(3) #define VMCI_CAPS_PPN64 BIT(4) +#define VMCI_CAPS_DMA_DATAGRAM BIT(5) /* Interrupt Cause register bits. */ #define VMCI_ICR_DATAGRAM BIT(0) -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH v3 6/8] VMCI: dma dg: allocate send and receive buffers for DMA datagrams
If DMA datagrams are used, allocate send and receive buffers in coherent DMA memory. This is done in preparation for the send and receive datagram operations, where the buffers are used for the exchange of data between driver and device. Reviewed-by: Vishnu Dasa Signed-off-by: Jorgen Hansen --- drivers/misc/vmw_vmci/vmci_guest.c | 71 ++ include/linux/vmw_vmci_defs.h | 4 ++ 2 files changed, 66 insertions(+), 9 deletions(-) diff --git a/drivers/misc/vmw_vmci/vmci_guest.c b/drivers/misc/vmw_vmci/vmci_guest.c index acef19c562b3..36eade15ba87 100644 --- a/drivers/misc/vmw_vmci/vmci_guest.c +++ b/drivers/misc/vmw_vmci/vmci_guest.c @@ -31,6 +31,12 @@ #define VMCI_UTIL_NUM_RESOURCES 1 +/* + * Datagram buffers for DMA send/receive must accommodate at least + * a maximum sized datagram and the header. + */ +#define VMCI_DMA_DG_BUFFER_SIZE (VMCI_MAX_DG_SIZE + PAGE_SIZE) + static bool vmci_disable_msi; module_param_named(disable_msi, vmci_disable_msi, bool, 0); MODULE_PARM_DESC(disable_msi, "Disable MSI use in driver - (default=0)"); @@ -53,6 +59,9 @@ struct vmci_guest_device { struct tasklet_struct bm_tasklet; void *data_buffer; + dma_addr_t data_buffer_base; + void *tx_buffer; + dma_addr_t tx_buffer_base; void *notification_bitmap; dma_addr_t notification_base; }; @@ -451,6 +460,24 @@ static irqreturn_t vmci_interrupt_dma_datagram(int irq, void *_dev) return IRQ_HANDLED; } +static void vmci_free_dg_buffers(struct vmci_guest_device *vmci_dev) +{ + if (vmci_dev->mmio_base != NULL) { + if (vmci_dev->tx_buffer != NULL) + dma_free_coherent(vmci_dev->dev, + VMCI_DMA_DG_BUFFER_SIZE, + vmci_dev->tx_buffer, + vmci_dev->tx_buffer_base); + if (vmci_dev->data_buffer != NULL) + dma_free_coherent(vmci_dev->dev, + VMCI_DMA_DG_BUFFER_SIZE, + vmci_dev->data_buffer, + vmci_dev->data_buffer_base); + } else { + vfree(vmci_dev->data_buffer); + } +} + /* * Most of the initialization at module load time is done here. */ @@ -517,11 +544,27 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, tasklet_init(_dev->bm_tasklet, vmci_process_bitmap, (unsigned long)vmci_dev); - vmci_dev->data_buffer = vmalloc(VMCI_MAX_DG_SIZE); + if (mmio_base != NULL) { + vmci_dev->tx_buffer = dma_alloc_coherent(>dev, VMCI_DMA_DG_BUFFER_SIZE, + _dev->tx_buffer_base, +GFP_KERNEL); + if (!vmci_dev->tx_buffer) { + dev_err(>dev, + "Can't allocate memory for datagram tx buffer\n"); + return -ENOMEM; + } + + vmci_dev->data_buffer = dma_alloc_coherent(>dev, VMCI_DMA_DG_BUFFER_SIZE, + _dev->data_buffer_base, + GFP_KERNEL); + } else { + vmci_dev->data_buffer = vmalloc(VMCI_MAX_DG_SIZE); + } if (!vmci_dev->data_buffer) { dev_err(>dev, "Can't allocate memory for datagram buffer\n"); - return -ENOMEM; + error = -ENOMEM; + goto err_free_data_buffers; } pci_set_master(pdev); /* To enable queue_pair functionality. */ @@ -539,7 +582,7 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, if (!(capabilities & VMCI_CAPS_DATAGRAM)) { dev_err(>dev, "Device does not support datagrams\n"); error = -ENXIO; - goto err_free_data_buffer; + goto err_free_data_buffers; } caps_in_use = VMCI_CAPS_DATAGRAM; @@ -583,7 +626,7 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, dev_err(>dev, "Missing capability: VMCI_CAPS_DMA_DATAGRAM\n"); error = -ENXIO; - goto err_free_data_buffer; + goto err_free_data_buffers; } } @@ -592,10 +635,17 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, /* Let the host know which capabilities we intend to use. */ vmci_write_reg(vmci_dev, caps_in_use, VMCI_CAPS_ADDR); - /* Let the device know the size for pages passed down. */ - if (caps_in_use & VMCI_CAPS_DMA_DATAGRAM) + if (caps_in_use & VMCI_CAPS_DMA_DATAGRAM) { + /* Let the device know the size for pages passed
[PATCH v3 2/8] VMCI: dma dg: add MMIO access to registers
Detect the support for MMIO access through examination of the length of the region requested in BAR1. If it is 256KB, the VMCI device supports MMIO access to registers. If MMIO access is supported, map the area of the region used for MMIO access (64KB size at offset 128KB). Add wrapper functions for accessing 32 bit register accesses through either MMIO or IO ports based on device configuration. Sending and receiving datagrams through iowrite8_rep/ioread8_rep is left unchanged for now, and will be addressed in a later change. Reviewed-by: Vishnu Dasa Signed-off-by: Jorgen Hansen --- drivers/misc/vmw_vmci/vmci_guest.c | 67 +- include/linux/vmw_vmci_defs.h | 12 ++ 2 files changed, 60 insertions(+), 19 deletions(-) diff --git a/drivers/misc/vmw_vmci/vmci_guest.c b/drivers/misc/vmw_vmci/vmci_guest.c index 1018dc77269d..d30d66258e52 100644 --- a/drivers/misc/vmw_vmci/vmci_guest.c +++ b/drivers/misc/vmw_vmci/vmci_guest.c @@ -45,6 +45,7 @@ static u32 vm_context_id = VMCI_INVALID_ID; struct vmci_guest_device { struct device *dev; /* PCI device we are attached to */ void __iomem *iobase; + void __iomem *mmio_base; bool exclusive_vectors; @@ -89,6 +90,21 @@ u32 vmci_get_vm_context_id(void) return vm_context_id; } +static unsigned int vmci_read_reg(struct vmci_guest_device *dev, u32 reg) +{ + if (dev->mmio_base != NULL) + return readl(dev->mmio_base + reg); + return ioread32(dev->iobase + reg); +} + +static void vmci_write_reg(struct vmci_guest_device *dev, u32 val, u32 reg) +{ + if (dev->mmio_base != NULL) + writel(val, dev->mmio_base + reg); + else + iowrite32(val, dev->iobase + reg); +} + /* * VM to hypervisor call mechanism. We use the standard VMware naming * convention since shared code is calling this function as well. @@ -116,7 +132,7 @@ int vmci_send_datagram(struct vmci_datagram *dg) if (vmci_dev_g) { iowrite8_rep(vmci_dev_g->iobase + VMCI_DATA_OUT_ADDR, dg, VMCI_DG_SIZE(dg)); - result = ioread32(vmci_dev_g->iobase + VMCI_RESULT_LOW_ADDR); + result = vmci_read_reg(vmci_dev_g, VMCI_RESULT_LOW_ADDR); } else { result = VMCI_ERROR_UNAVAILABLE; } @@ -384,7 +400,7 @@ static irqreturn_t vmci_interrupt(int irq, void *_dev) unsigned int icr; /* Acknowledge interrupt and determine what needs doing. */ - icr = ioread32(dev->iobase + VMCI_ICR_ADDR); + icr = vmci_read_reg(dev, VMCI_ICR_ADDR); if (icr == 0 || icr == ~0) return IRQ_NONE; @@ -429,7 +445,8 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, const struct pci_device_id *id) { struct vmci_guest_device *vmci_dev; - void __iomem *iobase; + void __iomem *iobase = NULL; + void __iomem *mmio_base = NULL; unsigned int capabilities; unsigned int caps_in_use; unsigned long cmd; @@ -445,16 +462,29 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, return error; } - error = pcim_iomap_regions(pdev, 1 << 0, KBUILD_MODNAME); - if (error) { - dev_err(>dev, "Failed to reserve/map IO regions\n"); - return error; - } + /* +* The VMCI device with mmio access to registers requests 256KB +* for BAR1. If present, driver will use new VMCI device +* functionality for register access and datagram send/recv. +*/ - iobase = pcim_iomap_table(pdev)[0]; + if (pci_resource_len(pdev, 1) == VMCI_WITH_MMIO_ACCESS_BAR_SIZE) { + dev_info(>dev, "MMIO register access is available\n"); + mmio_base = pci_iomap_range(pdev, 1, VMCI_MMIO_ACCESS_OFFSET, + VMCI_MMIO_ACCESS_SIZE); + /* If the map fails, we fall back to IOIO access. */ + if (!mmio_base) + dev_warn(>dev, "Failed to map MMIO register access\n"); + } - dev_info(>dev, "Found VMCI PCI device at %#lx, irq %u\n", -(unsigned long)iobase, pdev->irq); + if (!mmio_base) { + error = pcim_iomap_regions(pdev, BIT(0), KBUILD_MODNAME); + if (error) { + dev_err(>dev, "Failed to reserve/map IO regions\n"); + return error; + } + iobase = pcim_iomap_table(pdev)[0]; + } vmci_dev = devm_kzalloc(>dev, sizeof(*vmci_dev), GFP_KERNEL); if (!vmci_dev) { @@ -466,6 +496,7 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, vmci_dev->dev = >dev; vmci_dev->exclusive_vectors = false; vmci_dev->iobase = iobase; + vmci_dev->mmio_base = mmio_base;
[PATCH v3 1/8] VMCI: dma dg: whitespace formatting change for vmci register defines
Update formatting of existing register defines in preparation for adding additional register definitions for the VMCI device. Reviewed-by: Vishnu Dasa Signed-off-by: Jorgen Hansen --- include/linux/vmw_vmci_defs.h | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/include/linux/vmw_vmci_defs.h b/include/linux/vmw_vmci_defs.h index e36cb114c188..9911ecfc18ba 100644 --- a/include/linux/vmw_vmci_defs.h +++ b/include/linux/vmw_vmci_defs.h @@ -12,15 +12,15 @@ #include /* Register offsets. */ -#define VMCI_STATUS_ADDR 0x00 -#define VMCI_CONTROL_ADDR 0x04 -#define VMCI_ICR_ADDR0x08 -#define VMCI_IMR_ADDR 0x0c -#define VMCI_DATA_OUT_ADDR0x10 -#define VMCI_DATA_IN_ADDR 0x14 -#define VMCI_CAPS_ADDR0x18 -#define VMCI_RESULT_LOW_ADDR 0x1c -#define VMCI_RESULT_HIGH_ADDR 0x20 +#define VMCI_STATUS_ADDR0x00 +#define VMCI_CONTROL_ADDR 0x04 +#define VMCI_ICR_ADDR 0x08 +#define VMCI_IMR_ADDR 0x0c +#define VMCI_DATA_OUT_ADDR 0x10 +#define VMCI_DATA_IN_ADDR 0x14 +#define VMCI_CAPS_ADDR 0x18 +#define VMCI_RESULT_LOW_ADDR0x1c +#define VMCI_RESULT_HIGH_ADDR 0x20 /* Max number of devices. */ #define VMCI_MAX_DEVICES 1 -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
[PATCH v3 0/8] VMCI: dma dg: Add support for DMA datagrams
A new version of the VMCI device will introduce two new major changes: - support MMIO access to device registers - support send/receive of datagrams using DMA transfers instead of ioread8_rep/iowrite8_rep operations This patch series updates the VMCI driver to support these new features while maintaining backwards compatibility. The DMA based datagram operations use a send and a receive buffer allocated at module load time. The buffer contains a header describing the layout of the buffer followed by either an SG list or inline data. The header also contains a flag indicating whether the buffer is currently owned by the driver or the device. Both for send and receive, the driver will initialize the buffer, transfer ownership to the device by writing the buffer address to a register, and then wait for the ownership to be transferred back. The device will generate an interrupt when this happens. v2 (fixes issues flagged by kernel test robot ): - changed type of mmio_base to void __iomem * - made vmci_read_reg, vmci_write_reg and vmci_write_data static functions v3: - removed log messages for page size and BAR resources Jorgen Hansen (8): VMCI: dma dg: whitespace formatting change for vmci register defines VMCI: dma dg: add MMIO access to registers VMCI: dma dg: detect DMA datagram capability VMCI: dma dg: set OS page size VMCI: dma dg: register dummy IRQ handlers for DMA datagrams VMCI: dma dg: allocate send and receive buffers for DMA datagrams VMCI: dma dg: add support for DMA datagrams sends VMCI: dma dg: add support for DMA datagrams receive drivers/misc/vmw_vmci/vmci_guest.c | 335 - include/linux/vmw_vmci_defs.h | 84 +++- 2 files changed, 355 insertions(+), 64 deletions(-) -- 2.25.1 ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v2 2/8] VMCI: dma dg: add MMIO access to registers
> On 4 Feb 2022, at 16.12, Greg KH wrote: > > On Thu, Feb 03, 2022 at 05:12:31AM -0800, Jorgen Hansen wrote: >> Detect the support for MMIO access through examination of the length >> of the region requested in BAR1. If it is 256KB, the VMCI device >> supports MMIO access to registers. >> >> If MMIO access is supported, map the area of the region used for >> MMIO access (64KB size at offset 128KB). >> >> Add wrapper functions for accessing 32 bit register accesses through >> either MMIO or IO ports based on device configuration. >> >> Sending and receiving datagrams through iowrite8_rep/ioread8_rep is >> left unchanged for now, and will be addressed in a later change. >> >> Reviewed-by: Vishnu Dasa >> Signed-off-by: Jorgen Hansen >> --- >> drivers/misc/vmw_vmci/vmci_guest.c | 68 ++ >> include/linux/vmw_vmci_defs.h | 12 ++ >> 2 files changed, 62 insertions(+), 18 deletions(-) >> >> diff --git a/drivers/misc/vmw_vmci/vmci_guest.c >> b/drivers/misc/vmw_vmci/vmci_guest.c >> index 1018dc77269d..38ee7ed32ab9 100644 >> --- a/drivers/misc/vmw_vmci/vmci_guest.c >> +++ b/drivers/misc/vmw_vmci/vmci_guest.c >> @@ -45,6 +45,7 @@ static u32 vm_context_id = VMCI_INVALID_ID; >> struct vmci_guest_device { >> struct device *dev; /* PCI device we are attached to */ >> void __iomem *iobase; >> +void __iomem *mmio_base; >> >> bool exclusive_vectors; >> >> @@ -89,6 +90,21 @@ u32 vmci_get_vm_context_id(void) >> return vm_context_id; >> } >> >> +static unsigned int vmci_read_reg(struct vmci_guest_device *dev, u32 reg) >> +{ >> +if (dev->mmio_base != NULL) >> +return readl(dev->mmio_base + reg); >> +return ioread32(dev->iobase + reg); >> +} >> + >> +static void vmci_write_reg(struct vmci_guest_device *dev, u32 val, u32 reg) >> +{ >> +if (dev->mmio_base != NULL) >> +writel(val, dev->mmio_base + reg); >> +else >> +iowrite32(val, dev->iobase + reg); >> +} >> + >> /* >> * VM to hypervisor call mechanism. We use the standard VMware naming >> * convention since shared code is calling this function as well. >> @@ -116,7 +132,7 @@ int vmci_send_datagram(struct vmci_datagram *dg) >> if (vmci_dev_g) { >> iowrite8_rep(vmci_dev_g->iobase + VMCI_DATA_OUT_ADDR, >> dg, VMCI_DG_SIZE(dg)); >> -result = ioread32(vmci_dev_g->iobase + VMCI_RESULT_LOW_ADDR); >> +result = vmci_read_reg(vmci_dev_g, VMCI_RESULT_LOW_ADDR); >> } else { >> result = VMCI_ERROR_UNAVAILABLE; >> } >> @@ -384,7 +400,7 @@ static irqreturn_t vmci_interrupt(int irq, void *_dev) >> unsigned int icr; >> >> /* Acknowledge interrupt and determine what needs doing. */ >> -icr = ioread32(dev->iobase + VMCI_ICR_ADDR); >> +icr = vmci_read_reg(dev, VMCI_ICR_ADDR); >> if (icr == 0 || icr == ~0) >> return IRQ_NONE; >> >> @@ -429,7 +445,8 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, >> const struct pci_device_id *id) >> { >> struct vmci_guest_device *vmci_dev; >> -void __iomem *iobase; >> +void __iomem *iobase = NULL; >> +void __iomem *mmio_base = NULL; >> unsigned int capabilities; >> unsigned int caps_in_use; >> unsigned long cmd; >> @@ -445,16 +462,32 @@ static int vmci_guest_probe_device(struct pci_dev >> *pdev, >> return error; >> } >> >> -error = pcim_iomap_regions(pdev, 1 << 0, KBUILD_MODNAME); >> -if (error) { >> -dev_err(>dev, "Failed to reserve/map IO regions\n"); >> -return error; >> +/* >> + * The VMCI device with mmio access to registers requests 256KB >> + * for BAR1. If present, driver will use new VMCI device >> + * functionality for register access and datagram send/recv. >> + */ >> + >> +if (pci_resource_len(pdev, 1) == VMCI_WITH_MMIO_ACCESS_BAR_SIZE) { >> +dev_info(>dev, "MMIO register access is available\n"); >> +mmio_base = pci_iomap_range(pdev, 1, VMCI_MMIO_ACCESS_OFFSET, >> +VMCI_MMIO_ACCESS_SIZE); >> +/* If the map fails, we fall back to IOIO access. */ >> +if (!mmio_base) >> +dev_warn(>dev, "Failed to map MMIO register >> access\n"); >> } >> >> -iobase = pcim_iomap_table(pdev)[0]; >> +if (!mmio_base) { >> +error = pcim_iomap_regions(pdev, BIT(0), KBUILD_MODNAME); >> +if (error) { >> +dev_err(>dev, "Failed to reserve/map IO >> regions\n"); >> +return error; >> +} >> +iobase = pcim_iomap_table(pdev)[0]; >> +} >> >> -dev_info(>dev, "Found VMCI PCI device at %#lx, irq %u\n", >> - (unsigned long)iobase, pdev->irq); >> +dev_info(>dev, "Found VMCI PCI device at %#lx, %#lx, irq %u\n",
Re: [PATCH v2 4/8] VMCI: dma dg: set OS page size
> On 4 Feb 2022, at 16.12, Greg KH wrote: > > On Thu, Feb 03, 2022 at 05:12:33AM -0800, Jorgen Hansen wrote: >> Tell the device the page size used by the OS. >> >> Reviewed-by: Vishnu Dasa >> Signed-off-by: Jorgen Hansen >> --- >> drivers/misc/vmw_vmci/vmci_guest.c | 9 + >> include/linux/vmw_vmci_defs.h | 1 + >> 2 files changed, 10 insertions(+) >> >> diff --git a/drivers/misc/vmw_vmci/vmci_guest.c >> b/drivers/misc/vmw_vmci/vmci_guest.c >> index 5a99d8e27873..808680dc0820 100644 >> --- a/drivers/misc/vmw_vmci/vmci_guest.c >> +++ b/drivers/misc/vmw_vmci/vmci_guest.c >> @@ -581,6 +581,15 @@ static int vmci_guest_probe_device(struct pci_dev *pdev, >> /* Let the host know which capabilities we intend to use. */ >> vmci_write_reg(vmci_dev, caps_in_use, VMCI_CAPS_ADDR); >> >> +if (caps_in_use & VMCI_CAPS_DMA_DATAGRAM) { >> +uint32_t page_shift; >> + >> +/* Let the device know the size for pages passed down. */ >> +vmci_write_reg(vmci_dev, PAGE_SHIFT, VMCI_GUEST_PAGE_SHIFT); >> +page_shift = vmci_read_reg(vmci_dev, VMCI_GUEST_PAGE_SHIFT); >> +dev_info(>dev, "Using page shift %d\n", page_shift); > > Please do not print out debugging stuff like this to the kernel log. OK, I’ll remove it. > When drivers are working properly, they are quiet. > > thanks, > > greg k-h Thanks, Jorgen ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH v3 13/17] virtio_pci: queue_reset: support VIRTIO_F_RING_RESET
On Mon, 7 Feb 2022 14:57:13 +0800, Jason Wang wrote: > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道: > > This patch implements virtio pci support for QUEUE RESET. > > > > Performing reset on a queue is divided into two steps: > > > > 1. reset_vq: reset one vq > > 2. enable_reset_vq: re-enable the reset queue > > > > In the first step, these tasks will be completed: > > 1. notify the hardware queue to reset > > 2. recycle the buffer from vq > > 3. release the ring of the vq > > > > The process of enable reset vq: > > vp_modern_enable_reset_vq() > > vp_enable_reset_vq() > > __vp_setup_vq() > > setup_vq() > > vring_setup_virtqueue() > > > > In this process, we added two parameters, vq and num, and finally passed > > them to vring_setup_virtqueue(). And reuse the original info and vq. > > > > Signed-off-by: Xuan Zhuo > > --- > > drivers/virtio/virtio_pci_common.c | 36 +++ > > drivers/virtio/virtio_pci_common.h | 5 ++ > > drivers/virtio/virtio_pci_modern.c | 100 + > > 3 files changed, 128 insertions(+), 13 deletions(-) > > > > diff --git a/drivers/virtio/virtio_pci_common.c > > b/drivers/virtio/virtio_pci_common.c > > index c02936d29a31..ad21638fbf66 100644 > > --- a/drivers/virtio/virtio_pci_common.c > > +++ b/drivers/virtio/virtio_pci_common.c > > @@ -205,23 +205,28 @@ static int vp_request_msix_vectors(struct > > virtio_device *vdev, int nvectors, > > return err; > > } > > > > -static struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned > > index, > > -void (*callback)(struct virtqueue *vq), > > -const char *name, > > -bool ctx, > > -u16 msix_vec, u16 num) > > +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index, > > + void (*callback)(struct virtqueue *vq), > > + const char *name, > > + bool ctx, > > + u16 msix_vec, u16 num) > > { > > struct virtio_pci_device *vp_dev = to_vp_device(vdev); > > - struct virtio_pci_vq_info *info = kmalloc(sizeof *info, GFP_KERNEL); > > + struct virtio_pci_vq_info *info; > > struct virtqueue *vq; > > unsigned long flags; > > > > - /* fill out our structure that represents an active queue */ > > - if (!info) > > - return ERR_PTR(-ENOMEM); > > + info = vp_dev->vqs[index]; > > + if (!info) { > > + info = kzalloc(sizeof *info, GFP_KERNEL); > > + > > + /* fill out our structure that represents an active queue */ > > + if (!info) > > + return ERR_PTR(-ENOMEM); > > + } > > > > vq = vp_dev->setup_vq(vp_dev, info, index, callback, name, ctx, > > - msix_vec, NULL, num); > > + msix_vec, info->vq, num); > > if (IS_ERR(vq)) > > goto out_info; > > > > @@ -238,6 +243,9 @@ static struct virtqueue *vp_setup_vq(struct > > virtio_device *vdev, unsigned index, > > return vq; > > > > out_info: > > + if (info->vq && info->vq->reset) > > + return vq; > > + > > kfree(info); > > return vq; > > } > > @@ -248,9 +256,11 @@ static void vp_del_vq(struct virtqueue *vq) > > struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index]; > > unsigned long flags; > > > > - spin_lock_irqsave(_dev->lock, flags); > > - list_del(>node); > > - spin_unlock_irqrestore(_dev->lock, flags); > > + if (!vq->reset) { > > + spin_lock_irqsave(_dev->lock, flags); > > + list_del(>node); > > + spin_unlock_irqrestore(_dev->lock, flags); > > + } > > > > vp_dev->del_vq(info); > > kfree(info); > > diff --git a/drivers/virtio/virtio_pci_common.h > > b/drivers/virtio/virtio_pci_common.h > > index 65db92245e41..c1d15f7c0be4 100644 > > --- a/drivers/virtio/virtio_pci_common.h > > +++ b/drivers/virtio/virtio_pci_common.h > > @@ -119,6 +119,11 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned > > nvqs, > > struct virtqueue *vqs[], vq_callback_t *callbacks[], > > const char * const names[], const bool *ctx, > > struct irq_affinity *desc); > > +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index, > > + void (*callback)(struct virtqueue *vq), > > + const char *name, > > + bool ctx, > > + u16 msix_vec, u16 num); > > const char *vp_bus_name(struct virtio_device *vdev); > > > > /* Setup the affinity for a virtqueue: > > diff --git a/drivers/virtio/virtio_pci_modern.c > > b/drivers/virtio/virtio_pci_modern.c > > index 2ce58de549de..6789411169e4 100644 > > --- a/drivers/virtio/virtio_pci_modern.c > > +++ b/drivers/virtio/virtio_pci_modern.c > > @@ -34,6 +34,9 @@ static void vp_transport_features(struct virtio_device
Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data
On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo wrote: > > On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang wrote: > > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道: > > > Add queue_notify_data in struct virtio_pci_common_cfg, which comes from > > > here https://github.com/oasis-tcs/virtio-spec/issues/89 > > > > > > Since I want to add queue_reset after it, I submitted this patch first. > > > > > > Signed-off-by: Xuan Zhuo > > > --- > > > include/uapi/linux/virtio_pci.h | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/include/uapi/linux/virtio_pci.h > > > b/include/uapi/linux/virtio_pci.h > > > index 3a86f36d7e3d..492c89f56c6a 100644 > > > --- a/include/uapi/linux/virtio_pci.h > > > +++ b/include/uapi/linux/virtio_pci.h > > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg { > > > __le32 queue_avail_hi; /* read-write */ > > > __le32 queue_used_lo; /* read-write */ > > > __le32 queue_used_hi; /* read-write */ > > > + __le16 queue_notify_data; /* read-write */ > > > }; > > > > > > So I had the same concern as previous version. > > > > This breaks uABI where program may try to use sizeof(struct > > virtio_pci_common_cfg). > > > > We probably need a container structure here. > > I see, I plan to add a struct like this, do you think it's appropriate? > > struct virtio_pci_common_cfg_v1 { > struct virtio_pci_common_cfg cfg; > __le16 queue_notify_data; /* read-write */ > } Something like this but we probably need a better name. Thanks > > Thanks. > > > > > THanks > > > > > > > > > > /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */ > > > ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization