Re: [PATCH v3 00/17] virtio pci support VIRTIO_F_RING_RESET

2022-02-07 Thread Xuan Zhuo
On Tue, 08 Feb 2022 11:14:45 +0800, Xuan Zhuo  
wrote:
> On Tue, 8 Feb 2022 10:59:48 +0800, Jason Wang  wrote:
> >
> > 在 2022/2/7 下午2:02, Xuan Zhuo 写道:
> > > On Mon, 7 Feb 2022 11:39:36 +0800, Jason Wang  wrote:
> > >> On Wed, Jan 26, 2022 at 3:35 PM Xuan Zhuo  
> > >> wrote:
> > >>> 
> > >>> The virtio spec already supports the virtio queue reset function. This 
> > >>> patch set
> > >>> is to add this function to the kernel. The relevant virtio spec 
> > >>> information is
> > >>> here:
> > >>>
> > >>>  https://github.com/oasis-tcs/virtio-spec/issues/124
> > >>>
> > >>> Also regarding MMIO support for queue reset, I plan to support it after 
> > >>> this
> > >>> patch is passed.
> > >>>
> > >>> #14-#17 is the disable/enable function of rx/tx pair implemented by 
> > >>> virtio-net
> > >>> using the new helper.
> > >> One thing that came to mind is the steering. E.g if we disable an RX,
> > >> should we stop steering packets to that queue?

Regarding this spec, if there are multiple queues disabled at the same time, it
will be a troublesome problem for the backend to select the queue, so I want to
directly define that only one queue is allowed to reset at the same time, do you
think this is appropriate?

In terms of the implementation of backend queue reselection, it would be more
convenient to implement if we drop packets directly. Do you think we must
implement this reselection function?

Thanks.

> > > Yes, we should reselect a queue.
> > >
> > > Thanks.
> >
> >
> > Maybe a spec patch for that?
>
> Yes, I also realized this. Although virtio-net's disable/enable is implemented
> based on queue reset, virtio-net still has to define its own flag and define
> some more detailed implementations.
>
> I'll think about it and post a spec patch.
>
> Thanks.
>
> >
> > Thanks
> >
> >
> > >
> > >> Thanks
> > >>
> > >>> This function is not currently referenced by other
> > >>> functions. It is more to show the usage of the new helper, I not sure 
> > >>> if they
> > >>> are going to be merged together.
> > >>>
> > >>> Please review. Thanks.
> > >>>
> > >>> v3:
> > >>>1. keep vq, irq unreleased
> > >>>
> > >>> Xuan Zhuo (17):
> > >>>virtio_pci: struct virtio_pci_common_cfg add queue_notify_data
> > >>>virtio: queue_reset: add VIRTIO_F_RING_RESET
> > >>>virtio: queue_reset: struct virtio_config_ops add callbacks for
> > >>>  queue_reset
> > >>>virtio: queue_reset: add helper
> > >>>vritio_ring: queue_reset: extract the release function of the vq ring
> > >>>virtio_ring: queue_reset: split: add __vring_init_virtqueue()
> > >>>virtio_ring: queue_reset: split: support enable reset queue
> > >>>virtio_ring: queue_reset: packed: support enable reset queue
> > >>>virtio_ring: queue_reset: add vring_reset_virtqueue()
> > >>>virtio_pci: queue_reset: update struct virtio_pci_common_cfg and
> > >>>  option functions
> > >>>virtio_pci: queue_reset: release vq by vp_dev->vqs
> > >>>virtio_pci: queue_reset: setup_vq use vring_setup_virtqueue()
> > >>>virtio_pci: queue_reset: support VIRTIO_F_RING_RESET
> > >>>virtio_net: virtnet_tx_timeout() fix style
> > >>>virtio_net: virtnet_tx_timeout() stop ref sq->vq
> > >>>virtio_net: split free_unused_bufs()
> > >>>virtio_net: support pair disable/enable
> > >>>
> > >>>   drivers/net/virtio_net.c   | 220 ++---
> > >>>   drivers/virtio/virtio_pci_common.c |  62 ---
> > >>>   drivers/virtio/virtio_pci_common.h |  11 +-
> > >>>   drivers/virtio/virtio_pci_legacy.c |   5 +-
> > >>>   drivers/virtio/virtio_pci_modern.c | 120 +-
> > >>>   drivers/virtio/virtio_pci_modern_dev.c |  28 
> > >>>   drivers/virtio/virtio_ring.c   | 144 +++-
> > >>>   include/linux/virtio.h |   1 +
> > >>>   include/linux/virtio_config.h  |  75 -
> > >>>   include/linux/virtio_pci_modern.h  |   2 +
> > >>>   include/linux/virtio_ring.h|  42 +++--
> > >>>   include/uapi/linux/virtio_config.h |   7 +-
> > >>>   include/uapi/linux/virtio_pci.h|   2 +
> > >>>   13 files changed, 618 insertions(+), 101 deletions(-)
> > >>>
> > >>> --
> > >>> 2.31.0
> > >>>
> >
> ___
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 13/17] virtio_pci: queue_reset: support VIRTIO_F_RING_RESET

2022-02-07 Thread Xuan Zhuo
On Tue, 8 Feb 2022 10:55:37 +0800, Jason Wang  wrote:
> On Mon, Feb 7, 2022 at 4:19 PM Xuan Zhuo  wrote:
> >
> > On Mon, 7 Feb 2022 14:57:13 +0800, Jason Wang  wrote:
> > >
> > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道:
> > > > This patch implements virtio pci support for QUEUE RESET.
> > > >
> > > > Performing reset on a queue is divided into two steps:
> > > >
> > > > 1. reset_vq: reset one vq
> > > > 2. enable_reset_vq: re-enable the reset queue
> > > >
> > > > In the first step, these tasks will be completed:
> > > > 1. notify the hardware queue to reset
> > > > 2. recycle the buffer from vq
> > > > 3. release the ring of the vq
> > > >
> > > > The process of enable reset vq:
> > > >  vp_modern_enable_reset_vq()
> > > >  vp_enable_reset_vq()
> > > >  __vp_setup_vq()
> > > >  setup_vq()
> > > >  vring_setup_virtqueue()
> > > >
> > > > In this process, we added two parameters, vq and num, and finally passed
> > > > them to vring_setup_virtqueue().  And reuse the original info and vq.
> > > >
> > > > Signed-off-by: Xuan Zhuo 
> > > > ---
> > > >   drivers/virtio/virtio_pci_common.c |  36 +++
> > > >   drivers/virtio/virtio_pci_common.h |   5 ++
> > > >   drivers/virtio/virtio_pci_modern.c | 100 +
> > > >   3 files changed, 128 insertions(+), 13 deletions(-)
> > > >
> > > > diff --git a/drivers/virtio/virtio_pci_common.c 
> > > > b/drivers/virtio/virtio_pci_common.c
> > > > index c02936d29a31..ad21638fbf66 100644
> > > > --- a/drivers/virtio/virtio_pci_common.c
> > > > +++ b/drivers/virtio/virtio_pci_common.c
> > > > @@ -205,23 +205,28 @@ static int vp_request_msix_vectors(struct 
> > > > virtio_device *vdev, int nvectors,
> > > > return err;
> > > >   }
> > > >
> > > > -static struct virtqueue *vp_setup_vq(struct virtio_device *vdev, 
> > > > unsigned index,
> > > > -void (*callback)(struct virtqueue *vq),
> > > > -const char *name,
> > > > -bool ctx,
> > > > -u16 msix_vec, u16 num)
> > > > +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned 
> > > > index,
> > > > + void (*callback)(struct virtqueue *vq),
> > > > + const char *name,
> > > > + bool ctx,
> > > > + u16 msix_vec, u16 num)
> > > >   {
> > > > struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> > > > -   struct virtio_pci_vq_info *info = kmalloc(sizeof *info, GFP_KERNEL);
> > > > +   struct virtio_pci_vq_info *info;
> > > > struct virtqueue *vq;
> > > > unsigned long flags;
> > > >
> > > > -   /* fill out our structure that represents an active queue */
> > > > -   if (!info)
> > > > -   return ERR_PTR(-ENOMEM);
> > > > +   info = vp_dev->vqs[index];
> > > > +   if (!info) {
> > > > +   info = kzalloc(sizeof *info, GFP_KERNEL);
> > > > +
> > > > +   /* fill out our structure that represents an active queue */
> > > > +   if (!info)
> > > > +   return ERR_PTR(-ENOMEM);
> > > > +   }
> > > >
> > > > vq = vp_dev->setup_vq(vp_dev, info, index, callback, name, ctx,
> > > > - msix_vec, NULL, num);
> > > > + msix_vec, info->vq, num);
> > > > if (IS_ERR(vq))
> > > > goto out_info;
> > > >
> > > > @@ -238,6 +243,9 @@ static struct virtqueue *vp_setup_vq(struct 
> > > > virtio_device *vdev, unsigned index,
> > > > return vq;
> > > >
> > > >   out_info:
> > > > +   if (info->vq && info->vq->reset)
> > > > +   return vq;
> > > > +
> > > > kfree(info);
> > > > return vq;
> > > >   }
> > > > @@ -248,9 +256,11 @@ static void vp_del_vq(struct virtqueue *vq)
> > > > struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index];
> > > > unsigned long flags;
> > > >
> > > > -   spin_lock_irqsave(_dev->lock, flags);
> > > > -   list_del(>node);
> > > > -   spin_unlock_irqrestore(_dev->lock, flags);
> > > > +   if (!vq->reset) {
> > > > +   spin_lock_irqsave(_dev->lock, flags);
> > > > +   list_del(>node);
> > > > +   spin_unlock_irqrestore(_dev->lock, flags);
> > > > +   }
> > > >
> > > > vp_dev->del_vq(info);
> > > > kfree(info);
> > > > diff --git a/drivers/virtio/virtio_pci_common.h 
> > > > b/drivers/virtio/virtio_pci_common.h
> > > > index 65db92245e41..c1d15f7c0be4 100644
> > > > --- a/drivers/virtio/virtio_pci_common.h
> > > > +++ b/drivers/virtio/virtio_pci_common.h
> > > > @@ -119,6 +119,11 @@ int vp_find_vqs(struct virtio_device *vdev, 
> > > > unsigned nvqs,
> > > > struct virtqueue *vqs[], vq_callback_t *callbacks[],
> > > > const char * const names[], const bool *ctx,
> > > > struct irq_affinity *desc);
> > > > +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned 
> > > > index,
> > > > + 

Re: [PATCH 16/31] vhost: pass queue index to vhost_vq_get_addr

2022-02-07 Thread Jason Wang


在 2022/2/1 上午1:44, Eugenio Perez Martin 写道:

On Sat, Jan 29, 2022 at 9:20 AM Jason Wang  wrote:


在 2022/1/22 上午4:27, Eugenio Pérez 写道:

Doing that way allows vhost backend to know what address to return.

Signed-off-by: Eugenio Pérez 
---
   hw/virtio/vhost.c | 6 +++---
   1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 7b03efccec..64b955ba0c 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -798,9 +798,10 @@ static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
   struct vhost_virtqueue *vq,
   unsigned idx, bool enable_log)
   {
-struct vhost_vring_addr addr;
+struct vhost_vring_addr addr = {
+.index = idx,
+};
   int r;
-memset(, 0, sizeof(struct vhost_vring_addr));

   if (dev->vhost_ops->vhost_vq_get_addr) {
   r = dev->vhost_ops->vhost_vq_get_addr(dev, , vq);
@@ -813,7 +814,6 @@ static int vhost_virtqueue_set_addr(struct vhost_dev *dev,
   addr.avail_user_addr = (uint64_t)(unsigned long)vq->avail;
   addr.used_user_addr = (uint64_t)(unsigned long)vq->used;
   }


I'm a bit lost in the logic above, any reason we need call
vhost_vq_get_addr() :) ?


It's the way vhost_virtqueue_set_addr works if the backend has a
vhost_vq_get_addr operation (currently, only vhost-vdpa). vhost first
ask the address to the back end and then set it.



Right it's because vhost-vdpa doesn't use VA but GPA. But I'm not sure 
it's worth a dedicated vhost_ops. But consider we introduce shadow 
virtqueue stuffs, it should be ok now.


(In the future, we may consider to generalize non vhost-vdpa specific 
stuffs to VhostShadowVirtqueue, then we can get rid of this vhost_ops.





Previously, index was not needed because all the information was in
vhost_virtqueue. However to extract queue index from vhost_virtqueue
is tricky, so I think it's easier to simply have that information at
request, something similar to get_base or get_num when asking vdpa
device. We can extract the index from vq - dev->vqs or something
similar if it's prefered.



It looks odd for the caller to tell the index consider vhost_virtqueue 
is already passed. So I think we need deduce it from vhost_virtqueue as 
you mentioned here.


Thanks




Thanks!


Thanks



-addr.index = idx;
   addr.log_guest_addr = vq->used_phys;
   addr.flags = enable_log ? (1 << VHOST_VRING_F_LOG) : 0;
   r = dev->vhost_ops->vhost_set_vring_addr(dev, );


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 13/17] virtio_pci: queue_reset: support VIRTIO_F_RING_RESET

2022-02-07 Thread xuanzhuo



On 2022/2/8 上午10:55, Jason Wang wrote:

On Mon, Feb 7, 2022 at 4:19 PM Xuan Zhuo  wrote:

On Mon, 7 Feb 2022 14:57:13 +0800, Jason Wang  wrote:

在 2022/1/26 下午3:35, Xuan Zhuo 写道:

This patch implements virtio pci support for QUEUE RESET.

Performing reset on a queue is divided into two steps:

1. reset_vq: reset one vq
2. enable_reset_vq: re-enable the reset queue

In the first step, these tasks will be completed:
 1. notify the hardware queue to reset
 2. recycle the buffer from vq
 3. release the ring of the vq

The process of enable reset vq:
  vp_modern_enable_reset_vq()
  vp_enable_reset_vq()
  __vp_setup_vq()
  setup_vq()
  vring_setup_virtqueue()

In this process, we added two parameters, vq and num, and finally passed
them to vring_setup_virtqueue().  And reuse the original info and vq.

Signed-off-by: Xuan Zhuo 
---
   drivers/virtio/virtio_pci_common.c |  36 +++
   drivers/virtio/virtio_pci_common.h |   5 ++
   drivers/virtio/virtio_pci_modern.c | 100 +
   3 files changed, 128 insertions(+), 13 deletions(-)

diff --git a/drivers/virtio/virtio_pci_common.c 
b/drivers/virtio/virtio_pci_common.c
index c02936d29a31..ad21638fbf66 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -205,23 +205,28 @@ static int vp_request_msix_vectors(struct virtio_device 
*vdev, int nvectors,
 return err;
   }

-static struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned 
index,
-void (*callback)(struct virtqueue *vq),
-const char *name,
-bool ctx,
-u16 msix_vec, u16 num)
+struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index,
+ void (*callback)(struct virtqueue *vq),
+ const char *name,
+ bool ctx,
+ u16 msix_vec, u16 num)
   {
 struct virtio_pci_device *vp_dev = to_vp_device(vdev);
-   struct virtio_pci_vq_info *info = kmalloc(sizeof *info, GFP_KERNEL);
+   struct virtio_pci_vq_info *info;
 struct virtqueue *vq;
 unsigned long flags;

-   /* fill out our structure that represents an active queue */
-   if (!info)
-   return ERR_PTR(-ENOMEM);
+   info = vp_dev->vqs[index];
+   if (!info) {
+   info = kzalloc(sizeof *info, GFP_KERNEL);
+
+   /* fill out our structure that represents an active queue */
+   if (!info)
+   return ERR_PTR(-ENOMEM);
+   }

 vq = vp_dev->setup_vq(vp_dev, info, index, callback, name, ctx,
- msix_vec, NULL, num);
+ msix_vec, info->vq, num);
 if (IS_ERR(vq))
 goto out_info;

@@ -238,6 +243,9 @@ static struct virtqueue *vp_setup_vq(struct virtio_device 
*vdev, unsigned index,
 return vq;

   out_info:
+   if (info->vq && info->vq->reset)
+   return vq;
+
 kfree(info);
 return vq;
   }
@@ -248,9 +256,11 @@ static void vp_del_vq(struct virtqueue *vq)
 struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index];
 unsigned long flags;

-   spin_lock_irqsave(_dev->lock, flags);
-   list_del(>node);
-   spin_unlock_irqrestore(_dev->lock, flags);
+   if (!vq->reset) {
+   spin_lock_irqsave(_dev->lock, flags);
+   list_del(>node);
+   spin_unlock_irqrestore(_dev->lock, flags);
+   }

 vp_dev->del_vq(info);
 kfree(info);
diff --git a/drivers/virtio/virtio_pci_common.h 
b/drivers/virtio/virtio_pci_common.h
index 65db92245e41..c1d15f7c0be4 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -119,6 +119,11 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs,
 struct virtqueue *vqs[], vq_callback_t *callbacks[],
 const char * const names[], const bool *ctx,
 struct irq_affinity *desc);
+struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index,
+ void (*callback)(struct virtqueue *vq),
+ const char *name,
+ bool ctx,
+ u16 msix_vec, u16 num);
   const char *vp_bus_name(struct virtio_device *vdev);

   /* Setup the affinity for a virtqueue:
diff --git a/drivers/virtio/virtio_pci_modern.c 
b/drivers/virtio/virtio_pci_modern.c
index 2ce58de549de..6789411169e4 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -34,6 +34,9 @@ static void vp_transport_features(struct virtio_device *vdev, 
u64 features)
 if ((features & BIT_ULL(VIRTIO_F_SR_IOV)) &&
 pci_find_ext_capability(pci_dev, PCI_EXT_CAP_ID_SRIOV))
 __virtio_set_bit(vdev, VIRTIO_F_SR_IOV);
+
+   if (features & BIT_ULL(VIRTIO_F_RING_RESET))
+   __virtio_set_bit(vdev, VIRTIO_F_RING_RESET);
   }

   /* virtio config->finalize_features() 

Re: [PATCH 17/31] vdpa: adapt vhost_ops callbacks to svq

2022-02-07 Thread Jason Wang


在 2022/2/1 上午2:58, Eugenio Perez Martin 写道:

On Sun, Jan 30, 2022 at 5:03 AM Jason Wang  wrote:


在 2022/1/22 上午4:27, Eugenio Pérez 写道:

First half of the buffers forwarding part, preparing vhost-vdpa
callbacks to SVQ to offer it. QEMU cannot enable it at this moment, so
this is effectively dead code at the moment, but it helps to reduce
patch size.

Signed-off-by: Eugenio Pérez 
---
   hw/virtio/vhost-shadow-virtqueue.h |   2 +-
   hw/virtio/vhost-shadow-virtqueue.c |  21 -
   hw/virtio/vhost-vdpa.c | 133 ++---
   3 files changed, 143 insertions(+), 13 deletions(-)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
b/hw/virtio/vhost-shadow-virtqueue.h
index 035207a469..39aef5ffdf 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -35,7 +35,7 @@ size_t vhost_svq_device_area_size(const VhostShadowVirtqueue 
*svq);

   void vhost_svq_stop(VhostShadowVirtqueue *svq);

-VhostShadowVirtqueue *vhost_svq_new(void);
+VhostShadowVirtqueue *vhost_svq_new(uint16_t qsize);

   void vhost_svq_free(VhostShadowVirtqueue *vq);

diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index f129ec8395..7c168075d7 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -277,9 +277,17 @@ void vhost_svq_stop(VhostShadowVirtqueue *svq)
   /**
* Creates vhost shadow virtqueue, and instruct vhost device to use the 
shadow
* methods and file descriptors.
+ *
+ * @qsize Shadow VirtQueue size
+ *
+ * Returns the new virtqueue or NULL.
+ *
+ * In case of error, reason is reported through error_report.
*/
-VhostShadowVirtqueue *vhost_svq_new(void)
+VhostShadowVirtqueue *vhost_svq_new(uint16_t qsize)
   {
+size_t desc_size = sizeof(vring_desc_t) * qsize;
+size_t device_size, driver_size;
   g_autofree VhostShadowVirtqueue *svq = g_new0(VhostShadowVirtqueue, 1);
   int r;

@@ -300,6 +308,15 @@ VhostShadowVirtqueue *vhost_svq_new(void)
   /* Placeholder descriptor, it should be deleted at set_kick_fd */
   event_notifier_init_fd(>svq_kick, INVALID_SVQ_KICK_FD);

+svq->vring.num = qsize;


I wonder if this is the best. E.g some hardware can support up to 32K
queue size. So this will probably end up with:

1) SVQ use 32K queue size
2) hardware queue uses 256


In that case SVQ vring queue size will be 32K and guest's vring can
negotiate any number with SVQ equal or less than 32K,



Sorry for being unclear what I meant is actually

1) SVQ uses 32K queue size

2) guest vq uses 256

This looks like a burden that needs extra logic and may damage the 
performance.


And this can lead other interesting situation:

1) SVQ uses 256

2) guest vq uses 1024

Where a lot of more SVQ logic is needed.



including 256.
Is that what you mean?



I mean, it looks to me the logic will be much more simplified if we just 
allocate the shadow virtqueue with the size what guest can see (guest 
vring).


Then we don't need to think if the difference of the queue size can have 
any side effects.





If with hardware queues you mean guest's vring, not sure why it is
"probably 256". I'd say that in that case with the virtio-net kernel
driver the ring size will be the same as the device export, for
example, isn't it?

The implementation should support any combination of sizes, but the
ring size exposed to the guest is never bigger than hardware one.


? Or we SVQ can stick to 256 but this will this cause trouble if we want
to add event index support?


I think we should not have any problem with event idx. If you mean
that the guest could mark more buffers available than SVQ vring's
size, that should not happen because there must be less entries in the
guest than SVQ.

But if I understood you correctly, a similar situation could happen if
a guest's contiguous buffer is scattered across many qemu's VA chunks.
Even if that would happen, the situation should be ok too: SVQ knows
the guest's avail idx and, if SVQ is full, it will continue forwarding
avail buffers when the device uses more buffers.

Does that make sense to you?



Yes.

Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 11/31] vhost: Add vhost_svq_valid_device_features to shadow vq

2022-02-07 Thread Jason Wang


在 2022/2/1 下午6:57, Eugenio Perez Martin 写道:

On Mon, Jan 31, 2022 at 4:49 PM Eugenio Perez Martin
 wrote:

On Sat, Jan 29, 2022 at 9:11 AM Jason Wang  wrote:


在 2022/1/22 上午4:27, Eugenio Pérez 写道:

This allows SVQ to negotiate features with the device. For the device,
SVQ is a driver. While this function needs to bypass all non-transport
features, it needs to disable the features that SVQ does not support
when forwarding buffers. This includes packed vq layout, indirect
descriptors or event idx.

Signed-off-by: Eugenio Pérez 
---
   hw/virtio/vhost-shadow-virtqueue.h |  2 ++
   hw/virtio/vhost-shadow-virtqueue.c | 44 ++
   hw/virtio/vhost-vdpa.c | 21 ++
   3 files changed, 67 insertions(+)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
b/hw/virtio/vhost-shadow-virtqueue.h
index c9ffa11fce..d963867a04 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -15,6 +15,8 @@

   typedef struct VhostShadowVirtqueue VhostShadowVirtqueue;

+bool vhost_svq_valid_device_features(uint64_t *features);
+
   void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
   void vhost_svq_set_guest_call_notifier(VhostShadowVirtqueue *svq, int 
call_fd);
   const EventNotifier *vhost_svq_get_dev_kick_notifier(
diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index 9619c8082c..51442b3dbf 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -45,6 +45,50 @@ const EventNotifier *vhost_svq_get_dev_kick_notifier(
   return >hdev_kick;
   }

+/**
+ * Validate the transport device features that SVQ can use with the device
+ *
+ * @dev_features  The device features. If success, the acknowledged features.
+ *
+ * Returns true if SVQ can go with a subset of these, false otherwise.
+ */
+bool vhost_svq_valid_device_features(uint64_t *dev_features)
+{
+bool r = true;
+
+for (uint64_t b = VIRTIO_TRANSPORT_F_START; b <= VIRTIO_TRANSPORT_F_END;
+ ++b) {
+switch (b) {
+case VIRTIO_F_NOTIFY_ON_EMPTY:
+case VIRTIO_F_ANY_LAYOUT:
+continue;
+
+case VIRTIO_F_ACCESS_PLATFORM:
+/* SVQ does not know how to translate addresses */


I may miss something but any reason that we need to disable
ACCESS_PLATFORM? I'd expect the vring helper we used for shadow
virtqueue can deal with vIOMMU perfectly.


This function is validating SVQ <-> Device communications features,
that may or may not be the same as guest <-> SVQ. These feature flags
are valid for guest <-> SVQ communication, same as with indirect
descriptors one.

Having said that, there is a point in the series where
VIRTIO_F_ACCESS_PLATFORM is actually mandatory, so I think we could
use the latter addition of x-svq cmdline parameter and delay the
feature validations where it makes more sense.


+if (*dev_features & BIT_ULL(b)) {
+clear_bit(b, dev_features);
+r = false;
+}
+break;
+
+case VIRTIO_F_VERSION_1:


I had the same question here.


For VERSION_1 it's easier to assume that guest is little endian at
some points, but we could try harder to support both endianness if
needed.


Re-thinking the SVQ feature isolation stuff for this first iteration
based on your comments.

Maybe it's easier to simply fail if the device does not *match* the
expected feature set, and add all of the "feature isolation" later.
While a lot of guest <-> SVQ communication details are already solved
for free with qemu's VirtQueue (indirect, packed, ...), we may
simplify this series in particular and add the support for it later.

For example, at this moment would be valid for the device to export
indirect descriptors feature flag, and SVQ simply forward that feature
flag offering to the guest. So the guest <-> SVQ communication could
have indirect descriptors (qemu's VirtQueue code handles it for free),
but SVQ would not acknowledge it for the device. As a side note, to
negotiate it would have been harmless actually, but it's not the case
of packed vq.

So maybe for the v2 we can simply force the device to just export the
strictly needed features and nothing else with qemu cmdline, and then
enable the feature negotiation isolation for each side of SVQ?



Yes, that's exactly my point.

Thanks




Thanks!



Thanks!


Thanks



+/* SVQ trust that guest vring is little endian */
+if (!(*dev_features & BIT_ULL(b))) {
+set_bit(b, dev_features);
+r = false;
+}
+continue;
+
+default:
+if (*dev_features & BIT_ULL(b)) {
+clear_bit(b, dev_features);
+}
+}
+}
+
+return r;
+}
+
   /* Forward guest notifications */
   static void vhost_handle_guest_kick(EventNotifier *n)
   {
diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 

Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data

2022-02-07 Thread Jason Wang
On Tue, Feb 8, 2022 at 11:29 AM Xuan Zhuo  wrote:
>
> On Tue, 8 Feb 2022 11:24:13 +0800, Jason Wang  wrote:
> > On Tue, Feb 8, 2022 at 11:20 AM Xuan Zhuo  
> > wrote:
> > >
> > > On Tue, 8 Feb 2022 11:03:17 +0800, Jason Wang  wrote:
> > > > On Tue, Feb 8, 2022 at 10:17 AM Xuan Zhuo  
> > > > wrote:
> > > > >
> > > > > On Mon, 7 Feb 2022 16:06:15 +0800, Jason Wang  
> > > > > wrote:
> > > > > > On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo 
> > > > > >  wrote:
> > > > > > >
> > > > > > > On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang 
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道:
> > > > > > > > > Add queue_notify_data in struct virtio_pci_common_cfg, which 
> > > > > > > > > comes from
> > > > > > > > > here https://github.com/oasis-tcs/virtio-spec/issues/89
> > > > > > > > >
> > > > > > > > > Since I want to add queue_reset after it, I submitted this 
> > > > > > > > > patch first.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Xuan Zhuo 
> > > > > > > > > ---
> > > > > > > > >   include/uapi/linux/virtio_pci.h | 1 +
> > > > > > > > >   1 file changed, 1 insertion(+)
> > > > > > > > >
> > > > > > > > > diff --git a/include/uapi/linux/virtio_pci.h 
> > > > > > > > > b/include/uapi/linux/virtio_pci.h
> > > > > > > > > index 3a86f36d7e3d..492c89f56c6a 100644
> > > > > > > > > --- a/include/uapi/linux/virtio_pci.h
> > > > > > > > > +++ b/include/uapi/linux/virtio_pci.h
> > > > > > > > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg {
> > > > > > > > > __le32 queue_avail_hi;  /* read-write */
> > > > > > > > > __le32 queue_used_lo;   /* read-write */
> > > > > > > > > __le32 queue_used_hi;   /* read-write */
> > > > > > > > > +   __le16 queue_notify_data;   /* read-write */
> > > > > > > > >   };
> > > > > > > >
> > > > > > > >
> > > > > > > > So I had the same concern as previous version.
> > > > > > > >
> > > > > > > > This breaks uABI where program may try to use sizeof(struct
> > > > > > > > virtio_pci_common_cfg).
> > > > > > > >
> > > > > > > > We probably need a container structure here.
> > > > > > >
> > > > > > > I see, I plan to add a struct like this, do you think it's 
> > > > > > > appropriate?
> > > > > > >
> > > > > > > struct virtio_pci_common_cfg_v1 {
> > > > > > > struct virtio_pci_common_cfg cfg;
> > > > > > > __le16 queue_notify_data;   /* read-write */
> > > > > > > }
> > > > > >
> > > > > > Something like this but we probably need a better name.
> > > > >
> > > > >
> > > > > how about this?
> > > > >
> > > > > /* Ext Fields in VIRTIO_PCI_CAP_COMMON_CFG: */
> > > > > struct virtio_pci_common_cfg_ext {
> > > > > struct virtio_pci_common_cfg cfg;
> > > > >
> > > > > __le16 queue_notify_data;   /* read-write */
> > > > >
> > > > > __le16 reserved0;
> > > > > __le16 reserved1;
> > > > > __le16 reserved2;
> > > > > __le16 reserved3;
> > > > > __le16 reserved4;
> > > > > __le16 reserved5;
> > > > > __le16 reserved6;
> > > > > __le16 reserved7;
> > > > > __le16 reserved8;
> > > > > __le16 reserved9;
> > > > > __le16 reserved10;
> > > > > __le16 reserved11;
> > > > > __le16 reserved12;
> > > > > __le16 reserved13;
> > > > > __le16 reserved14;
> > > > > };
> > > >
> > > > I still think the container without padding is better. Otherwise
> > > > userspace needs to use offset_of() trick instead of sizeof().
> > >
> > > In this case, as virtio_pci_common_cfg_ext adds new members in the 
> > > future, we
> > > will add more container structures.
> > >
> > > In that case, I think virtio_pci_common_cfg_v1 is a good name instead.
> >
> > Something like "virtio_pci_common_cfg_notify" might be a little bit better.
>
> Although there is only one notify_data in this patch, I plan to look like this
> after my patch set:
>
> struct virtio_pci_common_cfg_v1 {
> struct virtio_pci_common_cfg cfg;
>
> __le16 queue_notify_data;   /* read-write */
> __le16 queue_reset;   /* read-write */
> }
>
> If we use virtio_pci_common_cfg_notify, then we will get two structures after
> this patch set:
>
> struct virtio_pci_common_cfg_notify {
> struct virtio_pci_common_cfg cfg;
>
> __le16 queue_notify_data;   /* read-write */
> }
>
> struct virtio_pci_common_cfg_reset {
> struct virtio_pci_common_cfg_notify cfg;
>
> __le16 queue_reset;   /* read-write */
> }

Right, this is sub-optimal, and we need padding in cfg_notify
probably. But I couldn't think of a better idea currently, maybe we
can listen from others opinion

But we use something like this for 

Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data

2022-02-07 Thread Xuan Zhuo
On Tue, 8 Feb 2022 11:24:13 +0800, Jason Wang  wrote:
> On Tue, Feb 8, 2022 at 11:20 AM Xuan Zhuo  wrote:
> >
> > On Tue, 8 Feb 2022 11:03:17 +0800, Jason Wang  wrote:
> > > On Tue, Feb 8, 2022 at 10:17 AM Xuan Zhuo  
> > > wrote:
> > > >
> > > > On Mon, 7 Feb 2022 16:06:15 +0800, Jason Wang  
> > > > wrote:
> > > > > On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo  
> > > > > wrote:
> > > > > >
> > > > > > On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang  
> > > > > > wrote:
> > > > > > >
> > > > > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道:
> > > > > > > > Add queue_notify_data in struct virtio_pci_common_cfg, which 
> > > > > > > > comes from
> > > > > > > > here https://github.com/oasis-tcs/virtio-spec/issues/89
> > > > > > > >
> > > > > > > > Since I want to add queue_reset after it, I submitted this 
> > > > > > > > patch first.
> > > > > > > >
> > > > > > > > Signed-off-by: Xuan Zhuo 
> > > > > > > > ---
> > > > > > > >   include/uapi/linux/virtio_pci.h | 1 +
> > > > > > > >   1 file changed, 1 insertion(+)
> > > > > > > >
> > > > > > > > diff --git a/include/uapi/linux/virtio_pci.h 
> > > > > > > > b/include/uapi/linux/virtio_pci.h
> > > > > > > > index 3a86f36d7e3d..492c89f56c6a 100644
> > > > > > > > --- a/include/uapi/linux/virtio_pci.h
> > > > > > > > +++ b/include/uapi/linux/virtio_pci.h
> > > > > > > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg {
> > > > > > > > __le32 queue_avail_hi;  /* read-write */
> > > > > > > > __le32 queue_used_lo;   /* read-write */
> > > > > > > > __le32 queue_used_hi;   /* read-write */
> > > > > > > > +   __le16 queue_notify_data;   /* read-write */
> > > > > > > >   };
> > > > > > >
> > > > > > >
> > > > > > > So I had the same concern as previous version.
> > > > > > >
> > > > > > > This breaks uABI where program may try to use sizeof(struct
> > > > > > > virtio_pci_common_cfg).
> > > > > > >
> > > > > > > We probably need a container structure here.
> > > > > >
> > > > > > I see, I plan to add a struct like this, do you think it's 
> > > > > > appropriate?
> > > > > >
> > > > > > struct virtio_pci_common_cfg_v1 {
> > > > > > struct virtio_pci_common_cfg cfg;
> > > > > > __le16 queue_notify_data;   /* read-write */
> > > > > > }
> > > > >
> > > > > Something like this but we probably need a better name.
> > > >
> > > >
> > > > how about this?
> > > >
> > > > /* Ext Fields in VIRTIO_PCI_CAP_COMMON_CFG: */
> > > > struct virtio_pci_common_cfg_ext {
> > > > struct virtio_pci_common_cfg cfg;
> > > >
> > > > __le16 queue_notify_data;   /* read-write */
> > > >
> > > > __le16 reserved0;
> > > > __le16 reserved1;
> > > > __le16 reserved2;
> > > > __le16 reserved3;
> > > > __le16 reserved4;
> > > > __le16 reserved5;
> > > > __le16 reserved6;
> > > > __le16 reserved7;
> > > > __le16 reserved8;
> > > > __le16 reserved9;
> > > > __le16 reserved10;
> > > > __le16 reserved11;
> > > > __le16 reserved12;
> > > > __le16 reserved13;
> > > > __le16 reserved14;
> > > > };
> > >
> > > I still think the container without padding is better. Otherwise
> > > userspace needs to use offset_of() trick instead of sizeof().
> >
> > In this case, as virtio_pci_common_cfg_ext adds new members in the future, 
> > we
> > will add more container structures.
> >
> > In that case, I think virtio_pci_common_cfg_v1 is a good name instead.
>
> Something like "virtio_pci_common_cfg_notify" might be a little bit better.

Although there is only one notify_data in this patch, I plan to look like this
after my patch set:

struct virtio_pci_common_cfg_v1 {
struct virtio_pci_common_cfg cfg;

__le16 queue_notify_data;   /* read-write */
__le16 queue_reset;   /* read-write */
}

If we use virtio_pci_common_cfg_notify, then we will get two structures after
this patch set:

struct virtio_pci_common_cfg_notify {
struct virtio_pci_common_cfg cfg;

__le16 queue_notify_data;   /* read-write */
}

struct virtio_pci_common_cfg_reset {
struct virtio_pci_common_cfg_notify cfg;

__le16 queue_reset;   /* read-write */
}


Thanks.

>
> Thanks
>
> >
> > Thanks.
> >
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > Thanks
> > > > >
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > > >
> > > > > > > THanks
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > >   /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
___
Virtualization mailing list

Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data

2022-02-07 Thread Jason Wang
On Tue, Feb 8, 2022 at 11:20 AM Xuan Zhuo  wrote:
>
> On Tue, 8 Feb 2022 11:03:17 +0800, Jason Wang  wrote:
> > On Tue, Feb 8, 2022 at 10:17 AM Xuan Zhuo  
> > wrote:
> > >
> > > On Mon, 7 Feb 2022 16:06:15 +0800, Jason Wang  wrote:
> > > > On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo  
> > > > wrote:
> > > > >
> > > > > On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang  
> > > > > wrote:
> > > > > >
> > > > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道:
> > > > > > > Add queue_notify_data in struct virtio_pci_common_cfg, which 
> > > > > > > comes from
> > > > > > > here https://github.com/oasis-tcs/virtio-spec/issues/89
> > > > > > >
> > > > > > > Since I want to add queue_reset after it, I submitted this patch 
> > > > > > > first.
> > > > > > >
> > > > > > > Signed-off-by: Xuan Zhuo 
> > > > > > > ---
> > > > > > >   include/uapi/linux/virtio_pci.h | 1 +
> > > > > > >   1 file changed, 1 insertion(+)
> > > > > > >
> > > > > > > diff --git a/include/uapi/linux/virtio_pci.h 
> > > > > > > b/include/uapi/linux/virtio_pci.h
> > > > > > > index 3a86f36d7e3d..492c89f56c6a 100644
> > > > > > > --- a/include/uapi/linux/virtio_pci.h
> > > > > > > +++ b/include/uapi/linux/virtio_pci.h
> > > > > > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg {
> > > > > > > __le32 queue_avail_hi;  /* read-write */
> > > > > > > __le32 queue_used_lo;   /* read-write */
> > > > > > > __le32 queue_used_hi;   /* read-write */
> > > > > > > +   __le16 queue_notify_data;   /* read-write */
> > > > > > >   };
> > > > > >
> > > > > >
> > > > > > So I had the same concern as previous version.
> > > > > >
> > > > > > This breaks uABI where program may try to use sizeof(struct
> > > > > > virtio_pci_common_cfg).
> > > > > >
> > > > > > We probably need a container structure here.
> > > > >
> > > > > I see, I plan to add a struct like this, do you think it's 
> > > > > appropriate?
> > > > >
> > > > > struct virtio_pci_common_cfg_v1 {
> > > > > struct virtio_pci_common_cfg cfg;
> > > > > __le16 queue_notify_data;   /* read-write */
> > > > > }
> > > >
> > > > Something like this but we probably need a better name.
> > >
> > >
> > > how about this?
> > >
> > > /* Ext Fields in VIRTIO_PCI_CAP_COMMON_CFG: */
> > > struct virtio_pci_common_cfg_ext {
> > > struct virtio_pci_common_cfg cfg;
> > >
> > > __le16 queue_notify_data;   /* read-write */
> > >
> > > __le16 reserved0;
> > > __le16 reserved1;
> > > __le16 reserved2;
> > > __le16 reserved3;
> > > __le16 reserved4;
> > > __le16 reserved5;
> > > __le16 reserved6;
> > > __le16 reserved7;
> > > __le16 reserved8;
> > > __le16 reserved9;
> > > __le16 reserved10;
> > > __le16 reserved11;
> > > __le16 reserved12;
> > > __le16 reserved13;
> > > __le16 reserved14;
> > > };
> >
> > I still think the container without padding is better. Otherwise
> > userspace needs to use offset_of() trick instead of sizeof().
>
> In this case, as virtio_pci_common_cfg_ext adds new members in the future, we
> will add more container structures.
>
> In that case, I think virtio_pci_common_cfg_v1 is a good name instead.

Something like "virtio_pci_common_cfg_notify" might be a little bit better.

Thanks

>
> Thanks.
>
>
> >
> > Thanks
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Thanks
> > > >
> > > > >
> > > > > Thanks.
> > > > >
> > > > > >
> > > > > > THanks
> > > > > >
> > > > > >
> > > > > > >
> > > > > > >   /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */
> > > > > >
> > > > >
> > > >
> > >
> >
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 09/31] vhost-vdpa: Take into account SVQ in vhost_vdpa_set_vring_call

2022-02-07 Thread Jason Wang


在 2022/1/31 下午11:34, Eugenio Perez Martin 写道:

On Sat, Jan 29, 2022 at 9:06 AM Jason Wang  wrote:


在 2022/1/22 上午4:27, Eugenio Pérez 写道:

Signed-off-by: Eugenio Pérez 
---
   hw/virtio/vhost-vdpa.c | 20 ++--
   1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index 18de14f0fb..029f98feee 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -687,13 +687,29 @@ static int vhost_vdpa_set_vring_kick(struct vhost_dev 
*dev,
   }
   }

-static int vhost_vdpa_set_vring_call(struct vhost_dev *dev,
-   struct vhost_vring_file *file)
+static int vhost_vdpa_set_vring_dev_call(struct vhost_dev *dev,
+ struct vhost_vring_file *file)
   {
   trace_vhost_vdpa_set_vring_call(dev, file->index, file->fd);
   return vhost_vdpa_call(dev, VHOST_SET_VRING_CALL, file);
   }

+static int vhost_vdpa_set_vring_call(struct vhost_dev *dev,
+ struct vhost_vring_file *file)
+{
+struct vhost_vdpa *v = dev->opaque;
+
+if (v->shadow_vqs_enabled) {
+int vdpa_idx = vhost_vdpa_get_vq_index(dev, file->index);
+VhostShadowVirtqueue *svq = g_ptr_array_index(v->shadow_vqs, vdpa_idx);
+
+vhost_svq_set_guest_call_notifier(svq, file->fd);


Two questions here (had similar questions for vring kick):

1) Any reason that we setup the eventfd for vhost-vdpa in
vhost_vdpa_svq_setup() not here?


I'm not sure what you mean.

The guest->SVQ call and kick fds are set here and at
vhost_vdpa_set_vring_kick. The event notifier handler of the guest ->
SVQ kick_fd is set at vhost_vdpa_set_vring_kick /
vhost_svq_set_svq_kick_fd. The guest -> SVQ call fd has no event
notifier handler since we don't poll it.

On the other hand, the connection SVQ <-> device uses the same fds
from the beginning to the end, and they will not change with, for
example, call fd masking. That's why it's setup from
vhost_vdpa_svq_setup. Delaying to vhost_vdpa_set_vring_call would make
us add way more logic there.



More logic in general shadow vq code but less codes for vhost-vdpa 
specific code I think.


E.g for we can move the kick set logic from vhost_vdpa_svq_set_fds() to 
here.


Thanks





2) The call could be disabled by using -1 as the fd, I don't see any
code to deal with that.


Right, I didn't take that into account. vhost-kernel takes also -1 as
kick_fd to unbind, so SVQ can be reworked to take that into account
for sure.

Thanks!


Thanks



+return 0;
+} else {
+return vhost_vdpa_set_vring_dev_call(dev, file);
+}
+}
+
   /**
* Set shadow virtqueue descriptors to the device
*


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data

2022-02-07 Thread Xuan Zhuo
On Tue, 8 Feb 2022 11:03:17 +0800, Jason Wang  wrote:
> On Tue, Feb 8, 2022 at 10:17 AM Xuan Zhuo  wrote:
> >
> > On Mon, 7 Feb 2022 16:06:15 +0800, Jason Wang  wrote:
> > > On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo  
> > > wrote:
> > > >
> > > > On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang  
> > > > wrote:
> > > > >
> > > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道:
> > > > > > Add queue_notify_data in struct virtio_pci_common_cfg, which comes 
> > > > > > from
> > > > > > here https://github.com/oasis-tcs/virtio-spec/issues/89
> > > > > >
> > > > > > Since I want to add queue_reset after it, I submitted this patch 
> > > > > > first.
> > > > > >
> > > > > > Signed-off-by: Xuan Zhuo 
> > > > > > ---
> > > > > >   include/uapi/linux/virtio_pci.h | 1 +
> > > > > >   1 file changed, 1 insertion(+)
> > > > > >
> > > > > > diff --git a/include/uapi/linux/virtio_pci.h 
> > > > > > b/include/uapi/linux/virtio_pci.h
> > > > > > index 3a86f36d7e3d..492c89f56c6a 100644
> > > > > > --- a/include/uapi/linux/virtio_pci.h
> > > > > > +++ b/include/uapi/linux/virtio_pci.h
> > > > > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg {
> > > > > > __le32 queue_avail_hi;  /* read-write */
> > > > > > __le32 queue_used_lo;   /* read-write */
> > > > > > __le32 queue_used_hi;   /* read-write */
> > > > > > +   __le16 queue_notify_data;   /* read-write */
> > > > > >   };
> > > > >
> > > > >
> > > > > So I had the same concern as previous version.
> > > > >
> > > > > This breaks uABI where program may try to use sizeof(struct
> > > > > virtio_pci_common_cfg).
> > > > >
> > > > > We probably need a container structure here.
> > > >
> > > > I see, I plan to add a struct like this, do you think it's appropriate?
> > > >
> > > > struct virtio_pci_common_cfg_v1 {
> > > > struct virtio_pci_common_cfg cfg;
> > > > __le16 queue_notify_data;   /* read-write */
> > > > }
> > >
> > > Something like this but we probably need a better name.
> >
> >
> > how about this?
> >
> > /* Ext Fields in VIRTIO_PCI_CAP_COMMON_CFG: */
> > struct virtio_pci_common_cfg_ext {
> > struct virtio_pci_common_cfg cfg;
> >
> > __le16 queue_notify_data;   /* read-write */
> >
> > __le16 reserved0;
> > __le16 reserved1;
> > __le16 reserved2;
> > __le16 reserved3;
> > __le16 reserved4;
> > __le16 reserved5;
> > __le16 reserved6;
> > __le16 reserved7;
> > __le16 reserved8;
> > __le16 reserved9;
> > __le16 reserved10;
> > __le16 reserved11;
> > __le16 reserved12;
> > __le16 reserved13;
> > __le16 reserved14;
> > };
>
> I still think the container without padding is better. Otherwise
> userspace needs to use offset_of() trick instead of sizeof().

In this case, as virtio_pci_common_cfg_ext adds new members in the future, we
will add more container structures.

In that case, I think virtio_pci_common_cfg_v1 is a good name instead.

Thanks.


>
> Thanks
>
> >
> > Thanks
> >
> > >
> > > Thanks
> > >
> > > >
> > > > Thanks.
> > > >
> > > > >
> > > > > THanks
> > > > >
> > > > >
> > > > > >
> > > > > >   /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */
> > > > >
> > > >
> > >
> >
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 00/17] virtio pci support VIRTIO_F_RING_RESET

2022-02-07 Thread Xuan Zhuo
On Tue, 8 Feb 2022 10:59:48 +0800, Jason Wang  wrote:
>
> 在 2022/2/7 下午2:02, Xuan Zhuo 写道:
> > On Mon, 7 Feb 2022 11:39:36 +0800, Jason Wang  wrote:
> >> On Wed, Jan 26, 2022 at 3:35 PM Xuan Zhuo  
> >> wrote:
> >>> 
> >>> The virtio spec already supports the virtio queue reset function. This 
> >>> patch set
> >>> is to add this function to the kernel. The relevant virtio spec 
> >>> information is
> >>> here:
> >>>
> >>>  https://github.com/oasis-tcs/virtio-spec/issues/124
> >>>
> >>> Also regarding MMIO support for queue reset, I plan to support it after 
> >>> this
> >>> patch is passed.
> >>>
> >>> #14-#17 is the disable/enable function of rx/tx pair implemented by 
> >>> virtio-net
> >>> using the new helper.
> >> One thing that came to mind is the steering. E.g if we disable an RX,
> >> should we stop steering packets to that queue?
> > Yes, we should reselect a queue.
> >
> > Thanks.
>
>
> Maybe a spec patch for that?

Yes, I also realized this. Although virtio-net's disable/enable is implemented
based on queue reset, virtio-net still has to define its own flag and define
some more detailed implementations.

I'll think about it and post a spec patch.

Thanks.

>
> Thanks
>
>
> >
> >> Thanks
> >>
> >>> This function is not currently referenced by other
> >>> functions. It is more to show the usage of the new helper, I not sure if 
> >>> they
> >>> are going to be merged together.
> >>>
> >>> Please review. Thanks.
> >>>
> >>> v3:
> >>>1. keep vq, irq unreleased
> >>>
> >>> Xuan Zhuo (17):
> >>>virtio_pci: struct virtio_pci_common_cfg add queue_notify_data
> >>>virtio: queue_reset: add VIRTIO_F_RING_RESET
> >>>virtio: queue_reset: struct virtio_config_ops add callbacks for
> >>>  queue_reset
> >>>virtio: queue_reset: add helper
> >>>vritio_ring: queue_reset: extract the release function of the vq ring
> >>>virtio_ring: queue_reset: split: add __vring_init_virtqueue()
> >>>virtio_ring: queue_reset: split: support enable reset queue
> >>>virtio_ring: queue_reset: packed: support enable reset queue
> >>>virtio_ring: queue_reset: add vring_reset_virtqueue()
> >>>virtio_pci: queue_reset: update struct virtio_pci_common_cfg and
> >>>  option functions
> >>>virtio_pci: queue_reset: release vq by vp_dev->vqs
> >>>virtio_pci: queue_reset: setup_vq use vring_setup_virtqueue()
> >>>virtio_pci: queue_reset: support VIRTIO_F_RING_RESET
> >>>virtio_net: virtnet_tx_timeout() fix style
> >>>virtio_net: virtnet_tx_timeout() stop ref sq->vq
> >>>virtio_net: split free_unused_bufs()
> >>>virtio_net: support pair disable/enable
> >>>
> >>>   drivers/net/virtio_net.c   | 220 ++---
> >>>   drivers/virtio/virtio_pci_common.c |  62 ---
> >>>   drivers/virtio/virtio_pci_common.h |  11 +-
> >>>   drivers/virtio/virtio_pci_legacy.c |   5 +-
> >>>   drivers/virtio/virtio_pci_modern.c | 120 +-
> >>>   drivers/virtio/virtio_pci_modern_dev.c |  28 
> >>>   drivers/virtio/virtio_ring.c   | 144 +++-
> >>>   include/linux/virtio.h |   1 +
> >>>   include/linux/virtio_config.h  |  75 -
> >>>   include/linux/virtio_pci_modern.h  |   2 +
> >>>   include/linux/virtio_ring.h|  42 +++--
> >>>   include/uapi/linux/virtio_config.h |   7 +-
> >>>   include/uapi/linux/virtio_pci.h|   2 +
> >>>   13 files changed, 618 insertions(+), 101 deletions(-)
> >>>
> >>> --
> >>> 2.31.0
> >>>
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data

2022-02-07 Thread Jason Wang
On Tue, Feb 8, 2022 at 10:17 AM Xuan Zhuo  wrote:
>
> On Mon, 7 Feb 2022 16:06:15 +0800, Jason Wang  wrote:
> > On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo  wrote:
> > >
> > > On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang  wrote:
> > > >
> > > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道:
> > > > > Add queue_notify_data in struct virtio_pci_common_cfg, which comes 
> > > > > from
> > > > > here https://github.com/oasis-tcs/virtio-spec/issues/89
> > > > >
> > > > > Since I want to add queue_reset after it, I submitted this patch 
> > > > > first.
> > > > >
> > > > > Signed-off-by: Xuan Zhuo 
> > > > > ---
> > > > >   include/uapi/linux/virtio_pci.h | 1 +
> > > > >   1 file changed, 1 insertion(+)
> > > > >
> > > > > diff --git a/include/uapi/linux/virtio_pci.h 
> > > > > b/include/uapi/linux/virtio_pci.h
> > > > > index 3a86f36d7e3d..492c89f56c6a 100644
> > > > > --- a/include/uapi/linux/virtio_pci.h
> > > > > +++ b/include/uapi/linux/virtio_pci.h
> > > > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg {
> > > > > __le32 queue_avail_hi;  /* read-write */
> > > > > __le32 queue_used_lo;   /* read-write */
> > > > > __le32 queue_used_hi;   /* read-write */
> > > > > +   __le16 queue_notify_data;   /* read-write */
> > > > >   };
> > > >
> > > >
> > > > So I had the same concern as previous version.
> > > >
> > > > This breaks uABI where program may try to use sizeof(struct
> > > > virtio_pci_common_cfg).
> > > >
> > > > We probably need a container structure here.
> > >
> > > I see, I plan to add a struct like this, do you think it's appropriate?
> > >
> > > struct virtio_pci_common_cfg_v1 {
> > > struct virtio_pci_common_cfg cfg;
> > > __le16 queue_notify_data;   /* read-write */
> > > }
> >
> > Something like this but we probably need a better name.
>
>
> how about this?
>
> /* Ext Fields in VIRTIO_PCI_CAP_COMMON_CFG: */
> struct virtio_pci_common_cfg_ext {
> struct virtio_pci_common_cfg cfg;
>
> __le16 queue_notify_data;   /* read-write */
>
> __le16 reserved0;
> __le16 reserved1;
> __le16 reserved2;
> __le16 reserved3;
> __le16 reserved4;
> __le16 reserved5;
> __le16 reserved6;
> __le16 reserved7;
> __le16 reserved8;
> __le16 reserved9;
> __le16 reserved10;
> __le16 reserved11;
> __le16 reserved12;
> __le16 reserved13;
> __le16 reserved14;
> };

I still think the container without padding is better. Otherwise
userspace needs to use offset_of() trick instead of sizeof().

Thanks

>
> Thanks
>
> >
> > Thanks
> >
> > >
> > > Thanks.
> > >
> > > >
> > > > THanks
> > > >
> > > >
> > > > >
> > > > >   /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */
> > > >
> > >
> >
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 03/17] virtio: queue_reset: struct virtio_config_ops add callbacks for queue_reset

2022-02-07 Thread Xuan Zhuo
On Tue, 8 Feb 2022 10:58:45 +0800, Jason Wang  wrote:
>
>  2022/2/7 pm 3:19, Xuan Zhuo :
> > On Mon, 7 Feb 2022 14:45:02 +0800, Jason Wang  wrote:
> >>  2022/1/26 pm 3:35, Xuan Zhuo :
> >>> Performing reset on a queue is divided into two steps:
> >>>
> >>> 1. reset_vq: reset one vq
> >>> 2. enable_reset_vq: re-enable the reset queue
> >>>
> >>> In the first step, these tasks will be completed:
> >>>   1. notify the hardware queue to reset
> >>>   2. recycle the buffer from vq
> >>>   3. release the ring of the vq
> >>>
> >>> The second step is similar to find vqs,
> >>
> >> Not sure, since find_vqs will usually try to allocate interrupts.
> >>
> >>
> > Yes.
> >
> >
> >>>passing parameters callback and
> >>> name, etc. Based on the original vq, the ring is re-allocated and
> >>> configured to the backend.
> >>
> >> I wonder whether we really have such requirement.
> >>
> >> For example, do we really have a use case that may change:
> >>
> >> vq callback, ctx, ring_num or even re-create the virtqueue?
> > 1. virtqueue is not recreated
> > 2. ring_num can be used to modify ring num by ethtool -G
>
>
> It looks to me we don't support this right now.

I am trying to implement this function based on virtio queue reset. It will be
added to the next version.

>
>
> >
> > There is really no scene to modify vq callback, ctx, name.
> >
> > Do you mean we still use the old one instead of adding these parameters?
>
>
> Yes, I think for driver we need to implement the function that is needed
> for the first user (e.g AF_XDP). If there's no use case, we can leave
> those extension for the future.

OK.

Thanks.


>
> Thanks
>
>
> >
> > Thanks.
> >
> >> Thanks
> >>
> >>
> >>> So add two callbacks reset_vq, enable_reset_vq to struct
> >>> virtio_config_ops.
> >>>
> >>> Add a structure for passing parameters. This will facilitate subsequent
> >>> expansion of the parameters of enable reset vq.
> >>> There is currently only one default extended parameter ring_num.
> >>>
> >>> Signed-off-by: Xuan Zhuo 
> >>> ---
> >>>include/linux/virtio_config.h | 43 ++-
> >>>1 file changed, 42 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
> >>> index 4d107ad31149..51dd8461d1b6 100644
> >>> --- a/include/linux/virtio_config.h
> >>> +++ b/include/linux/virtio_config.h
> >>> @@ -16,6 +16,44 @@ struct virtio_shm_region {
> >>>   u64 len;
> >>>};
> >>>
> >>> +typedef void vq_callback_t(struct virtqueue *);
> >>> +
> >>> +/* virtio_reset_vq: specify parameters for queue_reset
> >>> + *
> >>> + *   vdev: the device
> >>> + *   queue_index: the queue index
> >>> + *
> >>> + *   free_unused_cb: callback to free unused bufs
> >>> + *   data: used by free_unused_cb
> >>> + *
> >>> + *   callback: callback for the virtqueue, NULL for vq that do not 
> >>> need a
> >>> + * callback
> >>> + *   name: virtqueue names (mainly for debugging), NULL for vq 
> >>> unused by
> >>> + * driver
> >>> + *   ctx: ctx
> >>> + *
> >>> + *   ring_num: specify ring num for the vq to be re-enabled. 0 means 
> >>> use the
> >>> + * default value. MUST be a power of 2.
> >>> + */
> >>> +struct virtio_reset_vq;
> >>> +typedef void vq_reset_callback_t(struct virtio_reset_vq *param, void 
> >>> *buf);
> >>> +struct virtio_reset_vq {
> >>> + struct virtio_device *vdev;
> >>> + u16 queue_index;
> >>> +
> >>> + /* reset vq param */
> >>> + vq_reset_callback_t *free_unused_cb;
> >>> + void *data;
> >>> +
> >>> + /* enable reset vq param */
> >>> + vq_callback_t *callback;
> >>> + const char *name;
> >>> + const bool *ctx;
> >>> +
> >>> + /* ext enable reset vq param */
> >>> + u16 ring_num;
> >>> +};
> >>> +
> >>>/**
> >>> * virtio_config_ops - operations for configuring a virtio device
> >>> * Note: Do not assume that a transport implements all of the 
> >>> operations
> >>> @@ -74,8 +112,9 @@ struct virtio_shm_region {
> >>> * @set_vq_affinity: set the affinity for a virtqueue (optional).
> >>> * @get_vq_affinity: get the affinity for a virtqueue (optional).
> >>> * @get_shm_region: get a shared memory region based on the index.
> >>> + * @reset_vq: reset a queue individually
> >>> + * @enable_reset_vq: enable a reset queue
> >>> */
> >>> -typedef void vq_callback_t(struct virtqueue *);
> >>>struct virtio_config_ops {
> >>>   void (*enable_cbs)(struct virtio_device *vdev);
> >>>   void (*get)(struct virtio_device *vdev, unsigned offset,
> >>> @@ -100,6 +139,8 @@ struct virtio_config_ops {
> >>>   int index);
> >>>   bool (*get_shm_region)(struct virtio_device *vdev,
> >>>  struct virtio_shm_region *region, u8 id);
> >>> + int (*reset_vq)(struct virtio_reset_vq *param);
> >>> + struct virtqueue *(*enable_reset_vq)(struct virtio_reset_vq *param);
> >>>};
> 

Re: [PATCH v3 00/17] virtio pci support VIRTIO_F_RING_RESET

2022-02-07 Thread Jason Wang


在 2022/2/7 下午2:02, Xuan Zhuo 写道:

On Mon, 7 Feb 2022 11:39:36 +0800, Jason Wang  wrote:

On Wed, Jan 26, 2022 at 3:35 PM Xuan Zhuo  wrote:


The virtio spec already supports the virtio queue reset function. This patch set
is to add this function to the kernel. The relevant virtio spec information is
here:

 https://github.com/oasis-tcs/virtio-spec/issues/124

Also regarding MMIO support for queue reset, I plan to support it after this
patch is passed.

#14-#17 is the disable/enable function of rx/tx pair implemented by virtio-net
using the new helper.

One thing that came to mind is the steering. E.g if we disable an RX,
should we stop steering packets to that queue?

Yes, we should reselect a queue.

Thanks.



Maybe a spec patch for that?

Thanks





Thanks


This function is not currently referenced by other
functions. It is more to show the usage of the new helper, I not sure if they
are going to be merged together.

Please review. Thanks.

v3:
   1. keep vq, irq unreleased

Xuan Zhuo (17):
   virtio_pci: struct virtio_pci_common_cfg add queue_notify_data
   virtio: queue_reset: add VIRTIO_F_RING_RESET
   virtio: queue_reset: struct virtio_config_ops add callbacks for
 queue_reset
   virtio: queue_reset: add helper
   vritio_ring: queue_reset: extract the release function of the vq ring
   virtio_ring: queue_reset: split: add __vring_init_virtqueue()
   virtio_ring: queue_reset: split: support enable reset queue
   virtio_ring: queue_reset: packed: support enable reset queue
   virtio_ring: queue_reset: add vring_reset_virtqueue()
   virtio_pci: queue_reset: update struct virtio_pci_common_cfg and
 option functions
   virtio_pci: queue_reset: release vq by vp_dev->vqs
   virtio_pci: queue_reset: setup_vq use vring_setup_virtqueue()
   virtio_pci: queue_reset: support VIRTIO_F_RING_RESET
   virtio_net: virtnet_tx_timeout() fix style
   virtio_net: virtnet_tx_timeout() stop ref sq->vq
   virtio_net: split free_unused_bufs()
   virtio_net: support pair disable/enable

  drivers/net/virtio_net.c   | 220 ++---
  drivers/virtio/virtio_pci_common.c |  62 ---
  drivers/virtio/virtio_pci_common.h |  11 +-
  drivers/virtio/virtio_pci_legacy.c |   5 +-
  drivers/virtio/virtio_pci_modern.c | 120 +-
  drivers/virtio/virtio_pci_modern_dev.c |  28 
  drivers/virtio/virtio_ring.c   | 144 +++-
  include/linux/virtio.h |   1 +
  include/linux/virtio_config.h  |  75 -
  include/linux/virtio_pci_modern.h  |   2 +
  include/linux/virtio_ring.h|  42 +++--
  include/uapi/linux/virtio_config.h |   7 +-
  include/uapi/linux/virtio_pci.h|   2 +
  13 files changed, 618 insertions(+), 101 deletions(-)

--
2.31.0



___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 03/17] virtio: queue_reset: struct virtio_config_ops add callbacks for queue_reset

2022-02-07 Thread Jason Wang


在 2022/2/7 下午3:19, Xuan Zhuo 写道:

On Mon, 7 Feb 2022 14:45:02 +0800, Jason Wang  wrote:

在 2022/1/26 下午3:35, Xuan Zhuo 写道:

Performing reset on a queue is divided into two steps:

1. reset_vq: reset one vq
2. enable_reset_vq: re-enable the reset queue

In the first step, these tasks will be completed:
  1. notify the hardware queue to reset
  2. recycle the buffer from vq
  3. release the ring of the vq

The second step is similar to find vqs,


Not sure, since find_vqs will usually try to allocate interrupts.



Yes.



   passing parameters callback and
name, etc. Based on the original vq, the ring is re-allocated and
configured to the backend.


I wonder whether we really have such requirement.

For example, do we really have a use case that may change:

vq callback, ctx, ring_num or even re-create the virtqueue?

1. virtqueue is not recreated
2. ring_num can be used to modify ring num by ethtool -G



It looks to me we don't support this right now.




There is really no scene to modify vq callback, ctx, name.

Do you mean we still use the old one instead of adding these parameters?



Yes, I think for driver we need to implement the function that is needed 
for the first user (e.g AF_XDP). If there's no use case, we can leave 
those extension for the future.


Thanks




Thanks.


Thanks



So add two callbacks reset_vq, enable_reset_vq to struct
virtio_config_ops.

Add a structure for passing parameters. This will facilitate subsequent
expansion of the parameters of enable reset vq.
There is currently only one default extended parameter ring_num.

Signed-off-by: Xuan Zhuo 
---
   include/linux/virtio_config.h | 43 ++-
   1 file changed, 42 insertions(+), 1 deletion(-)

diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 4d107ad31149..51dd8461d1b6 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -16,6 +16,44 @@ struct virtio_shm_region {
u64 len;
   };

+typedef void vq_callback_t(struct virtqueue *);
+
+/* virtio_reset_vq: specify parameters for queue_reset
+ *
+ * vdev: the device
+ * queue_index: the queue index
+ *
+ * free_unused_cb: callback to free unused bufs
+ * data: used by free_unused_cb
+ *
+ * callback: callback for the virtqueue, NULL for vq that do not need a
+ *   callback
+ * name: virtqueue names (mainly for debugging), NULL for vq unused by
+ *   driver
+ * ctx: ctx
+ *
+ * ring_num: specify ring num for the vq to be re-enabled. 0 means use the
+ *   default value. MUST be a power of 2.
+ */
+struct virtio_reset_vq;
+typedef void vq_reset_callback_t(struct virtio_reset_vq *param, void *buf);
+struct virtio_reset_vq {
+   struct virtio_device *vdev;
+   u16 queue_index;
+
+   /* reset vq param */
+   vq_reset_callback_t *free_unused_cb;
+   void *data;
+
+   /* enable reset vq param */
+   vq_callback_t *callback;
+   const char *name;
+   const bool *ctx;
+
+   /* ext enable reset vq param */
+   u16 ring_num;
+};
+
   /**
* virtio_config_ops - operations for configuring a virtio device
* Note: Do not assume that a transport implements all of the operations
@@ -74,8 +112,9 @@ struct virtio_shm_region {
* @set_vq_affinity: set the affinity for a virtqueue (optional).
* @get_vq_affinity: get the affinity for a virtqueue (optional).
* @get_shm_region: get a shared memory region based on the index.
+ * @reset_vq: reset a queue individually
+ * @enable_reset_vq: enable a reset queue
*/
-typedef void vq_callback_t(struct virtqueue *);
   struct virtio_config_ops {
void (*enable_cbs)(struct virtio_device *vdev);
void (*get)(struct virtio_device *vdev, unsigned offset,
@@ -100,6 +139,8 @@ struct virtio_config_ops {
int index);
bool (*get_shm_region)(struct virtio_device *vdev,
   struct virtio_shm_region *region, u8 id);
+   int (*reset_vq)(struct virtio_reset_vq *param);
+   struct virtqueue *(*enable_reset_vq)(struct virtio_reset_vq *param);
   };

   /* If driver didn't advertise the feature, it will never appear. */


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 13/17] virtio_pci: queue_reset: support VIRTIO_F_RING_RESET

2022-02-07 Thread Jason Wang
On Mon, Feb 7, 2022 at 4:19 PM Xuan Zhuo  wrote:
>
> On Mon, 7 Feb 2022 14:57:13 +0800, Jason Wang  wrote:
> >
> > 在 2022/1/26 下午3:35, Xuan Zhuo 写道:
> > > This patch implements virtio pci support for QUEUE RESET.
> > >
> > > Performing reset on a queue is divided into two steps:
> > >
> > > 1. reset_vq: reset one vq
> > > 2. enable_reset_vq: re-enable the reset queue
> > >
> > > In the first step, these tasks will be completed:
> > > 1. notify the hardware queue to reset
> > > 2. recycle the buffer from vq
> > > 3. release the ring of the vq
> > >
> > > The process of enable reset vq:
> > >  vp_modern_enable_reset_vq()
> > >  vp_enable_reset_vq()
> > >  __vp_setup_vq()
> > >  setup_vq()
> > >  vring_setup_virtqueue()
> > >
> > > In this process, we added two parameters, vq and num, and finally passed
> > > them to vring_setup_virtqueue().  And reuse the original info and vq.
> > >
> > > Signed-off-by: Xuan Zhuo 
> > > ---
> > >   drivers/virtio/virtio_pci_common.c |  36 +++
> > >   drivers/virtio/virtio_pci_common.h |   5 ++
> > >   drivers/virtio/virtio_pci_modern.c | 100 +
> > >   3 files changed, 128 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/drivers/virtio/virtio_pci_common.c 
> > > b/drivers/virtio/virtio_pci_common.c
> > > index c02936d29a31..ad21638fbf66 100644
> > > --- a/drivers/virtio/virtio_pci_common.c
> > > +++ b/drivers/virtio/virtio_pci_common.c
> > > @@ -205,23 +205,28 @@ static int vp_request_msix_vectors(struct 
> > > virtio_device *vdev, int nvectors,
> > > return err;
> > >   }
> > >
> > > -static struct virtqueue *vp_setup_vq(struct virtio_device *vdev, 
> > > unsigned index,
> > > -void (*callback)(struct virtqueue *vq),
> > > -const char *name,
> > > -bool ctx,
> > > -u16 msix_vec, u16 num)
> > > +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index,
> > > + void (*callback)(struct virtqueue *vq),
> > > + const char *name,
> > > + bool ctx,
> > > + u16 msix_vec, u16 num)
> > >   {
> > > struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> > > -   struct virtio_pci_vq_info *info = kmalloc(sizeof *info, GFP_KERNEL);
> > > +   struct virtio_pci_vq_info *info;
> > > struct virtqueue *vq;
> > > unsigned long flags;
> > >
> > > -   /* fill out our structure that represents an active queue */
> > > -   if (!info)
> > > -   return ERR_PTR(-ENOMEM);
> > > +   info = vp_dev->vqs[index];
> > > +   if (!info) {
> > > +   info = kzalloc(sizeof *info, GFP_KERNEL);
> > > +
> > > +   /* fill out our structure that represents an active queue */
> > > +   if (!info)
> > > +   return ERR_PTR(-ENOMEM);
> > > +   }
> > >
> > > vq = vp_dev->setup_vq(vp_dev, info, index, callback, name, ctx,
> > > - msix_vec, NULL, num);
> > > + msix_vec, info->vq, num);
> > > if (IS_ERR(vq))
> > > goto out_info;
> > >
> > > @@ -238,6 +243,9 @@ static struct virtqueue *vp_setup_vq(struct 
> > > virtio_device *vdev, unsigned index,
> > > return vq;
> > >
> > >   out_info:
> > > +   if (info->vq && info->vq->reset)
> > > +   return vq;
> > > +
> > > kfree(info);
> > > return vq;
> > >   }
> > > @@ -248,9 +256,11 @@ static void vp_del_vq(struct virtqueue *vq)
> > > struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index];
> > > unsigned long flags;
> > >
> > > -   spin_lock_irqsave(_dev->lock, flags);
> > > -   list_del(>node);
> > > -   spin_unlock_irqrestore(_dev->lock, flags);
> > > +   if (!vq->reset) {
> > > +   spin_lock_irqsave(_dev->lock, flags);
> > > +   list_del(>node);
> > > +   spin_unlock_irqrestore(_dev->lock, flags);
> > > +   }
> > >
> > > vp_dev->del_vq(info);
> > > kfree(info);
> > > diff --git a/drivers/virtio/virtio_pci_common.h 
> > > b/drivers/virtio/virtio_pci_common.h
> > > index 65db92245e41..c1d15f7c0be4 100644
> > > --- a/drivers/virtio/virtio_pci_common.h
> > > +++ b/drivers/virtio/virtio_pci_common.h
> > > @@ -119,6 +119,11 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned 
> > > nvqs,
> > > struct virtqueue *vqs[], vq_callback_t *callbacks[],
> > > const char * const names[], const bool *ctx,
> > > struct irq_affinity *desc);
> > > +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index,
> > > + void (*callback)(struct virtqueue *vq),
> > > + const char *name,
> > > + bool ctx,
> > > + u16 msix_vec, u16 num);
> > >   const char *vp_bus_name(struct virtio_device *vdev);
> > >
> > >   /* Setup the affinity for a virtqueue:
> > > 

Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data

2022-02-07 Thread Xuan Zhuo
On Mon, 7 Feb 2022 16:06:15 +0800, Jason Wang  wrote:
> On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo  wrote:
> >
> > On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang  wrote:
> > >
> > > 在 2022/1/26 下午3:35, Xuan Zhuo 写道:
> > > > Add queue_notify_data in struct virtio_pci_common_cfg, which comes from
> > > > here https://github.com/oasis-tcs/virtio-spec/issues/89
> > > >
> > > > Since I want to add queue_reset after it, I submitted this patch first.
> > > >
> > > > Signed-off-by: Xuan Zhuo 
> > > > ---
> > > >   include/uapi/linux/virtio_pci.h | 1 +
> > > >   1 file changed, 1 insertion(+)
> > > >
> > > > diff --git a/include/uapi/linux/virtio_pci.h 
> > > > b/include/uapi/linux/virtio_pci.h
> > > > index 3a86f36d7e3d..492c89f56c6a 100644
> > > > --- a/include/uapi/linux/virtio_pci.h
> > > > +++ b/include/uapi/linux/virtio_pci.h
> > > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg {
> > > > __le32 queue_avail_hi;  /* read-write */
> > > > __le32 queue_used_lo;   /* read-write */
> > > > __le32 queue_used_hi;   /* read-write */
> > > > +   __le16 queue_notify_data;   /* read-write */
> > > >   };
> > >
> > >
> > > So I had the same concern as previous version.
> > >
> > > This breaks uABI where program may try to use sizeof(struct
> > > virtio_pci_common_cfg).
> > >
> > > We probably need a container structure here.
> >
> > I see, I plan to add a struct like this, do you think it's appropriate?
> >
> > struct virtio_pci_common_cfg_v1 {
> > struct virtio_pci_common_cfg cfg;
> > __le16 queue_notify_data;   /* read-write */
> > }
>
> Something like this but we probably need a better name.


how about this?

/* Ext Fields in VIRTIO_PCI_CAP_COMMON_CFG: */
struct virtio_pci_common_cfg_ext {
struct virtio_pci_common_cfg cfg;

__le16 queue_notify_data;   /* read-write */

__le16 reserved0;
__le16 reserved1;
__le16 reserved2;
__le16 reserved3;
__le16 reserved4;
__le16 reserved5;
__le16 reserved6;
__le16 reserved7;
__le16 reserved8;
__le16 reserved9;
__le16 reserved10;
__le16 reserved11;
__le16 reserved12;
__le16 reserved13;
__le16 reserved14;
};

Thanks

>
> Thanks
>
> >
> > Thanks.
> >
> > >
> > > THanks
> > >
> > >
> > > >
> > > >   /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */
> > >
> >
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v3 8/8] VMCI: dma dg: add support for DMA datagrams receive

2022-02-07 Thread Jorgen Hansen
Use the DMA based receive operation instead of the ioread8_rep
based datagram receive when DMA datagrams are supported.

In the receive operation, configure the header to point to the
page aligned VMCI_MAX_DG_SIZE part of the receive buffer
using s/g configuration for the header. This ensures that the
existing dispatch routine can be used with little modification.
Initiate the receive by writing the lower 32 bit of the buffer
to the VMCI_DATA_IN_LOW_ADDR register, and wait for the busy
flag to be changed by the device using a wait queue.

The existing dispatch routine for received  datagrams is reused
for the DMA datagrams with a few modifications:
- the receive buffer is always the maximum size for DMA datagrams
  (IO ports would try with a shorter buffer first to reduce
  overhead of the ioread8_rep operation).
- for DMA datagrams, datagrams are provided contiguous in the
  buffer as opposed to IO port datagrams, where they can start
  on any page boundary

Reviewed-by: Vishnu Dasa 
Signed-off-by: Jorgen Hansen 
---
 drivers/misc/vmw_vmci/vmci_guest.c | 103 ++---
 1 file changed, 79 insertions(+), 24 deletions(-)

diff --git a/drivers/misc/vmw_vmci/vmci_guest.c 
b/drivers/misc/vmw_vmci/vmci_guest.c
index bf524217914e..aa61a687b3e2 100644
--- a/drivers/misc/vmw_vmci/vmci_guest.c
+++ b/drivers/misc/vmw_vmci/vmci_guest.c
@@ -58,6 +58,7 @@ struct vmci_guest_device {
 
struct tasklet_struct datagram_tasklet;
struct tasklet_struct bm_tasklet;
+   struct wait_queue_head inout_wq;
 
void *data_buffer;
dma_addr_t data_buffer_base;
@@ -115,6 +116,36 @@ static void vmci_write_reg(struct vmci_guest_device *dev, 
u32 val, u32 reg)
iowrite32(val, dev->iobase + reg);
 }
 
+static void vmci_read_data(struct vmci_guest_device *vmci_dev,
+  void *dest, size_t size)
+{
+   if (vmci_dev->mmio_base == NULL)
+   ioread8_rep(vmci_dev->iobase + VMCI_DATA_IN_ADDR,
+   dest, size);
+   else {
+   /*
+* For DMA datagrams, the data_buffer will contain the header 
on the
+* first page, followed by the incoming datagram(s) on the 
following
+* pages. The header uses an S/G element immediately following 
the
+* header on the first page to point to the data area.
+*/
+   struct vmci_data_in_out_header *buffer_header = 
vmci_dev->data_buffer;
+   struct vmci_sg_elem *sg_array = (struct vmci_sg_elem 
*)(buffer_header + 1);
+   size_t buffer_offset = dest - vmci_dev->data_buffer;
+
+   buffer_header->opcode = 1;
+   buffer_header->size = 1;
+   buffer_header->busy = 0;
+   sg_array[0].addr = vmci_dev->data_buffer_base + buffer_offset;
+   sg_array[0].size = size;
+
+   vmci_write_reg(vmci_dev, 
lower_32_bits(vmci_dev->data_buffer_base),
+  VMCI_DATA_IN_LOW_ADDR);
+
+   wait_event(vmci_dev->inout_wq, buffer_header->busy == 1);
+   }
+}
+
 static int vmci_write_data(struct vmci_guest_device *dev,
   struct vmci_datagram *dg)
 {
@@ -261,15 +292,17 @@ static int vmci_check_host_caps(struct pci_dev *pdev)
 }
 
 /*
- * Reads datagrams from the data in port and dispatches them. We
- * always start reading datagrams into only the first page of the
- * datagram buffer. If the datagrams don't fit into one page, we
- * use the maximum datagram buffer size for the remainder of the
- * invocation. This is a simple heuristic for not penalizing
- * small datagrams.
+ * Reads datagrams from the device and dispatches them. For IO port
+ * based access to the device, we always start reading datagrams into
+ * only the first page of the datagram buffer. If the datagrams don't
+ * fit into one page, we use the maximum datagram buffer size for the
+ * remainder of the invocation. This is a simple heuristic for not
+ * penalizing small datagrams. For DMA-based datagrams, we always
+ * use the maximum datagram buffer size, since there is no performance
+ * penalty for doing so.
  *
  * This function assumes that it has exclusive access to the data
- * in port for the duration of the call.
+ * in register(s) for the duration of the call.
  */
 static void vmci_dispatch_dgs(unsigned long data)
 {
@@ -277,23 +310,41 @@ static void vmci_dispatch_dgs(unsigned long data)
u8 *dg_in_buffer = vmci_dev->data_buffer;
struct vmci_datagram *dg;
size_t dg_in_buffer_size = VMCI_MAX_DG_SIZE;
-   size_t current_dg_in_buffer_size = PAGE_SIZE;
+   size_t current_dg_in_buffer_size;
size_t remaining_bytes;
+   bool is_io_port = vmci_dev->mmio_base == NULL;
 
BUILD_BUG_ON(VMCI_MAX_DG_SIZE < PAGE_SIZE);
 
-   ioread8_rep(vmci_dev->iobase + VMCI_DATA_IN_ADDR,
-   vmci_dev->data_buffer, 

[PATCH v3 7/8] VMCI: dma dg: add support for DMA datagrams sends

2022-02-07 Thread Jorgen Hansen
Use DMA based send operation from the transmit buffer instead of the
iowrite8_rep based datagram send when DMA datagrams are supported.

The outgoing datagram is sent as inline data in the VMCI transmit
buffer. Once the header has been configured, the send is initiated
by writing the lower 32 bit of the buffer base address to the
VMCI_DATA_OUT_LOW_ADDR register. Only then will the device process
the header and the datagram itself. Following that, the driver busy
waits (it isn't possible to sleep on the send path) for the header
busy flag to change - indicating that the send is complete.

Reviewed-by: Vishnu Dasa 
Signed-off-by: Jorgen Hansen 
---
 drivers/misc/vmw_vmci/vmci_guest.c | 45 --
 include/linux/vmw_vmci_defs.h  | 34 ++
 2 files changed, 77 insertions(+), 2 deletions(-)

diff --git a/drivers/misc/vmw_vmci/vmci_guest.c 
b/drivers/misc/vmw_vmci/vmci_guest.c
index 36eade15ba87..bf524217914e 100644
--- a/drivers/misc/vmw_vmci/vmci_guest.c
+++ b/drivers/misc/vmw_vmci/vmci_guest.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -114,6 +115,47 @@ static void vmci_write_reg(struct vmci_guest_device *dev, 
u32 val, u32 reg)
iowrite32(val, dev->iobase + reg);
 }
 
+static int vmci_write_data(struct vmci_guest_device *dev,
+  struct vmci_datagram *dg)
+{
+   int result;
+
+   if (dev->mmio_base != NULL) {
+   struct vmci_data_in_out_header *buffer_header = dev->tx_buffer;
+   u8 *dg_out_buffer = (u8 *)(buffer_header + 1);
+
+   if (VMCI_DG_SIZE(dg) > VMCI_MAX_DG_SIZE)
+   return VMCI_ERROR_INVALID_ARGS;
+
+   /*
+* Initialize send buffer with outgoing datagram
+* and set up header for inline data. Device will
+* not access buffer asynchronously - only after
+* the write to VMCI_DATA_OUT_LOW_ADDR.
+*/
+   memcpy(dg_out_buffer, dg, VMCI_DG_SIZE(dg));
+   buffer_header->opcode = 0;
+   buffer_header->size = VMCI_DG_SIZE(dg);
+   buffer_header->busy = 1;
+
+   vmci_write_reg(dev, lower_32_bits(dev->tx_buffer_base),
+  VMCI_DATA_OUT_LOW_ADDR);
+
+   /* Caller holds a spinlock, so cannot block. */
+   spin_until_cond(buffer_header->busy == 0);
+
+   result = vmci_read_reg(vmci_dev_g, VMCI_RESULT_LOW_ADDR);
+   if (result == VMCI_SUCCESS)
+   result = (int)buffer_header->result;
+   } else {
+   iowrite8_rep(dev->iobase + VMCI_DATA_OUT_ADDR,
+dg, VMCI_DG_SIZE(dg));
+   result = vmci_read_reg(vmci_dev_g, VMCI_RESULT_LOW_ADDR);
+   }
+
+   return result;
+}
+
 /*
  * VM to hypervisor call mechanism. We use the standard VMware naming
  * convention since shared code is calling this function as well.
@@ -139,8 +181,7 @@ int vmci_send_datagram(struct vmci_datagram *dg)
spin_lock_irqsave(_dev_spinlock, flags);
 
if (vmci_dev_g) {
-   iowrite8_rep(vmci_dev_g->iobase + VMCI_DATA_OUT_ADDR,
-dg, VMCI_DG_SIZE(dg));
+   vmci_write_data(vmci_dev_g, dg);
result = vmci_read_reg(vmci_dev_g, VMCI_RESULT_LOW_ADDR);
} else {
result = VMCI_ERROR_UNAVAILABLE;
diff --git a/include/linux/vmw_vmci_defs.h b/include/linux/vmw_vmci_defs.h
index 8bc37d8244a8..6fb663b36f72 100644
--- a/include/linux/vmw_vmci_defs.h
+++ b/include/linux/vmw_vmci_defs.h
@@ -110,6 +110,40 @@ enum {
 #define VMCI_MMIO_ACCESS_OFFSET((size_t)(128 * 1024))
 #define VMCI_MMIO_ACCESS_SIZE  ((size_t)(64 * 1024))
 
+/*
+ * For VMCI devices supporting the VMCI_CAPS_DMA_DATAGRAM capability, the
+ * sending and receiving of datagrams can be performed using DMA to/from
+ * a driver allocated buffer.
+ * Sending and receiving will be handled as follows:
+ * - when sending datagrams, the driver initializes the buffer where the
+ *   data part will refer to the outgoing VMCI datagram, sets the busy flag
+ *   to 1 and writes the address of the buffer to VMCI_DATA_OUT_HIGH_ADDR
+ *   and VMCI_DATA_OUT_LOW_ADDR. Writing to VMCI_DATA_OUT_LOW_ADDR triggers
+ *   the device processing of the buffer. When the device has processed the
+ *   buffer, it will write the result value to the buffer and then clear the
+ *   busy flag.
+ * - when receiving datagrams, the driver initializes the buffer where the
+ *   data part will describe the receive buffer, clears the busy flag and
+ *   writes the address of the buffer to VMCI_DATA_IN_HIGH_ADDR and
+ *   VMCI_DATA_IN_LOW_ADDR. Writing to VMCI_DATA_IN_LOW_ADDR triggers the
+ *   device processing of the buffer. The device will copy as many available
+ *   datagrams into the buffer as possible, and then sets the 

[PATCH v3 5/8] VMCI: dma dg: register dummy IRQ handlers for DMA datagrams

2022-02-07 Thread Jorgen Hansen
Register dummy interrupt handlers for DMA datagrams in preparation for
DMA datagram receive operations.

Reviewed-by: Vishnu Dasa 
Signed-off-by: Jorgen Hansen 
---
 drivers/misc/vmw_vmci/vmci_guest.c | 42 +++---
 include/linux/vmw_vmci_defs.h  | 14 --
 2 files changed, 51 insertions(+), 5 deletions(-)

diff --git a/drivers/misc/vmw_vmci/vmci_guest.c 
b/drivers/misc/vmw_vmci/vmci_guest.c
index ced187e7ac08..acef19c562b3 100644
--- a/drivers/misc/vmw_vmci/vmci_guest.c
+++ b/drivers/misc/vmw_vmci/vmci_guest.c
@@ -414,6 +414,9 @@ static irqreturn_t vmci_interrupt(int irq, void *_dev)
icr &= ~VMCI_ICR_NOTIFICATION;
}
 
+   if (icr & VMCI_ICR_DMA_DATAGRAM)
+   icr &= ~VMCI_ICR_DMA_DATAGRAM;
+
if (icr != 0)
dev_warn(dev->dev,
 "Ignoring unknown interrupt cause (%d)\n",
@@ -438,6 +441,16 @@ static irqreturn_t vmci_interrupt_bm(int irq, void *_dev)
return IRQ_HANDLED;
 }
 
+/*
+ * Interrupt handler for MSI-X interrupt vector VMCI_INTR_DMA_DATAGRAM,
+ * which is for the completion of a DMA datagram send or receive operation.
+ * Will only get called if we are using MSI-X with exclusive vectors.
+ */
+static irqreturn_t vmci_interrupt_dma_datagram(int irq, void *_dev)
+{
+   return IRQ_HANDLED;
+}
+
 /*
  * Most of the initialization at module load time is done here.
  */
@@ -447,6 +460,7 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
struct vmci_guest_device *vmci_dev;
void __iomem *iobase = NULL;
void __iomem *mmio_base = NULL;
+   unsigned int num_irq_vectors;
unsigned int capabilities;
unsigned int caps_in_use;
unsigned long cmd;
@@ -627,8 +641,12 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
 * Enable interrupts.  Try MSI-X first, then MSI, and then fallback on
 * legacy interrupts.
 */
-   error = pci_alloc_irq_vectors(pdev, VMCI_MAX_INTRS, VMCI_MAX_INTRS,
-   PCI_IRQ_MSIX);
+   if (vmci_dev->mmio_base != NULL)
+   num_irq_vectors = VMCI_MAX_INTRS;
+   else
+   num_irq_vectors = VMCI_MAX_INTRS_NOTIFICATION;
+   error = pci_alloc_irq_vectors(pdev, num_irq_vectors, num_irq_vectors,
+ PCI_IRQ_MSIX);
if (error < 0) {
error = pci_alloc_irq_vectors(pdev, 1, 1,
PCI_IRQ_MSIX | PCI_IRQ_MSI | PCI_IRQ_LEGACY);
@@ -666,6 +684,17 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
pci_irq_vector(pdev, 1), error);
goto err_free_irq;
}
+   if (caps_in_use & VMCI_CAPS_DMA_DATAGRAM) {
+   error = request_irq(pci_irq_vector(pdev, 2),
+   vmci_interrupt_dma_datagram,
+   0, KBUILD_MODNAME, vmci_dev);
+   if (error) {
+   dev_err(>dev,
+   "Failed to allocate irq %u: %d\n",
+   pci_irq_vector(pdev, 2), error);
+   goto err_free_bm_irq;
+   }
+   }
}
 
dev_dbg(>dev, "Registered device\n");
@@ -676,6 +705,8 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
cmd = VMCI_IMR_DATAGRAM;
if (caps_in_use & VMCI_CAPS_NOTIFICATIONS)
cmd |= VMCI_IMR_NOTIFICATION;
+   if (caps_in_use & VMCI_CAPS_DMA_DATAGRAM)
+   cmd |= VMCI_IMR_DMA_DATAGRAM;
vmci_write_reg(vmci_dev, cmd, VMCI_IMR_ADDR);
 
/* Enable interrupts. */
@@ -686,6 +717,8 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
vmci_call_vsock_callback(false);
return 0;
 
+err_free_bm_irq:
+   free_irq(pci_irq_vector(pdev, 1), vmci_dev);
 err_free_irq:
free_irq(pci_irq_vector(pdev, 0), vmci_dev);
tasklet_kill(_dev->datagram_tasklet);
@@ -751,8 +784,11 @@ static void vmci_guest_remove_device(struct pci_dev *pdev)
 * MSI-X, we might have multiple vectors, each with their own
 * IRQ, which we must free too.
 */
-   if (vmci_dev->exclusive_vectors)
+   if (vmci_dev->exclusive_vectors) {
free_irq(pci_irq_vector(pdev, 1), vmci_dev);
+   if (vmci_dev->mmio_base != NULL)
+   free_irq(pci_irq_vector(pdev, 2), vmci_dev);
+   }
free_irq(pci_irq_vector(pdev, 0), vmci_dev);
pci_free_irq_vectors(pdev);
 
diff --git a/include/linux/vmw_vmci_defs.h b/include/linux/vmw_vmci_defs.h
index 4167779469fd..2b70c024dacb 100644
--- a/include/linux/vmw_vmci_defs.h
+++ b/include/linux/vmw_vmci_defs.h
@@ -45,13 +45,22 @@
 /* Interrupt Cause register bits. */
 #define 

[PATCH v3 4/8] VMCI: dma dg: set OS page size

2022-02-07 Thread Jorgen Hansen
Tell the device the page size used by the OS.

Reviewed-by: Vishnu Dasa 
Signed-off-by: Jorgen Hansen 
---
 drivers/misc/vmw_vmci/vmci_guest.c | 4 
 include/linux/vmw_vmci_defs.h  | 1 +
 2 files changed, 5 insertions(+)

diff --git a/drivers/misc/vmw_vmci/vmci_guest.c 
b/drivers/misc/vmw_vmci/vmci_guest.c
index b93afe7f7119..ced187e7ac08 100644
--- a/drivers/misc/vmw_vmci/vmci_guest.c
+++ b/drivers/misc/vmw_vmci/vmci_guest.c
@@ -578,6 +578,10 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
/* Let the host know which capabilities we intend to use. */
vmci_write_reg(vmci_dev, caps_in_use, VMCI_CAPS_ADDR);
 
+   /* Let the device know the size for pages passed down. */
+   if (caps_in_use & VMCI_CAPS_DMA_DATAGRAM)
+   vmci_write_reg(vmci_dev, PAGE_SHIFT, VMCI_GUEST_PAGE_SHIFT);
+
/* Set up global device so that we can start sending datagrams */
spin_lock_irq(_dev_spinlock);
vmci_dev_g = vmci_dev;
diff --git a/include/linux/vmw_vmci_defs.h b/include/linux/vmw_vmci_defs.h
index 1ce2cffdc3ae..4167779469fd 100644
--- a/include/linux/vmw_vmci_defs.h
+++ b/include/linux/vmw_vmci_defs.h
@@ -21,6 +21,7 @@
 #define VMCI_CAPS_ADDR  0x18
 #define VMCI_RESULT_LOW_ADDR0x1c
 #define VMCI_RESULT_HIGH_ADDR   0x20
+#define VMCI_GUEST_PAGE_SHIFT   0x34
 
 /* Max number of devices. */
 #define VMCI_MAX_DEVICES 1
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 3/8] VMCI: dma dg: detect DMA datagram capability

2022-02-07 Thread Jorgen Hansen
Detect the VMCI DMA datagram capability, and if present, ack it
to the device.

Reviewed-by: Vishnu Dasa 
Signed-off-by: Jorgen Hansen 
---
 drivers/misc/vmw_vmci/vmci_guest.c | 11 +++
 include/linux/vmw_vmci_defs.h  |  1 +
 2 files changed, 12 insertions(+)

diff --git a/drivers/misc/vmw_vmci/vmci_guest.c 
b/drivers/misc/vmw_vmci/vmci_guest.c
index d30d66258e52..b93afe7f7119 100644
--- a/drivers/misc/vmw_vmci/vmci_guest.c
+++ b/drivers/misc/vmw_vmci/vmci_guest.c
@@ -562,6 +562,17 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
}
}
 
+   if (mmio_base != NULL) {
+   if (capabilities & VMCI_CAPS_DMA_DATAGRAM) {
+   caps_in_use |= VMCI_CAPS_DMA_DATAGRAM;
+   } else {
+   dev_err(>dev,
+   "Missing capability: VMCI_CAPS_DMA_DATAGRAM\n");
+   error = -ENXIO;
+   goto err_free_data_buffer;
+   }
+   }
+
dev_info(>dev, "Using capabilities 0x%x\n", caps_in_use);
 
/* Let the host know which capabilities we intend to use. */
diff --git a/include/linux/vmw_vmci_defs.h b/include/linux/vmw_vmci_defs.h
index 8fc00e2685cf..1ce2cffdc3ae 100644
--- a/include/linux/vmw_vmci_defs.h
+++ b/include/linux/vmw_vmci_defs.h
@@ -39,6 +39,7 @@
 #define VMCI_CAPS_DATAGRAM  BIT(2)
 #define VMCI_CAPS_NOTIFICATIONS BIT(3)
 #define VMCI_CAPS_PPN64 BIT(4)
+#define VMCI_CAPS_DMA_DATAGRAM  BIT(5)
 
 /* Interrupt Cause register bits. */
 #define VMCI_ICR_DATAGRAM  BIT(0)
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 6/8] VMCI: dma dg: allocate send and receive buffers for DMA datagrams

2022-02-07 Thread Jorgen Hansen
If DMA datagrams are used, allocate send and receive buffers
in coherent DMA memory.

This is done in preparation for the send and receive datagram
operations, where the buffers are used for the exchange of data
between driver and device.

Reviewed-by: Vishnu Dasa 
Signed-off-by: Jorgen Hansen 
---
 drivers/misc/vmw_vmci/vmci_guest.c | 71 ++
 include/linux/vmw_vmci_defs.h  |  4 ++
 2 files changed, 66 insertions(+), 9 deletions(-)

diff --git a/drivers/misc/vmw_vmci/vmci_guest.c 
b/drivers/misc/vmw_vmci/vmci_guest.c
index acef19c562b3..36eade15ba87 100644
--- a/drivers/misc/vmw_vmci/vmci_guest.c
+++ b/drivers/misc/vmw_vmci/vmci_guest.c
@@ -31,6 +31,12 @@
 
 #define VMCI_UTIL_NUM_RESOURCES 1
 
+/*
+ * Datagram buffers for DMA send/receive must accommodate at least
+ * a maximum sized datagram and the header.
+ */
+#define VMCI_DMA_DG_BUFFER_SIZE (VMCI_MAX_DG_SIZE + PAGE_SIZE)
+
 static bool vmci_disable_msi;
 module_param_named(disable_msi, vmci_disable_msi, bool, 0);
 MODULE_PARM_DESC(disable_msi, "Disable MSI use in driver - (default=0)");
@@ -53,6 +59,9 @@ struct vmci_guest_device {
struct tasklet_struct bm_tasklet;
 
void *data_buffer;
+   dma_addr_t data_buffer_base;
+   void *tx_buffer;
+   dma_addr_t tx_buffer_base;
void *notification_bitmap;
dma_addr_t notification_base;
 };
@@ -451,6 +460,24 @@ static irqreturn_t vmci_interrupt_dma_datagram(int irq, 
void *_dev)
return IRQ_HANDLED;
 }
 
+static void vmci_free_dg_buffers(struct vmci_guest_device *vmci_dev)
+{
+   if (vmci_dev->mmio_base != NULL) {
+   if (vmci_dev->tx_buffer != NULL)
+   dma_free_coherent(vmci_dev->dev,
+ VMCI_DMA_DG_BUFFER_SIZE,
+ vmci_dev->tx_buffer,
+ vmci_dev->tx_buffer_base);
+   if (vmci_dev->data_buffer != NULL)
+   dma_free_coherent(vmci_dev->dev,
+ VMCI_DMA_DG_BUFFER_SIZE,
+ vmci_dev->data_buffer,
+ vmci_dev->data_buffer_base);
+   } else {
+   vfree(vmci_dev->data_buffer);
+   }
+}
+
 /*
  * Most of the initialization at module load time is done here.
  */
@@ -517,11 +544,27 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
tasklet_init(_dev->bm_tasklet,
 vmci_process_bitmap, (unsigned long)vmci_dev);
 
-   vmci_dev->data_buffer = vmalloc(VMCI_MAX_DG_SIZE);
+   if (mmio_base != NULL) {
+   vmci_dev->tx_buffer = dma_alloc_coherent(>dev, 
VMCI_DMA_DG_BUFFER_SIZE,
+
_dev->tx_buffer_base,
+GFP_KERNEL);
+   if (!vmci_dev->tx_buffer) {
+   dev_err(>dev,
+   "Can't allocate memory for datagram tx 
buffer\n");
+   return -ENOMEM;
+   }
+
+   vmci_dev->data_buffer = dma_alloc_coherent(>dev, 
VMCI_DMA_DG_BUFFER_SIZE,
+  
_dev->data_buffer_base,
+  GFP_KERNEL);
+   } else {
+   vmci_dev->data_buffer = vmalloc(VMCI_MAX_DG_SIZE);
+   }
if (!vmci_dev->data_buffer) {
dev_err(>dev,
"Can't allocate memory for datagram buffer\n");
-   return -ENOMEM;
+   error = -ENOMEM;
+   goto err_free_data_buffers;
}
 
pci_set_master(pdev);   /* To enable queue_pair functionality. */
@@ -539,7 +582,7 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
if (!(capabilities & VMCI_CAPS_DATAGRAM)) {
dev_err(>dev, "Device does not support datagrams\n");
error = -ENXIO;
-   goto err_free_data_buffer;
+   goto err_free_data_buffers;
}
caps_in_use = VMCI_CAPS_DATAGRAM;
 
@@ -583,7 +626,7 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
dev_err(>dev,
"Missing capability: VMCI_CAPS_DMA_DATAGRAM\n");
error = -ENXIO;
-   goto err_free_data_buffer;
+   goto err_free_data_buffers;
}
}
 
@@ -592,10 +635,17 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
/* Let the host know which capabilities we intend to use. */
vmci_write_reg(vmci_dev, caps_in_use, VMCI_CAPS_ADDR);
 
-   /* Let the device know the size for pages passed down. */
-   if (caps_in_use & VMCI_CAPS_DMA_DATAGRAM)
+   if (caps_in_use & VMCI_CAPS_DMA_DATAGRAM) {
+   /* Let the device know the size for pages passed 

[PATCH v3 2/8] VMCI: dma dg: add MMIO access to registers

2022-02-07 Thread Jorgen Hansen
Detect the support for MMIO access through examination of the length
of the region requested in BAR1. If it is 256KB, the VMCI device
supports MMIO access to registers.

If MMIO access is supported, map the area of the region used for
MMIO access (64KB size at offset 128KB).

Add wrapper functions for accessing 32 bit register accesses through
either MMIO or IO ports based on device configuration.

Sending and receiving datagrams through iowrite8_rep/ioread8_rep is
left unchanged for now, and will be addressed in a later change.

Reviewed-by: Vishnu Dasa 
Signed-off-by: Jorgen Hansen 
---
 drivers/misc/vmw_vmci/vmci_guest.c | 67 +-
 include/linux/vmw_vmci_defs.h  | 12 ++
 2 files changed, 60 insertions(+), 19 deletions(-)

diff --git a/drivers/misc/vmw_vmci/vmci_guest.c 
b/drivers/misc/vmw_vmci/vmci_guest.c
index 1018dc77269d..d30d66258e52 100644
--- a/drivers/misc/vmw_vmci/vmci_guest.c
+++ b/drivers/misc/vmw_vmci/vmci_guest.c
@@ -45,6 +45,7 @@ static u32 vm_context_id = VMCI_INVALID_ID;
 struct vmci_guest_device {
struct device *dev; /* PCI device we are attached to */
void __iomem *iobase;
+   void __iomem *mmio_base;
 
bool exclusive_vectors;
 
@@ -89,6 +90,21 @@ u32 vmci_get_vm_context_id(void)
return vm_context_id;
 }
 
+static unsigned int vmci_read_reg(struct vmci_guest_device *dev, u32 reg)
+{
+   if (dev->mmio_base != NULL)
+   return readl(dev->mmio_base + reg);
+   return ioread32(dev->iobase + reg);
+}
+
+static void vmci_write_reg(struct vmci_guest_device *dev, u32 val, u32 reg)
+{
+   if (dev->mmio_base != NULL)
+   writel(val, dev->mmio_base + reg);
+   else
+   iowrite32(val, dev->iobase + reg);
+}
+
 /*
  * VM to hypervisor call mechanism. We use the standard VMware naming
  * convention since shared code is calling this function as well.
@@ -116,7 +132,7 @@ int vmci_send_datagram(struct vmci_datagram *dg)
if (vmci_dev_g) {
iowrite8_rep(vmci_dev_g->iobase + VMCI_DATA_OUT_ADDR,
 dg, VMCI_DG_SIZE(dg));
-   result = ioread32(vmci_dev_g->iobase + VMCI_RESULT_LOW_ADDR);
+   result = vmci_read_reg(vmci_dev_g, VMCI_RESULT_LOW_ADDR);
} else {
result = VMCI_ERROR_UNAVAILABLE;
}
@@ -384,7 +400,7 @@ static irqreturn_t vmci_interrupt(int irq, void *_dev)
unsigned int icr;
 
/* Acknowledge interrupt and determine what needs doing. */
-   icr = ioread32(dev->iobase + VMCI_ICR_ADDR);
+   icr = vmci_read_reg(dev, VMCI_ICR_ADDR);
if (icr == 0 || icr == ~0)
return IRQ_NONE;
 
@@ -429,7 +445,8 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
   const struct pci_device_id *id)
 {
struct vmci_guest_device *vmci_dev;
-   void __iomem *iobase;
+   void __iomem *iobase = NULL;
+   void __iomem *mmio_base = NULL;
unsigned int capabilities;
unsigned int caps_in_use;
unsigned long cmd;
@@ -445,16 +462,29 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
return error;
}
 
-   error = pcim_iomap_regions(pdev, 1 << 0, KBUILD_MODNAME);
-   if (error) {
-   dev_err(>dev, "Failed to reserve/map IO regions\n");
-   return error;
-   }
+   /*
+* The VMCI device with mmio access to registers requests 256KB
+* for BAR1. If present, driver will use new VMCI device
+* functionality for register access and datagram send/recv.
+*/
 
-   iobase = pcim_iomap_table(pdev)[0];
+   if (pci_resource_len(pdev, 1) == VMCI_WITH_MMIO_ACCESS_BAR_SIZE) {
+   dev_info(>dev, "MMIO register access is available\n");
+   mmio_base = pci_iomap_range(pdev, 1, VMCI_MMIO_ACCESS_OFFSET,
+   VMCI_MMIO_ACCESS_SIZE);
+   /* If the map fails, we fall back to IOIO access. */
+   if (!mmio_base)
+   dev_warn(>dev, "Failed to map MMIO register 
access\n");
+   }
 
-   dev_info(>dev, "Found VMCI PCI device at %#lx, irq %u\n",
-(unsigned long)iobase, pdev->irq);
+   if (!mmio_base) {
+   error = pcim_iomap_regions(pdev, BIT(0), KBUILD_MODNAME);
+   if (error) {
+   dev_err(>dev, "Failed to reserve/map IO 
regions\n");
+   return error;
+   }
+   iobase = pcim_iomap_table(pdev)[0];
+   }
 
vmci_dev = devm_kzalloc(>dev, sizeof(*vmci_dev), GFP_KERNEL);
if (!vmci_dev) {
@@ -466,6 +496,7 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
vmci_dev->dev = >dev;
vmci_dev->exclusive_vectors = false;
vmci_dev->iobase = iobase;
+   vmci_dev->mmio_base = mmio_base;
 
   

[PATCH v3 1/8] VMCI: dma dg: whitespace formatting change for vmci register defines

2022-02-07 Thread Jorgen Hansen
Update formatting of existing register defines in preparation for
adding additional register definitions for the VMCI device.

Reviewed-by: Vishnu Dasa 
Signed-off-by: Jorgen Hansen 
---
 include/linux/vmw_vmci_defs.h | 18 +-
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/include/linux/vmw_vmci_defs.h b/include/linux/vmw_vmci_defs.h
index e36cb114c188..9911ecfc18ba 100644
--- a/include/linux/vmw_vmci_defs.h
+++ b/include/linux/vmw_vmci_defs.h
@@ -12,15 +12,15 @@
 #include 
 
 /* Register offsets. */
-#define VMCI_STATUS_ADDR  0x00
-#define VMCI_CONTROL_ADDR 0x04
-#define VMCI_ICR_ADDR0x08
-#define VMCI_IMR_ADDR 0x0c
-#define VMCI_DATA_OUT_ADDR0x10
-#define VMCI_DATA_IN_ADDR 0x14
-#define VMCI_CAPS_ADDR0x18
-#define VMCI_RESULT_LOW_ADDR  0x1c
-#define VMCI_RESULT_HIGH_ADDR 0x20
+#define VMCI_STATUS_ADDR0x00
+#define VMCI_CONTROL_ADDR   0x04
+#define VMCI_ICR_ADDR   0x08
+#define VMCI_IMR_ADDR   0x0c
+#define VMCI_DATA_OUT_ADDR  0x10
+#define VMCI_DATA_IN_ADDR   0x14
+#define VMCI_CAPS_ADDR  0x18
+#define VMCI_RESULT_LOW_ADDR0x1c
+#define VMCI_RESULT_HIGH_ADDR   0x20
 
 /* Max number of devices. */
 #define VMCI_MAX_DEVICES 1
-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


[PATCH v3 0/8] VMCI: dma dg: Add support for DMA datagrams

2022-02-07 Thread Jorgen Hansen
A new version of the VMCI device will introduce two new major changes:
- support MMIO access to device registers
- support send/receive of datagrams using DMA transfers instead of
  ioread8_rep/iowrite8_rep operations
This patch series updates the VMCI driver to support these new
features while maintaining backwards compatibility.

The DMA based datagram operations use a send and a receive buffer
allocated at module load time. The buffer contains a header
describing the layout of the buffer followed by either an SG list or
inline data. The header also contains a flag indicating whether the
buffer is currently owned by the driver or the device. Both for send
and receive, the driver will initialize the buffer, transfer ownership
to the device by writing the buffer address to a register, and then
wait for the ownership to be transferred back. The device will
generate an interrupt when this happens.

v2 (fixes issues flagged by kernel test robot ):
- changed type of mmio_base to void __iomem *
- made vmci_read_reg, vmci_write_reg and vmci_write_data static functions

v3:
- removed log messages for page size and BAR resources

Jorgen Hansen (8):
  VMCI: dma dg: whitespace formatting change for vmci register defines
  VMCI: dma dg: add MMIO access to registers
  VMCI: dma dg: detect DMA datagram capability
  VMCI: dma dg: set OS page size
  VMCI: dma dg: register dummy IRQ handlers for DMA datagrams
  VMCI: dma dg: allocate send and receive buffers for DMA datagrams
  VMCI: dma dg: add support for DMA datagrams sends
  VMCI: dma dg: add support for DMA datagrams receive

 drivers/misc/vmw_vmci/vmci_guest.c | 335 -
 include/linux/vmw_vmci_defs.h  |  84 +++-
 2 files changed, 355 insertions(+), 64 deletions(-)

-- 
2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/8] VMCI: dma dg: add MMIO access to registers

2022-02-07 Thread Jorgen Hansen


> On 4 Feb 2022, at 16.12, Greg KH  wrote:
> 
> On Thu, Feb 03, 2022 at 05:12:31AM -0800, Jorgen Hansen wrote:
>> Detect the support for MMIO access through examination of the length
>> of the region requested in BAR1. If it is 256KB, the VMCI device
>> supports MMIO access to registers.
>> 
>> If MMIO access is supported, map the area of the region used for
>> MMIO access (64KB size at offset 128KB).
>> 
>> Add wrapper functions for accessing 32 bit register accesses through
>> either MMIO or IO ports based on device configuration.
>> 
>> Sending and receiving datagrams through iowrite8_rep/ioread8_rep is
>> left unchanged for now, and will be addressed in a later change.
>> 
>> Reviewed-by: Vishnu Dasa 
>> Signed-off-by: Jorgen Hansen 
>> ---
>> drivers/misc/vmw_vmci/vmci_guest.c | 68 ++
>> include/linux/vmw_vmci_defs.h  | 12 ++
>> 2 files changed, 62 insertions(+), 18 deletions(-)
>> 
>> diff --git a/drivers/misc/vmw_vmci/vmci_guest.c 
>> b/drivers/misc/vmw_vmci/vmci_guest.c
>> index 1018dc77269d..38ee7ed32ab9 100644
>> --- a/drivers/misc/vmw_vmci/vmci_guest.c
>> +++ b/drivers/misc/vmw_vmci/vmci_guest.c
>> @@ -45,6 +45,7 @@ static u32 vm_context_id = VMCI_INVALID_ID;
>> struct vmci_guest_device {
>>  struct device *dev; /* PCI device we are attached to */
>>  void __iomem *iobase;
>> +void __iomem *mmio_base;
>> 
>>  bool exclusive_vectors;
>> 
>> @@ -89,6 +90,21 @@ u32 vmci_get_vm_context_id(void)
>>  return vm_context_id;
>> }
>> 
>> +static unsigned int vmci_read_reg(struct vmci_guest_device *dev, u32 reg)
>> +{
>> +if (dev->mmio_base != NULL)
>> +return readl(dev->mmio_base + reg);
>> +return ioread32(dev->iobase + reg);
>> +}
>> +
>> +static void vmci_write_reg(struct vmci_guest_device *dev, u32 val, u32 reg)
>> +{
>> +if (dev->mmio_base != NULL)
>> +writel(val, dev->mmio_base + reg);
>> +else
>> +iowrite32(val, dev->iobase + reg);
>> +}
>> +
>> /*
>>  * VM to hypervisor call mechanism. We use the standard VMware naming
>>  * convention since shared code is calling this function as well.
>> @@ -116,7 +132,7 @@ int vmci_send_datagram(struct vmci_datagram *dg)
>>  if (vmci_dev_g) {
>>  iowrite8_rep(vmci_dev_g->iobase + VMCI_DATA_OUT_ADDR,
>>   dg, VMCI_DG_SIZE(dg));
>> -result = ioread32(vmci_dev_g->iobase + VMCI_RESULT_LOW_ADDR);
>> +result = vmci_read_reg(vmci_dev_g, VMCI_RESULT_LOW_ADDR);
>>  } else {
>>  result = VMCI_ERROR_UNAVAILABLE;
>>  }
>> @@ -384,7 +400,7 @@ static irqreturn_t vmci_interrupt(int irq, void *_dev)
>>  unsigned int icr;
>> 
>>  /* Acknowledge interrupt and determine what needs doing. */
>> -icr = ioread32(dev->iobase + VMCI_ICR_ADDR);
>> +icr = vmci_read_reg(dev, VMCI_ICR_ADDR);
>>  if (icr == 0 || icr == ~0)
>>  return IRQ_NONE;
>> 
>> @@ -429,7 +445,8 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
>> const struct pci_device_id *id)
>> {
>>  struct vmci_guest_device *vmci_dev;
>> -void __iomem *iobase;
>> +void __iomem *iobase = NULL;
>> +void __iomem *mmio_base = NULL;
>>  unsigned int capabilities;
>>  unsigned int caps_in_use;
>>  unsigned long cmd;
>> @@ -445,16 +462,32 @@ static int vmci_guest_probe_device(struct pci_dev 
>> *pdev,
>>  return error;
>>  }
>> 
>> -error = pcim_iomap_regions(pdev, 1 << 0, KBUILD_MODNAME);
>> -if (error) {
>> -dev_err(>dev, "Failed to reserve/map IO regions\n");
>> -return error;
>> +/*
>> + * The VMCI device with mmio access to registers requests 256KB
>> + * for BAR1. If present, driver will use new VMCI device
>> + * functionality for register access and datagram send/recv.
>> + */
>> +
>> +if (pci_resource_len(pdev, 1) == VMCI_WITH_MMIO_ACCESS_BAR_SIZE) {
>> +dev_info(>dev, "MMIO register access is available\n");
>> +mmio_base = pci_iomap_range(pdev, 1, VMCI_MMIO_ACCESS_OFFSET,
>> +VMCI_MMIO_ACCESS_SIZE);
>> +/* If the map fails, we fall back to IOIO access. */
>> +if (!mmio_base)
>> +dev_warn(>dev, "Failed to map MMIO register 
>> access\n");
>>  }
>> 
>> -iobase = pcim_iomap_table(pdev)[0];
>> +if (!mmio_base) {
>> +error = pcim_iomap_regions(pdev, BIT(0), KBUILD_MODNAME);
>> +if (error) {
>> +dev_err(>dev, "Failed to reserve/map IO 
>> regions\n");
>> +return error;
>> +}
>> +iobase = pcim_iomap_table(pdev)[0];
>> +}
>> 
>> -dev_info(>dev, "Found VMCI PCI device at %#lx, irq %u\n",
>> - (unsigned long)iobase, pdev->irq);
>> +dev_info(>dev, "Found VMCI PCI device at %#lx, %#lx, irq %u\n",

Re: [PATCH v2 4/8] VMCI: dma dg: set OS page size

2022-02-07 Thread Jorgen Hansen


> On 4 Feb 2022, at 16.12, Greg KH  wrote:
> 
> On Thu, Feb 03, 2022 at 05:12:33AM -0800, Jorgen Hansen wrote:
>> Tell the device the page size used by the OS.
>> 
>> Reviewed-by: Vishnu Dasa 
>> Signed-off-by: Jorgen Hansen 
>> ---
>> drivers/misc/vmw_vmci/vmci_guest.c | 9 +
>> include/linux/vmw_vmci_defs.h  | 1 +
>> 2 files changed, 10 insertions(+)
>> 
>> diff --git a/drivers/misc/vmw_vmci/vmci_guest.c 
>> b/drivers/misc/vmw_vmci/vmci_guest.c
>> index 5a99d8e27873..808680dc0820 100644
>> --- a/drivers/misc/vmw_vmci/vmci_guest.c
>> +++ b/drivers/misc/vmw_vmci/vmci_guest.c
>> @@ -581,6 +581,15 @@ static int vmci_guest_probe_device(struct pci_dev *pdev,
>>  /* Let the host know which capabilities we intend to use. */
>>  vmci_write_reg(vmci_dev, caps_in_use, VMCI_CAPS_ADDR);
>> 
>> +if (caps_in_use & VMCI_CAPS_DMA_DATAGRAM) {
>> +uint32_t page_shift;
>> +
>> +/* Let the device know the size for pages passed down. */
>> +vmci_write_reg(vmci_dev, PAGE_SHIFT, VMCI_GUEST_PAGE_SHIFT);
>> +page_shift = vmci_read_reg(vmci_dev, VMCI_GUEST_PAGE_SHIFT);
>> +dev_info(>dev, "Using page shift %d\n", page_shift);
> 
> Please do not print out debugging stuff like this to the kernel log.

OK, I’ll remove it.

> When drivers are working properly, they are quiet.
> 
> thanks,
> 
> greg k-h

Thanks,
Jorgen
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v3 13/17] virtio_pci: queue_reset: support VIRTIO_F_RING_RESET

2022-02-07 Thread Xuan Zhuo
On Mon, 7 Feb 2022 14:57:13 +0800, Jason Wang  wrote:
>
> 在 2022/1/26 下午3:35, Xuan Zhuo 写道:
> > This patch implements virtio pci support for QUEUE RESET.
> >
> > Performing reset on a queue is divided into two steps:
> >
> > 1. reset_vq: reset one vq
> > 2. enable_reset_vq: re-enable the reset queue
> >
> > In the first step, these tasks will be completed:
> > 1. notify the hardware queue to reset
> > 2. recycle the buffer from vq
> > 3. release the ring of the vq
> >
> > The process of enable reset vq:
> >  vp_modern_enable_reset_vq()
> >  vp_enable_reset_vq()
> >  __vp_setup_vq()
> >  setup_vq()
> >  vring_setup_virtqueue()
> >
> > In this process, we added two parameters, vq and num, and finally passed
> > them to vring_setup_virtqueue().  And reuse the original info and vq.
> >
> > Signed-off-by: Xuan Zhuo 
> > ---
> >   drivers/virtio/virtio_pci_common.c |  36 +++
> >   drivers/virtio/virtio_pci_common.h |   5 ++
> >   drivers/virtio/virtio_pci_modern.c | 100 +
> >   3 files changed, 128 insertions(+), 13 deletions(-)
> >
> > diff --git a/drivers/virtio/virtio_pci_common.c 
> > b/drivers/virtio/virtio_pci_common.c
> > index c02936d29a31..ad21638fbf66 100644
> > --- a/drivers/virtio/virtio_pci_common.c
> > +++ b/drivers/virtio/virtio_pci_common.c
> > @@ -205,23 +205,28 @@ static int vp_request_msix_vectors(struct 
> > virtio_device *vdev, int nvectors,
> > return err;
> >   }
> >
> > -static struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned 
> > index,
> > -void (*callback)(struct virtqueue *vq),
> > -const char *name,
> > -bool ctx,
> > -u16 msix_vec, u16 num)
> > +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index,
> > + void (*callback)(struct virtqueue *vq),
> > + const char *name,
> > + bool ctx,
> > + u16 msix_vec, u16 num)
> >   {
> > struct virtio_pci_device *vp_dev = to_vp_device(vdev);
> > -   struct virtio_pci_vq_info *info = kmalloc(sizeof *info, GFP_KERNEL);
> > +   struct virtio_pci_vq_info *info;
> > struct virtqueue *vq;
> > unsigned long flags;
> >
> > -   /* fill out our structure that represents an active queue */
> > -   if (!info)
> > -   return ERR_PTR(-ENOMEM);
> > +   info = vp_dev->vqs[index];
> > +   if (!info) {
> > +   info = kzalloc(sizeof *info, GFP_KERNEL);
> > +
> > +   /* fill out our structure that represents an active queue */
> > +   if (!info)
> > +   return ERR_PTR(-ENOMEM);
> > +   }
> >
> > vq = vp_dev->setup_vq(vp_dev, info, index, callback, name, ctx,
> > - msix_vec, NULL, num);
> > + msix_vec, info->vq, num);
> > if (IS_ERR(vq))
> > goto out_info;
> >
> > @@ -238,6 +243,9 @@ static struct virtqueue *vp_setup_vq(struct 
> > virtio_device *vdev, unsigned index,
> > return vq;
> >
> >   out_info:
> > +   if (info->vq && info->vq->reset)
> > +   return vq;
> > +
> > kfree(info);
> > return vq;
> >   }
> > @@ -248,9 +256,11 @@ static void vp_del_vq(struct virtqueue *vq)
> > struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index];
> > unsigned long flags;
> >
> > -   spin_lock_irqsave(_dev->lock, flags);
> > -   list_del(>node);
> > -   spin_unlock_irqrestore(_dev->lock, flags);
> > +   if (!vq->reset) {
> > +   spin_lock_irqsave(_dev->lock, flags);
> > +   list_del(>node);
> > +   spin_unlock_irqrestore(_dev->lock, flags);
> > +   }
> >
> > vp_dev->del_vq(info);
> > kfree(info);
> > diff --git a/drivers/virtio/virtio_pci_common.h 
> > b/drivers/virtio/virtio_pci_common.h
> > index 65db92245e41..c1d15f7c0be4 100644
> > --- a/drivers/virtio/virtio_pci_common.h
> > +++ b/drivers/virtio/virtio_pci_common.h
> > @@ -119,6 +119,11 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned 
> > nvqs,
> > struct virtqueue *vqs[], vq_callback_t *callbacks[],
> > const char * const names[], const bool *ctx,
> > struct irq_affinity *desc);
> > +struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index,
> > + void (*callback)(struct virtqueue *vq),
> > + const char *name,
> > + bool ctx,
> > + u16 msix_vec, u16 num);
> >   const char *vp_bus_name(struct virtio_device *vdev);
> >
> >   /* Setup the affinity for a virtqueue:
> > diff --git a/drivers/virtio/virtio_pci_modern.c 
> > b/drivers/virtio/virtio_pci_modern.c
> > index 2ce58de549de..6789411169e4 100644
> > --- a/drivers/virtio/virtio_pci_modern.c
> > +++ b/drivers/virtio/virtio_pci_modern.c
> > @@ -34,6 +34,9 @@ static void vp_transport_features(struct virtio_device 

Re: [PATCH v3 01/17] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data

2022-02-07 Thread Jason Wang
On Mon, Feb 7, 2022 at 2:07 PM Xuan Zhuo  wrote:
>
> On Mon, 7 Feb 2022 11:41:06 +0800, Jason Wang  wrote:
> >
> > 在 2022/1/26 下午3:35, Xuan Zhuo 写道:
> > > Add queue_notify_data in struct virtio_pci_common_cfg, which comes from
> > > here https://github.com/oasis-tcs/virtio-spec/issues/89
> > >
> > > Since I want to add queue_reset after it, I submitted this patch first.
> > >
> > > Signed-off-by: Xuan Zhuo 
> > > ---
> > >   include/uapi/linux/virtio_pci.h | 1 +
> > >   1 file changed, 1 insertion(+)
> > >
> > > diff --git a/include/uapi/linux/virtio_pci.h 
> > > b/include/uapi/linux/virtio_pci.h
> > > index 3a86f36d7e3d..492c89f56c6a 100644
> > > --- a/include/uapi/linux/virtio_pci.h
> > > +++ b/include/uapi/linux/virtio_pci.h
> > > @@ -164,6 +164,7 @@ struct virtio_pci_common_cfg {
> > > __le32 queue_avail_hi;  /* read-write */
> > > __le32 queue_used_lo;   /* read-write */
> > > __le32 queue_used_hi;   /* read-write */
> > > +   __le16 queue_notify_data;   /* read-write */
> > >   };
> >
> >
> > So I had the same concern as previous version.
> >
> > This breaks uABI where program may try to use sizeof(struct
> > virtio_pci_common_cfg).
> >
> > We probably need a container structure here.
>
> I see, I plan to add a struct like this, do you think it's appropriate?
>
> struct virtio_pci_common_cfg_v1 {
> struct virtio_pci_common_cfg cfg;
> __le16 queue_notify_data;   /* read-write */
> }

Something like this but we probably need a better name.

Thanks

>
> Thanks.
>
> >
> > THanks
> >
> >
> > >
> > >   /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */
> >
>

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization