date:20220405

Re: [PATCH 1/5] iommu: Replace uses of IOMMU_CAP_CACHE_COHERENCY with dev_is_dma_coherent()

2022-04-05 Thread Christoph Hellwig

On Tue, Apr 05, 2022 at 01:16:00PM -0300, Jason Gunthorpe wrote:
> diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c 
> b/drivers/infiniband/hw/usnic/usnic_uiom.c
> index 760b254ba42d6b..24d118198ac756 100644
> --- a/drivers/infiniband/hw/usnic/usnic_uiom.c
> +++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
> @@ -42,6 +42,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include "usnic_log.h"
>  #include "usnic_uiom.h"
> @@ -474,6 +475,12 @@ int usnic_uiom_attach_dev_to_pd(struct usnic_uiom_pd 
> *pd, struct device *dev)
>   struct usnic_uiom_dev *uiom_dev;
>   int err;
>  
> + if (!dev_is_dma_coherent(dev)) {

Which part of the comment at the top of dma-map-ops.h is not clear
enough to you?
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v5 2/2] virtio-blk: support mq_ops->queue_rqs()

2022-04-05 Thread Christoph Hellwig

Looks good:

Reviewed-by: Christoph Hellwig 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v5 1/2] virtio-blk: support polling I/O

2022-04-05 Thread Christoph Hellwig

On Wed, Apr 06, 2022 at 12:09:23AM +0900, Suwan Kim wrote:
> +for (i = 0; i < num_vqs - num_poll_vqs; i++) {
> +callbacks[i] = virtblk_done;
> +snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req.%d", i);
> +names[i] = vblk->vqs[i].name;
> +}
> +
> +for (; i < num_vqs; i++) {
> +callbacks[i] = NULL;
> +snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req_poll.%d", i);
> +names[i] = vblk->vqs[i].name;
> +}

This uses spaces for indentation.

> + /*
> +  * Regular queues have interrupts and hence CPU affinity is
> +  * defined by the core virtio code, but polling queues have
> +  * no interrupts so we let the block layer assign CPU affinity.
> +  */
> + if (i != HCTX_TYPE_POLL)
> + blk_mq_virtio_map_queues(>map[i], vblk->vdev, 0);
> + else
> + blk_mq_map_queues(>map[i]);

Nit, but I would have just done a "positive" check here as that is ab it
easier to read:

if (i == HCTX_TYPE_POLL)
blk_mq_map_queues(>map[i]);
else
blk_mq_virtio_map_queues(>map[i], vblk->vdev, 0);

Otherwise looks good:

Reviewed-by: Christoph Hellwig 
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 31/32] virtio_net: support rx/tx queue resize

2022-04-05 Thread Xuan Zhuo

This patch implements the resize function of the rx, tx queues.
Based on this function, it is possible to modify the ring num of the
queue.

There may be an exception during the resize process, the resize may
fail, or the vq can no longer be used. Either way, we must execute
napi_enable(). Because napi_disable is similar to a lock, napi_enable
must be called after calling napi_disable.

Signed-off-by: Xuan Zhuo 
---
 drivers/net/virtio_net.c | 81 
 1 file changed, 81 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index b8bf00525177..ba6859f305f7 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -251,6 +251,9 @@ struct padded_vnet_hdr {
char padding[4];
 };
 
+static void virtnet_sq_free_unused_buf(struct virtqueue *vq, void *buf);
+static void virtnet_rq_free_unused_buf(struct virtqueue *vq, void *buf);
+
 static bool is_xdp_frame(void *ptr)
 {
return (unsigned long)ptr & VIRTIO_XDP_FLAG;
@@ -1369,6 +1372,15 @@ static void virtnet_napi_enable(struct virtqueue *vq, 
struct napi_struct *napi)
 {
napi_enable(napi);
 
+   /* Check if vq is in reset state. The normal reset/resize process will
+* be protected by napi. However, the protection of napi is only enabled
+* during the operation, and the protection of napi will end after the
+* operation is completed. If re-enable fails during the process, vq
+* will remain unavailable with reset state.
+*/
+   if (vq->reset)
+   return;
+
/* If all buffers were filled by other side before we napi_enabled, we
 * won't get another interrupt, so process any outstanding packets now.
 * Call local_bh_enable after to trigger softIRQ processing.
@@ -1413,6 +1425,15 @@ static void refill_work(struct work_struct *work)
struct receive_queue *rq = >rq[i];
 
napi_disable(>napi);
+
+   /* Check if vq is in reset state. See more in
+* virtnet_napi_enable()
+*/
+   if (rq->vq->reset) {
+   virtnet_napi_enable(rq->vq, >napi);
+   continue;
+   }
+
still_empty = !try_fill_recv(vi, rq, GFP_KERNEL);
virtnet_napi_enable(rq->vq, >napi);
 
@@ -1523,6 +1544,10 @@ static void virtnet_poll_cleantx(struct receive_queue 
*rq)
if (!sq->napi.weight || is_xdp_raw_buffer_queue(vi, index))
return;
 
+   /* Check if vq is in reset state. See more in virtnet_napi_enable() */
+   if (sq->vq->reset)
+   return;
+
if (__netif_tx_trylock(txq)) {
do {
virtqueue_disable_cb(sq->vq);
@@ -1769,6 +1794,62 @@ static netdev_tx_t start_xmit(struct sk_buff *skb, 
struct net_device *dev)
return NETDEV_TX_OK;
 }
 
+static int virtnet_rx_resize(struct virtnet_info *vi,
+struct receive_queue *rq, u32 ring_num)
+{
+   int err;
+
+   napi_disable(>napi);
+
+   err = virtqueue_resize(rq->vq, ring_num, virtnet_rq_free_unused_buf);
+   if (err)
+   goto err;
+
+   if (!try_fill_recv(vi, rq, GFP_KERNEL))
+   schedule_delayed_work(>refill, 0);
+
+   virtnet_napi_enable(rq->vq, >napi);
+   return 0;
+
+err:
+   netdev_err(vi->dev,
+  "reset rx reset vq fail: rx queue index: %td err: %d\n",
+  rq - vi->rq, err);
+   virtnet_napi_enable(rq->vq, >napi);
+   return err;
+}
+
+static int virtnet_tx_resize(struct virtnet_info *vi,
+struct send_queue *sq, u32 ring_num)
+{
+   struct netdev_queue *txq;
+   int err, qindex;
+
+   qindex = sq - vi->sq;
+
+   virtnet_napi_tx_disable(>napi);
+
+   txq = netdev_get_tx_queue(vi->dev, qindex);
+   __netif_tx_lock_bh(txq);
+   netif_stop_subqueue(vi->dev, qindex);
+   __netif_tx_unlock_bh(txq);
+
+   err = virtqueue_resize(sq->vq, ring_num, virtnet_sq_free_unused_buf);
+   if (err)
+   goto err;
+
+   netif_start_subqueue(vi->dev, qindex);
+   virtnet_napi_tx_enable(vi, sq->vq, >napi);
+   return 0;
+
+err:
+   netdev_err(vi->dev,
+  "reset tx reset vq fail: tx queue index: %td err: %d\n",
+  sq - vi->sq, err);
+   virtnet_napi_tx_enable(vi, sq->vq, >napi);
+   return err;
+}
+
 /*
  * Send command via the control virtqueue and check status.  Commands
  * supported by the hypervisor, as indicated by feature bits, should
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 32/32] virtio_net: support set_ringparam

2022-04-05 Thread Xuan Zhuo

Support set_ringparam based on virtio queue reset.

Users can use ethtool -G eth0  to modify the ring size of
virtio-net.

Signed-off-by: Xuan Zhuo 
---
 drivers/net/virtio_net.c | 47 
 1 file changed, 47 insertions(+)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index ba6859f305f7..37e4e27f1e4e 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2264,6 +2264,52 @@ static void virtnet_get_ringparam(struct net_device *dev,
ring->tx_pending = virtqueue_get_vring_size(vi->sq[0].vq);
 }
 
+static int virtnet_set_ringparam(struct net_device *dev,
+struct ethtool_ringparam *ring,
+struct kernel_ethtool_ringparam *kernel_ring,
+struct netlink_ext_ack *extack)
+{
+   struct virtnet_info *vi = netdev_priv(dev);
+   u32 rx_pending, tx_pending;
+   struct receive_queue *rq;
+   struct send_queue *sq;
+   int i, err;
+
+   if (ring->rx_mini_pending || ring->rx_jumbo_pending)
+   return -EINVAL;
+
+   rx_pending = virtqueue_get_vring_size(vi->rq[0].vq);
+   tx_pending = virtqueue_get_vring_size(vi->sq[0].vq);
+
+   if (ring->rx_pending == rx_pending &&
+   ring->tx_pending == tx_pending)
+   return 0;
+
+   if (ring->rx_pending > virtqueue_get_vring_max_size(vi->rq[0].vq))
+   return -EINVAL;
+
+   if (ring->tx_pending > virtqueue_get_vring_max_size(vi->sq[0].vq))
+   return -EINVAL;
+
+   for (i = 0; i < vi->max_queue_pairs; i++) {
+   rq = vi->rq + i;
+   sq = vi->sq + i;
+
+   if (ring->tx_pending != tx_pending) {
+   err = virtnet_tx_resize(vi, sq, ring->tx_pending);
+   if (err)
+   return err;
+   }
+
+   if (ring->rx_pending != rx_pending) {
+   err = virtnet_rx_resize(vi, rq, ring->rx_pending);
+   if (err)
+   return err;
+   }
+   }
+
+   return 0;
+}
 
 static void virtnet_get_drvinfo(struct net_device *dev,
struct ethtool_drvinfo *info)
@@ -2497,6 +2543,7 @@ static const struct ethtool_ops virtnet_ethtool_ops = {
.get_drvinfo = virtnet_get_drvinfo,
.get_link = ethtool_op_get_link,
.get_ringparam = virtnet_get_ringparam,
+   .set_ringparam = virtnet_set_ringparam,
.get_strings = virtnet_get_strings,
.get_sset_count = virtnet_get_sset_count,
.get_ethtool_stats = virtnet_get_ethtool_stats,
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 30/32] virtio_net: split free_unused_bufs()

2022-04-05 Thread Xuan Zhuo

This patch separates two functions for freeing sq buf and rq buf from
free_unused_bufs().

When supporting the enable/disable tx/rq queue in the future, it is
necessary to support separate recovery of a sq buf or a rq buf.

Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
---
 drivers/net/virtio_net.c | 41 
 1 file changed, 25 insertions(+), 16 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 96d96c666c8c..b8bf00525177 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2804,6 +2804,27 @@ static void free_receive_page_frags(struct virtnet_info 
*vi)
put_page(vi->rq[i].alloc_frag.page);
 }
 
+static void virtnet_sq_free_unused_buf(struct virtqueue *vq, void *buf)
+{
+   if (!is_xdp_frame(buf))
+   dev_kfree_skb(buf);
+   else
+   xdp_return_frame(ptr_to_xdp(buf));
+}
+
+static void virtnet_rq_free_unused_buf(struct virtqueue *vq, void *buf)
+{
+   struct virtnet_info *vi = vq->vdev->priv;
+   int i = vq2rxq(vq);
+
+   if (vi->mergeable_rx_bufs)
+   put_page(virt_to_head_page(buf));
+   else if (vi->big_packets)
+   give_pages(>rq[i], buf);
+   else
+   put_page(virt_to_head_page(buf));
+}
+
 static void free_unused_bufs(struct virtnet_info *vi)
 {
void *buf;
@@ -2811,26 +2832,14 @@ static void free_unused_bufs(struct virtnet_info *vi)
 
for (i = 0; i < vi->max_queue_pairs; i++) {
struct virtqueue *vq = vi->sq[i].vq;
-   while ((buf = virtqueue_detach_unused_buf(vq)) != NULL) {
-   if (!is_xdp_frame(buf))
-   dev_kfree_skb(buf);
-   else
-   xdp_return_frame(ptr_to_xdp(buf));
-   }
+   while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
+   virtnet_sq_free_unused_buf(vq, buf);
}
 
for (i = 0; i < vi->max_queue_pairs; i++) {
struct virtqueue *vq = vi->rq[i].vq;
-
-   while ((buf = virtqueue_detach_unused_buf(vq)) != NULL) {
-   if (vi->mergeable_rx_bufs) {
-   put_page(virt_to_head_page(buf));
-   } else if (vi->big_packets) {
-   give_pages(>rq[i], buf);
-   } else {
-   put_page(virt_to_head_page(buf));
-   }
-   }
+   while ((buf = virtqueue_detach_unused_buf(vq)) != NULL)
+   virtnet_rq_free_unused_buf(vq, buf);
}
 }
 
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 29/32] virtio_net: get ringparam by virtqueue_get_vring_max_size()

2022-04-05 Thread Xuan Zhuo

Use virtqueue_get_vring_max_size() in virtnet_get_ringparam() to set
tx,rx_max_pending.

Signed-off-by: Xuan Zhuo 
---
 drivers/net/virtio_net.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index dad497a47b3a..96d96c666c8c 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2177,10 +2177,10 @@ static void virtnet_get_ringparam(struct net_device 
*dev,
 {
struct virtnet_info *vi = netdev_priv(dev);
 
-   ring->rx_max_pending = virtqueue_get_vring_size(vi->rq[0].vq);
-   ring->tx_max_pending = virtqueue_get_vring_size(vi->sq[0].vq);
-   ring->rx_pending = ring->rx_max_pending;
-   ring->tx_pending = ring->tx_max_pending;
+   ring->rx_max_pending = virtqueue_get_vring_max_size(vi->rq[0].vq);
+   ring->tx_max_pending = virtqueue_get_vring_max_size(vi->sq[0].vq);
+   ring->rx_pending = virtqueue_get_vring_size(vi->rq[0].vq);
+   ring->tx_pending = virtqueue_get_vring_size(vi->sq[0].vq);
 }
 
 
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 28/32] virtio_net: set the default max ring size by find_vqs()

2022-04-05 Thread Xuan Zhuo

Use virtio_find_vqs_ctx_size() to specify the maximum ring size of tx,
rx at the same time.

 | rx/tx ring size
---
speed == UNKNOWN or < 10G| 1024
speed < 40G  | 4096
speed >= 40G | 8192

Call virtnet_update_settings() once before calling init_vqs() to update
speed.

Signed-off-by: Xuan Zhuo 
---
 drivers/net/virtio_net.c | 42 
 1 file changed, 38 insertions(+), 4 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index a801ea40908f..dad497a47b3a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2861,6 +2861,29 @@ static unsigned int mergeable_min_buf_len(struct 
virtnet_info *vi, struct virtqu
   (unsigned int)GOOD_PACKET_LEN);
 }
 
+static void virtnet_config_sizes(struct virtnet_info *vi, u32 *sizes)
+{
+   u32 i, rx_size, tx_size;
+
+   if (vi->speed == SPEED_UNKNOWN || vi->speed < SPEED_1) {
+   rx_size = 1024;
+   tx_size = 1024;
+
+   } else if (vi->speed < SPEED_4) {
+   rx_size = 1024 * 4;
+   tx_size = 1024 * 4;
+
+   } else {
+   rx_size = 1024 * 8;
+   tx_size = 1024 * 8;
+   }
+
+   for (i = 0; i < vi->max_queue_pairs; i++) {
+   sizes[rxq2vq(i)] = rx_size;
+   sizes[txq2vq(i)] = tx_size;
+   }
+}
+
 static int virtnet_find_vqs(struct virtnet_info *vi)
 {
vq_callback_t **callbacks;
@@ -2868,6 +2891,7 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
int ret = -ENOMEM;
int i, total_vqs;
const char **names;
+   u32 *sizes;
bool *ctx;
 
/* We expect 1 RX virtqueue followed by 1 TX virtqueue, followed by
@@ -2895,10 +2919,15 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
ctx = NULL;
}
 
+   sizes = kmalloc_array(total_vqs, sizeof(*sizes), GFP_KERNEL);
+   if (!sizes)
+   goto err_sizes;
+
/* Parameters for control virtqueue, if any */
if (vi->has_cvq) {
callbacks[total_vqs - 1] = NULL;
names[total_vqs - 1] = "control";
+   sizes[total_vqs - 1] = 64;
}
 
/* Allocate/initialize parameters for send/receive virtqueues */
@@ -2913,8 +2942,10 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
ctx[rxq2vq(i)] = true;
}
 
-   ret = virtio_find_vqs_ctx(vi->vdev, total_vqs, vqs, callbacks,
- names, ctx, NULL);
+   virtnet_config_sizes(vi, sizes);
+
+   ret = virtio_find_vqs_ctx_size(vi->vdev, total_vqs, vqs, callbacks,
+  names, sizes, ctx, NULL);
if (ret)
goto err_find;
 
@@ -2934,6 +2965,8 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
 
 
 err_find:
+   kfree(sizes);
+err_sizes:
kfree(ctx);
 err_ctx:
kfree(names);
@@ -3252,6 +3285,9 @@ static int virtnet_probe(struct virtio_device *vdev)
vi->curr_queue_pairs = num_online_cpus();
vi->max_queue_pairs = max_queue_pairs;
 
+   virtnet_init_settings(dev);
+   virtnet_update_settings(vi);
+
/* Allocate/initialize the rx/tx queues, and invoke find_vqs */
err = init_vqs(vi);
if (err)
@@ -3264,8 +3300,6 @@ static int virtnet_probe(struct virtio_device *vdev)
netif_set_real_num_tx_queues(dev, vi->curr_queue_pairs);
netif_set_real_num_rx_queues(dev, vi->curr_queue_pairs);
 
-   virtnet_init_settings(dev);
-
if (virtio_has_feature(vdev, VIRTIO_NET_F_STANDBY)) {
vi->failover = net_failover_create(vi->dev);
if (IS_ERR(vi->failover)) {
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 27/32] virtio: add helper virtio_find_vqs_ctx_size()

2022-04-05 Thread Xuan Zhuo

Introduce helper virtio_find_vqs_ctx_size() to call find_vqs and specify
the maximum size of each vq ring.

Signed-off-by: Xuan Zhuo 
---
 include/linux/virtio_config.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 0f7def7ddfd2..22e29c926946 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -235,6 +235,18 @@ int virtio_find_vqs_ctx(struct virtio_device *vdev, 
unsigned nvqs,
  ctx, desc);
 }
 
+static inline
+int virtio_find_vqs_ctx_size(struct virtio_device *vdev, u32 nvqs,
+struct virtqueue *vqs[],
+vq_callback_t *callbacks[],
+const char * const names[],
+u32 sizes[],
+const bool *ctx, struct irq_affinity *desc)
+{
+   return vdev->config->find_vqs(vdev, nvqs, vqs, callbacks, names, sizes,
+ ctx, desc);
+}
+
 /**
  * virtio_device_ready - enable vq use in probe function
  * @vdev: the device
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 26/32] virtio_mmio: support the arg sizes of find_vqs()

2022-04-05 Thread Xuan Zhuo

Virtio MMIO support the new parameter sizes of find_vqs().

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_mmio.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index 9d5a674bdeec..51cf51764a92 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -347,7 +347,7 @@ static void vm_del_vqs(struct virtio_device *vdev)
 
 static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned 
index,
  void (*callback)(struct virtqueue *vq),
- const char *name, bool ctx)
+ const char *name, u32 size, bool ctx)
 {
struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
struct virtio_mmio_vq_info *info;
@@ -382,8 +382,11 @@ static struct virtqueue *vm_setup_vq(struct virtio_device 
*vdev, unsigned index,
goto error_new_virtqueue;
}
 
+   if (!size || size > num)
+   size = num;
+
/* Create the vring */
-   vq = vring_create_virtqueue(index, num, VIRTIO_MMIO_VRING_ALIGN, vdev,
+   vq = vring_create_virtqueue(index, size, VIRTIO_MMIO_VRING_ALIGN, vdev,
 true, true, ctx, vm_notify, callback, name);
if (!vq) {
err = -ENOMEM;
@@ -484,6 +487,7 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned 
nvqs,
}
 
vqs[i] = vm_setup_vq(vdev, queue_idx++, callbacks[i], names[i],
+sizes ? sizes[i] : 0,
 ctx ? ctx[i] : false);
if (IS_ERR(vqs[i])) {
vm_del_vqs(vdev);
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 24/32] virtio: find_vqs() add arg sizes

2022-04-05 Thread Xuan Zhuo

find_vqs() adds a new parameter sizes to specify the size of each vq
vring.

0 means use the maximum size supported by the backend.

In the split scenario, the meaning of size is the largest size, because
it may be limited by memory, the virtio core will try a smaller size.
And the size is power of 2.

Signed-off-by: Xuan Zhuo 
Acked-by: Hans de Goede 
Reviewed-by: Mathieu Poirier 
---
 arch/um/drivers/virtio_uml.c |  2 +-
 drivers/platform/mellanox/mlxbf-tmfifo.c |  1 +
 drivers/remoteproc/remoteproc_virtio.c   |  1 +
 drivers/s390/virtio/virtio_ccw.c |  1 +
 drivers/virtio/virtio_mmio.c |  1 +
 drivers/virtio/virtio_pci_common.c   |  2 +-
 drivers/virtio/virtio_pci_common.h   |  2 +-
 drivers/virtio/virtio_pci_modern.c   |  7 +--
 drivers/virtio/virtio_vdpa.c |  1 +
 include/linux/virtio_config.h| 14 +-
 10 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/arch/um/drivers/virtio_uml.c b/arch/um/drivers/virtio_uml.c
index 904993d15a85..6af98d130840 100644
--- a/arch/um/drivers/virtio_uml.c
+++ b/arch/um/drivers/virtio_uml.c
@@ -998,7 +998,7 @@ static struct virtqueue *vu_setup_vq(struct virtio_device 
*vdev,
 
 static int vu_find_vqs(struct virtio_device *vdev, unsigned nvqs,
   struct virtqueue *vqs[], vq_callback_t *callbacks[],
-  const char * const names[], const bool *ctx,
+  const char * const names[], u32 sizes[], const bool *ctx,
   struct irq_affinity *desc)
 {
struct virtio_uml_device *vu_dev = to_virtio_uml_device(vdev);
diff --git a/drivers/platform/mellanox/mlxbf-tmfifo.c 
b/drivers/platform/mellanox/mlxbf-tmfifo.c
index 1ae3c56b66b0..8be13d416f48 100644
--- a/drivers/platform/mellanox/mlxbf-tmfifo.c
+++ b/drivers/platform/mellanox/mlxbf-tmfifo.c
@@ -928,6 +928,7 @@ static int mlxbf_tmfifo_virtio_find_vqs(struct 
virtio_device *vdev,
struct virtqueue *vqs[],
vq_callback_t *callbacks[],
const char * const names[],
+   u32 sizes[],
const bool *ctx,
struct irq_affinity *desc)
 {
diff --git a/drivers/remoteproc/remoteproc_virtio.c 
b/drivers/remoteproc/remoteproc_virtio.c
index 7611755d0ae2..baad31c9da45 100644
--- a/drivers/remoteproc/remoteproc_virtio.c
+++ b/drivers/remoteproc/remoteproc_virtio.c
@@ -158,6 +158,7 @@ static int rproc_virtio_find_vqs(struct virtio_device 
*vdev, unsigned int nvqs,
 struct virtqueue *vqs[],
 vq_callback_t *callbacks[],
 const char * const names[],
+u32 sizes[],
 const bool * ctx,
 struct irq_affinity *desc)
 {
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 468da60b56c5..f0c814a54e78 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -634,6 +634,7 @@ static int virtio_ccw_find_vqs(struct virtio_device *vdev, 
unsigned nvqs,
   struct virtqueue *vqs[],
   vq_callback_t *callbacks[],
   const char * const names[],
+  u32 sizes[],
   const bool *ctx,
   struct irq_affinity *desc)
 {
diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index a41abc8051b9..9d5a674bdeec 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -461,6 +461,7 @@ static int vm_find_vqs(struct virtio_device *vdev, unsigned 
nvqs,
   struct virtqueue *vqs[],
   vq_callback_t *callbacks[],
   const char * const names[],
+  u32 sizes[],
   const bool *ctx,
   struct irq_affinity *desc)
 {
diff --git a/drivers/virtio/virtio_pci_common.c 
b/drivers/virtio/virtio_pci_common.c
index 863d3a8a0956..826ea2e35d54 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -427,7 +427,7 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, 
unsigned nvqs,
 /* the config->find_vqs() implementation */
 int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs,
struct virtqueue *vqs[], vq_callback_t *callbacks[],
-   const char * const names[], const bool *ctx,
+   const char * const names[], u32 sizes[], const bool *ctx,
struct irq_affinity *desc)
 {
int err;
diff --git a/drivers/virtio/virtio_pci_common.h 
b/drivers/virtio/virtio_pci_common.h
index 23f6c5c678d5..859eed559e10 100644
---

[PATCH v9 25/32] virtio_pci: support the arg sizes of find_vqs()

2022-04-05 Thread Xuan Zhuo

Virtio PCI supports new parameter sizes of find_vqs().

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_pci_common.c | 18 ++
 drivers/virtio/virtio_pci_common.h |  1 +
 drivers/virtio/virtio_pci_legacy.c |  6 +-
 drivers/virtio/virtio_pci_modern.c | 10 +++---
 4 files changed, 23 insertions(+), 12 deletions(-)

diff --git a/drivers/virtio/virtio_pci_common.c 
b/drivers/virtio/virtio_pci_common.c
index 826ea2e35d54..23976c61583f 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -208,6 +208,7 @@ static int vp_request_msix_vectors(struct virtio_device 
*vdev, int nvectors,
 static struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned 
index,
 void (*callback)(struct virtqueue *vq),
 const char *name,
+u32 size,
 bool ctx,
 u16 msix_vec)
 {
@@ -220,7 +221,7 @@ static struct virtqueue *vp_setup_vq(struct virtio_device 
*vdev, unsigned index,
if (!info)
return ERR_PTR(-ENOMEM);
 
-   vq = vp_dev->setup_vq(vp_dev, info, index, callback, name, ctx,
+   vq = vp_dev->setup_vq(vp_dev, info, index, callback, name, size, ctx,
  msix_vec);
if (IS_ERR(vq))
goto out_info;
@@ -314,7 +315,7 @@ void vp_del_vqs(struct virtio_device *vdev)
 
 static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
struct virtqueue *vqs[], vq_callback_t *callbacks[],
-   const char * const names[], bool per_vq_vectors,
+   const char * const names[], u32 sizes[], bool per_vq_vectors,
const bool *ctx,
struct irq_affinity *desc)
 {
@@ -357,8 +358,8 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, 
unsigned nvqs,
else
msix_vec = VP_MSIX_VQ_VECTOR;
vqs[i] = vp_setup_vq(vdev, queue_idx++, callbacks[i], names[i],
-ctx ? ctx[i] : false,
-msix_vec);
+sizes ? sizes[i] : 0,
+ctx ? ctx[i] : false, msix_vec);
if (IS_ERR(vqs[i])) {
err = PTR_ERR(vqs[i]);
goto error_find;
@@ -388,7 +389,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, 
unsigned nvqs,
 
 static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned nvqs,
struct virtqueue *vqs[], vq_callback_t *callbacks[],
-   const char * const names[], const bool *ctx)
+   const char * const names[], u32 sizes[], const bool *ctx)
 {
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
int i, err, queue_idx = 0;
@@ -410,6 +411,7 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, 
unsigned nvqs,
continue;
}
vqs[i] = vp_setup_vq(vdev, queue_idx++, callbacks[i], names[i],
+sizes ? sizes[i] : 0,
 ctx ? ctx[i] : false,
 VIRTIO_MSI_NO_VECTOR);
if (IS_ERR(vqs[i])) {
@@ -433,15 +435,15 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs,
int err;
 
/* Try MSI-X with one vector per queue. */
-   err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, true, ctx, 
desc);
+   err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, sizes, true, 
ctx, desc);
if (!err)
return 0;
/* Fallback: MSI-X with one vector for config, one shared for queues. */
-   err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, false, ctx, 
desc);
+   err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, sizes, false, 
ctx, desc);
if (!err)
return 0;
/* Finally fall back to regular interrupts. */
-   return vp_find_vqs_intx(vdev, nvqs, vqs, callbacks, names, ctx);
+   return vp_find_vqs_intx(vdev, nvqs, vqs, callbacks, names, sizes, ctx);
 }
 
 const char *vp_bus_name(struct virtio_device *vdev)
diff --git a/drivers/virtio/virtio_pci_common.h 
b/drivers/virtio/virtio_pci_common.h
index 859eed559e10..fbf5a6d4b164 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -81,6 +81,7 @@ struct virtio_pci_device {
  unsigned idx,
  void (*callback)(struct virtqueue *vq),
  const char *name,
+ u32 size,
  bool ctx,
  u16 msix_vec);
void (*del_vq)(struct virtio_pci_vq_info *info);
diff --git

[PATCH v9 23/32] virtio_pci: queue_reset: support VIRTIO_F_RING_RESET

2022-04-05 Thread Xuan Zhuo

This patch implements virtio pci support for QUEUE RESET.

Performing reset on a queue is divided into these steps:

 1. notify the device to reset the queue
 2. recycle the buffer submitted
 3. reset the vring (may re-alloc)
 4. mmap vring to device, and enable the queue

This patch implements virtio_reset_vq(), virtio_enable_resetq() in the
pci scenario.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_pci_common.c |  8 +--
 drivers/virtio/virtio_pci_modern.c | 84 ++
 drivers/virtio/virtio_ring.c   |  2 +
 include/linux/virtio.h |  1 +
 4 files changed, 92 insertions(+), 3 deletions(-)

diff --git a/drivers/virtio/virtio_pci_common.c 
b/drivers/virtio/virtio_pci_common.c
index fdbde1db5ec5..863d3a8a0956 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -248,9 +248,11 @@ static void vp_del_vq(struct virtqueue *vq)
struct virtio_pci_vq_info *info = vp_dev->vqs[vq->index];
unsigned long flags;
 
-   spin_lock_irqsave(_dev->lock, flags);
-   list_del(>node);
-   spin_unlock_irqrestore(_dev->lock, flags);
+   if (!vq->reset) {
+   spin_lock_irqsave(_dev->lock, flags);
+   list_del(>node);
+   spin_unlock_irqrestore(_dev->lock, flags);
+   }
 
vp_dev->del_vq(info);
kfree(info);
diff --git a/drivers/virtio/virtio_pci_modern.c 
b/drivers/virtio/virtio_pci_modern.c
index 49a4493732cf..cb5d38f1c9c8 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -34,6 +34,9 @@ static void vp_transport_features(struct virtio_device *vdev, 
u64 features)
if ((features & BIT_ULL(VIRTIO_F_SR_IOV)) &&
pci_find_ext_capability(pci_dev, PCI_EXT_CAP_ID_SRIOV))
__virtio_set_bit(vdev, VIRTIO_F_SR_IOV);
+
+   if (features & BIT_ULL(VIRTIO_F_RING_RESET))
+   __virtio_set_bit(vdev, VIRTIO_F_RING_RESET);
 }
 
 /* virtio config->finalize_features() implementation */
@@ -199,6 +202,83 @@ static int vp_active_vq(struct virtqueue *vq, u16 msix_vec)
return 0;
 }
 
+static int vp_modern_reset_vq(struct virtqueue *vq)
+{
+   struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev);
+   struct virtio_pci_modern_device *mdev = _dev->mdev;
+   struct virtio_pci_vq_info *info;
+   unsigned long flags;
+
+   if (!virtio_has_feature(vq->vdev, VIRTIO_F_RING_RESET))
+   return -ENOENT;
+
+   vp_modern_set_queue_reset(mdev, vq->index);
+
+   info = vp_dev->vqs[vq->index];
+
+   /* delete vq from irq handler */
+   spin_lock_irqsave(_dev->lock, flags);
+   list_del(>node);
+   spin_unlock_irqrestore(_dev->lock, flags);
+
+   INIT_LIST_HEAD(>node);
+
+   /* For the case where vq has an exclusive irq, to prevent the irq from
+* being received again and the pending irq, call disable_irq().
+*
+* In the scenario based on shared interrupts, vq will be searched from
+* the queue virtqueues. Since the previous list_del() has been deleted
+* from the queue, it is impossible for vq to be called in this case.
+* There is no need to close the corresponding interrupt.
+*/
+   if (vp_dev->per_vq_vectors && info->msix_vector != VIRTIO_MSI_NO_VECTOR)
+   disable_irq(pci_irq_vector(vp_dev->pci_dev, info->msix_vector));
+
+   vq->reset = true;
+
+   return 0;
+}
+
+static int vp_modern_enable_reset_vq(struct virtqueue *vq)
+{
+   struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev);
+   struct virtio_pci_modern_device *mdev = _dev->mdev;
+   struct virtio_pci_vq_info *info;
+   unsigned long flags, index;
+   int err;
+
+   if (!vq->reset)
+   return -EBUSY;
+
+   index = vq->index;
+   info = vp_dev->vqs[index];
+
+   /* check queue reset status */
+   if (vp_modern_get_queue_reset(mdev, index) != 1)
+   return -EBUSY;
+
+   err = vp_active_vq(vq, info->msix_vector);
+   if (err)
+   return err;
+
+   if (vq->callback) {
+   spin_lock_irqsave(_dev->lock, flags);
+   list_add(>node, _dev->virtqueues);
+   spin_unlock_irqrestore(_dev->lock, flags);
+   } else {
+   INIT_LIST_HEAD(>node);
+   }
+
+   vp_modern_set_queue_enable(_dev->mdev, index, true);
+
+   if (vp_dev->per_vq_vectors && info->msix_vector != VIRTIO_MSI_NO_VECTOR)
+   enable_irq(pci_irq_vector(vp_dev->pci_dev, info->msix_vector));
+
+   vq->reset = false;
+
+   return 0;
+}
+
 static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector)
 {
return vp_modern_config_vector(_dev->mdev, vector);
@@ -407,6 +487,8 @@ static const struct virtio_config_ops 
virtio_pci_config_nodev_ops = {
.set_vq_affinity = vp_set_vq_affinity,
.get_vq_affinity = vp_get_vq_affinity,

[PATCH v9 22/32] virtio_pci: queue_reset: extract the logic of active vq for modern pci

2022-04-05 Thread Xuan Zhuo

Introduce vp_active_vq() to configure vring to backend after vq attach
vring. And configure vq vector if necessary.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_pci_modern.c | 46 ++
 1 file changed, 28 insertions(+), 18 deletions(-)

diff --git a/drivers/virtio/virtio_pci_modern.c 
b/drivers/virtio/virtio_pci_modern.c
index 86d301f272b8..49a4493732cf 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -176,6 +176,29 @@ static void vp_reset(struct virtio_device *vdev)
vp_disable_cbs(vdev);
 }
 
+static int vp_active_vq(struct virtqueue *vq, u16 msix_vec)
+{
+   struct virtio_pci_device *vp_dev = to_vp_device(vq->vdev);
+   struct virtio_pci_modern_device *mdev = _dev->mdev;
+   unsigned long index;
+
+   index = vq->index;
+
+   /* activate the queue */
+   vp_modern_set_queue_size(mdev, index, virtqueue_get_vring_size(vq));
+   vp_modern_queue_address(mdev, index, virtqueue_get_desc_addr(vq),
+   virtqueue_get_avail_addr(vq),
+   virtqueue_get_used_addr(vq));
+
+   if (msix_vec != VIRTIO_MSI_NO_VECTOR) {
+   msix_vec = vp_modern_queue_vector(mdev, index, msix_vec);
+   if (msix_vec == VIRTIO_MSI_NO_VECTOR)
+   return -EBUSY;
+   }
+
+   return 0;
+}
+
 static u16 vp_config_vector(struct virtio_pci_device *vp_dev, u16 vector)
 {
return vp_modern_config_vector(_dev->mdev, vector);
@@ -220,32 +243,19 @@ static struct virtqueue *setup_vq(struct 
virtio_pci_device *vp_dev,
 
vq->num_max = num;
 
-   /* activate the queue */
-   vp_modern_set_queue_size(mdev, index, virtqueue_get_vring_size(vq));
-   vp_modern_queue_address(mdev, index, virtqueue_get_desc_addr(vq),
-   virtqueue_get_avail_addr(vq),
-   virtqueue_get_used_addr(vq));
+   err = vp_active_vq(vq, msix_vec);
+   if (err)
+   goto err;
 
vq->priv = (void __force *)vp_modern_map_vq_notify(mdev, index, NULL);
if (!vq->priv) {
err = -ENOMEM;
-   goto err_map_notify;
-   }
-
-   if (msix_vec != VIRTIO_MSI_NO_VECTOR) {
-   msix_vec = vp_modern_queue_vector(mdev, index, msix_vec);
-   if (msix_vec == VIRTIO_MSI_NO_VECTOR) {
-   err = -EBUSY;
-   goto err_assign_vector;
-   }
+   goto err;
}
 
return vq;
 
-err_assign_vector:
-   if (!mdev->notify_base)
-   pci_iounmap(mdev->pci_dev, (void __iomem __force *)vq->priv);
-err_map_notify:
+err:
vring_del_virtqueue(vq);
return ERR_PTR(err);
 }
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 21/32] virtio_pci: queue_reset: update struct virtio_pci_common_cfg and option functions

2022-04-05 Thread Xuan Zhuo

Add queue_reset in virtio_pci_common_cfg, and add related operation
functions.

For not breaks uABI, add a new struct virtio_pci_common_cfg_reset.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_pci_modern_dev.c | 36 ++
 include/linux/virtio_pci_modern.h  |  2 ++
 include/uapi/linux/virtio_pci.h|  7 +
 3 files changed, 45 insertions(+)

diff --git a/drivers/virtio/virtio_pci_modern_dev.c 
b/drivers/virtio/virtio_pci_modern_dev.c
index e8b3ff2b9fbc..8c74b00bc511 100644
--- a/drivers/virtio/virtio_pci_modern_dev.c
+++ b/drivers/virtio/virtio_pci_modern_dev.c
@@ -3,6 +3,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * vp_modern_map_capability - map a part of virtio pci capability
@@ -463,6 +464,41 @@ void vp_modern_set_status(struct virtio_pci_modern_device 
*mdev,
 }
 EXPORT_SYMBOL_GPL(vp_modern_set_status);
 
+/*
+ * vp_modern_get_queue_reset - get the queue reset status
+ * @mdev: the modern virtio-pci device
+ * @index: queue index
+ */
+int vp_modern_get_queue_reset(struct virtio_pci_modern_device *mdev, u16 index)
+{
+   struct virtio_pci_common_cfg_reset __iomem *cfg;
+
+   cfg = (struct virtio_pci_common_cfg_reset __iomem *)mdev->common;
+
+   vp_iowrite16(index, >cfg.queue_select);
+   return vp_ioread16(>queue_reset);
+}
+EXPORT_SYMBOL_GPL(vp_modern_get_queue_reset);
+
+/*
+ * vp_modern_set_queue_reset - reset the queue
+ * @mdev: the modern virtio-pci device
+ * @index: queue index
+ */
+void vp_modern_set_queue_reset(struct virtio_pci_modern_device *mdev, u16 
index)
+{
+   struct virtio_pci_common_cfg_reset __iomem *cfg;
+
+   cfg = (struct virtio_pci_common_cfg_reset __iomem *)mdev->common;
+
+   vp_iowrite16(index, >cfg.queue_select);
+   vp_iowrite16(1, >queue_reset);
+
+   while (vp_ioread16(>queue_reset) != 1)
+   msleep(1);
+}
+EXPORT_SYMBOL_GPL(vp_modern_set_queue_reset);
+
 /*
  * vp_modern_queue_vector - set the MSIX vector for a specific virtqueue
  * @mdev: the modern virtio-pci device
diff --git a/include/linux/virtio_pci_modern.h 
b/include/linux/virtio_pci_modern.h
index eb2bd9b4077d..cc4154dd7b28 100644
--- a/include/linux/virtio_pci_modern.h
+++ b/include/linux/virtio_pci_modern.h
@@ -106,4 +106,6 @@ void __iomem * vp_modern_map_vq_notify(struct 
virtio_pci_modern_device *mdev,
   u16 index, resource_size_t *pa);
 int vp_modern_probe(struct virtio_pci_modern_device *mdev);
 void vp_modern_remove(struct virtio_pci_modern_device *mdev);
+int vp_modern_get_queue_reset(struct virtio_pci_modern_device *mdev, u16 
index);
+void vp_modern_set_queue_reset(struct virtio_pci_modern_device *mdev, u16 
index);
 #endif
diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
index 22bec9bd0dfc..d9462efd6ce8 100644
--- a/include/uapi/linux/virtio_pci.h
+++ b/include/uapi/linux/virtio_pci.h
@@ -173,6 +173,13 @@ struct virtio_pci_common_cfg_notify {
__le16 padding;
 };
 
+struct virtio_pci_common_cfg_reset {
+   struct virtio_pci_common_cfg cfg;
+
+   __le16 queue_notify_data;   /* read-write */
+   __le16 queue_reset; /* read-write */
+};
+
 /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */
 struct virtio_pci_cfg_cap {
struct virtio_pci_cap cap;
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 19/32] virtio_pci: struct virtio_pci_common_cfg add queue_notify_data

2022-04-05 Thread Xuan Zhuo

Add queue_notify_data in struct virtio_pci_common_cfg, which comes from
here https://github.com/oasis-tcs/virtio-spec/issues/89

For not breaks uABI, add a new struct virtio_pci_common_cfg_notify.

Since I want to add queue_reset after queue_notify_data, I submitted
this patch first.

Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
---
 include/uapi/linux/virtio_pci.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
index 3a86f36d7e3d..22bec9bd0dfc 100644
--- a/include/uapi/linux/virtio_pci.h
+++ b/include/uapi/linux/virtio_pci.h
@@ -166,6 +166,13 @@ struct virtio_pci_common_cfg {
__le32 queue_used_hi;   /* read-write */
 };
 
+struct virtio_pci_common_cfg_notify {
+   struct virtio_pci_common_cfg cfg;
+
+   __le16 queue_notify_data;   /* read-write */
+   __le16 padding;
+};
+
 /* Fields in VIRTIO_PCI_CAP_PCI_CFG: */
 struct virtio_pci_cfg_cap {
struct virtio_pci_cap cap;
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 20/32] virtio: queue_reset: add VIRTIO_F_RING_RESET

2022-04-05 Thread Xuan Zhuo

Added VIRTIO_F_RING_RESET, it came from here
https://github.com/oasis-tcs/virtio-spec/issues/124

This feature indicates that the driver can reset a queue individually.

Signed-off-by: Xuan Zhuo 
Acked-by: Jason Wang 
---
 include/uapi/linux/virtio_config.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/virtio_config.h 
b/include/uapi/linux/virtio_config.h
index b5eda06f0d57..0862be802ff8 100644
--- a/include/uapi/linux/virtio_config.h
+++ b/include/uapi/linux/virtio_config.h
@@ -52,7 +52,7 @@
  * rest are per-device feature bits.
  */
 #define VIRTIO_TRANSPORT_F_START   28
-#define VIRTIO_TRANSPORT_F_END 38
+#define VIRTIO_TRANSPORT_F_END 41
 
 #ifndef VIRTIO_CONFIG_NO_LEGACY
 /* Do we get callbacks when the ring is completely used, even if we've
@@ -92,4 +92,9 @@
  * Does the device support Single Root I/O Virtualization?
  */
 #define VIRTIO_F_SR_IOV37
+
+/*
+ * This feature indicates that the driver can reset a queue individually.
+ */
+#define VIRTIO_F_RING_RESET40
 #endif /* _UAPI_LINUX_VIRTIO_CONFIG_H */
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 18/32] virtio_ring: introduce virtqueue_resize()

2022-04-05 Thread Xuan Zhuo

Introduce virtqueue_resize() to implement the resize of vring.
Based on these, the driver can dynamically adjust the size of the vring.
For example: ethtool -G.

virtqueue_resize() implements resize based on the vq reset function. In
case of failure to allocate a new vring, it will give up resize and use
the original vring.

During this process, if the re-enable reset vq fails, the vq can no
longer be used. Although the probability of this situation is not high.

The parameter recycle is used to recycle the buffer that is no longer
used.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 69 
 include/linux/virtio.h   |  3 ++
 2 files changed, 72 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 06f66b15c86c..6250e19fc5bf 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2554,6 +2554,75 @@ struct virtqueue *vring_create_virtqueue(
 }
 EXPORT_SYMBOL_GPL(vring_create_virtqueue);
 
+/**
+ * virtqueue_resize - resize the vring of vq
+ * @_vq: the struct virtqueue we're talking about.
+ * @num: new ring num
+ * @recycle: callback for recycle the useless buffer
+ *
+ * When it is really necessary to create a new vring, it will set the current 
vq
+ * into the reset state. Then call the passed callback to recycle the buffer
+ * that is no longer used. Only after the new vring is successfully created, 
the
+ * old vring will be released.
+ *
+ * Caller must ensure we don't call this with other virtqueue operations
+ * at the same time (except where noted).
+ *
+ * Returns zero or a negative error.
+ */
+int virtqueue_resize(struct virtqueue *_vq, u32 num,
+void (*recycle)(struct virtqueue *vq, void *buf))
+{
+   struct vring_virtqueue *vq = to_vvq(_vq);
+   struct virtio_device *vdev = vq->vq.vdev;
+   bool packed;
+   void *buf;
+   int err;
+
+   if (!vq->we_own_ring)
+   return -EINVAL;
+
+   if (num > vq->vq.num_max)
+   return -E2BIG;
+
+   if (!num)
+   return -EINVAL;
+
+   packed = virtio_has_feature(vdev, VIRTIO_F_RING_PACKED) ? true : false;
+
+   if ((packed ? vq->packed.vring.num : vq->split.vring.num) == num)
+   return 0;
+
+   if (!vdev->config->reset_vq)
+   return -ENOENT;
+
+   if (!vdev->config->enable_reset_vq)
+   return -ENOENT;
+
+   err = vdev->config->reset_vq(_vq);
+   if (err)
+   return err;
+
+   while ((buf = virtqueue_detach_unused_buf(_vq)) != NULL)
+   recycle(_vq, buf);
+
+   if (packed) {
+   err = virtqueue_resize_packed(_vq, num);
+   if (err)
+   virtqueue_reinit_packed(vq);
+   } else {
+   err = virtqueue_resize_split(_vq, num);
+   if (err)
+   virtqueue_reinit_split(vq);
+   }
+
+   if (vdev->config->enable_reset_vq(_vq))
+   return -EBUSY;
+
+   return err;
+}
+EXPORT_SYMBOL_GPL(virtqueue_resize);
+
 /* Only available for split ring */
 struct virtqueue *vring_new_virtqueue(unsigned int index,
  unsigned int num,
diff --git a/include/linux/virtio.h b/include/linux/virtio.h
index d59adc4be068..c86ff02e0ca0 100644
--- a/include/linux/virtio.h
+++ b/include/linux/virtio.h
@@ -91,6 +91,9 @@ dma_addr_t virtqueue_get_desc_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_avail_addr(struct virtqueue *vq);
 dma_addr_t virtqueue_get_used_addr(struct virtqueue *vq);
 
+int virtqueue_resize(struct virtqueue *vq, u32 num,
+void (*recycle)(struct virtqueue *vq, void *buf));
+
 /**
  * virtio_device - representation of a device using virtio
  * @index: unique position on the virtio bus
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 17/32] virtio_ring: packed: introduce virtqueue_resize_packed()

2022-04-05 Thread Xuan Zhuo

virtio ring packed supports resize.

Only after the new vring is successfully allocated based on the new num,
we will release the old vring. In any case, an error is returned,
indicating that the vring still points to the old vring.

In the case of an error, the caller must re-initialize(by
virtqueue_reinit_packed()) the virtqueue to ensure that the vring can be
used.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 9a4f2db718bd..06f66b15c86c 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2059,6 +2059,46 @@ static struct virtqueue *vring_create_virtqueue_packed(
return NULL;
 }
 
+static int virtqueue_resize_packed(struct virtqueue *_vq, u32 num)
+{
+   dma_addr_t ring_dma_addr, driver_event_dma_addr, device_event_dma_addr;
+   struct vring_packed_desc_event *driver, *device;
+   size_t ring_size_in_bytes, event_size_in_bytes;
+   struct vring_virtqueue *vq = to_vvq(_vq);
+   struct virtio_device *vdev = _vq->vdev;
+   struct vring_desc_state_packed *state;
+   struct vring_desc_extra *extra;
+   struct vring_packed_desc *ring;
+   int err;
+
+   if (vring_alloc_queue_packed(vdev, num, , , ,
+_dma_addr, _event_dma_addr,
+_event_dma_addr,
+_size_in_bytes, _size_in_bytes))
+   goto err_ring;
+
+
+   err = vring_alloc_state_extra_packed(num, , );
+   if (err)
+   goto err_state_extra;
+
+   vring_free(>vq);
+
+   vring_virtqueue_attach_packed(vq, num, ring, driver, device,
+ ring_dma_addr, driver_event_dma_addr,
+ device_event_dma_addr, ring_size_in_bytes,
+ event_size_in_bytes, state, extra);
+   vring_virtqueue_init_packed(vq, vdev);
+   return 0;
+
+err_state_extra:
+   vring_free_queue(vdev, event_size_in_bytes, device, 
device_event_dma_addr);
+   vring_free_queue(vdev, event_size_in_bytes, driver, 
driver_event_dma_addr);
+   vring_free_queue(vdev, ring_size_in_bytes, ring, ring_dma_addr);
+err_ring:
+   return -ENOMEM;
+}
+
 
 /*
  * Generic functions and exported symbols.
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 15/32] virtio_ring: packed: extract the logic of vq init

2022-04-05 Thread Xuan Zhuo

Separate the logic of initializing vq, and subsequent patches will call
it separately.

The characteristic of this part of the logic is that it does not depend
on the information passed by the upper layer, and can be called
repeatedly.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 71 
 1 file changed, 39 insertions(+), 32 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 80d446fa8d16..c783eb272468 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1933,6 +1933,44 @@ static void vring_virtqueue_attach_packed(struct 
vring_virtqueue *vq,
vq->packed.desc_extra = extra;
 }
 
+static void vring_virtqueue_init_packed(struct vring_virtqueue *vq,
+   struct virtio_device *vdev)
+{
+   vq->vq.num_free = vq->packed.vring.num;
+   vq->we_own_ring = true;
+   vq->broken = false;
+   vq->last_used_idx = 0;
+   vq->event_triggered = false;
+   vq->num_added = 0;
+   vq->packed_ring = true;
+   vq->use_dma_api = vring_use_dma_api(vdev);
+#ifdef DEBUG
+   vq->in_use = false;
+   vq->last_add_time_valid = false;
+#endif
+
+   vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
+
+   if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM))
+   vq->weak_barriers = false;
+
+   vq->packed.next_avail_idx = 0;
+   vq->packed.avail_wrap_counter = 1;
+   vq->packed.used_wrap_counter = 1;
+   vq->packed.event_flags_shadow = 0;
+   vq->packed.avail_used_flags = 1 << VRING_PACKED_DESC_F_AVAIL;
+
+   /* Put everything in free lists. */
+   vq->free_head = 0;
+
+   /* No callback?  Tell other side not to bother us. */
+   if (!vq->vq.callback) {
+   vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
+   vq->packed.vring.driver->flags =
+   cpu_to_le16(vq->packed.event_flags_shadow);
+   }
+}
+
 static struct virtqueue *vring_create_virtqueue_packed(
unsigned int index,
unsigned int num,
@@ -1968,34 +2006,12 @@ static struct virtqueue *vring_create_virtqueue_packed(
vq->vq.callback = callback;
vq->vq.vdev = vdev;
vq->vq.name = name;
-   vq->vq.num_free = num;
vq->vq.index = index;
-   vq->we_own_ring = true;
vq->notify = notify;
vq->weak_barriers = weak_barriers;
-   vq->broken = false;
-   vq->last_used_idx = 0;
-   vq->event_triggered = false;
-   vq->num_added = 0;
-   vq->packed_ring = true;
-   vq->use_dma_api = vring_use_dma_api(vdev);
-#ifdef DEBUG
-   vq->in_use = false;
-   vq->last_add_time_valid = false;
-#endif
 
vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC) &&
!context;
-   vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
-
-   if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM))
-   vq->weak_barriers = false;
-
-   vq->packed.next_avail_idx = 0;
-   vq->packed.avail_wrap_counter = 1;
-   vq->packed.used_wrap_counter = 1;
-   vq->packed.event_flags_shadow = 0;
-   vq->packed.avail_used_flags = 1 << VRING_PACKED_DESC_F_AVAIL;
 
err = vring_alloc_state_extra_packed(num, , );
if (err)
@@ -2005,16 +2021,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
  ring_dma_addr, driver_event_dma_addr,
  device_event_dma_addr, ring_size_in_bytes,
  event_size_in_bytes, state, extra);
-
-   /* Put everything in free lists. */
-   vq->free_head = 0;
-
-   /* No callback?  Tell other side not to bother us. */
-   if (!callback) {
-   vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
-   vq->packed.vring.driver->flags =
-   cpu_to_le16(vq->packed.event_flags_shadow);
-   }
+   vring_virtqueue_init_packed(vq, vdev);
 
spin_lock(>vqs_list_lock);
list_add_tail(>vq.list, >vqs);
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 16/32] virtio_ring: packed: introduce virtqueue_reinit_packed()

2022-04-05 Thread Xuan Zhuo

Introduce a function to initialize vq without allocating new ring,
desc_state, desc_extra.

Subsequent patches will call this function after reset vq to
reinitialize vq.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index c783eb272468..9a4f2db718bd 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1971,6 +1971,27 @@ static void vring_virtqueue_init_packed(struct 
vring_virtqueue *vq,
}
 }
 
+static void virtqueue_reinit_packed(struct vring_virtqueue *vq)
+{
+   struct virtio_device *vdev = vq->vq.vdev;
+   int size, i;
+
+   memset(vq->packed.vring.device, 0, vq->packed.event_size_in_bytes);
+   memset(vq->packed.vring.driver, 0, vq->packed.event_size_in_bytes);
+   memset(vq->packed.vring.desc, 0, vq->packed.ring_size_in_bytes);
+
+   size = sizeof(struct vring_desc_state_packed) * vq->packed.vring.num;
+   memset(vq->packed.desc_state, 0, size);
+
+   size = sizeof(struct vring_desc_extra) * vq->packed.vring.num;
+   memset(vq->packed.desc_extra, 0, size);
+
+   for (i = 0; i < vq->packed.vring.num - 1; i++)
+   vq->packed.desc_extra[i].next = i + 1;
+
+   vring_virtqueue_init_packed(vq, vdev);
+}
+
 static struct virtqueue *vring_create_virtqueue_packed(
unsigned int index,
unsigned int num,
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 13/32] virtio_ring: packed: extract the logic of alloc state and extra

2022-04-05 Thread Xuan Zhuo

Separate the logic for alloc desc_state and desc_extra, which will
be called separately by subsequent patches.

Use struct vring_packed to pass desc_state, desc_extra.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 51 ++--
 1 file changed, 37 insertions(+), 14 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index ea451ae2aaef..5b5976c5742e 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1876,6 +1876,34 @@ static int vring_alloc_queue_packed(struct virtio_device 
*vdev,
return -ENOMEM;
 }
 
+static int vring_alloc_state_extra_packed(u32 num,
+ struct vring_desc_state_packed 
**desc_state,
+ struct vring_desc_extra **desc_extra)
+{
+   struct vring_desc_state_packed *state;
+   struct vring_desc_extra *extra;
+
+   state = kmalloc_array(num, sizeof(struct vring_desc_state_packed), 
GFP_KERNEL);
+   if (!state)
+   goto err_desc_state;
+
+   memset(state, 0, num * sizeof(struct vring_desc_state_packed));
+
+   extra = vring_alloc_desc_extra(num);
+   if (!extra)
+   goto err_desc_extra;
+
+   *desc_state = state;
+   *desc_extra = extra;
+
+   return 0;
+
+err_desc_extra:
+   kfree(state);
+err_desc_state:
+   return -ENOMEM;
+}
+
 static struct virtqueue *vring_create_virtqueue_packed(
unsigned int index,
unsigned int num,
@@ -1891,8 +1919,11 @@ static struct virtqueue *vring_create_virtqueue_packed(
dma_addr_t ring_dma_addr, driver_event_dma_addr, device_event_dma_addr;
struct vring_packed_desc_event *driver, *device;
size_t ring_size_in_bytes, event_size_in_bytes;
+   struct vring_desc_state_packed *state;
+   struct vring_desc_extra *extra;
struct vring_packed_desc *ring;
struct vring_virtqueue *vq;
+   int err;
 
if (vring_alloc_queue_packed(vdev, num, , , ,
 _dma_addr, _event_dma_addr,
@@ -1949,22 +1980,16 @@ static struct virtqueue *vring_create_virtqueue_packed(
vq->packed.event_flags_shadow = 0;
vq->packed.avail_used_flags = 1 << VRING_PACKED_DESC_F_AVAIL;
 
-   vq->packed.desc_state = kmalloc_array(num,
-   sizeof(struct vring_desc_state_packed),
-   GFP_KERNEL);
-   if (!vq->packed.desc_state)
-   goto err_desc_state;
+   err = vring_alloc_state_extra_packed(num, , );
+   if (err)
+   goto err_state_extra;
 
-   memset(vq->packed.desc_state, 0,
-   num * sizeof(struct vring_desc_state_packed));
+   vq->packed.desc_state = state;
+   vq->packed.desc_extra = extra;
 
/* Put everything in free lists. */
vq->free_head = 0;
 
-   vq->packed.desc_extra = vring_alloc_desc_extra(num);
-   if (!vq->packed.desc_extra)
-   goto err_desc_extra;
-
/* No callback?  Tell other side not to bother us. */
if (!callback) {
vq->packed.event_flags_shadow = VRING_PACKED_EVENT_FLAG_DISABLE;
@@ -1977,9 +2002,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
spin_unlock(>vqs_list_lock);
return >vq;
 
-err_desc_extra:
-   kfree(vq->packed.desc_state);
-err_desc_state:
+err_state_extra:
kfree(vq);
 err_vq:
vring_free_queue(vdev, event_size_in_bytes, device, 
device_event_dma_addr);
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 14/32] virtio_ring: packed: extract the logic of attach vring

2022-04-05 Thread Xuan Zhuo

Separate the logic of attach vring, the subsequent patch will call it
separately.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 47 +---
 1 file changed, 33 insertions(+), 14 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 5b5976c5742e..80d446fa8d16 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1904,6 +1904,35 @@ static int vring_alloc_state_extra_packed(u32 num,
return -ENOMEM;
 }
 
+static void vring_virtqueue_attach_packed(struct vring_virtqueue *vq,
+ u32 num,
+ struct vring_packed_desc *ring,
+ struct vring_packed_desc_event 
*driver,
+ struct vring_packed_desc_event 
*device,
+ dma_addr_t ring_dma_addr,
+ dma_addr_t driver_event_dma_addr,
+ dma_addr_t device_event_dma_addr,
+ size_t ring_size_in_bytes,
+ size_t event_size_in_bytes,
+ struct vring_desc_state_packed *state,
+ struct vring_desc_extra *extra)
+{
+   vq->packed.ring_dma_addr = ring_dma_addr;
+   vq->packed.driver_event_dma_addr = driver_event_dma_addr;
+   vq->packed.device_event_dma_addr = device_event_dma_addr;
+
+   vq->packed.ring_size_in_bytes = ring_size_in_bytes;
+   vq->packed.event_size_in_bytes = event_size_in_bytes;
+
+   vq->packed.vring.num = num;
+   vq->packed.vring.desc = ring;
+   vq->packed.vring.driver = driver;
+   vq->packed.vring.device = device;
+
+   vq->packed.desc_state = state;
+   vq->packed.desc_extra = extra;
+}
+
 static struct virtqueue *vring_create_virtqueue_packed(
unsigned int index,
unsigned int num,
@@ -1962,18 +1991,6 @@ static struct virtqueue *vring_create_virtqueue_packed(
if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM))
vq->weak_barriers = false;
 
-   vq->packed.ring_dma_addr = ring_dma_addr;
-   vq->packed.driver_event_dma_addr = driver_event_dma_addr;
-   vq->packed.device_event_dma_addr = device_event_dma_addr;
-
-   vq->packed.ring_size_in_bytes = ring_size_in_bytes;
-   vq->packed.event_size_in_bytes = event_size_in_bytes;
-
-   vq->packed.vring.num = num;
-   vq->packed.vring.desc = ring;
-   vq->packed.vring.driver = driver;
-   vq->packed.vring.device = device;
-
vq->packed.next_avail_idx = 0;
vq->packed.avail_wrap_counter = 1;
vq->packed.used_wrap_counter = 1;
@@ -1984,8 +2001,10 @@ static struct virtqueue *vring_create_virtqueue_packed(
if (err)
goto err_state_extra;
 
-   vq->packed.desc_state = state;
-   vq->packed.desc_extra = extra;
+   vring_virtqueue_attach_packed(vq, num, ring, driver, device,
+ ring_dma_addr, driver_event_dma_addr,
+ device_event_dma_addr, ring_size_in_bytes,
+ event_size_in_bytes, state, extra);
 
/* Put everything in free lists. */
vq->free_head = 0;
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 12/32] virtio_ring: packed: extract the logic of alloc queue

2022-04-05 Thread Xuan Zhuo

Separate the logic of packed to create vring queue.

For the convenience of passing parameters, add a structure
vring_packed.

This feature is required for subsequent virtuqueue reset vring.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 70 
 1 file changed, 56 insertions(+), 14 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 33864134a744..ea451ae2aaef 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1817,19 +1817,17 @@ static struct vring_desc_extra 
*vring_alloc_desc_extra(unsigned int num)
return desc_extra;
 }
 
-static struct virtqueue *vring_create_virtqueue_packed(
-   unsigned int index,
-   unsigned int num,
-   unsigned int vring_align,
-   struct virtio_device *vdev,
-   bool weak_barriers,
-   bool may_reduce_num,
-   bool context,
-   bool (*notify)(struct virtqueue *),
-   void (*callback)(struct virtqueue *),
-   const char *name)
+static int vring_alloc_queue_packed(struct virtio_device *vdev,
+   u32 num,
+   struct vring_packed_desc **_ring,
+   struct vring_packed_desc_event **_driver,
+   struct vring_packed_desc_event **_device,
+   dma_addr_t *_ring_dma_addr,
+   dma_addr_t *_driver_event_dma_addr,
+   dma_addr_t *_device_event_dma_addr,
+   size_t *_ring_size_in_bytes,
+   size_t *_event_size_in_bytes)
 {
-   struct vring_virtqueue *vq;
struct vring_packed_desc *ring;
struct vring_packed_desc_event *driver, *device;
dma_addr_t ring_dma_addr, driver_event_dma_addr, device_event_dma_addr;
@@ -1857,6 +1855,52 @@ static struct virtqueue *vring_create_virtqueue_packed(
if (!device)
goto err_device;
 
+   *_ring   = ring;
+   *_driver = driver;
+   *_device = device;
+   *_ring_dma_addr  = ring_dma_addr;
+   *_driver_event_dma_addr  = driver_event_dma_addr;
+   *_device_event_dma_addr  = device_event_dma_addr;
+   *_ring_size_in_bytes = ring_size_in_bytes;
+   *_event_size_in_bytes= event_size_in_bytes;
+
+   return 0;
+
+err_device:
+   vring_free_queue(vdev, event_size_in_bytes, driver, 
driver_event_dma_addr);
+
+err_driver:
+   vring_free_queue(vdev, ring_size_in_bytes, ring, ring_dma_addr);
+
+err_ring:
+   return -ENOMEM;
+}
+
+static struct virtqueue *vring_create_virtqueue_packed(
+   unsigned int index,
+   unsigned int num,
+   unsigned int vring_align,
+   struct virtio_device *vdev,
+   bool weak_barriers,
+   bool may_reduce_num,
+   bool context,
+   bool (*notify)(struct virtqueue *),
+   void (*callback)(struct virtqueue *),
+   const char *name)
+{
+   dma_addr_t ring_dma_addr, driver_event_dma_addr, device_event_dma_addr;
+   struct vring_packed_desc_event *driver, *device;
+   size_t ring_size_in_bytes, event_size_in_bytes;
+   struct vring_packed_desc *ring;
+   struct vring_virtqueue *vq;
+
+   if (vring_alloc_queue_packed(vdev, num, , , ,
+_dma_addr, _event_dma_addr,
+_event_dma_addr,
+_size_in_bytes,
+_size_in_bytes))
+   goto err_ring;
+
vq = kmalloc(sizeof(*vq), GFP_KERNEL);
if (!vq)
goto err_vq;
@@ -1939,9 +1983,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
kfree(vq);
 err_vq:
vring_free_queue(vdev, event_size_in_bytes, device, 
device_event_dma_addr);
-err_device:
vring_free_queue(vdev, event_size_in_bytes, driver, 
driver_event_dma_addr);
-err_driver:
vring_free_queue(vdev, ring_size_in_bytes, ring, ring_dma_addr);
 err_ring:
return NULL;
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 11/32] virtio_ring: split: introduce virtqueue_resize_split()

2022-04-05 Thread Xuan Zhuo

virtio ring split supports resize.

Only after the new vring is successfully allocated based on the new num,
we will release the old vring. In any case, an error is returned,
indicating that the vring still points to the old vring.

In the case of an error, the caller must
re-initialize(virtqueue_reinit_split()) the virtqueue to ensure that the
vring can be used.

In addition, vring_align, may_reduce_num are necessary for reallocating
vring, so they are retained for creating vq.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 47 
 1 file changed, 47 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 3dc6ace2ba7a..33864134a744 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -139,6 +139,12 @@ struct vring_virtqueue {
/* DMA address and size information */
dma_addr_t queue_dma_addr;
size_t queue_size_in_bytes;
+
+   /* The parameters for creating vrings are reserved for
+* creating new vring.
+*/
+   u32 vring_align;
+   bool may_reduce_num;
} split;
 
/* Available for packed ring */
@@ -199,6 +205,7 @@ struct vring_virtqueue {
 };
 
 static struct vring_desc_extra *vring_alloc_desc_extra(unsigned int num);
+static void vring_free(struct virtqueue *_vq);
 
 /*
  * Helpers.
@@ -1088,6 +1095,8 @@ static struct virtqueue *vring_create_virtqueue_split(
return NULL;
}
 
+   to_vvq(vq)->split.vring_align = vring_align;
+   to_vvq(vq)->split.may_reduce_num = may_reduce_num;
to_vvq(vq)->split.queue_dma_addr = dma_addr;
to_vvq(vq)->split.queue_size_in_bytes = queue_size_in_bytes;
to_vvq(vq)->we_own_ring = true;
@@ -1095,6 +1104,44 @@ static struct virtqueue *vring_create_virtqueue_split(
return vq;
 }
 
+static int virtqueue_resize_split(struct virtqueue *_vq, u32 num)
+{
+   struct vring_virtqueue *vq = to_vvq(_vq);
+   struct virtio_device *vdev = _vq->vdev;
+   struct vring_desc_state_split *state;
+   struct vring_desc_extra *extra;
+   size_t queue_size_in_bytes;
+   dma_addr_t dma_addr;
+   struct vring vring;
+   int err = -ENOMEM;
+   void *queue;
+
+   queue = vring_alloc_queue_split(vdev, _addr, ,
+   vq->split.vring_align,
+   vq->weak_barriers,
+   vq->split.may_reduce_num);
+   if (!queue)
+   return -ENOMEM;
+
+   queue_size_in_bytes = vring_size(num, vq->split.vring_align);
+
+   err = vring_alloc_state_extra_split(num, , );
+   if (err) {
+   vring_free_queue(vdev, queue_size_in_bytes, queue, dma_addr);
+   return -ENOMEM;
+   }
+
+   vring_free(>vq);
+
+   vring_init(, num, queue, vq->split.vring_align);
+   vring_virtqueue_attach_split(vq, vring, state, extra);
+   vq->split.queue_dma_addr = dma_addr;
+   vq->split.queue_size_in_bytes = queue_size_in_bytes;
+
+   vring_virtqueue_init_split(vq, vdev, true);
+   return 0;
+}
+
 
 /*
  * Packed ring specific functions - *_packed().
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 09/32] virtio_ring: split: extract the logic of vq init

2022-04-05 Thread Xuan Zhuo

Separate the logic of initializing vq, and subsequent patches will call
it separately.

The feature of this part is that it does not depend on the information
passed by the upper layer and can be called repeatedly.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 68 
 1 file changed, 38 insertions(+), 30 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 083f2992ba0d..874f878087a3 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -916,6 +916,43 @@ static void *virtqueue_detach_unused_buf_split(struct 
virtqueue *_vq)
return NULL;
 }
 
+static void vring_virtqueue_init_split(struct vring_virtqueue *vq,
+  struct virtio_device *vdev,
+  bool own_ring)
+{
+   vq->packed_ring = false;
+   vq->vq.num_free = vq->split.vring.num;
+   vq->we_own_ring = own_ring;
+   vq->broken = false;
+   vq->last_used_idx = 0;
+   vq->event_triggered = false;
+   vq->num_added = 0;
+   vq->use_dma_api = vring_use_dma_api(vdev);
+#ifdef DEBUG
+   vq->in_use = false;
+   vq->last_add_time_valid = false;
+#endif
+
+   vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
+
+   if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM))
+   vq->weak_barriers = false;
+
+   vq->split.avail_flags_shadow = 0;
+   vq->split.avail_idx_shadow = 0;
+
+   /* No callback?  Tell other side not to bother us. */
+   if (!vq->vq.callback) {
+   vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
+   if (!vq->event)
+   vq->split.vring.avail->flags = cpu_to_virtio16(vdev,
+   vq->split.avail_flags_shadow);
+   }
+
+   /* Put everything in free lists. */
+   vq->free_head = 0;
+}
+
 static void vring_virtqueue_attach_split(struct vring_virtqueue *vq,
 struct vring vring,
 struct vring_desc_state_split 
*desc_state,
@@ -2249,42 +2286,15 @@ struct virtqueue *__vring_new_virtqueue(unsigned int 
index,
if (!vq)
return NULL;
 
-   vq->packed_ring = false;
vq->vq.callback = callback;
vq->vq.vdev = vdev;
vq->vq.name = name;
-   vq->vq.num_free = vring.num;
vq->vq.index = index;
-   vq->we_own_ring = false;
vq->notify = notify;
vq->weak_barriers = weak_barriers;
-   vq->broken = false;
-   vq->last_used_idx = 0;
-   vq->event_triggered = false;
-   vq->num_added = 0;
-   vq->use_dma_api = vring_use_dma_api(vdev);
-#ifdef DEBUG
-   vq->in_use = false;
-   vq->last_add_time_valid = false;
-#endif
 
vq->indirect = virtio_has_feature(vdev, VIRTIO_RING_F_INDIRECT_DESC) &&
!context;
-   vq->event = virtio_has_feature(vdev, VIRTIO_RING_F_EVENT_IDX);
-
-   if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM))
-   vq->weak_barriers = false;
-
-   vq->split.avail_flags_shadow = 0;
-   vq->split.avail_idx_shadow = 0;
-
-   /* No callback?  Tell other side not to bother us. */
-   if (!callback) {
-   vq->split.avail_flags_shadow |= VRING_AVAIL_F_NO_INTERRUPT;
-   if (!vq->event)
-   vq->split.vring.avail->flags = cpu_to_virtio16(vdev,
-   vq->split.avail_flags_shadow);
-   }
 
err = vring_alloc_state_extra_split(vring.num, , );
if (err) {
@@ -2293,9 +2303,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int 
index,
}
 
vring_virtqueue_attach_split(vq, vring, state, extra);
-
-   /* Put everything in free lists. */
-   vq->free_head = 0;
+   vring_virtqueue_init_split(vq, vdev, false);
 
spin_lock(>vqs_list_lock);
list_add_tail(>vq.list, >vqs);
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 10/32] virtio_ring: split: introduce virtqueue_reinit_split()

2022-04-05 Thread Xuan Zhuo

Introduce a function to initialize vq without allocating new ring,
desc_state, desc_extra.

Subsequent patches will call this function after reset vq to
reinitialize vq.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 874f878087a3..3dc6ace2ba7a 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -953,6 +953,25 @@ static void vring_virtqueue_init_split(struct 
vring_virtqueue *vq,
vq->free_head = 0;
 }
 
+static void virtqueue_reinit_split(struct vring_virtqueue *vq)
+{
+   struct virtio_device *vdev = vq->vq.vdev;
+   int size, i;
+
+   memset(vq->split.vring.desc, 0, vq->split.queue_size_in_bytes);
+
+   size = sizeof(struct vring_desc_state_split) * vq->split.vring.num;
+   memset(vq->split.desc_state, 0, size);
+
+   size = sizeof(struct vring_desc_extra) * vq->split.vring.num;
+   memset(vq->split.desc_extra, 0, size);
+
+   for (i = 0; i < vq->split.vring.num - 1; i++)
+   vq->split.desc_extra[i].next = i + 1;
+
+   vring_virtqueue_init_split(vq, vdev, true);
+}
+
 static void vring_virtqueue_attach_split(struct vring_virtqueue *vq,
 struct vring vring,
 struct vring_desc_state_split 
*desc_state,
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 07/32] virtio_ring: split: extract the logic of alloc state and extra

2022-04-05 Thread Xuan Zhuo

Separate the logic of creating desc_state, desc_extra, and subsequent
patches will call it independently.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 53 ++--
 1 file changed, 38 insertions(+), 15 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 72d5ae063fa0..6de67439cb57 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -198,6 +198,7 @@ struct vring_virtqueue {
 #endif
 };
 
+static struct vring_desc_extra *vring_alloc_desc_extra(unsigned int num);
 
 /*
  * Helpers.
@@ -915,6 +916,33 @@ static void *virtqueue_detach_unused_buf_split(struct 
virtqueue *_vq)
return NULL;
 }
 
+static int vring_alloc_state_extra_split(u32 num,
+struct vring_desc_state_split 
**desc_state,
+struct vring_desc_extra **desc_extra)
+{
+   struct vring_desc_state_split *state;
+   struct vring_desc_extra *extra;
+
+   state = kmalloc_array(num, sizeof(struct vring_desc_state_split), 
GFP_KERNEL);
+   if (!state)
+   goto err_state;
+
+   extra = vring_alloc_desc_extra(num);
+   if (!extra)
+   goto err_extra;
+
+   memset(state, 0, num * sizeof(struct vring_desc_state_split));
+
+   *desc_state = state;
+   *desc_extra = extra;
+   return 0;
+
+err_extra:
+   kfree(state);
+err_state:
+   return -ENOMEM;
+}
+
 static void *vring_alloc_queue_split(struct virtio_device *vdev,
 dma_addr_t *dma_addr,
 u32 *n,
@@ -2196,7 +2224,10 @@ struct virtqueue *__vring_new_virtqueue(unsigned int 
index,
void (*callback)(struct virtqueue *),
const char *name)
 {
+   struct vring_desc_state_split *state;
+   struct vring_desc_extra *extra;
struct vring_virtqueue *vq;
+   int err;
 
if (virtio_has_feature(vdev, VIRTIO_F_RING_PACKED))
return NULL;
@@ -2246,30 +2277,22 @@ struct virtqueue *__vring_new_virtqueue(unsigned int 
index,
vq->split.avail_flags_shadow);
}
 
-   vq->split.desc_state = kmalloc_array(vring.num,
-   sizeof(struct vring_desc_state_split), GFP_KERNEL);
-   if (!vq->split.desc_state)
-   goto err_state;
+   err = vring_alloc_state_extra_split(vring.num, , );
+   if (err) {
+   kfree(vq);
+   return NULL;
+   }
 
-   vq->split.desc_extra = vring_alloc_desc_extra(vring.num);
-   if (!vq->split.desc_extra)
-   goto err_extra;
+   vq->split.desc_state = state;
+   vq->split.desc_extra = extra;
 
/* Put everything in free lists. */
vq->free_head = 0;
-   memset(vq->split.desc_state, 0, vring.num *
-   sizeof(struct vring_desc_state_split));
 
spin_lock(>vqs_list_lock);
list_add_tail(>vq.list, >vqs);
spin_unlock(>vqs_list_lock);
return >vq;
-
-err_extra:
-   kfree(vq->split.desc_state);
-err_state:
-   kfree(vq);
-   return NULL;
 }
 EXPORT_SYMBOL_GPL(__vring_new_virtqueue);
 
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 08/32] virtio_ring: split: extract the logic of attach vring

2022-04-05 Thread Xuan Zhuo

Separate the logic of attach vring, subsequent patches will call it
separately.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 20 ++--
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 6de67439cb57..083f2992ba0d 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -916,6 +916,19 @@ static void *virtqueue_detach_unused_buf_split(struct 
virtqueue *_vq)
return NULL;
 }
 
+static void vring_virtqueue_attach_split(struct vring_virtqueue *vq,
+struct vring vring,
+struct vring_desc_state_split 
*desc_state,
+struct vring_desc_extra *desc_extra)
+{
+   vq->split.vring = vring;
+   vq->split.queue_dma_addr = 0;
+   vq->split.queue_size_in_bytes = 0;
+
+   vq->split.desc_state = desc_state;
+   vq->split.desc_extra = desc_extra;
+}
+
 static int vring_alloc_state_extra_split(u32 num,
 struct vring_desc_state_split 
**desc_state,
 struct vring_desc_extra **desc_extra)
@@ -2262,10 +2275,6 @@ struct virtqueue *__vring_new_virtqueue(unsigned int 
index,
if (virtio_has_feature(vdev, VIRTIO_F_ORDER_PLATFORM))
vq->weak_barriers = false;
 
-   vq->split.queue_dma_addr = 0;
-   vq->split.queue_size_in_bytes = 0;
-
-   vq->split.vring = vring;
vq->split.avail_flags_shadow = 0;
vq->split.avail_idx_shadow = 0;
 
@@ -2283,8 +2292,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int 
index,
return NULL;
}
 
-   vq->split.desc_state = state;
-   vq->split.desc_extra = extra;
+   vring_virtqueue_attach_split(vq, vring, state, extra);
 
/* Put everything in free lists. */
vq->free_head = 0;
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 06/32] virtio_ring: split: extract the logic of alloc queue

2022-04-05 Thread Xuan Zhuo

Separate the logic of split to create vring queue.

This feature is required for subsequent virtuqueue reset vring.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 53 
 1 file changed, 36 insertions(+), 17 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 33fddfb907a6..72d5ae063fa0 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -915,23 +915,15 @@ static void *virtqueue_detach_unused_buf_split(struct 
virtqueue *_vq)
return NULL;
 }
 
-static struct virtqueue *vring_create_virtqueue_split(
-   unsigned int index,
-   unsigned int num,
-   unsigned int vring_align,
-   struct virtio_device *vdev,
-   bool weak_barriers,
-   bool may_reduce_num,
-   bool context,
-   bool (*notify)(struct virtqueue *),
-   void (*callback)(struct virtqueue *),
-   const char *name)
+static void *vring_alloc_queue_split(struct virtio_device *vdev,
+dma_addr_t *dma_addr,
+u32 *n,
+unsigned int vring_align,
+bool weak_barriers,
+bool may_reduce_num)
 {
-   struct virtqueue *vq;
void *queue = NULL;
-   dma_addr_t dma_addr;
-   size_t queue_size_in_bytes;
-   struct vring vring;
+   u32 num = *n;
 
/* We assume num is a power of 2. */
if (num & (num - 1)) {
@@ -942,7 +934,7 @@ static struct virtqueue *vring_create_virtqueue_split(
/* TODO: allocate each queue chunk individually */
for (; num && vring_size(num, vring_align) > PAGE_SIZE; num /= 2) {
queue = vring_alloc_queue(vdev, vring_size(num, vring_align),
- _addr,
+ dma_addr,
  GFP_KERNEL|__GFP_NOWARN|__GFP_ZERO);
if (queue)
break;
@@ -956,11 +948,38 @@ static struct virtqueue *vring_create_virtqueue_split(
if (!queue) {
/* Try to get a single page. You are my only hope! */
queue = vring_alloc_queue(vdev, vring_size(num, vring_align),
- _addr, GFP_KERNEL|__GFP_ZERO);
+ dma_addr, GFP_KERNEL|__GFP_ZERO);
}
if (!queue)
return NULL;
 
+   *n = num;
+   return queue;
+}
+
+static struct virtqueue *vring_create_virtqueue_split(
+   unsigned int index,
+   unsigned int num,
+   unsigned int vring_align,
+   struct virtio_device *vdev,
+   bool weak_barriers,
+   bool may_reduce_num,
+   bool context,
+   bool (*notify)(struct virtqueue *),
+   void (*callback)(struct virtqueue *),
+   const char *name)
+{
+   size_t queue_size_in_bytes;
+   struct virtqueue *vq;
+   dma_addr_t dma_addr;
+   struct vring vring;
+   void *queue;
+
+   queue = vring_alloc_queue_split(vdev, _addr, , vring_align,
+   weak_barriers, may_reduce_num);
+   if (!queue)
+   return NULL;
+
queue_size_in_bytes = vring_size(num, vring_align);
vring_init(, num, queue, vring_align);
 
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 05/32] virtio_ring: extract the logic of freeing vring

2022-04-05 Thread Xuan Zhuo

Introduce vring_free() to free the vring of vq.

Subsequent patches will use vring_free() alone.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 18 +-
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index cb6010750a94..33fddfb907a6 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2301,14 +2301,10 @@ struct virtqueue *vring_new_virtqueue(unsigned int 
index,
 }
 EXPORT_SYMBOL_GPL(vring_new_virtqueue);
 
-void vring_del_virtqueue(struct virtqueue *_vq)
+static void vring_free(struct virtqueue *_vq)
 {
struct vring_virtqueue *vq = to_vvq(_vq);
 
-   spin_lock(>vq.vdev->vqs_list_lock);
-   list_del(&_vq->list);
-   spin_unlock(>vq.vdev->vqs_list_lock);
-
if (vq->we_own_ring) {
if (vq->packed_ring) {
vring_free_queue(vq->vq.vdev,
@@ -2339,6 +2335,18 @@ void vring_del_virtqueue(struct virtqueue *_vq)
kfree(vq->split.desc_state);
kfree(vq->split.desc_extra);
}
+}
+
+void vring_del_virtqueue(struct virtqueue *_vq)
+{
+   struct vring_virtqueue *vq = to_vvq(_vq);
+
+   spin_lock(>vq.vdev->vqs_list_lock);
+   list_del(&_vq->list);
+   spin_unlock(>vq.vdev->vqs_list_lock);
+
+   vring_free(_vq);
+
kfree(vq);
 }
 EXPORT_SYMBOL_GPL(vring_del_virtqueue);
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 04/32] virtio_ring: remove the arg vq of vring_alloc_desc_extra()

2022-04-05 Thread Xuan Zhuo

The parameter vq of vring_alloc_desc_extra() is useless. This patch
removes this parameter.

Subsequent patches will call this function to avoid passing useless
arguments.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index f1807f6b06a5..cb6010750a94 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -1636,8 +1636,7 @@ static void *virtqueue_detach_unused_buf_packed(struct 
virtqueue *_vq)
return NULL;
 }
 
-static struct vring_desc_extra *vring_alloc_desc_extra(struct vring_virtqueue 
*vq,
-  unsigned int num)
+static struct vring_desc_extra *vring_alloc_desc_extra(unsigned int num)
 {
struct vring_desc_extra *desc_extra;
unsigned int i;
@@ -1755,7 +1754,7 @@ static struct virtqueue *vring_create_virtqueue_packed(
/* Put everything in free lists. */
vq->free_head = 0;
 
-   vq->packed.desc_extra = vring_alloc_desc_extra(vq, num);
+   vq->packed.desc_extra = vring_alloc_desc_extra(num);
if (!vq->packed.desc_extra)
goto err_desc_extra;
 
@@ -2233,7 +2232,7 @@ struct virtqueue *__vring_new_virtqueue(unsigned int 
index,
if (!vq->split.desc_state)
goto err_state;
 
-   vq->split.desc_extra = vring_alloc_desc_extra(vq, vring.num);
+   vq->split.desc_extra = vring_alloc_desc_extra(vring.num);
if (!vq->split.desc_extra)
goto err_extra;
 
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 03/32] virtio_ring: update the document of the virtqueue_detach_unused_buf for queue reset

2022-04-05 Thread Xuan Zhuo

Added documentation for virtqueue_detach_unused_buf, allowing it to be
called on queue reset.

Signed-off-by: Xuan Zhuo 
---
 drivers/virtio/virtio_ring.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index b87130c8f312..f1807f6b06a5 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2127,8 +2127,8 @@ EXPORT_SYMBOL_GPL(virtqueue_enable_cb_delayed);
  * @_vq: the struct virtqueue we're talking about.
  *
  * Returns NULL or the "data" token handed to virtqueue_add_*().
- * This is not valid on an active queue; it is useful only for device
- * shutdown.
+ * This is not valid on an active queue; it is useful for device
+ * shutdown or the reset queue.
  */
 void *virtqueue_detach_unused_buf(struct virtqueue *_vq)
 {
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 01/32] virtio: add helper virtqueue_get_vring_max_size()

2022-04-05 Thread Xuan Zhuo

Record the maximum queue num supported by the device.

virtio-net can display the maximum (supported by hardware) ring size in
ethtool -g eth0.

When the subsequent patch implements vring reset, it can judge whether
the ring size passed by the driver is legal based on this.

Signed-off-by: Xuan Zhuo 
---
 arch/um/drivers/virtio_uml.c |  1 +
 drivers/platform/mellanox/mlxbf-tmfifo.c |  2 ++
 drivers/remoteproc/remoteproc_virtio.c   |  2 ++
 drivers/s390/virtio/virtio_ccw.c |  3 +++
 drivers/virtio/virtio_mmio.c |  2 ++
 drivers/virtio/virtio_pci_legacy.c   |  2 ++
 drivers/virtio/virtio_pci_modern.c   |  2 ++
 drivers/virtio/virtio_ring.c | 14 ++
 drivers/virtio/virtio_vdpa.c |  2 ++
 include/linux/virtio.h   |  2 ++
 10 files changed, 32 insertions(+)

diff --git a/arch/um/drivers/virtio_uml.c b/arch/um/drivers/virtio_uml.c
index ba562d68dc04..904993d15a85 100644
--- a/arch/um/drivers/virtio_uml.c
+++ b/arch/um/drivers/virtio_uml.c
@@ -945,6 +945,7 @@ static struct virtqueue *vu_setup_vq(struct virtio_device 
*vdev,
goto error_create;
}
vq->priv = info;
+   vq->num_max = num;
num = virtqueue_get_vring_size(vq);
 
if (vu_dev->protocol_features &
diff --git a/drivers/platform/mellanox/mlxbf-tmfifo.c 
b/drivers/platform/mellanox/mlxbf-tmfifo.c
index 38800e86ed8a..1ae3c56b66b0 100644
--- a/drivers/platform/mellanox/mlxbf-tmfifo.c
+++ b/drivers/platform/mellanox/mlxbf-tmfifo.c
@@ -959,6 +959,8 @@ static int mlxbf_tmfifo_virtio_find_vqs(struct 
virtio_device *vdev,
goto error;
}
 
+   vq->num_max = vring->num;
+
vqs[i] = vq;
vring->vq = vq;
vq->priv = vring;
diff --git a/drivers/remoteproc/remoteproc_virtio.c 
b/drivers/remoteproc/remoteproc_virtio.c
index 70ab496d0431..7611755d0ae2 100644
--- a/drivers/remoteproc/remoteproc_virtio.c
+++ b/drivers/remoteproc/remoteproc_virtio.c
@@ -125,6 +125,8 @@ static struct virtqueue *rp_find_vq(struct virtio_device 
*vdev,
return ERR_PTR(-ENOMEM);
}
 
+   vq->num_max = len;
+
rvring->vq = vq;
vq->priv = rvring;
 
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index d35e7a3f7067..468da60b56c5 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -529,6 +529,9 @@ static struct virtqueue *virtio_ccw_setup_vq(struct 
virtio_device *vdev,
err = -ENOMEM;
goto out_err;
}
+
+   vq->num_max = info->num;
+
/* it may have been reduced */
info->num = virtqueue_get_vring_size(vq);
 
diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index 56128b9c46eb..a41abc8051b9 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -390,6 +390,8 @@ static struct virtqueue *vm_setup_vq(struct virtio_device 
*vdev, unsigned index,
goto error_new_virtqueue;
}
 
+   vq->num_max = num;
+
/* Activate the queue */
writel(virtqueue_get_vring_size(vq), vm_dev->base + 
VIRTIO_MMIO_QUEUE_NUM);
if (vm_dev->version == 1) {
diff --git a/drivers/virtio/virtio_pci_legacy.c 
b/drivers/virtio/virtio_pci_legacy.c
index 34141b9abe27..b68934fe6b5d 100644
--- a/drivers/virtio/virtio_pci_legacy.c
+++ b/drivers/virtio/virtio_pci_legacy.c
@@ -135,6 +135,8 @@ static struct virtqueue *setup_vq(struct virtio_pci_device 
*vp_dev,
if (!vq)
return ERR_PTR(-ENOMEM);
 
+   vq->num_max = num;
+
q_pfn = virtqueue_get_desc_addr(vq) >> VIRTIO_PCI_QUEUE_ADDR_SHIFT;
if (q_pfn >> 32) {
dev_err(_dev->pci_dev->dev,
diff --git a/drivers/virtio/virtio_pci_modern.c 
b/drivers/virtio/virtio_pci_modern.c
index 5455bc041fb6..86d301f272b8 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -218,6 +218,8 @@ static struct virtqueue *setup_vq(struct virtio_pci_device 
*vp_dev,
if (!vq)
return ERR_PTR(-ENOMEM);
 
+   vq->num_max = num;
+
/* activate the queue */
vp_modern_set_queue_size(mdev, index, virtqueue_get_vring_size(vq));
vp_modern_queue_address(mdev, index, virtqueue_get_desc_addr(vq),
diff --git a/drivers/virtio/virtio_ring.c b/drivers/virtio/virtio_ring.c
index 962f1477b1fa..b87130c8f312 100644
--- a/drivers/virtio/virtio_ring.c
+++ b/drivers/virtio/virtio_ring.c
@@ -2371,6 +2371,20 @@ void vring_transport_features(struct virtio_device *vdev)
 }
 EXPORT_SYMBOL_GPL(vring_transport_features);
 
+/**
+ * virtqueue_get_vring_max_size - return the max size of the virtqueue's vring
+ * @_vq: the struct virtqueue containing the vring of interest.
+ *
+ * Returns the max size of the vring.
+ *
+ * Unlike other operations, this need not be serialized.
+ */
+unsigned int

[PATCH v9 02/32] virtio: struct virtio_config_ops add callbacks for queue_reset

2022-04-05 Thread Xuan Zhuo

Performing reset on a queue is divided into four steps:

 1. transport: notify the device to reset the queue
 2. vring: recycle the buffer submitted
 3. vring: reset/resize the vring (may re-alloc)
 4. transport: mmap vring to device, and enable the queue

In order to support queue reset, add two callbacks(reset_vq,
enable_reset_vq) in struct virtio_config_ops to implement steps 1 and 4.

Signed-off-by: Xuan Zhuo 
---
 include/linux/virtio_config.h | 12 
 1 file changed, 12 insertions(+)

diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 4d107ad31149..d4adcd0e1c57 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -74,6 +74,16 @@ struct virtio_shm_region {
  * @set_vq_affinity: set the affinity for a virtqueue (optional).
  * @get_vq_affinity: get the affinity for a virtqueue (optional).
  * @get_shm_region: get a shared memory region based on the index.
+ * @reset_vq: reset a queue individually (optional).
+ * vq: the virtqueue
+ * Returns 0 on success or error status
+ * reset_vq will guarantee that the callbacks are disabled and 
synchronized.
+ * Except for the callback, the caller should guarantee that the vring is
+ * not accessed by any functions of virtqueue.
+ * @enable_reset_vq: enable a reset queue
+ * vq: the virtqueue
+ * Returns 0 on success or error status
+ * If reset_vq is set, then enable_reset_vq must also be set.
  */
 typedef void vq_callback_t(struct virtqueue *);
 struct virtio_config_ops {
@@ -100,6 +110,8 @@ struct virtio_config_ops {
int index);
bool (*get_shm_region)(struct virtio_device *vdev,
   struct virtio_shm_region *region, u8 id);
+   int (*reset_vq)(struct virtqueue *vq);
+   int (*enable_reset_vq)(struct virtqueue *vq);
 };
 
 /* If driver didn't advertise the feature, it will never appear. */
-- 
2.31.0

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH v9 00/32] virtio pci support VIRTIO_F_RING_RESET (refactor vring)

2022-04-05 Thread Xuan Zhuo

The virtio spec already supports the virtio queue reset function. This patch set
is to add this function to the kernel. The relevant virtio spec information is
here:

https://github.com/oasis-tcs/virtio-spec/issues/124

Also regarding MMIO support for queue reset, I plan to support it after this
patch is passed.

This patch set implements the refactoring of vring. Finally, the
virtuque_resize() interface is provided based on the reset function of the
transport layer.

Test environment:
Host: 4.19.91
Qemu: QEMU emulator version 6.2.50 (with vq reset support)
Test Cmd:  ethtool -G eth1 rx $1 tx $2; ethtool -g eth1

The default is split mode, modify Qemu virtio-net to add PACKED feature to 
test
packed mode.

Qemu code:

https://github.com/fengidri/qemu/compare/89f3bfa3265554d1d591ee4d7f1197b6e3397e84...master

In order to simplify the review of this patch set, the function of reusing
the old buffers after resize will be introduced in subsequent patch sets.

Please review. Thanks.

v9:
  1. Provide a virtqueue_resize() interface directly
  2. A patch set including vring resize, virtio pci reset, virtio-net resize
  3. No more separate structs

v8:
  1. Provide a virtqueue_reset() interface directly
  2. Split the two patch sets, this is the first part
  3. Add independent allocation helper for allocating state, extra

v7:
  1. fix #6 subject typo
  2. fix #6 ring_size_in_bytes is uninitialized
  3. check by: make W=12

v6:
  1. virtio_pci: use synchronize_irq(irq) to sync the irq callbacks
  2. Introduce virtqueue_reset_vring() to implement the reset of vring during
 the reset process. May use the old vring if num of the vq not change.
  3. find_vqs() support sizes to special the max size of each vq

v5:
  1. add virtio-net support set_ringparam

v4:
  1. just the code of virtio, without virtio-net
  2. Performing reset on a queue is divided into these steps:
1. reset_vq: reset one vq
2. recycle the buffer from vq by virtqueue_detach_unused_buf()
3. release the ring of the vq by vring_release_virtqueue()
4. enable_reset_vq: re-enable the reset queue
  3. Simplify the parameters of enable_reset_vq()
  4. add container structures for virtio_pci_common_cfg

v3:
  1. keep vq, irq unreleased

Xuan Zhuo (32):
  virtio: add helper virtqueue_get_vring_max_size()
  virtio: struct virtio_config_ops add callbacks for queue_reset
  virtio_ring: update the document of the virtqueue_detach_unused_buf
for queue reset
  virtio_ring: remove the arg vq of vring_alloc_desc_extra()
  virtio_ring: extract the logic of freeing vring
  virtio_ring: split: extract the logic of alloc queue
  virtio_ring: split: extract the logic of alloc state and extra
  virtio_ring: split: extract the logic of attach vring
  virtio_ring: split: extract the logic of vq init
  virtio_ring: split: introduce virtqueue_reinit_split()
  virtio_ring: split: introduce virtqueue_resize_split()
  virtio_ring: packed: extract the logic of alloc queue
  virtio_ring: packed: extract the logic of alloc state and extra
  virtio_ring: packed: extract the logic of attach vring
  virtio_ring: packed: extract the logic of vq init
  virtio_ring: packed: introduce virtqueue_reinit_packed()
  virtio_ring: packed: introduce virtqueue_resize_packed()
  virtio_ring: introduce virtqueue_resize()
  virtio_pci: struct virtio_pci_common_cfg add queue_notify_data
  virtio: queue_reset: add VIRTIO_F_RING_RESET
  virtio_pci: queue_reset: update struct virtio_pci_common_cfg and
option functions
  virtio_pci: queue_reset: extract the logic of active vq for modern pci
  virtio_pci: queue_reset: support VIRTIO_F_RING_RESET
  virtio: find_vqs() add arg sizes
  virtio_pci: support the arg sizes of find_vqs()
  virtio_mmio: support the arg sizes of find_vqs()
  virtio: add helper virtio_find_vqs_ctx_size()
  virtio_net: set the default max ring size by find_vqs()
  virtio_net: get ringparam by virtqueue_get_vring_max_size()
  virtio_net: split free_unused_bufs()
  virtio_net: support rx/tx queue resize
  virtio_net: support set_ringparam

 arch/um/drivers/virtio_uml.c |   3 +-
 drivers/net/virtio_net.c | 219 +++-
 drivers/platform/mellanox/mlxbf-tmfifo.c |   3 +
 drivers/remoteproc/remoteproc_virtio.c   |   3 +
 drivers/s390/virtio/virtio_ccw.c |   4 +
 drivers/virtio/virtio_mmio.c |  11 +-
 drivers/virtio/virtio_pci_common.c   |  28 +-
 drivers/virtio/virtio_pci_common.h   |   3 +-
 drivers/virtio/virtio_pci_legacy.c   |   8 +-
 drivers/virtio/virtio_pci_modern.c   | 149 +-
 drivers/virtio/virtio_pci_modern_dev.c   |  36 ++
 drivers/virtio/virtio_ring.c | 626 ++-
 drivers/virtio/virtio_vdpa.c |   3 +
 include/linux/virtio.h   |   6 +
 include/linux/virtio_config.h|  38 +-
 include/linux/virtio_pci_modern.h|   2 +
 include/uapi/linux/virtio_config.h   |   7 +-

Re: [PATCH RESEND V2 3/3] vdpa/mlx5: Use consistent RQT size

2022-04-05 Thread Jason Wang

在 2022/4/4 下午7:24, Michael S. Tsirkin 写道:

On Mon, Apr 04, 2022 at 11:07:36AM +, Eli Cohen wrote:

From: Michael S. Tsirkin 
Sent: Monday, April 4, 2022 1:35 PM
To: Jason Wang 
Cc: Eli Cohen ; hdan...@sina.com; 
virtualization@lists.linux-foundation.org; linux-ker...@vger.kernel.org
Subject: Re: [PATCH RESEND V2 3/3] vdpa/mlx5: Use consistent RQT size

On Tue, Mar 29, 2022 at 12:21:09PM +0800, Jason Wang wrote:

From: Eli Cohen 

The current code evaluates RQT size based on the configured number of
virtqueues. This can raise an issue in the following scenario:

Assume MQ was negotiated.
1. mlx5_vdpa_set_map() gets called.
2. handle_ctrl_mq() is called setting cur_num_vqs to some value, lower
than the configured max VQs.
3. A second set_map gets called, but now a smaller number of VQs is used
to evaluate the size of the RQT.
4. handle_ctrl_mq() is called with a value larger than what the RQT can
hold. This will emit errors and the driver state is compromised.

To fix this, we use a new field in struct mlx5_vdpa_net to hold the
required number of entries in the RQT. This value is evaluated in
mlx5_vdpa_set_driver_features() where we have the negotiated features
all set up.

In addtion

addition?

Do you need me to send another version?

It's a bit easier that way but I can handle it manually too.

Let me send a new version with this fixed.

If so, let's wait for Jason's reply.

Right.

to that, we take into consideration the max capability of RQT
entries early when the device is added so we don't need to take consider
it when creating the RQT.

Last, we remove the use of mlx5_vdpa_max_qps() which just returns the
max_vas / 2 and make the code clearer.

Fixes: 52893733f2c5 ("vdpa/mlx5: Add multiqueue support")
Signed-off-by: Eli Cohen 

Jason I don't have your ack or S.O.B on this one.

My bad, for some reason, I miss that.

Will fix.

Thanks

---
  drivers/vdpa/mlx5/net/mlx5_vnet.c | 61 +++
  1 file changed, 21 insertions(+), 40 deletions(-)

diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c 
b/drivers/vdpa/mlx5/net/mlx5_vnet.c
index 53b8c1a68f90..61bec1ed0bc9 100644
--- a/drivers/vdpa/mlx5/net/mlx5_vnet.c
+++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c
@@ -161,6 +161,7 @@ struct mlx5_vdpa_net {
struct mlx5_flow_handle *rx_rule_mcast;
bool setup;
u32 cur_num_vqs;
+   u32 rqt_size;
struct notifier_block nb;
struct vdpa_callback config_cb;
struct mlx5_vdpa_wq_ent cvq_ent;
@@ -204,17 +205,12 @@ static __virtio16 cpu_to_mlx5vdpa16(struct mlx5_vdpa_dev 
*mvdev, u16 val)
return __cpu_to_virtio16(mlx5_vdpa_is_little_endian(mvdev), val);
  }

-static inline u32 mlx5_vdpa_max_qps(int max_vqs)
-{
-   return max_vqs / 2;
-}
-
  static u16 ctrl_vq_idx(struct mlx5_vdpa_dev *mvdev)
  {
if (!(mvdev->actual_features & BIT_ULL(VIRTIO_NET_F_MQ)))
return 2;

-   return 2 * mlx5_vdpa_max_qps(mvdev->max_vqs);
+   return mvdev->max_vqs;
  }

  static bool is_ctrl_vq_idx(struct mlx5_vdpa_dev *mvdev, u16 idx)
@@ -1236,25 +1232,13 @@ static void teardown_vq(struct mlx5_vdpa_net *ndev, 
struct mlx5_vdpa_virtqueue *
  static int create_rqt(struct mlx5_vdpa_net *ndev)
  {
__be32 *list;
-   int max_rqt;
void *rqtc;
int inlen;
void *in;
int i, j;
int err;
-   int num;
-
-   if (!(ndev->mvdev.actual_features & BIT_ULL(VIRTIO_NET_F_MQ)))
-   num = 1;
-   else
-   num = ndev->cur_num_vqs / 2;

-   max_rqt = min_t(int, roundup_pow_of_two(num),
-   1 << MLX5_CAP_GEN(ndev->mvdev.mdev, log_max_rqt_size));
-   if (max_rqt < 1)
-   return -EOPNOTSUPP;
-
-   inlen = MLX5_ST_SZ_BYTES(create_rqt_in) + max_rqt * 
MLX5_ST_SZ_BYTES(rq_num);
+   inlen = MLX5_ST_SZ_BYTES(create_rqt_in) + ndev->rqt_size * 
MLX5_ST_SZ_BYTES(rq_num);
in = kzalloc(inlen, GFP_KERNEL);
if (!in)
return -ENOMEM;
@@ -1263,12 +1247,12 @@ static int create_rqt(struct mlx5_vdpa_net *ndev)
rqtc = MLX5_ADDR_OF(create_rqt_in, in, rqt_context);

MLX5_SET(rqtc, rqtc, list_q_type, MLX5_RQTC_LIST_Q_TYPE_VIRTIO_NET_Q);
-   MLX5_SET(rqtc, rqtc, rqt_max_size, max_rqt);
+   MLX5_SET(rqtc, rqtc, rqt_max_size, ndev->rqt_size);
list = MLX5_ADDR_OF(rqtc, rqtc, rq_num[0]);
-   for (i = 0, j = 0; i < max_rqt; i++, j += 2)
-   list[i] = cpu_to_be32(ndev->vqs[j % (2 * num)].virtq_id);
+   for (i = 0, j = 0; i < ndev->rqt_size; i++, j += 2)
+   list[i] = cpu_to_be32(ndev->vqs[j % 
ndev->cur_num_vqs].virtq_id);

-   MLX5_SET(rqtc, rqtc, rqt_actual_size, max_rqt);
+   MLX5_SET(rqtc, rqtc, rqt_actual_size, ndev->rqt_size);
err = mlx5_vdpa_create_rqt(>mvdev, in, inlen, >res.rqtn);
kfree(in);
if (err)
@@ -1282,19 +1266,13 @@ static int create_rqt(struct mlx5_vdpa_net *ndev)
  static int modify_rqt(struct

RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()

2022-04-05 Thread Tian, Kevin

> From: Tian, Kevin
> Sent: Wednesday, April 6, 2022 7:32 AM
> 
> > From: Jason Gunthorpe 
> > Sent: Wednesday, April 6, 2022 6:58 AM
> >
> > On Tue, Apr 05, 2022 at 01:50:36PM -0600, Alex Williamson wrote:
> > > >
> > > > +static bool intel_iommu_enforce_cache_coherency(struct
> > iommu_domain *domain)
> > > > +{
> > > > +   struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > > > +
> > > > +   if (!dmar_domain->iommu_snooping)
> > > > +   return false;
> > > > +   dmar_domain->enforce_no_snoop = true;
> > > > +   return true;
> > > > +}
> > >
> > > Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
> > > support it, ie. reserved register bit set in pte faults?
> >
> > The way the Intel driver is setup that is not possible. Currently it
> > does:
> >
> >  static bool intel_iommu_capable(enum iommu_cap cap)
> >  {
> > if (cap == IOMMU_CAP_CACHE_COHERENCY)
> > return domain_update_iommu_snooping(NULL);
> >
> > Which is a global property unrelated to any device.
> >
> > Thus either all devices and all domains support iommu_snooping, or
> > none do.
> >
> > It is unclear because for some reason the driver recalculates this
> > almost constant value on every device attach..
> 
> The reason is simply because iommu capability is a global flag 

...and intel iommu driver supports hotplug-ed iommu. But in reality this
recalculation is almost a no-op because the only iommu that doesn't
support force snoop is for Intel GPU and built-in in the platform.

> 
> when the cap is removed by this series I don't think we need keep that
> global recalculation thing.
> 
> Thanks
> Kevin
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

RE: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()

2022-04-05 Thread Tian, Kevin

> From: Jason Gunthorpe 
> Sent: Wednesday, April 6, 2022 6:58 AM
> 
> On Tue, Apr 05, 2022 at 01:50:36PM -0600, Alex Williamson wrote:
> > >
> > > +static bool intel_iommu_enforce_cache_coherency(struct
> iommu_domain *domain)
> > > +{
> > > + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> > > +
> > > + if (!dmar_domain->iommu_snooping)
> > > + return false;
> > > + dmar_domain->enforce_no_snoop = true;
> > > + return true;
> > > +}
> >
> > Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
> > support it, ie. reserved register bit set in pte faults?
> 
> The way the Intel driver is setup that is not possible. Currently it
> does:
> 
>  static bool intel_iommu_capable(enum iommu_cap cap)
>  {
>   if (cap == IOMMU_CAP_CACHE_COHERENCY)
>   return domain_update_iommu_snooping(NULL);
> 
> Which is a global property unrelated to any device.
> 
> Thus either all devices and all domains support iommu_snooping, or
> none do.
> 
> It is unclear because for some reason the driver recalculates this
> almost constant value on every device attach..

The reason is simply because iommu capability is a global flag 

when the cap is removed by this series I don't think we need keep that
global recalculation thing.

Thanks
Kevin
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 3/5] iommu: Introduce the domain op enforce_cache_coherency()

2022-04-05 Thread Alex Williamson

On Tue,  5 Apr 2022 13:16:02 -0300
Jason Gunthorpe  wrote:

> This new mechanism will replace using IOMMU_CAP_CACHE_COHERENCY and
> IOMMU_CACHE to control the no-snoop blocking behavior of the IOMMU.
> 
> Currently only Intel and AMD IOMMUs are known to support this
> feature. They both implement it as an IOPTE bit, that when set, will cause
> PCIe TLPs to that IOVA with the no-snoop bit set to be treated as though
> the no-snoop bit was clear.
> 
> The new API is triggered by calling enforce_cache_coherency() before
> mapping any IOVA to the domain which globally switches on no-snoop
> blocking. This allows other implementations that might block no-snoop
> globally and outside the IOPTE - AMD also documents such an HW capability.
> 
> Leave AMD out of sync with Intel and have it block no-snoop even for
> in-kernel users. This can be trivially resolved in a follow up patch.
> 
> Only VFIO will call this new API.
> 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/iommu/amd/iommu.c   |  7 +++
>  drivers/iommu/intel/iommu.c | 14 +-
>  include/linux/intel-iommu.h |  1 +
>  include/linux/iommu.h   |  4 
>  4 files changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index a1ada7bff44e61..e500b487eb3429 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -2271,6 +2271,12 @@ static int amd_iommu_def_domain_type(struct device 
> *dev)
>   return 0;
>  }
>  
> +static bool amd_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> + /* IOMMU_PTE_FC is always set */
> + return true;
> +}
> +
>  const struct iommu_ops amd_iommu_ops = {
>   .capable = amd_iommu_capable,
>   .domain_alloc = amd_iommu_domain_alloc,
> @@ -2293,6 +2299,7 @@ const struct iommu_ops amd_iommu_ops = {
>   .flush_iotlb_all = amd_iommu_flush_iotlb_all,
>   .iotlb_sync = amd_iommu_iotlb_sync,
>   .free   = amd_iommu_domain_free,
> + .enforce_cache_coherency = amd_iommu_enforce_cache_coherency,
>   }
>  };
>  
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index df5c62ecf942b8..f08611a6cc4799 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -4422,7 +4422,8 @@ static int intel_iommu_map(struct iommu_domain *domain,
>   prot |= DMA_PTE_READ;
>   if (iommu_prot & IOMMU_WRITE)
>   prot |= DMA_PTE_WRITE;
> - if ((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping)
> + if (((iommu_prot & IOMMU_CACHE) && dmar_domain->iommu_snooping) ||
> + dmar_domain->enforce_no_snoop)
>   prot |= DMA_PTE_SNP;
>  
>   max_addr = iova + size;
> @@ -4545,6 +4546,16 @@ static phys_addr_t intel_iommu_iova_to_phys(struct 
> iommu_domain *domain,
>   return phys;
>  }
>  
> +static bool intel_iommu_enforce_cache_coherency(struct iommu_domain *domain)
> +{
> + struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> +
> + if (!dmar_domain->iommu_snooping)
> + return false;
> + dmar_domain->enforce_no_snoop = true;
> + return true;
> +}

Don't we have issues if we try to set DMA_PTE_SNP on DMARs that don't
support it, ie. reserved register bit set in pte faults?  It seems
really inconsistent here that I could make a domain that supports
iommu_snooping, set enforce_no_snoop = true, then add another DMAR to
the domain that may not support iommu_snooping, I'd get false on the
subsequent enforcement test, but the dmar_domain is still trying to use
DMA_PTE_SNP.

There's also a disconnect, maybe just in the naming or documentation,
but if I call enforce_cache_coherency for a domain, that seems like the
domain should retain those semantics regardless of how it's modified,
ie. "enforced".  For example, if I tried to perform the above operation,
I should get a failure attaching the device that brings in the less
capable DMAR because the domain has been set to enforce this feature.

If the API is that I need to re-enforce_cache_coherency on every
modification of the domain, shouldn't dmar_domain->enforce_no_snoop
also return to a default value on domain changes?

Maybe this should be something like set_no_snoop_squashing with the
above semantics, it needs to be re-applied whenever the domain:device
composition changes?  Thanks,

Alex

> +
>  static bool intel_iommu_capable(enum iommu_cap cap)
>  {
>   if (cap == IOMMU_CAP_CACHE_COHERENCY)
> @@ -4898,6 +4909,7 @@ const struct iommu_ops intel_iommu_ops = {
>   .iotlb_sync = intel_iommu_tlb_sync,
>   .iova_to_phys   = intel_iommu_iova_to_phys,
>   .free   = intel_iommu_domain_free,
> + .enforce_cache_coherency = intel_iommu_enforce_cache_coherency,
>   }
>  };
>  
> diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
> index 2f9891cb3d0014..1f930c0c225d94

Re: [PATCH 2/5] vfio: Require that devices support DMA cache coherence

2022-04-05 Thread Alex Williamson

On Tue,  5 Apr 2022 13:16:01 -0300
Jason Gunthorpe  wrote:

> dev_is_dma_coherent() is the control to determine if IOMMU_CACHE can be
> supported.
> 
> IOMMU_CACHE means that normal DMAs do not require any additional coherency
> mechanism and is the basic uAPI that VFIO exposes to userspace. For
> instance VFIO applications like DPDK will not work if additional coherency
> operations are required.
> 
> Therefore check dev_is_dma_coherent() before allowing a device to join a
> domain. This will block device/platform/iommu combinations from using VFIO
> that do not support cache coherent DMA.
> 
> Signed-off-by: Jason Gunthorpe 
> ---
>  drivers/vfio/vfio.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index a4555014bd1e72..2a3aa3e742d943 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -32,6 +32,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include "vfio.h"
>  
>  #define DRIVER_VERSION   "0.3"
> @@ -1348,6 +1349,11 @@ static int vfio_group_get_device_fd(struct vfio_group 
> *group, char *buf)
>   if (IS_ERR(device))
>   return PTR_ERR(device);
>  
> + if (group->type == VFIO_IOMMU && !dev_is_dma_coherent(device->dev)) {
> + ret = -ENODEV;
> + goto err_device_put;
> + }
> +

Failing at the point where the user is trying to gain access to the
device seems a little late in the process and opaque, wouldn't we
rather have vfio bus drivers fail to probe such devices?  I'd expect
this to occur in the vfio_register_group_dev() path.  Thanks,

Alex

>   if (!try_module_get(device->dev->driver->owner)) {
>   ret = -ENODEV;
>   goto err_device_put;

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH] drm/virtio: Add execbuf flag to request no fence-event

2022-04-05 Thread Chia-I Wu

On Tue, Apr 5, 2022 at 10:38 AM Rob Clark  wrote:
>
> From: Rob Clark 
>
> It would have been cleaner to have a flag to *request* the fence event.
> But that ship has sailed.  So add a flag so that userspace which doesn't
> care about the events can opt-out.
>
> Signed-off-by: Rob Clark 
Reviewed-by: Chia-I Wu 

Might want to wait for Gurchetan to chime in as he added the mechanism.

> ---
>  drivers/gpu/drm/virtio/virtgpu_ioctl.c | 8 +---
>  include/uapi/drm/virtgpu_drm.h | 2 ++
>  2 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
> b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> index 3a8078f2ee27..09f1aa263f91 100644
> --- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> +++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
> @@ -225,9 +225,11 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
> *dev, void *data,
> goto out_unresv;
> }
>
> -   ret = virtio_gpu_fence_event_create(dev, file, out_fence, ring_idx);
> -   if (ret)
> -   goto out_unresv;
> +   if (!(exbuf->flags & VIRTGPU_EXECBUF_NO_EVENT)) {
> +   ret = virtio_gpu_fence_event_create(dev, file, out_fence, 
> ring_idx);
> +   if (ret)
> +   goto out_unresv;
> +   }
>
> if (out_fence_fd >= 0) {
> sync_file = sync_file_create(_fence->f);
> diff --git a/include/uapi/drm/virtgpu_drm.h b/include/uapi/drm/virtgpu_drm.h
> index 0512fde5e697..d06cac3407cc 100644
> --- a/include/uapi/drm/virtgpu_drm.h
> +++ b/include/uapi/drm/virtgpu_drm.h
> @@ -52,10 +52,12 @@ extern "C" {
>  #define VIRTGPU_EXECBUF_FENCE_FD_IN0x01
>  #define VIRTGPU_EXECBUF_FENCE_FD_OUT   0x02
>  #define VIRTGPU_EXECBUF_RING_IDX   0x04
> +#define VIRTGPU_EXECBUF_NO_EVENT   0x08
>  #define VIRTGPU_EXECBUF_FLAGS  (\
> VIRTGPU_EXECBUF_FENCE_FD_IN |\
> VIRTGPU_EXECBUF_FENCE_FD_OUT |\
> VIRTGPU_EXECBUF_RING_IDX |\
> +   VIRTGPU_EXECBUF_NO_EVENT |\
> 0)
>
>  struct drm_virtgpu_map {
> --
> 2.35.1
>
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [GIT PULL] virtio: fixes, cleanups

2022-04-05 Thread pr-tracker-bot

The pull request you sent on Mon, 4 Apr 2022 06:31:28 -0400:

> https://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost.git tags/for_linus

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/3e732ebf7316ac83e8562db7e64cc68aec390a18

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/prtracker.html
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

[PATCH] drm/virtio: Add execbuf flag to request no fence-event

2022-04-05 Thread Rob Clark

From: Rob Clark 

It would have been cleaner to have a flag to *request* the fence event.
But that ship has sailed.  So add a flag so that userspace which doesn't
care about the events can opt-out.

Signed-off-by: Rob Clark 
---
 drivers/gpu/drm/virtio/virtgpu_ioctl.c | 8 +---
 include/uapi/drm/virtgpu_drm.h | 2 ++
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/virtio/virtgpu_ioctl.c 
b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
index 3a8078f2ee27..09f1aa263f91 100644
--- a/drivers/gpu/drm/virtio/virtgpu_ioctl.c
+++ b/drivers/gpu/drm/virtio/virtgpu_ioctl.c
@@ -225,9 +225,11 @@ static int virtio_gpu_execbuffer_ioctl(struct drm_device 
*dev, void *data,
goto out_unresv;
}
 
-   ret = virtio_gpu_fence_event_create(dev, file, out_fence, ring_idx);
-   if (ret)
-   goto out_unresv;
+   if (!(exbuf->flags & VIRTGPU_EXECBUF_NO_EVENT)) {
+   ret = virtio_gpu_fence_event_create(dev, file, out_fence, 
ring_idx);
+   if (ret)
+   goto out_unresv;
+   }
 
if (out_fence_fd >= 0) {
sync_file = sync_file_create(_fence->f);
diff --git a/include/uapi/drm/virtgpu_drm.h b/include/uapi/drm/virtgpu_drm.h
index 0512fde5e697..d06cac3407cc 100644
--- a/include/uapi/drm/virtgpu_drm.h
+++ b/include/uapi/drm/virtgpu_drm.h
@@ -52,10 +52,12 @@ extern "C" {
 #define VIRTGPU_EXECBUF_FENCE_FD_IN0x01
 #define VIRTGPU_EXECBUF_FENCE_FD_OUT   0x02
 #define VIRTGPU_EXECBUF_RING_IDX   0x04
+#define VIRTGPU_EXECBUF_NO_EVENT   0x08
 #define VIRTGPU_EXECBUF_FLAGS  (\
VIRTGPU_EXECBUF_FENCE_FD_IN |\
VIRTGPU_EXECBUF_FENCE_FD_OUT |\
VIRTGPU_EXECBUF_RING_IDX |\
+   VIRTGPU_EXECBUF_NO_EVENT |\
0)
 
 struct drm_virtgpu_map {
-- 
2.35.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v5 0/2] virtio-blk: support polling I/O and mq_ops->queue_rqs()

2022-04-05 Thread Stefan Hajnoczi

On Wed, Apr 06, 2022 at 12:09:22AM +0900, Suwan Kim wrote:
> This patch serise adds support for polling I/O and mq_ops->queue_rqs()
> to virtio-blk driver.
> 
> Changes
> 
> v4 -> v5
> - patch1 : virtblk_poll
> - Replace "req_done" with "found" in virtblk_poll()
> - Split for loop into two distinct for loop in init_vq()
>   that sets callback function for each default/poll queues
> - Replace "if (i == HCTX_TYPE_DEFAULT)" with "i != HCTX_TYPE_POLL"
>   in virtblk_map_queues()
> - Replace "virtblk_unmap_data(req, vbr);" with
>   "virtblk_unmap_data(req, blk_mq_rq_to_pdu(req);"
>   in virtblk_complete_batch()
> 
> - patch2 : virtio_queue_rqs
> - Instead of using vbr.sg_num field, use vbr->sg_table.nents.
>   So, remove sg_num field in struct virtblk_req
> - Drop the unnecessary argument of virtblk_add_req() because it
>   doens't need "data_sg" and "have_data". It can be derived from "vbr"
>   argument.
> - Add Reviewed-by tag from Stefan
> 
> v3 -> v4
> - patch1 : virtblk_poll
> - Add print the number of default/read/poll queues in init_vq()
> - Add blk_mq_start_stopped_hw_queues() to virtblk_poll()
>   virtblk_poll()
>   ...
>   if (req_done)
>
> blk_mq_start_stopped_hw_queues(vblk->disk->queue, true);
>   ...
> 
> - patch2 : virtio_queue_rqs
> - Modify virtio_queue_rqs() to hold lock only once when it adds
>   requests to virtqueue just before virtqueue notify.
>   It will guarantee that virtio_queue_rqs() will not use
>   previous req again.
> 
> v2 -> v3
> - Fix warning by kernel test robot
>   
> static int virtblk_poll()
> ...
> if (!blk_mq_add_to_batch(req, iob, virtblk_result(vbr),
>-> vbr->status,
> 
> v1 -> v2
> - To receive the number of poll queues from user,
>   use module parameter instead of QEMU uapi change.
> 
> - Add the comment about virtblk_map_queues().
> 
> - Add support for mq_ops->queue_rqs() to implement submit side
>   batch.
> 
> Suwan Kim (2):
>   virtio-blk: support polling I/O
>   virtio-blk: support mq_ops->queue_rqs()
> 
>  drivers/block/virtio_blk.c | 229 +
>  1 file changed, 206 insertions(+), 23 deletions(-)
> 
> -- 
> 2.26.3
> 

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 8/8] virtio_ring.h: do not include from exported header

2022-04-05 Thread Michael S. Tsirkin

On Tue, Apr 05, 2022 at 08:29:36AM +0200, Arnd Bergmann wrote:
> On Tue, Apr 5, 2022 at 7:35 AM Christoph Hellwig  wrote:
> >
> > On Mon, Apr 04, 2022 at 10:04:02AM +0200, Arnd Bergmann wrote:
> > > The header is shared between kernel and other projects using virtio, such 
> > > as
> > > qemu and any boot loaders booting from virtio devices. It's not 
> > > technically a
> > > /kernel/ ABI, but it is an ABI and for practical reasons the kernel 
> > > version is
> > > maintained as the master copy if I understand it correctly.
> >
> > Besides that fact that as you correctly states these are not a UAPI at
> > all, qemu and bootloades are not specific to Linux and can't require a
> > specific kernel version.  So the same thing we do for file system
> > formats or network protocols applies here:  just copy the damn header.
> > And as stated above any reasonably portable userspace needs to have a
> > copy anyway.
> 
> I think the users all have their own copies, at least the ones I could
> find on codesearch.debian.org.

kvmtool does not seem to have its own copy, just grep vring_init.

> However, there are 27 virtio_*.h
> files in include/uapi/linux that probably should stay together for
> the purpose of defining the virtio protocol, and some others might
> be uapi relevant.
> 
> I see that at least include/uapi/linux/vhost.h has ioctl() definitions
> in it, and includes the virtio_ring.h header indirectly.
> 
> Adding the virtio maintainers to Cc to see if they can provide
> more background on this.
> 
> > If it is just as a "master copy" it can live in drivers/virtio/, just
> > like we do for other formats.
> 
> It has to be in include/linux/ at least because it's used by a number
> of drivers outside of drivers/virtio/.
> 
> Arnd
> ___
> Virtualization mailing list
> Virtualization@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/virtualization
> 

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 8/8] virtio_ring.h: do not include from exported header

2022-04-05 Thread Michael S. Tsirkin

On Tue, Apr 05, 2022 at 12:01:35AM -0700, Christoph Hellwig wrote:
> On Tue, Apr 05, 2022 at 08:29:36AM +0200, Arnd Bergmann wrote:
> > I think the users all have their own copies, at least the ones I could
> > find on codesearch.debian.org. However, there are 27 virtio_*.h
> > files in include/uapi/linux that probably should stay together for
> > the purpose of defining the virtio protocol, and some others might
> > be uapi relevant.
> > 
> > I see that at least include/uapi/linux/vhost.h has ioctl() definitions
> > in it, and includes the virtio_ring.h header indirectly.
> 
> Uhh.  We had a somilar mess (but at a smaller scale) in nvme, where
> the uapi nvme.h contained both the UAPI and the protocol definition.
> We took a hard break to only have a nvme_ioctl.h in the uapi header
> and linux/nvme.h for the protocol.  This did break a bit of userspace
> compilation (but not running obviously) at the time, but really made
> the headers much easier to main.  Some userspace keeps on copying
> nvme.h with the protocol definitions.

So far we are quite happy with the status quo, I don't see any issues
maintaining the headers. And yes, through vhost and vringh they are part
of UAPI.

Yes users have their own copies but they synch with the kernel.

That's generally. Specifically the vring_init thing is a legacy thingy
used by kvmtool and maybe others, and it inits the ring in the way that
vring/virtio expect.  Has been there since day 1 and we are careful not
to add more stuff like that, so I don't see a lot of gain from incurring
this pain for users.

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v4 2/2] virtio-blk: support mq_ops->queue_rqs()

2022-04-05 Thread Stefan Hajnoczi

On Tue, Apr 05, 2022 at 02:31:22PM +0900, Suwan Kim wrote:
> This patch supports mq_ops->queue_rqs() hook. It has an advantage of
> batch submission to virtio-blk driver. It also helps polling I/O because
> polling uses batched completion of block layer. Batch submission in
> queue_rqs() can boost polling performance.
> 
> In queue_rqs(), it iterates plug->mq_list, collects requests that
> belong to same HW queue until it encounters a request from other
> HW queue or sees the end of the list.
> Then, virtio-blk adds requests into virtqueue and kicks virtqueue
> to submit requests.
> 
> If there is an error, it inserts error request to requeue_list and
> passes it to ordinary block layer path.
> 
> For verification, I did fio test.
> (io_uring, randread, direct=1, bs=4K, iodepth=64 numjobs=N)
> I set 4 vcpu and 2 virtio-blk queues for VM and run fio test 5 times.
> It shows about 2% improvement.
> 
>  |   numjobs=2   |   numjobs=4
>   ---
> fio without queue_rqs()  |   291K IOPS   |   238K IOPS
>   ---
> fio with queue_rqs() |   295K IOPS   |   243K IOPS
> 
> For polling I/O performance, I also did fio test as below.
> (io_uring, hipri, randread, direct=1, bs=512, iodepth=64 numjobs=4)
> I set 4 vcpu and 2 poll queues for VM.
> It shows about 2% improvement in polling I/O.
> 
>   |   IOPS   |  avg latency
>   ---
> fio poll without queue_rqs()  |   424K   |   613.05 usec
>   ---
> fio poll with queue_rqs() |   435K   |   601.01 usec
> 
> Signed-off-by: Suwan Kim 
> ---
>  drivers/block/virtio_blk.c | 110 +
>  1 file changed, 99 insertions(+), 11 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v4 2/2] virtio-blk: support mq_ops->queue_rqs()

2022-04-05 Thread Christoph Hellwig

On Tue, Apr 05, 2022 at 02:31:22PM +0900, Suwan Kim wrote:
> This patch supports mq_ops->queue_rqs() hook. It has an advantage of
> batch submission to virtio-blk driver. It also helps polling I/O because
> polling uses batched completion of block layer. Batch submission in
> queue_rqs() can boost polling performance.
> 
> In queue_rqs(), it iterates plug->mq_list, collects requests that
> belong to same HW queue until it encounters a request from other
> HW queue or sees the end of the list.
> Then, virtio-blk adds requests into virtqueue and kicks virtqueue
> to submit requests.
> 
> If there is an error, it inserts error request to requeue_list and
> passes it to ordinary block layer path.
> 
> For verification, I did fio test.
> (io_uring, randread, direct=1, bs=4K, iodepth=64 numjobs=N)
> I set 4 vcpu and 2 virtio-blk queues for VM and run fio test 5 times.
> It shows about 2% improvement.
> 
>  |   numjobs=2   |   numjobs=4
>   ---
> fio without queue_rqs()  |   291K IOPS   |   238K IOPS
>   ---
> fio with queue_rqs() |   295K IOPS   |   243K IOPS
> 
> For polling I/O performance, I also did fio test as below.
> (io_uring, hipri, randread, direct=1, bs=512, iodepth=64 numjobs=4)
> I set 4 vcpu and 2 poll queues for VM.
> It shows about 2% improvement in polling I/O.
> 
>   |   IOPS   |  avg latency
>   ---
> fio poll without queue_rqs()  |   424K   |   613.05 usec
>   ---
> fio poll with queue_rqs() |   435K   |   601.01 usec
> 
> Signed-off-by: Suwan Kim 
> ---
>  drivers/block/virtio_blk.c | 110 +
>  1 file changed, 99 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 712579dcd3cc..a091034bc551 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -92,6 +92,7 @@ struct virtio_blk {
>  struct virtblk_req {
>   struct virtio_blk_outhdr out_hdr;
>   u8 status;
> + int sg_num;
>   struct sg_table sg_table;
>   struct scatterlist sg[];
>  };
> @@ -311,18 +312,13 @@ static void virtio_commit_rqs(struct blk_mq_hw_ctx 
> *hctx)
>   virtqueue_notify(vq->vq);
>  }
>  
> -static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx,
> -const struct blk_mq_queue_data *bd)
> +static blk_status_t virtblk_prep_rq(struct blk_mq_hw_ctx *hctx,
> + struct virtio_blk *vblk,
> + struct request *req,
> + struct virtblk_req *vbr)
>  {
> - struct virtio_blk *vblk = hctx->queue->queuedata;
> - struct request *req = bd->rq;
> - struct virtblk_req *vbr = blk_mq_rq_to_pdu(req);
> - unsigned long flags;
> - int num;
> - int qid = hctx->queue_num;
> - bool notify = false;
>   blk_status_t status;
> - int err;
> + int num;
>  
>   status = virtblk_setup_cmd(vblk->vdev, req, vbr);
>   if (unlikely(status))
> @@ -335,9 +331,30 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx 
> *hctx,
>   virtblk_cleanup_cmd(req);
>   return BLK_STS_RESOURCE;
>   }
> + vbr->sg_num = num;

This can go into the nents field of vbr->sg_table.

> + int err;
> +
> + status = virtblk_prep_rq(hctx, vblk, req, vbr);
> + if (unlikely(status))
> + return status;
>  
>   spin_lock_irqsave(>vqs[qid].lock, flags);
> - err = virtblk_add_req(vblk->vqs[qid].vq, vbr, vbr->sg_table.sgl, num);
> + err = virtblk_add_req(vblk->vqs[qid].vq, vbr,
> + vbr->sg_table.sgl, vbr->sg_num);

And while we're at it - virtblk_add_req can lose the data_sg and
have_data arguments as they can be derived from vbr.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v4 1/2] virtio-blk: support polling I/O

2022-04-05 Thread Christoph Hellwig

On Tue, Apr 05, 2022 at 02:31:21PM +0900, Suwan Kim wrote:
> This patch supports polling I/O via virtio-blk driver. Polling
> feature is enabled by module parameter "num_poll_queues" and it
> sets dedicated polling queues for virtio-blk. This patch improves
> the polling I/O throughput and latency.
> 
> The virtio-blk driver doesn't not have a poll function and a poll
> queue and it has been operating in interrupt driven method even if
> the polling function is called in the upper layer.
> 
> virtio-blk polling is implemented upon 'batched completion' of block
> layer. virtblk_poll() queues completed request to io_comp_batch->req_list
> and later, virtblk_complete_batch() calls unmap function and ends
> the requests in batch.
> 
> virtio-blk reads the number of poll queues from module parameter
> "num_poll_queues". If VM sets queue parameter as below,
> ("num-queues=N" [QEMU property], "num_poll_queues=M" [module parameter])
> It allocates N virtqueues to virtio_blk->vqs[N] and it uses [0..(N-M-1)]
> as default queues and [(N-M)..(N-1)] as poll queues. Unlike the default
> queues, the poll queues have no callback function.
> 
> Regarding HW-SW queue mapping, the default queue mapping uses the
> existing method that condsiders MSI irq vector. But the poll queue
> doesn't have an irq, so it uses the regular blk-mq cpu mapping.
> 
> For verifying the improvement, I did Fio polling I/O performance test
> with io_uring engine with the options below.
> (io_uring, hipri, randread, direct=1, bs=512, iodepth=64 numjobs=N)
> I set 4 vcpu and 4 virtio-blk queues - 2 default queues and 2 poll
> queues for VM.
> 
> As a result, IOPS and average latency improved about 10%.
> 
> Test result:
> 
> - Fio io_uring poll without virtio-blk poll support
>   -- numjobs=1 : IOPS = 339K, avg latency = 188.33us
>   -- numjobs=2 : IOPS = 367K, avg latency = 347.33us
>   -- numjobs=4 : IOPS = 383K, avg latency = 682.06us
> 
> - Fio io_uring poll with virtio-blk poll support
>   -- numjobs=1 : IOPS = 385K, avg latency = 165.94us
>   -- numjobs=2 : IOPS = 408K, avg latency = 313.28us
>   -- numjobs=4 : IOPS = 424K, avg latency = 613.05us
> 
> Signed-off-by: Suwan Kim 
> ---
>  drivers/block/virtio_blk.c | 112 +++--
>  1 file changed, 108 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
> index 8c415be86732..712579dcd3cc 100644
> --- a/drivers/block/virtio_blk.c
> +++ b/drivers/block/virtio_blk.c
> @@ -37,6 +37,10 @@ MODULE_PARM_DESC(num_request_queues,
>"0 for no limit. "
>"Values > nr_cpu_ids truncated to nr_cpu_ids.");
>  
> +static unsigned int poll_queues;
> +module_param(poll_queues, uint, 0644);
> +MODULE_PARM_DESC(poll_queues, "The number of dedicated virtqueues for 
> polling I/O");
> +
>  static int major;
>  static DEFINE_IDA(vd_index_ida);
>  
> @@ -81,6 +85,7 @@ struct virtio_blk {
>  
>   /* num of vqs */
>   int num_vqs;
> + int io_queues[HCTX_MAX_TYPES];
>   struct virtio_blk_vq *vqs;
>  };
>  
> @@ -548,6 +553,7 @@ static int init_vq(struct virtio_blk *vblk)
>   const char **names;
>   struct virtqueue **vqs;
>   unsigned short num_vqs;
> + unsigned int num_poll_vqs;
>   struct virtio_device *vdev = vblk->vdev;
>   struct irq_affinity desc = { 0, };
>  
> @@ -556,6 +562,7 @@ static int init_vq(struct virtio_blk *vblk)
>  _vqs);
>   if (err)
>   num_vqs = 1;
> +
>   if (!err && !num_vqs) {
>   dev_err(>dev, "MQ advertised but zero queues reported\n");
>   return -EINVAL;
> @@ -565,6 +572,18 @@ static int init_vq(struct virtio_blk *vblk)
>   min_not_zero(num_request_queues, nr_cpu_ids),
>   num_vqs);
>  
> + num_poll_vqs = min_t(unsigned int, poll_queues, num_vqs - 1);
> +
> + memset(vblk->io_queues, 0, sizeof(int) * HCTX_MAX_TYPES);
> + vblk->io_queues[HCTX_TYPE_DEFAULT] = num_vqs - num_poll_vqs;
> + vblk->io_queues[HCTX_TYPE_READ] = 0;
> + vblk->io_queues[HCTX_TYPE_POLL] = num_poll_vqs;
> +
> + dev_info(>dev, "%d/%d/%d default/read/poll queues\n",
> + vblk->io_queues[HCTX_TYPE_DEFAULT],
> + vblk->io_queues[HCTX_TYPE_READ],
> + vblk->io_queues[HCTX_TYPE_POLL]);
> +
>   vblk->vqs = kmalloc_array(num_vqs, sizeof(*vblk->vqs), GFP_KERNEL);
>   if (!vblk->vqs)
>   return -ENOMEM;
> @@ -578,8 +597,13 @@ static int init_vq(struct virtio_blk *vblk)
>   }
>  
>   for (i = 0; i < num_vqs; i++) {
> + if (i < num_vqs - num_poll_vqs) {
> + callbacks[i] = virtblk_done;
> + snprintf(vblk->vqs[i].name, VQ_NAME_LEN, "req.%d", i);
> + } else {
> + callbacks[i] = NULL;
> + snprintf(vblk->vqs[i].name,

Re: [PATCH v3 0/4] Introduce akcipher service for virtio-crypto

2022-04-05 Thread Cornelia Huck

On Tue, Apr 05 2022, "Michael S. Tsirkin"  wrote:

> On Mon, Apr 04, 2022 at 05:39:24PM +0200, Cornelia Huck wrote:
>> On Mon, Mar 07 2022, "Michael S. Tsirkin"  wrote:
>> 
>> > On Mon, Mar 07, 2022 at 10:42:30AM +0800, zhenwei pi wrote:
>> >> Hi, Michael & Lei
>> >> 
>> >> The full patchset has been reviewed by Gonglei, thanks to Gonglei.
>> >> Should I modify the virtio crypto specification(use "__le32 
>> >> akcipher_algo;"
>> >> instead of "__le32 reserve;" only, see v1->v2 change), and start a new 
>> >> issue
>> >> for a revoting procedure?
>> >
>> > You can but not it probably will be deferred to 1.3. OK with you?
>> >
>> >> Also cc Cornelia Huck.
>> 
>> [Apologies, I'm horribly behind on my email backlog, and on virtio
>> things in general :(]
>> 
>> The akcipher update had been deferred for 1.2, so I think it will be 1.3
>> material. However, I just noticed while browsing the fine lwn.net merge
>> window summary that this seems to have been merged already. That
>> situation is less than ideal, although I don't expect any really bad
>> problems, given that there had not been any negative feedback for the
>> spec proposal that I remember.
>
> Let's open a 1.3 branch? What do you think?

Yes, that's probably best, before things start piling up.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v4 1/2] virtio-blk: support polling I/O

2022-04-05 Thread Stefan Hajnoczi

On Tue, Apr 05, 2022 at 02:31:21PM +0900, Suwan Kim wrote:
> +static int virtblk_poll(struct blk_mq_hw_ctx *hctx, struct io_comp_batch 
> *iob)
> +{
> + struct virtio_blk *vblk = hctx->queue->queuedata;
> + struct virtio_blk_vq *vq = hctx->driver_data;
> + struct virtblk_req *vbr;
> + bool req_done = false;
> + unsigned long flags;
> + unsigned int len;
> + int found = 0;
> +
> + spin_lock_irqsave(>lock, flags);
> +
> + while ((vbr = virtqueue_get_buf(vq->vq, )) != NULL) {
> + struct request *req = blk_mq_rq_from_pdu(vbr);
>  
> - return blk_mq_virtio_map_queues(>map[HCTX_TYPE_DEFAULT],
> - vblk->vdev, 0);
> + found++;
> + if (!blk_mq_add_to_batch(req, iob, vbr->status,
> + virtblk_complete_batch))
> + blk_mq_complete_request(req);
> + req_done = true;
> + }
> +
> + if (req_done)

Minor nit: req_done can be replaced with found > 0.

> + blk_mq_start_stopped_hw_queues(vblk->disk->queue, true);
> +
> + spin_unlock_irqrestore(>lock, flags);
> +
> + return found;
> +}


signature.asc
Description: PGP signature
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 8/8] virtio_ring.h: do not include from exported header

2022-04-05 Thread Christoph Hellwig

On Tue, Apr 05, 2022 at 08:29:36AM +0200, Arnd Bergmann wrote:
> I think the users all have their own copies, at least the ones I could
> find on codesearch.debian.org. However, there are 27 virtio_*.h
> files in include/uapi/linux that probably should stay together for
> the purpose of defining the virtio protocol, and some others might
> be uapi relevant.
> 
> I see that at least include/uapi/linux/vhost.h has ioctl() definitions
> in it, and includes the virtio_ring.h header indirectly.

Uhh.  We had a somilar mess (but at a smaller scale) in nvme, where
the uapi nvme.h contained both the UAPI and the protocol definition.
We took a hard break to only have a nvme_ioctl.h in the uapi header
and linux/nvme.h for the protocol.  This did break a bit of userspace
compilation (but not running obviously) at the time, but really made
the headers much easier to main.  Some userspace keeps on copying
nvme.h with the protocol definitions.
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH 8/8] virtio_ring.h: do not include from exported header

2022-04-05 Thread Arnd Bergmann

On Tue, Apr 5, 2022 at 7:35 AM Christoph Hellwig  wrote:
>
> On Mon, Apr 04, 2022 at 10:04:02AM +0200, Arnd Bergmann wrote:
> > The header is shared between kernel and other projects using virtio, such as
> > qemu and any boot loaders booting from virtio devices. It's not technically 
> > a
> > /kernel/ ABI, but it is an ABI and for practical reasons the kernel version 
> > is
> > maintained as the master copy if I understand it correctly.
>
> Besides that fact that as you correctly states these are not a UAPI at
> all, qemu and bootloades are not specific to Linux and can't require a
> specific kernel version.  So the same thing we do for file system
> formats or network protocols applies here:  just copy the damn header.
> And as stated above any reasonably portable userspace needs to have a
> copy anyway.

I think the users all have their own copies, at least the ones I could
find on codesearch.debian.org. However, there are 27 virtio_*.h
files in include/uapi/linux that probably should stay together for
the purpose of defining the virtio protocol, and some others might
be uapi relevant.

I see that at least include/uapi/linux/vhost.h has ioctl() definitions
in it, and includes the virtio_ring.h header indirectly.

Adding the virtio maintainers to Cc to see if they can provide
more background on this.

> If it is just as a "master copy" it can live in drivers/virtio/, just
> like we do for other formats.

It has to be in include/linux/ at least because it's used by a number
of drivers outside of drivers/virtio/.

Arnd
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

54 matches

Mail list logo