Re: [PATCH RFC v6 02/11] vhost: use batched get_vq_desc version

2020-06-14 Thread Jason Wang


On 2020/6/11 下午5:06, Michael S. Tsirkin wrote:

On Thu, Jun 11, 2020 at 11:02:57AM +0800, Jason Wang wrote:

On 2020/6/10 下午7:05, Michael S. Tsirkin wrote:

+EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
/* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
{
+   unfetch_descs(vq);
vq->last_avail_idx -= n;

So unfetch_descs() has decreased last_avail_idx.
Can we fix this by letting unfetch_descs() return the number and then we can
do:

int d = unfetch_descs(vq);
vq->last_avail_idx -= (n > d) ? n - d: 0;

Thanks

That's intentional I think - we need both.


Yes, but:



Unfetch_descs drops the descriptors in the cache that were
*not returned to caller*  through get_vq_desc.

vhost_discard_vq_desc drops the ones that were returned through get_vq_desc.

Did I miss anything?

We could count some descriptors twice, consider the case e.g we only cache
on descriptor:

fetch_descs()
     fetch_buf()
         last_avail_idx++;

Then we want do discard it:
vhost_discard_avail_buf(1)
     unfetch_descs()
         last_avail_idx--;
     last_avail_idx -= 1;

Thanks


I don't think that happens. vhost_discard_avail_buf(1) is only called
after get vhost_get_avail_buf. vhost_get_avail_buf increments
first_desc.  unfetch_descs only counts from first_desc to ndescs.

If I'm wrong, could you show values of first_desc and ndescs in this
scenario?



You're right, fetch_descriptor could not be called directly from the 
device code.


Thanks






___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH RFC v6 02/11] vhost: use batched get_vq_desc version

2020-06-11 Thread Michael S. Tsirkin
On Thu, Jun 11, 2020 at 11:02:57AM +0800, Jason Wang wrote:
> 
> On 2020/6/10 下午7:05, Michael S. Tsirkin wrote:
> > > > +EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
> > > >/* Reverse the effect of vhost_get_vq_desc. Useful for error 
> > > > handling. */
> > > >void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
> > > >{
> > > > +   unfetch_descs(vq);
> > > > vq->last_avail_idx -= n;
> > > So unfetch_descs() has decreased last_avail_idx.
> > > Can we fix this by letting unfetch_descs() return the number and then we 
> > > can
> > > do:
> > > 
> > > int d = unfetch_descs(vq);
> > > vq->last_avail_idx -= (n > d) ? n - d: 0;
> > > 
> > > Thanks
> > That's intentional I think - we need both.
> 
> 
> Yes, but:
> 
> 
> > 
> > Unfetch_descs drops the descriptors in the cache that were
> > *not returned to caller*  through get_vq_desc.
> > 
> > vhost_discard_vq_desc drops the ones that were returned through get_vq_desc.
> > 
> > Did I miss anything?
> 
> We could count some descriptors twice, consider the case e.g we only cache
> on descriptor:
> 
> fetch_descs()
>     fetch_buf()
>         last_avail_idx++;
> 
> Then we want do discard it:
> vhost_discard_avail_buf(1)
>     unfetch_descs()
>         last_avail_idx--;
>     last_avail_idx -= 1;
> 
> Thanks


I don't think that happens. vhost_discard_avail_buf(1) is only called
after get vhost_get_avail_buf. vhost_get_avail_buf increments
first_desc.  unfetch_descs only counts from first_desc to ndescs.

If I'm wrong, could you show values of first_desc and ndescs in this
scenario?

-- 
MST

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH RFC v6 02/11] vhost: use batched get_vq_desc version

2020-06-10 Thread Jason Wang


On 2020/6/10 下午7:05, Michael S. Tsirkin wrote:

+EXPORT_SYMBOL_GPL(vhost_get_vq_desc);
   /* Reverse the effect of vhost_get_vq_desc. Useful for error handling. */
   void vhost_discard_vq_desc(struct vhost_virtqueue *vq, int n)
   {
+   unfetch_descs(vq);
vq->last_avail_idx -= n;

So unfetch_descs() has decreased last_avail_idx.
Can we fix this by letting unfetch_descs() return the number and then we can
do:

int d = unfetch_descs(vq);
vq->last_avail_idx -= (n > d) ? n - d: 0;

Thanks

That's intentional I think - we need both.



Yes, but:




Unfetch_descs drops the descriptors in the cache that were
*not returned to caller*  through get_vq_desc.

vhost_discard_vq_desc drops the ones that were returned through get_vq_desc.

Did I miss anything?


We could count some descriptors twice, consider the case e.g we only 
cache on descriptor:


fetch_descs()
    fetch_buf()
        last_avail_idx++;

Then we want do discard it:
vhost_discard_avail_buf(1)
    unfetch_descs()
        last_avail_idx--;
    last_avail_idx -= 1;

Thanks

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH RFC v6 02/11] vhost: use batched get_vq_desc version

2020-06-10 Thread Michael S. Tsirkin
On Wed, Jun 10, 2020 at 11:14:49AM +0800, Jason Wang wrote:
> 
> On 2020/6/8 下午8:52, Michael S. Tsirkin wrote:
> > As testing shows no performance change, switch to that now.
> > 
> > Signed-off-by: Michael S. Tsirkin 
> > Signed-off-by: Eugenio Pérez 
> > Link: https://lore.kernel.org/r/20200401183118.8334-3-epere...@redhat.com
> > Signed-off-by: Michael S. Tsirkin 
> > ---
> >   drivers/vhost/test.c  |   2 +-
> >   drivers/vhost/vhost.c | 318 --
> >   drivers/vhost/vhost.h |   7 +-
> >   3 files changed, 65 insertions(+), 262 deletions(-)
> > 
> > diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
> > index 0466921f4772..7d69778aaa26 100644
> > --- a/drivers/vhost/test.c
> > +++ b/drivers/vhost/test.c
> > @@ -119,7 +119,7 @@ static int vhost_test_open(struct inode *inode, struct 
> > file *f)
> > dev = >dev;
> > vqs[VHOST_TEST_VQ] = >vqs[VHOST_TEST_VQ];
> > n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
> > -   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV,
> > +   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV + 64,
> >VHOST_TEST_PKT_WEIGHT, VHOST_TEST_WEIGHT, true, NULL);
> > f->private_data = n;
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index 180b7b58c76b..41d6b132c234 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -304,6 +304,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
> >   {
> > vq->num = 1;
> > vq->ndescs = 0;
> > +   vq->first_desc = 0;
> > vq->desc = NULL;
> > vq->avail = NULL;
> > vq->used = NULL;
> > @@ -372,6 +373,11 @@ static int vhost_worker(void *data)
> > return 0;
> >   }
> > +static int vhost_vq_num_batch_descs(struct vhost_virtqueue *vq)
> > +{
> > +   return vq->max_descs - UIO_MAXIOV;
> > +}
> > +
> >   static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq)
> >   {
> > kfree(vq->descs);
> > @@ -394,6 +400,9 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev 
> > *dev)
> > for (i = 0; i < dev->nvqs; ++i) {
> > vq = dev->vqs[i];
> > vq->max_descs = dev->iov_limit;
> > +   if (vhost_vq_num_batch_descs(vq) < 0) {
> > +   return -EINVAL;
> > +   }
> > vq->descs = kmalloc_array(vq->max_descs,
> >   sizeof(*vq->descs),
> >   GFP_KERNEL);
> > @@ -1610,6 +1619,7 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned 
> > int ioctl, void __user *arg
> > vq->last_avail_idx = s.num;
> > /* Forget the cached index value. */
> > vq->avail_idx = vq->last_avail_idx;
> > +   vq->ndescs = vq->first_desc = 0;
> > break;
> > case VHOST_GET_VRING_BASE:
> > s.index = idx;
> > @@ -2078,253 +2088,6 @@ static unsigned next_desc(struct vhost_virtqueue 
> > *vq, struct vring_desc *desc)
> > return next;
> >   }
> > -static int get_indirect(struct vhost_virtqueue *vq,
> > -   struct iovec iov[], unsigned int iov_size,
> > -   unsigned int *out_num, unsigned int *in_num,
> > -   struct vhost_log *log, unsigned int *log_num,
> > -   struct vring_desc *indirect)
> > -{
> > -   struct vring_desc desc;
> > -   unsigned int i = 0, count, found = 0;
> > -   u32 len = vhost32_to_cpu(vq, indirect->len);
> > -   struct iov_iter from;
> > -   int ret, access;
> > -
> > -   /* Sanity check */
> > -   if (unlikely(len % sizeof desc)) {
> > -   vq_err(vq, "Invalid length in indirect descriptor: "
> > -  "len 0x%llx not multiple of 0x%zx\n",
> > -  (unsigned long long)len,
> > -  sizeof desc);
> > -   return -EINVAL;
> > -   }
> > -
> > -   ret = translate_desc(vq, vhost64_to_cpu(vq, indirect->addr), len, 
> > vq->indirect,
> > -UIO_MAXIOV, VHOST_ACCESS_RO);
> > -   if (unlikely(ret < 0)) {
> > -   if (ret != -EAGAIN)
> > -   vq_err(vq, "Translation failure %d in indirect.\n", 
> > ret);
> > -   return ret;
> > -   }
> > -   iov_iter_init(, READ, vq->indirect, ret, len);
> > -
> > -   /* We will use the result as an address to read from, so most
> > -* architectures only need a compiler barrier here. */
> > -   read_barrier_depends();
> > -
> > -   count = len / sizeof desc;
> > -   /* Buffers are chained via a 16 bit next field, so
> > -* we can have at most 2^16 of these. */
> > -   if (unlikely(count > USHRT_MAX + 1)) {
> > -   vq_err(vq, "Indirect buffer length too big: %d\n",
> > -  indirect->len);
> > -   return -E2BIG;
> > -   }
> > -
> > -   do {
> > -   unsigned iov_count = *in_num + *out_num;
> > -   if (unlikely(++found > count)) {
> > -   vq_err(vq, "Loop detected: last one at %u "
> > -  "indirect size %u\n",
> > -  

Re: [PATCH RFC v6 02/11] vhost: use batched get_vq_desc version

2020-06-09 Thread Jason Wang


On 2020/6/8 下午8:52, Michael S. Tsirkin wrote:

As testing shows no performance change, switch to that now.

Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Eugenio Pérez 
Link: https://lore.kernel.org/r/20200401183118.8334-3-epere...@redhat.com
Signed-off-by: Michael S. Tsirkin 
---
  drivers/vhost/test.c  |   2 +-
  drivers/vhost/vhost.c | 318 --
  drivers/vhost/vhost.h |   7 +-
  3 files changed, 65 insertions(+), 262 deletions(-)

diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 0466921f4772..7d69778aaa26 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -119,7 +119,7 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
dev = >dev;
vqs[VHOST_TEST_VQ] = >vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
-   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV,
+   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV + 64,
   VHOST_TEST_PKT_WEIGHT, VHOST_TEST_WEIGHT, true, NULL);
  
  	f->private_data = n;

diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 180b7b58c76b..41d6b132c234 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -304,6 +304,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
  {
vq->num = 1;
vq->ndescs = 0;
+   vq->first_desc = 0;
vq->desc = NULL;
vq->avail = NULL;
vq->used = NULL;
@@ -372,6 +373,11 @@ static int vhost_worker(void *data)
return 0;
  }
  
+static int vhost_vq_num_batch_descs(struct vhost_virtqueue *vq)

+{
+   return vq->max_descs - UIO_MAXIOV;
+}
+
  static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq)
  {
kfree(vq->descs);
@@ -394,6 +400,9 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
for (i = 0; i < dev->nvqs; ++i) {
vq = dev->vqs[i];
vq->max_descs = dev->iov_limit;
+   if (vhost_vq_num_batch_descs(vq) < 0) {
+   return -EINVAL;
+   }
vq->descs = kmalloc_array(vq->max_descs,
  sizeof(*vq->descs),
  GFP_KERNEL);
@@ -1610,6 +1619,7 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int 
ioctl, void __user *arg
vq->last_avail_idx = s.num;
/* Forget the cached index value. */
vq->avail_idx = vq->last_avail_idx;
+   vq->ndescs = vq->first_desc = 0;
break;
case VHOST_GET_VRING_BASE:
s.index = idx;
@@ -2078,253 +2088,6 @@ static unsigned next_desc(struct vhost_virtqueue *vq, 
struct vring_desc *desc)
return next;
  }
  
-static int get_indirect(struct vhost_virtqueue *vq,

-   struct iovec iov[], unsigned int iov_size,
-   unsigned int *out_num, unsigned int *in_num,
-   struct vhost_log *log, unsigned int *log_num,
-   struct vring_desc *indirect)
-{
-   struct vring_desc desc;
-   unsigned int i = 0, count, found = 0;
-   u32 len = vhost32_to_cpu(vq, indirect->len);
-   struct iov_iter from;
-   int ret, access;
-
-   /* Sanity check */
-   if (unlikely(len % sizeof desc)) {
-   vq_err(vq, "Invalid length in indirect descriptor: "
-  "len 0x%llx not multiple of 0x%zx\n",
-  (unsigned long long)len,
-  sizeof desc);
-   return -EINVAL;
-   }
-
-   ret = translate_desc(vq, vhost64_to_cpu(vq, indirect->addr), len, 
vq->indirect,
-UIO_MAXIOV, VHOST_ACCESS_RO);
-   if (unlikely(ret < 0)) {
-   if (ret != -EAGAIN)
-   vq_err(vq, "Translation failure %d in indirect.\n", 
ret);
-   return ret;
-   }
-   iov_iter_init(, READ, vq->indirect, ret, len);
-
-   /* We will use the result as an address to read from, so most
-* architectures only need a compiler barrier here. */
-   read_barrier_depends();
-
-   count = len / sizeof desc;
-   /* Buffers are chained via a 16 bit next field, so
-* we can have at most 2^16 of these. */
-   if (unlikely(count > USHRT_MAX + 1)) {
-   vq_err(vq, "Indirect buffer length too big: %d\n",
-  indirect->len);
-   return -E2BIG;
-   }
-
-   do {
-   unsigned iov_count = *in_num + *out_num;
-   if (unlikely(++found > count)) {
-   vq_err(vq, "Loop detected: last one at %u "
-  "indirect size %u\n",
-  i, count);
-   return -EINVAL;
-   }
-   if (unlikely(!copy_from_iter_full(, sizeof(desc), ))) 
{
-   vq_err(vq, "Failed indirect descriptor: idx %d, %zx\n",
- 

[PATCH RFC v6 02/11] vhost: use batched get_vq_desc version

2020-06-08 Thread Michael S. Tsirkin
As testing shows no performance change, switch to that now.

Signed-off-by: Michael S. Tsirkin 
Signed-off-by: Eugenio Pérez 
Link: https://lore.kernel.org/r/20200401183118.8334-3-epere...@redhat.com
Signed-off-by: Michael S. Tsirkin 
---
 drivers/vhost/test.c  |   2 +-
 drivers/vhost/vhost.c | 318 --
 drivers/vhost/vhost.h |   7 +-
 3 files changed, 65 insertions(+), 262 deletions(-)

diff --git a/drivers/vhost/test.c b/drivers/vhost/test.c
index 0466921f4772..7d69778aaa26 100644
--- a/drivers/vhost/test.c
+++ b/drivers/vhost/test.c
@@ -119,7 +119,7 @@ static int vhost_test_open(struct inode *inode, struct file 
*f)
dev = >dev;
vqs[VHOST_TEST_VQ] = >vqs[VHOST_TEST_VQ];
n->vqs[VHOST_TEST_VQ].handle_kick = handle_vq_kick;
-   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV,
+   vhost_dev_init(dev, vqs, VHOST_TEST_VQ_MAX, UIO_MAXIOV + 64,
   VHOST_TEST_PKT_WEIGHT, VHOST_TEST_WEIGHT, true, NULL);
 
f->private_data = n;
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 180b7b58c76b..41d6b132c234 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -304,6 +304,7 @@ static void vhost_vq_reset(struct vhost_dev *dev,
 {
vq->num = 1;
vq->ndescs = 0;
+   vq->first_desc = 0;
vq->desc = NULL;
vq->avail = NULL;
vq->used = NULL;
@@ -372,6 +373,11 @@ static int vhost_worker(void *data)
return 0;
 }
 
+static int vhost_vq_num_batch_descs(struct vhost_virtqueue *vq)
+{
+   return vq->max_descs - UIO_MAXIOV;
+}
+
 static void vhost_vq_free_iovecs(struct vhost_virtqueue *vq)
 {
kfree(vq->descs);
@@ -394,6 +400,9 @@ static long vhost_dev_alloc_iovecs(struct vhost_dev *dev)
for (i = 0; i < dev->nvqs; ++i) {
vq = dev->vqs[i];
vq->max_descs = dev->iov_limit;
+   if (vhost_vq_num_batch_descs(vq) < 0) {
+   return -EINVAL;
+   }
vq->descs = kmalloc_array(vq->max_descs,
  sizeof(*vq->descs),
  GFP_KERNEL);
@@ -1610,6 +1619,7 @@ long vhost_vring_ioctl(struct vhost_dev *d, unsigned int 
ioctl, void __user *arg
vq->last_avail_idx = s.num;
/* Forget the cached index value. */
vq->avail_idx = vq->last_avail_idx;
+   vq->ndescs = vq->first_desc = 0;
break;
case VHOST_GET_VRING_BASE:
s.index = idx;
@@ -2078,253 +2088,6 @@ static unsigned next_desc(struct vhost_virtqueue *vq, 
struct vring_desc *desc)
return next;
 }
 
-static int get_indirect(struct vhost_virtqueue *vq,
-   struct iovec iov[], unsigned int iov_size,
-   unsigned int *out_num, unsigned int *in_num,
-   struct vhost_log *log, unsigned int *log_num,
-   struct vring_desc *indirect)
-{
-   struct vring_desc desc;
-   unsigned int i = 0, count, found = 0;
-   u32 len = vhost32_to_cpu(vq, indirect->len);
-   struct iov_iter from;
-   int ret, access;
-
-   /* Sanity check */
-   if (unlikely(len % sizeof desc)) {
-   vq_err(vq, "Invalid length in indirect descriptor: "
-  "len 0x%llx not multiple of 0x%zx\n",
-  (unsigned long long)len,
-  sizeof desc);
-   return -EINVAL;
-   }
-
-   ret = translate_desc(vq, vhost64_to_cpu(vq, indirect->addr), len, 
vq->indirect,
-UIO_MAXIOV, VHOST_ACCESS_RO);
-   if (unlikely(ret < 0)) {
-   if (ret != -EAGAIN)
-   vq_err(vq, "Translation failure %d in indirect.\n", 
ret);
-   return ret;
-   }
-   iov_iter_init(, READ, vq->indirect, ret, len);
-
-   /* We will use the result as an address to read from, so most
-* architectures only need a compiler barrier here. */
-   read_barrier_depends();
-
-   count = len / sizeof desc;
-   /* Buffers are chained via a 16 bit next field, so
-* we can have at most 2^16 of these. */
-   if (unlikely(count > USHRT_MAX + 1)) {
-   vq_err(vq, "Indirect buffer length too big: %d\n",
-  indirect->len);
-   return -E2BIG;
-   }
-
-   do {
-   unsigned iov_count = *in_num + *out_num;
-   if (unlikely(++found > count)) {
-   vq_err(vq, "Loop detected: last one at %u "
-  "indirect size %u\n",
-  i, count);
-   return -EINVAL;
-   }
-   if (unlikely(!copy_from_iter_full(, sizeof(desc), ))) 
{
-   vq_err(vq, "Failed indirect descriptor: idx %d, %zx\n",
-  i, (size_t)vhost64_to_cpu(vq,