Re: [PATCH v12] i2c: virtio: add a virtio i2c frontend driver

2021-07-04 Thread Jie Deng



On 2021/7/5 10:40, Viresh Kumar wrote:

I think we missed the first deadline for 5.14-rc1 as Wolfram should be out of
office now. Anyway lets make sure we fix all the pending bits before he is back
and see if we can still pull it off by rc2.

On 02-07-21, 16:46, Jie Deng wrote:

diff --git a/drivers/i2c/busses/i2c-virtio.c b/drivers/i2c/busses/i2c-virtio.c
+static int virtio_i2c_send_reqs(struct virtqueue *vq,

It would be better to rename this to virtio_i2c_prepare_reqs instead, as this
doesn't send anything.



That's a better name. I'm OK with that.





+   struct virtio_i2c_req *reqs,
+   struct i2c_msg *msgs, int nr)
+{
+   struct scatterlist *sgs[3], out_hdr, msg_buf, in_hdr;
+   int i, outcnt, incnt, err = 0;
+
+   for (i = 0; i < nr; i++) {
+   /*
+* Only 7-bit mode supported for this moment. For the address 
format,
+* Please check the Virtio I2C Specification.
+*/
+   reqs[i].out_hdr.addr = cpu_to_le16(msgs[i].addr << 1);
+
+   if (i != nr - 1)
+   reqs[i].out_hdr.flags = 
cpu_to_le32(VIRTIO_I2C_FLAGS_FAIL_NEXT);
+
+   outcnt = incnt = 0;
+   sg_init_one(_hdr, [i].out_hdr, 
sizeof(reqs[i].out_hdr));
+   sgs[outcnt++] = _hdr;
+
+   if (msgs[i].len) {
+   reqs[i].buf = i2c_get_dma_safe_msg_buf([i], 1);
+   if (!reqs[i].buf)
+   break;
+
+   sg_init_one(_buf, reqs[i].buf, msgs[i].len);
+
+   if (msgs[i].flags & I2C_M_RD)
+   sgs[outcnt + incnt++] = _buf;
+   else
+   sgs[outcnt++] = _buf;
+   }
+
+   sg_init_one(_hdr, [i].in_hdr, sizeof(reqs[i].in_hdr));
+   sgs[outcnt + incnt++] = _hdr;
+
+   err = virtqueue_add_sgs(vq, sgs, outcnt, incnt, [i], 
GFP_KERNEL);
+   if (err < 0) {
+   i2c_put_dma_safe_msg_buf(reqs[i].buf, [i], false);
+   break;
+   }
+   }
+
+   return i;
+}

The right way of doing this is is making this function return - Error on failure
and 0 on success. There is no point returning number of successful additions
here.



We need the number for virtio_i2c_complete_reqs to do cleanup. We don't 
have to


do cleanup "num" times every time. Just do it as needed.




Moreover, on failures this needs to clean up (free the dmabufs) itself, just
like you did i2c_put_dma_safe_msg_buf() at the end. The caller shouldn't be
required to handle the error cases by freeing up resources.



This function will return the number of requests being successfully 
prepared and make sure


resources of the failed request being freed. And 
virtio_i2c_complete_reqs will free the


resources of those successful request.



+static int virtio_i2c_complete_reqs(struct virtqueue *vq,
+   struct virtio_i2c_req *reqs,
+   struct i2c_msg *msgs, int nr,
+   bool fail)
+{
+   struct virtio_i2c_req *req;
+   bool failed = fail;
+   unsigned int len;
+   int i, j = 0;
+
+   for (i = 0; i < nr; i++) {
+   /* Detach the ith request from the vq */
+   req = virtqueue_get_buf(vq, );
+
+   /*
+* Condition (req && req == [i]) should always meet since
+* we have total nr requests in the vq.
+*/
+   if (!failed && (WARN_ON(!(req && req == [i])) ||
+   (req->in_hdr.status != VIRTIO_I2C_MSG_OK)))

What about writing this as:

if (!failed && (WARN_ON(req != [i]) ||
(req->in_hdr.status != VIRTIO_I2C_MSG_OK)))

We don't need to check req here since if req is NULL, we will not do req->in_hdr
at all.



It's right here just because the [i] will never be NULL in our 
case. But if you see


"virtio_i2c_complete_reqs" as an independent function, you need to check the

req. From the perspective of the callee, you can't ask the caller always 
give you


the non-NULL parameters. And some tools may give warnings.



+   failed = true;
+
+   i2c_put_dma_safe_msg_buf(reqs[i].buf, [i], !failed);
+   if (!failed)
+   ++j;
+   }
+
+   return (fail ? -ETIMEDOUT : j);
+}
+

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v12] i2c: virtio: add a virtio i2c frontend driver

2021-07-04 Thread Jie Deng



On 2021/7/5 10:43, Viresh Kumar wrote:

On 02-07-21, 12:58, Andy Shevchenko wrote:

On Fri, Jul 02, 2021 at 04:46:47PM +0800, Jie Deng wrote:

+static int virtio_i2c_complete_reqs(struct virtqueue *vq,
+   struct virtio_i2c_req *reqs,
+   struct i2c_msg *msgs, int nr,
+   bool fail)
+{
+   struct virtio_i2c_req *req;
+   bool failed = fail;

Jie, you can actually get rid of this variable too. Jut rename fail to failed
and everything shall work as you want.


Sure.

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH V2 5/6] virtio: add one field into virtio_device for recording if device uses managed irq

2021-07-04 Thread Ming Lei
On Fri, Jul 02, 2021 at 11:55:40AM -0400, Michael S. Tsirkin wrote:
> On Fri, Jul 02, 2021 at 11:05:54PM +0800, Ming Lei wrote:
> > blk-mq needs to know if the device uses managed irq, so add one field
> > to virtio_device for recording if device uses managed irq.
> > 
> > If the driver use managed irq, this flag has to be set so it can be
> > passed to blk-mq.
> > 
> > Cc: "Michael S. Tsirkin" 
> > Cc: Jason Wang 
> > Cc: virtualization@lists.linux-foundation.org
> > Signed-off-by: Ming Lei 
> 
> 
> The API seems somewhat confusing. virtio does not request
> a managed irq as such, does it? I think it's a decision taken
> by the irq core.

vp_request_msix_vectors():

if (desc) {
flags |= PCI_IRQ_AFFINITY;
desc->pre_vectors++; /* virtio config vector */
vdev->use_managed_irq = true;
}

err = pci_alloc_irq_vectors_affinity(vp_dev->pci_dev, nvectors,
 nvectors, flags, desc);

Managed irq is used once PCI_IRQ_AFFINITY is passed to
pci_alloc_irq_vectors_affinity().

> 
> Any way to query the irq to find out if it's managed?

So far the managed info isn't exported via /proc/irq, but if one irq is
managed, its smp_affinity & smp_affinity_list attributes can't be
changed successfully.

If irq's debugfs is enabled, this info can be retrieved.


Thanks,
Ming

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v8 10/10] Documentation: Add documentation for VDUSE

2021-07-04 Thread Jason Wang


在 2021/7/4 下午5:49, Yongji Xie 写道:

OK, I get you now. Since the VIRTIO specification says "Device
configuration space is generally used for rarely-changing or
initialization-time parameters". I assume the VDUSE_DEV_SET_CONFIG
ioctl should not be called frequently.

The spec uses MUST and other terms to define the precise requirements.
Here the language (especially the word "generally") is weaker and means
there may be exceptions.

Another type of access that doesn't work with the VDUSE_DEV_SET_CONFIG
approach is reads that have side-effects. For example, imagine a field
containing an error code if the device encounters a problem unrelated to
a specific virtqueue request. Reading from this field resets the error
code to 0, saving the driver an extra configuration space write access
and possibly race conditions. It isn't possible to implement those
semantics suing VDUSE_DEV_SET_CONFIG. It's another corner case, but it
makes me think that the interface does not allow full VIRTIO semantics.



Note that though you're correct, my understanding is that config space 
is not suitable for this kind of error propagating. And it would be very 
hard to implement such kind of semantic in some transports.  Virtqueue 
should be much better. As Yong Ji quoted, the config space is used for 
"rarely-changing or intialization-time parameters".




Agreed. I will use VDUSE_DEV_GET_CONFIG in the next version. And to
handle the message failure, I'm going to add a return value to
virtio_config_ops.get() and virtio_cread_* API so that the error can
be propagated to the virtio device driver. Then the virtio-blk device
driver can be modified to handle that.

Jason and Stefan, what do you think of this way?



I'd like to stick to the current assumption thich get_config won't fail. 
That is to say,


1) maintain a config in the kernel, make sure the config space read can 
always succeed

2) introduce an ioctl for the vduse usersapce to update the config space.
3) we can synchronize with the vduse userspace during set_config

Does this work?

Thanks






___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH] vhost-vdpa: mark vhost device invalid to reflect vdpa device unregistration

2021-07-04 Thread Michael S. Tsirkin
On Mon, Jul 05, 2021 at 02:22:04AM +0530, gautam.da...@xilinx.com wrote:
> From: Gautam Dawar 
> 
> As mentioned in Bug 213179, any malicious user-space application can render
> a module registering a vDPA device to hang forever. This will typically
> surface when vdpa_device_unregister() is called from the function
> responsible for module unload, leading rmmod commands to not return.
> 
> This patch unblocks the caller module by continuing with the clean-up
> but after marking the vhost device as unavailable. For future requests
> from user-space application, the vhost device availability is checked
> first and if it has gone unavailable, such requests are denied.
> 
> Signed-off-by: Gautam Dawar 


I don't seem mappings handled below. Did I miss it?

> ---
>  drivers/vhost/vdpa.c | 45 ++--
>  1 file changed, 39 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/vhost/vdpa.c b/drivers/vhost/vdpa.c
> index e4b7d26649d8..623bc7f0c0ca 100644
> --- a/drivers/vhost/vdpa.c
> +++ b/drivers/vhost/vdpa.c
> @@ -47,6 +47,7 @@ struct vhost_vdpa {
>   int minor;
>   struct eventfd_ctx *config_ctx;
>   int in_batch;
> + int dev_invalid;
>   struct vdpa_iova_range range;
>  };
>  
> @@ -61,6 +62,11 @@ static void handle_vq_kick(struct vhost_work *work)
>   struct vhost_vdpa *v = container_of(vq->dev, struct vhost_vdpa, vdev);
>   const struct vdpa_config_ops *ops = v->vdpa->config;
>  
> + if (v->dev_invalid) {
> + dev_info(>dev,
> +  "%s: vhost_vdpa device unavailable\n", __func__);
> + return;
> + }
>   ops->kick_vq(v->vdpa, vq - v->vqs);
>  }
>  
> @@ -120,6 +126,11 @@ static void vhost_vdpa_reset(struct vhost_vdpa *v)
>  {
>   struct vdpa_device *vdpa = v->vdpa;
>  
> + if (v->dev_invalid) {
> + dev_info(>dev,
> +  "%s: vhost_vdpa device unavailable\n", __func__);
> + return;
> + }
>   vdpa_reset(vdpa);
>   v->in_batch = 0;
>  }
> @@ -367,6 +378,11 @@ static long vhost_vdpa_vring_ioctl(struct vhost_vdpa *v, 
> unsigned int cmd,
>   u32 idx;
>   long r;
>  
> + if (v->dev_invalid) {
> + dev_info(>dev,
> +  "%s: vhost_vdpa device unavailable\n", __func__);
> + return -ENODEV;
> + }
>   r = get_user(idx, (u32 __user *)argp);
>   if (r < 0)
>   return r;
> @@ -450,6 +466,11 @@ static long vhost_vdpa_unlocked_ioctl(struct file *filep,
>   return 0;
>   }
>  
> + if (v->dev_invalid) {
> + dev_info(>dev,
> +  "%s: vhost_vdpa device unavailable\n", __func__);
> + return -ENODEV;
> + }
>   mutex_lock(>mutex);
>  
>   switch (cmd) {
> @@ -745,8 +766,13 @@ static int vhost_vdpa_process_iotlb_msg(struct vhost_dev 
> *dev,
>   const struct vdpa_config_ops *ops = vdpa->config;
>   int r = 0;
>  
> - mutex_lock(>mutex);
> + if (v->dev_invalid) {
> + dev_info(>dev,
> +  "%s: vhost_vdpa device unavailable\n", __func__);
> + return -ENODEV;
> + }
>  
> + mutex_lock(>mutex);
>   r = vhost_dev_check_owner(dev);
>   if (r)
>   goto unlock;
> @@ -949,6 +975,11 @@ static vm_fault_t vhost_vdpa_fault(struct vm_fault *vmf)
>   struct vm_area_struct *vma = vmf->vma;
>   u16 index = vma->vm_pgoff;
>  
> + if (v->dev_invalid) {
> + dev_info(>dev,
> +  "%s: vhost_vdpa device unavailable\n", __func__);
> + return VM_FAULT_NOPAGE;
> + }
>   notify = ops->get_vq_notification(vdpa, index);
>  
>   vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
> @@ -1091,11 +1122,13 @@ static void vhost_vdpa_remove(struct vdpa_device 
> *vdpa)
>   opened = atomic_cmpxchg(>opened, 0, 1);
>   if (!opened)
>   break;
> - wait_for_completion_timeout(>completion,
> - msecs_to_jiffies(1000));
> - dev_warn_once(>dev,
> -   "%s waiting for /dev/%s to be closed\n",
> -   __func__, dev_name(>dev));
> + if (!wait_for_completion_timeout(>completion,
> + msecs_to_jiffies(1000))) {
> + dev_warn(>dev,
> +  "%s /dev/%s in use, continue..\n",
> +  __func__, dev_name(>dev));
> + break;
> + }

When you have an arbitrary timeout you know something's not entirely
robust ...

>   } while (1);
>  
>   put_device(>dev);
> + v->dev_invalid = true;
> -- 
> 2.30.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH] vhost-vdpa: mark vhost device invalid to reflect vdpa device unregistration

2021-07-04 Thread Jason Wang


在 2021/7/5 上午4:52, gautam.da...@xilinx.com 写道:

vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
@@ -1091,11 +1122,13 @@ static void vhost_vdpa_remove(struct vdpa_device *vdpa)
opened = atomic_cmpxchg(>opened, 0, 1);
if (!opened)
break;
-   wait_for_completion_timeout(>completion,
-   msecs_to_jiffies(1000));
-   dev_warn_once(>dev,
- "%s waiting for/dev/%s to be closed\n",
- __func__, dev_name(>dev));
+   if (!wait_for_completion_timeout(>completion,
+   msecs_to_jiffies(1000))) {
+   dev_warn(>dev,
+"%s/dev/%s in use, continue..\n",
+__func__, dev_name(>dev));
+   break;
+   }
} while (1);
  
  	put_device(>dev);

+   v->dev_invalid = true;



Besides the mapping handling mentioned by Michael. I think this can lead 
use-after-free. put_device may release the memory.


Another fundamental issue, vDPA is the parent of vhost-vDPA device. I'm 
not sure the device core can allow the parent to go away first.


Thanks


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH V2 5/6] virtio: add one field into virtio_device for recording if device uses managed irq

2021-07-04 Thread Jason Wang


在 2021/7/2 下午11:05, Ming Lei 写道:

blk-mq needs to know if the device uses managed irq, so add one field
to virtio_device for recording if device uses managed irq.

If the driver use managed irq, this flag has to be set so it can be
passed to blk-mq.

Cc: "Michael S. Tsirkin"
Cc: Jason Wang
Cc:virtualization@lists.linux-foundation.org
Signed-off-by: Ming Lei
---
  drivers/block/virtio_blk.c | 2 ++
  drivers/scsi/virtio_scsi.c | 1 +
  drivers/virtio/virtio_pci_common.c | 1 +
  include/linux/virtio.h | 1 +
  4 files changed, 5 insertions(+)

diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index e4bd3b1fc3c2..33b9c80ac475 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -764,6 +764,8 @@ static int virtblk_probe(struct virtio_device *vdev)
vblk->tag_set.queue_depth = queue_depth;
vblk->tag_set.numa_node = NUMA_NO_NODE;
vblk->tag_set.flags = BLK_MQ_F_SHOULD_MERGE;
+   if (vdev->use_managed_irq)
+   vblk->tag_set.flags |= BLK_MQ_F_MANAGED_IRQ;



I'm not familiar with blk mq.

But the name is kind of confusing, I guess 
"BLK_MQ_F_AFFINITY_MANAGED_IRQ" is better? (Consider we had 
"IRQD_AFFINITY_MANAGED")


This helps me to differ this from the devres (device managed IRQ) at least.

Thanks


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH v12] i2c: virtio: add a virtio i2c frontend driver

2021-07-04 Thread Viresh Kumar
On 02-07-21, 12:58, Andy Shevchenko wrote:
> On Fri, Jul 02, 2021 at 04:46:47PM +0800, Jie Deng wrote:
> > +static int virtio_i2c_complete_reqs(struct virtqueue *vq,
> > +   struct virtio_i2c_req *reqs,
> > +   struct i2c_msg *msgs, int nr,
> > +   bool fail)
> > +{
> > +   struct virtio_i2c_req *req;
> > +   bool failed = fail;

Jie, you can actually get rid of this variable too. Jut rename fail to failed
and everything shall work as you want.

> > +   unsigned int len;
> > +   int i, j = 0;
> > +
> > +   for (i = 0; i < nr; i++) {
> > +   /* Detach the ith request from the vq */
> > +   req = virtqueue_get_buf(vq, );
> > +
> > +   /*
> > +* Condition (req && req == [i]) should always meet since
> > +* we have total nr requests in the vq.
> > +*/
> > +   if (!failed && (WARN_ON(!(req && req == [i])) ||
> > +   (req->in_hdr.status != VIRTIO_I2C_MSG_OK)))
> > +   failed = true;
> 
> ...and after failed is true, we are continuing the loop, why?

Actually this function can be called with fail set to true. We proceed as we
need to call i2c_put_dma_safe_msg_buf() for all buffers we allocated earlier.

> > +   i2c_put_dma_safe_msg_buf(reqs[i].buf, [i], !failed);
> > +   if (!failed)
> 
> > +   ++j;
> 
> Besides better to read j++ the j itself can be renamed to something more
> verbose.
> 
> > +   }
> 
> > +   return (fail ? -ETIMEDOUT : j);
> 
> Redundant parentheses.
> 
> > +}

-- 
viresh
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v12] i2c: virtio: add a virtio i2c frontend driver

2021-07-04 Thread Viresh Kumar
I think we missed the first deadline for 5.14-rc1 as Wolfram should be out of
office now. Anyway lets make sure we fix all the pending bits before he is back
and see if we can still pull it off by rc2.

On 02-07-21, 16:46, Jie Deng wrote:
> diff --git a/drivers/i2c/busses/i2c-virtio.c b/drivers/i2c/busses/i2c-virtio.c
> +static int virtio_i2c_send_reqs(struct virtqueue *vq,

It would be better to rename this to virtio_i2c_prepare_reqs instead, as this
doesn't send anything.

> + struct virtio_i2c_req *reqs,
> + struct i2c_msg *msgs, int nr)
> +{
> + struct scatterlist *sgs[3], out_hdr, msg_buf, in_hdr;
> + int i, outcnt, incnt, err = 0;
> +
> + for (i = 0; i < nr; i++) {
> + /*
> +  * Only 7-bit mode supported for this moment. For the address 
> format,
> +  * Please check the Virtio I2C Specification.
> +  */
> + reqs[i].out_hdr.addr = cpu_to_le16(msgs[i].addr << 1);
> +
> + if (i != nr - 1)
> + reqs[i].out_hdr.flags = 
> cpu_to_le32(VIRTIO_I2C_FLAGS_FAIL_NEXT);
> +
> + outcnt = incnt = 0;
> + sg_init_one(_hdr, [i].out_hdr, 
> sizeof(reqs[i].out_hdr));
> + sgs[outcnt++] = _hdr;
> +
> + if (msgs[i].len) {
> + reqs[i].buf = i2c_get_dma_safe_msg_buf([i], 1);
> + if (!reqs[i].buf)
> + break;
> +
> + sg_init_one(_buf, reqs[i].buf, msgs[i].len);
> +
> + if (msgs[i].flags & I2C_M_RD)
> + sgs[outcnt + incnt++] = _buf;
> + else
> + sgs[outcnt++] = _buf;
> + }
> +
> + sg_init_one(_hdr, [i].in_hdr, sizeof(reqs[i].in_hdr));
> + sgs[outcnt + incnt++] = _hdr;
> +
> + err = virtqueue_add_sgs(vq, sgs, outcnt, incnt, [i], 
> GFP_KERNEL);
> + if (err < 0) {
> + i2c_put_dma_safe_msg_buf(reqs[i].buf, [i], false);
> + break;
> + }
> + }
> +
> + return i;
> +}

The right way of doing this is is making this function return - Error on failure
and 0 on success. There is no point returning number of successful additions
here.

Moreover, on failures this needs to clean up (free the dmabufs) itself, just
like you did i2c_put_dma_safe_msg_buf() at the end. The caller shouldn't be
required to handle the error cases by freeing up resources.

> +static int virtio_i2c_complete_reqs(struct virtqueue *vq,
> + struct virtio_i2c_req *reqs,
> + struct i2c_msg *msgs, int nr,
> + bool fail)
> +{
> + struct virtio_i2c_req *req;
> + bool failed = fail;
> + unsigned int len;
> + int i, j = 0;
> +
> + for (i = 0; i < nr; i++) {
> + /* Detach the ith request from the vq */
> + req = virtqueue_get_buf(vq, );
> +
> + /*
> +  * Condition (req && req == [i]) should always meet since
> +  * we have total nr requests in the vq.
> +  */
> + if (!failed && (WARN_ON(!(req && req == [i])) ||
> + (req->in_hdr.status != VIRTIO_I2C_MSG_OK)))

What about writing this as:

if (!failed && (WARN_ON(req != [i]) ||
(req->in_hdr.status != VIRTIO_I2C_MSG_OK)))

We don't need to check req here since if req is NULL, we will not do req->in_hdr
at all.

> + failed = true;
> +
> + i2c_put_dma_safe_msg_buf(reqs[i].buf, [i], !failed);
> + if (!failed)
> + ++j;
> + }
> +
> + return (fail ? -ETIMEDOUT : j);
> +}
> +

-- 
viresh
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v12] i2c: virtio: add a virtio i2c frontend driver

2021-07-04 Thread Jie Deng



On 2021/7/2 17:58, Andy Shevchenko wrote:

On Fri, Jul 02, 2021 at 04:46:47PM +0800, Jie Deng wrote:

Add an I2C bus driver for virtio para-virtualization.

The controller can be emulated by the backend driver in
any device model software by following the virtio protocol.

The device specification can be found on
https://lists.oasis-open.org/archives/virtio-comment/202101/msg8.html.

By following the specification, people may implement different
backend drivers to emulate different controllers according to
their needs.

...


+static int virtio_i2c_complete_reqs(struct virtqueue *vq,
+   struct virtio_i2c_req *reqs,
+   struct i2c_msg *msgs, int nr,
+   bool fail)
+{
+   struct virtio_i2c_req *req;
+   bool failed = fail;
+   unsigned int len;
+   int i, j = 0;
+
+   for (i = 0; i < nr; i++) {
+   /* Detach the ith request from the vq */
+   req = virtqueue_get_buf(vq, );
+
+   /*
+* Condition (req && req == [i]) should always meet since
+* we have total nr requests in the vq.
+*/
+   if (!failed && (WARN_ON(!(req && req == [i])) ||
+   (req->in_hdr.status != VIRTIO_I2C_MSG_OK)))
+   failed = true;

...and after failed is true, we are continuing the loop, why?



Still need to continue to do some cleanup.





+   i2c_put_dma_safe_msg_buf(reqs[i].buf, [i], !failed);
+   if (!failed)
+   ++j;

Besides better to read j++ the j itself can be renamed to something more
verbose.



Will change it to j++.



+   }
+   return (fail ? -ETIMEDOUT : j);

Redundant parentheses.



Will remove the parentheses.





+}

...


+   ret = virtio_i2c_send_reqs(vq, reqs, msgs, num);
+   if (ret != num) {
+   virtio_i2c_complete_reqs(vq, reqs, msgs, ret, true);

Below you check the returned code, here is not.



I will do some optimization here.





+   ret = 0;
+   goto err_free;
+   }
+
+   reinit_completion(>completion);
+   virtqueue_kick(vq);
+
+   time_left = wait_for_completion_timeout(>completion, adap->timeout);
+   if (!time_left)
+   dev_err(>dev, "virtio i2c backend timeout.\n");
+
+   ret = virtio_i2c_complete_reqs(vq, reqs, msgs, num, !time_left);
+
+err_free:
+   kfree(reqs);
+   return ret;
+++ b/include/uapi/linux/virtio_i2c.h
+#include 
+
+/* The bit 0 of the @virtio_i2c_out_hdr.@flags, used to group the requests */
+#define VIRTIO_I2C_FLAGS_FAIL_NEXT BIT(0)

It's _BITUL() or so from linux/const.h.
https://elixir.bootlin.com/linux/latest/source/include/uapi/linux/const.h#L28
You may not use internal definitions in UAPI headers.



Let's use this _BITUL() from linux/const.h instead. Thank you !


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v2 2/2] drivers: gpio: add virtio-gpio guest driver

2021-07-04 Thread Viresh Kumar
On 03-07-21, 04:05, Michael S. Tsirkin wrote:
> On Wed, May 26, 2021 at 09:02:06AM +0530, Viresh Kumar wrote:
> > On 25-05-21, 14:59, Enrico Weigelt, metux IT consult wrote:
> > > On 24.05.21 13:27, Viresh Kumar wrote:
> > > 
> > > Hi,
> > > 
> > > 
> > > > We (Linaro's Project Stratos
> > > > https://linaro.atlassian.net/wiki/spaces/STR/overview)
> > > >   are interested in this stuff. I was trying to look at the last status
> > > > of all this. Few
> > > > questions for you:
> > > > 
> > > > - Was the spec ever posted to virtio-dev list ? I thought that's the
> > > > very first step before
> > > > we merge the code.
> > > 
> > > I had posted some spec quite some time ago, but it wasn't in the form
> > > of patches against the .tex documentation files yet. It's been laying
> > > aside for quite a while, since I've been busy w/ other things.
> > 
> > Will you be fine if I take that up and restart the thread ?
> 
> It's been a while - why not right?

Yeah, we went past that and here is the last version I posted.

https://lists.oasis-open.org/archives/virtio-comment/202106/msg00033.html

which I took back later on to let Enrico do it, as he wanted to.

And here is the last version from Enrico:

https://lists.oasis-open.org/archives/virtio-comment/202106/msg00048.html

Lets see how this goes in coming days.

-- 
viresh
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [PATCH v12] i2c: virtio: add a virtio i2c frontend driver

2021-07-04 Thread Viresh Kumar
On 05-07-21, 11:45, Jie Deng wrote:
> On 2021/7/5 10:40, Viresh Kumar wrote:
> > On 02-07-21, 16:46, Jie Deng wrote:
> > The right way of doing this is is making this function return - Error on 
> > failure
> > and 0 on success. There is no point returning number of successful additions
> > here.
> 
> 
> We need the number for virtio_i2c_complete_reqs to do cleanup. We don't have
> to
> 
> do cleanup "num" times every time. Just do it as needed.

If you do full cleanup here, then you won't required that at the caller site.

> > Moreover, on failures this needs to clean up (free the dmabufs) itself, just
> > like you did i2c_put_dma_safe_msg_buf() at the end. The caller shouldn't be
> > required to handle the error cases by freeing up resources.
> 
> 
> This function will return the number of requests being successfully prepared
> and make sure
> 
> resources of the failed request being freed. And virtio_i2c_complete_reqs
> will free the
> 
> resources of those successful request.

It just looks cleaner to give such responsibility to each and every function,
i.e. if they fail, they should clean stuff up instead of the caller. That's the
normal philosophy you will find across kernel in most of the cases.
 
> > > +static int virtio_i2c_complete_reqs(struct virtqueue *vq,
> > > + struct virtio_i2c_req *reqs,
> > > + struct i2c_msg *msgs, int nr,
> > > + bool fail)
> > > +{
> > > + struct virtio_i2c_req *req;
> > > + bool failed = fail;
> > > + unsigned int len;
> > > + int i, j = 0;
> > > +
> > > + for (i = 0; i < nr; i++) {
> > > + /* Detach the ith request from the vq */
> > > + req = virtqueue_get_buf(vq, );
> > > +
> > > + /*
> > > +  * Condition (req && req == [i]) should always meet since
> > > +  * we have total nr requests in the vq.
> > > +  */
> > > + if (!failed && (WARN_ON(!(req && req == [i])) ||
> > > + (req->in_hdr.status != VIRTIO_I2C_MSG_OK)))
> > What about writing this as:
> > 
> > if (!failed && (WARN_ON(req != [i]) ||
> > (req->in_hdr.status != VIRTIO_I2C_MSG_OK)))
> > 
> > We don't need to check req here since if req is NULL, we will not do 
> > req->in_hdr
> > at all.
> 
> 
> It's right here just because the [i] will never be NULL in our case.
> But if you see
> 
> "virtio_i2c_complete_reqs" as an independent function, you need to check the
> 
> req. From the perspective of the callee, you can't ask the caller always
> give you
> 
> the non-NULL parameters.

We need to keep this driver optimized in its current form. If you see your own
argument here, then why don't you test vq or msgs for a valid pointer ? And even
reqs.

If we know for certain that this will never happen, then it should be optimized.
But if you see a case where reqs[i] can be NULL here, then it would be fine.

> And some tools may give warnings.

I don't see why a tool will raise an error here and if it does, then the tool is
buggy and not the driver. And we don't need to take care of that.

-- 
viresh
___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH] vhost-vdpa: mark vhost device invalid to reflect vdpa device unregistration

2021-07-04 Thread Jason Wang


在 2021/7/5 上午11:48, Jason Wang 写道:


在 2021/7/5 上午4:52, gautam.da...@xilinx.com 写道:

  vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
@@ -1091,11 +1122,13 @@ static void vhost_vdpa_remove(struct 
vdpa_device *vdpa)

  opened = atomic_cmpxchg(>opened, 0, 1);
  if (!opened)
  break;
-    wait_for_completion_timeout(>completion,
-    msecs_to_jiffies(1000));
-    dev_warn_once(>dev,
-  "%s waiting for/dev/%s to be closed\n",
-  __func__, dev_name(>dev));
+    if (!wait_for_completion_timeout(>completion,
+    msecs_to_jiffies(1000))) {
+    dev_warn(>dev,
+ "%s/dev/%s in use, continue..\n",
+ __func__, dev_name(>dev));
+    break;
+    }
  } while (1);
    put_device(>dev);
+    v->dev_invalid = true;



Besides the mapping handling mentioned by Michael. I think this can 
lead use-after-free. put_device may release the memory.


Another fundamental issue, vDPA is the parent of vhost-vDPA device. 
I'm not sure the device core can allow the parent to go away first.



Or this probably means you need couple the fd loosely with the 
vhost-vDPA device.


Thanks




Thanks




___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [PATCH linux-next v3 2/6] vdpa: Introduce query of device config layout

2021-07-04 Thread Jason Wang


在 2021/7/2 下午2:04, Parav Pandit 写道:



From: Jason Wang 
Sent: Thursday, July 1, 2021 1:13 PM


Actually it depends on what attributes is required for building the config.

We can simply reuse the existing virtio_net_config, if most of the fields are
required.

struct virtio_net_config_set {
          __virtio64 features;
          union {
              struct virtio_net_config;
              __virtio64 reserved[64];
          }
};

If only few of the is required, we can just pick them and use another
structure.

The point is we define structure based on current fields. Tomorrow a new RSS or 
rx scaling scheme appears, and structure size might need change.
And it demands us to go back to length based typecasting code.
and to avoid some length check we pick some arbitrary size reserved words.
And I do not know what network research group will come up for new rss 
algorithm and needed plumbing.



Yes, but as discussed, we may suffer the similar issue at the device 
level. E.g we need a command to let PF to "build" the config for a VF or SF.






Actually, I think just pass the whole config with the device_features during
device creation is a good choice that can simplify a lot of things.

Yes. I totally agree to this.


We can define what is needed and ignore the others in the virtio spec.
Then there's no need to worry about any other things. vDPA core can just do
santiy test like checking size vs features.

Yes, we are trying to have code that avoids such sanity checks based on 
structure size, length etc fields. :-)




Instead of doing them as individual netlink attributes, its lumped together

in a struct of arbitrary length. :-)


I think not? We want to have a fixed length of the structure which never
grow.


I am not sure defining that future now is right choice, at least for me.


So the different is:

1) using netlink dedicated fields

if (nl_attrs[VDPA_ATTR_DEV_NET_CFG_MACADDR])

2) using netlink as transport

if (features & VIRTIO_NET_F_MAC)



I notice several fields of the vduse device is setup via ioctl, which I think

should be setup via this vdpa device add interface.

Also we can always wrap above nl_attr code in a helper API so that drivers

to not hand-code it.


Then it would be still more like 2) above (wrap netlink back to
something like virtio_net_config)?



We may meet similar issue when provision VF/SF instance at the

hardware

level. So I think we may need something in the virtio spec in the near

future.

Given the device config is not spelled out in the virtio spec, may be we can 
wait for it to define virtio management interface.



Yes.





I don't object but it needs to be done in virtio uAPI instead of
netlink, since it's the device ABI.

Device config can surely be part of the virtio uAPI.
We need not have put that in UAPI.
More below.


This is the reverse of netlink which offers to not reserve any arbitrary size

structure.


It's not arbitrary but with fixed length.

Its fixed, but decided arbitrarily large in anticipation that we likely need to 
grow.
And sometimes that fall short when next research comes up with more creative 
thoughts.



How about something like TLVs in the virtio spec then?





It may only work for netlink (with some duplication with the existing
virtio uAPI). If we can solve it at general virtio layer, it would be
better. Otherwise we need to invent them again in the virtio spec.


Virtio spec will likely define what should be config fields to program and its 
layout.
Kernel can always fill up the format that virtio spec demands.



Yes, I wonder if you have the interest to work on the spec to support this.





I think even for the current mlx5e vDPA it would be better, otherwise we
may have:

vDPA tool -> [netlink specific vDPA attributes(1)] -> vDPA core -> [vDPA
core specific VDPA attributes(2)] -> mlx5e_vDPA -> [mlx5e specific vDPA
attributes(3)] -> mlx5e_core

We need to use a single and unified virtio structure in all the (1), (2)
and (3).

This is where I differ.
Its only vdpa tool -> vdpa core -> vendor_driver

Vdpa tool -> vdpa core = netlink attribute
Vdpa core -> vendor driver = struct_foo. (internal inside the linux kernel)

If tomorrow virtio spec defines struct_foo to be something else, kernel can 
always upgrade to struct_bar without upgrading UAPI netlink attributes.



That's fine. Note that actually have an extra level if vendor_driver is 
virtio-pci vDPA driver (vp_vdpa).


Then we have

vdpa tool -> vdpa core -> vp_vdpa -> virtio-pci device

So we still need invent commands to configure/build VF/SF config space 
between vp_vdpa and virtio-pci device. And I think we may suffer the 
similar issue as we met here (vdpa tool -> vdpa core).




Netlink attributes addition will be needed only when struct_foo has newer 
fields.
This will be still forward/backward compatible.

An exact example of this is drivers/net/vxlan.c
vxlan_nl2conf().
A vxlan device needs VNI, src ip, dst ip, tos, and more.
Instead of putting all in single 

Re: [PATCH 2/3] vDPA/ifcvf: implement management netlink framework for ifcvf

2021-07-04 Thread Jason Wang


在 2021/6/30 下午4:21, Zhu Lingshan 写道:

This commit implments the management netlink framework for ifcvf,
including register and add / remove a device

It works with iprouter2:
[root@localhost lszhu]# vdpa mgmtdev show -jp
{
 "mgmtdev": {
 "pci/:01:00.5": {
 "supported_classes": [ "net" ]
 },
 "pci/:01:00.6": {
 "supported_classes": [ "net" ]
 }
 }
}

[root@localhost lszhu]# vdpa dev add mgmtdev pci/:01:00.5 name vdpa0
[root@localhost lszhu]# vdpa dev add mgmtdev pci/:01:00.6 name vdpa1

Signed-off-by: Zhu Lingshan 
---
  drivers/vdpa/ifcvf/ifcvf_base.h |   6 ++
  drivers/vdpa/ifcvf/ifcvf_main.c | 156 
  2 files changed, 126 insertions(+), 36 deletions(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_base.h b/drivers/vdpa/ifcvf/ifcvf_base.h
index ded1b1b5fb13..e5251fcbb200 100644
--- a/drivers/vdpa/ifcvf/ifcvf_base.h
+++ b/drivers/vdpa/ifcvf/ifcvf_base.h
@@ -104,6 +104,12 @@ struct ifcvf_lm_cfg {
struct ifcvf_vring_lm_cfg vring_lm_cfg[IFCVF_MAX_QUEUE_PAIRS];
  };
  
+struct ifcvf_vdpa_mgmt_dev {

+   struct vdpa_mgmt_dev mdev;
+   struct ifcvf_adapter *adapter;
+   struct pci_dev *pdev;
+};
+
  int ifcvf_init_hw(struct ifcvf_hw *hw, struct pci_dev *dev);
  int ifcvf_start_hw(struct ifcvf_hw *hw);
  void ifcvf_stop_hw(struct ifcvf_hw *hw);
diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index 5f70ab1283a0..7c2f64ca2163 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -442,6 +442,16 @@ static const struct vdpa_config_ops ifc_vdpa_ops = {
.set_config_cb  = ifcvf_vdpa_set_config_cb,
  };
  
+static struct virtio_device_id id_table_net[] = {

+   {VIRTIO_ID_NET, VIRTIO_DEV_ANY_ID},
+   {0},
+};
+
+static struct virtio_device_id id_table_blk[] = {
+   {VIRTIO_ID_BLOCK, VIRTIO_DEV_ANY_ID},
+   {0},
+};
+
  static u32 get_dev_type(struct pci_dev *pdev)
  {
u32 dev_type;
@@ -462,48 +472,30 @@ static u32 get_dev_type(struct pci_dev *pdev)
return dev_type;
  }
  
-static int ifcvf_probe(struct pci_dev *pdev, const struct pci_device_id *id)

+static int ifcvf_vdpa_dev_add(struct vdpa_mgmt_dev *mdev, const char *name)
  {
-   struct device *dev = >dev;
+   struct ifcvf_vdpa_mgmt_dev *ifcvf_mgmt_dev;
struct ifcvf_adapter *adapter;
+   struct pci_dev *pdev;
struct ifcvf_hw *vf;
+   struct device *dev;
int ret, i;
  
-	ret = pcim_enable_device(pdev);

-   if (ret) {
-   IFCVF_ERR(pdev, "Failed to enable device\n");
-   return ret;
-   }
-
-   ret = pcim_iomap_regions(pdev, BIT(0) | BIT(2) | BIT(4),
-IFCVF_DRIVER_NAME);
-   if (ret) {
-   IFCVF_ERR(pdev, "Failed to request MMIO region\n");
-   return ret;
-   }
-
-   ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64));
-   if (ret) {
-   IFCVF_ERR(pdev, "No usable DMA configuration\n");
-   return ret;
-   }
-
-   ret = devm_add_action_or_reset(dev, ifcvf_free_irq_vectors, pdev);
-   if (ret) {
-   IFCVF_ERR(pdev,
- "Failed for adding devres for freeing irq vectors\n");
-   return ret;
-   }
+   ifcvf_mgmt_dev = container_of(mdev, struct ifcvf_vdpa_mgmt_dev, mdev);
+   if (ifcvf_mgmt_dev->adapter)
+   return -EOPNOTSUPP;
  
+	pdev = ifcvf_mgmt_dev->pdev;

+   dev = >dev;
adapter = vdpa_alloc_device(struct ifcvf_adapter, vdpa,
-   dev, _vdpa_ops, NULL);
-   if (adapter == NULL) {
+   dev, _vdpa_ops, name);
+   if (!adapter) {
IFCVF_ERR(pdev, "Failed to allocate vDPA structure");
return -ENOMEM;
}
  
-	pci_set_master(pdev);

-   pci_set_drvdata(pdev, adapter);
+   ifcvf_mgmt_dev->adapter = adapter;
+   pci_set_drvdata(pdev, ifcvf_mgmt_dev);
  
  	vf = >vf;

vf->dev_type = get_dev_type(pdev);
@@ -515,7 +507,7 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)
ret = ifcvf_init_hw(vf, pdev);
if (ret) {
IFCVF_ERR(pdev, "Failed to init IFCVF hw\n");
-   goto err;
+   return ret;
}
  
  	for (i = 0; i < IFCVF_MAX_QUEUE_PAIRS * 2; i++)

@@ -523,9 +515,94 @@ static int ifcvf_probe(struct pci_dev *pdev, const struct 
pci_device_id *id)
  
  	vf->hw_features = ifcvf_get_hw_features(vf);
  
-	ret = vdpa_register_device(>vdpa, IFCVF_MAX_QUEUE_PAIRS * 2);

+   adapter->vdpa.mdev = _mgmt_dev->mdev;
+   ret = _vdpa_register_device(>vdpa, IFCVF_MAX_QUEUE_PAIRS * 2);
if (ret) {
-   IFCVF_ERR(pdev, "Failed to register ifcvf to vdpa bus");
+   IFCVF_ERR(pdev, "Failed to register to vDPA bus");
+   return ret;
+   }
+
+

Re: [PATCH 3/3] vDPA/ifcvf: set_status() should get a adapter from the mgmt dev

2021-07-04 Thread Jason Wang


在 2021/6/30 下午4:21, Zhu Lingshan 写道:

ifcvf_vdpa_set_status() should get a adapter from the
management device

Signed-off-by: Zhu Lingshan 
---
  drivers/vdpa/ifcvf/ifcvf_main.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/vdpa/ifcvf/ifcvf_main.c b/drivers/vdpa/ifcvf/ifcvf_main.c
index 7c2f64ca2163..28c71eef1d2b 100644
--- a/drivers/vdpa/ifcvf/ifcvf_main.c
+++ b/drivers/vdpa/ifcvf/ifcvf_main.c
@@ -212,13 +212,15 @@ static u8 ifcvf_vdpa_get_status(struct vdpa_device 
*vdpa_dev)
  
  static void ifcvf_vdpa_set_status(struct vdpa_device *vdpa_dev, u8 status)

  {
+   struct ifcvf_vdpa_mgmt_dev *ifcvf_mgmt_dev;
struct ifcvf_adapter *adapter;
struct ifcvf_hw *vf;
u8 status_old;
int ret;
  
  	vf  = vdpa_to_vf(vdpa_dev);

-   adapter = dev_get_drvdata(vdpa_dev->dev.parent);



If this is a fix for patch 2, you need to squash this into that one.

Any reason that vdpa_to_adapter() can't work?

And I see:

+struct ifcvf_vdpa_mgmt_dev {
+   struct vdpa_mgmt_dev mdev;
+   struct ifcvf_adapter *adapter;
+   struct pci_dev *pdev;
+};

What's the reason for having a adapter pointer here?

Thanks



+   ifcvf_mgmt_dev = dev_get_drvdata(vdpa_dev->dev.parent);
+   adapter = ifcvf_mgmt_dev->adapter;
status_old = ifcvf_get_status(vf);
  
  	if (status_old == status)


___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

Re: [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic

2021-07-04 Thread Michael S. Tsirkin
On Sun, Jul 04, 2021 at 12:23:03PM +0300, Arseny Krasnov wrote:
> 
> On 04.07.2021 11:30, Michael S. Tsirkin wrote:
> > On Sun, Jul 04, 2021 at 11:08:13AM +0300, Arseny Krasnov wrote:
> >>This patchset modifies receive logic for SOCK_SEQPACKET.
> >> Difference between current implementation and this version is that
> >> now reader is woken up when there is at least one RW packet in rx
> >> queue of socket and data is copied to user's buffer, while merged
> >> approach wake up user only when whole message is received and kept
> >> in queue. New implementation has several advantages:
> >>  1) There is no limit for message length. Merged approach requires
> >> that length must be smaller than 'peer_buf_alloc', otherwise
> >> transmission will stuck.
> >>  2) There is no need to keep whole message in queue, thus no
> >> 'kmalloc()' memory will be wasted until EOR is received.
> >>
> >> Also new approach has some feature: as fragments of message
> >> are copied until EOR is received, it is possible that part of
> >> message will be already in user's buffer, while rest of message
> >> still not received. And if user will be interrupted by signal or
> >> timeout with part of message in buffer, it will exit receive loop,
> >> leaving rest of message in queue. To solve this problem special
> >> callback was added to transport: it is called when user was forced
> >> to leave exit loop and tells transport to drop any packet until
> >> EOR met.
> > Sorry about commenting late in the game.  I'm a bit lost
> >
> >
> > SOCK_SEQPACKET
> > Provides sequenced, reliable, bidirectional, connection-mode transmission 
> > paths for records. A record can be sent using one or more output operations 
> > and received using one or more input operations, but a single operation 
> > never transfers part of more than one record. Record boundaries are visible 
> > to the receiver via the MSG_EOR flag.
> >
> > it's supposed to be reliable - how is it legal to drop packets?
> 
> Sorry, seems i need to rephrase description. "Packet" here means fragment of 
> record(message) at transport
> 
> layer. As this is SEQPACKET mode, receiver could get only whole message or 
> error, so if only several fragments
> 
> of message was copied (if signal received for example) we can't return it to 
> user - it breaks SEQPACKET sense. I think,
> 
> in this case we can drop rest of record's fragments legally.
> 
> 
> Thank You

Would not that violate the reliable property? IIUC it's only ok to
return an error if socket gets closed. Just like e.g. TCP ...



> >
> >
> >> When EOR is found, this mode is disabled and normal packet
> >> processing started. Note, that when 'drop until EOR' mode is on,
> >> incoming packets still inserted in queue, reader will be woken up,
> >> tries to copy data, but nothing will be copied until EOR found.
> >> It was possible to drain such unneeded packets it rx work without
> >> kicking user, but implemented way is simplest. Anyway, i think
> >> such cases are rare.
> >
> >> New test also added - it tries to copy to invalid user's
> >> buffer.
> >>
> >> Arseny Krasnov (16):
> >>  af_vsock/virtio/vsock: change seqpacket receive logic
> >>  af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
> >>  virtio/vsock: remove 'msg_count' based logic
> >>  af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
> >>  virtio/vsock: remove record size limit for SEQPACKET
> >>  vsock_test: SEQPACKET read to broken buffer
> >>
> >>  drivers/vhost/vsock.c   |   2 +-
> >>  include/linux/virtio_vsock.h|   7 +-
> >>  include/net/af_vsock.h  |   4 +-
> >>  net/vmw_vsock/af_vsock.c|  44 
> >>  net/vmw_vsock/virtio_transport.c|   2 +-
> >>  net/vmw_vsock/virtio_transport_common.c | 103 ---
> >>  net/vmw_vsock/vsock_loopback.c  |   2 +-
> >>  tools/testing/vsock/vsock_test.c| 120 ++
> >>  8 files changed, 193 insertions(+), 91 deletions(-)
> >>
> >>  v1 -> v2:
> >>  Patches reordered and reorganized.
> >>
> >> Signed-off-by: Arseny Krasnov 
> >> ---
> >>  cv.txt | 0
> >>  1 file changed, 0 insertions(+), 0 deletions(-)
> >>  create mode 100644 cv.txt
> >>
> >> diff --git a/cv.txt b/cv.txt
> >> new file mode 100644
> >> index ..e69de29bb2d1
> >> -- 
> >> 2.25.1
> >

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization


Re: [RFC PATCH v2 0/6] Improve SOCK_SEQPACKET receive logic

2021-07-04 Thread Michael S. Tsirkin
On Sun, Jul 04, 2021 at 11:08:13AM +0300, Arseny Krasnov wrote:
>   This patchset modifies receive logic for SOCK_SEQPACKET.
> Difference between current implementation and this version is that
> now reader is woken up when there is at least one RW packet in rx
> queue of socket and data is copied to user's buffer, while merged
> approach wake up user only when whole message is received and kept
> in queue. New implementation has several advantages:
>  1) There is no limit for message length. Merged approach requires
> that length must be smaller than 'peer_buf_alloc', otherwise
> transmission will stuck.
>  2) There is no need to keep whole message in queue, thus no
> 'kmalloc()' memory will be wasted until EOR is received.
> 
> Also new approach has some feature: as fragments of message
> are copied until EOR is received, it is possible that part of
> message will be already in user's buffer, while rest of message
> still not received. And if user will be interrupted by signal or
> timeout with part of message in buffer, it will exit receive loop,
> leaving rest of message in queue. To solve this problem special
> callback was added to transport: it is called when user was forced
> to leave exit loop and tells transport to drop any packet until
> EOR met.

Sorry about commenting late in the game.  I'm a bit lost


SOCK_SEQPACKET
Provides sequenced, reliable, bidirectional, connection-mode transmission paths 
for records. A record can be sent using one or more output operations and 
received using one or more input operations, but a single operation never 
transfers part of more than one record. Record boundaries are visible to the 
receiver via the MSG_EOR flag.

it's supposed to be reliable - how is it legal to drop packets?


> When EOR is found, this mode is disabled and normal packet
> processing started. Note, that when 'drop until EOR' mode is on,
> incoming packets still inserted in queue, reader will be woken up,
> tries to copy data, but nothing will be copied until EOR found.
> It was possible to drain such unneeded packets it rx work without
> kicking user, but implemented way is simplest. Anyway, i think
> such cases are rare.


> New test also added - it tries to copy to invalid user's
> buffer.
> 
> Arseny Krasnov (16):
>  af_vsock/virtio/vsock: change seqpacket receive logic
>  af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
>  virtio/vsock: remove 'msg_count' based logic
>  af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
>  virtio/vsock: remove record size limit for SEQPACKET
>  vsock_test: SEQPACKET read to broken buffer
> 
>  drivers/vhost/vsock.c   |   2 +-
>  include/linux/virtio_vsock.h|   7 +-
>  include/net/af_vsock.h  |   4 +-
>  net/vmw_vsock/af_vsock.c|  44 
>  net/vmw_vsock/virtio_transport.c|   2 +-
>  net/vmw_vsock/virtio_transport_common.c | 103 ---
>  net/vmw_vsock/vsock_loopback.c  |   2 +-
>  tools/testing/vsock/vsock_test.c| 120 ++
>  8 files changed, 193 insertions(+), 91 deletions(-)
> 
>  v1 -> v2:
>  Patches reordered and reorganized.
> 
> Signed-off-by: Arseny Krasnov 
> ---
>  cv.txt | 0
>  1 file changed, 0 insertions(+), 0 deletions(-)
>  create mode 100644 cv.txt
> 
> diff --git a/cv.txt b/cv.txt
> new file mode 100644
> index ..e69de29bb2d1
> -- 
> 2.25.1

___
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization