Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Wed, Jul 7, 2021 at 4:53 PM Stefan Hajnoczi wrote: > > On Tue, Jun 15, 2021 at 10:13:30PM +0800, Xie Yongji wrote: > > +static bool vduse_validate_config(struct vduse_dev_config *config) > > +{ > > The name field needs to be NUL terminated? > I think so. > > + case VDUSE_CREATE_DEV: { > > + struct vduse_dev_config config; > > + unsigned long size = offsetof(struct vduse_dev_config, > > config); > > + void *buf; > > + > > + ret = -EFAULT; > > + if (copy_from_user(, argp, size)) > > + break; > > + > > + ret = -EINVAL; > > + if (vduse_validate_config() == false) > > + break; > > + > > + buf = vmemdup_user(argp + size, config.config_size); > > + if (IS_ERR(buf)) { > > + ret = PTR_ERR(buf); > > + break; > > + } > > + ret = vduse_create_dev(, buf, control->api_version); > > + break; > > + } > > + case VDUSE_DESTROY_DEV: { > > + char name[VDUSE_NAME_MAX]; > > + > > + ret = -EFAULT; > > + if (copy_from_user(name, argp, VDUSE_NAME_MAX)) > > + break; > > Is this missing a NUL terminator? Oh, yes. Looks like I need to set '\0' to name[VDUSE_VDUSE_NAME_MAX - 1] here. Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Tue, Jun 15, 2021 at 10:13:30PM +0800, Xie Yongji wrote: > +static bool vduse_validate_config(struct vduse_dev_config *config) > +{ The name field needs to be NUL terminated? > + case VDUSE_CREATE_DEV: { > + struct vduse_dev_config config; > + unsigned long size = offsetof(struct vduse_dev_config, config); > + void *buf; > + > + ret = -EFAULT; > + if (copy_from_user(, argp, size)) > + break; > + > + ret = -EINVAL; > + if (vduse_validate_config() == false) > + break; > + > + buf = vmemdup_user(argp + size, config.config_size); > + if (IS_ERR(buf)) { > + ret = PTR_ERR(buf); > + break; > + } > + ret = vduse_create_dev(, buf, control->api_version); > + break; > + } > + case VDUSE_DESTROY_DEV: { > + char name[VDUSE_NAME_MAX]; > + > + ret = -EFAULT; > + if (copy_from_user(name, argp, VDUSE_NAME_MAX)) > + break; Is this missing a NUL terminator? signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/7/1 下午6:26, Yongji Xie 写道: On Thu, Jul 1, 2021 at 3:55 PM Jason Wang wrote: 在 2021/7/1 下午2:50, Yongji Xie 写道: On Wed, Jun 30, 2021 at 5:51 PM Stefan Hajnoczi wrote: On Tue, Jun 29, 2021 at 10:59:51AM +0800, Yongji Xie wrote: On Mon, Jun 28, 2021 at 9:02 PM Stefan Hajnoczi wrote: On Tue, Jun 15, 2021 at 10:13:30PM +0800, Xie Yongji wrote: +/* ioctls */ + +struct vduse_dev_config { + char name[VDUSE_NAME_MAX]; /* vduse device name */ + __u32 vendor_id; /* virtio vendor id */ + __u32 device_id; /* virtio device id */ + __u64 features; /* device features */ + __u64 bounce_size; /* bounce buffer size for iommu */ + __u16 vq_size_max; /* the max size of virtqueue */ The VIRTIO specification allows per-virtqueue sizes. A device can have two virtqueues, where the first one allows up to 1024 descriptors and the second one allows only 128 descriptors, for example. Good point! But it looks like virtio-vdpa/virtio-pci doesn't support that now. All virtqueues have the same maximum size. I see struct vpda_config_ops only supports a per-device max vq size: u16 (*get_vq_num_max)(struct vdpa_device *vdev); virtio-pci supports per-virtqueue sizes because the struct virtio_pci_common_cfg->queue_size register is per-queue (controlled by queue_select). Oh, yes. I miss queue_select. I guess this is a question for Jason: will vdpa will keep this limitation? If yes, then VDUSE can stick to it too without running into problems in the future. I think it's better to extend the get_vq_num_max() per virtqueue. Currently, vDPA assumes the parent to have a global max size. This seems to work on most of the parents but not vp-vDPA (which could be backed by QEMU, in that case cvq's size is smaller). Fortunately, we haven't enabled had cvq support in the userspace now. I can post the fixes. OK. If so, it looks like we need to support the per-vq configuration. I wonder if it's better to use something like: VDUSE_CREATE_DEVICE -> VDUSE_SETUP_VQ -> VDUSE_SETUP_VQ -> ... -> VDUSE_ENABLE_DEVICE to do initialization rather than only use VDUSE_CREATE_DEVICE. This should be fine. Thanks Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Thu, Jul 1, 2021 at 3:55 PM Jason Wang wrote: > > > 在 2021/7/1 下午2:50, Yongji Xie 写道: > > On Wed, Jun 30, 2021 at 5:51 PM Stefan Hajnoczi wrote: > >> On Tue, Jun 29, 2021 at 10:59:51AM +0800, Yongji Xie wrote: > >>> On Mon, Jun 28, 2021 at 9:02 PM Stefan Hajnoczi > >>> wrote: > On Tue, Jun 15, 2021 at 10:13:30PM +0800, Xie Yongji wrote: > > +/* ioctls */ > > + > > +struct vduse_dev_config { > > + char name[VDUSE_NAME_MAX]; /* vduse device name */ > > + __u32 vendor_id; /* virtio vendor id */ > > + __u32 device_id; /* virtio device id */ > > + __u64 features; /* device features */ > > + __u64 bounce_size; /* bounce buffer size for iommu */ > > + __u16 vq_size_max; /* the max size of virtqueue */ > The VIRTIO specification allows per-virtqueue sizes. A device can have > two virtqueues, where the first one allows up to 1024 descriptors and > the second one allows only 128 descriptors, for example. > > >>> Good point! But it looks like virtio-vdpa/virtio-pci doesn't support > >>> that now. All virtqueues have the same maximum size. > >> I see struct vpda_config_ops only supports a per-device max vq size: > >> u16 (*get_vq_num_max)(struct vdpa_device *vdev); > >> > >> virtio-pci supports per-virtqueue sizes because the struct > >> virtio_pci_common_cfg->queue_size register is per-queue (controlled by > >> queue_select). > >> > > Oh, yes. I miss queue_select. > > > >> I guess this is a question for Jason: will vdpa will keep this limitation? > >> If yes, then VDUSE can stick to it too without running into problems in > >> the future. > > > I think it's better to extend the get_vq_num_max() per virtqueue. > > Currently, vDPA assumes the parent to have a global max size. This seems > to work on most of the parents but not vp-vDPA (which could be backed by > QEMU, in that case cvq's size is smaller). > > Fortunately, we haven't enabled had cvq support in the userspace now. > > I can post the fixes. > OK. If so, it looks like we need to support the per-vq configuration. I wonder if it's better to use something like: VDUSE_CREATE_DEVICE -> VDUSE_SETUP_VQ -> VDUSE_SETUP_VQ -> ... -> VDUSE_ENABLE_DEVICE to do initialization rather than only use VDUSE_CREATE_DEVICE. Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/7/1 下午2:50, Yongji Xie 写道: On Wed, Jun 30, 2021 at 5:51 PM Stefan Hajnoczi wrote: On Tue, Jun 29, 2021 at 10:59:51AM +0800, Yongji Xie wrote: On Mon, Jun 28, 2021 at 9:02 PM Stefan Hajnoczi wrote: On Tue, Jun 15, 2021 at 10:13:30PM +0800, Xie Yongji wrote: +/* ioctls */ + +struct vduse_dev_config { + char name[VDUSE_NAME_MAX]; /* vduse device name */ + __u32 vendor_id; /* virtio vendor id */ + __u32 device_id; /* virtio device id */ + __u64 features; /* device features */ + __u64 bounce_size; /* bounce buffer size for iommu */ + __u16 vq_size_max; /* the max size of virtqueue */ The VIRTIO specification allows per-virtqueue sizes. A device can have two virtqueues, where the first one allows up to 1024 descriptors and the second one allows only 128 descriptors, for example. Good point! But it looks like virtio-vdpa/virtio-pci doesn't support that now. All virtqueues have the same maximum size. I see struct vpda_config_ops only supports a per-device max vq size: u16 (*get_vq_num_max)(struct vdpa_device *vdev); virtio-pci supports per-virtqueue sizes because the struct virtio_pci_common_cfg->queue_size register is per-queue (controlled by queue_select). Oh, yes. I miss queue_select. I guess this is a question for Jason: will vdpa will keep this limitation? If yes, then VDUSE can stick to it too without running into problems in the future. I think it's better to extend the get_vq_num_max() per virtqueue. Currently, vDPA assumes the parent to have a global max size. This seems to work on most of the parents but not vp-vDPA (which could be backed by QEMU, in that case cvq's size is smaller). Fortunately, we haven't enabled had cvq support in the userspace now. I can post the fixes. + __u16 padding; /* padding */ + __u32 vq_num; /* the number of virtqueues */ + __u32 vq_align; /* the allocation alignment of virtqueue's metadata */ I'm not sure what this is? This will be used by vring_create_virtqueue() too. If there is no official definition for the meaning of this value then "/* same as vring_create_virtqueue()'s vring_align parameter */" would be clearer. That way the reader knows what to research in order to understand how this field works. OK. I don't remember but maybe it was used to support vrings when the host/guest have non-4KB page sizes. I wonder if anyone has an official definition for this value? Not sure. Maybe we might need some alignment which is less than PAGE_SIZE sometimes. So I see CCW always use 4096, but I'm not sure whether or not it's smaller than PAGE_SIZE. Thanks Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Wed, Jun 30, 2021 at 5:51 PM Stefan Hajnoczi wrote: > > On Tue, Jun 29, 2021 at 10:59:51AM +0800, Yongji Xie wrote: > > On Mon, Jun 28, 2021 at 9:02 PM Stefan Hajnoczi wrote: > > > > > > On Tue, Jun 15, 2021 at 10:13:30PM +0800, Xie Yongji wrote: > > > > +/* ioctls */ > > > > + > > > > +struct vduse_dev_config { > > > > + char name[VDUSE_NAME_MAX]; /* vduse device name */ > > > > + __u32 vendor_id; /* virtio vendor id */ > > > > + __u32 device_id; /* virtio device id */ > > > > + __u64 features; /* device features */ > > > > + __u64 bounce_size; /* bounce buffer size for iommu */ > > > > + __u16 vq_size_max; /* the max size of virtqueue */ > > > > > > The VIRTIO specification allows per-virtqueue sizes. A device can have > > > two virtqueues, where the first one allows up to 1024 descriptors and > > > the second one allows only 128 descriptors, for example. > > > > > > > Good point! But it looks like virtio-vdpa/virtio-pci doesn't support > > that now. All virtqueues have the same maximum size. > > I see struct vpda_config_ops only supports a per-device max vq size: > u16 (*get_vq_num_max)(struct vdpa_device *vdev); > > virtio-pci supports per-virtqueue sizes because the struct > virtio_pci_common_cfg->queue_size register is per-queue (controlled by > queue_select). > Oh, yes. I miss queue_select. > I guess this is a question for Jason: will vdpa will keep this limitation? > If yes, then VDUSE can stick to it too without running into problems in > the future. > > > > > + __u16 padding; /* padding */ > > > > + __u32 vq_num; /* the number of virtqueues */ > > > > + __u32 vq_align; /* the allocation alignment of virtqueue's > > > > metadata */ > > > > > > I'm not sure what this is? > > > > > > > This will be used by vring_create_virtqueue() too. > > If there is no official definition for the meaning of this value then > "/* same as vring_create_virtqueue()'s vring_align parameter */" would > be clearer. That way the reader knows what to research in order to > understand how this field works. > OK. > I don't remember but maybe it was used to support vrings when the > host/guest have non-4KB page sizes. I wonder if anyone has an official > definition for this value? Not sure. Maybe we might need some alignment which is less than PAGE_SIZE sometimes. Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Tue, Jun 29, 2021 at 10:59:51AM +0800, Yongji Xie wrote: > On Mon, Jun 28, 2021 at 9:02 PM Stefan Hajnoczi wrote: > > > > On Tue, Jun 15, 2021 at 10:13:30PM +0800, Xie Yongji wrote: > > > +/* ioctls */ > > > + > > > +struct vduse_dev_config { > > > + char name[VDUSE_NAME_MAX]; /* vduse device name */ > > > + __u32 vendor_id; /* virtio vendor id */ > > > + __u32 device_id; /* virtio device id */ > > > + __u64 features; /* device features */ > > > + __u64 bounce_size; /* bounce buffer size for iommu */ > > > + __u16 vq_size_max; /* the max size of virtqueue */ > > > > The VIRTIO specification allows per-virtqueue sizes. A device can have > > two virtqueues, where the first one allows up to 1024 descriptors and > > the second one allows only 128 descriptors, for example. > > > > Good point! But it looks like virtio-vdpa/virtio-pci doesn't support > that now. All virtqueues have the same maximum size. I see struct vpda_config_ops only supports a per-device max vq size: u16 (*get_vq_num_max)(struct vdpa_device *vdev); virtio-pci supports per-virtqueue sizes because the struct virtio_pci_common_cfg->queue_size register is per-queue (controlled by queue_select). I guess this is a question for Jason: will vdpa will keep this limitation? If yes, then VDUSE can stick to it too without running into problems in the future. > > > + __u16 padding; /* padding */ > > > + __u32 vq_num; /* the number of virtqueues */ > > > + __u32 vq_align; /* the allocation alignment of virtqueue's metadata > > > */ > > > > I'm not sure what this is? > > > > This will be used by vring_create_virtqueue() too. If there is no official definition for the meaning of this value then "/* same as vring_create_virtqueue()'s vring_align parameter */" would be clearer. That way the reader knows what to research in order to understand how this field works. I don't remember but maybe it was used to support vrings when the host/guest have non-4KB page sizes. I wonder if anyone has an official definition for this value? signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/6/29 上午11:56, Yongji Xie 写道: On Tue, Jun 29, 2021 at 11:29 AM Jason Wang wrote: 在 2021/6/29 上午10:26, Yongji Xie 写道: On Mon, Jun 28, 2021 at 12:40 PM Jason Wang wrote: 在 2021/6/25 下午12:19, Yongji Xie 写道: 2b) for set_status(): simply relay the message to userspace, reply is no needed. Userspace will use a command to update the status when the datapath is stop. The the status could be fetched via get_stats(). 2b looks more spec complaint. Looks good to me. And I think we can use the reply of the message to update the status instead of introducing a new command. Just notice this part in virtio_finalize_features(): virtio_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK); status = dev->config->get_status(dev); if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) { So we no reply doesn't work for FEATURES_OK. So my understanding is: 1) We must not use noreply for set_status() 2) We can use noreply for get_status(), but it requires a new ioctl to update the status. So it looks to me we need synchronize for both get_status() and set_status(). We should not send messages to userspace in the FEATURES_OK case. So the synchronization is not necessary. As discussed previously, there could be a device that mandates some features (VIRTIO_F_RING_PACKED). So it can choose to not accept FEATURES_OK is packed virtqueue is not negotiated. In this case we need to relay the message to userspace. OK, I see. If so, I prefer to only use noreply for set_status(). We do not set the status bit if the message is failed. In this way, we don't need to change lots of virtio core codes to handle the failure of set_status()/get_status(). It should work. Thanks Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Tue, Jun 29, 2021 at 11:29 AM Jason Wang wrote: > > > 在 2021/6/29 上午10:26, Yongji Xie 写道: > > On Mon, Jun 28, 2021 at 12:40 PM Jason Wang wrote: > >> > >> 在 2021/6/25 下午12:19, Yongji Xie 写道: > 2b) for set_status(): simply relay the message to userspace, reply is no > needed. Userspace will use a command to update the status when the > datapath is stop. The the status could be fetched via get_stats(). > > 2b looks more spec complaint. > > >>> Looks good to me. And I think we can use the reply of the message to > >>> update the status instead of introducing a new command. > >>> > >> Just notice this part in virtio_finalize_features(): > >> > >> virtio_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK); > >> status = dev->config->get_status(dev); > >> if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) { > >> > >> So we no reply doesn't work for FEATURES_OK. > >> > >> So my understanding is: > >> > >> 1) We must not use noreply for set_status() > >> 2) We can use noreply for get_status(), but it requires a new ioctl to > >> update the status. > >> > >> So it looks to me we need synchronize for both get_status() and > >> set_status(). > >> > > We should not send messages to userspace in the FEATURES_OK case. So > > the synchronization is not necessary. > > > As discussed previously, there could be a device that mandates some > features (VIRTIO_F_RING_PACKED). So it can choose to not accept > FEATURES_OK is packed virtqueue is not negotiated. > > In this case we need to relay the message to userspace. > OK, I see. If so, I prefer to only use noreply for set_status(). We do not set the status bit if the message is failed. In this way, we don't need to change lots of virtio core codes to handle the failure of set_status()/get_status(). Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/6/29 上午10:26, Yongji Xie 写道: On Mon, Jun 28, 2021 at 12:40 PM Jason Wang wrote: 在 2021/6/25 下午12:19, Yongji Xie 写道: 2b) for set_status(): simply relay the message to userspace, reply is no needed. Userspace will use a command to update the status when the datapath is stop. The the status could be fetched via get_stats(). 2b looks more spec complaint. Looks good to me. And I think we can use the reply of the message to update the status instead of introducing a new command. Just notice this part in virtio_finalize_features(): virtio_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK); status = dev->config->get_status(dev); if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) { So we no reply doesn't work for FEATURES_OK. So my understanding is: 1) We must not use noreply for set_status() 2) We can use noreply for get_status(), but it requires a new ioctl to update the status. So it looks to me we need synchronize for both get_status() and set_status(). We should not send messages to userspace in the FEATURES_OK case. So the synchronization is not necessary. As discussed previously, there could be a device that mandates some features (VIRTIO_F_RING_PACKED). So it can choose to not accept FEATURES_OK is packed virtqueue is not negotiated. In this case we need to relay the message to userspace. Thanks Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Mon, Jun 28, 2021 at 9:02 PM Stefan Hajnoczi wrote: > > On Tue, Jun 15, 2021 at 10:13:30PM +0800, Xie Yongji wrote: > > diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h > > new file mode 100644 > > index ..f21b2e51b5c8 > > --- /dev/null > > +++ b/include/uapi/linux/vduse.h > > @@ -0,0 +1,143 @@ > > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ > > +#ifndef _UAPI_VDUSE_H_ > > +#define _UAPI_VDUSE_H_ > > + > > +#include > > + > > +#define VDUSE_API_VERSION0 > > + > > +#define VDUSE_NAME_MAX 256 > > + > > +/* the control messages definition for read/write */ > > + > > +enum vduse_req_type { > > + /* Get the state for virtqueue from userspace */ > > + VDUSE_GET_VQ_STATE, > > + /* Notify userspace to start the dataplane, no reply */ > > + VDUSE_START_DATAPLANE, > > + /* Notify userspace to stop the dataplane, no reply */ > > + VDUSE_STOP_DATAPLANE, > > + /* Notify userspace to update the memory mapping in device IOTLB */ > > + VDUSE_UPDATE_IOTLB, > > +}; > > + > > +struct vduse_vq_state { > > + __u32 index; /* virtqueue index */ > > + __u32 avail_idx; /* virtqueue state (last_avail_idx) */ > > +}; > > + > > +struct vduse_iova_range { > > + __u64 start; /* start of the IOVA range */ > > + __u64 last; /* end of the IOVA range */ > > Please clarify whether this describes a closed range [start, last] or an > open range [start, last). > OK. > > +}; > > + > > +struct vduse_dev_request { > > + __u32 type; /* request type */ > > + __u32 request_id; /* request id */ > > +#define VDUSE_REQ_FLAGS_NO_REPLY (1 << 0) /* No need to reply */ > > + __u32 flags; /* request flags */ > > + __u32 reserved; /* for future use */ > > + union { > > + struct vduse_vq_state vq_state; /* virtqueue state */ > > + struct vduse_iova_range iova; /* iova range for updating */ > > + __u32 padding[16]; /* padding */ > > + }; > > +}; > > + > > +struct vduse_dev_response { > > + __u32 request_id; /* corresponding request id */ > > +#define VDUSE_REQ_RESULT_OK 0x00 > > +#define VDUSE_REQ_RESULT_FAILED 0x01 > > + __u32 result; /* the result of request */ > > + __u32 reserved[2]; /* for future use */ > > + union { > > + struct vduse_vq_state vq_state; /* virtqueue state */ > > + __u32 padding[16]; /* padding */ > > + }; > > +}; > > + > > +/* ioctls */ > > + > > +struct vduse_dev_config { > > + char name[VDUSE_NAME_MAX]; /* vduse device name */ > > + __u32 vendor_id; /* virtio vendor id */ > > + __u32 device_id; /* virtio device id */ > > + __u64 features; /* device features */ > > + __u64 bounce_size; /* bounce buffer size for iommu */ > > + __u16 vq_size_max; /* the max size of virtqueue */ > > The VIRTIO specification allows per-virtqueue sizes. A device can have > two virtqueues, where the first one allows up to 1024 descriptors and > the second one allows only 128 descriptors, for example. > Good point! But it looks like virtio-vdpa/virtio-pci doesn't support that now. All virtqueues have the same maximum size. > This constant seems to impose the constraint that all virtqueues have > the same maximum size. Is this really necessary? > This will be used by vring_create_virtqueue(). We need to specify the maximum queue size supported by the device. > > + __u16 padding; /* padding */ > > + __u32 vq_num; /* the number of virtqueues */ > > + __u32 vq_align; /* the allocation alignment of virtqueue's metadata */ > > I'm not sure what this is? > This will be used by vring_create_virtqueue() too. > > + __u32 config_size; /* the size of the configuration space */ > > + __u32 reserved[15]; /* for future use */ > > + __u8 config[0]; /* the buffer of the configuration space */ > > +}; > > + > > +struct vduse_iotlb_entry { > > + __u64 offset; /* the mmap offset on fd */ > > + __u64 start; /* start of the IOVA range */ > > + __u64 last; /* last of the IOVA range */ > > Same here, please specify whether this is an open range or a closed > range. > Sure. > > +#define VDUSE_ACCESS_RO 0x1 > > +#define VDUSE_ACCESS_WO 0x2 > > +#define VDUSE_ACCESS_RW 0x3 > > + __u8 perm; /* access permission of this range */ > > +}; > > + > > +struct vduse_config_update { > > + __u32 offset; /* offset from the beginning of configuration space */ > > + __u32 length; /* the length to write to configuration space */ > > + __u8 buffer[0]; /* buffer used to write from */ > > +}; > > + > > +struct vduse_vq_info { > > + __u32 index; /* virtqueue index */ > > + __u32 avail_idx; /* virtqueue state (last_avail_idx) */ > > + __u64 desc_addr; /* address of desc area */ > > + __u64 driver_addr; /* address of driver area */ > > + __u64 device_addr; /* address of device area */ > > + __u32 num; /* the size of virtqueue */ > > + __u8 ready; /* ready status of
Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Mon, Jun 28, 2021 at 12:40 PM Jason Wang wrote: > > > 在 2021/6/25 下午12:19, Yongji Xie 写道: > >> 2b) for set_status(): simply relay the message to userspace, reply is no > >> needed. Userspace will use a command to update the status when the > >> datapath is stop. The the status could be fetched via get_stats(). > >> > >> 2b looks more spec complaint. > >> > > Looks good to me. And I think we can use the reply of the message to > > update the status instead of introducing a new command. > > > > Just notice this part in virtio_finalize_features(): > > virtio_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK); > status = dev->config->get_status(dev); > if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) { > > So we no reply doesn't work for FEATURES_OK. > > So my understanding is: > > 1) We must not use noreply for set_status() > 2) We can use noreply for get_status(), but it requires a new ioctl to > update the status. > > So it looks to me we need synchronize for both get_status() and > set_status(). > We should not send messages to userspace in the FEATURES_OK case. So the synchronization is not necessary. Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Tue, Jun 15, 2021 at 10:13:30PM +0800, Xie Yongji wrote: > diff --git a/include/uapi/linux/vduse.h b/include/uapi/linux/vduse.h > new file mode 100644 > index ..f21b2e51b5c8 > --- /dev/null > +++ b/include/uapi/linux/vduse.h > @@ -0,0 +1,143 @@ > +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ > +#ifndef _UAPI_VDUSE_H_ > +#define _UAPI_VDUSE_H_ > + > +#include > + > +#define VDUSE_API_VERSION0 > + > +#define VDUSE_NAME_MAX 256 > + > +/* the control messages definition for read/write */ > + > +enum vduse_req_type { > + /* Get the state for virtqueue from userspace */ > + VDUSE_GET_VQ_STATE, > + /* Notify userspace to start the dataplane, no reply */ > + VDUSE_START_DATAPLANE, > + /* Notify userspace to stop the dataplane, no reply */ > + VDUSE_STOP_DATAPLANE, > + /* Notify userspace to update the memory mapping in device IOTLB */ > + VDUSE_UPDATE_IOTLB, > +}; > + > +struct vduse_vq_state { > + __u32 index; /* virtqueue index */ > + __u32 avail_idx; /* virtqueue state (last_avail_idx) */ > +}; > + > +struct vduse_iova_range { > + __u64 start; /* start of the IOVA range */ > + __u64 last; /* end of the IOVA range */ Please clarify whether this describes a closed range [start, last] or an open range [start, last). > +}; > + > +struct vduse_dev_request { > + __u32 type; /* request type */ > + __u32 request_id; /* request id */ > +#define VDUSE_REQ_FLAGS_NO_REPLY (1 << 0) /* No need to reply */ > + __u32 flags; /* request flags */ > + __u32 reserved; /* for future use */ > + union { > + struct vduse_vq_state vq_state; /* virtqueue state */ > + struct vduse_iova_range iova; /* iova range for updating */ > + __u32 padding[16]; /* padding */ > + }; > +}; > + > +struct vduse_dev_response { > + __u32 request_id; /* corresponding request id */ > +#define VDUSE_REQ_RESULT_OK 0x00 > +#define VDUSE_REQ_RESULT_FAILED 0x01 > + __u32 result; /* the result of request */ > + __u32 reserved[2]; /* for future use */ > + union { > + struct vduse_vq_state vq_state; /* virtqueue state */ > + __u32 padding[16]; /* padding */ > + }; > +}; > + > +/* ioctls */ > + > +struct vduse_dev_config { > + char name[VDUSE_NAME_MAX]; /* vduse device name */ > + __u32 vendor_id; /* virtio vendor id */ > + __u32 device_id; /* virtio device id */ > + __u64 features; /* device features */ > + __u64 bounce_size; /* bounce buffer size for iommu */ > + __u16 vq_size_max; /* the max size of virtqueue */ The VIRTIO specification allows per-virtqueue sizes. A device can have two virtqueues, where the first one allows up to 1024 descriptors and the second one allows only 128 descriptors, for example. This constant seems to impose the constraint that all virtqueues have the same maximum size. Is this really necessary? > + __u16 padding; /* padding */ > + __u32 vq_num; /* the number of virtqueues */ > + __u32 vq_align; /* the allocation alignment of virtqueue's metadata */ I'm not sure what this is? > + __u32 config_size; /* the size of the configuration space */ > + __u32 reserved[15]; /* for future use */ > + __u8 config[0]; /* the buffer of the configuration space */ > +}; > + > +struct vduse_iotlb_entry { > + __u64 offset; /* the mmap offset on fd */ > + __u64 start; /* start of the IOVA range */ > + __u64 last; /* last of the IOVA range */ Same here, please specify whether this is an open range or a closed range. > +#define VDUSE_ACCESS_RO 0x1 > +#define VDUSE_ACCESS_WO 0x2 > +#define VDUSE_ACCESS_RW 0x3 > + __u8 perm; /* access permission of this range */ > +}; > + > +struct vduse_config_update { > + __u32 offset; /* offset from the beginning of configuration space */ > + __u32 length; /* the length to write to configuration space */ > + __u8 buffer[0]; /* buffer used to write from */ > +}; > + > +struct vduse_vq_info { > + __u32 index; /* virtqueue index */ > + __u32 avail_idx; /* virtqueue state (last_avail_idx) */ > + __u64 desc_addr; /* address of desc area */ > + __u64 driver_addr; /* address of driver area */ > + __u64 device_addr; /* address of device area */ > + __u32 num; /* the size of virtqueue */ > + __u8 ready; /* ready status of virtqueue */ > +}; > + > +struct vduse_vq_eventfd { > + __u32 index; /* virtqueue index */ > +#define VDUSE_EVENTFD_DEASSIGN -1 > + int fd; /* eventfd, -1 means de-assigning the eventfd */ > +}; > + > +#define VDUSE_BASE 0x81 > + > +/* Get the version of VDUSE API. This is used for future extension */ > +#define VDUSE_GET_API_VERSION_IOR(VDUSE_BASE, 0x00, __u64) > + > +/* Set the version of VDUSE API. */ > +#define VDUSE_SET_API_VERSION_IOW(VDUSE_BASE, 0x01, __u64) > + > +/* Create a vduse device which is represented by a char device > (/dev/vduse/) */ > +#define
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/6/25 下午12:19, Yongji Xie 写道: 2b) for set_status(): simply relay the message to userspace, reply is no needed. Userspace will use a command to update the status when the datapath is stop. The the status could be fetched via get_stats(). 2b looks more spec complaint. Looks good to me. And I think we can use the reply of the message to update the status instead of introducing a new command. Just notice this part in virtio_finalize_features(): virtio_add_status(dev, VIRTIO_CONFIG_S_FEATURES_OK); status = dev->config->get_status(dev); if (!(status & VIRTIO_CONFIG_S_FEATURES_OK)) { So we no reply doesn't work for FEATURES_OK. So my understanding is: 1) We must not use noreply for set_status() 2) We can use noreply for get_status(), but it requires a new ioctl to update the status. So it looks to me we need synchronize for both get_status() and set_status(). Thanks ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Fri, Jun 25, 2021 at 11:09 AM Jason Wang wrote: > > > 在 2021/6/24 下午5:16, Yongji Xie 写道: > > On Thu, Jun 24, 2021 at 4:14 PM Jason Wang wrote: > >> > >> 在 2021/6/24 下午12:46, Yongji Xie 写道: > So we need to deal with both FEATURES_OK and reset, but probably not > DRIVER_OK. > > >>> OK, I see. Thanks for the explanation. One more question is how about > >>> clearing the corresponding status bit in get_status() rather than > >>> making set_status() fail. Since the spec recommends this way for > >>> validation which is done in virtio_dev_remove() and > >>> virtio_finalize_features(). > >>> > >>> Thanks, > >>> Yongji > >>> > >> I think you can. Or it would be even better that we just don't set the > >> bit during set_status(). > >> > > Yes, that's what I mean. > > > >> I just realize that in vdpa_reset() we had: > >> > >> static inline void vdpa_reset(struct vdpa_device *vdev) > >> { > >> const struct vdpa_config_ops *ops = vdev->config; > >> > >> vdev->features_valid = false; > >> ops->set_status(vdev, 0); > >> } > >> > >> We probably need to add the synchronization here. E.g re-read with a > >> timeout. > >> > > Looks like the timeout is already in set_status(). > > > Do you mean the VDUSE's implementation? > Yes. > > > Do we really need a > > duplicated one here? > > > 1) this is the timeout at the vDPA layer instead of the VDUSE layer. OK, I get it. > 2) it really depends on what's the meaning of the timeout for set_status > of VDUSE. > > Do we want: > > 2a) for set_status(): relay the message to userspace and wait for the > userspace to quiescence the datapath > > or > > 2b) for set_status(): simply relay the message to userspace, reply is no > needed. Userspace will use a command to update the status when the > datapath is stop. The the status could be fetched via get_stats(). > > 2b looks more spec complaint. > Looks good to me. And I think we can use the reply of the message to update the status instead of introducing a new command. > > And how to handle failure? Adding a return value > > to virtio_config_ops->reset() and passing the error to the upper > > layer? > > > Something like this. > OK. Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/6/24 下午5:16, Yongji Xie 写道: On Thu, Jun 24, 2021 at 4:14 PM Jason Wang wrote: 在 2021/6/24 下午12:46, Yongji Xie 写道: So we need to deal with both FEATURES_OK and reset, but probably not DRIVER_OK. OK, I see. Thanks for the explanation. One more question is how about clearing the corresponding status bit in get_status() rather than making set_status() fail. Since the spec recommends this way for validation which is done in virtio_dev_remove() and virtio_finalize_features(). Thanks, Yongji I think you can. Or it would be even better that we just don't set the bit during set_status(). Yes, that's what I mean. I just realize that in vdpa_reset() we had: static inline void vdpa_reset(struct vdpa_device *vdev) { const struct vdpa_config_ops *ops = vdev->config; vdev->features_valid = false; ops->set_status(vdev, 0); } We probably need to add the synchronization here. E.g re-read with a timeout. Looks like the timeout is already in set_status(). Do you mean the VDUSE's implementation? Do we really need a duplicated one here? 1) this is the timeout at the vDPA layer instead of the VDUSE layer. 2) it really depends on what's the meaning of the timeout for set_status of VDUSE. Do we want: 2a) for set_status(): relay the message to userspace and wait for the userspace to quiescence the datapath or 2b) for set_status(): simply relay the message to userspace, reply is no needed. Userspace will use a command to update the status when the datapath is stop. The the status could be fetched via get_stats(). 2b looks more spec complaint. And how to handle failure? Adding a return value to virtio_config_ops->reset() and passing the error to the upper layer? Something like this. Thanks Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Thu, Jun 24, 2021 at 4:14 PM Jason Wang wrote: > > > 在 2021/6/24 下午12:46, Yongji Xie 写道: > >> So we need to deal with both FEATURES_OK and reset, but probably not > >> DRIVER_OK. > >> > > OK, I see. Thanks for the explanation. One more question is how about > > clearing the corresponding status bit in get_status() rather than > > making set_status() fail. Since the spec recommends this way for > > validation which is done in virtio_dev_remove() and > > virtio_finalize_features(). > > > > Thanks, > > Yongji > > > > I think you can. Or it would be even better that we just don't set the > bit during set_status(). > Yes, that's what I mean. > I just realize that in vdpa_reset() we had: > > static inline void vdpa_reset(struct vdpa_device *vdev) > { > const struct vdpa_config_ops *ops = vdev->config; > > vdev->features_valid = false; > ops->set_status(vdev, 0); > } > > We probably need to add the synchronization here. E.g re-read with a > timeout. > Looks like the timeout is already in set_status(). Do we really need a duplicated one here? And how to handle failure? Adding a return value to virtio_config_ops->reset() and passing the error to the upper layer? Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/6/24 下午12:46, Yongji Xie 写道: So we need to deal with both FEATURES_OK and reset, but probably not DRIVER_OK. OK, I see. Thanks for the explanation. One more question is how about clearing the corresponding status bit in get_status() rather than making set_status() fail. Since the spec recommends this way for validation which is done in virtio_dev_remove() and virtio_finalize_features(). Thanks, Yongji I think you can. Or it would be even better that we just don't set the bit during set_status(). I just realize that in vdpa_reset() we had: static inline void vdpa_reset(struct vdpa_device *vdev) { const struct vdpa_config_ops *ops = vdev->config; vdev->features_valid = false; ops->set_status(vdev, 0); } We probably need to add the synchronization here. E.g re-read with a timeout. Thanks ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Thu, Jun 24, 2021 at 11:35 AM Jason Wang wrote: > > > 在 2021/6/23 下午1:50, Yongji Xie 写道: > > On Wed, Jun 23, 2021 at 11:31 AM Jason Wang wrote: > >> > >> 在 2021/6/22 下午4:14, Yongji Xie 写道: > >>> On Tue, Jun 22, 2021 at 3:50 PM Jason Wang wrote: > 在 2021/6/22 下午3:22, Yongji Xie 写道: > >> We need fix a way to propagate the error to the userspace. > >> > >> E.g if we want to stop the deivce, we will delay the status reset until > >> we get respose from the userspace? > >> > > I didn't get how to delay the status reset. And should it be a DoS > > that we want to fix if the userspace doesn't give a response forever? > You're right. So let's make set_status() can fail first, then propagate > its failure via VHOST_VDPA_SET_STATUS. > > >>> OK. So we only need to propagate the failure in the vhost-vdpa case, > >>> right? > >> > >> I think not, we need to deal with the reset for virtio as well: > >> > >> E.g in register_virtio_devices(), we have: > >> > >> /* We always start by resetting the device, in case a previous > >>* driver messed it up. This also tests that code path a > >> little. */ > >> dev->config->reset(dev); > >> > >> We probably need to make reset can fail and then fail the > >> register_virtio_device() as well. > >> > > OK, looks like virtio_add_status() and virtio_device_ready()[1] should > > be also modified if we need to propagate the failure in the > > virtio-vdpa case. Or do we only need to care about the reset case? > > > > [1] > > https://lore.kernel.org/lkml/20210517093428.670-1-xieyon...@bytedance.com/ > > > My understanding is DRIVER_OK is not something that needs to be validated: > > " > > DRIVER_OK (4) > Indicates that the driver is set up and ready to drive the device. > > " > > Since the spec doesn't require to re-read the and check if DRIVER_OK is > set in 3.1.1 Driver Requirements: Device Initialization. > > It's more about "telling the device that driver is ready." > > But we don have some status bit that requires the synchronization with > the device. > > 1) FEATURES_OK, spec requires to re-read the status bit to check whether > or it it was set by the device: > > " > > Re-read device status to ensure the FEATURES_OK bit is still set: > otherwise, the device does not support our subset of features and the > device is unusable. > > " > > This is useful for some device which can only support a subset of the > features. E.g a device that can only work for packed virtqueue. This > means the current design of set_features won't work, we need either: > > 1a) relay the set_features request to userspace > > or > > 1b) introduce a mandated_device_features during device creation and > validate the driver features during the set_features(), and don't set > FEATURES_OK if they don't match. > > > 2) Some transports (PCI) requires to re-read the status to ensure the > synchronization. > > " > > After writing 0 to device_status, the driver MUST wait for a read of > device_status to return 0 before reinitializing the device. > > " > > So we need to deal with both FEATURES_OK and reset, but probably not > DRIVER_OK. > OK, I see. Thanks for the explanation. One more question is how about clearing the corresponding status bit in get_status() rather than making set_status() fail. Since the spec recommends this way for validation which is done in virtio_dev_remove() and virtio_finalize_features(). Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/6/23 下午1:50, Yongji Xie 写道: On Wed, Jun 23, 2021 at 11:31 AM Jason Wang wrote: 在 2021/6/22 下午4:14, Yongji Xie 写道: On Tue, Jun 22, 2021 at 3:50 PM Jason Wang wrote: 在 2021/6/22 下午3:22, Yongji Xie 写道: We need fix a way to propagate the error to the userspace. E.g if we want to stop the deivce, we will delay the status reset until we get respose from the userspace? I didn't get how to delay the status reset. And should it be a DoS that we want to fix if the userspace doesn't give a response forever? You're right. So let's make set_status() can fail first, then propagate its failure via VHOST_VDPA_SET_STATUS. OK. So we only need to propagate the failure in the vhost-vdpa case, right? I think not, we need to deal with the reset for virtio as well: E.g in register_virtio_devices(), we have: /* We always start by resetting the device, in case a previous * driver messed it up. This also tests that code path a little. */ dev->config->reset(dev); We probably need to make reset can fail and then fail the register_virtio_device() as well. OK, looks like virtio_add_status() and virtio_device_ready()[1] should be also modified if we need to propagate the failure in the virtio-vdpa case. Or do we only need to care about the reset case? [1] https://lore.kernel.org/lkml/20210517093428.670-1-xieyon...@bytedance.com/ My understanding is DRIVER_OK is not something that needs to be validated: " DRIVER_OK (4) Indicates that the driver is set up and ready to drive the device. " Since the spec doesn't require to re-read the and check if DRIVER_OK is set in 3.1.1 Driver Requirements: Device Initialization. It's more about "telling the device that driver is ready." But we don have some status bit that requires the synchronization with the device. 1) FEATURES_OK, spec requires to re-read the status bit to check whether or it it was set by the device: " Re-read device status to ensure the FEATURES_OK bit is still set: otherwise, the device does not support our subset of features and the device is unusable. " This is useful for some device which can only support a subset of the features. E.g a device that can only work for packed virtqueue. This means the current design of set_features won't work, we need either: 1a) relay the set_features request to userspace or 1b) introduce a mandated_device_features during device creation and validate the driver features during the set_features(), and don't set FEATURES_OK if they don't match. 2) Some transports (PCI) requires to re-read the status to ensure the synchronization. " After writing 0 to device_status, the driver MUST wait for a read of device_status to return 0 before reinitializing the device. " So we need to deal with both FEATURES_OK and reset, but probably not DRIVER_OK. Thanks Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Wed, Jun 23, 2021 at 11:31 AM Jason Wang wrote: > > > 在 2021/6/22 下午4:14, Yongji Xie 写道: > > On Tue, Jun 22, 2021 at 3:50 PM Jason Wang wrote: > >> > >> 在 2021/6/22 下午3:22, Yongji Xie 写道: > We need fix a way to propagate the error to the userspace. > > E.g if we want to stop the deivce, we will delay the status reset until > we get respose from the userspace? > > >>> I didn't get how to delay the status reset. And should it be a DoS > >>> that we want to fix if the userspace doesn't give a response forever? > >> > >> You're right. So let's make set_status() can fail first, then propagate > >> its failure via VHOST_VDPA_SET_STATUS. > >> > > OK. So we only need to propagate the failure in the vhost-vdpa case, right? > > > I think not, we need to deal with the reset for virtio as well: > > E.g in register_virtio_devices(), we have: > > /* We always start by resetting the device, in case a previous > * driver messed it up. This also tests that code path a > little. */ >dev->config->reset(dev); > > We probably need to make reset can fail and then fail the > register_virtio_device() as well. > OK, looks like virtio_add_status() and virtio_device_ready()[1] should be also modified if we need to propagate the failure in the virtio-vdpa case. Or do we only need to care about the reset case? [1] https://lore.kernel.org/lkml/20210517093428.670-1-xieyon...@bytedance.com/ Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/6/22 下午4:14, Yongji Xie 写道: On Tue, Jun 22, 2021 at 3:50 PM Jason Wang wrote: 在 2021/6/22 下午3:22, Yongji Xie 写道: We need fix a way to propagate the error to the userspace. E.g if we want to stop the deivce, we will delay the status reset until we get respose from the userspace? I didn't get how to delay the status reset. And should it be a DoS that we want to fix if the userspace doesn't give a response forever? You're right. So let's make set_status() can fail first, then propagate its failure via VHOST_VDPA_SET_STATUS. OK. So we only need to propagate the failure in the vhost-vdpa case, right? I think not, we need to deal with the reset for virtio as well: E.g in register_virtio_devices(), we have: /* We always start by resetting the device, in case a previous * driver messed it up. This also tests that code path a little. */ dev->config->reset(dev); We probably need to make reset can fail and then fail the register_virtio_device() as well. Thanks Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Tue, Jun 22, 2021 at 3:50 PM Jason Wang wrote: > > > 在 2021/6/22 下午3:22, Yongji Xie 写道: > >> We need fix a way to propagate the error to the userspace. > >> > >> E.g if we want to stop the deivce, we will delay the status reset until > >> we get respose from the userspace? > >> > > I didn't get how to delay the status reset. And should it be a DoS > > that we want to fix if the userspace doesn't give a response forever? > > > You're right. So let's make set_status() can fail first, then propagate > its failure via VHOST_VDPA_SET_STATUS. > OK. So we only need to propagate the failure in the vhost-vdpa case, right? Thanks, Yongji ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/6/22 下午3:22, Yongji Xie 写道: We need fix a way to propagate the error to the userspace. E.g if we want to stop the deivce, we will delay the status reset until we get respose from the userspace? I didn't get how to delay the status reset. And should it be a DoS that we want to fix if the userspace doesn't give a response forever? You're right. So let's make set_status() can fail first, then propagate its failure via VHOST_VDPA_SET_STATUS. + } +} + +static size_t vduse_vdpa_get_config_size(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->config_size; +} + +static void vduse_vdpa_get_config(struct vdpa_device *vdpa, unsigned int offset, + void *buf, unsigned int len) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + memcpy(buf, dev->config + offset, len); +} + +static void vduse_vdpa_set_config(struct vdpa_device *vdpa, unsigned int offset, + const void *buf, unsigned int len) +{ + /* Now we only support read-only configuration space */ +} + +static u32 vduse_vdpa_get_generation(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + return dev->generation; +} + +static int vduse_vdpa_set_map(struct vdpa_device *vdpa, + struct vhost_iotlb *iotlb) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + int ret; + + ret = vduse_domain_set_map(dev->domain, iotlb); + if (ret) + return ret; + + ret = vduse_dev_update_iotlb(dev, 0ULL, ULLONG_MAX); + if (ret) { + vduse_domain_clear_map(dev->domain, iotlb); + return ret; + } + + return 0; +} + +static void vduse_vdpa_free(struct vdpa_device *vdpa) +{ + struct vduse_dev *dev = vdpa_to_vduse(vdpa); + + dev->vdev = NULL; +} + +static const struct vdpa_config_ops vduse_vdpa_config_ops = { + .set_vq_address = vduse_vdpa_set_vq_address, + .kick_vq= vduse_vdpa_kick_vq, + .set_vq_cb = vduse_vdpa_set_vq_cb, + .set_vq_num = vduse_vdpa_set_vq_num, + .set_vq_ready = vduse_vdpa_set_vq_ready, + .get_vq_ready = vduse_vdpa_get_vq_ready, + .set_vq_state = vduse_vdpa_set_vq_state, + .get_vq_state = vduse_vdpa_get_vq_state, + .get_vq_align = vduse_vdpa_get_vq_align, + .get_features = vduse_vdpa_get_features, + .set_features = vduse_vdpa_set_features, + .set_config_cb = vduse_vdpa_set_config_cb, + .get_vq_num_max = vduse_vdpa_get_vq_num_max, + .get_device_id = vduse_vdpa_get_device_id, + .get_vendor_id = vduse_vdpa_get_vendor_id, + .get_status = vduse_vdpa_get_status, + .set_status = vduse_vdpa_set_status, + .get_config_size= vduse_vdpa_get_config_size, + .get_config = vduse_vdpa_get_config, + .set_config = vduse_vdpa_set_config, + .get_generation = vduse_vdpa_get_generation, + .set_map= vduse_vdpa_set_map, + .free = vduse_vdpa_free, +}; + +static dma_addr_t vduse_dev_map_page(struct device *dev, struct page *page, + unsigned long offset, size_t size, + enum dma_data_direction dir, + unsigned long attrs) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain; + + return vduse_domain_map_page(domain, page, offset, size, dir, attrs); +} + +static void vduse_dev_unmap_page(struct device *dev, dma_addr_t dma_addr, + size_t size, enum dma_data_direction dir, + unsigned long attrs) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain; + + return vduse_domain_unmap_page(domain, dma_addr, size, dir, attrs); +} + +static void *vduse_dev_alloc_coherent(struct device *dev, size_t size, + dma_addr_t *dma_addr, gfp_t flag, + unsigned long attrs) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain; + unsigned long iova; + void *addr; + + *dma_addr = DMA_MAPPING_ERROR; + addr = vduse_domain_alloc_coherent(domain, size, + (dma_addr_t *), flag, attrs); + if (!addr) + return NULL; + + *dma_addr = (dma_addr_t)iova; + + return addr; +} + +static void vduse_dev_free_coherent(struct device *dev, size_t size, + void *vaddr, dma_addr_t dma_addr, + unsigned long attrs) +{ + struct vduse_dev *vdev = dev_to_vduse(dev); + struct vduse_iova_domain *domain = vdev->domain;
Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
.On Tue, Jun 22, 2021 at 1:07 PM Jason Wang wrote: > > > 在 2021/6/21 下午6:41, Yongji Xie 写道: > > On Mon, Jun 21, 2021 at 5:14 PM Jason Wang wrote: > >> > >> 在 2021/6/15 下午10:13, Xie Yongji 写道: > >>> This VDUSE driver enables implementing vDPA devices in userspace. > >>> The vDPA device's control path is handled in kernel and the data > >>> path is handled in userspace. > >>> > >>> A message mechnism is used by VDUSE driver to forward some control > >>> messages such as starting/stopping datapath to userspace. Userspace > >>> can use read()/write() to receive/reply those control messages. > >>> > >>> And some ioctls are introduced to help userspace to implement the > >>> data path. VDUSE_IOTLB_GET_FD ioctl can be used to get the file > >>> descriptors referring to vDPA device's iova regions. Then userspace > >>> can use mmap() to access those iova regions. VDUSE_DEV_GET_FEATURES > >>> and VDUSE_VQ_GET_INFO ioctls are used to get the negotiated features > >>> and metadata of virtqueues. VDUSE_INJECT_VQ_IRQ and VDUSE_VQ_SETUP_KICKFD > >>> ioctls can be used to inject interrupt and setup the kickfd for > >>> virtqueues. VDUSE_DEV_UPDATE_CONFIG ioctl is used to update the > >>> configuration space and inject a config interrupt. > >>> > >>> Signed-off-by: Xie Yongji > >>> --- > >>>Documentation/userspace-api/ioctl/ioctl-number.rst |1 + > >>>drivers/vdpa/Kconfig | 10 + > >>>drivers/vdpa/Makefile |1 + > >>>drivers/vdpa/vdpa_user/Makefile|5 + > >>>drivers/vdpa/vdpa_user/vduse_dev.c | 1453 > >>> > >>>include/uapi/linux/vduse.h | 143 ++ > >>>6 files changed, 1613 insertions(+) > >>>create mode 100644 drivers/vdpa/vdpa_user/Makefile > >>>create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c > >>>create mode 100644 include/uapi/linux/vduse.h > >>> > >>> diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst > >>> b/Documentation/userspace-api/ioctl/ioctl-number.rst > >>> index 9bfc2b510c64..acd95e9dcfe7 100644 > >>> --- a/Documentation/userspace-api/ioctl/ioctl-number.rst > >>> +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst > >>> @@ -300,6 +300,7 @@ Code Seq#Include File > >>>Comments > >>>'z' 10-4F drivers/s390/crypto/zcrypt_api.h > >>> conflict! > >>>'|' 00-7F linux/media.h > >>>0x80 00-1F linux/fb.h > >>> +0x81 00-1F linux/vduse.h > >>>0x89 00-06 arch/x86/include/asm/sockios.h > >>>0x89 0B-DF linux/sockios.h > >>>0x89 E0-EF linux/sockios.h > >>> SIOCPROTOPRIVATE range > >>> diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig > >>> index a503c1b2bfd9..6e23bce6433a 100644 > >>> --- a/drivers/vdpa/Kconfig > >>> +++ b/drivers/vdpa/Kconfig > >>> @@ -33,6 +33,16 @@ config VDPA_SIM_BLOCK > >>> vDPA block device simulator which terminates IO request in a > >>> memory buffer. > >>> > >>> +config VDPA_USER > >>> + tristate "VDUSE (vDPA Device in Userspace) support" > >>> + depends on EVENTFD && MMU && HAS_DMA > >>> + select DMA_OPS > >>> + select VHOST_IOTLB > >>> + select IOMMU_IOVA > >>> + help > >>> + With VDUSE it is possible to emulate a vDPA Device > >>> + in a userspace program. > >>> + > >>>config IFCVF > >>>tristate "Intel IFC VF vDPA driver" > >>>depends on PCI_MSI > >>> diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile > >>> index 67fe7f3d6943..f02ebed33f19 100644 > >>> --- a/drivers/vdpa/Makefile > >>> +++ b/drivers/vdpa/Makefile > >>> @@ -1,6 +1,7 @@ > >>># SPDX-License-Identifier: GPL-2.0 > >>>obj-$(CONFIG_VDPA) += vdpa.o > >>>obj-$(CONFIG_VDPA_SIM) += vdpa_sim/ > >>> +obj-$(CONFIG_VDPA_USER) += vdpa_user/ > >>>obj-$(CONFIG_IFCVF)+= ifcvf/ > >>>obj-$(CONFIG_MLX5_VDPA) += mlx5/ > >>>obj-$(CONFIG_VP_VDPA)+= virtio_pci/ > >>> diff --git a/drivers/vdpa/vdpa_user/Makefile > >>> b/drivers/vdpa/vdpa_user/Makefile > >>> new file mode 100644 > >>> index ..260e0b26af99 > >>> --- /dev/null > >>> +++ b/drivers/vdpa/vdpa_user/Makefile > >>> @@ -0,0 +1,5 @@ > >>> +# SPDX-License-Identifier: GPL-2.0 > >>> + > >>> +vduse-y := vduse_dev.o iova_domain.o > >>> + > >>> +obj-$(CONFIG_VDPA_USER) += vduse.o > >>> diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c > >>> b/drivers/vdpa/vdpa_user/vduse_dev.c > >>> new file mode 100644 > >>> index ..5271cbd15e28 > >>> --- /dev/null > >>> +++ b/drivers/vdpa/vdpa_user/vduse_dev.c > >>> @@ -0,0 +1,1453 @@ > >>> +// SPDX-License-Identifier: GPL-2.0-only > >>> +/* > >>> + * VDUSE: vDPA Device in Userspace > >>> + * > >>> + * Copyright (C) 2020-2021 Bytedance Inc. and/or its affiliates. All > >>> rights reserved. > >>> + * > >>> + * Author: Xie Yongji > >>> + * > >>> + */ > >>> + > >>>
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/6/21 下午6:41, Yongji Xie 写道: On Mon, Jun 21, 2021 at 5:14 PM Jason Wang wrote: 在 2021/6/15 下午10:13, Xie Yongji 写道: This VDUSE driver enables implementing vDPA devices in userspace. The vDPA device's control path is handled in kernel and the data path is handled in userspace. A message mechnism is used by VDUSE driver to forward some control messages such as starting/stopping datapath to userspace. Userspace can use read()/write() to receive/reply those control messages. And some ioctls are introduced to help userspace to implement the data path. VDUSE_IOTLB_GET_FD ioctl can be used to get the file descriptors referring to vDPA device's iova regions. Then userspace can use mmap() to access those iova regions. VDUSE_DEV_GET_FEATURES and VDUSE_VQ_GET_INFO ioctls are used to get the negotiated features and metadata of virtqueues. VDUSE_INJECT_VQ_IRQ and VDUSE_VQ_SETUP_KICKFD ioctls can be used to inject interrupt and setup the kickfd for virtqueues. VDUSE_DEV_UPDATE_CONFIG ioctl is used to update the configuration space and inject a config interrupt. Signed-off-by: Xie Yongji --- Documentation/userspace-api/ioctl/ioctl-number.rst |1 + drivers/vdpa/Kconfig | 10 + drivers/vdpa/Makefile |1 + drivers/vdpa/vdpa_user/Makefile|5 + drivers/vdpa/vdpa_user/vduse_dev.c | 1453 include/uapi/linux/vduse.h | 143 ++ 6 files changed, 1613 insertions(+) create mode 100644 drivers/vdpa/vdpa_user/Makefile create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c create mode 100644 include/uapi/linux/vduse.h diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst index 9bfc2b510c64..acd95e9dcfe7 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -300,6 +300,7 @@ Code Seq#Include File Comments 'z' 10-4F drivers/s390/crypto/zcrypt_api.h conflict! '|' 00-7F linux/media.h 0x80 00-1F linux/fb.h +0x81 00-1F linux/vduse.h 0x89 00-06 arch/x86/include/asm/sockios.h 0x89 0B-DF linux/sockios.h 0x89 E0-EF linux/sockios.h SIOCPROTOPRIVATE range diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig index a503c1b2bfd9..6e23bce6433a 100644 --- a/drivers/vdpa/Kconfig +++ b/drivers/vdpa/Kconfig @@ -33,6 +33,16 @@ config VDPA_SIM_BLOCK vDPA block device simulator which terminates IO request in a memory buffer. +config VDPA_USER + tristate "VDUSE (vDPA Device in Userspace) support" + depends on EVENTFD && MMU && HAS_DMA + select DMA_OPS + select VHOST_IOTLB + select IOMMU_IOVA + help + With VDUSE it is possible to emulate a vDPA Device + in a userspace program. + config IFCVF tristate "Intel IFC VF vDPA driver" depends on PCI_MSI diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile index 67fe7f3d6943..f02ebed33f19 100644 --- a/drivers/vdpa/Makefile +++ b/drivers/vdpa/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_VDPA) += vdpa.o obj-$(CONFIG_VDPA_SIM) += vdpa_sim/ +obj-$(CONFIG_VDPA_USER) += vdpa_user/ obj-$(CONFIG_IFCVF)+= ifcvf/ obj-$(CONFIG_MLX5_VDPA) += mlx5/ obj-$(CONFIG_VP_VDPA)+= virtio_pci/ diff --git a/drivers/vdpa/vdpa_user/Makefile b/drivers/vdpa/vdpa_user/Makefile new file mode 100644 index ..260e0b26af99 --- /dev/null +++ b/drivers/vdpa/vdpa_user/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 + +vduse-y := vduse_dev.o iova_domain.o + +obj-$(CONFIG_VDPA_USER) += vduse.o diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c new file mode 100644 index ..5271cbd15e28 --- /dev/null +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -0,0 +1,1453 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * VDUSE: vDPA Device in Userspace + * + * Copyright (C) 2020-2021 Bytedance Inc. and/or its affiliates. All rights reserved. + * + * Author: Xie Yongji + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "iova_domain.h" + +#define DRV_AUTHOR "Yongji Xie " +#define DRV_DESC "vDPA Device in Userspace" +#define DRV_LICENSE "GPL v2" + +#define VDUSE_DEV_MAX (1U << MINORBITS) +#define VDUSE_MAX_BOUNCE_SIZE (64 * 1024 * 1024) +#define VDUSE_IOVA_SIZE (128 * 1024 * 1024) +#define VDUSE_REQUEST_TIMEOUT 30 + +struct vduse_virtqueue { + u16 index; + u32 num; + u32 avail_idx; + u64 desc_addr; + u64 driver_addr; + u64 device_addr; + bool ready; + bool kicked; + spinlock_t
Re: Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
On Mon, Jun 21, 2021 at 5:14 PM Jason Wang wrote: > > > 在 2021/6/15 下午10:13, Xie Yongji 写道: > > This VDUSE driver enables implementing vDPA devices in userspace. > > The vDPA device's control path is handled in kernel and the data > > path is handled in userspace. > > > > A message mechnism is used by VDUSE driver to forward some control > > messages such as starting/stopping datapath to userspace. Userspace > > can use read()/write() to receive/reply those control messages. > > > > And some ioctls are introduced to help userspace to implement the > > data path. VDUSE_IOTLB_GET_FD ioctl can be used to get the file > > descriptors referring to vDPA device's iova regions. Then userspace > > can use mmap() to access those iova regions. VDUSE_DEV_GET_FEATURES > > and VDUSE_VQ_GET_INFO ioctls are used to get the negotiated features > > and metadata of virtqueues. VDUSE_INJECT_VQ_IRQ and VDUSE_VQ_SETUP_KICKFD > > ioctls can be used to inject interrupt and setup the kickfd for > > virtqueues. VDUSE_DEV_UPDATE_CONFIG ioctl is used to update the > > configuration space and inject a config interrupt. > > > > Signed-off-by: Xie Yongji > > --- > > Documentation/userspace-api/ioctl/ioctl-number.rst |1 + > > drivers/vdpa/Kconfig | 10 + > > drivers/vdpa/Makefile |1 + > > drivers/vdpa/vdpa_user/Makefile|5 + > > drivers/vdpa/vdpa_user/vduse_dev.c | 1453 > > > > include/uapi/linux/vduse.h | 143 ++ > > 6 files changed, 1613 insertions(+) > > create mode 100644 drivers/vdpa/vdpa_user/Makefile > > create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c > > create mode 100644 include/uapi/linux/vduse.h > > > > diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst > > b/Documentation/userspace-api/ioctl/ioctl-number.rst > > index 9bfc2b510c64..acd95e9dcfe7 100644 > > --- a/Documentation/userspace-api/ioctl/ioctl-number.rst > > +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst > > @@ -300,6 +300,7 @@ Code Seq#Include File > > Comments > > 'z' 10-4F drivers/s390/crypto/zcrypt_api.h > > conflict! > > '|' 00-7F linux/media.h > > 0x80 00-1F linux/fb.h > > +0x81 00-1F linux/vduse.h > > 0x89 00-06 arch/x86/include/asm/sockios.h > > 0x89 0B-DF linux/sockios.h > > 0x89 E0-EF linux/sockios.h > > SIOCPROTOPRIVATE range > > diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig > > index a503c1b2bfd9..6e23bce6433a 100644 > > --- a/drivers/vdpa/Kconfig > > +++ b/drivers/vdpa/Kconfig > > @@ -33,6 +33,16 @@ config VDPA_SIM_BLOCK > > vDPA block device simulator which terminates IO request in a > > memory buffer. > > > > +config VDPA_USER > > + tristate "VDUSE (vDPA Device in Userspace) support" > > + depends on EVENTFD && MMU && HAS_DMA > > + select DMA_OPS > > + select VHOST_IOTLB > > + select IOMMU_IOVA > > + help > > + With VDUSE it is possible to emulate a vDPA Device > > + in a userspace program. > > + > > config IFCVF > > tristate "Intel IFC VF vDPA driver" > > depends on PCI_MSI > > diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile > > index 67fe7f3d6943..f02ebed33f19 100644 > > --- a/drivers/vdpa/Makefile > > +++ b/drivers/vdpa/Makefile > > @@ -1,6 +1,7 @@ > > # SPDX-License-Identifier: GPL-2.0 > > obj-$(CONFIG_VDPA) += vdpa.o > > obj-$(CONFIG_VDPA_SIM) += vdpa_sim/ > > +obj-$(CONFIG_VDPA_USER) += vdpa_user/ > > obj-$(CONFIG_IFCVF)+= ifcvf/ > > obj-$(CONFIG_MLX5_VDPA) += mlx5/ > > obj-$(CONFIG_VP_VDPA)+= virtio_pci/ > > diff --git a/drivers/vdpa/vdpa_user/Makefile > > b/drivers/vdpa/vdpa_user/Makefile > > new file mode 100644 > > index ..260e0b26af99 > > --- /dev/null > > +++ b/drivers/vdpa/vdpa_user/Makefile > > @@ -0,0 +1,5 @@ > > +# SPDX-License-Identifier: GPL-2.0 > > + > > +vduse-y := vduse_dev.o iova_domain.o > > + > > +obj-$(CONFIG_VDPA_USER) += vduse.o > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c > > b/drivers/vdpa/vdpa_user/vduse_dev.c > > new file mode 100644 > > index ..5271cbd15e28 > > --- /dev/null > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c > > @@ -0,0 +1,1453 @@ > > +// SPDX-License-Identifier: GPL-2.0-only > > +/* > > + * VDUSE: vDPA Device in Userspace > > + * > > + * Copyright (C) 2020-2021 Bytedance Inc. and/or its affiliates. All > > rights reserved. > > + * > > + * Author: Xie Yongji > > + * > > + */ > > + > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > +#include > > + > > +#include "iova_domain.h" > > + > > +#define DRV_AUTHOR
Re: [PATCH v8 09/10] vduse: Introduce VDUSE - vDPA Device in Userspace
在 2021/6/15 下午10:13, Xie Yongji 写道: This VDUSE driver enables implementing vDPA devices in userspace. The vDPA device's control path is handled in kernel and the data path is handled in userspace. A message mechnism is used by VDUSE driver to forward some control messages such as starting/stopping datapath to userspace. Userspace can use read()/write() to receive/reply those control messages. And some ioctls are introduced to help userspace to implement the data path. VDUSE_IOTLB_GET_FD ioctl can be used to get the file descriptors referring to vDPA device's iova regions. Then userspace can use mmap() to access those iova regions. VDUSE_DEV_GET_FEATURES and VDUSE_VQ_GET_INFO ioctls are used to get the negotiated features and metadata of virtqueues. VDUSE_INJECT_VQ_IRQ and VDUSE_VQ_SETUP_KICKFD ioctls can be used to inject interrupt and setup the kickfd for virtqueues. VDUSE_DEV_UPDATE_CONFIG ioctl is used to update the configuration space and inject a config interrupt. Signed-off-by: Xie Yongji --- Documentation/userspace-api/ioctl/ioctl-number.rst |1 + drivers/vdpa/Kconfig | 10 + drivers/vdpa/Makefile |1 + drivers/vdpa/vdpa_user/Makefile|5 + drivers/vdpa/vdpa_user/vduse_dev.c | 1453 include/uapi/linux/vduse.h | 143 ++ 6 files changed, 1613 insertions(+) create mode 100644 drivers/vdpa/vdpa_user/Makefile create mode 100644 drivers/vdpa/vdpa_user/vduse_dev.c create mode 100644 include/uapi/linux/vduse.h diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst index 9bfc2b510c64..acd95e9dcfe7 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -300,6 +300,7 @@ Code Seq#Include File Comments 'z' 10-4F drivers/s390/crypto/zcrypt_api.hconflict! '|' 00-7F linux/media.h 0x80 00-1F linux/fb.h +0x81 00-1F linux/vduse.h 0x89 00-06 arch/x86/include/asm/sockios.h 0x89 0B-DF linux/sockios.h 0x89 E0-EF linux/sockios.h SIOCPROTOPRIVATE range diff --git a/drivers/vdpa/Kconfig b/drivers/vdpa/Kconfig index a503c1b2bfd9..6e23bce6433a 100644 --- a/drivers/vdpa/Kconfig +++ b/drivers/vdpa/Kconfig @@ -33,6 +33,16 @@ config VDPA_SIM_BLOCK vDPA block device simulator which terminates IO request in a memory buffer. +config VDPA_USER + tristate "VDUSE (vDPA Device in Userspace) support" + depends on EVENTFD && MMU && HAS_DMA + select DMA_OPS + select VHOST_IOTLB + select IOMMU_IOVA + help + With VDUSE it is possible to emulate a vDPA Device + in a userspace program. + config IFCVF tristate "Intel IFC VF vDPA driver" depends on PCI_MSI diff --git a/drivers/vdpa/Makefile b/drivers/vdpa/Makefile index 67fe7f3d6943..f02ebed33f19 100644 --- a/drivers/vdpa/Makefile +++ b/drivers/vdpa/Makefile @@ -1,6 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-$(CONFIG_VDPA) += vdpa.o obj-$(CONFIG_VDPA_SIM) += vdpa_sim/ +obj-$(CONFIG_VDPA_USER) += vdpa_user/ obj-$(CONFIG_IFCVF)+= ifcvf/ obj-$(CONFIG_MLX5_VDPA) += mlx5/ obj-$(CONFIG_VP_VDPA)+= virtio_pci/ diff --git a/drivers/vdpa/vdpa_user/Makefile b/drivers/vdpa/vdpa_user/Makefile new file mode 100644 index ..260e0b26af99 --- /dev/null +++ b/drivers/vdpa/vdpa_user/Makefile @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 + +vduse-y := vduse_dev.o iova_domain.o + +obj-$(CONFIG_VDPA_USER) += vduse.o diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c b/drivers/vdpa/vdpa_user/vduse_dev.c new file mode 100644 index ..5271cbd15e28 --- /dev/null +++ b/drivers/vdpa/vdpa_user/vduse_dev.c @@ -0,0 +1,1453 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * VDUSE: vDPA Device in Userspace + * + * Copyright (C) 2020-2021 Bytedance Inc. and/or its affiliates. All rights reserved. + * + * Author: Xie Yongji + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "iova_domain.h" + +#define DRV_AUTHOR "Yongji Xie " +#define DRV_DESC "vDPA Device in Userspace" +#define DRV_LICENSE "GPL v2" + +#define VDUSE_DEV_MAX (1U << MINORBITS) +#define VDUSE_MAX_BOUNCE_SIZE (64 * 1024 * 1024) +#define VDUSE_IOVA_SIZE (128 * 1024 * 1024) +#define VDUSE_REQUEST_TIMEOUT 30 + +struct vduse_virtqueue { + u16 index; + u32 num; + u32 avail_idx; + u64 desc_addr; + u64 driver_addr; + u64 device_addr; + bool ready; + bool kicked; + spinlock_t kick_lock; + spinlock_t irq_lock; + struct eventfd_ctx *kickfd; +