Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
On 2019/9/20 上午10:16, Tiwei Bie wrote: On Fri, Sep 20, 2019 at 09:30:58AM +0800, Jason Wang wrote: On 2019/9/19 下午11:45, Tiwei Bie wrote: On Thu, Sep 19, 2019 at 09:08:11PM +0800, Jason Wang wrote: On 2019/9/18 下午10:32, Michael S. Tsirkin wrote: So I have some questions: 1) Compared to method 2, what's the advantage of creating a new vhost char device? I guess it's for keep the API compatibility? One benefit is that we can avoid doing vhost ioctls on VFIO device fd. Yes, but any benefit from doing this? It does seem a bit more modular, but it's certainly not a big deal. Ok, if we go this way, it could be as simple as provide some callback to vhost, then vhost can just forward the ioctl through parent_ops. 2) For method 2, is there any easy way for user/admin to distinguish e.g ordinary vfio-mdev for vhost from ordinary vfio-mdev? I think device-api could be a choice. Ok. I saw you introduce ops matching helper but it's not friendly to management. The ops matching helper is just to check whether a given vfio-device is based on a mdev device. 3) A drawback of 1) and 2) is that it must follow vfio_device_ops that assumes the parameter comes from userspace, it prevents support kernel virtio drivers. 4) So comes the idea of method 3, since it register a new vhost-mdev driver, we can use device specific ops instead of VFIO ones, then we can have a common API between vDPA parent and vhost-mdev/virtio-mdev drivers. As the above draft shows, this requires introducing a new VFIO device driver. I think Alex's opinion matters here. Just to clarify, a new type of mdev driver but provides dummy vfio_device_ops for VFIO to make container DMA ioctl work. I see. Thanks! IIUC, you mean we can provide a very tiny VFIO device driver in drivers/vhost/mdev.c, e.g.: static int vfio_vhost_mdev_open(void *device_data) { if (!try_module_get(THIS_MODULE)) return -ENODEV; return 0; } static void vfio_vhost_mdev_release(void *device_data) { module_put(THIS_MODULE); } static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = { .name = "vfio-vhost-mdev", .open = vfio_vhost_mdev_open, .release= vfio_vhost_mdev_release, }; static int vhost_mdev_probe(struct device *dev) { struct mdev_device *mdev = to_mdev_device(dev); ... Check the mdev device_id proposed in ... ... https://lkml.org/lkml/2019/9/12/151 ... To clarify, this should be done through the id_table fields in vhost_mdev_driver, and it should claim it supports virtio-mdev device only: static struct mdev_class_id id_table[] = { { MDEV_ID_VIRTIO }, { 0 }, }; static struct mdev_driver vhost_mdev_driver = { ... .id_table = id_table, } In this way, both of virtio-mdev and vhost-mdev will try to take this device. We may want a way to let vhost-mdev take this device only when users explicitly ask it to do it. Or maybe we can have a different MDEV_ID for vhost-mdev but share the device ops with virtio-mdev. I think it's similar to virtio-pci vs vfio-pci. User can choose to switch the driver through bind/unbind. return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev); And in vfio_vhost_mdev_ops, all its need is to just implement vhost-net ioctl and translate them to virtio-mdev transport (e.g device_ops I proposed or ioctls other whatever other method) API. I see, so my previous understanding is basically correct: https://lkml.org/lkml/2019/9/17/332 I.e. we won't have a separate vhost fd and we will do all vhost ioctls on the VFIO device fd backed by this new VFIO driver. Yes. Thanks And it could have a dummy ops implementation for the other device_ops. } static void vhost_mdev_remove(struct device *dev) { vfio_del_group_dev(dev); } static struct mdev_driver vhost_mdev_driver = { .name = "vhost_mdev", .probe = vhost_mdev_probe, .remove = vhost_mdev_remove, }; So we can bind above mdev driver to the virtio-mdev compatible mdev devices when we want to use vhost-mdev. After binding above driver to the mdev device, we can setup IOMMU via VFIO and get VFIO device fd of this mdev device, and pass it to vhost fd (/dev/vhost-mdev) with a SET_BACKEND ioctl. Then what vhost-mdev char device did is just forwarding ioctl back to this vfio device fd which seems a overkill. It's simpler that just do ioctl on the device ops directly. Yes. Thanks, Tiwei Thanks Thanks, Tiwei Thanks Yes, it is. Thanks
Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
On Fri, Sep 20, 2019 at 09:30:58AM +0800, Jason Wang wrote: > On 2019/9/19 下午11:45, Tiwei Bie wrote: > > On Thu, Sep 19, 2019 at 09:08:11PM +0800, Jason Wang wrote: > > > On 2019/9/18 下午10:32, Michael S. Tsirkin wrote: > > > > > > > So I have some questions: > > > > > > > > > > > > > > 1) Compared to method 2, what's the advantage of creating a new > > > > > > > vhost char > > > > > > > device? I guess it's for keep the API compatibility? > > > > > > One benefit is that we can avoid doing vhost ioctls on > > > > > > VFIO device fd. > > > > > Yes, but any benefit from doing this? > > > > It does seem a bit more modular, but it's certainly not a big deal. > > > Ok, if we go this way, it could be as simple as provide some callback to > > > vhost, then vhost can just forward the ioctl through parent_ops. > > > > > > > > > > 2) For method 2, is there any easy way for user/admin to > > > > > > > distinguish e.g > > > > > > > ordinary vfio-mdev for vhost from ordinary vfio-mdev? > > > > > > I think device-api could be a choice. > > > > > Ok. > > > > > > > > > > > > > > > > > I saw you introduce > > > > > > > ops matching helper but it's not friendly to management. > > > > > > The ops matching helper is just to check whether a given > > > > > > vfio-device is based on a mdev device. > > > > > > > > > > > > > 3) A drawback of 1) and 2) is that it must follow vfio_device_ops > > > > > > > that > > > > > > > assumes the parameter comes from userspace, it prevents support > > > > > > > kernel > > > > > > > virtio drivers. > > > > > > > > > > > > > > 4) So comes the idea of method 3, since it register a new > > > > > > > vhost-mdev driver, > > > > > > > we can use device specific ops instead of VFIO ones, then we can > > > > > > > have a > > > > > > > common API between vDPA parent and vhost-mdev/virtio-mdev drivers. > > > > > > As the above draft shows, this requires introducing a new > > > > > > VFIO device driver. I think Alex's opinion matters here. > > > Just to clarify, a new type of mdev driver but provides dummy > > > vfio_device_ops for VFIO to make container DMA ioctl work. > > I see. Thanks! IIUC, you mean we can provide a very tiny > > VFIO device driver in drivers/vhost/mdev.c, e.g.: > > > > static int vfio_vhost_mdev_open(void *device_data) > > { > > if (!try_module_get(THIS_MODULE)) > > return -ENODEV; > > return 0; > > } > > > > static void vfio_vhost_mdev_release(void *device_data) > > { > > module_put(THIS_MODULE); > > } > > > > static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = { > > .name = "vfio-vhost-mdev", > > .open = vfio_vhost_mdev_open, > > .release= vfio_vhost_mdev_release, > > }; > > > > static int vhost_mdev_probe(struct device *dev) > > { > > struct mdev_device *mdev = to_mdev_device(dev); > > > > ... Check the mdev device_id proposed in ... > > ... https://lkml.org/lkml/2019/9/12/151 ... > > > To clarify, this should be done through the id_table fields in > vhost_mdev_driver, and it should claim it supports virtio-mdev device only: > > > static struct mdev_class_id id_table[] = { > { MDEV_ID_VIRTIO }, > { 0 }, > }; > > > static struct mdev_driver vhost_mdev_driver = { > ... > .id_table = id_table, > } In this way, both of virtio-mdev and vhost-mdev will try to take this device. We may want a way to let vhost-mdev take this device only when users explicitly ask it to do it. Or maybe we can have a different MDEV_ID for vhost-mdev but share the device ops with virtio-mdev. > > > > > > return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev); > > > And in vfio_vhost_mdev_ops, all its need is to just implement vhost-net > ioctl and translate them to virtio-mdev transport (e.g device_ops I proposed > or ioctls other whatever other method) API. I see, so my previous understanding is basically correct: https://lkml.org/lkml/2019/9/17/332 I.e. we won't have a separate vhost fd and we will do all vhost ioctls on the VFIO device fd backed by this new VFIO driver. > And it could have a dummy ops > implementation for the other device_ops. > > > > } > > > > static void vhost_mdev_remove(struct device *dev) > > { > > vfio_del_group_dev(dev); > > } > > > > static struct mdev_driver vhost_mdev_driver = { > > .name = "vhost_mdev", > > .probe = vhost_mdev_probe, > > .remove = vhost_mdev_remove, > > }; > > > > So we can bind above mdev driver to the virtio-mdev compatible > > mdev devices when we want to use vhost-mdev. > > > > After binding above driver to the mdev device, we can setup IOMMU > > via VFIO and get VFIO device fd of this mdev device, and pass it > > to vhost fd (/dev/vhost-mdev) with a SET_BACKEND ioctl. > > > Then what vhost-mdev char device did is just forwarding ioctl back to this > vfio device fd which seems a overkill. It's simpler that just do ioctl on > the device ops directly. Yes. Thanks, Tiwei > > Thanks >
Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
On 2019/9/19 下午11:45, Tiwei Bie wrote: On Thu, Sep 19, 2019 at 09:08:11PM +0800, Jason Wang wrote: On 2019/9/18 下午10:32, Michael S. Tsirkin wrote: So I have some questions: 1) Compared to method 2, what's the advantage of creating a new vhost char device? I guess it's for keep the API compatibility? One benefit is that we can avoid doing vhost ioctls on VFIO device fd. Yes, but any benefit from doing this? It does seem a bit more modular, but it's certainly not a big deal. Ok, if we go this way, it could be as simple as provide some callback to vhost, then vhost can just forward the ioctl through parent_ops. 2) For method 2, is there any easy way for user/admin to distinguish e.g ordinary vfio-mdev for vhost from ordinary vfio-mdev? I think device-api could be a choice. Ok. I saw you introduce ops matching helper but it's not friendly to management. The ops matching helper is just to check whether a given vfio-device is based on a mdev device. 3) A drawback of 1) and 2) is that it must follow vfio_device_ops that assumes the parameter comes from userspace, it prevents support kernel virtio drivers. 4) So comes the idea of method 3, since it register a new vhost-mdev driver, we can use device specific ops instead of VFIO ones, then we can have a common API between vDPA parent and vhost-mdev/virtio-mdev drivers. As the above draft shows, this requires introducing a new VFIO device driver. I think Alex's opinion matters here. Just to clarify, a new type of mdev driver but provides dummy vfio_device_ops for VFIO to make container DMA ioctl work. I see. Thanks! IIUC, you mean we can provide a very tiny VFIO device driver in drivers/vhost/mdev.c, e.g.: static int vfio_vhost_mdev_open(void *device_data) { if (!try_module_get(THIS_MODULE)) return -ENODEV; return 0; } static void vfio_vhost_mdev_release(void *device_data) { module_put(THIS_MODULE); } static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = { .name = "vfio-vhost-mdev", .open = vfio_vhost_mdev_open, .release= vfio_vhost_mdev_release, }; static int vhost_mdev_probe(struct device *dev) { struct mdev_device *mdev = to_mdev_device(dev); ... Check the mdev device_id proposed in ... ... https://lkml.org/lkml/2019/9/12/151 ... To clarify, this should be done through the id_table fields in vhost_mdev_driver, and it should claim it supports virtio-mdev device only: static struct mdev_class_id id_table[] = { { MDEV_ID_VIRTIO }, { 0 }, }; static struct mdev_driver vhost_mdev_driver = { ... .id_table = id_table, } return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev); And in vfio_vhost_mdev_ops, all its need is to just implement vhost-net ioctl and translate them to virtio-mdev transport (e.g device_ops I proposed or ioctls other whatever other method) API. And it could have a dummy ops implementation for the other device_ops. } static void vhost_mdev_remove(struct device *dev) { vfio_del_group_dev(dev); } static struct mdev_driver vhost_mdev_driver = { .name = "vhost_mdev", .probe = vhost_mdev_probe, .remove = vhost_mdev_remove, }; So we can bind above mdev driver to the virtio-mdev compatible mdev devices when we want to use vhost-mdev. After binding above driver to the mdev device, we can setup IOMMU via VFIO and get VFIO device fd of this mdev device, and pass it to vhost fd (/dev/vhost-mdev) with a SET_BACKEND ioctl. Then what vhost-mdev char device did is just forwarding ioctl back to this vfio device fd which seems a overkill. It's simpler that just do ioctl on the device ops directly. Thanks Thanks, Tiwei Thanks Yes, it is. Thanks
Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
On 2019/9/19 下午11:45, Tiwei Bie wrote: On Thu, Sep 19, 2019 at 09:08:11PM +0800, Jason Wang wrote: On 2019/9/18 下午10:32, Michael S. Tsirkin wrote: So I have some questions: 1) Compared to method 2, what's the advantage of creating a new vhost char device? I guess it's for keep the API compatibility? One benefit is that we can avoid doing vhost ioctls on VFIO device fd. Yes, but any benefit from doing this? It does seem a bit more modular, but it's certainly not a big deal. Ok, if we go this way, it could be as simple as provide some callback to vhost, then vhost can just forward the ioctl through parent_ops. 2) For method 2, is there any easy way for user/admin to distinguish e.g ordinary vfio-mdev for vhost from ordinary vfio-mdev? I think device-api could be a choice. Ok. I saw you introduce ops matching helper but it's not friendly to management. The ops matching helper is just to check whether a given vfio-device is based on a mdev device. 3) A drawback of 1) and 2) is that it must follow vfio_device_ops that assumes the parameter comes from userspace, it prevents support kernel virtio drivers. 4) So comes the idea of method 3, since it register a new vhost-mdev driver, we can use device specific ops instead of VFIO ones, then we can have a common API between vDPA parent and vhost-mdev/virtio-mdev drivers. As the above draft shows, this requires introducing a new VFIO device driver. I think Alex's opinion matters here. Just to clarify, a new type of mdev driver but provides dummy vfio_device_ops for VFIO to make container DMA ioctl work. I see. Thanks! IIUC, you mean we can provide a very tiny VFIO device driver in drivers/vhost/mdev.c, e.g.: static int vfio_vhost_mdev_open(void *device_data) { if (!try_module_get(THIS_MODULE)) return -ENODEV; return 0; } static void vfio_vhost_mdev_release(void *device_data) { module_put(THIS_MODULE); } static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = { .name = "vfio-vhost-mdev", .open = vfio_vhost_mdev_open, .release= vfio_vhost_mdev_release, }; static int vhost_mdev_probe(struct device *dev) { struct mdev_device *mdev = to_mdev_device(dev); ... Check the mdev device_id proposed in ... ... https://lkml.org/lkml/2019/9/12/151 ... return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev); } static void vhost_mdev_remove(struct device *dev) { vfio_del_group_dev(dev); } static struct mdev_driver vhost_mdev_driver = { .name = "vhost_mdev", .probe = vhost_mdev_probe, .remove = vhost_mdev_remove, }; So we can bind above mdev driver to the virtio-mdev compatible mdev devices when we want to use vhost-mdev. After binding above driver to the mdev device, we can setup IOMMU via VFIO and get VFIO device fd of this mdev device, and pass it to vhost fd (/dev/vhost-mdev) with a SET_BACKEND ioctl. Thanks, Tiwei Yes, something like this. Thanks Thanks Yes, it is. Thanks
Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
On Thu, Sep 19, 2019 at 09:08:11PM +0800, Jason Wang wrote: > On 2019/9/18 下午10:32, Michael S. Tsirkin wrote: > > > > > So I have some questions: > > > > > > > > > > 1) Compared to method 2, what's the advantage of creating a new vhost > > > > > char > > > > > device? I guess it's for keep the API compatibility? > > > > One benefit is that we can avoid doing vhost ioctls on > > > > VFIO device fd. > > > Yes, but any benefit from doing this? > > It does seem a bit more modular, but it's certainly not a big deal. > > Ok, if we go this way, it could be as simple as provide some callback to > vhost, then vhost can just forward the ioctl through parent_ops. > > > > > > > > 2) For method 2, is there any easy way for user/admin to distinguish > > > > > e.g > > > > > ordinary vfio-mdev for vhost from ordinary vfio-mdev? > > > > I think device-api could be a choice. > > > Ok. > > > > > > > > > > > I saw you introduce > > > > > ops matching helper but it's not friendly to management. > > > > The ops matching helper is just to check whether a given > > > > vfio-device is based on a mdev device. > > > > > > > > > 3) A drawback of 1) and 2) is that it must follow vfio_device_ops that > > > > > assumes the parameter comes from userspace, it prevents support kernel > > > > > virtio drivers. > > > > > > > > > > 4) So comes the idea of method 3, since it register a new vhost-mdev > > > > > driver, > > > > > we can use device specific ops instead of VFIO ones, then we can have > > > > > a > > > > > common API between vDPA parent and vhost-mdev/virtio-mdev drivers. > > > > As the above draft shows, this requires introducing a new > > > > VFIO device driver. I think Alex's opinion matters here. > > Just to clarify, a new type of mdev driver but provides dummy > vfio_device_ops for VFIO to make container DMA ioctl work. I see. Thanks! IIUC, you mean we can provide a very tiny VFIO device driver in drivers/vhost/mdev.c, e.g.: static int vfio_vhost_mdev_open(void *device_data) { if (!try_module_get(THIS_MODULE)) return -ENODEV; return 0; } static void vfio_vhost_mdev_release(void *device_data) { module_put(THIS_MODULE); } static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = { .name = "vfio-vhost-mdev", .open = vfio_vhost_mdev_open, .release= vfio_vhost_mdev_release, }; static int vhost_mdev_probe(struct device *dev) { struct mdev_device *mdev = to_mdev_device(dev); ... Check the mdev device_id proposed in ... ... https://lkml.org/lkml/2019/9/12/151 ... return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev); } static void vhost_mdev_remove(struct device *dev) { vfio_del_group_dev(dev); } static struct mdev_driver vhost_mdev_driver = { .name = "vhost_mdev", .probe = vhost_mdev_probe, .remove = vhost_mdev_remove, }; So we can bind above mdev driver to the virtio-mdev compatible mdev devices when we want to use vhost-mdev. After binding above driver to the mdev device, we can setup IOMMU via VFIO and get VFIO device fd of this mdev device, and pass it to vhost fd (/dev/vhost-mdev) with a SET_BACKEND ioctl. Thanks, Tiwei > > Thanks > > > > > Yes, it is. > > > > > > Thanks > > > > > >
Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
On 2019/9/18 下午10:32, Michael S. Tsirkin wrote: So I have some questions: 1) Compared to method 2, what's the advantage of creating a new vhost char device? I guess it's for keep the API compatibility? One benefit is that we can avoid doing vhost ioctls on VFIO device fd. Yes, but any benefit from doing this? It does seem a bit more modular, but it's certainly not a big deal. Ok, if we go this way, it could be as simple as provide some callback to vhost, then vhost can just forward the ioctl through parent_ops. 2) For method 2, is there any easy way for user/admin to distinguish e.g ordinary vfio-mdev for vhost from ordinary vfio-mdev? I think device-api could be a choice. Ok. I saw you introduce ops matching helper but it's not friendly to management. The ops matching helper is just to check whether a given vfio-device is based on a mdev device. 3) A drawback of 1) and 2) is that it must follow vfio_device_ops that assumes the parameter comes from userspace, it prevents support kernel virtio drivers. 4) So comes the idea of method 3, since it register a new vhost-mdev driver, we can use device specific ops instead of VFIO ones, then we can have a common API between vDPA parent and vhost-mdev/virtio-mdev drivers. As the above draft shows, this requires introducing a new VFIO device driver. I think Alex's opinion matters here. Just to clarify, a new type of mdev driver but provides dummy vfio_device_ops for VFIO to make container DMA ioctl work. Thanks Yes, it is. Thanks
Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
On Wed, Sep 18, 2019 at 01:51:21PM +0800, Jason Wang wrote: > > On 2019/9/17 下午6:58, Tiwei Bie wrote: > > On Tue, Sep 17, 2019 at 11:32:03AM +0800, Jason Wang wrote: > > > On 2019/9/17 上午9:02, Tiwei Bie wrote: > > > > This RFC is to demonstrate below ideas, > > > > > > > > a) Build vhost-mdev on top of the same abstraction defined in > > > > the virtio-mdev series [1]; > > > > > > > > b) Introduce /dev/vhost-mdev to do vhost ioctls and support > > > > setting mdev device as backend; > > > > > > > > Now the userspace API looks like this: > > > > > > > > - Userspace generates a compatible mdev device; > > > > > > > > - Userspace opens this mdev device with VFIO API (including > > > > doing IOMMU programming for this mdev device with VFIO's > > > > container/group based interface); > > > > > > > > - Userspace opens /dev/vhost-mdev and gets vhost fd; > > > > > > > > - Userspace uses vhost ioctls to setup vhost (userspace should > > > > do VHOST_MDEV_SET_BACKEND ioctl with VFIO group fd and device > > > > fd first before doing other vhost ioctls); > > > > > > > > Only compile test has been done for this series for now. > > > > > > Have a hard thought on the architecture: > > Thanks a lot! Do appreciate it! > > > > > 1) Create a vhost char device and pass vfio mdev device fd to it as a > > > backend and translate vhost-mdev ioctl to virtio mdev transport (e.g > > > read/write). DMA was done through the VFIO DMA mapping on the container > > > that > > > is attached. > > Yeah, that's what we are doing in this series. > > > > > We have two more choices: > > > > > > 2) Use vfio-mdev but do not create vhost-mdev device, instead, just > > > implement vhost ioctl on vfio_device_ops, and translate them into > > > virtio-mdev transport or just pass ioctl to parent. > > Yeah. Instead of introducing /dev/vhost-mdev char device, do > > vhost ioctls on VFIO device fd directly. That's what we did > > in RFC v3. > > > > > 3) Don't use vfio-mdev, create a new vhost-mdev driver, during probe still > > > try to add dev to vfio group and talk to parent with device specific ops > > If my understanding is correct, this means we need to introduce > > a new VFIO device driver to replace the existing vfio-mdev driver > > in our case. Below is a quick draft just to show my understanding: > > > > #include > > #include > > #include > > #include > > #include > > #include > > #include > > > > #include "mdev_private.h" > > > > /* XXX: we need a proper way to include below vhost header. */ > > #include "../../vhost/vhost.h" > > > > static int vfio_vhost_mdev_open(void *device_data) > > { > > if (!try_module_get(THIS_MODULE)) > > return -ENODEV; > > > > /* ... */ > > vhost_dev_init(...); > > > > return 0; > > } > > > > static void vfio_vhost_mdev_release(void *device_data) > > { > > /* ... */ > > module_put(THIS_MODULE); > > } > > > > static long vfio_vhost_mdev_unlocked_ioctl(void *device_data, > >unsigned int cmd, unsigned long arg) > > { > > struct mdev_device *mdev = device_data; > > struct mdev_parent *parent = mdev->parent; > > > > /* > > * Use vhost ioctls. > > * > > * We will have a different parent_ops design. > > * And potentially, we can share the same parent_ops > > * with virtio_mdev. > > */ > > switch (cmd) { > > case VHOST_GET_FEATURES: > > parent->ops->get_features(mdev, ...); > > break; > > /* ... */ > > } > > > > return 0; > > } > > > > static ssize_t vfio_vhost_mdev_read(void *device_data, char __user *buf, > > size_t count, loff_t *ppos) > > { > > /* ... */ > > return 0; > > } > > > > static ssize_t vfio_vhost_mdev_write(void *device_data, const char __user > > *buf, > > size_t count, loff_t *ppos) > > { > > /* ... */ > > return 0; > > } > > > > static int vfio_vhost_mdev_mmap(void *device_data, struct vm_area_struct > > *vma) > > { > > /* ... */ > > return 0; > > } > > > > static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = { > > .name = "vfio-vhost-mdev", > > .open = vfio_vhost_mdev_open, > > .release= vfio_vhost_mdev_release, > > .ioctl = vfio_vhost_mdev_unlocked_ioctl, > > .read = vfio_vhost_mdev_read, > > .write = vfio_vhost_mdev_write, > > .mmap = vfio_vhost_mdev_mmap, > > }; > > > > static int vfio_vhost_mdev_probe(struct device *dev) > > { > > struct mdev_device *mdev = to_mdev_device(dev); > > > > /* ... */ > > return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev); > > } > > > > static void vfio_vhost_mdev_remove(struct device *dev) > > { > > /* ... */ > > vfio_del_group_dev(dev); > > } > > > > static struct mdev_driver vfio_vhost_mdev_driver = { > > .name = "vfio_v
Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
On 2019/9/17 下午6:58, Tiwei Bie wrote: On Tue, Sep 17, 2019 at 11:32:03AM +0800, Jason Wang wrote: On 2019/9/17 上午9:02, Tiwei Bie wrote: This RFC is to demonstrate below ideas, a) Build vhost-mdev on top of the same abstraction defined in the virtio-mdev series [1]; b) Introduce /dev/vhost-mdev to do vhost ioctls and support setting mdev device as backend; Now the userspace API looks like this: - Userspace generates a compatible mdev device; - Userspace opens this mdev device with VFIO API (including doing IOMMU programming for this mdev device with VFIO's container/group based interface); - Userspace opens /dev/vhost-mdev and gets vhost fd; - Userspace uses vhost ioctls to setup vhost (userspace should do VHOST_MDEV_SET_BACKEND ioctl with VFIO group fd and device fd first before doing other vhost ioctls); Only compile test has been done for this series for now. Have a hard thought on the architecture: Thanks a lot! Do appreciate it! 1) Create a vhost char device and pass vfio mdev device fd to it as a backend and translate vhost-mdev ioctl to virtio mdev transport (e.g read/write). DMA was done through the VFIO DMA mapping on the container that is attached. Yeah, that's what we are doing in this series. We have two more choices: 2) Use vfio-mdev but do not create vhost-mdev device, instead, just implement vhost ioctl on vfio_device_ops, and translate them into virtio-mdev transport or just pass ioctl to parent. Yeah. Instead of introducing /dev/vhost-mdev char device, do vhost ioctls on VFIO device fd directly. That's what we did in RFC v3. 3) Don't use vfio-mdev, create a new vhost-mdev driver, during probe still try to add dev to vfio group and talk to parent with device specific ops If my understanding is correct, this means we need to introduce a new VFIO device driver to replace the existing vfio-mdev driver in our case. Below is a quick draft just to show my understanding: #include #include #include #include #include #include #include #include "mdev_private.h" /* XXX: we need a proper way to include below vhost header. */ #include "../../vhost/vhost.h" static int vfio_vhost_mdev_open(void *device_data) { if (!try_module_get(THIS_MODULE)) return -ENODEV; /* ... */ vhost_dev_init(...); return 0; } static void vfio_vhost_mdev_release(void *device_data) { /* ... */ module_put(THIS_MODULE); } static long vfio_vhost_mdev_unlocked_ioctl(void *device_data, unsigned int cmd, unsigned long arg) { struct mdev_device *mdev = device_data; struct mdev_parent *parent = mdev->parent; /* * Use vhost ioctls. * * We will have a different parent_ops design. * And potentially, we can share the same parent_ops * with virtio_mdev. */ switch (cmd) { case VHOST_GET_FEATURES: parent->ops->get_features(mdev, ...); break; /* ... */ } return 0; } static ssize_t vfio_vhost_mdev_read(void *device_data, char __user *buf, size_t count, loff_t *ppos) { /* ... */ return 0; } static ssize_t vfio_vhost_mdev_write(void *device_data, const char __user *buf, size_t count, loff_t *ppos) { /* ... */ return 0; } static int vfio_vhost_mdev_mmap(void *device_data, struct vm_area_struct *vma) { /* ... */ return 0; } static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = { .name = "vfio-vhost-mdev", .open = vfio_vhost_mdev_open, .release= vfio_vhost_mdev_release, .ioctl = vfio_vhost_mdev_unlocked_ioctl, .read = vfio_vhost_mdev_read, .write = vfio_vhost_mdev_write, .mmap = vfio_vhost_mdev_mmap, }; static int vfio_vhost_mdev_probe(struct device *dev) { struct mdev_device *mdev = to_mdev_device(dev); /* ... */ return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev); } static void vfio_vhost_mdev_remove(struct device *dev) { /* ... */ vfio_del_group_dev(dev); } static struct mdev_driver vfio_vhost_mdev_driver = { .name = "vfio_vhost_mdev", .probe = vfio_vhost_mdev_probe, .remove = vfio_vhost_mdev_remove, }; static int __init vfio_vhost_mdev_init(void) { return mdev_register_driver(&vfio_vhost_mdev_driver, THIS_MODULE); } module_init(vfio_vhost_mdev_init) static void __exit vfio_vhost_mdev_exit(void) { mdev_unregister_driver(&vfio_vhost_mdev_driver); } module_exit(vfio_vhost_mdev_exit) Yes, something like this basically. So I have some questions: 1) Compared to method 2, what's the advantage of creating a new vhost char device? I guess it's for keep the API compatibility? One ben
Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
On Tue, Sep 17, 2019 at 11:32:03AM +0800, Jason Wang wrote: > On 2019/9/17 上午9:02, Tiwei Bie wrote: > > This RFC is to demonstrate below ideas, > > > > a) Build vhost-mdev on top of the same abstraction defined in > > the virtio-mdev series [1]; > > > > b) Introduce /dev/vhost-mdev to do vhost ioctls and support > > setting mdev device as backend; > > > > Now the userspace API looks like this: > > > > - Userspace generates a compatible mdev device; > > > > - Userspace opens this mdev device with VFIO API (including > >doing IOMMU programming for this mdev device with VFIO's > >container/group based interface); > > > > - Userspace opens /dev/vhost-mdev and gets vhost fd; > > > > - Userspace uses vhost ioctls to setup vhost (userspace should > >do VHOST_MDEV_SET_BACKEND ioctl with VFIO group fd and device > >fd first before doing other vhost ioctls); > > > > Only compile test has been done for this series for now. > > > Have a hard thought on the architecture: Thanks a lot! Do appreciate it! > > 1) Create a vhost char device and pass vfio mdev device fd to it as a > backend and translate vhost-mdev ioctl to virtio mdev transport (e.g > read/write). DMA was done through the VFIO DMA mapping on the container that > is attached. Yeah, that's what we are doing in this series. > > We have two more choices: > > 2) Use vfio-mdev but do not create vhost-mdev device, instead, just > implement vhost ioctl on vfio_device_ops, and translate them into > virtio-mdev transport or just pass ioctl to parent. Yeah. Instead of introducing /dev/vhost-mdev char device, do vhost ioctls on VFIO device fd directly. That's what we did in RFC v3. > > 3) Don't use vfio-mdev, create a new vhost-mdev driver, during probe still > try to add dev to vfio group and talk to parent with device specific ops If my understanding is correct, this means we need to introduce a new VFIO device driver to replace the existing vfio-mdev driver in our case. Below is a quick draft just to show my understanding: #include #include #include #include #include #include #include #include "mdev_private.h" /* XXX: we need a proper way to include below vhost header. */ #include "../../vhost/vhost.h" static int vfio_vhost_mdev_open(void *device_data) { if (!try_module_get(THIS_MODULE)) return -ENODEV; /* ... */ vhost_dev_init(...); return 0; } static void vfio_vhost_mdev_release(void *device_data) { /* ... */ module_put(THIS_MODULE); } static long vfio_vhost_mdev_unlocked_ioctl(void *device_data, unsigned int cmd, unsigned long arg) { struct mdev_device *mdev = device_data; struct mdev_parent *parent = mdev->parent; /* * Use vhost ioctls. * * We will have a different parent_ops design. * And potentially, we can share the same parent_ops * with virtio_mdev. */ switch (cmd) { case VHOST_GET_FEATURES: parent->ops->get_features(mdev, ...); break; /* ... */ } return 0; } static ssize_t vfio_vhost_mdev_read(void *device_data, char __user *buf, size_t count, loff_t *ppos) { /* ... */ return 0; } static ssize_t vfio_vhost_mdev_write(void *device_data, const char __user *buf, size_t count, loff_t *ppos) { /* ... */ return 0; } static int vfio_vhost_mdev_mmap(void *device_data, struct vm_area_struct *vma) { /* ... */ return 0; } static const struct vfio_device_ops vfio_vhost_mdev_dev_ops = { .name = "vfio-vhost-mdev", .open = vfio_vhost_mdev_open, .release= vfio_vhost_mdev_release, .ioctl = vfio_vhost_mdev_unlocked_ioctl, .read = vfio_vhost_mdev_read, .write = vfio_vhost_mdev_write, .mmap = vfio_vhost_mdev_mmap, }; static int vfio_vhost_mdev_probe(struct device *dev) { struct mdev_device *mdev = to_mdev_device(dev); /* ... */ return vfio_add_group_dev(dev, &vfio_vhost_mdev_dev_ops, mdev); } static void vfio_vhost_mdev_remove(struct device *dev) { /* ... */ vfio_del_group_dev(dev); } static struct mdev_driver vfio_vhost_mdev_driver = { .name = "vfio_vhost_mdev", .probe = vfio_vhost_mdev_probe, .remove = vfio_vhost_mdev_remove, }; static int __init vfio_vhost_mdev_init(void) { return mdev_register_driver(&vfio_vhost_mdev_driver, THIS_MODULE); } module_init(vfio_vhost_mdev_init) static void __exit vfio_vhost_mdev_exit(void) { mdev_unregister_driver(&vfio_vhost_mdev_driver); } module_exit(vfio_vhost_mdev_exit) > > So I have some questions: > > 1) Compared to method 2, what's the advantage of creating a new vhost char > device? I gue
Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
On 2019/9/17 上午9:02, Tiwei Bie wrote: This RFC is to demonstrate below ideas, a) Build vhost-mdev on top of the same abstraction defined in the virtio-mdev series [1]; b) Introduce /dev/vhost-mdev to do vhost ioctls and support setting mdev device as backend; Now the userspace API looks like this: - Userspace generates a compatible mdev device; - Userspace opens this mdev device with VFIO API (including doing IOMMU programming for this mdev device with VFIO's container/group based interface); - Userspace opens /dev/vhost-mdev and gets vhost fd; - Userspace uses vhost ioctls to setup vhost (userspace should do VHOST_MDEV_SET_BACKEND ioctl with VFIO group fd and device fd first before doing other vhost ioctls); Only compile test has been done for this series for now. Have a hard thought on the architecture: 1) Create a vhost char device and pass vfio mdev device fd to it as a backend and translate vhost-mdev ioctl to virtio mdev transport (e.g read/write). DMA was done through the VFIO DMA mapping on the container that is attached. We have two more choices: 2) Use vfio-mdev but do not create vhost-mdev device, instead, just implement vhost ioctl on vfio_device_ops, and translate them into virtio-mdev transport or just pass ioctl to parent. 3) Don't use vfio-mdev, create a new vhost-mdev driver, during probe still try to add dev to vfio group and talk to parent with device specific ops So I have some questions: 1) Compared to method 2, what's the advantage of creating a new vhost char device? I guess it's for keep the API compatibility? 2) For method 2, is there any easy way for user/admin to distinguish e.g ordinary vfio-mdev for vhost from ordinary vfio-mdev? I saw you introduce ops matching helper but it's not friendly to management. 3) A drawback of 1) and 2) is that it must follow vfio_device_ops that assumes the parameter comes from userspace, it prevents support kernel virtio drivers. 4) So comes the idea of method 3, since it register a new vhost-mdev driver, we can use device specific ops instead of VFIO ones, then we can have a common API between vDPA parent and vhost-mdev/virtio-mdev drivers. What's your thoughts? Thanks RFCv3: https://patchwork.kernel.org/patch/7785/ [1] https://lkml.org/lkml/2019/9/10/135 Tiwei Bie (3): vfio: support getting vfio device from device fd vfio: support checking vfio driver by device ops vhost: introduce mdev based hardware backend drivers/vfio/mdev/vfio_mdev.c| 3 +- drivers/vfio/vfio.c | 32 +++ drivers/vhost/Kconfig| 9 + drivers/vhost/Makefile | 3 + drivers/vhost/mdev.c | 462 +++ drivers/vhost/vhost.c| 39 ++- drivers/vhost/vhost.h| 6 + include/linux/vfio.h | 11 + include/uapi/linux/vhost.h | 10 + include/uapi/linux/vhost_types.h | 5 + 10 files changed, 573 insertions(+), 7 deletions(-) create mode 100644 drivers/vhost/mdev.c
Re: [RFC v4 0/3] vhost: introduce mdev based hardware backend
On 2019/9/17 上午9:02, Tiwei Bie wrote: This RFC is to demonstrate below ideas, a) Build vhost-mdev on top of the same abstraction defined in the virtio-mdev series [1]; b) Introduce /dev/vhost-mdev to do vhost ioctls and support setting mdev device as backend; Now the userspace API looks like this: - Userspace generates a compatible mdev device; - Userspace opens this mdev device with VFIO API (including doing IOMMU programming for this mdev device with VFIO's container/group based interface); - Userspace opens /dev/vhost-mdev and gets vhost fd; - Userspace uses vhost ioctls to setup vhost (userspace should do VHOST_MDEV_SET_BACKEND ioctl with VFIO group fd and device fd first before doing other vhost ioctls); Only compile test has been done for this series for now. RFCv3: https://patchwork.kernel.org/patch/7785/ [1] https://lkml.org/lkml/2019/9/10/135 Thanks a lot for the patches. Per Michael request, the API in [1] might need some tweak, I want to introduce some device specific parent_ops instead of vfio specific one. This RFC has been posted at https://lkml.org/lkml/2019/9/12/151. Tiwei Bie (3): vfio: support getting vfio device from device fd vfio: support checking vfio driver by device ops vhost: introduce mdev based hardware backend drivers/vfio/mdev/vfio_mdev.c| 3 +- drivers/vfio/vfio.c | 32 +++ drivers/vhost/Kconfig| 9 + drivers/vhost/Makefile | 3 + drivers/vhost/mdev.c | 462 +++ drivers/vhost/vhost.c| 39 ++- drivers/vhost/vhost.h| 6 + include/linux/vfio.h | 11 + include/uapi/linux/vhost.h | 10 + include/uapi/linux/vhost_types.h | 5 + 10 files changed, 573 insertions(+), 7 deletions(-) create mode 100644 drivers/vhost/mdev.c
[RFC v4 0/3] vhost: introduce mdev based hardware backend
This RFC is to demonstrate below ideas, a) Build vhost-mdev on top of the same abstraction defined in the virtio-mdev series [1]; b) Introduce /dev/vhost-mdev to do vhost ioctls and support setting mdev device as backend; Now the userspace API looks like this: - Userspace generates a compatible mdev device; - Userspace opens this mdev device with VFIO API (including doing IOMMU programming for this mdev device with VFIO's container/group based interface); - Userspace opens /dev/vhost-mdev and gets vhost fd; - Userspace uses vhost ioctls to setup vhost (userspace should do VHOST_MDEV_SET_BACKEND ioctl with VFIO group fd and device fd first before doing other vhost ioctls); Only compile test has been done for this series for now. RFCv3: https://patchwork.kernel.org/patch/7785/ [1] https://lkml.org/lkml/2019/9/10/135 Tiwei Bie (3): vfio: support getting vfio device from device fd vfio: support checking vfio driver by device ops vhost: introduce mdev based hardware backend drivers/vfio/mdev/vfio_mdev.c| 3 +- drivers/vfio/vfio.c | 32 +++ drivers/vhost/Kconfig| 9 + drivers/vhost/Makefile | 3 + drivers/vhost/mdev.c | 462 +++ drivers/vhost/vhost.c| 39 ++- drivers/vhost/vhost.h| 6 + include/linux/vfio.h | 11 + include/uapi/linux/vhost.h | 10 + include/uapi/linux/vhost_types.h | 5 + 10 files changed, 573 insertions(+), 7 deletions(-) create mode 100644 drivers/vhost/mdev.c -- 2.17.1