Re: [RFT][PATCH v2 9/9] vfio: Replace phys_pfn with pages for vfio_pin_pages()
Reviewed-by: Kirti Wankhede On 7/6/2022 11:57 AM, Nicolin Chen wrote: Most of the callers of vfio_pin_pages() want "struct page *" and the low-level mm code to pin pages returns a list of "struct page *" too. So there's no gain in converting "struct page *" to PFN in between. Replace the output parameter "phys_pfn" list with a "pages" list, to simplify callers. This also allows us to replace the vfio_iommu_type1 implementation with a more efficient one. For now, also update vfio_iommu_type1 to fit this new parameter too. Signed-off-by: Nicolin Chen --- .../driver-api/vfio-mediated-device.rst | 2 +- drivers/gpu/drm/i915/gvt/kvmgt.c | 19 ++- drivers/s390/cio/vfio_ccw_cp.c| 19 +-- drivers/s390/crypto/vfio_ap_ops.c | 6 +++--- drivers/vfio/vfio.c | 8 drivers/vfio/vfio.h | 2 +- drivers/vfio/vfio_iommu_type1.c | 19 +++ include/linux/vfio.h | 2 +- 8 files changed, 36 insertions(+), 41 deletions(-) diff --git a/Documentation/driver-api/vfio-mediated-device.rst b/Documentation/driver-api/vfio-mediated-device.rst index ea32a0f13ddb..ba5fefcdae1a 100644 --- a/Documentation/driver-api/vfio-mediated-device.rst +++ b/Documentation/driver-api/vfio-mediated-device.rst @@ -263,7 +263,7 @@ The following APIs are provided for translating user pfn to host pfn in a VFIO driver:: int vfio_pin_pages(struct vfio_device *device, dma_addr_t iova, - int npage, int prot, unsigned long *phys_pfn); + int npage, int prot, struct page **pages); void vfio_unpin_pages(struct vfio_device *device, dma_addr_t iova, int npage); diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index ea6041fa48ac..3a49471dcc16 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -239,7 +239,7 @@ static void gvt_unpin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, static int gvt_pin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, unsigned long size, struct page **page) { - unsigned long base_pfn = 0; + struct page *base_page = NULL; int total_pages; int npage; int ret; @@ -251,26 +251,19 @@ static int gvt_pin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, */ for (npage = 0; npage < total_pages; npage++) { dma_addr_t cur_iova = (gfn + npage) << PAGE_SHIFT; - unsigned long pfn; + struct page *cur_page; ret = vfio_pin_pages(>vfio_device, cur_iova, 1, -IOMMU_READ | IOMMU_WRITE, ); +IOMMU_READ | IOMMU_WRITE, _page); if (ret != 1) { gvt_vgpu_err("vfio_pin_pages failed for iova %pad, ret %d\n", _iova, ret); goto err; } - if (!pfn_valid(pfn)) { - gvt_vgpu_err("pfn 0x%lx is not mem backed\n", pfn); - npage++; - ret = -EFAULT; - goto err; - } - if (npage == 0) - base_pfn = pfn; - else if (base_pfn + npage != pfn) { + base_page = cur_page; + else if (base_page + npage != cur_page) { gvt_vgpu_err("The pages are not continuous\n"); ret = -EINVAL; npage++; @@ -278,7 +271,7 @@ static int gvt_pin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, } } - *page = pfn_to_page(base_pfn); + *page = base_page; return 0; err: gvt_unpin_guest_page(vgpu, gfn, npage * PAGE_SIZE); diff --git a/drivers/s390/cio/vfio_ccw_cp.c b/drivers/s390/cio/vfio_ccw_cp.c index cd4ec4f6d6ff..8963f452f963 100644 --- a/drivers/s390/cio/vfio_ccw_cp.c +++ b/drivers/s390/cio/vfio_ccw_cp.c @@ -22,8 +22,8 @@ struct page_array { /* Array that stores pages need to pin. */ dma_addr_t *pa_iova; - /* Array that receives PFNs of the pages pinned. */ - unsigned long *pa_pfn; + /* Array that receives the pinned pages. */ + struct page **pa_page; /* Number of pages pinned from @pa_iova. */ int pa_nr; }; @@ -68,19 +68,19 @@ static int page_array_alloc(struct page_array *pa, u64 iova, unsigned int len) return -EINVAL; pa->pa_iova = kcalloc(pa->pa_nr, - sizeof(*pa->pa_
Re: [RFT][PATCH v2 4/9] vfio: Pass in starting IOVA to vfio_pin/unpin_pages API
Reviewed by: Kirti Wankhede On 7/6/2022 11:57 AM, Nicolin Chen wrote: The vfio_pin/unpin_pages() so far accepted arrays of PFNs of user IOVA. Among all three callers, there was only one caller possibly passing in a non-contiguous PFN list, which is now ensured to have contiguous PFN inputs too. Pass in the starting address with "iova" alone to simplify things, so callers no longer need to maintain a PFN list or to pin/unpin one page at a time. This also allows VFIO to use more efficient implementations of pin/unpin_pages. For now, also update vfio_iommu_type1 to fit this new parameter too, while keeping its input intact (being user_iova) since we don't want to spend too much effort swapping its parameters and local variables at that level. Signed-off-by: Nicolin Chen --- .../driver-api/vfio-mediated-device.rst | 4 +-- drivers/gpu/drm/i915/gvt/kvmgt.c | 24 ++--- drivers/s390/cio/vfio_ccw_cp.c| 4 +-- drivers/s390/crypto/vfio_ap_ops.c | 9 +++ drivers/vfio/vfio.c | 27 +-- drivers/vfio/vfio.h | 4 +-- drivers/vfio/vfio_iommu_type1.c | 17 ++-- include/linux/vfio.h | 5 ++-- 8 files changed, 40 insertions(+), 54 deletions(-) diff --git a/Documentation/driver-api/vfio-mediated-device.rst b/Documentation/driver-api/vfio-mediated-device.rst index b0fdf76b339a..ea32a0f13ddb 100644 --- a/Documentation/driver-api/vfio-mediated-device.rst +++ b/Documentation/driver-api/vfio-mediated-device.rst @@ -262,10 +262,10 @@ Translation APIs for Mediated Devices The following APIs are provided for translating user pfn to host pfn in a VFIO driver:: - int vfio_pin_pages(struct vfio_device *device, unsigned long *user_pfn, + int vfio_pin_pages(struct vfio_device *device, dma_addr_t iova, int npage, int prot, unsigned long *phys_pfn); - void vfio_unpin_pages(struct vfio_device *device, unsigned long *user_pfn, + void vfio_unpin_pages(struct vfio_device *device, dma_addr_t iova, int npage); These functions call back into the back-end IOMMU module by using the pin_pages diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 8c67c9aba82d..ea6041fa48ac 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -231,16 +231,8 @@ static void intel_gvt_cleanup_vgpu_type_groups(struct intel_gvt *gvt) static void gvt_unpin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, unsigned long size) { - int total_pages; - int npage; - - total_pages = roundup(size, PAGE_SIZE) / PAGE_SIZE; - - for (npage = 0; npage < total_pages; npage++) { - unsigned long cur_gfn = gfn + npage; - - vfio_unpin_pages(>vfio_device, _gfn, 1); - } + vfio_unpin_pages(>vfio_device, gfn << PAGE_SHIFT, +roundup(size, PAGE_SIZE) / PAGE_SIZE); } /* Pin a normal or compound guest page for dma. */ @@ -258,14 +250,14 @@ static int gvt_pin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, * on stack to hold pfns. */ for (npage = 0; npage < total_pages; npage++) { - unsigned long cur_gfn = gfn + npage; + dma_addr_t cur_iova = (gfn + npage) << PAGE_SHIFT; unsigned long pfn; - ret = vfio_pin_pages(>vfio_device, _gfn, 1, + ret = vfio_pin_pages(>vfio_device, cur_iova, 1, IOMMU_READ | IOMMU_WRITE, ); if (ret != 1) { - gvt_vgpu_err("vfio_pin_pages failed for gfn 0x%lx, ret %d\n", -cur_gfn, ret); + gvt_vgpu_err("vfio_pin_pages failed for iova %pad, ret %d\n", +_iova, ret); goto err; } @@ -309,7 +301,7 @@ static int gvt_dma_map_page(struct intel_vgpu *vgpu, unsigned long gfn, if (dma_mapping_error(dev, *dma_addr)) { gvt_vgpu_err("DMA mapping failed for pfn 0x%lx, ret %d\n", page_to_pfn(page), ret); - gvt_unpin_guest_page(vgpu, gfn, size); + gvt_unpin_guest_page(vgpu, gfn << PAGE_SHIFT, size); return -ENOMEM; } @@ -322,7 +314,7 @@ static void gvt_dma_unmap_page(struct intel_vgpu *vgpu, unsigned long gfn, struct device *dev = vgpu->gvt->gt->i915->drm.dev; dma_unmap_page(dev, dma_addr, size, DMA_BIDIRECTIONAL); - gvt_unpin_guest_page(vgpu, gfn, size); + gvt_unpin_guest_page(vgpu, gfn << PAGE_SHIFT, size); } static struct gvt_dma *__gvt_cache_find_dma_addr(struct intel_v
Re: [RFT][PATCH v2 1/9] vfio: Make vfio_unpin_pages() return void
Reviewed-by: Kirti Wankhede On 7/6/2022 11:57 AM, Nicolin Chen wrote: There's only one caller that checks its return value with a WARN_ON_ONCE, while all other callers do not check return value at all. So simplify the API to return void by embedding similar WARN_ON_ONCEs. Suggested-by: Christoph Hellwig Signed-off-by: Nicolin Chen --- .../driver-api/vfio-mediated-device.rst | 2 +- drivers/gpu/drm/i915/gvt/kvmgt.c | 5 +--- drivers/vfio/vfio.c | 24 --- drivers/vfio/vfio.h | 2 +- drivers/vfio/vfio_iommu_type1.c | 16 ++--- include/linux/vfio.h | 4 ++-- 6 files changed, 23 insertions(+), 30 deletions(-) diff --git a/Documentation/driver-api/vfio-mediated-device.rst b/Documentation/driver-api/vfio-mediated-device.rst index 1c57815619fd..b0fdf76b339a 100644 --- a/Documentation/driver-api/vfio-mediated-device.rst +++ b/Documentation/driver-api/vfio-mediated-device.rst @@ -265,7 +265,7 @@ driver:: int vfio_pin_pages(struct vfio_device *device, unsigned long *user_pfn, int npage, int prot, unsigned long *phys_pfn); - int vfio_unpin_pages(struct vfio_device *device, unsigned long *user_pfn, + void vfio_unpin_pages(struct vfio_device *device, unsigned long *user_pfn, int npage); These functions call back into the back-end IOMMU module by using the pin_pages diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index e2f6c56ab342..8c67c9aba82d 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -231,18 +231,15 @@ static void intel_gvt_cleanup_vgpu_type_groups(struct intel_gvt *gvt) static void gvt_unpin_guest_page(struct intel_vgpu *vgpu, unsigned long gfn, unsigned long size) { - struct drm_i915_private *i915 = vgpu->gvt->gt->i915; int total_pages; int npage; - int ret; total_pages = roundup(size, PAGE_SIZE) / PAGE_SIZE; for (npage = 0; npage < total_pages; npage++) { unsigned long cur_gfn = gfn + npage; - ret = vfio_unpin_pages(>vfio_device, _gfn, 1); - drm_WARN_ON(>drm, ret != 1); + vfio_unpin_pages(>vfio_device, _gfn, 1); } } diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c index 61e71c1154be..01f45ec70a3d 100644 --- a/drivers/vfio/vfio.c +++ b/drivers/vfio/vfio.c @@ -1959,31 +1959,27 @@ EXPORT_SYMBOL(vfio_pin_pages); * PFNs should not be greater than VFIO_PIN_PAGES_MAX_ENTRIES. * @npage [in] : count of elements in user_pfn array. This count should not * be greater than VFIO_PIN_PAGES_MAX_ENTRIES. - * Return error or number of pages unpinned. */ -int vfio_unpin_pages(struct vfio_device *device, unsigned long *user_pfn, -int npage) +void vfio_unpin_pages(struct vfio_device *device, unsigned long *user_pfn, + int npage) { struct vfio_container *container; struct vfio_iommu_driver *driver; - int ret; - if (!user_pfn || !npage || !vfio_assert_device_open(device)) - return -EINVAL; + if (WARN_ON_ONCE(!user_pfn || !npage || !vfio_assert_device_open(device))) + return; - if (npage > VFIO_PIN_PAGES_MAX_ENTRIES) - return -E2BIG; + if (WARN_ON_ONCE(npage > VFIO_PIN_PAGES_MAX_ENTRIES)) + return; /* group->container cannot change while a vfio device is open */ container = device->group->container; driver = container->iommu_driver; - if (likely(driver && driver->ops->unpin_pages)) - ret = driver->ops->unpin_pages(container->iommu_data, user_pfn, - npage); - else - ret = -ENOTTY; - return ret; + if (WARN_ON_ONCE(unlikely(!driver || !driver->ops->unpin_pages))) + return; + + driver->ops->unpin_pages(container->iommu_data, user_pfn, npage); } EXPORT_SYMBOL(vfio_unpin_pages); diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index a67130221151..bef4edf58138 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -53,7 +53,7 @@ struct vfio_iommu_driver_ops { unsigned long *user_pfn, int npage, int prot, unsigned long *phys_pfn); - int (*unpin_pages)(void *iommu_data, + void(*unpin_pages)(void *iommu_data, unsigned long *user_pfn, int npage); int (*register_notifier)(void *iommu_data, unsigned long *events, diff --git a/drivers
Re: [PATCH 34/34] vfio/mdev: Remove mdev drvdata
On 4/11/2022 7:44 PM, Christoph Hellwig wrote: From: Jason Gunthorpe This is no longer used, remove it. All usages were moved over to either use container_of() from a vfio_device or to use dev_drvdata() directly on the mdev. Signed-off-by: Jason Gunthorpe Signed-off-by: Christoph Hellwig --- include/linux/mdev.h | 9 - 1 file changed, 9 deletions(-) Reviewed-by: Kirti Wankhede
Re: [PATCH 33/34] vfio/mdev: Use the driver core to create the 'remove' file
On 4/11/2022 7:44 PM, Christoph Hellwig wrote: From: Jason Gunthorpe The device creator is supposed to use the dev.groups value to add sysfs files before device_add is called, not call sysfs_create_files() after device_add() returns. This creates a race with uevent delivery where the extra attribute will not be visible. This was being done because the groups had been co-opted by the mdev driver, now that prior patches have moved the driver's groups to the struct device_driver the dev.group is properly free for use here. Signed-off-by: Jason Gunthorpe Signed-off-by: Christoph Hellwig --- drivers/vfio/mdev/mdev_core.c| 1 + drivers/vfio/mdev/mdev_private.h | 2 ++ drivers/vfio/mdev/mdev_sysfs.c | 19 ++- 3 files changed, 13 insertions(+), 9 deletions(-) Reviewed-by: Kirti Wankhede
Re: [PATCH 32/34] vfio/mdev: Remove mdev_parent_ops
On 4/11/2022 7:44 PM, Christoph Hellwig wrote: From: Jason Gunthorpe The last useful member in this struct is the supported_type_groups, move it to the mdev_driver and delete mdev_parent_ops. Replace it with mdev_driver as an argument to mdev_register_device() Signed-off-by: Jason Gunthorpe Signed-off-by: Christoph Hellwig --- .../driver-api/vfio-mediated-device.rst | 24 -- drivers/gpu/drm/i915/gvt/kvmgt.c | 7 +- drivers/s390/cio/vfio_ccw_ops.c | 7 +- drivers/s390/crypto/vfio_ap_ops.c | 9 ++- drivers/vfio/mdev/mdev_core.c | 13 -- drivers/vfio/mdev/mdev_private.h | 2 +- drivers/vfio/mdev/mdev_sysfs.c| 6 ++--- include/linux/mdev.h | 25 +++ samples/vfio-mdev/mbochs.c| 9 ++- samples/vfio-mdev/mdpy.c | 9 ++- samples/vfio-mdev/mtty.c | 9 ++- 11 files changed, 28 insertions(+), 92 deletions(-) Reviewed-by: Kirti Wankhede
Re: [PATCH 31/34] vfio/mdev: Remove mdev_parent_ops dev_attr_groups
On 4/11/2022 7:44 PM, Christoph Hellwig wrote: From: Jason Gunthorpe This is only used by one sample to print a fixed string that is pointless. In general, having a device driver attach sysfs attributes to the parent is horrific. This should never happen, and always leads to some kind of liftime bug as it become very difficult for the sysfs attribute to go back to any data owned by the device driver. Remove the general mechanism to create this abuse. Signed-off-by: Jason Gunthorpe Signed-off-by: Christoph Hellwig --- drivers/vfio/mdev/mdev_sysfs.c | 12 ++-- include/linux/mdev.h | 2 -- samples/vfio-mdev/mtty.c | 30 +- 3 files changed, 3 insertions(+), 41 deletions(-) Update Documentation/driver-api/vfio-mediated-device.rst where mtty sample code explained. Tree command output should be updated, i.e remove below 2 lines. |-- mtty_dev | `-- sample_mtty_dev Rest looks good to me. Thanks, Kirti
Re: [PATCH 30/34] vfio/mdev: Remove vfio_mdev.c
On 4/11/2022 7:43 PM, Christoph Hellwig wrote: From: Jason Gunthorpe Now that all mdev drivers directly create their own mdev_device driver and directly register with the vfio core's vfio_device_ops this is all dead code. Delete vfio_mdev.c and the mdev_parent_ops members that are connected to it. Signed-off-by: Jason Gunthorpe Signed-off-by: Christoph Hellwig Reviewed-by: Kirti Wankhede
Re: [PATCH 00/13] Provide core infrastructure for managing open/release
On 7/15/2021 5:50 AM, Jason Gunthorpe wrote: Prologue: This is the first series of three to send the "mlx5_vfio_pci" driver that has been discussed on the list for a while now. - Reorganize reflck to support splitting vfio_pci - Split vfio_pci into vfio_pci/vfio_pci_core and provide infrastructure for non-generic VFIO PCI drivers - The new driver mlx5_vfio_pci that is a full implementation of suspend/resume functionality for mlx5 devices. A preview of all the patches can be seen here: https://github.com/jgunthorpe/linux/commits/mlx5_vfio_pci === This is in support of Max's series to split vfio-pci. For that to work the reflck concept embedded in vfio-pci needs to be sharable across all of the new VFIO PCI drivers which motivated re-examining how this is implemented. Another significant issue is how the VFIO PCI core includes code like: if (pci_dev_driver(pdev) != _pci_driver) Which is not scalable if there are going to be multiple different driver types. This series takes the approach of moving the "reflck" mechanism into the core code as a "device set". Each vfio_device driver can specify how vfio_devices are grouped into the set using a key and the set comes along with a set-global mutex. The core code manages creating per-device set memory and associating it with each vfio_device. In turn this allows the core code to provide an open/close_device() operation that is called only for the first/last FD, and is called under the global device set lock. Review of all the drivers show that they are either already open coding the first/last semantic or are buggy and missing it. All drivers are migrated/fixed to the new open/close_device ops and the unused per-FD open()/release() ops are deleted. Why can't open()/release() ops be reused instead of adding open_device()/close_device(). Thanks, Kirti
Re: [PATCH 08/10] vfio/mtty: Convert to use vfio_register_group_dev()
-static int mtty_reset(struct mdev_device *mdev) +static int mtty_reset(struct mdev_state *mdev_stte) Nit pick: s/mdev_stte/mdev_state +static const struct vfio_device_ops mtty_dev_ops = { + .name = "vfio-mdev", I think name should be different that 'vfio-mdev', probably 'vfio-mdev-mtty' or 'vfio-mtty' Rest looks fine to me. Reviewed-by: Kirti Wankhede
Re: [PATCH 07/10] vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind
On 6/15/2021 7:05 PM, Christoph Hellwig wrote: From: Jason Gunthorpe This allows a mdev driver to opt out of using vfio_mdev.c, instead the driver will provide a 'struct mdev_driver' and register directly with the driver core. Much of mdev_parent_ops becomes unused in this mode: - create()/remove() are done via the mdev_driver probe()/remove() - mdev_attr_groups becomes mdev_driver driver.dev_groups - Wrapper function callbacks are replaced with the same ones from struct vfio_device_ops Signed-off-by: Jason Gunthorpe Signed-off-by: Christoph Hellwig --- drivers/vfio/mdev/mdev_core.c | 30 ++ drivers/vfio/mdev/mdev_driver.c | 10 ++ include/linux/mdev.h| 2 ++ 3 files changed, 34 insertions(+), 8 deletions(-) Reviewed-by: Kirti Wankhede
Re: [PATCH 06/10] vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE
On 6/15/2021 7:05 PM, Christoph Hellwig wrote: From: Jason Gunthorpe For some reason the vfio_mdev shim mdev_driver has its own module and kconfig. As the next patch requires access to it from mdev.ko merge the two modules together and remove VFIO_MDEV_DEVICE. A later patch deletes this driver entirely. Signed-off-by: Jason Gunthorpe Signed-off-by: Christoph Hellwig Reviewed-by: Cornelia Huck Reviewed-by: Kirti Wankhede
Re: [PATCH 02/10] driver core: Better distinguish probe errors in really_probe
On 6/15/2021 7:05 PM, Christoph Hellwig wrote: really_probe tries to special case errors from ->probe, but due to all other initialization added to the function over time now a lot of internal errors hit that code path as well. Untangle that by adding a new probe_err local variable and apply the special casing only to that. Signed-off-by: Christoph Hellwig Reviewed-by: Kirti Wankhede
Re: [PATCH 02/10] driver core: Better distinguish probe errors in really_probe
On 6/14/2021 8:38 PM, Christoph Hellwig wrote: really_probe tries to special case errors from ->probe, but due to all other initialization added to the function over time now a lot of internal errors hit that code path as well. Untangle that by adding a new probe_err local variable and apply the special casing only to that. Signed-off-by: Christoph Hellwig --- drivers/base/dd.c | 72 +++ 1 file changed, 41 insertions(+), 31 deletions(-) diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 7477d3322b3a..999bc737a8f0 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -513,12 +513,42 @@ static ssize_t state_synced_show(struct device *dev, } static DEVICE_ATTR_RO(state_synced); + +static int call_driver_probe(struct device *dev, struct device_driver *drv) +{ + int ret = 0; + + if (dev->bus->probe) + ret = dev->bus->probe(dev); + else if (drv->probe) + ret = drv->probe(dev); + + switch (ret) { + case -EPROBE_DEFER: + /* Driver requested deferred probing */ + dev_dbg(dev, "Driver %s requests probe deferral\n", drv->name); + break; + case -ENODEV: + case -ENXIO: + pr_debug("%s: probe of %s rejects match %d\n", +drv->name, dev_name(dev), ret); + break; + default: + /* driver matched but the probe failed */ + pr_warn("%s: probe of %s failed with error %d\n", + drv->name, dev_name(dev), ret); There should be case 0, that is, success case before default case as below: + case 0: + /* Driver returned success */ + break; Otherwise even in case of success, above warning would mislead that probe has failed. Thanks, Kirti + break; + } + + return ret; +} + static int really_probe(struct device *dev, struct device_driver *drv) { - int ret = -EPROBE_DEFER; int local_trigger_count = atomic_read(_trigger_count); bool test_remove = IS_ENABLED(CONFIG_DEBUG_TEST_DRIVER_REMOVE) && !drv->suppress_bind_attrs; + int ret = -EPROBE_DEFER, probe_ret = 0; if (defer_all_probes) { /* @@ -572,15 +602,15 @@ static int really_probe(struct device *dev, struct device_driver *drv) goto probe_failed; } - if (dev->bus->probe) { - ret = dev->bus->probe(dev); - if (ret) - goto probe_failed; - } else if (drv->probe) { - ret = drv->probe(dev); - if (ret) - goto probe_failed; - } + probe_ret = call_driver_probe(dev, drv); + if (probe_ret) { + /* +* Ignore errors returned by ->probe so that the next driver can +* try its luck. + */ + ret = 0; + goto probe_failed; + } if (device_add_groups(dev, drv->dev_groups)) { dev_err(dev, "device_add_groups() failed\n"); @@ -650,28 +680,8 @@ static int really_probe(struct device *dev, struct device_driver *drv) dev->pm_domain->dismiss(dev); pm_runtime_reinit(dev); dev_pm_set_driver_flags(dev, 0); - - switch (ret) { - case -EPROBE_DEFER: - /* Driver requested deferred probing */ - dev_dbg(dev, "Driver %s requests probe deferral\n", drv->name); + if (probe_ret == -EPROBE_DEFER) driver_deferred_probe_add_trigger(dev, local_trigger_count); - break; - case -ENODEV: - case -ENXIO: - pr_debug("%s: probe of %s rejects match %d\n", -drv->name, dev_name(dev), ret); - break; - default: - /* driver matched but the probe failed */ - pr_warn("%s: probe of %s failed with error %d\n", - drv->name, dev_name(dev), ret); - } - /* -* Ignore errors returned by ->probe so that the next driver can try -* its luck. -*/ - ret = 0; done: atomic_dec(_count); wake_up_all(_waitqueue);
Re: [PATCH 00/10] Allow mdev drivers to directly create the vfio_device
Jason, I couldn't find patch 1,2,4 and 5 of these series. Can you please keep k...@vger.kernel.org cc for all patches? Also it will be helpful if you can add version prefix, eg. 'v3' for this series, in subject line. Thanks, Kirti On 6/8/2021 6:25 AM, Jason Gunthorpe wrote: This is a "v3" of the previous posted full conversion: https://lore.kernel.org/r/0-v2-7667f42c9bad+935-vfio3_...@nvidia.com Without the trailing patches that are running into complications: - The CCW conversion has some complicated remarks - AP is waiting for some locking stuff to get worked out - No feedback on GT - The license change topic for removing vfio_mdev.c Getting the baseline functionality merged will allow Intel's IDXD mdev driver to advance. It has already been RFC posted in the new format: https://lore.kernel.org/kvm/162164243591.261970.3439987543338120797.st...@djiang5-desk3.ch.intel.com/ This series includes base infrastructure and the sample conversions. The remaining four issues can be sorted out one by one. The major change in v3 is to enhance the driver core support for binding based on the request from Christoph Hellwig and Dan Williams. Based on some light analysis this looks broadly useful: https://lore.kernel.org/kvm/20210428233856.gy1370...@nvidia.com/ The mdev bus's core part for managing the lifecycle of devices is mostly as one would expect for a driver core bus subsystem. However instead of having a normal 'struct device_driver' and binding the actual mdev drivers through the standard driver core mechanisms it open codes this with the struct mdev_parent_ops and provides a single driver that shims between the VFIO core's struct vfio_device and the actual device driver. Instead, allow mdev drivers implement an actual struct mdev_driver and directly call vfio_register_group_dev() in the probe() function for the mdev. Arrange to bind the created mdev_device to the mdev_driver that is provided by the end driver. The actual execution flow doesn't change much, eg what was parent_ops->create is now device_driver->probe and it is called at almost the exact same time - except under the normal control of the driver core. Ultimately converting all the drivers unlocks a fair number of additional VFIO simplifications and cleanups. v3: - Use device_driver_attach() from the driver core - 5 new patches to make device_driver_attach() exported and usable for this - Remove trailing patches for now v2: https://lore.kernel.org/r/0-v2-7667f42c9bad+935-vfio3_...@nvidia.com - Keep && m in samples kconfig - Restore accidently squashed removeal of vfio_mdev.c - Remove indirections to call bus_register()/bus_unregister() - Reflow long doc lines v1: https://lore.kernel.org/r/0-v1-d88406ed308e+418-vfio3_...@nvidia.com Jason Gunthorpe (10): driver core: Do not continue searching for drivers if deferred probe is used driver core: Pull required checks into driver_probe_device() driver core: Flow the return code from ->probe() through to sysfs bind driver core: Don't return EPROBE_DEFER to userspace during sysfs bind driver core: Export device_driver_attach() vfio/mdev: Remove CONFIG_VFIO_MDEV_DEVICE vfio/mdev: Allow the mdev_parent_ops to specify the device driver to bind vfio/mtty: Convert to use vfio_register_group_dev() vfio/mdpy: Convert to use vfio_register_group_dev() vfio/mbochs: Convert to use vfio_register_group_dev() Documentation/s390/vfio-ap.rst | 1 - arch/s390/Kconfig| 2 +- drivers/base/base.h | 1 - drivers/base/bus.c | 6 +- drivers/base/dd.c| 116 --- drivers/gpu/drm/i915/Kconfig | 2 +- drivers/vfio/mdev/Kconfig| 7 -- drivers/vfio/mdev/Makefile | 3 +- drivers/vfio/mdev/mdev_core.c| 46 ++-- drivers/vfio/mdev/mdev_driver.c | 10 ++ drivers/vfio/mdev/mdev_private.h | 2 + drivers/vfio/mdev/vfio_mdev.c| 24 +--- include/linux/device.h | 2 + include/linux/mdev.h | 2 + samples/Kconfig | 6 +- samples/vfio-mdev/mbochs.c | 163 +++ samples/vfio-mdev/mdpy.c | 159 ++ samples/vfio-mdev/mtty.c | 185 ++- 18 files changed, 397 insertions(+), 340 deletions(-)
Re: [PATCH v3 11/12] samples: vfio-mdev: constify fb ops
On 12/9/2019 7:31 PM, Jani Nikula wrote: On Tue, 03 Dec 2019, Jani Nikula wrote: Now that the fbops member of struct fb_info is const, we can start making the ops const as well. v2: fix typo (Christophe de Dinechin) Cc: Kirti Wankhede Cc: k...@vger.kernel.org Reviewed-by: Daniel Vetter Signed-off-by: Jani Nikula Kirti, may I have your ack to merge this through drm-misc please? BR, Jani. --- samples/vfio-mdev/mdpy-fb.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/samples/vfio-mdev/mdpy-fb.c b/samples/vfio-mdev/mdpy-fb.c index 2719bb259653..21dbf63d6e41 100644 --- a/samples/vfio-mdev/mdpy-fb.c +++ b/samples/vfio-mdev/mdpy-fb.c @@ -86,7 +86,7 @@ static void mdpy_fb_destroy(struct fb_info *info) iounmap(info->screen_base); } -static struct fb_ops mdpy_fb_ops = { +static const struct fb_ops mdpy_fb_ops = { .owner = THIS_MODULE, .fb_destroy = mdpy_fb_destroy, .fb_setcolreg = mdpy_fb_setcolreg, Acked-by : Kirti Wankhede ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH V11 2/6] modpost: add support for mdev class id
On 11/7/2019 8:41 PM, Jason Wang wrote: Add support to parse mdev class id table. Reviewed-by: Parav Pandit Reviewed-by: Cornelia Huck Signed-off-by: Jason Wang Reviewed-by: Kirti Wankhede Thanks, Kirti --- drivers/vfio/mdev/vfio_mdev.c | 2 ++ scripts/mod/devicetable-offsets.c | 3 +++ scripts/mod/file2alias.c | 11 +++ 3 files changed, 16 insertions(+) diff --git a/drivers/vfio/mdev/vfio_mdev.c b/drivers/vfio/mdev/vfio_mdev.c index 38431e9ef7f5..a6641cd8b5a3 100644 --- a/drivers/vfio/mdev/vfio_mdev.c +++ b/drivers/vfio/mdev/vfio_mdev.c @@ -125,6 +125,8 @@ static const struct mdev_class_id vfio_id_table[] = { { 0 }, }; +MODULE_DEVICE_TABLE(mdev, vfio_id_table); + static struct mdev_driver vfio_mdev_driver = { .name = "vfio_mdev", .probe = vfio_mdev_probe, diff --git a/scripts/mod/devicetable-offsets.c b/scripts/mod/devicetable-offsets.c index 054405b90ba4..6cbb1062488a 100644 --- a/scripts/mod/devicetable-offsets.c +++ b/scripts/mod/devicetable-offsets.c @@ -231,5 +231,8 @@ int main(void) DEVID(wmi_device_id); DEVID_FIELD(wmi_device_id, guid_string); + DEVID(mdev_class_id); + DEVID_FIELD(mdev_class_id, id); + return 0; } diff --git a/scripts/mod/file2alias.c b/scripts/mod/file2alias.c index c91eba751804..45f1c22f49be 100644 --- a/scripts/mod/file2alias.c +++ b/scripts/mod/file2alias.c @@ -1335,6 +1335,16 @@ static int do_wmi_entry(const char *filename, void *symval, char *alias) return 1; } +/* looks like: "mdev:cN" */ +static int do_mdev_entry(const char *filename, void *symval, char *alias) +{ + DEF_FIELD(symval, mdev_class_id, id); + + sprintf(alias, "mdev:c%02X", id); + add_wildcard(alias); + return 1; +} + /* Does namelen bytes of name exactly match the symbol? */ static bool sym_is(const char *name, unsigned namelen, const char *symbol) { @@ -1407,6 +1417,7 @@ static const struct devtable devtable[] = { {"typec", SIZE_typec_device_id, do_typec_entry}, {"tee", SIZE_tee_client_device_id, do_tee_entry}, {"wmi", SIZE_wmi_device_id, do_wmi_entry}, + {"mdev", SIZE_mdev_class_id, do_mdev_entry}, }; /* Create MODULE_ALIAS() statements. ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH V11 3/6] mdev: introduce device specific ops
On 11/7/2019 8:41 PM, Jason Wang wrote: Currently, except for the create and remove, the rest of mdev_parent_ops is designed for vfio-mdev driver only and may not help for kernel mdev driver. With the help of class id, this patch introduces device specific callbacks inside mdev_device structure. This allows different set of callback to be used by vfio-mdev and virtio-mdev. Reviewed-by: Parav Pandit Signed-off-by: Jason Wang Reviewed-by: Kirti Wankhede Thanks, Kirti --- .../driver-api/vfio-mediated-device.rst | 35 + MAINTAINERS | 1 + drivers/gpu/drm/i915/gvt/kvmgt.c | 18 --- drivers/s390/cio/vfio_ccw_ops.c | 18 --- drivers/s390/crypto/vfio_ap_ops.c | 14 +++-- drivers/vfio/mdev/mdev_core.c | 24 - drivers/vfio/mdev/mdev_private.h | 5 ++ drivers/vfio/mdev/vfio_mdev.c | 37 ++--- include/linux/mdev.h | 43 --- include/linux/mdev_vfio_ops.h | 52 +++ samples/vfio-mdev/mbochs.c| 20 --- samples/vfio-mdev/mdpy.c | 20 --- samples/vfio-mdev/mtty.c | 18 --- 13 files changed, 206 insertions(+), 99 deletions(-) create mode 100644 include/linux/mdev_vfio_ops.h diff --git a/Documentation/driver-api/vfio-mediated-device.rst b/Documentation/driver-api/vfio-mediated-device.rst index 6709413bee29..04d56884c357 100644 --- a/Documentation/driver-api/vfio-mediated-device.rst +++ b/Documentation/driver-api/vfio-mediated-device.rst @@ -152,15 +152,6 @@ callbacks per mdev parent device, per mdev type, or any other categorization. Vendor drivers are expected to be fully asynchronous in this respect or provide their own internal resource protection.) -The callbacks in the mdev_parent_ops structure are as follows: - -* open: open callback of mediated device -* close: close callback of mediated device -* ioctl: ioctl callback of mediated device -* read : read emulation callback -* write: write emulation callback -* mmap: mmap emulation callback - A driver should use the mdev_parent_ops structure in the function call to register itself with the mdev core driver:: @@ -172,10 +163,34 @@ that a driver should use to unregister itself with the mdev core driver:: extern void mdev_unregister_device(struct device *dev); -It is also required to specify the class_id in create() callback through:: +As multiple types of mediated devices may be supported, class id needs +to be specified in the create() callback. This could be done +explicitly for the device that interacts with the mdev device directly +through:: int mdev_set_class(struct mdev_device *mdev, u16 id); +For the device that uses the mdev bus for its operation, the class +should provide helper function to set class id and device specific +ops. E.g for vfio-mdev devices, the function to be called is:: + + int mdev_set_vfio_ops(struct mdev_device *mdev, + const struct mdev_vfio_device_ops *vfio_ops); + +The class id (set by this function to MDEV_CLASS_ID_VFIO) is used to +match a device with an mdev driver via its id table. The device +specific callbacks (specified in *vfio_ops) are obtainable via +mdev_get_vfio_ops() (for use by the mdev bus driver). A vfio-mdev +device (class id MDEV_CLASS_ID_VFIO) uses the following +device-specific ops: + +* open: open callback of vfio mediated device +* close: close callback of vfio mediated device +* ioctl: ioctl callback of vfio mediated device +* read : read emulation callback +* write: write emulation callback +* mmap: mmap emulation callback + Mediated Device Management Interface Through sysfs == diff --git a/MAINTAINERS b/MAINTAINERS index cba1095547fd..f661d13344d6 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -17121,6 +17121,7 @@ S: Maintained F:Documentation/driver-api/vfio-mediated-device.rst F:drivers/vfio/mdev/ F:include/linux/mdev.h +F: include/linux/mdev_vfio_ops.h F:samples/vfio-mdev/ VFIO PLATFORM DRIVER diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 6420f0dbd31b..662f3a672372 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -42,6 +42,7 @@ #include #include #include +#include #include #include @@ -643,6 +644,8 @@ static void kvmgt_put_vfio_device(void *vgpu) vfio_device_put(((struct intel_vgpu *)vgpu)->vdev.vfio_device); } +static const struct mdev_vfio_device_ops intel_vfio_vgpu_dev_ops; + static int intel_vgpu_create(struct kobject *kobj, struct mdev_device *mdev) { struct intel_vgpu *vgpu = NULL; @@ -678,7 +681,7 @@ static int intel_vgpu_create(struct kobject *kobj, struct mdev_device *m
Re: [PATCH V11 1/6] mdev: class id support
On 11/7/2019 8:41 PM, Jason Wang wrote: Mdev bus only supports vfio driver right now, so it doesn't implement match method. But in the future, we may add drivers other than vfio, the first driver could be virtio-mdev. This means we need to add device class id support in bus match method to pair the mdev device and mdev driver correctly. So this patch adds id_table to mdev_driver and class_id for mdev device with the match method for mdev bus. Reviewed-by: Parav Pandit Reviewed-by: Cornelia Huck Signed-off-by: Jason Wang Reviewed-by: Kirti Wankhede Thanks, Kirti --- .../driver-api/vfio-mediated-device.rst | 5 drivers/gpu/drm/i915/gvt/kvmgt.c | 1 + drivers/s390/cio/vfio_ccw_ops.c | 1 + drivers/s390/crypto/vfio_ap_ops.c | 1 + drivers/vfio/mdev/mdev_core.c | 17 + drivers/vfio/mdev/mdev_driver.c | 25 +++ drivers/vfio/mdev/mdev_private.h | 1 + drivers/vfio/mdev/vfio_mdev.c | 6 + include/linux/mdev.h | 8 ++ include/linux/mod_devicetable.h | 8 ++ samples/vfio-mdev/mbochs.c| 1 + samples/vfio-mdev/mdpy.c | 1 + samples/vfio-mdev/mtty.c | 1 + 13 files changed, 76 insertions(+) diff --git a/Documentation/driver-api/vfio-mediated-device.rst b/Documentation/driver-api/vfio-mediated-device.rst index 25eb7d5b834b..6709413bee29 100644 --- a/Documentation/driver-api/vfio-mediated-device.rst +++ b/Documentation/driver-api/vfio-mediated-device.rst @@ -102,12 +102,14 @@ structure to represent a mediated device's driver:: * @probe: called when new device created * @remove: called when device removed * @driver: device driver structure + * @id_table: the ids serviced by this driver */ struct mdev_driver { const char *name; int (*probe) (struct device *dev); void (*remove) (struct device *dev); struct device_driverdriver; +const struct mdev_class_id *id_table; }; A mediated bus driver for mdev should use this structure in the function calls @@ -170,6 +172,9 @@ that a driver should use to unregister itself with the mdev core driver:: extern void mdev_unregister_device(struct device *dev); +It is also required to specify the class_id in create() callback through:: + + int mdev_set_class(struct mdev_device *mdev, u16 id); Mediated Device Management Interface Through sysfs == diff --git a/drivers/gpu/drm/i915/gvt/kvmgt.c b/drivers/gpu/drm/i915/gvt/kvmgt.c index 343d79c1cb7e..6420f0dbd31b 100644 --- a/drivers/gpu/drm/i915/gvt/kvmgt.c +++ b/drivers/gpu/drm/i915/gvt/kvmgt.c @@ -678,6 +678,7 @@ static int intel_vgpu_create(struct kobject *kobj, struct mdev_device *mdev) dev_name(mdev_dev(mdev))); ret = 0; + mdev_set_class(mdev, MDEV_CLASS_ID_VFIO); out: return ret; } diff --git a/drivers/s390/cio/vfio_ccw_ops.c b/drivers/s390/cio/vfio_ccw_ops.c index f0d71ab77c50..cf2c013ae32f 100644 --- a/drivers/s390/cio/vfio_ccw_ops.c +++ b/drivers/s390/cio/vfio_ccw_ops.c @@ -129,6 +129,7 @@ static int vfio_ccw_mdev_create(struct kobject *kobj, struct mdev_device *mdev) private->sch->schid.ssid, private->sch->schid.sch_no); + mdev_set_class(mdev, MDEV_CLASS_ID_VFIO); return 0; } diff --git a/drivers/s390/crypto/vfio_ap_ops.c b/drivers/s390/crypto/vfio_ap_ops.c index 5c0f53c6dde7..07c31070afeb 100644 --- a/drivers/s390/crypto/vfio_ap_ops.c +++ b/drivers/s390/crypto/vfio_ap_ops.c @@ -343,6 +343,7 @@ static int vfio_ap_mdev_create(struct kobject *kobj, struct mdev_device *mdev) list_add(_mdev->node, _dev->mdev_list); mutex_unlock(_dev->lock); + mdev_set_class(mdev, MDEV_CLASS_ID_VFIO); return 0; } diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c index b558d4cfd082..7bfa2e46e829 100644 --- a/drivers/vfio/mdev/mdev_core.c +++ b/drivers/vfio/mdev/mdev_core.c @@ -45,6 +45,17 @@ void mdev_set_drvdata(struct mdev_device *mdev, void *data) } EXPORT_SYMBOL(mdev_set_drvdata); +/* + * Specify the class for the mdev device, this must be called during + * create() callback. + */ +void mdev_set_class(struct mdev_device *mdev, u16 id) +{ + WARN_ON(mdev->class_id); + mdev->class_id = id; +} +EXPORT_SYMBOL(mdev_set_class); + struct device *mdev_dev(struct mdev_device *mdev) { return >dev; @@ -324,6 +335,12 @@ int mdev_device_create(struct kobject *kobj, if (ret) goto ops_create_fail; + if (!mdev->class_id) { + ret = -EINVAL; + dev_warn(dev, "mdev vendor dr
Re: [PATCH V11 4/6] mdev: introduce virtio device and its device ops
On 11/7/2019 8:41 PM, Jason Wang wrote: This patch implements basic support for mdev driver that supports virtio transport for kernel virtio driver. Reviewed-by: Cornelia Huck Signed-off-by: Jason Wang I'm not expert on virtio part, my ack is from mdev perspective. Reviewed-by: Kirti Wankhede Thanks, Kirti --- MAINTAINERS | 1 + drivers/vfio/mdev/mdev_core.c| 21 + drivers/vfio/mdev/mdev_private.h | 2 + include/linux/mdev.h | 6 ++ include/linux/mdev_virtio_ops.h | 147 +++ 5 files changed, 177 insertions(+) create mode 100644 include/linux/mdev_virtio_ops.h diff --git a/MAINTAINERS b/MAINTAINERS index f661d13344d6..4997957443df 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -17248,6 +17248,7 @@ F: include/linux/virtio*.h F:include/uapi/linux/virtio_*.h F:drivers/crypto/virtio/ F:mm/balloon_compaction.c +F: include/linux/mdev_virtio_ops.h VIRTIO BLOCK AND SCSI DRIVERS M:"Michael S. Tsirkin" diff --git a/drivers/vfio/mdev/mdev_core.c b/drivers/vfio/mdev/mdev_core.c index 4e70f19ac145..c58253404ed5 100644 --- a/drivers/vfio/mdev/mdev_core.c +++ b/drivers/vfio/mdev/mdev_core.c @@ -78,6 +78,27 @@ const struct mdev_vfio_device_ops *mdev_get_vfio_ops(struct mdev_device *mdev) } EXPORT_SYMBOL(mdev_get_vfio_ops); +/* + * Specify the virtio device ops for the mdev device, this + * must be called during create() callback for virtio mdev device. + */ +void mdev_set_virtio_ops(struct mdev_device *mdev, +const struct mdev_virtio_device_ops *virtio_ops) +{ + mdev_set_class(mdev, MDEV_CLASS_ID_VIRTIO); + mdev->virtio_ops = virtio_ops; +} +EXPORT_SYMBOL(mdev_set_virtio_ops); + +/* Get the virtio device ops for the mdev device. */ +const struct mdev_virtio_device_ops * +mdev_get_virtio_ops(struct mdev_device *mdev) +{ + WARN_ON(mdev->class_id != MDEV_CLASS_ID_VIRTIO); + return mdev->virtio_ops; +} +EXPORT_SYMBOL(mdev_get_virtio_ops); + struct device *mdev_dev(struct mdev_device *mdev) { return >dev; diff --git a/drivers/vfio/mdev/mdev_private.h b/drivers/vfio/mdev/mdev_private.h index 411227373625..2c74dd032409 100644 --- a/drivers/vfio/mdev/mdev_private.h +++ b/drivers/vfio/mdev/mdev_private.h @@ -11,6 +11,7 @@ #define MDEV_PRIVATE_H #include +#include int mdev_bus_register(void); void mdev_bus_unregister(void); @@ -38,6 +39,7 @@ struct mdev_device { u16 class_id; union { const struct mdev_vfio_device_ops *vfio_ops; + const struct mdev_virtio_device_ops *virtio_ops; }; }; diff --git a/include/linux/mdev.h b/include/linux/mdev.h index 9e37506d1987..f3d75a60c2b5 100644 --- a/include/linux/mdev.h +++ b/include/linux/mdev.h @@ -17,6 +17,7 @@ struct mdev_device; struct mdev_vfio_device_ops; +struct mdev_virtio_device_ops; /* * Called by the parent device driver to set the device which represents @@ -112,6 +113,10 @@ void mdev_set_class(struct mdev_device *mdev, u16 id); void mdev_set_vfio_ops(struct mdev_device *mdev, const struct mdev_vfio_device_ops *vfio_ops); const struct mdev_vfio_device_ops *mdev_get_vfio_ops(struct mdev_device *mdev); +void mdev_set_virtio_ops(struct mdev_device *mdev, +const struct mdev_virtio_device_ops *virtio_ops); +const struct mdev_virtio_device_ops * +mdev_get_virtio_ops(struct mdev_device *mdev); extern struct bus_type mdev_bus_type; @@ -127,6 +132,7 @@ struct mdev_device *mdev_from_dev(struct device *dev); enum { MDEV_CLASS_ID_VFIO = 1, + MDEV_CLASS_ID_VIRTIO = 2, /* New entries must be added here */ }; diff --git a/include/linux/mdev_virtio_ops.h b/include/linux/mdev_virtio_ops.h new file mode 100644 index ..8951331c6629 --- /dev/null +++ b/include/linux/mdev_virtio_ops.h @@ -0,0 +1,147 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Virtio mediated device driver + * + * Copyright 2019, Red Hat Corp. + * Author: Jason Wang + */ +#ifndef MDEV_VIRTIO_OPS_H +#define MDEV_VIRTIO_OPS_H + +#include +#include +#include + +#define VIRTIO_MDEV_DEVICE_API_STRING "virtio-mdev" + +struct virtio_mdev_callback { + irqreturn_t (*callback)(void *data); + void *private; +}; + +/** + * struct mdev_virtio_device_ops - Structure to be registered for each + * mdev device to register the device for virtio/vhost drivers. + * + * The callbacks are mandatory unless explicitly mentioned. + * + * @set_vq_address:Set the address of virtqueue + * @mdev: mediated device + * @idx: virtqueue index + * @desc_area: address of desc area + * @driver_area: address of driver area + * @dev
Re: [PATCH v13 5/7] vfio: ABI for mdev display dma-buf operation
On 8/8/2017 11:37 PM, Alex Williamson wrote: > On Tue, 8 Aug 2017 14:18:07 +0530 > Kirti Wankhede <kwankh...@nvidia.com> wrote: > >> On 8/7/2017 11:13 PM, Alex Williamson wrote: >>> On Mon, 7 Aug 2017 08:11:43 + >>> "Zhang, Tina" <tina.zh...@intel.com> wrote: >>> >>>> After going through the previous discussions, here are some summaries may >>>> be related to the current discussion: >>>> 1. How does user mode figure the device capabilities between region and >>>> dma-buf? >>>> VFIO_DEVICE_GET_REGION_INFO could tell if the mdev supports region case. >>>> Otherwise, the mdev supports dma-buf. >>> >>> Why do we need to make this assumption? >> >> Right, we should not make such assumption. Vendor driver might not >> support both or disable console vnc ( for example, for performance >> testing console VNC need to be disabled) >> >>> What happens when dma-buf is >>> superseded? What happens if a device supports both dma-buf and >>> regions? We have a flags field in vfio_device_gfx_plane_info, doesn't >>> it make sense to use it to identify which field, between region_index >>> and fd, is valid? We could even put region_index and fd into a union >>> with the flag bits indicating how to interpret the union, but I'm not >>> sure everyone was onboard with this idea. Seems like a waste of 4 >>> bytes not to do that though. >>> >> >> Agree. >> >>> Thinking further, is the user ever in a situation where they query the >>> graphics plane info and can handle either a dma-buf or a region? It >>> seems more likely that the user needs to know early on which is >>> supported and would then require that they continue to see compatible >>> plane information... Should the user then be able to specify whether >>> they want a dma-buf or a region? Perhaps these flag bits are actually >>> input and the return should be -errno if the driver cannot produce >>> something compatible. >>> >>> Maybe we'd therefore define 3 flag bits: PROBE, DMABUF, REGION. In >>> this initial implementation, DMABUF or REGION would always be set by >>> the user to request that type of interface. Additionally, the QUERY >>> bit could be set to probe compatibility, thus if PROBE and REGION are >>> set, the vendor driver would return success only if it supports the >>> region based interface. If PROBE and DMABUF are set, the vendor driver >>> returns success only if the dma-buf based interface is supported. The >>> value of the remainder of the structure is undefined for PROBE. >>> Additionally setting both DMABUF and REGION is invalid. Undefined >>> flags bits must be validated as zero by the drivers for future use >>> (thus if we later define DMABUFv2, an older driver should >>> automatically return -errno when probed or requested). >>> >> >> Are you saying all of this to be part of VFIO_DEVICE_QUERY_GFX_PLANE ioctl? >> >> Let me summarize what we need, taking QEMU as reference: >> 1. From vfio_initfn(), for REGION case, get region info: >> vfio_get_dev_region_info(.., VFIO_REGION_SUBTYPE_CONSOLE_REGION, >> >console_opregion) >> >> If above return success, setup console REGION and mmap. >> I don't know what is required for DMABUF at this moment. >> >> If console VNC is supported either DMABUF or REGION, initialize console >> and register its callback operations: >> >> static const GraphicHwOps vfio_console_ops = { >> .gfx_update = vfio_console_update_display, >> }; >> >> vdev->console = graphic_console_init(DEVICE(vdev), 0, _console_ops, >> vdev); > > I might structure it that vfio_initfn() would call > ioctl(VFIO_DEVICE_QUERY_GFX_PLANE) with the PROBE bit set with either > DMABUF or REGION set as the interface type in the flags field. If > REGION is the probed interface and success is returned, then QEMU might > go look for regions of the appropriate type, however the > vfio_device_gfx_plane_info structure is canonical source for the region > index, so QEMU would probably be wise to use that and only use the > region type for consistency testing. > >> 2. On above console registration, vfio_console_update_display() gets >> called for each refresh cycle of console. In that: >> - call ioctl(VFIO_DEVICE_QUERY_GFX_PLANE) >> - if (queried size > 0), update QEMU console surface (for REGION case >> read mmaped region, for DMABUF read surface using fd) > >
Re: [PATCH v13 5/7] vfio: ABI for mdev display dma-buf operation
On 8/7/2017 11:13 PM, Alex Williamson wrote: > On Mon, 7 Aug 2017 08:11:43 + > "Zhang, Tina"wrote: > >> After going through the previous discussions, here are some summaries may be >> related to the current discussion: >> 1. How does user mode figure the device capabilities between region and >> dma-buf? >> VFIO_DEVICE_GET_REGION_INFO could tell if the mdev supports region case. >> Otherwise, the mdev supports dma-buf. > > Why do we need to make this assumption? Right, we should not make such assumption. Vendor driver might not support both or disable console vnc ( for example, for performance testing console VNC need to be disabled) > What happens when dma-buf is > superseded? What happens if a device supports both dma-buf and > regions? We have a flags field in vfio_device_gfx_plane_info, doesn't > it make sense to use it to identify which field, between region_index > and fd, is valid? We could even put region_index and fd into a union > with the flag bits indicating how to interpret the union, but I'm not > sure everyone was onboard with this idea. Seems like a waste of 4 > bytes not to do that though. > Agree. > Thinking further, is the user ever in a situation where they query the > graphics plane info and can handle either a dma-buf or a region? It > seems more likely that the user needs to know early on which is > supported and would then require that they continue to see compatible > plane information... Should the user then be able to specify whether > they want a dma-buf or a region? Perhaps these flag bits are actually > input and the return should be -errno if the driver cannot produce > something compatible. > > Maybe we'd therefore define 3 flag bits: PROBE, DMABUF, REGION. In > this initial implementation, DMABUF or REGION would always be set by > the user to request that type of interface. Additionally, the QUERY > bit could be set to probe compatibility, thus if PROBE and REGION are > set, the vendor driver would return success only if it supports the > region based interface. If PROBE and DMABUF are set, the vendor driver > returns success only if the dma-buf based interface is supported. The > value of the remainder of the structure is undefined for PROBE. > Additionally setting both DMABUF and REGION is invalid. Undefined > flags bits must be validated as zero by the drivers for future use > (thus if we later define DMABUFv2, an older driver should > automatically return -errno when probed or requested). > Are you saying all of this to be part of VFIO_DEVICE_QUERY_GFX_PLANE ioctl? Let me summarize what we need, taking QEMU as reference: 1. From vfio_initfn(), for REGION case, get region info: vfio_get_dev_region_info(.., VFIO_REGION_SUBTYPE_CONSOLE_REGION, >console_opregion) If above return success, setup console REGION and mmap. I don't know what is required for DMABUF at this moment. If console VNC is supported either DMABUF or REGION, initialize console and register its callback operations: static const GraphicHwOps vfio_console_ops = { .gfx_update = vfio_console_update_display, }; vdev->console = graphic_console_init(DEVICE(vdev), 0, _console_ops, vdev); 2. On above console registration, vfio_console_update_display() gets called for each refresh cycle of console. In that: - call ioctl(VFIO_DEVICE_QUERY_GFX_PLANE) - if (queried size > 0), update QEMU console surface (for REGION case read mmaped region, for DMABUF read surface using fd) Alex, in your proposal above, my understanding is ioctl(VFIO_DEVICE_QUERY_GFX_PLANE) with PROBE flag should be called in step #1. In step #2, ioctl(VFIO_DEVICE_QUERY_GFX_PLANE) will be without PROBE flag, but either DMABUF or REGION flag based on what is returned as supported by vendor driver in step #1. Is that correct? > It seems like this handles all the cases, the user can ask what's > supported and specifies the interface they want on every call. The > user therefore can also choose between region_index and fd and we can > make that a union. > >> 2. For dma-buf, how to differentiate unsupported vs not initialized? >> For dma-buf, when the mdev doesn't support some arguments, -EINVAL will be >> returned. And -errno will return when meeting other failures, like -ENOMEM. >> If the mdev is not initialized, there won't be any returned err. Just zero >> all the fields in structure vfio_device_gfx_plane_info. > > So we're relying on special values again :-\ For which fields is zero > not a valid value? I prefer the probe interface above unless there are > better ideas. > PROBE will be called during vdev initialization and after that vfio_console_update_display() gets called at every console refresh cycle. But until driver in guest is loaded, mmaped buffer/ DMABUF will not be populated with correct surface data. In that case for QUERY, vendor driver should return (atleast) size=0 which means there is nothing to copy. Once guest driver is loaded and surface is created by guest
Re: [PATCH v13 5/7] vfio: ABI for mdev display dma-buf operation
On 8/2/2017 2:48 AM, Alex Williamson wrote: > On Tue, 25 Jul 2017 17:28:18 +0800 > Tina Zhangwrote: > >> Add VFIO_DEVICE_QUERY_GFX_PLANE ioctl command to let user mode query and >> get the plan and its related information. >> >> The dma-buf's life cycle is handled by user mode and tracked by kernel. >> The returned fd in struct vfio_device_query_gfx_plane can be a new >> fd or an old fd of a re-exported dma-buf. Host User mode can check the >> value of fd and to see if it needs to create new resource according to >> the new fd or just use the existed resource related to the old fd. >> >> Signed-off-by: Tina Zhang >> --- >> include/uapi/linux/vfio.h | 28 >> 1 file changed, 28 insertions(+) >> >> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h >> index ae46105..827a230 100644 >> --- a/include/uapi/linux/vfio.h >> +++ b/include/uapi/linux/vfio.h >> @@ -502,6 +502,34 @@ struct vfio_pci_hot_reset { >> >> #define VFIO_DEVICE_PCI_HOT_RESET _IO(VFIO_TYPE, VFIO_BASE + 13) >> >> +/** >> + * VFIO_DEVICE_QUERY_GFX_PLANE - _IOW(VFIO_TYPE, VFIO_BASE + 14, struct >> vfio_device_query_gfx_plane) >> + * >> + * Set the drm_plane_type and retrieve information about the gfx plane. >> + * >> + * Return: 0 on success, -errno on failure. >> + */ >> +struct vfio_device_gfx_plane_info { >> +__u32 argsz; >> +__u32 flags; >> +/* in */ >> +__u32 drm_plane_type; /* type of plane: DRM_PLANE_TYPE_* */ >> +/* out */ >> +__u32 drm_format; /* drm format of plane */ >> +__u64 drm_format_mod; /* tiled mode */ >> +__u32 width;/* width of plane */ >> +__u32 height; /* height of plane */ >> +__u32 stride; /* stride of plane */ >> +__u32 size; /* size of plane in bytes, align on page*/ >> +__u32 x_pos;/* horizontal position of cursor plane, upper left >> corner in pixels */ >> +__u32 y_pos;/* vertical position of cursor plane, upper left corner >> in lines*/ >> +__u32 region_index; >> +__s32 fd; /* dma-buf fd */ > > How do I know which of these is valid, region_index or fd? Can I ask > for one vs the other? What are the errno values to differentiate > unsupported vs not initialized? Is there a "probe" flag that I can use > to determine what the device supports without allocating those > resources yet? > > Kirti, does this otherwise meet your needs? > I wanted to test my change with this interface, but I couldn't, was busy with other stuff. This structure looks good to me for region type support. > What are the errno values to differentiate > unsupported vs not initialized? In previous version discussion, we decided that vendor driver should set all members to 0 in such cases. Mainly, if size is 0 then there is nothing to copy. Thanks, Kirti > Thanks, > Alex > ___ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel