Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Wed, Sep 29, 2021 at 09:22:30AM -0300, Jason Gunthorpe wrote: > On Wed, Sep 29, 2021 at 12:46:14PM +1000, da...@gibson.dropbear.id.au wrote: > > On Tue, Sep 21, 2021 at 10:00:14PM -0300, Jason Gunthorpe wrote: > > > On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote: > > > > > From: Jason Gunthorpe > > > > > Sent: Wednesday, September 22, 2021 12:01 AM > > > > > > > > > > > One open about how to organize the device nodes under > > > > > /dev/vfio/devices/. > > > > > > This RFC adopts a simple policy by keeping a flat layout with mixed > > > > > devname > > > > > > from all kinds of devices. The prerequisite of this model is that > > > > > > devnames > > > > > > from different bus types are unique formats: > > > > > > > > > > This isn't reliable, the devname should just be vfio0, vfio1, etc > > > > > > > > > > The userspace can learn the correct major/minor by inspecting the > > > > > sysfs. > > > > > > > > > > This whole concept should disappear into the prior patch that adds the > > > > > struct device in the first place, and I think most of the code here > > > > > can be deleted once the struct device is used properly. > > > > > > > > > > > > > Can you help elaborate above flow? This is one area where we need > > > > more guidance. > > > > > > > > When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F", > > > > how does Qemu identify which vifo0/1/... is associated with the > > > > specified > > > > :BB:DD.F? > > > > > > When done properly in the kernel the file: > > > > > > /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev > > > > > > Will contain the major:minor of the VFIO device. > > > > > > Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat > > > that the major:minor matches. > > > > > > in the above pattern "pci" and ":BB:DD.FF" are the arguments passed > > > to qemu. > > > > I thought part of the appeal of the device centric model was less > > grovelling around in sysfs for information. Using type/address > > directly in /dev seems simpler than having to dig around matching > > things here. > > I would say more regular grovelling. Starting from a sysfs device > directory and querying the VFIO cdev associated with it is much more > normal than what happens today, which also includes passing sysfs > information into an ioctl :\ Hm.. ok. Clearly I'm unfamiliar with the things that do that. Other than current VFIO, the only model I've really seen is where you just point your program at a device node. > > Note that this doesn't have to be done in kernel: you could have the > > kernel just call them /dev/vfio/devices/vfio0, ... but add udev rules > > that create symlinks from say /dev/vfio/pci/:BB:SS.F - > > > ../devices/vfioXX based on the sysfs information. > > This is the right approach if people want to do this, but I'm not sure > it is worth it given backwards compat requires the sysfs path as > input. You mean for userspace that needs to be able to go back to the old VFIO interface as well? It seems silly to force this sysfs mucking about on new programs that depend on the new interface. > We may as well stick with sysfs as the command line interface > for userspace tools. > And I certainly don't want to see userspace tools trying to reverse a > sysfs path into a /dev/ symlink name when they can directly and > reliably learn the correct cdev from the sysfspath. Um.. sure.. but they can get the correct cdev from the sysfspath no matter how we name the cdevs. -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Wed, Sep 29, 2021 at 12:46:14PM +1000, da...@gibson.dropbear.id.au wrote: > On Tue, Sep 21, 2021 at 10:00:14PM -0300, Jason Gunthorpe wrote: > > On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote: > > > > From: Jason Gunthorpe > > > > Sent: Wednesday, September 22, 2021 12:01 AM > > > > > > > > > One open about how to organize the device nodes under > > > > /dev/vfio/devices/. > > > > > This RFC adopts a simple policy by keeping a flat layout with mixed > > > > devname > > > > > from all kinds of devices. The prerequisite of this model is that > > > > > devnames > > > > > from different bus types are unique formats: > > > > > > > > This isn't reliable, the devname should just be vfio0, vfio1, etc > > > > > > > > The userspace can learn the correct major/minor by inspecting the > > > > sysfs. > > > > > > > > This whole concept should disappear into the prior patch that adds the > > > > struct device in the first place, and I think most of the code here > > > > can be deleted once the struct device is used properly. > > > > > > > > > > Can you help elaborate above flow? This is one area where we need > > > more guidance. > > > > > > When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F", > > > how does Qemu identify which vifo0/1/... is associated with the specified > > > :BB:DD.F? > > > > When done properly in the kernel the file: > > > > /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev > > > > Will contain the major:minor of the VFIO device. > > > > Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat > > that the major:minor matches. > > > > in the above pattern "pci" and ":BB:DD.FF" are the arguments passed > > to qemu. > > I thought part of the appeal of the device centric model was less > grovelling around in sysfs for information. Using type/address > directly in /dev seems simpler than having to dig around matching > things here. I would say more regular grovelling. Starting from a sysfs device directory and querying the VFIO cdev associated with it is much more normal than what happens today, which also includes passing sysfs information into an ioctl :\ > Note that this doesn't have to be done in kernel: you could have the > kernel just call them /dev/vfio/devices/vfio0, ... but add udev rules > that create symlinks from say /dev/vfio/pci/:BB:SS.F - > > ../devices/vfioXX based on the sysfs information. This is the right approach if people want to do this, but I'm not sure it is worth it given backwards compat requires the sysfs path as input. We may as well stick with sysfs as the command line interface for userspace tools. And I certainly don't want to see userspace tools trying to reverse a sysfs path into a /dev/ symlink name when they can directly and reliably learn the correct cdev from the sysfspath. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Wed, Sep 29, 2021 at 09:08:25AM +0200, Cornelia Huck wrote: > On Wed, Sep 29 2021, "Tian, Kevin" wrote: > > >> From: David Gibson > >> Sent: Wednesday, September 29, 2021 10:44 AM > >> > >> > One alternative option is to arrange device nodes in sub-directories > >> > based > >> > on the device type. But doing so also adds one trouble to userspace. The > >> > current vfio uAPI is designed to have the user query device type via > >> > VFIO_DEVICE_GET_INFO after opening the device. With this option the user > >> > instead needs to figure out the device type before opening the device, to > >> > identify the sub-directory. > >> > >> Wouldn't this be up to the operator / configuration, rather than the > >> actual software though? I would assume that typically the VFIO > >> program would be pointed at a specific vfio device node file to use, > >> e.g. > >>my-vfio-prog -d /dev/vfio/pci/:0a:03.1 > >> > >> Or more generally, if you're expecting userspace to know a name in a > >> uniqu pattern, they can equally well know a "type/name" pair. > >> > > > > You are correct. Currently: > > > > -device, vfio-pci,host=:BB:DD.F > > -device, vfio-pci,sysfdev=/sys/bus/pci/devices/ :BB:DD.F > > -device, vfio-platform,sysdev=/sys/bus/platform/devices/PNP0103:00 > > > > above is definitely type/name information to find the related node. > > > > Actually even for Jason's proposal we still need such information to > > identify the sysfs path. > > > > Then I feel type-based sub-directory does work. Adding another link > > to sysfs sounds unnecessary now. But I'm not sure whether we still > > want to create /dev/vfio/devices/vfio0 thing and related udev rule > > thing that you pointed out in another mail. > > Still reading through this whole thread, but type-based subdirectories > also make the most sense to me. I don't really see userspace wanting to > grab just any device and then figure out whether it is the device it was > looking for, but rather immediately go to a specific device or at least > a device of a specific type. Even so the kernel should not be creating this, that is a job for udev and some symlinks > Sequentially-numbered devices tend to become really unwieldy in my > experience if you are working on a system with loads of devices. If the user experiance is always to refer to the sysfs node as Kevin shows above then the user never sees the integer. It is very much like how the group number works already, programs always start at the sysfs, do the readlink thing on iommu_group and then get the group number to go to /dev/vfio/X So it is already the case that every piece of software can construct a sysfs path to the device, we are just changing from readlink(iommu_group) to readdir(vfio/vfio_device_XX) Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Wed, Sep 29 2021, "Tian, Kevin" wrote: >> From: David Gibson >> Sent: Wednesday, September 29, 2021 10:44 AM >> >> > One alternative option is to arrange device nodes in sub-directories based >> > on the device type. But doing so also adds one trouble to userspace. The >> > current vfio uAPI is designed to have the user query device type via >> > VFIO_DEVICE_GET_INFO after opening the device. With this option the user >> > instead needs to figure out the device type before opening the device, to >> > identify the sub-directory. >> >> Wouldn't this be up to the operator / configuration, rather than the >> actual software though? I would assume that typically the VFIO >> program would be pointed at a specific vfio device node file to use, >> e.g. >> my-vfio-prog -d /dev/vfio/pci/:0a:03.1 >> >> Or more generally, if you're expecting userspace to know a name in a >> uniqu pattern, they can equally well know a "type/name" pair. >> > > You are correct. Currently: > > -device, vfio-pci,host=:BB:DD.F > -device, vfio-pci,sysfdev=/sys/bus/pci/devices/ :BB:DD.F > -device, vfio-platform,sysdev=/sys/bus/platform/devices/PNP0103:00 > > above is definitely type/name information to find the related node. > > Actually even for Jason's proposal we still need such information to > identify the sysfs path. > > Then I feel type-based sub-directory does work. Adding another link > to sysfs sounds unnecessary now. But I'm not sure whether we still > want to create /dev/vfio/devices/vfio0 thing and related udev rule > thing that you pointed out in another mail. Still reading through this whole thread, but type-based subdirectories also make the most sense to me. I don't really see userspace wanting to grab just any device and then figure out whether it is the device it was looking for, but rather immediately go to a specific device or at least a device of a specific type. Sequentially-numbered devices tend to become really unwieldy in my experience if you are working on a system with loads of devices. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC 03/20] vfio: Add vfio_[un]register_device()
> From: David Gibson > Sent: Wednesday, September 29, 2021 10:44 AM > > > One alternative option is to arrange device nodes in sub-directories based > > on the device type. But doing so also adds one trouble to userspace. The > > current vfio uAPI is designed to have the user query device type via > > VFIO_DEVICE_GET_INFO after opening the device. With this option the user > > instead needs to figure out the device type before opening the device, to > > identify the sub-directory. > > Wouldn't this be up to the operator / configuration, rather than the > actual software though? I would assume that typically the VFIO > program would be pointed at a specific vfio device node file to use, > e.g. > my-vfio-prog -d /dev/vfio/pci/:0a:03.1 > > Or more generally, if you're expecting userspace to know a name in a > uniqu pattern, they can equally well know a "type/name" pair. > You are correct. Currently: -device, vfio-pci,host=:BB:DD.F -device, vfio-pci,sysfdev=/sys/bus/pci/devices/ :BB:DD.F -device, vfio-platform,sysdev=/sys/bus/platform/devices/PNP0103:00 above is definitely type/name information to find the related node. Actually even for Jason's proposal we still need such information to identify the sysfs path. Then I feel type-based sub-directory does work. Adding another link to sysfs sounds unnecessary now. But I'm not sure whether we still want to create /dev/vfio/devices/vfio0 thing and related udev rule thing that you pointed out in another mail. Jason? Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC 03/20] vfio: Add vfio_[un]register_device()
> From: David Gibson > Sent: Wednesday, September 29, 2021 10:44 AM > > > > > One open about how to organize the device nodes under > /dev/vfio/devices/. > > This RFC adopts a simple policy by keeping a flat layout with mixed > devname > > from all kinds of devices. The prerequisite of this model is that devnames > > from different bus types are unique formats: > > > > /dev/vfio/devices/:00:14.2 (pci) > > /dev/vfio/devices/PNP0103:00 (platform) > > /dev/vfio/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001 (mdev) > > Oof. I really don't think this is a good idea. Ensuring that a > format is "unique" in the sense that it can't collide with any of the > other formats, for *every* value of the parameters on both sides is > actually pretty complicated in general. > > I think per-type sub-directories would be helpful here, Jason's > suggestion of just sequential numbers would work as well. we'll follow Jason's suggestion in next version. > > + /* > > +* Refcounting can't start until the driver call register. Don’t > > +* start twice when the device is exposed in both group and > nongroup > > +* interfaces. > > +*/ > > + if (!refcount_read(&device->refcount)) > > Is there a possible race here with something getting in and > incrementing the refcount between the read and set? this will not be required in next version, which will always create both group and nongroup interfaces for every device (then let driver providing .bind_iommufd() callback for whether nongroup interface is functional). It will be centrally processed within existing vfio_[un]register_group_dev(), thus above race is not a concern any more. > > > + refcount_set(&device->refcount, 1); > > > > mutex_lock(&group->device_lock); > > list_add(&device->group_next, &group->device_list); > > @@ -804,7 +810,78 @@ int vfio_register_group_dev(struct vfio_device > *device) > > > > return 0; > > } > > -EXPORT_SYMBOL_GPL(vfio_register_group_dev); > > + > > +static int __vfio_register_nongroup_dev(struct vfio_device *device) > > +{ > > + struct vfio_device *existing_device; > > + struct device *dev; > > + int ret = 0, minor; > > + > > + mutex_lock(&vfio.device_lock); > > + list_for_each_entry(existing_device, &vfio.device_list, vfio_next) { > > + if (existing_device == device) { > > + ret = -EBUSY; > > + goto out_unlock; > > This indicates a bug in the caller, doesn't it? Should it be a BUG or > WARN instead? this call is initiated by userspace. Per Jason's suggestion we don't even need to check it then no lock is required. Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Sun, Sep 19, 2021 at 02:38:31PM +0800, Liu Yi L wrote: > With /dev/vfio/devices introduced, now a vfio device driver has three > options to expose its device to userspace: > > a) only legacy group interface, for devices which haven't been moved to > iommufd (e.g. platform devices, sw mdev, etc.); > > b) both legacy group interface and new device-centric interface, for > devices which supports iommufd but also wants to keep backward > compatibility (e.g. pci devices in this RFC); > > c) only new device-centric interface, for new devices which don't carry > backward compatibility burden (e.g. hw mdev/subdev with pasid); > > This patch introduces vfio_[un]register_device() helpers for the device > drivers to specify the device exposure policy to vfio core. Hence the > existing vfio_[un]register_group_dev() become the wrapper of the new > helper functions. The new device-centric interface is described as > 'nongroup' to differentiate from existing 'group' stuff. > > TBD: this patch needs to rebase on top of below series from Christoph in > next version. > > "cleanup vfio iommu_group creation" > > Legacy userspace continues to follow the legacy group interface. > > Newer userspace can first try the new device-centric interface if the > device is present under /dev/vfio/devices. Otherwise fall back to the > group interface. > > One open about how to organize the device nodes under /dev/vfio/devices/. > This RFC adopts a simple policy by keeping a flat layout with mixed devname > from all kinds of devices. The prerequisite of this model is that devnames > from different bus types are unique formats: > > /dev/vfio/devices/:00:14.2 (pci) > /dev/vfio/devices/PNP0103:00 (platform) > /dev/vfio/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001 (mdev) Oof. I really don't think this is a good idea. Ensuring that a format is "unique" in the sense that it can't collide with any of the other formats, for *every* value of the parameters on both sides is actually pretty complicated in general. I think per-type sub-directories would be helpful here, Jason's suggestion of just sequential numbers would work as well. > One alternative option is to arrange device nodes in sub-directories based > on the device type. But doing so also adds one trouble to userspace. The > current vfio uAPI is designed to have the user query device type via > VFIO_DEVICE_GET_INFO after opening the device. With this option the user > instead needs to figure out the device type before opening the device, to > identify the sub-directory. Wouldn't this be up to the operator / configuration, rather than the actual software though? I would assume that typically the VFIO program would be pointed at a specific vfio device node file to use, e.g. my-vfio-prog -d /dev/vfio/pci/:0a:03.1 Or more generally, if you're expecting userspace to know a name in a uniqu pattern, they can equally well know a "type/name" pair. > Another tricky thing is that "pdev. vs. mdev" > and "pci vs. platform vs. ccw,..." are orthogonal categorizations. Need > more thoughts on whether both or just one category should be used to define > the sub-directories. > > Signed-off-by: Liu Yi L > --- > drivers/vfio/vfio.c | 137 +++ > include/linux/vfio.h | 9 +++ > 2 files changed, 134 insertions(+), 12 deletions(-) > > diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c > index 84436d7abedd..1e87b25962f1 100644 > --- a/drivers/vfio/vfio.c > +++ b/drivers/vfio/vfio.c > @@ -51,6 +51,7 @@ static struct vfio { > struct cdev device_cdev; > dev_t device_devt; > struct mutexdevice_lock; > + struct list_headdevice_list; > struct idr device_idr; > } vfio; > > @@ -757,7 +758,7 @@ void vfio_init_group_dev(struct vfio_device *device, > struct device *dev, > } > EXPORT_SYMBOL_GPL(vfio_init_group_dev); > > -int vfio_register_group_dev(struct vfio_device *device) > +static int __vfio_register_group_dev(struct vfio_device *device) > { > struct vfio_device *existing_device; > struct iommu_group *iommu_group; > @@ -794,8 +795,13 @@ int vfio_register_group_dev(struct vfio_device *device) > /* Our reference on group is moved to the device */ > device->group = group; > > - /* Refcounting can't start until the driver calls register */ > - refcount_set(&device->refcount, 1); > + /* > + * Refcounting can't start until the driver call register. Don’t > + * start twice when the device is exposed in both group and nongroup > + * interfaces. > + */ > + if (!refcount_read(&device->refcount)) Is there a possible race here with something getting in and incrementing the refcount between the read and set? > + refcount_set(&device->refcount, 1); > > mutex_lock(&group->device_lock); >
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Tue, Sep 21, 2021 at 10:00:14PM -0300, Jason Gunthorpe wrote: > On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote: > > > From: Jason Gunthorpe > > > Sent: Wednesday, September 22, 2021 12:01 AM > > > > > > > One open about how to organize the device nodes under > > > /dev/vfio/devices/. > > > > This RFC adopts a simple policy by keeping a flat layout with mixed > > > devname > > > > from all kinds of devices. The prerequisite of this model is that > > > > devnames > > > > from different bus types are unique formats: > > > > > > This isn't reliable, the devname should just be vfio0, vfio1, etc > > > > > > The userspace can learn the correct major/minor by inspecting the > > > sysfs. > > > > > > This whole concept should disappear into the prior patch that adds the > > > struct device in the first place, and I think most of the code here > > > can be deleted once the struct device is used properly. > > > > > > > Can you help elaborate above flow? This is one area where we need > > more guidance. > > > > When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F", > > how does Qemu identify which vifo0/1/... is associated with the specified > > :BB:DD.F? > > When done properly in the kernel the file: > > /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev > > Will contain the major:minor of the VFIO device. > > Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat > that the major:minor matches. > > in the above pattern "pci" and ":BB:DD.FF" are the arguments passed > to qemu. I thought part of the appeal of the device centric model was less grovelling around in sysfs for information. Using type/address directly in /dev seems simpler than having to dig around matching things here. Note that this doesn't have to be done in kernel: you could have the kernel just call them /dev/vfio/devices/vfio0, ... but add udev rules that create symlinks from say /dev/vfio/pci/:BB:SS.F - > ../devices/vfioXX based on the sysfs information. > > You can look at this for some general over engineered code to handle > opening from a sysfs handle like above: > > https://github.com/linux-rdma/rdma-core/blob/master/util/open_cdev.c > > Jason > -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Thu, Sep 23, 2021 at 09:25:27AM +0200, Eric Auger wrote: > Hi, > > On 9/22/21 3:00 AM, Jason Gunthorpe wrote: > > On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote: > >>> From: Jason Gunthorpe > >>> Sent: Wednesday, September 22, 2021 12:01 AM > >>> > One open about how to organize the device nodes under > >>> /dev/vfio/devices/. > This RFC adopts a simple policy by keeping a flat layout with mixed > >>> devname > from all kinds of devices. The prerequisite of this model is that > devnames > from different bus types are unique formats: > >>> This isn't reliable, the devname should just be vfio0, vfio1, etc > >>> > >>> The userspace can learn the correct major/minor by inspecting the > >>> sysfs. > >>> > >>> This whole concept should disappear into the prior patch that adds the > >>> struct device in the first place, and I think most of the code here > >>> can be deleted once the struct device is used properly. > >>> > >> Can you help elaborate above flow? This is one area where we need > >> more guidance. > >> > >> When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F", > >> how does Qemu identify which vifo0/1/... is associated with the specified > >> :BB:DD.F? > > When done properly in the kernel the file: > > > > /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev > > > > Will contain the major:minor of the VFIO device. > > > > Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat > > that the major:minor matches. > > > > in the above pattern "pci" and ":BB:DD.FF" are the arguments passed > > to qemu. > I guess this would be the same for platform devices, for instance > /sys/bus/platform/devices/AMDI8001:01/vfio/vfioX/dev, right? Yes, it is the general driver core pattern for creating cdevs below a parent device Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
Hi, On 9/22/21 3:00 AM, Jason Gunthorpe wrote: > On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote: >>> From: Jason Gunthorpe >>> Sent: Wednesday, September 22, 2021 12:01 AM >>> One open about how to organize the device nodes under >>> /dev/vfio/devices/. This RFC adopts a simple policy by keeping a flat layout with mixed >>> devname from all kinds of devices. The prerequisite of this model is that devnames from different bus types are unique formats: >>> This isn't reliable, the devname should just be vfio0, vfio1, etc >>> >>> The userspace can learn the correct major/minor by inspecting the >>> sysfs. >>> >>> This whole concept should disappear into the prior patch that adds the >>> struct device in the first place, and I think most of the code here >>> can be deleted once the struct device is used properly. >>> >> Can you help elaborate above flow? This is one area where we need >> more guidance. >> >> When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F", >> how does Qemu identify which vifo0/1/... is associated with the specified >> :BB:DD.F? > When done properly in the kernel the file: > > /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev > > Will contain the major:minor of the VFIO device. > > Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat > that the major:minor matches. > > in the above pattern "pci" and ":BB:DD.FF" are the arguments passed > to qemu. I guess this would be the same for platform devices, for instance /sys/bus/platform/devices/AMDI8001:01/vfio/vfioX/dev, right? Thanks Eric > > You can look at this for some general over engineered code to handle > opening from a sysfs handle like above: > > https://github.com/linux-rdma/rdma-core/blob/master/util/open_cdev.c > > Jason > ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC 03/20] vfio: Add vfio_[un]register_device()
> From: Jason Gunthorpe > Sent: Thursday, September 23, 2021 7:52 AM > > On Wed, Sep 22, 2021 at 11:45:33PM +, Tian, Kevin wrote: > > > From: Alex Williamson > > > btw I realized another related piece regarding to the new layout that > > Jason suggested, which have sys device node include a link to the vfio > > devnode: > > > > /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev > > > > This for sure requires specific vfio driver support to get the link > > established. > > It doesn't. Just set the parent device of the vfio_device's struct > device to the physical struct device that vfio is already tracking - > ie the struct device providing the IOMMU. The driver core takes care > of everything else. > Thanks for correction. So it's still the same try-and-fail model for both devices which support iommufd and which do not. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Wed, Sep 22, 2021 at 02:10:36PM -0600, Alex Williamson wrote: > But why would we create vfio device interface files at all if they > can't work? I'm not really on board with creating a try-and-fail > interface for a mechanism that cannot work for a given device. The > existence of the device interface should indicate that it's supported. I'm a little worried about adding a struct device to vfio_device and then making it optional.. That is a really weird situation. I suppose you could create the sysfs presence in the struct device but not create a cdev. However, if we ever want to use the device fd for something else, like querying the device driver capabilities or mode, (ie clean the driver_api thing wrongly placed in mdev sysfs for instance), we are blocked as the uAPI will be cdev == must support iommufd.. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Wed, Sep 22, 2021 at 11:45:33PM +, Tian, Kevin wrote: > > From: Alex Williamson > btw I realized another related piece regarding to the new layout that > Jason suggested, which have sys device node include a link to the vfio > devnode: > > /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev > > This for sure requires specific vfio driver support to get the link > established. It doesn't. Just set the parent device of the vfio_device's struct device to the physical struct device that vfio is already tracking - ie the struct device providing the IOMMU. The driver core takes care of everything else. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC 03/20] vfio: Add vfio_[un]register_device()
> From: Alex Williamson > Sent: Thursday, September 23, 2021 6:45 AM > > On Wed, 22 Sep 2021 22:34:42 + > "Tian, Kevin" wrote: > > > > From: Alex Williamson > > > Sent: Thursday, September 23, 2021 4:11 AM > > > > > > On Wed, 22 Sep 2021 09:22:52 -0300 > > > Jason Gunthorpe wrote: > > > > > > > On Wed, Sep 22, 2021 at 09:23:34AM +, Tian, Kevin wrote: > > > > > > > > > > Providing an ioctl to bind to a normal VFIO container or group might > > > > > > allow a reasonable fallback in userspace.. > > > > > > > > > > I didn't get this point though. An error in binding already allows the > > > > > user to fall back to the group path. Why do we need introduce > another > > > > > ioctl to explicitly bind to container via the nongroup interface? > > > > > > > > New userspace still needs a fallback path if it hits the 'try and > > > > fail'. Keeping the device FD open and just using a different ioctl to > > > > bind to a container/group FD, which new userspace can then obtain as > a > > > > fallback, might be OK. > > > > > > > > Hard to see without going through the qemu parts, so maybe just keep > > > > it in mind > > > > > > If we assume that the container/group/device interface is essentially > > > deprecated once we have iommufd, it doesn't make a lot of sense to me > > > to tack on a container/device interface just so userspace can avoid > > > reverting to the fully legacy interface. > > > > > > But why would we create vfio device interface files at all if they > > > can't work? I'm not really on board with creating a try-and-fail > > > interface for a mechanism that cannot work for a given device. The > > > existence of the device interface should indicate that it's supported. > > > Thanks, > > > > > > > Now it's a try-and-fail model even for devices which support iommufd. > > Per Jason's suggestion, a device is always opened with a parked fops > > which supports only bind. Binding serves as the contract for handling > > exclusive ownership on a device and switching to normal fops if > > succeed. So the user has to try-and-fail in case multiple threads attempt > > to open a same device. Device which doesn't support iommufd is not > > different, except binding request 100% fails (due to missing .bind_iommufd > > in kernel driver). > > That's a rather important difference. I don't really see how that's > comparable to the mutually exclusive nature of the legacy vs device I didn't get the 'comparable' part. Can you elaborate? > interface. We're not going to present a vfio device interface for SW > mdevs that can't participate in iommufd, right? Thanks, > Did you see any problem if exposing sw mdev now? Following above explanation the try-and-fail model should still work... btw I realized another related piece regarding to the new layout that Jason suggested, which have sys device node include a link to the vfio devnode: /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev This for sure requires specific vfio driver support to get the link established. if we only do it for vfio-pci in the start, then for other devices which don't support iommufd there is no way for the user to identify the corresponding vfio devnode even it's still exposed. Then try-and-fail model may not even been reached for those devices. Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Wed, 22 Sep 2021 22:34:42 + "Tian, Kevin" wrote: > > From: Alex Williamson > > Sent: Thursday, September 23, 2021 4:11 AM > > > > On Wed, 22 Sep 2021 09:22:52 -0300 > > Jason Gunthorpe wrote: > > > > > On Wed, Sep 22, 2021 at 09:23:34AM +, Tian, Kevin wrote: > > > > > > > > Providing an ioctl to bind to a normal VFIO container or group might > > > > > allow a reasonable fallback in userspace.. > > > > > > > > I didn't get this point though. An error in binding already allows the > > > > user to fall back to the group path. Why do we need introduce another > > > > ioctl to explicitly bind to container via the nongroup interface? > > > > > > New userspace still needs a fallback path if it hits the 'try and > > > fail'. Keeping the device FD open and just using a different ioctl to > > > bind to a container/group FD, which new userspace can then obtain as a > > > fallback, might be OK. > > > > > > Hard to see without going through the qemu parts, so maybe just keep > > > it in mind > > > > If we assume that the container/group/device interface is essentially > > deprecated once we have iommufd, it doesn't make a lot of sense to me > > to tack on a container/device interface just so userspace can avoid > > reverting to the fully legacy interface. > > > > But why would we create vfio device interface files at all if they > > can't work? I'm not really on board with creating a try-and-fail > > interface for a mechanism that cannot work for a given device. The > > existence of the device interface should indicate that it's supported. > > Thanks, > > > > Now it's a try-and-fail model even for devices which support iommufd. > Per Jason's suggestion, a device is always opened with a parked fops > which supports only bind. Binding serves as the contract for handling > exclusive ownership on a device and switching to normal fops if > succeed. So the user has to try-and-fail in case multiple threads attempt > to open a same device. Device which doesn't support iommufd is not > different, except binding request 100% fails (due to missing .bind_iommufd > in kernel driver). That's a rather important difference. I don't really see how that's comparable to the mutually exclusive nature of the legacy vs device interface. We're not going to present a vfio device interface for SW mdevs that can't participate in iommufd, right? Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC 03/20] vfio: Add vfio_[un]register_device()
> From: Alex Williamson > Sent: Thursday, September 23, 2021 4:11 AM > > On Wed, 22 Sep 2021 09:22:52 -0300 > Jason Gunthorpe wrote: > > > On Wed, Sep 22, 2021 at 09:23:34AM +, Tian, Kevin wrote: > > > > > > Providing an ioctl to bind to a normal VFIO container or group might > > > > allow a reasonable fallback in userspace.. > > > > > > I didn't get this point though. An error in binding already allows the > > > user to fall back to the group path. Why do we need introduce another > > > ioctl to explicitly bind to container via the nongroup interface? > > > > New userspace still needs a fallback path if it hits the 'try and > > fail'. Keeping the device FD open and just using a different ioctl to > > bind to a container/group FD, which new userspace can then obtain as a > > fallback, might be OK. > > > > Hard to see without going through the qemu parts, so maybe just keep > > it in mind > > If we assume that the container/group/device interface is essentially > deprecated once we have iommufd, it doesn't make a lot of sense to me > to tack on a container/device interface just so userspace can avoid > reverting to the fully legacy interface. > > But why would we create vfio device interface files at all if they > can't work? I'm not really on board with creating a try-and-fail > interface for a mechanism that cannot work for a given device. The > existence of the device interface should indicate that it's supported. > Thanks, > Now it's a try-and-fail model even for devices which support iommufd. Per Jason's suggestion, a device is always opened with a parked fops which supports only bind. Binding serves as the contract for handling exclusive ownership on a device and switching to normal fops if succeed. So the user has to try-and-fail in case multiple threads attempt to open a same device. Device which doesn't support iommufd is not different, except binding request 100% fails (due to missing .bind_iommufd in kernel driver). Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Wed, 22 Sep 2021 09:22:52 -0300 Jason Gunthorpe wrote: > On Wed, Sep 22, 2021 at 09:23:34AM +, Tian, Kevin wrote: > > > > Providing an ioctl to bind to a normal VFIO container or group might > > > allow a reasonable fallback in userspace.. > > > > I didn't get this point though. An error in binding already allows the > > user to fall back to the group path. Why do we need introduce another > > ioctl to explicitly bind to container via the nongroup interface? > > New userspace still needs a fallback path if it hits the 'try and > fail'. Keeping the device FD open and just using a different ioctl to > bind to a container/group FD, which new userspace can then obtain as a > fallback, might be OK. > > Hard to see without going through the qemu parts, so maybe just keep > it in mind If we assume that the container/group/device interface is essentially deprecated once we have iommufd, it doesn't make a lot of sense to me to tack on a container/device interface just so userspace can avoid reverting to the fully legacy interface. But why would we create vfio device interface files at all if they can't work? I'm not really on board with creating a try-and-fail interface for a mechanism that cannot work for a given device. The existence of the device interface should indicate that it's supported. Thanks, Alex ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC 03/20] vfio: Add vfio_[un]register_device()
> From: Jason Gunthorpe > Sent: Wednesday, September 22, 2021 8:23 PM > > On Wed, Sep 22, 2021 at 09:23:34AM +, Tian, Kevin wrote: > > > > Providing an ioctl to bind to a normal VFIO container or group might > > > allow a reasonable fallback in userspace.. > > > > I didn't get this point though. An error in binding already allows the > > user to fall back to the group path. Why do we need introduce another > > ioctl to explicitly bind to container via the nongroup interface? > > New userspace still needs a fallback path if it hits the 'try and > fail'. Keeping the device FD open and just using a different ioctl to > bind to a container/group FD, which new userspace can then obtain as a > fallback, might be OK. > > Hard to see without going through the qemu parts, so maybe just keep > it in mind > sure. will figure it out when working on the qemu part. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Wed, Sep 22, 2021 at 09:23:34AM +, Tian, Kevin wrote: > > Providing an ioctl to bind to a normal VFIO container or group might > > allow a reasonable fallback in userspace.. > > I didn't get this point though. An error in binding already allows the > user to fall back to the group path. Why do we need introduce another > ioctl to explicitly bind to container via the nongroup interface? New userspace still needs a fallback path if it hits the 'try and fail'. Keeping the device FD open and just using a different ioctl to bind to a container/group FD, which new userspace can then obtain as a fallback, might be OK. Hard to see without going through the qemu parts, so maybe just keep it in mind Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC 03/20] vfio: Add vfio_[un]register_device()
> From: Jason Gunthorpe > Sent: Wednesday, September 22, 2021 8:54 AM > > On Tue, Sep 21, 2021 at 11:10:15PM +, Tian, Kevin wrote: > > > From: Jason Gunthorpe > > > Sent: Wednesday, September 22, 2021 12:01 AM > > > > > > On Sun, Sep 19, 2021 at 02:38:31PM +0800, Liu Yi L wrote: > > > > With /dev/vfio/devices introduced, now a vfio device driver has three > > > > options to expose its device to userspace: > > > > > > > > a) only legacy group interface, for devices which haven't been moved > to > > > > iommufd (e.g. platform devices, sw mdev, etc.); > > > > > > > > b) both legacy group interface and new device-centric interface, for > > > > devices which supports iommufd but also wants to keep backward > > > > compatibility (e.g. pci devices in this RFC); > > > > > > > > c) only new device-centric interface, for new devices which don't carry > > > > backward compatibility burden (e.g. hw mdev/subdev with pasid); > > > > > > We shouldn't have 'b'? Where does it come from? > > > > a vfio-pci device can be opened via the existing group interface. if no b) > > it > > means legacy vfio userspace can never use vfio-pci device any more > > once the latter is moved to iommufd. > > Sorry, I think I ment a, which I guess you will say is SW mdev devices > > But even so, I think the way forward here is to still always expose > the device /dev/vfio/devices/X and some devices may not allow iommufd > usage initially. After another thought this should work. Following your comments in other places, we'll move the handling of BIND_IOMMUFD to vfio core which then invoke .bind_iommufd() from the driver. For devices which don't allow iommufd now, the callback is null thus an error is returned. This leaves the userspace in a try-and-fail mode. It first opens the device fd and iommufd, and then try to connect the two together. If failed then fallback to the legacy group interface. Then we don't need a) at all. and we can even avoid introducing new vfio_[un]register_device() at this point. Just leverage existing vfio_[un]register_group_dev() to cover b). new helpers can be introduced later when c) is supported. > > Providing an ioctl to bind to a normal VFIO container or group might > allow a reasonable fallback in userspace.. > I didn't get this point though. An error in binding already allows the user to fall back to the group path. Why do we need introduce another ioctl to explicitly bind to container via the nongroup interface? Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC 03/20] vfio: Add vfio_[un]register_device()
> From: Jason Gunthorpe > Sent: Wednesday, September 22, 2021 9:00 AM > > On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote: > > > From: Jason Gunthorpe > > > Sent: Wednesday, September 22, 2021 12:01 AM > > > > > > > One open about how to organize the device nodes under > > > /dev/vfio/devices/. > > > > This RFC adopts a simple policy by keeping a flat layout with mixed > > > devname > > > > from all kinds of devices. The prerequisite of this model is that > devnames > > > > from different bus types are unique formats: > > > > > > This isn't reliable, the devname should just be vfio0, vfio1, etc > > > > > > The userspace can learn the correct major/minor by inspecting the > > > sysfs. > > > > > > This whole concept should disappear into the prior patch that adds the > > > struct device in the first place, and I think most of the code here > > > can be deleted once the struct device is used properly. > > > > > > > Can you help elaborate above flow? This is one area where we need > > more guidance. > > > > When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F", > > how does Qemu identify which vifo0/1/... is associated with the specified > > :BB:DD.F? > > When done properly in the kernel the file: > > /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev > > Will contain the major:minor of the VFIO device. > > Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat > that the major:minor matches. ah, that's the trick. > > in the above pattern "pci" and ":BB:DD.FF" are the arguments passed > to qemu. > > You can look at this for some general over engineered code to handle > opening from a sysfs handle like above: > > https://github.com/linux-rdma/rdma-core/blob/master/util/open_cdev.c > will check. Thanks for suggestion. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote: > > From: Jason Gunthorpe > > Sent: Wednesday, September 22, 2021 12:01 AM > > > > > One open about how to organize the device nodes under > > /dev/vfio/devices/. > > > This RFC adopts a simple policy by keeping a flat layout with mixed > > devname > > > from all kinds of devices. The prerequisite of this model is that devnames > > > from different bus types are unique formats: > > > > This isn't reliable, the devname should just be vfio0, vfio1, etc > > > > The userspace can learn the correct major/minor by inspecting the > > sysfs. > > > > This whole concept should disappear into the prior patch that adds the > > struct device in the first place, and I think most of the code here > > can be deleted once the struct device is used properly. > > > > Can you help elaborate above flow? This is one area where we need > more guidance. > > When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F", > how does Qemu identify which vifo0/1/... is associated with the specified > :BB:DD.F? When done properly in the kernel the file: /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev Will contain the major:minor of the VFIO device. Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat that the major:minor matches. in the above pattern "pci" and ":BB:DD.FF" are the arguments passed to qemu. You can look at this for some general over engineered code to handle opening from a sysfs handle like above: https://github.com/linux-rdma/rdma-core/blob/master/util/open_cdev.c Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC 03/20] vfio: Add vfio_[un]register_device()
> From: Jason Gunthorpe > Sent: Wednesday, September 22, 2021 8:54 AM > > On Tue, Sep 21, 2021 at 11:10:15PM +, Tian, Kevin wrote: > > > From: Jason Gunthorpe > > > Sent: Wednesday, September 22, 2021 12:01 AM > > > > > > On Sun, Sep 19, 2021 at 02:38:31PM +0800, Liu Yi L wrote: > > > > With /dev/vfio/devices introduced, now a vfio device driver has three > > > > options to expose its device to userspace: > > > > > > > > a) only legacy group interface, for devices which haven't been moved > to > > > > iommufd (e.g. platform devices, sw mdev, etc.); > > > > > > > > b) both legacy group interface and new device-centric interface, for > > > > devices which supports iommufd but also wants to keep backward > > > > compatibility (e.g. pci devices in this RFC); > > > > > > > > c) only new device-centric interface, for new devices which don't carry > > > > backward compatibility burden (e.g. hw mdev/subdev with pasid); > > > > > > We shouldn't have 'b'? Where does it come from? > > > > a vfio-pci device can be opened via the existing group interface. if no b) > > it > > means legacy vfio userspace can never use vfio-pci device any more > > once the latter is moved to iommufd. > > Sorry, I think I ment a, which I guess you will say is SW mdev devices We listed a) here in case we don't want to move all vfio device types to use iommufd in one breath. It's supposed to be a type valid only in this transition phase. In the end only b) and c) are valid. > > But even so, I think the way forward here is to still always expose > the device /dev/vfio/devices/X and some devices may not allow iommufd > usage initially. > > Providing an ioctl to bind to a normal VFIO container or group might > allow a reasonable fallback in userspace.. > but doesn't a new ioctl still imply breaking existing vfio userspace? Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC 03/20] vfio: Add vfio_[un]register_device()
> From: Jason Gunthorpe > Sent: Wednesday, September 22, 2021 12:01 AM > > > One open about how to organize the device nodes under > /dev/vfio/devices/. > > This RFC adopts a simple policy by keeping a flat layout with mixed > devname > > from all kinds of devices. The prerequisite of this model is that devnames > > from different bus types are unique formats: > > This isn't reliable, the devname should just be vfio0, vfio1, etc > > The userspace can learn the correct major/minor by inspecting the > sysfs. > > This whole concept should disappear into the prior patch that adds the > struct device in the first place, and I think most of the code here > can be deleted once the struct device is used properly. > Can you help elaborate above flow? This is one area where we need more guidance. When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F", how does Qemu identify which vifo0/1/... is associated with the specified :BB:DD.F? Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Tue, Sep 21, 2021 at 11:10:15PM +, Tian, Kevin wrote: > > From: Jason Gunthorpe > > Sent: Wednesday, September 22, 2021 12:01 AM > > > > On Sun, Sep 19, 2021 at 02:38:31PM +0800, Liu Yi L wrote: > > > With /dev/vfio/devices introduced, now a vfio device driver has three > > > options to expose its device to userspace: > > > > > > a) only legacy group interface, for devices which haven't been moved to > > > iommufd (e.g. platform devices, sw mdev, etc.); > > > > > > b) both legacy group interface and new device-centric interface, for > > > devices which supports iommufd but also wants to keep backward > > > compatibility (e.g. pci devices in this RFC); > > > > > > c) only new device-centric interface, for new devices which don't carry > > > backward compatibility burden (e.g. hw mdev/subdev with pasid); > > > > We shouldn't have 'b'? Where does it come from? > > a vfio-pci device can be opened via the existing group interface. if no b) it > means legacy vfio userspace can never use vfio-pci device any more > once the latter is moved to iommufd. Sorry, I think I ment a, which I guess you will say is SW mdev devices But even so, I think the way forward here is to still always expose the device /dev/vfio/devices/X and some devices may not allow iommufd usage initially. Providing an ioctl to bind to a normal VFIO container or group might allow a reasonable fallback in userspace.. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
RE: [RFC 03/20] vfio: Add vfio_[un]register_device()
> From: Jason Gunthorpe > Sent: Wednesday, September 22, 2021 12:01 AM > > On Sun, Sep 19, 2021 at 02:38:31PM +0800, Liu Yi L wrote: > > With /dev/vfio/devices introduced, now a vfio device driver has three > > options to expose its device to userspace: > > > > a) only legacy group interface, for devices which haven't been moved to > > iommufd (e.g. platform devices, sw mdev, etc.); > > > > b) both legacy group interface and new device-centric interface, for > > devices which supports iommufd but also wants to keep backward > > compatibility (e.g. pci devices in this RFC); > > > > c) only new device-centric interface, for new devices which don't carry > > backward compatibility burden (e.g. hw mdev/subdev with pasid); > > We shouldn't have 'b'? Where does it come from? a vfio-pci device can be opened via the existing group interface. if no b) it means legacy vfio userspace can never use vfio-pci device any more once the latter is moved to iommufd. Thanks Kevin ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [RFC 03/20] vfio: Add vfio_[un]register_device()
On Sun, Sep 19, 2021 at 02:38:31PM +0800, Liu Yi L wrote: > With /dev/vfio/devices introduced, now a vfio device driver has three > options to expose its device to userspace: > > a) only legacy group interface, for devices which haven't been moved to > iommufd (e.g. platform devices, sw mdev, etc.); > > b) both legacy group interface and new device-centric interface, for > devices which supports iommufd but also wants to keep backward > compatibility (e.g. pci devices in this RFC); > > c) only new device-centric interface, for new devices which don't carry > backward compatibility burden (e.g. hw mdev/subdev with pasid); We shouldn't have 'b'? Where does it come from? > This patch introduces vfio_[un]register_device() helpers for the device > drivers to specify the device exposure policy to vfio core. Hence the > existing vfio_[un]register_group_dev() become the wrapper of the new > helper functions. The new device-centric interface is described as > 'nongroup' to differentiate from existing 'group' stuff. Detect what the driver supports based on the ops it declares. There should be a function provided through the ops for the driver to bind to the iommufd. > One open about how to organize the device nodes under /dev/vfio/devices/. > This RFC adopts a simple policy by keeping a flat layout with mixed devname > from all kinds of devices. The prerequisite of this model is that devnames > from different bus types are unique formats: This isn't reliable, the devname should just be vfio0, vfio1, etc The userspace can learn the correct major/minor by inspecting the sysfs. This whole concept should disappear into the prior patch that adds the struct device in the first place, and I think most of the code here can be deleted once the struct device is used properly. Jason ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu