subject:"RE\: \[RFC 03\/20\] vfio\: Add vfio_\[un\]register

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-29 Thread da...@gibson.dropbear.id.au

On Wed, Sep 29, 2021 at 09:22:30AM -0300, Jason Gunthorpe wrote:
> On Wed, Sep 29, 2021 at 12:46:14PM +1000, da...@gibson.dropbear.id.au wrote:
> > On Tue, Sep 21, 2021 at 10:00:14PM -0300, Jason Gunthorpe wrote:
> > > On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote:
> > > > > From: Jason Gunthorpe 
> > > > > Sent: Wednesday, September 22, 2021 12:01 AM
> > > > > 
> > > > > >  One open about how to organize the device nodes under
> > > > > /dev/vfio/devices/.
> > > > > > This RFC adopts a simple policy by keeping a flat layout with mixed
> > > > > devname
> > > > > > from all kinds of devices. The prerequisite of this model is that 
> > > > > > devnames
> > > > > > from different bus types are unique formats:
> > > > > 
> > > > > This isn't reliable, the devname should just be vfio0, vfio1, etc
> > > > > 
> > > > > The userspace can learn the correct major/minor by inspecting the
> > > > > sysfs.
> > > > > 
> > > > > This whole concept should disappear into the prior patch that adds the
> > > > > struct device in the first place, and I think most of the code here
> > > > > can be deleted once the struct device is used properly.
> > > > > 
> > > > 
> > > > Can you help elaborate above flow? This is one area where we need
> > > > more guidance.
> > > > 
> > > > When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F",
> > > > how does Qemu identify which vifo0/1/... is associated with the 
> > > > specified 
> > > > :BB:DD.F? 
> > > 
> > > When done properly in the kernel the file:
> > > 
> > > /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev
> > > 
> > > Will contain the major:minor of the VFIO device.
> > > 
> > > Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat
> > > that the major:minor matches.
> > > 
> > > in the above pattern "pci" and ":BB:DD.FF" are the arguments passed
> > > to qemu.
> > 
> > I thought part of the appeal of the device centric model was less
> > grovelling around in sysfs for information.  Using type/address
> > directly in /dev seems simpler than having to dig around matching
> > things here.
> 
> I would say more regular grovelling. Starting from a sysfs device
> directory and querying the VFIO cdev associated with it is much more
> normal than what happens today, which also includes passing sysfs
> information into an ioctl :\

Hm.. ok.  Clearly I'm unfamiliar with the things that do that.  Other
than current VFIO, the only model I've really seen is where you just
point your program at a device node.

> > Note that this doesn't have to be done in kernel: you could have the
> > kernel just call them /dev/vfio/devices/vfio0, ... but add udev rules
> > that create symlinks from say /dev/vfio/pci/:BB:SS.F - >
> > ../devices/vfioXX based on the sysfs information.
> 
> This is the right approach if people want to do this, but I'm not sure
> it is worth it given backwards compat requires the sysfs path as
> input.

You mean for userspace that needs to be able to go back to the old
VFIO interface as well?  It seems silly to force this sysfs mucking
about on new programs that depend on the new interface.

> We may as well stick with sysfs as the command line interface
> for userspace tools.

> And I certainly don't want to see userspace tools trying to reverse a
> sysfs path into a /dev/ symlink name when they can directly and
> reliably learn the correct cdev from the sysfspath.

Um.. sure.. but they can get the correct cdev from the sysfspath no
matter how we name the cdevs.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-29 Thread Jason Gunthorpe via iommu

On Wed, Sep 29, 2021 at 12:46:14PM +1000, da...@gibson.dropbear.id.au wrote:
> On Tue, Sep 21, 2021 at 10:00:14PM -0300, Jason Gunthorpe wrote:
> > On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote:
> > > > From: Jason Gunthorpe 
> > > > Sent: Wednesday, September 22, 2021 12:01 AM
> > > > 
> > > > >  One open about how to organize the device nodes under
> > > > /dev/vfio/devices/.
> > > > > This RFC adopts a simple policy by keeping a flat layout with mixed
> > > > devname
> > > > > from all kinds of devices. The prerequisite of this model is that 
> > > > > devnames
> > > > > from different bus types are unique formats:
> > > > 
> > > > This isn't reliable, the devname should just be vfio0, vfio1, etc
> > > > 
> > > > The userspace can learn the correct major/minor by inspecting the
> > > > sysfs.
> > > > 
> > > > This whole concept should disappear into the prior patch that adds the
> > > > struct device in the first place, and I think most of the code here
> > > > can be deleted once the struct device is used properly.
> > > > 
> > > 
> > > Can you help elaborate above flow? This is one area where we need
> > > more guidance.
> > > 
> > > When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F",
> > > how does Qemu identify which vifo0/1/... is associated with the specified 
> > > :BB:DD.F? 
> > 
> > When done properly in the kernel the file:
> > 
> > /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev
> > 
> > Will contain the major:minor of the VFIO device.
> > 
> > Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat
> > that the major:minor matches.
> > 
> > in the above pattern "pci" and ":BB:DD.FF" are the arguments passed
> > to qemu.
> 
> I thought part of the appeal of the device centric model was less
> grovelling around in sysfs for information.  Using type/address
> directly in /dev seems simpler than having to dig around matching
> things here.

I would say more regular grovelling. Starting from a sysfs device
directory and querying the VFIO cdev associated with it is much more
normal than what happens today, which also includes passing sysfs
information into an ioctl :\

> Note that this doesn't have to be done in kernel: you could have the
> kernel just call them /dev/vfio/devices/vfio0, ... but add udev rules
> that create symlinks from say /dev/vfio/pci/:BB:SS.F - >
> ../devices/vfioXX based on the sysfs information.

This is the right approach if people want to do this, but I'm not sure
it is worth it given backwards compat requires the sysfs path as
input. We may as well stick with sysfs as the command line interface
for userspace tools.

And I certainly don't want to see userspace tools trying to reverse a
sysfs path into a /dev/ symlink name when they can directly and
reliably learn the correct cdev from the sysfspath.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-29 Thread Jason Gunthorpe via iommu

On Wed, Sep 29, 2021 at 09:08:25AM +0200, Cornelia Huck wrote:
> On Wed, Sep 29 2021, "Tian, Kevin"  wrote:
> 
> >> From: David Gibson 
> >> Sent: Wednesday, September 29, 2021 10:44 AM
> >> 
> >> > One alternative option is to arrange device nodes in sub-directories 
> >> > based
> >> > on the device type. But doing so also adds one trouble to userspace. The
> >> > current vfio uAPI is designed to have the user query device type via
> >> > VFIO_DEVICE_GET_INFO after opening the device. With this option the user
> >> > instead needs to figure out the device type before opening the device, to
> >> > identify the sub-directory.
> >> 
> >> Wouldn't this be up to the operator / configuration, rather than the
> >> actual software though?  I would assume that typically the VFIO
> >> program would be pointed at a specific vfio device node file to use,
> >> e.g.
> >>my-vfio-prog -d /dev/vfio/pci/:0a:03.1
> >> 
> >> Or more generally, if you're expecting userspace to know a name in a
> >> uniqu pattern, they can equally well know a "type/name" pair.
> >> 
> >
> > You are correct. Currently:
> >
> > -device, vfio-pci,host=:BB:DD.F
> > -device, vfio-pci,sysfdev=/sys/bus/pci/devices/ :BB:DD.F
> > -device, vfio-platform,sysdev=/sys/bus/platform/devices/PNP0103:00
> >
> > above is definitely type/name information to find the related node. 
> >
> > Actually even for Jason's proposal we still need such information to
> > identify the sysfs path.
> >
> > Then I feel type-based sub-directory does work. Adding another link
> > to sysfs sounds unnecessary now. But I'm not sure whether we still
> > want to create /dev/vfio/devices/vfio0 thing and related udev rule
> > thing that you pointed out in another mail.
> 
> Still reading through this whole thread, but type-based subdirectories
> also make the most sense to me. I don't really see userspace wanting to
> grab just any device and then figure out whether it is the device it was
> looking for, but rather immediately go to a specific device or at least
> a device of a specific type.

Even so the kernel should not be creating this, that is a job for
udev and some symlinks

> Sequentially-numbered devices tend to become really unwieldy in my
> experience if you are working on a system with loads of devices.

If the user experiance is always to refer to the sysfs node as Kevin
shows above then the user never sees the integer.

It is very much like how the group number works already, programs
always start at the sysfs, do the readlink thing on iommu_group and
then get the group number to go to /dev/vfio/X

So it is already the case that every piece of software can construct a
sysfs path to the device, we are just changing from
readlink(iommu_group) to readdir(vfio/vfio_device_XX)

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-29 Thread Cornelia Huck

On Wed, Sep 29 2021, "Tian, Kevin"  wrote:

>> From: David Gibson 
>> Sent: Wednesday, September 29, 2021 10:44 AM
>> 
>> > One alternative option is to arrange device nodes in sub-directories based
>> > on the device type. But doing so also adds one trouble to userspace. The
>> > current vfio uAPI is designed to have the user query device type via
>> > VFIO_DEVICE_GET_INFO after opening the device. With this option the user
>> > instead needs to figure out the device type before opening the device, to
>> > identify the sub-directory.
>> 
>> Wouldn't this be up to the operator / configuration, rather than the
>> actual software though?  I would assume that typically the VFIO
>> program would be pointed at a specific vfio device node file to use,
>> e.g.
>>  my-vfio-prog -d /dev/vfio/pci/:0a:03.1
>> 
>> Or more generally, if you're expecting userspace to know a name in a
>> uniqu pattern, they can equally well know a "type/name" pair.
>> 
>
> You are correct. Currently:
>
> -device, vfio-pci,host=:BB:DD.F
> -device, vfio-pci,sysfdev=/sys/bus/pci/devices/ :BB:DD.F
> -device, vfio-platform,sysdev=/sys/bus/platform/devices/PNP0103:00
>
> above is definitely type/name information to find the related node. 
>
> Actually even for Jason's proposal we still need such information to
> identify the sysfs path.
>
> Then I feel type-based sub-directory does work. Adding another link
> to sysfs sounds unnecessary now. But I'm not sure whether we still
> want to create /dev/vfio/devices/vfio0 thing and related udev rule
> thing that you pointed out in another mail.

Still reading through this whole thread, but type-based subdirectories
also make the most sense to me. I don't really see userspace wanting to
grab just any device and then figure out whether it is the device it was
looking for, but rather immediately go to a specific device or at least
a device of a specific type.

Sequentially-numbered devices tend to become really unwieldy in my
experience if you are working on a system with loads of devices.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-28 Thread Tian, Kevin

> From: David Gibson 
> Sent: Wednesday, September 29, 2021 10:44 AM
> 
> > One alternative option is to arrange device nodes in sub-directories based
> > on the device type. But doing so also adds one trouble to userspace. The
> > current vfio uAPI is designed to have the user query device type via
> > VFIO_DEVICE_GET_INFO after opening the device. With this option the user
> > instead needs to figure out the device type before opening the device, to
> > identify the sub-directory.
> 
> Wouldn't this be up to the operator / configuration, rather than the
> actual software though?  I would assume that typically the VFIO
> program would be pointed at a specific vfio device node file to use,
> e.g.
>   my-vfio-prog -d /dev/vfio/pci/:0a:03.1
> 
> Or more generally, if you're expecting userspace to know a name in a
> uniqu pattern, they can equally well know a "type/name" pair.
> 

You are correct. Currently:

-device, vfio-pci,host=:BB:DD.F
-device, vfio-pci,sysfdev=/sys/bus/pci/devices/ :BB:DD.F
-device, vfio-platform,sysdev=/sys/bus/platform/devices/PNP0103:00

above is definitely type/name information to find the related node. 

Actually even for Jason's proposal we still need such information to
identify the sysfs path.

Then I feel type-based sub-directory does work. Adding another link
to sysfs sounds unnecessary now. But I'm not sure whether we still
want to create /dev/vfio/devices/vfio0 thing and related udev rule
thing that you pointed out in another mail.

Jason?

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-28 Thread Tian, Kevin

> From: David Gibson 
> Sent: Wednesday, September 29, 2021 10:44 AM
> 
> >
> > One open about how to organize the device nodes under
> /dev/vfio/devices/.
> > This RFC adopts a simple policy by keeping a flat layout with mixed
> devname
> > from all kinds of devices. The prerequisite of this model is that devnames
> > from different bus types are unique formats:
> >
> > /dev/vfio/devices/:00:14.2 (pci)
> > /dev/vfio/devices/PNP0103:00 (platform)
> > /dev/vfio/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001 (mdev)
> 
> Oof.  I really don't think this is a good idea.  Ensuring that a
> format is "unique" in the sense that it can't collide with any of the
> other formats, for *every* value of the parameters on both sides is
> actually pretty complicated in general.
> 
> I think per-type sub-directories would be helpful here, Jason's
> suggestion of just sequential numbers would work as well.

we'll follow Jason's suggestion in next version.

> > +   /*
> > +* Refcounting can't start until the driver call register. Don’t
> > +* start twice when the device is exposed in both group and
> nongroup
> > +* interfaces.
> > +*/
> > +   if (!refcount_read(&device->refcount))
> 
> Is there a possible race here with something getting in and
> incrementing the refcount between the read and set?

this will not be required in next version, which will always create
both group and nongroup interfaces for every device (then let
driver providing .bind_iommufd() callback for whether nongroup
interface is functional). It will be centrally processed within
existing vfio_[un]register_group_dev(), thus above race is not
a concern any more.

> 
> > +   refcount_set(&device->refcount, 1);
> >
> > mutex_lock(&group->device_lock);
> > list_add(&device->group_next, &group->device_list);
> > @@ -804,7 +810,78 @@ int vfio_register_group_dev(struct vfio_device
> *device)
> >
> > return 0;
> >  }
> > -EXPORT_SYMBOL_GPL(vfio_register_group_dev);
> > +
> > +static int __vfio_register_nongroup_dev(struct vfio_device *device)
> > +{
> > +   struct vfio_device *existing_device;
> > +   struct device *dev;
> > +   int ret = 0, minor;
> > +
> > +   mutex_lock(&vfio.device_lock);
> > +   list_for_each_entry(existing_device, &vfio.device_list, vfio_next) {
> > +   if (existing_device == device) {
> > +   ret = -EBUSY;
> > +   goto out_unlock;
> 
> This indicates a bug in the caller, doesn't it?  Should it be a BUG or
> WARN instead?

this call is initiated by userspace. Per Jason's suggestion we don't 
even need to check it then no lock is required. 

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-28 Thread David Gibson

On Sun, Sep 19, 2021 at 02:38:31PM +0800, Liu Yi L wrote:
> With /dev/vfio/devices introduced, now a vfio device driver has three
> options to expose its device to userspace:
> 
> a)  only legacy group interface, for devices which haven't been moved to
> iommufd (e.g. platform devices, sw mdev, etc.);
> 
> b)  both legacy group interface and new device-centric interface, for
> devices which supports iommufd but also wants to keep backward
> compatibility (e.g. pci devices in this RFC);
> 
> c)  only new device-centric interface, for new devices which don't carry
> backward compatibility burden (e.g. hw mdev/subdev with pasid);
> 
> This patch introduces vfio_[un]register_device() helpers for the device
> drivers to specify the device exposure policy to vfio core. Hence the
> existing vfio_[un]register_group_dev() become the wrapper of the new
> helper functions. The new device-centric interface is described as
> 'nongroup' to differentiate from existing 'group' stuff.
> 
> TBD: this patch needs to rebase on top of below series from Christoph in
> next version.
> 
>   "cleanup vfio iommu_group creation"
> 
> Legacy userspace continues to follow the legacy group interface.
> 
> Newer userspace can first try the new device-centric interface if the
> device is present under /dev/vfio/devices. Otherwise fall back to the
> group interface.
> 
> One open about how to organize the device nodes under /dev/vfio/devices/.
> This RFC adopts a simple policy by keeping a flat layout with mixed devname
> from all kinds of devices. The prerequisite of this model is that devnames
> from different bus types are unique formats:
> 
>   /dev/vfio/devices/:00:14.2 (pci)
>   /dev/vfio/devices/PNP0103:00 (platform)
>   /dev/vfio/devices/83b8f4f2-509f-382f-3c1e-e6bfe0fa1001 (mdev)

Oof.  I really don't think this is a good idea.  Ensuring that a
format is "unique" in the sense that it can't collide with any of the
other formats, for *every* value of the parameters on both sides is
actually pretty complicated in general.

I think per-type sub-directories would be helpful here, Jason's
suggestion of just sequential numbers would work as well.

> One alternative option is to arrange device nodes in sub-directories based
> on the device type. But doing so also adds one trouble to userspace. The
> current vfio uAPI is designed to have the user query device type via
> VFIO_DEVICE_GET_INFO after opening the device. With this option the user
> instead needs to figure out the device type before opening the device, to
> identify the sub-directory.

Wouldn't this be up to the operator / configuration, rather than the
actual software though?  I would assume that typically the VFIO
program would be pointed at a specific vfio device node file to use,
e.g.
my-vfio-prog -d /dev/vfio/pci/:0a:03.1

Or more generally, if you're expecting userspace to know a name in a
uniqu pattern, they can equally well know a "type/name" pair.

> Another tricky thing is that "pdev. vs. mdev"
> and "pci vs. platform vs. ccw,..." are orthogonal categorizations. Need
> more thoughts on whether both or just one category should be used to define
> the sub-directories.
> 
> Signed-off-by: Liu Yi L 
> ---
>  drivers/vfio/vfio.c  | 137 +++
>  include/linux/vfio.h |   9 +++
>  2 files changed, 134 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/vfio/vfio.c b/drivers/vfio/vfio.c
> index 84436d7abedd..1e87b25962f1 100644
> --- a/drivers/vfio/vfio.c
> +++ b/drivers/vfio/vfio.c
> @@ -51,6 +51,7 @@ static struct vfio {
>   struct cdev device_cdev;
>   dev_t   device_devt;
>   struct mutexdevice_lock;
> + struct list_headdevice_list;
>   struct idr  device_idr;
>  } vfio;
>  
> @@ -757,7 +758,7 @@ void vfio_init_group_dev(struct vfio_device *device, 
> struct device *dev,
>  }
>  EXPORT_SYMBOL_GPL(vfio_init_group_dev);
>  
> -int vfio_register_group_dev(struct vfio_device *device)
> +static int __vfio_register_group_dev(struct vfio_device *device)
>  {
>   struct vfio_device *existing_device;
>   struct iommu_group *iommu_group;
> @@ -794,8 +795,13 @@ int vfio_register_group_dev(struct vfio_device *device)
>   /* Our reference on group is moved to the device */
>   device->group = group;
>  
> - /* Refcounting can't start until the driver calls register */
> - refcount_set(&device->refcount, 1);
> + /*
> +  * Refcounting can't start until the driver call register. Don’t
> +  * start twice when the device is exposed in both group and nongroup
> +  * interfaces.
> +  */
> + if (!refcount_read(&device->refcount))

Is there a possible race here with something getting in and
incrementing the refcount between the read and set?

> + refcount_set(&device->refcount, 1);
>  
>   mutex_lock(&group->device_lock);
>

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-28 Thread da...@gibson.dropbear.id.au

On Tue, Sep 21, 2021 at 10:00:14PM -0300, Jason Gunthorpe wrote:
> On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote:
> > > From: Jason Gunthorpe 
> > > Sent: Wednesday, September 22, 2021 12:01 AM
> > > 
> > > >  One open about how to organize the device nodes under
> > > /dev/vfio/devices/.
> > > > This RFC adopts a simple policy by keeping a flat layout with mixed
> > > devname
> > > > from all kinds of devices. The prerequisite of this model is that 
> > > > devnames
> > > > from different bus types are unique formats:
> > > 
> > > This isn't reliable, the devname should just be vfio0, vfio1, etc
> > > 
> > > The userspace can learn the correct major/minor by inspecting the
> > > sysfs.
> > > 
> > > This whole concept should disappear into the prior patch that adds the
> > > struct device in the first place, and I think most of the code here
> > > can be deleted once the struct device is used properly.
> > > 
> > 
> > Can you help elaborate above flow? This is one area where we need
> > more guidance.
> > 
> > When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F",
> > how does Qemu identify which vifo0/1/... is associated with the specified 
> > :BB:DD.F? 
> 
> When done properly in the kernel the file:
> 
> /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev
> 
> Will contain the major:minor of the VFIO device.
> 
> Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat
> that the major:minor matches.
> 
> in the above pattern "pci" and ":BB:DD.FF" are the arguments passed
> to qemu.

I thought part of the appeal of the device centric model was less
grovelling around in sysfs for information.  Using type/address
directly in /dev seems simpler than having to dig around matching
things here.

Note that this doesn't have to be done in kernel: you could have the
kernel just call them /dev/vfio/devices/vfio0, ... but add udev rules
that create symlinks from say /dev/vfio/pci/:BB:SS.F - >
../devices/vfioXX based on the sysfs information.

> 
> You can look at this for some general over engineered code to handle
> opening from a sysfs handle like above:
> 
> https://github.com/linux-rdma/rdma-core/blob/master/util/open_cdev.c
> 
> Jason
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-23 Thread Jason Gunthorpe via iommu

On Thu, Sep 23, 2021 at 09:25:27AM +0200, Eric Auger wrote:
> Hi,
> 
> On 9/22/21 3:00 AM, Jason Gunthorpe wrote:
> > On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote:
> >>> From: Jason Gunthorpe 
> >>> Sent: Wednesday, September 22, 2021 12:01 AM
> >>>
>   One open about how to organize the device nodes under
> >>> /dev/vfio/devices/.
>  This RFC adopts a simple policy by keeping a flat layout with mixed
> >>> devname
>  from all kinds of devices. The prerequisite of this model is that 
>  devnames
>  from different bus types are unique formats:
> >>> This isn't reliable, the devname should just be vfio0, vfio1, etc
> >>>
> >>> The userspace can learn the correct major/minor by inspecting the
> >>> sysfs.
> >>>
> >>> This whole concept should disappear into the prior patch that adds the
> >>> struct device in the first place, and I think most of the code here
> >>> can be deleted once the struct device is used properly.
> >>>
> >> Can you help elaborate above flow? This is one area where we need
> >> more guidance.
> >>
> >> When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F",
> >> how does Qemu identify which vifo0/1/... is associated with the specified 
> >> :BB:DD.F? 
> > When done properly in the kernel the file:
> >
> > /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev
> >
> > Will contain the major:minor of the VFIO device.
> >
> > Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat
> > that the major:minor matches.
> >
> > in the above pattern "pci" and ":BB:DD.FF" are the arguments passed
> > to qemu.
> I guess this would be the same for platform devices, for instance
> /sys/bus/platform/devices/AMDI8001:01/vfio/vfioX/dev, right?

Yes, it is the general driver core pattern for creating cdevs below a
parent device

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-23 Thread Eric Auger

Hi,

On 9/22/21 3:00 AM, Jason Gunthorpe wrote:
> On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote:
>>> From: Jason Gunthorpe 
>>> Sent: Wednesday, September 22, 2021 12:01 AM
>>>
  One open about how to organize the device nodes under
>>> /dev/vfio/devices/.
 This RFC adopts a simple policy by keeping a flat layout with mixed
>>> devname
 from all kinds of devices. The prerequisite of this model is that devnames
 from different bus types are unique formats:
>>> This isn't reliable, the devname should just be vfio0, vfio1, etc
>>>
>>> The userspace can learn the correct major/minor by inspecting the
>>> sysfs.
>>>
>>> This whole concept should disappear into the prior patch that adds the
>>> struct device in the first place, and I think most of the code here
>>> can be deleted once the struct device is used properly.
>>>
>> Can you help elaborate above flow? This is one area where we need
>> more guidance.
>>
>> When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F",
>> how does Qemu identify which vifo0/1/... is associated with the specified 
>> :BB:DD.F? 
> When done properly in the kernel the file:
>
> /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev
>
> Will contain the major:minor of the VFIO device.
>
> Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat
> that the major:minor matches.
>
> in the above pattern "pci" and ":BB:DD.FF" are the arguments passed
> to qemu.
I guess this would be the same for platform devices, for instance
/sys/bus/platform/devices/AMDI8001:01/vfio/vfioX/dev, right?

Thanks

Eric
>
> You can look at this for some general over engineered code to handle
> opening from a sysfs handle like above:
>
> https://github.com/linux-rdma/rdma-core/blob/master/util/open_cdev.c
>
> Jason
>

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-22 Thread Tian, Kevin

> From: Jason Gunthorpe 
> Sent: Thursday, September 23, 2021 7:52 AM
> 
> On Wed, Sep 22, 2021 at 11:45:33PM +, Tian, Kevin wrote:
> > > From: Alex Williamson 
> 
> > btw I realized another related piece regarding to the new layout that
> > Jason suggested, which have sys device node include a link to the vfio
> > devnode:
> >
> > /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev
> >
> > This for sure requires specific vfio driver support to get the link
> > established.
> 
> It doesn't. Just set the parent device of the vfio_device's struct
> device to the physical struct device that vfio is already tracking -
> ie the struct device providing the IOMMU. The driver core takes care
> of everything else.
> 

Thanks for correction. So it's still the same try-and-fail model for both 
devices which support iommufd and which do not.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-22 Thread Jason Gunthorpe via iommu

On Wed, Sep 22, 2021 at 02:10:36PM -0600, Alex Williamson wrote:

> But why would we create vfio device interface files at all if they
> can't work?  I'm not really on board with creating a try-and-fail
> interface for a mechanism that cannot work for a given device.  The
> existence of the device interface should indicate that it's supported.

I'm a little worried about adding a struct device to vfio_device and
then making it optional.. That is a really weird situation.

I suppose you could create the sysfs presence in the struct device but
not create a cdev.

However, if we ever want to use the device fd for something else, like
querying the device driver capabilities or mode, (ie clean the
driver_api thing wrongly placed in mdev sysfs for instance), we are
blocked as the uAPI will be cdev == must support iommufd..

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-22 Thread Jason Gunthorpe via iommu

On Wed, Sep 22, 2021 at 11:45:33PM +, Tian, Kevin wrote:
> > From: Alex Williamson 

> btw I realized another related piece regarding to the new layout that
> Jason suggested, which have sys device node include a link to the vfio
> devnode:
> 
>   /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev
> 
> This for sure requires specific vfio driver support to get the link
> established.

It doesn't. Just set the parent device of the vfio_device's struct
device to the physical struct device that vfio is already tracking -
ie the struct device providing the IOMMU. The driver core takes care
of everything else.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-22 Thread Tian, Kevin

> From: Alex Williamson 
> Sent: Thursday, September 23, 2021 6:45 AM
> 
> On Wed, 22 Sep 2021 22:34:42 +
> "Tian, Kevin"  wrote:
> 
> > > From: Alex Williamson 
> > > Sent: Thursday, September 23, 2021 4:11 AM
> > >
> > > On Wed, 22 Sep 2021 09:22:52 -0300
> > > Jason Gunthorpe  wrote:
> > >
> > > > On Wed, Sep 22, 2021 at 09:23:34AM +, Tian, Kevin wrote:
> > > >
> > > > > > Providing an ioctl to bind to a normal VFIO container or group might
> > > > > > allow a reasonable fallback in userspace..
> > > > >
> > > > > I didn't get this point though. An error in binding already allows the
> > > > > user to fall back to the group path. Why do we need introduce
> another
> > > > > ioctl to explicitly bind to container via the nongroup interface?
> > > >
> > > > New userspace still needs a fallback path if it hits the 'try and
> > > > fail'. Keeping the device FD open and just using a different ioctl to
> > > > bind to a container/group FD, which new userspace can then obtain as
> a
> > > > fallback, might be OK.
> > > >
> > > > Hard to see without going through the qemu parts, so maybe just keep
> > > > it in mind
> > >
> > > If we assume that the container/group/device interface is essentially
> > > deprecated once we have iommufd, it doesn't make a lot of sense to me
> > > to tack on a container/device interface just so userspace can avoid
> > > reverting to the fully legacy interface.
> > >
> > > But why would we create vfio device interface files at all if they
> > > can't work?  I'm not really on board with creating a try-and-fail
> > > interface for a mechanism that cannot work for a given device.  The
> > > existence of the device interface should indicate that it's supported.
> > > Thanks,
> > >
> >
> > Now it's a try-and-fail model even for devices which support iommufd.
> > Per Jason's suggestion, a device is always opened with a parked fops
> > which supports only bind. Binding serves as the contract for handling
> > exclusive ownership on a device and switching to normal fops if
> > succeed. So the user has to try-and-fail in case multiple threads attempt
> > to open a same device. Device which doesn't support iommufd is not
> > different, except binding request 100% fails (due to missing .bind_iommufd
> > in kernel driver).
> 
> That's a rather important difference.  I don't really see how that's
> comparable to the mutually exclusive nature of the legacy vs device

I didn't get the 'comparable' part. Can you elaborate?

> interface.  We're not going to present a vfio device interface for SW
> mdevs that can't participate in iommufd, right?  Thanks,
> 

Did you see any problem if exposing sw mdev now? Following above
explanation the try-and-fail model should still work...

btw I realized another related piece regarding to the new layout that
Jason suggested, which have sys device node include a link to the vfio
devnode:

/sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev

This for sure requires specific vfio driver support to get the link established.
if we only do it for vfio-pci in the start, then for other devices which don't
support iommufd there is no way for the user to identify the corresponding
vfio devnode even it's still exposed. Then try-and-fail model may not even
been reached for those devices.

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-22 Thread Alex Williamson

On Wed, 22 Sep 2021 22:34:42 +
"Tian, Kevin"  wrote:

> > From: Alex Williamson 
> > Sent: Thursday, September 23, 2021 4:11 AM
> > 
> > On Wed, 22 Sep 2021 09:22:52 -0300
> > Jason Gunthorpe  wrote:
> >   
> > > On Wed, Sep 22, 2021 at 09:23:34AM +, Tian, Kevin wrote:
> > >  
> > > > > Providing an ioctl to bind to a normal VFIO container or group might
> > > > > allow a reasonable fallback in userspace..  
> > > >
> > > > I didn't get this point though. An error in binding already allows the
> > > > user to fall back to the group path. Why do we need introduce another
> > > > ioctl to explicitly bind to container via the nongroup interface?  
> > >
> > > New userspace still needs a fallback path if it hits the 'try and
> > > fail'. Keeping the device FD open and just using a different ioctl to
> > > bind to a container/group FD, which new userspace can then obtain as a
> > > fallback, might be OK.
> > >
> > > Hard to see without going through the qemu parts, so maybe just keep
> > > it in mind  
> > 
> > If we assume that the container/group/device interface is essentially
> > deprecated once we have iommufd, it doesn't make a lot of sense to me
> > to tack on a container/device interface just so userspace can avoid
> > reverting to the fully legacy interface.
> > 
> > But why would we create vfio device interface files at all if they
> > can't work?  I'm not really on board with creating a try-and-fail
> > interface for a mechanism that cannot work for a given device.  The
> > existence of the device interface should indicate that it's supported.
> > Thanks,
> >   
> 
> Now it's a try-and-fail model even for devices which support iommufd.
> Per Jason's suggestion, a device is always opened with a parked fops
> which supports only bind. Binding serves as the contract for handling
> exclusive ownership on a device and switching to normal fops if
> succeed. So the user has to try-and-fail in case multiple threads attempt 
> to open a same device. Device which doesn't support iommufd is not
> different, except binding request 100% fails (due to missing .bind_iommufd
> in kernel driver).

That's a rather important difference.  I don't really see how that's
comparable to the mutually exclusive nature of the legacy vs device
interface.  We're not going to present a vfio device interface for SW
mdevs that can't participate in iommufd, right?  Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-22 Thread Tian, Kevin

> From: Alex Williamson 
> Sent: Thursday, September 23, 2021 4:11 AM
> 
> On Wed, 22 Sep 2021 09:22:52 -0300
> Jason Gunthorpe  wrote:
> 
> > On Wed, Sep 22, 2021 at 09:23:34AM +, Tian, Kevin wrote:
> >
> > > > Providing an ioctl to bind to a normal VFIO container or group might
> > > > allow a reasonable fallback in userspace..
> > >
> > > I didn't get this point though. An error in binding already allows the
> > > user to fall back to the group path. Why do we need introduce another
> > > ioctl to explicitly bind to container via the nongroup interface?
> >
> > New userspace still needs a fallback path if it hits the 'try and
> > fail'. Keeping the device FD open and just using a different ioctl to
> > bind to a container/group FD, which new userspace can then obtain as a
> > fallback, might be OK.
> >
> > Hard to see without going through the qemu parts, so maybe just keep
> > it in mind
> 
> If we assume that the container/group/device interface is essentially
> deprecated once we have iommufd, it doesn't make a lot of sense to me
> to tack on a container/device interface just so userspace can avoid
> reverting to the fully legacy interface.
> 
> But why would we create vfio device interface files at all if they
> can't work?  I'm not really on board with creating a try-and-fail
> interface for a mechanism that cannot work for a given device.  The
> existence of the device interface should indicate that it's supported.
> Thanks,
> 

Now it's a try-and-fail model even for devices which support iommufd.
Per Jason's suggestion, a device is always opened with a parked fops
which supports only bind. Binding serves as the contract for handling
exclusive ownership on a device and switching to normal fops if
succeed. So the user has to try-and-fail in case multiple threads attempt 
to open a same device. Device which doesn't support iommufd is not
different, except binding request 100% fails (due to missing .bind_iommufd
in kernel driver).

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-22 Thread Alex Williamson

On Wed, 22 Sep 2021 09:22:52 -0300
Jason Gunthorpe  wrote:

> On Wed, Sep 22, 2021 at 09:23:34AM +, Tian, Kevin wrote:
> 
> > > Providing an ioctl to bind to a normal VFIO container or group might
> > > allow a reasonable fallback in userspace..  
> > 
> > I didn't get this point though. An error in binding already allows the
> > user to fall back to the group path. Why do we need introduce another
> > ioctl to explicitly bind to container via the nongroup interface?   
> 
> New userspace still needs a fallback path if it hits the 'try and
> fail'. Keeping the device FD open and just using a different ioctl to
> bind to a container/group FD, which new userspace can then obtain as a
> fallback, might be OK.
> 
> Hard to see without going through the qemu parts, so maybe just keep
> it in mind

If we assume that the container/group/device interface is essentially
deprecated once we have iommufd, it doesn't make a lot of sense to me
to tack on a container/device interface just so userspace can avoid
reverting to the fully legacy interface.

But why would we create vfio device interface files at all if they
can't work?  I'm not really on board with creating a try-and-fail
interface for a mechanism that cannot work for a given device.  The
existence of the device interface should indicate that it's supported.
Thanks,

Alex

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-22 Thread Tian, Kevin

> From: Jason Gunthorpe 
> Sent: Wednesday, September 22, 2021 8:23 PM
> 
> On Wed, Sep 22, 2021 at 09:23:34AM +, Tian, Kevin wrote:
> 
> > > Providing an ioctl to bind to a normal VFIO container or group might
> > > allow a reasonable fallback in userspace..
> >
> > I didn't get this point though. An error in binding already allows the
> > user to fall back to the group path. Why do we need introduce another
> > ioctl to explicitly bind to container via the nongroup interface?
> 
> New userspace still needs a fallback path if it hits the 'try and
> fail'. Keeping the device FD open and just using a different ioctl to
> bind to a container/group FD, which new userspace can then obtain as a
> fallback, might be OK.
> 
> Hard to see without going through the qemu parts, so maybe just keep
> it in mind
> 

sure. will figure it out when working on the qemu part.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-22 Thread Jason Gunthorpe via iommu

On Wed, Sep 22, 2021 at 09:23:34AM +, Tian, Kevin wrote:

> > Providing an ioctl to bind to a normal VFIO container or group might
> > allow a reasonable fallback in userspace..
> 
> I didn't get this point though. An error in binding already allows the
> user to fall back to the group path. Why do we need introduce another
> ioctl to explicitly bind to container via the nongroup interface? 

New userspace still needs a fallback path if it hits the 'try and
fail'. Keeping the device FD open and just using a different ioctl to
bind to a container/group FD, which new userspace can then obtain as a
fallback, might be OK.

Hard to see without going through the qemu parts, so maybe just keep
it in mind

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-22 Thread Tian, Kevin

> From: Jason Gunthorpe 
> Sent: Wednesday, September 22, 2021 8:54 AM
> 
> On Tue, Sep 21, 2021 at 11:10:15PM +, Tian, Kevin wrote:
> > > From: Jason Gunthorpe 
> > > Sent: Wednesday, September 22, 2021 12:01 AM
> > >
> > > On Sun, Sep 19, 2021 at 02:38:31PM +0800, Liu Yi L wrote:
> > > > With /dev/vfio/devices introduced, now a vfio device driver has three
> > > > options to expose its device to userspace:
> > > >
> > > > a)  only legacy group interface, for devices which haven't been moved
> to
> > > > iommufd (e.g. platform devices, sw mdev, etc.);
> > > >
> > > > b)  both legacy group interface and new device-centric interface, for
> > > > devices which supports iommufd but also wants to keep backward
> > > > compatibility (e.g. pci devices in this RFC);
> > > >
> > > > c)  only new device-centric interface, for new devices which don't carry
> > > > backward compatibility burden (e.g. hw mdev/subdev with pasid);
> > >
> > > We shouldn't have 'b'? Where does it come from?
> >
> > a vfio-pci device can be opened via the existing group interface. if no b) 
> > it
> > means legacy vfio userspace can never use vfio-pci device any more
> > once the latter is moved to iommufd.
> 
> Sorry, I think I ment a, which I guess you will say is SW mdev devices
> 
> But even so, I think the way forward here is to still always expose
> the device /dev/vfio/devices/X and some devices may not allow iommufd
> usage initially.

After another thought this should work. Following your comments in
other places, we'll move the handling of BIND_IOMMUFD to vfio core
which then invoke .bind_iommufd() from the driver. For devices which
don't allow iommufd now, the callback is null thus an error is returned.

This leaves the userspace in a try-and-fail mode. It first opens the device
fd and iommufd, and then try to connect the two together. If failed then
fallback to the legacy group interface.

Then we don't need a) at all. and we can even avoid introducing new
vfio_[un]register_device() at this point. Just leverage existing 
vfio_[un]register_group_dev() to cover b). new helpers can be introduced
later when c) is supported.

> 
> Providing an ioctl to bind to a normal VFIO container or group might
> allow a reasonable fallback in userspace..
> 

I didn't get this point though. An error in binding already allows the
user to fall back to the group path. Why do we need introduce another
ioctl to explicitly bind to container via the nongroup interface? 

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-21 Thread Tian, Kevin

> From: Jason Gunthorpe 
> Sent: Wednesday, September 22, 2021 9:00 AM
> 
> On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote:
> > > From: Jason Gunthorpe 
> > > Sent: Wednesday, September 22, 2021 12:01 AM
> > >
> > > >  One open about how to organize the device nodes under
> > > /dev/vfio/devices/.
> > > > This RFC adopts a simple policy by keeping a flat layout with mixed
> > > devname
> > > > from all kinds of devices. The prerequisite of this model is that
> devnames
> > > > from different bus types are unique formats:
> > >
> > > This isn't reliable, the devname should just be vfio0, vfio1, etc
> > >
> > > The userspace can learn the correct major/minor by inspecting the
> > > sysfs.
> > >
> > > This whole concept should disappear into the prior patch that adds the
> > > struct device in the first place, and I think most of the code here
> > > can be deleted once the struct device is used properly.
> > >
> >
> > Can you help elaborate above flow? This is one area where we need
> > more guidance.
> >
> > When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F",
> > how does Qemu identify which vifo0/1/... is associated with the specified
> > :BB:DD.F?
> 
> When done properly in the kernel the file:
> 
> /sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev
> 
> Will contain the major:minor of the VFIO device.
> 
> Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat
> that the major:minor matches.

ah, that's the trick.

> 
> in the above pattern "pci" and ":BB:DD.FF" are the arguments passed
> to qemu.
> 
> You can look at this for some general over engineered code to handle
> opening from a sysfs handle like above:
> 
> https://github.com/linux-rdma/rdma-core/blob/master/util/open_cdev.c
> 

will check. Thanks for suggestion.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-21 Thread Jason Gunthorpe via iommu

On Wed, Sep 22, 2021 at 12:54:02AM +, Tian, Kevin wrote:
> > From: Jason Gunthorpe 
> > Sent: Wednesday, September 22, 2021 12:01 AM
> > 
> > >  One open about how to organize the device nodes under
> > /dev/vfio/devices/.
> > > This RFC adopts a simple policy by keeping a flat layout with mixed
> > devname
> > > from all kinds of devices. The prerequisite of this model is that devnames
> > > from different bus types are unique formats:
> > 
> > This isn't reliable, the devname should just be vfio0, vfio1, etc
> > 
> > The userspace can learn the correct major/minor by inspecting the
> > sysfs.
> > 
> > This whole concept should disappear into the prior patch that adds the
> > struct device in the first place, and I think most of the code here
> > can be deleted once the struct device is used properly.
> > 
> 
> Can you help elaborate above flow? This is one area where we need
> more guidance.
> 
> When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F",
> how does Qemu identify which vifo0/1/... is associated with the specified 
> :BB:DD.F? 

When done properly in the kernel the file:

/sys/bus/pci/devices/:BB:DD.F/vfio/vfioX/dev

Will contain the major:minor of the VFIO device.

Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat
that the major:minor matches.

in the above pattern "pci" and ":BB:DD.FF" are the arguments passed
to qemu.

You can look at this for some general over engineered code to handle
opening from a sysfs handle like above:

https://github.com/linux-rdma/rdma-core/blob/master/util/open_cdev.c

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-21 Thread Tian, Kevin

> From: Jason Gunthorpe 
> Sent: Wednesday, September 22, 2021 8:54 AM
> 
> On Tue, Sep 21, 2021 at 11:10:15PM +, Tian, Kevin wrote:
> > > From: Jason Gunthorpe 
> > > Sent: Wednesday, September 22, 2021 12:01 AM
> > >
> > > On Sun, Sep 19, 2021 at 02:38:31PM +0800, Liu Yi L wrote:
> > > > With /dev/vfio/devices introduced, now a vfio device driver has three
> > > > options to expose its device to userspace:
> > > >
> > > > a)  only legacy group interface, for devices which haven't been moved
> to
> > > > iommufd (e.g. platform devices, sw mdev, etc.);
> > > >
> > > > b)  both legacy group interface and new device-centric interface, for
> > > > devices which supports iommufd but also wants to keep backward
> > > > compatibility (e.g. pci devices in this RFC);
> > > >
> > > > c)  only new device-centric interface, for new devices which don't carry
> > > > backward compatibility burden (e.g. hw mdev/subdev with pasid);
> > >
> > > We shouldn't have 'b'? Where does it come from?
> >
> > a vfio-pci device can be opened via the existing group interface. if no b) 
> > it
> > means legacy vfio userspace can never use vfio-pci device any more
> > once the latter is moved to iommufd.
> 
> Sorry, I think I ment a, which I guess you will say is SW mdev devices

We listed a) here in case we don't want to move all vfio device types to
use iommufd in one breath. It's supposed to be a type valid only in this
transition phase. In the end only b) and c) are valid.

> 
> But even so, I think the way forward here is to still always expose
> the device /dev/vfio/devices/X and some devices may not allow iommufd
> usage initially.
> 
> Providing an ioctl to bind to a normal VFIO container or group might
> allow a reasonable fallback in userspace..
> 

but doesn't a new ioctl still imply breaking existing vfio userspace?

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-21 Thread Tian, Kevin

> From: Jason Gunthorpe 
> Sent: Wednesday, September 22, 2021 12:01 AM
> 
> >  One open about how to organize the device nodes under
> /dev/vfio/devices/.
> > This RFC adopts a simple policy by keeping a flat layout with mixed
> devname
> > from all kinds of devices. The prerequisite of this model is that devnames
> > from different bus types are unique formats:
> 
> This isn't reliable, the devname should just be vfio0, vfio1, etc
> 
> The userspace can learn the correct major/minor by inspecting the
> sysfs.
> 
> This whole concept should disappear into the prior patch that adds the
> struct device in the first place, and I think most of the code here
> can be deleted once the struct device is used properly.
> 

Can you help elaborate above flow? This is one area where we need
more guidance.

When Qemu accepts an option "-device vfio-pci,host=:BB:DD.F",
how does Qemu identify which vifo0/1/... is associated with the specified 
:BB:DD.F? 

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-21 Thread Jason Gunthorpe via iommu

On Tue, Sep 21, 2021 at 11:10:15PM +, Tian, Kevin wrote:
> > From: Jason Gunthorpe 
> > Sent: Wednesday, September 22, 2021 12:01 AM
> > 
> > On Sun, Sep 19, 2021 at 02:38:31PM +0800, Liu Yi L wrote:
> > > With /dev/vfio/devices introduced, now a vfio device driver has three
> > > options to expose its device to userspace:
> > >
> > > a)  only legacy group interface, for devices which haven't been moved to
> > > iommufd (e.g. platform devices, sw mdev, etc.);
> > >
> > > b)  both legacy group interface and new device-centric interface, for
> > > devices which supports iommufd but also wants to keep backward
> > > compatibility (e.g. pci devices in this RFC);
> > >
> > > c)  only new device-centric interface, for new devices which don't carry
> > > backward compatibility burden (e.g. hw mdev/subdev with pasid);
> > 
> > We shouldn't have 'b'? Where does it come from?
> 
> a vfio-pci device can be opened via the existing group interface. if no b) it 
> means legacy vfio userspace can never use vfio-pci device any more
> once the latter is moved to iommufd.

Sorry, I think I ment a, which I guess you will say is SW mdev devices

But even so, I think the way forward here is to still always expose
the device /dev/vfio/devices/X and some devices may not allow iommufd
usage initially.

Providing an ioctl to bind to a normal VFIO container or group might
allow a reasonable fallback in userspace..

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-21 Thread Tian, Kevin

> From: Jason Gunthorpe 
> Sent: Wednesday, September 22, 2021 12:01 AM
> 
> On Sun, Sep 19, 2021 at 02:38:31PM +0800, Liu Yi L wrote:
> > With /dev/vfio/devices introduced, now a vfio device driver has three
> > options to expose its device to userspace:
> >
> > a)  only legacy group interface, for devices which haven't been moved to
> > iommufd (e.g. platform devices, sw mdev, etc.);
> >
> > b)  both legacy group interface and new device-centric interface, for
> > devices which supports iommufd but also wants to keep backward
> > compatibility (e.g. pci devices in this RFC);
> >
> > c)  only new device-centric interface, for new devices which don't carry
> > backward compatibility burden (e.g. hw mdev/subdev with pasid);
> 
> We shouldn't have 'b'? Where does it come from?

a vfio-pci device can be opened via the existing group interface. if no b) it 
means legacy vfio userspace can never use vfio-pci device any more
once the latter is moved to iommufd.

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

2021-09-21 Thread Jason Gunthorpe via iommu

On Sun, Sep 19, 2021 at 02:38:31PM +0800, Liu Yi L wrote:
> With /dev/vfio/devices introduced, now a vfio device driver has three
> options to expose its device to userspace:
> 
> a)  only legacy group interface, for devices which haven't been moved to
> iommufd (e.g. platform devices, sw mdev, etc.);
> 
> b)  both legacy group interface and new device-centric interface, for
> devices which supports iommufd but also wants to keep backward
> compatibility (e.g. pci devices in this RFC);
> 
> c)  only new device-centric interface, for new devices which don't carry
> backward compatibility burden (e.g. hw mdev/subdev with pasid);

We shouldn't have 'b'? Where does it come from?

> This patch introduces vfio_[un]register_device() helpers for the device
> drivers to specify the device exposure policy to vfio core. Hence the
> existing vfio_[un]register_group_dev() become the wrapper of the new
> helper functions. The new device-centric interface is described as
> 'nongroup' to differentiate from existing 'group' stuff.

Detect what the driver supports based on the ops it declares. There
should be a function provided through the ops for the driver to bind
to the iommufd.

>  One open about how to organize the device nodes under /dev/vfio/devices/.
> This RFC adopts a simple policy by keeping a flat layout with mixed devname
> from all kinds of devices. The prerequisite of this model is that devnames
> from different bus types are unique formats:

This isn't reliable, the devname should just be vfio0, vfio1, etc

The userspace can learn the correct major/minor by inspecting the
sysfs.

This whole concept should disappear into the prior patch that adds the
struct device in the first place, and I think most of the code here
can be deleted once the struct device is used properly.

Jason
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

RE: [RFC 03/20] vfio: Add vfio_[un]register_device()

Re: [RFC 03/20] vfio: Add vfio_[un]register_device()

27 matches

Site Navigation

Mail list logo

Footer information