Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
7. Hot-plug It is same syntax to create a virtual device for hot-plug. >>> >>> How do groups work with hotplug? Can a device be creating into an >>> existing, running group? Can a device be removed from an existing, >>> running group? Thanks, >>> >> >> Group is for vendor driver to decide when to commit resources for a >> group of devices to support multiple devices in a domain. It will be >> upto vendor driver how it manage it. > > Then it's not sufficiently specified for libvirt to use. Thanks, > Can you clarify what if required for libvirt to use? >>> >>> Everything we're discussing above. We can't simply say that group >>> management is defined by the vendor driver, that means that libvirt >>> needs to know how to uniquely behave for each vendor driver. We want >>> to create a consistent framework where libvirt doesn't care what the >>> vendor driver is. Therefore we need to make group manipulation fully >>> specified, regardless of the vendor driver. Thanks, >>> >> >> There is no communication between different vendor drivers. So if >> libvirt creates a group with one device from one vendor and one device >> from another vendor, both vendor driver would think they have a single >> device and accordingly commit resources for that single device from its >> own driver. Ideally nothing would fail. But that is not the main aim of >> grouping, right? >> >> To make this independent of vendor driver, we had initial proposal of >> having same UUID and different instance numbers. But that was also not >> acceptable. > > We're defining the infrastructure, we get to define the rules, but we > need to define all the rules so that libvirt can actually implement to > them. We can't say "groups are this nebulous thing and each vendor > will manage them however they want", nobody wants to care that NVIDIA > did it one way and Intel did it another and we need to support both. > That's a design failure. Therefore we \must\ make the group definition > we need well specified, any hopefully generic enough that others can > use it. If they cannot then we'd need to add a new type of group. > Therefore we need to \specify\ things like for what set of mdev should > a group be created? Do I need a group for a single device? How do we > introspect a group to know which mdevs it contains? Can we dynamically > add or remove mdev devices from a group? Can we attach multiple > groups to the same user? Can we change the association of an mdev > device without removing it and recreating it? etc, etc, etc. We don't > need vendor coordination, we need a specification of exactly how this > type of group is meant to work. Thanks, > We decided to handle "multiple vGPUs per VM" through internal design changes. With that we don't need grouping in mdev framework. Hope this helps to accelerate and settle on sysfs interface soon. Thanks, Kirti -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Thu, Sep 29, 2016 at 09:03:40AM +0100, Daniel P. Berrange wrote: > On Wed, Sep 28, 2016 at 12:22:35PM -0700, Neo Jia wrote: > > On Thu, Sep 22, 2016 at 03:26:38PM +0100, Daniel P. Berrange wrote: > > > On Thu, Sep 22, 2016 at 08:19:21AM -0600, Alex Williamson wrote: > > > > On Thu, 22 Sep 2016 09:41:20 +0530 > > > > Kirti Wankhedewrote: > > > > > > > > > > My concern is that a type id seems arbitrary but we're > > > > > > specifying that > > > > > > it be unique. We already have something unique, the name. So > > > > > > why try > > > > > > to make the type id unique as well? A vendor can accidentally > > > > > > create > > > > > > their vendor driver so that a given name means something very > > > > > > specific. On the other hand they need to be extremely > > > > > > deliberate to > > > > > > coordinate that a type id means a unique thing across all their > > > > > > product > > > > > > lines. > > > > > > > > > > > > > > > > Let me clarify, type id should be unique in the list of > > > > > mdev_supported_types. You can't have 2 directories in with same > > > > > name. > > > > > >>> > > > > > >>> Of course, but does that mean it's only unique to the machine I'm > > > > > >>> currently running on? Let's say I have a Tesla P100 on my system > > > > > >>> and > > > > > >>> type-id 11 is named "GRID-M60-0B". At some point in the future I > > > > > >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 on > > > > > >>> that > > > > > >>> new card still going to be a "GRID-M60-0B"? If not then we've > > > > > >>> based > > > > > >>> our XML on the wrong attribute. If the new device does not > > > > > >>> support > > > > > >>> "GRID-M60-0B" then we should generate an error, not simply > > > > > >>> initialize > > > > > >>> whatever type-id 11 happens to be on this new card. > > > > > >>> > > > > > >> > > > > > >> If there are 2 M60 in the system then you would find '11' type > > > > > >> directory > > > > > >> in mdev_supported_types of both M60. If you have P100, '11' type > > > > > >> would > > > > > >> not be there in its mdev_supported_types, it will have different > > > > > >> types. > > > > > >> > > > > > >> For example, if you replace M60 with P100, but XML is not updated. > > > > > >> XML > > > > > >> have type '11'. When libvirt would try to create mdev device, > > > > > >> libvirt > > > > > >> would have to find 'create' file in sysfs in following directory > > > > > >> format: > > > > > >> > > > > > >> --- mdev_supported_types > > > > > >> |-- 11 > > > > > >> | |-- create > > > > > >> > > > > > >> but now for P100, '11' directory is not there, so libvirt should > > > > > >> throw > > > > > >> error on not able to find '11' directory. > > > > > > > > > > > > This really seems like an accident waiting to happen. What happens > > > > > > when the user replaces their M60 with an Intel XYZ device that > > > > > > happens > > > > > > to expose a type 11 mdev class gpu device? How is libvirt supposed > > > > > > to > > > > > > know that the XML used to refer to a GRID-M60-0B and now it's an > > > > > > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and > > > > > > removing > > > > > > yet another arbitrary requirement that we have some sort of globally > > > > > > unique type-id database make a lot of sense? The same issue applies > > > > > > for simple debug-ability, if I'm reviewing the XML for a domain and > > > > > > the > > > > > > name is the primary index for the mdev device, I know what it is. > > > > > > Seeing type-id='11' is meaningless. > > > > > > > > > > > > > > > > Let me clarify again, type '11' is a string that vendor driver would > > > > > define (see my previous reply below) it could be "11" or > > > > > "GRID-M60-0B". > > > > > If 2 vendors used same string we can't control that. right? > > > > > > > > > > > > > > > Lets remove 'id' from type id in XML if that is the concern. > > > > > Supported > > > > > types is going to be defined by vendor driver, so let vendor > > > > > driver > > > > > decide what to use for directory name and same should be used in > > > > > device > > > > > xml file, it could be '11' or "GRID M60-0B": > > > > > > > > > > > > > > > my-vgpu > > > > > pci__86_00_0 > > > > > > > > > >
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Wed, Sep 28, 2016 at 12:22:35PM -0700, Neo Jia wrote: > On Thu, Sep 22, 2016 at 03:26:38PM +0100, Daniel P. Berrange wrote: > > On Thu, Sep 22, 2016 at 08:19:21AM -0600, Alex Williamson wrote: > > > On Thu, 22 Sep 2016 09:41:20 +0530 > > > Kirti Wankhedewrote: > > > > > > > > My concern is that a type id seems arbitrary but we're specifying > > > > > that > > > > > it be unique. We already have something unique, the name. So > > > > > why try > > > > > to make the type id unique as well? A vendor can accidentally > > > > > create > > > > > their vendor driver so that a given name means something very > > > > > specific. On the other hand they need to be extremely deliberate > > > > > to > > > > > coordinate that a type id means a unique thing across all their > > > > > product > > > > > lines. > > > > > > > > > > > > > Let me clarify, type id should be unique in the list of > > > > mdev_supported_types. You can't have 2 directories in with same > > > > name. > > > > >>> > > > > >>> Of course, but does that mean it's only unique to the machine I'm > > > > >>> currently running on? Let's say I have a Tesla P100 on my system > > > > >>> and > > > > >>> type-id 11 is named "GRID-M60-0B". At some point in the future I > > > > >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 on > > > > >>> that > > > > >>> new card still going to be a "GRID-M60-0B"? If not then we've based > > > > >>> our XML on the wrong attribute. If the new device does not support > > > > >>> "GRID-M60-0B" then we should generate an error, not simply > > > > >>> initialize > > > > >>> whatever type-id 11 happens to be on this new card. > > > > >>> > > > > >> > > > > >> If there are 2 M60 in the system then you would find '11' type > > > > >> directory > > > > >> in mdev_supported_types of both M60. If you have P100, '11' type > > > > >> would > > > > >> not be there in its mdev_supported_types, it will have different > > > > >> types. > > > > >> > > > > >> For example, if you replace M60 with P100, but XML is not updated. > > > > >> XML > > > > >> have type '11'. When libvirt would try to create mdev device, libvirt > > > > >> would have to find 'create' file in sysfs in following directory > > > > >> format: > > > > >> > > > > >> --- mdev_supported_types > > > > >> |-- 11 > > > > >> | |-- create > > > > >> > > > > >> but now for P100, '11' directory is not there, so libvirt should > > > > >> throw > > > > >> error on not able to find '11' directory. > > > > > > > > > > This really seems like an accident waiting to happen. What happens > > > > > when the user replaces their M60 with an Intel XYZ device that happens > > > > > to expose a type 11 mdev class gpu device? How is libvirt supposed to > > > > > know that the XML used to refer to a GRID-M60-0B and now it's an > > > > > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and removing > > > > > yet another arbitrary requirement that we have some sort of globally > > > > > unique type-id database make a lot of sense? The same issue applies > > > > > for simple debug-ability, if I'm reviewing the XML for a domain and > > > > > the > > > > > name is the primary index for the mdev device, I know what it is. > > > > > Seeing type-id='11' is meaningless. > > > > > > > > > > > > > Let me clarify again, type '11' is a string that vendor driver would > > > > define (see my previous reply below) it could be "11" or "GRID-M60-0B". > > > > If 2 vendors used same string we can't control that. right? > > > > > > > > > > > > Lets remove 'id' from type id in XML if that is the concern. > > > > Supported > > > > types is going to be defined by vendor driver, so let vendor driver > > > > decide what to use for directory name and same should be used in > > > > device > > > > xml file, it could be '11' or "GRID M60-0B": > > > > > > > > > > > > my-vgpu > > > > pci__86_00_0 > > > > > > > >
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Wed, 28 Sep 2016 12:59:59 -0700 Neo Jiawrote: > On Wed, Sep 28, 2016 at 07:45:38PM +, Tian, Kevin wrote: > > > From: Neo Jia [mailto:c...@nvidia.com] > > > Sent: Thursday, September 29, 2016 3:23 AM > > > > > > On Thu, Sep 22, 2016 at 03:26:38PM +0100, Daniel P. Berrange wrote: > > > > On Thu, Sep 22, 2016 at 08:19:21AM -0600, Alex Williamson wrote: > > > > > On Thu, 22 Sep 2016 09:41:20 +0530 > > > > > Kirti Wankhede wrote: > > > > > > > > > > > > My concern is that a type id seems arbitrary but we're > > > > > > > specifying that > > > > > > > it be unique. We already have something unique, the name. > > > > > > > So why try > > > > > > > to make the type id unique as well? A vendor can > > > > > > > accidentally create > > > > > > > their vendor driver so that a given name means something very > > > > > > > specific. On the other hand they need to be extremely > > > > > > > deliberate to > > > > > > > coordinate that a type id means a unique thing across all > > > > > > > their product > > > > > > > lines. > > > > > > > > > > > > > > > > > > > Let me clarify, type id should be unique in the list of > > > > > > mdev_supported_types. You can't have 2 directories in with > > > > > > same name. > > > > > > >>> > > > > > > >>> Of course, but does that mean it's only unique to the machine > > > > > > >>> I'm > > > > > > >>> currently running on? Let's say I have a Tesla P100 on my > > > > > > >>> system and > > > > > > >>> type-id 11 is named "GRID-M60-0B". At some point in the future > > > > > > >>> I > > > > > > >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 > > > > > > >>> on that > > > > > > >>> new card still going to be a "GRID-M60-0B"? If not then we've > > > > > > >>> based > > > > > > >>> our XML on the wrong attribute. If the new device does not > > > > > > >>> support > > > > > > >>> "GRID-M60-0B" then we should generate an error, not simply > > > > > > >>> initialize > > > > > > >>> whatever type-id 11 happens to be on this new card. > > > > > > >>> > > > > > > >> > > > > > > >> If there are 2 M60 in the system then you would find '11' type > > > > > > >> directory > > > > > > >> in mdev_supported_types of both M60. If you have P100, '11' type > > > > > > >> would > > > > > > >> not be there in its mdev_supported_types, it will have different > > > > > > >> types. > > > > > > >> > > > > > > >> For example, if you replace M60 with P100, but XML is not > > > > > > >> updated. XML > > > > > > >> have type '11'. When libvirt would try to create mdev device, > > > > > > >> libvirt > > > > > > >> would have to find 'create' file in sysfs in following directory > > > > > > >> format: > > > > > > >> > > > > > > >> --- mdev_supported_types > > > > > > >> |-- 11 > > > > > > >> | |-- create > > > > > > >> > > > > > > >> but now for P100, '11' directory is not there, so libvirt should > > > > > > >> throw > > > > > > >> error on not able to find '11' directory. > > > > > > > > > > > > > > This really seems like an accident waiting to happen. What > > > > > > > happens > > > > > > > when the user replaces their M60 with an Intel XYZ device that > > > > > > > happens > > > > > > > to expose a type 11 mdev class gpu device? How is libvirt > > > > > > > supposed to > > > > > > > know that the XML used to refer to a GRID-M60-0B and now it's an > > > > > > > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and > > > > > > > removing > > > > > > > yet another arbitrary requirement that we have some sort of > > > > > > > globally > > > > > > > unique type-id database make a lot of sense? The same issue > > > > > > > applies > > > > > > > for simple debug-ability, if I'm reviewing the XML for a domain > > > > > > > and the > > > > > > > name is the primary index for the mdev device, I know what it is. > > > > > > > Seeing type-id='11' is meaningless. > > > > > > > > > > > > > > > > > > > Let me clarify again, type '11' is a string that vendor driver would > > > > > > define (see my previous reply below) it could be "11" or > > > > > > "GRID-M60-0B". > > > > > > If 2 vendors used same string we can't control that. right? > > > > > > > > > > > > > > > > > > Lets remove 'id' from type id in XML if that is the concern. > > > > > > Supported > > > > > > types is going to be defined by vendor driver, so let vendor > > > > > > driver > > > > > > decide what to use for directory name and same should be used > > > > > > in device > > > > > > xml file, it could be '11' or "GRID M60-0B": > > > > > > > > > > > > > > > > > > my-vgpu > > > > > > pci__86_00_0 > > > > > > > > > > > >
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Wed, 28 Sep 2016 13:06:31 -0700 Neo Jiawrote: > On Wed, Sep 28, 2016 at 01:55:47PM -0600, Alex Williamson wrote: > > On Wed, 28 Sep 2016 12:22:35 -0700 > > Neo Jia wrote: > > > > > On Thu, Sep 22, 2016 at 03:26:38PM +0100, Daniel P. Berrange wrote: > > > > On Thu, Sep 22, 2016 at 08:19:21AM -0600, Alex Williamson wrote: > > > > > On Thu, 22 Sep 2016 09:41:20 +0530 > > > > > Kirti Wankhede wrote: > > > > > > > > > > > > My concern is that a type id seems arbitrary but we're > > > > > > > specifying that > > > > > > > it be unique. We already have something unique, the name. > > > > > > > So why try > > > > > > > to make the type id unique as well? A vendor can > > > > > > > accidentally create > > > > > > > their vendor driver so that a given name means something very > > > > > > > specific. On the other hand they need to be extremely > > > > > > > deliberate to > > > > > > > coordinate that a type id means a unique thing across all > > > > > > > their product > > > > > > > lines. > > > > > > > > > > > > > > > > > > > Let me clarify, type id should be unique in the list of > > > > > > mdev_supported_types. You can't have 2 directories in with > > > > > > same name. > > > > > > >>> > > > > > > >>> Of course, but does that mean it's only unique to the machine > > > > > > >>> I'm > > > > > > >>> currently running on? Let's say I have a Tesla P100 on my > > > > > > >>> system and > > > > > > >>> type-id 11 is named "GRID-M60-0B". At some point in the future > > > > > > >>> I > > > > > > >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 > > > > > > >>> on that > > > > > > >>> new card still going to be a "GRID-M60-0B"? If not then we've > > > > > > >>> based > > > > > > >>> our XML on the wrong attribute. If the new device does not > > > > > > >>> support > > > > > > >>> "GRID-M60-0B" then we should generate an error, not simply > > > > > > >>> initialize > > > > > > >>> whatever type-id 11 happens to be on this new card. > > > > > > >>> > > > > > > >> > > > > > > >> If there are 2 M60 in the system then you would find '11' type > > > > > > >> directory > > > > > > >> in mdev_supported_types of both M60. If you have P100, '11' type > > > > > > >> would > > > > > > >> not be there in its mdev_supported_types, it will have different > > > > > > >> types. > > > > > > >> > > > > > > >> For example, if you replace M60 with P100, but XML is not > > > > > > >> updated. XML > > > > > > >> have type '11'. When libvirt would try to create mdev device, > > > > > > >> libvirt > > > > > > >> would have to find 'create' file in sysfs in following directory > > > > > > >> format: > > > > > > >> > > > > > > >> --- mdev_supported_types > > > > > > >> |-- 11 > > > > > > >> | |-- create > > > > > > >> > > > > > > >> but now for P100, '11' directory is not there, so libvirt should > > > > > > >> throw > > > > > > >> error on not able to find '11' directory. > > > > > > > > > > > > > > This really seems like an accident waiting to happen. What > > > > > > > happens > > > > > > > when the user replaces their M60 with an Intel XYZ device that > > > > > > > happens > > > > > > > to expose a type 11 mdev class gpu device? How is libvirt > > > > > > > supposed to > > > > > > > know that the XML used to refer to a GRID-M60-0B and now it's an > > > > > > > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and > > > > > > > removing > > > > > > > yet another arbitrary requirement that we have some sort of > > > > > > > globally > > > > > > > unique type-id database make a lot of sense? The same issue > > > > > > > applies > > > > > > > for simple debug-ability, if I'm reviewing the XML for a domain > > > > > > > and the > > > > > > > name is the primary index for the mdev device, I know what it is. > > > > > > > Seeing type-id='11' is meaningless. > > > > > > > > > > > > > > > > > > > Let me clarify again, type '11' is a string that vendor driver would > > > > > > define (see my previous reply below) it could be "11" or > > > > > > "GRID-M60-0B". > > > > > > If 2 vendors used same string we can't control that. right? > > > > > > > > > > > > > > > > > > Lets remove 'id' from type id in XML if that is the concern. > > > > > > Supported > > > > > > types is going to be defined by vendor driver, so let vendor > > > > > > driver > > > > > > decide what to use for directory name and same should be used > > > > > > in device > > > > > > xml file, it could be '11' or "GRID M60-0B": > > > > > > > > > > > > > > > > > > my-vgpu > > > > > > pci__86_00_0 > > > > > > > > > > > >
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Wed, Sep 28, 2016 at 04:31:25PM -0400, Laine Stump wrote: > On 09/28/2016 03:59 PM, Neo Jia wrote: > > On Wed, Sep 28, 2016 at 07:45:38PM +, Tian, Kevin wrote: > > > > From: Neo Jia [mailto:c...@nvidia.com] > > > > Sent: Thursday, September 29, 2016 3:23 AM > > > > > > > > On Thu, Sep 22, 2016 at 03:26:38PM +0100, Daniel P. Berrange wrote: > > > > > On Thu, Sep 22, 2016 at 08:19:21AM -0600, Alex Williamson wrote: > > > > > > On Thu, 22 Sep 2016 09:41:20 +0530 > > > > > > Kirti Wankhedewrote: > > > > > > > > > > > > > > > > > > My concern is that a type id seems arbitrary but we're > > > > > > > > > > > > specifying that > > > > > > > > > > > > it be unique. We already have something unique, the > > > > > > > > > > > > name. So why try > > > > > > > > > > > > to make the type id unique as well? A vendor can > > > > > > > > > > > > accidentally create > > > > > > > > > > > > their vendor driver so that a given name means > > > > > > > > > > > > something very > > > > > > > > > > > > specific. On the other hand they need to be extremely > > > > > > > > > > > > deliberate to > > > > > > > > > > > > coordinate that a type id means a unique thing across > > > > > > > > > > > > all their product > > > > > > > > > > > > lines. > > > > > > > > > > > > > > > > > > > > > > > Let me clarify, type id should be unique in the list of > > > > > > > > > > > mdev_supported_types. You can't have 2 directories in > > > > > > > > > > > with same name. > > > > > > > > > > Of course, but does that mean it's only unique to the > > > > > > > > > > machine I'm > > > > > > > > > > currently running on? Let's say I have a Tesla P100 on my > > > > > > > > > > system and > > > > > > > > > > type-id 11 is named "GRID-M60-0B". At some point in the > > > > > > > > > > future I > > > > > > > > > > replace the Tesla P100 with a Q1000 (made up). Is type-id > > > > > > > > > > 11 on that > > > > > > > > > > new card still going to be a "GRID-M60-0B"? If not then > > > > > > > > > > we've based > > > > > > > > > > our XML on the wrong attribute. If the new device does not > > > > > > > > > > support > > > > > > > > > > "GRID-M60-0B" then we should generate an error, not simply > > > > > > > > > > initialize > > > > > > > > > > whatever type-id 11 happens to be on this new card. > > > > > > > > > > > > > > > > > > > If there are 2 M60 in the system then you would find '11' > > > > > > > > > type directory > > > > > > > > > in mdev_supported_types of both M60. If you have P100, '11' > > > > > > > > > type would > > > > > > > > > not be there in its mdev_supported_types, it will have > > > > > > > > > different types. > > > > > > > > > > > > > > > > > > For example, if you replace M60 with P100, but XML is not > > > > > > > > > updated. XML > > > > > > > > > have type '11'. When libvirt would try to create mdev device, > > > > > > > > > libvirt > > > > > > > > > would have to find 'create' file in sysfs in following > > > > > > > > > directory format: > > > > > > > > > > > > > > > > > > --- mdev_supported_types > > > > > > > > > |-- 11 > > > > > > > > > | |-- create > > > > > > > > > > > > > > > > > > but now for P100, '11' directory is not there, so libvirt > > > > > > > > > should throw > > > > > > > > > error on not able to find '11' directory. > > > > > > > > This really seems like an accident waiting to happen. What > > > > > > > > happens > > > > > > > > when the user replaces their M60 with an Intel XYZ device that > > > > > > > > happens > > > > > > > > to expose a type 11 mdev class gpu device? How is libvirt > > > > > > > > supposed to > > > > > > > > know that the XML used to refer to a GRID-M60-0B and now it's an > > > > > > > > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and > > > > > > > > removing > > > > > > > > yet another arbitrary requirement that we have some sort of > > > > > > > > globally > > > > > > > > unique type-id database make a lot of sense? The same issue > > > > > > > > applies > > > > > > > > for simple debug-ability, if I'm reviewing the XML for a domain > > > > > > > > and the > > > > > > > > name is the primary index for the mdev device, I know what it > > > > > > > > is. > > > > > > > > Seeing type-id='11' is meaningless. > > > > > > > > > > > > > > > Let me clarify again, type '11' is a string that vendor driver > > > > > > > would > > > > > > > define (see my previous reply below) it could be "11" or > > > > > > > "GRID-M60-0B". > > > > > > > If 2 vendors used same string we can't control that. right? > > > > > > > > > > > > > > > > > > > > > > > > > Lets remove 'id' from type id in XML if that is the > > > > > > > > > > > concern. Supported > > > > > > > > > > > types is going to be defined by vendor driver, so let > > > > > > > > > > > vendor driver > > > > > > > > > > > decide what to use for directory name and same should be > > > > > > > > > > > used in device > > > > > > > > > > > xml
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On 09/28/2016 03:59 PM, Neo Jia wrote: On Wed, Sep 28, 2016 at 07:45:38PM +, Tian, Kevin wrote: From: Neo Jia [mailto:c...@nvidia.com] Sent: Thursday, September 29, 2016 3:23 AM On Thu, Sep 22, 2016 at 03:26:38PM +0100, Daniel P. Berrange wrote: On Thu, Sep 22, 2016 at 08:19:21AM -0600, Alex Williamson wrote: On Thu, 22 Sep 2016 09:41:20 +0530 Kirti Wankhedewrote: My concern is that a type id seems arbitrary but we're specifying that it be unique. We already have something unique, the name. So why try to make the type id unique as well? A vendor can accidentally create their vendor driver so that a given name means something very specific. On the other hand they need to be extremely deliberate to coordinate that a type id means a unique thing across all their product lines. Let me clarify, type id should be unique in the list of mdev_supported_types. You can't have 2 directories in with same name. Of course, but does that mean it's only unique to the machine I'm currently running on? Let's say I have a Tesla P100 on my system and type-id 11 is named "GRID-M60-0B". At some point in the future I replace the Tesla P100 with a Q1000 (made up). Is type-id 11 on that new card still going to be a "GRID-M60-0B"? If not then we've based our XML on the wrong attribute. If the new device does not support "GRID-M60-0B" then we should generate an error, not simply initialize whatever type-id 11 happens to be on this new card. If there are 2 M60 in the system then you would find '11' type directory in mdev_supported_types of both M60. If you have P100, '11' type would not be there in its mdev_supported_types, it will have different types. For example, if you replace M60 with P100, but XML is not updated. XML have type '11'. When libvirt would try to create mdev device, libvirt would have to find 'create' file in sysfs in following directory format: --- mdev_supported_types |-- 11 | |-- create but now for P100, '11' directory is not there, so libvirt should throw error on not able to find '11' directory. This really seems like an accident waiting to happen. What happens when the user replaces their M60 with an Intel XYZ device that happens to expose a type 11 mdev class gpu device? How is libvirt supposed to know that the XML used to refer to a GRID-M60-0B and now it's an INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and removing yet another arbitrary requirement that we have some sort of globally unique type-id database make a lot of sense? The same issue applies for simple debug-ability, if I'm reviewing the XML for a domain and the name is the primary index for the mdev device, I know what it is. Seeing type-id='11' is meaningless. Let me clarify again, type '11' is a string that vendor driver would define (see my previous reply below) it could be "11" or "GRID-M60-0B". If 2 vendors used same string we can't control that. right? Lets remove 'id' from type id in XML if that is the concern. Supported types is going to be defined by vendor driver, so let vendor driver decide what to use for directory name and same should be used in device xml file, it could be '11' or "GRID M60-0B": my-vgpu pci__86_00_0
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Wed, Sep 28, 2016 at 01:55:47PM -0600, Alex Williamson wrote: > On Wed, 28 Sep 2016 12:22:35 -0700 > Neo Jiawrote: > > > On Thu, Sep 22, 2016 at 03:26:38PM +0100, Daniel P. Berrange wrote: > > > On Thu, Sep 22, 2016 at 08:19:21AM -0600, Alex Williamson wrote: > > > > On Thu, 22 Sep 2016 09:41:20 +0530 > > > > Kirti Wankhede wrote: > > > > > > > > > > My concern is that a type id seems arbitrary but we're > > > > > > specifying that > > > > > > it be unique. We already have something unique, the name. So > > > > > > why try > > > > > > to make the type id unique as well? A vendor can accidentally > > > > > > create > > > > > > their vendor driver so that a given name means something very > > > > > > specific. On the other hand they need to be extremely > > > > > > deliberate to > > > > > > coordinate that a type id means a unique thing across all their > > > > > > product > > > > > > lines. > > > > > > > > > > > > > > > > Let me clarify, type id should be unique in the list of > > > > > mdev_supported_types. You can't have 2 directories in with same > > > > > name. > > > > > >>> > > > > > >>> Of course, but does that mean it's only unique to the machine I'm > > > > > >>> currently running on? Let's say I have a Tesla P100 on my system > > > > > >>> and > > > > > >>> type-id 11 is named "GRID-M60-0B". At some point in the future I > > > > > >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 on > > > > > >>> that > > > > > >>> new card still going to be a "GRID-M60-0B"? If not then we've > > > > > >>> based > > > > > >>> our XML on the wrong attribute. If the new device does not > > > > > >>> support > > > > > >>> "GRID-M60-0B" then we should generate an error, not simply > > > > > >>> initialize > > > > > >>> whatever type-id 11 happens to be on this new card. > > > > > >>> > > > > > >> > > > > > >> If there are 2 M60 in the system then you would find '11' type > > > > > >> directory > > > > > >> in mdev_supported_types of both M60. If you have P100, '11' type > > > > > >> would > > > > > >> not be there in its mdev_supported_types, it will have different > > > > > >> types. > > > > > >> > > > > > >> For example, if you replace M60 with P100, but XML is not updated. > > > > > >> XML > > > > > >> have type '11'. When libvirt would try to create mdev device, > > > > > >> libvirt > > > > > >> would have to find 'create' file in sysfs in following directory > > > > > >> format: > > > > > >> > > > > > >> --- mdev_supported_types > > > > > >> |-- 11 > > > > > >> | |-- create > > > > > >> > > > > > >> but now for P100, '11' directory is not there, so libvirt should > > > > > >> throw > > > > > >> error on not able to find '11' directory. > > > > > > > > > > > > This really seems like an accident waiting to happen. What happens > > > > > > when the user replaces their M60 with an Intel XYZ device that > > > > > > happens > > > > > > to expose a type 11 mdev class gpu device? How is libvirt supposed > > > > > > to > > > > > > know that the XML used to refer to a GRID-M60-0B and now it's an > > > > > > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and > > > > > > removing > > > > > > yet another arbitrary requirement that we have some sort of globally > > > > > > unique type-id database make a lot of sense? The same issue applies > > > > > > for simple debug-ability, if I'm reviewing the XML for a domain and > > > > > > the > > > > > > name is the primary index for the mdev device, I know what it is. > > > > > > Seeing type-id='11' is meaningless. > > > > > > > > > > > > > > > > Let me clarify again, type '11' is a string that vendor driver would > > > > > define (see my previous reply below) it could be "11" or > > > > > "GRID-M60-0B". > > > > > If 2 vendors used same string we can't control that. right? > > > > > > > > > > > > > > > Lets remove 'id' from type id in XML if that is the concern. > > > > > Supported > > > > > types is going to be defined by vendor driver, so let vendor > > > > > driver > > > > > decide what to use for directory name and same should be used in > > > > > device > > > > > xml file, it could be '11' or "GRID M60-0B": > > > > > > > > > > > > > > > my-vgpu > > > > > pci__86_00_0 > > > > > > > > > >
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Wed, Sep 28, 2016 at 07:45:38PM +, Tian, Kevin wrote: > > From: Neo Jia [mailto:c...@nvidia.com] > > Sent: Thursday, September 29, 2016 3:23 AM > > > > On Thu, Sep 22, 2016 at 03:26:38PM +0100, Daniel P. Berrange wrote: > > > On Thu, Sep 22, 2016 at 08:19:21AM -0600, Alex Williamson wrote: > > > > On Thu, 22 Sep 2016 09:41:20 +0530 > > > > Kirti Wankhedewrote: > > > > > > > > > > My concern is that a type id seems arbitrary but we're > > > > > > specifying that > > > > > > it be unique. We already have something unique, the name. So > > > > > > why try > > > > > > to make the type id unique as well? A vendor can accidentally > > > > > > create > > > > > > their vendor driver so that a given name means something very > > > > > > specific. On the other hand they need to be extremely > > > > > > deliberate to > > > > > > coordinate that a type id means a unique thing across all their > > > > > > product > > > > > > lines. > > > > > > > > > > > > > > > > Let me clarify, type id should be unique in the list of > > > > > mdev_supported_types. You can't have 2 directories in with same > > > > > name. > > > > > >>> > > > > > >>> Of course, but does that mean it's only unique to the machine I'm > > > > > >>> currently running on? Let's say I have a Tesla P100 on my system > > > > > >>> and > > > > > >>> type-id 11 is named "GRID-M60-0B". At some point in the future I > > > > > >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 on > > > > > >>> that > > > > > >>> new card still going to be a "GRID-M60-0B"? If not then we've > > > > > >>> based > > > > > >>> our XML on the wrong attribute. If the new device does not > > > > > >>> support > > > > > >>> "GRID-M60-0B" then we should generate an error, not simply > > > > > >>> initialize > > > > > >>> whatever type-id 11 happens to be on this new card. > > > > > >>> > > > > > >> > > > > > >> If there are 2 M60 in the system then you would find '11' type > > > > > >> directory > > > > > >> in mdev_supported_types of both M60. If you have P100, '11' type > > > > > >> would > > > > > >> not be there in its mdev_supported_types, it will have different > > > > > >> types. > > > > > >> > > > > > >> For example, if you replace M60 with P100, but XML is not updated. > > > > > >> XML > > > > > >> have type '11'. When libvirt would try to create mdev device, > > > > > >> libvirt > > > > > >> would have to find 'create' file in sysfs in following directory > > > > > >> format: > > > > > >> > > > > > >> --- mdev_supported_types > > > > > >> |-- 11 > > > > > >> | |-- create > > > > > >> > > > > > >> but now for P100, '11' directory is not there, so libvirt should > > > > > >> throw > > > > > >> error on not able to find '11' directory. > > > > > > > > > > > > This really seems like an accident waiting to happen. What happens > > > > > > when the user replaces their M60 with an Intel XYZ device that > > > > > > happens > > > > > > to expose a type 11 mdev class gpu device? How is libvirt supposed > > > > > > to > > > > > > know that the XML used to refer to a GRID-M60-0B and now it's an > > > > > > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and > > > > > > removing > > > > > > yet another arbitrary requirement that we have some sort of globally > > > > > > unique type-id database make a lot of sense? The same issue applies > > > > > > for simple debug-ability, if I'm reviewing the XML for a domain and > > > > > > the > > > > > > name is the primary index for the mdev device, I know what it is. > > > > > > Seeing type-id='11' is meaningless. > > > > > > > > > > > > > > > > Let me clarify again, type '11' is a string that vendor driver would > > > > > define (see my previous reply below) it could be "11" or > > > > > "GRID-M60-0B". > > > > > If 2 vendors used same string we can't control that. right? > > > > > > > > > > > > > > > Lets remove 'id' from type id in XML if that is the concern. > > > > > Supported > > > > > types is going to be defined by vendor driver, so let vendor > > > > > driver > > > > > decide what to use for directory name and same should be used in > > > > > device > > > > > xml file, it could be '11' or "GRID M60-0B": > > > > > > > > > > > > > > > my-vgpu > > > > > pci__86_00_0 > > > > > > > > > >
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Wed, 28 Sep 2016 12:22:35 -0700 Neo Jiawrote: > On Thu, Sep 22, 2016 at 03:26:38PM +0100, Daniel P. Berrange wrote: > > On Thu, Sep 22, 2016 at 08:19:21AM -0600, Alex Williamson wrote: > > > On Thu, 22 Sep 2016 09:41:20 +0530 > > > Kirti Wankhede wrote: > > > > > > > > My concern is that a type id seems arbitrary but we're specifying > > > > > that > > > > > it be unique. We already have something unique, the name. So > > > > > why try > > > > > to make the type id unique as well? A vendor can accidentally > > > > > create > > > > > their vendor driver so that a given name means something very > > > > > specific. On the other hand they need to be extremely deliberate > > > > > to > > > > > coordinate that a type id means a unique thing across all their > > > > > product > > > > > lines. > > > > > > > > > > > > > Let me clarify, type id should be unique in the list of > > > > mdev_supported_types. You can't have 2 directories in with same > > > > name. > > > > >>> > > > > >>> Of course, but does that mean it's only unique to the machine I'm > > > > >>> currently running on? Let's say I have a Tesla P100 on my system > > > > >>> and > > > > >>> type-id 11 is named "GRID-M60-0B". At some point in the future I > > > > >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 on > > > > >>> that > > > > >>> new card still going to be a "GRID-M60-0B"? If not then we've based > > > > >>> our XML on the wrong attribute. If the new device does not support > > > > >>> "GRID-M60-0B" then we should generate an error, not simply > > > > >>> initialize > > > > >>> whatever type-id 11 happens to be on this new card. > > > > >>> > > > > >> > > > > >> If there are 2 M60 in the system then you would find '11' type > > > > >> directory > > > > >> in mdev_supported_types of both M60. If you have P100, '11' type > > > > >> would > > > > >> not be there in its mdev_supported_types, it will have different > > > > >> types. > > > > >> > > > > >> For example, if you replace M60 with P100, but XML is not updated. > > > > >> XML > > > > >> have type '11'. When libvirt would try to create mdev device, libvirt > > > > >> would have to find 'create' file in sysfs in following directory > > > > >> format: > > > > >> > > > > >> --- mdev_supported_types > > > > >> |-- 11 > > > > >> | |-- create > > > > >> > > > > >> but now for P100, '11' directory is not there, so libvirt should > > > > >> throw > > > > >> error on not able to find '11' directory. > > > > > > > > > > This really seems like an accident waiting to happen. What happens > > > > > when the user replaces their M60 with an Intel XYZ device that happens > > > > > to expose a type 11 mdev class gpu device? How is libvirt supposed to > > > > > know that the XML used to refer to a GRID-M60-0B and now it's an > > > > > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and removing > > > > > yet another arbitrary requirement that we have some sort of globally > > > > > unique type-id database make a lot of sense? The same issue applies > > > > > for simple debug-ability, if I'm reviewing the XML for a domain and > > > > > the > > > > > name is the primary index for the mdev device, I know what it is. > > > > > Seeing type-id='11' is meaningless. > > > > > > > > > > > > > Let me clarify again, type '11' is a string that vendor driver would > > > > define (see my previous reply below) it could be "11" or "GRID-M60-0B". > > > > If 2 vendors used same string we can't control that. right? > > > > > > > > > > > > Lets remove 'id' from type id in XML if that is the concern. > > > > Supported > > > > types is going to be defined by vendor driver, so let vendor driver > > > > decide what to use for directory name and same should be used in > > > > device > > > > xml file, it could be '11' or "GRID M60-0B": > > > > > > > > > > > > my-vgpu > > > > pci__86_00_0 > > > > > > > >
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
> From: Neo Jia [mailto:c...@nvidia.com] > Sent: Thursday, September 29, 2016 3:23 AM > > On Thu, Sep 22, 2016 at 03:26:38PM +0100, Daniel P. Berrange wrote: > > On Thu, Sep 22, 2016 at 08:19:21AM -0600, Alex Williamson wrote: > > > On Thu, 22 Sep 2016 09:41:20 +0530 > > > Kirti Wankhedewrote: > > > > > > > > My concern is that a type id seems arbitrary but we're specifying > > > > > that > > > > > it be unique. We already have something unique, the name. So > > > > > why try > > > > > to make the type id unique as well? A vendor can accidentally > > > > > create > > > > > their vendor driver so that a given name means something very > > > > > specific. On the other hand they need to be extremely deliberate > > > > > to > > > > > coordinate that a type id means a unique thing across all their > > > > > product > > > > > lines. > > > > > > > > > > > > > Let me clarify, type id should be unique in the list of > > > > mdev_supported_types. You can't have 2 directories in with same > > > > name. > > > > >>> > > > > >>> Of course, but does that mean it's only unique to the machine I'm > > > > >>> currently running on? Let's say I have a Tesla P100 on my system > > > > >>> and > > > > >>> type-id 11 is named "GRID-M60-0B". At some point in the future I > > > > >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 on > > > > >>> that > > > > >>> new card still going to be a "GRID-M60-0B"? If not then we've based > > > > >>> our XML on the wrong attribute. If the new device does not support > > > > >>> "GRID-M60-0B" then we should generate an error, not simply > > > > >>> initialize > > > > >>> whatever type-id 11 happens to be on this new card. > > > > >>> > > > > >> > > > > >> If there are 2 M60 in the system then you would find '11' type > > > > >> directory > > > > >> in mdev_supported_types of both M60. If you have P100, '11' type > > > > >> would > > > > >> not be there in its mdev_supported_types, it will have different > > > > >> types. > > > > >> > > > > >> For example, if you replace M60 with P100, but XML is not updated. > > > > >> XML > > > > >> have type '11'. When libvirt would try to create mdev device, libvirt > > > > >> would have to find 'create' file in sysfs in following directory > > > > >> format: > > > > >> > > > > >> --- mdev_supported_types > > > > >> |-- 11 > > > > >> | |-- create > > > > >> > > > > >> but now for P100, '11' directory is not there, so libvirt should > > > > >> throw > > > > >> error on not able to find '11' directory. > > > > > > > > > > This really seems like an accident waiting to happen. What happens > > > > > when the user replaces their M60 with an Intel XYZ device that happens > > > > > to expose a type 11 mdev class gpu device? How is libvirt supposed to > > > > > know that the XML used to refer to a GRID-M60-0B and now it's an > > > > > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and removing > > > > > yet another arbitrary requirement that we have some sort of globally > > > > > unique type-id database make a lot of sense? The same issue applies > > > > > for simple debug-ability, if I'm reviewing the XML for a domain and > > > > > the > > > > > name is the primary index for the mdev device, I know what it is. > > > > > Seeing type-id='11' is meaningless. > > > > > > > > > > > > > Let me clarify again, type '11' is a string that vendor driver would > > > > define (see my previous reply below) it could be "11" or "GRID-M60-0B". > > > > If 2 vendors used same string we can't control that. right? > > > > > > > > > > > > Lets remove 'id' from type id in XML if that is the concern. > > > > Supported > > > > types is going to be defined by vendor driver, so let vendor driver > > > > decide what to use for directory name and same should be used in > > > > device > > > > xml file, it could be '11' or "GRID M60-0B": > > > > > > > > > > > > my-vgpu > > > > pci__86_00_0 > > > > > > > >
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Thu, Sep 22, 2016 at 03:26:38PM +0100, Daniel P. Berrange wrote: > On Thu, Sep 22, 2016 at 08:19:21AM -0600, Alex Williamson wrote: > > On Thu, 22 Sep 2016 09:41:20 +0530 > > Kirti Wankhedewrote: > > > > > > My concern is that a type id seems arbitrary but we're specifying > > > > that > > > > it be unique. We already have something unique, the name. So why > > > > try > > > > to make the type id unique as well? A vendor can accidentally > > > > create > > > > their vendor driver so that a given name means something very > > > > specific. On the other hand they need to be extremely deliberate to > > > > coordinate that a type id means a unique thing across all their > > > > product > > > > lines. > > > > > > > > > > Let me clarify, type id should be unique in the list of > > > mdev_supported_types. You can't have 2 directories in with same > > > name. > > > >>> > > > >>> Of course, but does that mean it's only unique to the machine I'm > > > >>> currently running on? Let's say I have a Tesla P100 on my system and > > > >>> type-id 11 is named "GRID-M60-0B". At some point in the future I > > > >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 on that > > > >>> new card still going to be a "GRID-M60-0B"? If not then we've based > > > >>> our XML on the wrong attribute. If the new device does not support > > > >>> "GRID-M60-0B" then we should generate an error, not simply initialize > > > >>> whatever type-id 11 happens to be on this new card. > > > >>> > > > >> > > > >> If there are 2 M60 in the system then you would find '11' type > > > >> directory > > > >> in mdev_supported_types of both M60. If you have P100, '11' type would > > > >> not be there in its mdev_supported_types, it will have different types. > > > >> > > > >> For example, if you replace M60 with P100, but XML is not updated. XML > > > >> have type '11'. When libvirt would try to create mdev device, libvirt > > > >> would have to find 'create' file in sysfs in following directory > > > >> format: > > > >> > > > >> --- mdev_supported_types > > > >> |-- 11 > > > >> | |-- create > > > >> > > > >> but now for P100, '11' directory is not there, so libvirt should throw > > > >> error on not able to find '11' directory. > > > > > > > > This really seems like an accident waiting to happen. What happens > > > > when the user replaces their M60 with an Intel XYZ device that happens > > > > to expose a type 11 mdev class gpu device? How is libvirt supposed to > > > > know that the XML used to refer to a GRID-M60-0B and now it's an > > > > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and removing > > > > yet another arbitrary requirement that we have some sort of globally > > > > unique type-id database make a lot of sense? The same issue applies > > > > for simple debug-ability, if I'm reviewing the XML for a domain and the > > > > name is the primary index for the mdev device, I know what it is. > > > > Seeing type-id='11' is meaningless. > > > > > > > > > > Let me clarify again, type '11' is a string that vendor driver would > > > define (see my previous reply below) it could be "11" or "GRID-M60-0B". > > > If 2 vendors used same string we can't control that. right? > > > > > > > > > Lets remove 'id' from type id in XML if that is the concern. > > > Supported > > > types is going to be defined by vendor driver, so let vendor driver > > > decide what to use for directory name and same should be used in > > > device > > > xml file, it could be '11' or "GRID M60-0B": > > > > > > > > > my-vgpu > > > pci__86_00_0 > > > > > >
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On 9/23/2016 12:55 AM, Tian, Kevin wrote: >> From: Kirti Wankhede [mailto:kwankh...@nvidia.com] >> Sent: Wednesday, September 21, 2016 12:23 AM >>> > I have > a hard time believing that a given vendor can even allocate unique type > ids for their own devices. Unique type id across vendors is not > practical. So which attribute are we actually using to identify which > type of mdev device we need and why would we ever specify additional > attributes like fb_length? Doesn't the vendor guarantee that "GRID > M60-0B" has a fixed setup of those attributes? > Specifying attributes here is not our requirement. Yes we have fixed set of attributes for "GRID-M60-0B" and on. We are defining the attributed here for "display" class for all other vendor of gpu can use. > > Hi, Kirti, > > We decide to go with above type-based interface for KVMGT, with fixed setup > of attributes too. If both Intel and NVIDIA solutions use such fixed manner, > can we go with a proposal which doesn't include 'class' extension for now? > Later if there is a concrete usage requiring such class-specific attribute > setting, > then it's easy to extend following discussion in this thread. I'm thinking > how we > can converge the discussion here into something simple enough (and extensible) > to accelerate upstreaming of both Intel/NVIDIA solutions... > Hi Kevin, We have fixed set of attributes which are GPU/graphics specific, like framebuffer_length, resolution, number of heads, max supported instances. If you are going with fixed set of attributes, how are the vGPU types on KVMGT looks like from attributes point of view? attributes are graphics specific attributes? Thanks, Kirti -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
> From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > Sent: Wednesday, September 21, 2016 12:23 AM > > > >>> I have > >>> a hard time believing that a given vendor can even allocate unique type > >>> ids for their own devices. Unique type id across vendors is not > >>> practical. So which attribute are we actually using to identify which > >>> type of mdev device we need and why would we ever specify additional > >>> attributes like fb_length? Doesn't the vendor guarantee that "GRID > >>> M60-0B" has a fixed setup of those attributes? > >>> > >> > >> Specifying attributes here is not our requirement. Yes we have fixed set > >> of attributes for "GRID-M60-0B" and on. > >> We are defining the attributed here for "display" class for all other > >> vendor of gpu can use. > >> Hi, Kirti, We decide to go with above type-based interface for KVMGT, with fixed setup of attributes too. If both Intel and NVIDIA solutions use such fixed manner, can we go with a proposal which doesn't include 'class' extension for now? Later if there is a concrete usage requiring such class-specific attribute setting, then it's easy to extend following discussion in this thread. I'm thinking how we can converge the discussion here into something simple enough (and extensible) to accelerate upstreaming of both Intel/NVIDIA solutions... Thanks Kevin -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Thu, Sep 22, 2016 at 08:19:21AM -0600, Alex Williamson wrote: > On Thu, 22 Sep 2016 09:41:20 +0530 > Kirti Wankhedewrote: > > > > My concern is that a type id seems arbitrary but we're specifying that > > > it be unique. We already have something unique, the name. So why try > > > to make the type id unique as well? A vendor can accidentally create > > > their vendor driver so that a given name means something very > > > specific. On the other hand they need to be extremely deliberate to > > > coordinate that a type id means a unique thing across all their > > > product > > > lines. > > > > > > > Let me clarify, type id should be unique in the list of > > mdev_supported_types. You can't have 2 directories in with same name. > > > > >>> > > >>> Of course, but does that mean it's only unique to the machine I'm > > >>> currently running on? Let's say I have a Tesla P100 on my system and > > >>> type-id 11 is named "GRID-M60-0B". At some point in the future I > > >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 on that > > >>> new card still going to be a "GRID-M60-0B"? If not then we've based > > >>> our XML on the wrong attribute. If the new device does not support > > >>> "GRID-M60-0B" then we should generate an error, not simply initialize > > >>> whatever type-id 11 happens to be on this new card. > > >>> > > >> > > >> If there are 2 M60 in the system then you would find '11' type directory > > >> in mdev_supported_types of both M60. If you have P100, '11' type would > > >> not be there in its mdev_supported_types, it will have different types. > > >> > > >> For example, if you replace M60 with P100, but XML is not updated. XML > > >> have type '11'. When libvirt would try to create mdev device, libvirt > > >> would have to find 'create' file in sysfs in following directory format: > > >> > > >> --- mdev_supported_types > > >> |-- 11 > > >> | |-- create > > >> > > >> but now for P100, '11' directory is not there, so libvirt should throw > > >> error on not able to find '11' directory. > > > > > > This really seems like an accident waiting to happen. What happens > > > when the user replaces their M60 with an Intel XYZ device that happens > > > to expose a type 11 mdev class gpu device? How is libvirt supposed to > > > know that the XML used to refer to a GRID-M60-0B and now it's an > > > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and removing > > > yet another arbitrary requirement that we have some sort of globally > > > unique type-id database make a lot of sense? The same issue applies > > > for simple debug-ability, if I'm reviewing the XML for a domain and the > > > name is the primary index for the mdev device, I know what it is. > > > Seeing type-id='11' is meaningless. > > > > > > > Let me clarify again, type '11' is a string that vendor driver would > > define (see my previous reply below) it could be "11" or "GRID-M60-0B". > > If 2 vendors used same string we can't control that. right? > > > > > > Lets remove 'id' from type id in XML if that is the concern. Supported > > types is going to be defined by vendor driver, so let vendor driver > > decide what to use for directory name and same should be used in device > > xml file, it could be '11' or "GRID M60-0B": > > > > > > my-vgpu > > pci__86_00_0 > > > >
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Thu, 22 Sep 2016 09:41:20 +0530 Kirti Wankhedewrote: > > My concern is that a type id seems arbitrary but we're specifying that > > it be unique. We already have something unique, the name. So why try > > to make the type id unique as well? A vendor can accidentally create > > their vendor driver so that a given name means something very > > specific. On the other hand they need to be extremely deliberate to > > coordinate that a type id means a unique thing across all their product > > lines. > > > > Let me clarify, type id should be unique in the list of > mdev_supported_types. You can't have 2 directories in with same name. > >>> > >>> Of course, but does that mean it's only unique to the machine I'm > >>> currently running on? Let's say I have a Tesla P100 on my system and > >>> type-id 11 is named "GRID-M60-0B". At some point in the future I > >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 on that > >>> new card still going to be a "GRID-M60-0B"? If not then we've based > >>> our XML on the wrong attribute. If the new device does not support > >>> "GRID-M60-0B" then we should generate an error, not simply initialize > >>> whatever type-id 11 happens to be on this new card. > >>> > >> > >> If there are 2 M60 in the system then you would find '11' type directory > >> in mdev_supported_types of both M60. If you have P100, '11' type would > >> not be there in its mdev_supported_types, it will have different types. > >> > >> For example, if you replace M60 with P100, but XML is not updated. XML > >> have type '11'. When libvirt would try to create mdev device, libvirt > >> would have to find 'create' file in sysfs in following directory format: > >> > >> --- mdev_supported_types > >> |-- 11 > >> | |-- create > >> > >> but now for P100, '11' directory is not there, so libvirt should throw > >> error on not able to find '11' directory. > > > > This really seems like an accident waiting to happen. What happens > > when the user replaces their M60 with an Intel XYZ device that happens > > to expose a type 11 mdev class gpu device? How is libvirt supposed to > > know that the XML used to refer to a GRID-M60-0B and now it's an > > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and removing > > yet another arbitrary requirement that we have some sort of globally > > unique type-id database make a lot of sense? The same issue applies > > for simple debug-ability, if I'm reviewing the XML for a domain and the > > name is the primary index for the mdev device, I know what it is. > > Seeing type-id='11' is meaningless. > > > > Let me clarify again, type '11' is a string that vendor driver would > define (see my previous reply below) it could be "11" or "GRID-M60-0B". > If 2 vendors used same string we can't control that. right? > > > Lets remove 'id' from type id in XML if that is the concern. Supported > types is going to be defined by vendor driver, so let vendor driver > decide what to use for directory name and same should be used in device > xml file, it could be '11' or "GRID M60-0B": > > > my-vgpu > pci__86_00_0 > >
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
> My concern is that a type id seems arbitrary but we're specifying that > it be unique. We already have something unique, the name. So why try > to make the type id unique as well? A vendor can accidentally create > their vendor driver so that a given name means something very > specific. On the other hand they need to be extremely deliberate to > coordinate that a type id means a unique thing across all their product > lines. > Let me clarify, type id should be unique in the list of mdev_supported_types. You can't have 2 directories in with same name. >>> >>> Of course, but does that mean it's only unique to the machine I'm >>> currently running on? Let's say I have a Tesla P100 on my system and >>> type-id 11 is named "GRID-M60-0B". At some point in the future I >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 on that >>> new card still going to be a "GRID-M60-0B"? If not then we've based >>> our XML on the wrong attribute. If the new device does not support >>> "GRID-M60-0B" then we should generate an error, not simply initialize >>> whatever type-id 11 happens to be on this new card. >>> >> >> If there are 2 M60 in the system then you would find '11' type directory >> in mdev_supported_types of both M60. If you have P100, '11' type would >> not be there in its mdev_supported_types, it will have different types. >> >> For example, if you replace M60 with P100, but XML is not updated. XML >> have type '11'. When libvirt would try to create mdev device, libvirt >> would have to find 'create' file in sysfs in following directory format: >> >> --- mdev_supported_types >> |-- 11 >> | |-- create >> >> but now for P100, '11' directory is not there, so libvirt should throw >> error on not able to find '11' directory. > > This really seems like an accident waiting to happen. What happens > when the user replaces their M60 with an Intel XYZ device that happens > to expose a type 11 mdev class gpu device? How is libvirt supposed to > know that the XML used to refer to a GRID-M60-0B and now it's an > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and removing > yet another arbitrary requirement that we have some sort of globally > unique type-id database make a lot of sense? The same issue applies > for simple debug-ability, if I'm reviewing the XML for a domain and the > name is the primary index for the mdev device, I know what it is. > Seeing type-id='11' is meaningless. > Let me clarify again, type '11' is a string that vendor driver would define (see my previous reply below) it could be "11" or "GRID-M60-0B". If 2 vendors used same string we can't control that. right? Lets remove 'id' from type id in XML if that is the concern. Supported types is going to be defined by vendor driver, so let vendor driver decide what to use for directory name and same should be used in device xml file, it could be '11' or "GRID M60-0B": my-vgpu pci__86_00_0
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
> From: Alex Williamson [mailto:alex.william...@redhat.com] > Sent: Thursday, September 22, 2016 11:02 AM > > On Thu, 22 Sep 2016 02:33:01 + > "Tian, Kevin"wrote: > > > > From: Alex Williamson [mailto:alex.william...@redhat.com] > > > Sent: Wednesday, September 21, 2016 12:37 PM > > > > > > On Wed, 21 Sep 2016 03:56:21 + > > > "Tian, Kevin" wrote: > > > > > > > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > > > > > Sent: Tuesday, September 20, 2016 10:22 PM > > > > > >> > > > > > >> 'max_instance': > > > > > >> Read-Only file. Mandatory. > > > > > >> Returns integer. Returns maximum mdev device could be created > > > > > >> at the moment when this file is read. This count would be updated > > > > > >> by > > > > > >> vendor driver. Before creating mdev device of this type, check if > > > > > >> max_instance is > 0. > > > > > > > > > > > > Didn't we discuss this being being something like > > > > > > "available_instances" > > > > > > with a "devices" sub-directory linking current devices of that type? > > > > > > > > > > > > > > > > I'm ok with 'available_instances' as name of this file. > > > > > Yes, mdev device directory will have links to devices of that type. I > > > > > think that is part of mdev core module discussion. I'm trying to list > > > > > sysfs files for libvirt interface here to focus more on libvirt > > > > > interface. Hence didn't add that. I'll make sure to add all such > > > > > details > > > > > in future. > > > > > > > > > > > > > > > > > > Definitely you need provide those details since they also contain > > > > libvirt interface, e.g. 'destroy' which we discussed should be in > > > > mdev directory. > > > > > > $ find /sys/devices -name remove | wc -l > > > 24 > > > $ find /sys/devices -name destroy | wc -l > > > 0 > > > > > > 'remove' is the sysfs deconstructor we should use. > > > > make sense. > > > > > > > > > Another example is class-specific attributes such > > > > as 'resolution', etc. which you listed under 'display class'. All those > > > > attributes should be moved to mdev directory. Only class ID is > > > > required under each type. > > > > > > Can you elaborate on what you're proposing here? If we don't have > > > attributes like 'resolution' under the type-id then we can't describe > > > to the user the features of the mdev before we create it. I don't > > > think anybody wants a 'create it and find out' type of interface. > > > Thanks, > > > > > > > I think such way would be racy. What about two clients creating mdev > > devices simultaneously? How to guarantee their configuration of class > > specific attributes not mutual-impacted since before creation any such > > configuration would be global under that type? > > A simple mutex in the sysfs store attribute to serialize, return write > error if insufficient resources to create additional devices... Easy. > Let's not exaggerate the issue. > > > My feeling is that we'd better keep create simple - just write a UUID > > to "type-id/create". Any class-specific attribute, if we really want to > > support, should be able to be individually configured with required > > resource allocated incrementally on top of basic resources allocated > > at create stage. Then libvirt can set those attributes between create > > and open a mdev device. If this direction is accepted, then naturally > > such attributes should be put under mdev directory. Possibly we > > don't need a class description under type-id. libvirt just checks directly > > whether any known class represented in each mdev directory (vendor > > driver will expose it on demand), and then operate attributes under > > that class. > > It seems like this just moves the hypothetical race to the point where > we're adjusting individual attributes for the mdev device, but now the > problem is worse because there's no simple locking or serialization > that guarantees we can get all the attributes we want. We might get > half the attributes for one device and half the attributes for another > and not be able to complete initialization of either. I think we also > lose a lot of our ability to expose pre-defined configurations to the > user and the vendor's test matrix explodes. We also can't provide any > sort of reasonable estimate of how many devices can be created because > we don't know the resource requirements of each device. I don't like > this idea at all, sorry. Thanks, > Yes, it's a valid concern. Alex, curious about your opinion of supporting some device-agnostic optional attributes. One example is to set QoS parameters (priority, weight, etc.), which may be enabled by vendor driver to allow dynamic adjustment even after a mdev device is created. Do we still put a QoS class under type-id, meaning must-be-configured before creation (not a bad limitation since from SLA p.o.v all QoS attributes should be negotiated down in advance)? Would there be any concern if a type-id
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Thu, 22 Sep 2016 02:33:01 + "Tian, Kevin"wrote: > > From: Alex Williamson [mailto:alex.william...@redhat.com] > > Sent: Wednesday, September 21, 2016 12:37 PM > > > > On Wed, 21 Sep 2016 03:56:21 + > > "Tian, Kevin" wrote: > > > > > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > > > > Sent: Tuesday, September 20, 2016 10:22 PM > > > > >> > > > > >> 'max_instance': > > > > >> Read-Only file. Mandatory. > > > > >> Returns integer. Returns maximum mdev device could be created > > > > >> at the moment when this file is read. This count would be updated by > > > > >> vendor driver. Before creating mdev device of this type, check if > > > > >> max_instance is > 0. > > > > > > > > > > Didn't we discuss this being being something like > > > > > "available_instances" > > > > > with a "devices" sub-directory linking current devices of that type? > > > > > > > > > > > > > I'm ok with 'available_instances' as name of this file. > > > > Yes, mdev device directory will have links to devices of that type. I > > > > think that is part of mdev core module discussion. I'm trying to list > > > > sysfs files for libvirt interface here to focus more on libvirt > > > > interface. Hence didn't add that. I'll make sure to add all such details > > > > in future. > > > > > > > > > > > > > > Definitely you need provide those details since they also contain > > > libvirt interface, e.g. 'destroy' which we discussed should be in > > > mdev directory. > > > > $ find /sys/devices -name remove | wc -l > > 24 > > $ find /sys/devices -name destroy | wc -l > > 0 > > > > 'remove' is the sysfs deconstructor we should use. > > make sense. > > > > > > Another example is class-specific attributes such > > > as 'resolution', etc. which you listed under 'display class'. All those > > > attributes should be moved to mdev directory. Only class ID is > > > required under each type. > > > > Can you elaborate on what you're proposing here? If we don't have > > attributes like 'resolution' under the type-id then we can't describe > > to the user the features of the mdev before we create it. I don't > > think anybody wants a 'create it and find out' type of interface. > > Thanks, > > > > I think such way would be racy. What about two clients creating mdev > devices simultaneously? How to guarantee their configuration of class > specific attributes not mutual-impacted since before creation any such > configuration would be global under that type? A simple mutex in the sysfs store attribute to serialize, return write error if insufficient resources to create additional devices... Easy. Let's not exaggerate the issue. > My feeling is that we'd better keep create simple - just write a UUID > to "type-id/create". Any class-specific attribute, if we really want to > support, should be able to be individually configured with required > resource allocated incrementally on top of basic resources allocated > at create stage. Then libvirt can set those attributes between create > and open a mdev device. If this direction is accepted, then naturally > such attributes should be put under mdev directory. Possibly we > don't need a class description under type-id. libvirt just checks directly > whether any known class represented in each mdev directory (vendor > driver will expose it on demand), and then operate attributes under > that class. It seems like this just moves the hypothetical race to the point where we're adjusting individual attributes for the mdev device, but now the problem is worse because there's no simple locking or serialization that guarantees we can get all the attributes we want. We might get half the attributes for one device and half the attributes for another and not be able to complete initialization of either. I think we also lose a lot of our ability to expose pre-defined configurations to the user and the vendor's test matrix explodes. We also can't provide any sort of reasonable estimate of how many devices can be created because we don't know the resource requirements of each device. I don't like this idea at all, sorry. Thanks, Alex -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
> From: Tian, Kevin > Sent: Thursday, September 22, 2016 10:33 AM > > > > > > Another example is class-specific attributes such > > > as 'resolution', etc. which you listed under 'display class'. All those > > > attributes should be moved to mdev directory. Only class ID is > > > required under each type. > > > > Can you elaborate on what you're proposing here? If we don't have > > attributes like 'resolution' under the type-id then we can't describe > > to the user the features of the mdev before we create it. I don't > > think anybody wants a 'create it and find out' type of interface. > > Thanks, > > > > I think such way would be racy. What about two clients creating mdev > devices simultaneously? How to guarantee their configuration of class > specific attributes not mutual-impacted since before creation any such > configuration would be global under that type? > > My feeling is that we'd better keep create simple - just write a UUID > to "type-id/create". Any class-specific attribute, if we really want to > support, should be able to be individually configured with required > resource allocated incrementally on top of basic resources allocated > at create stage. Then libvirt can set those attributes between create > and open a mdev device. If this direction is accepted, then naturally > such attributes should be put under mdev directory. Possibly we > don't need a class description under type-id. libvirt just checks directly > whether any known class represented in each mdev directory (vendor > driver will expose it on demand), and then operate attributes under > that class. > Have a better understanding of your concern now. User needs to know and set a list of attributes before requesting to create a new mdev device. From this angle all attributes should be enumerated per type-id. But as I explained above, writing multiple sysfs nodes to create a mdev device is racy. We either need a way to guarantee transactional operations on those nodes, or having type-id directory to only expose supported attributes while programming those attributes has to be through per-mdev directory after mdev device is created... Thanks Kevin -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
> From: Alex Williamson [mailto:alex.william...@redhat.com] > Sent: Wednesday, September 21, 2016 12:43 PM > > On Wed, 21 Sep 2016 04:10:53 + > "Tian, Kevin"wrote: > > > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > > > Sent: Wednesday, September 21, 2016 12:23 AM > > > > > > > > > On 9/20/2016 8:13 PM, Alex Williamson wrote: > > > > On Tue, 20 Sep 2016 19:51:58 +0530 > > > > Kirti Wankhede wrote: > > > > > > > >> On 9/20/2016 3:06 AM, Alex Williamson wrote: > > > >>> On Tue, 20 Sep 2016 02:05:52 +0530 > > > >>> Kirti Wankhede wrote: > > > >>> > > > Hi libvirt experts, > > > > > > > > > 'create': > > > Write-only file. Mandatory. > > > Accepts string to create mediated device. > > > > > > 'name': > > > Read-Only file. Mandatory. > > > Returns string, the name of that type id. > > > > > > 'fb_length': > > > Read-only file. Mandatory. > > > Returns {K,M,G}, size of framebuffer. > > > >>> > > > >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > > > >>> just one user of mediated devices. > > > >>> > > > >> > > > >> As per our discussion in BOF, we would define class and its attributes. > > > >> As I mentioned above these are attributes of "display" class. > > > > > > > > Just as I didn't know here, how does libvirt know the class of a given > > > > type id? > > > > > > > > > > Yes, we need some way to identify the class as Daniel mentioned in his > > > suggestion. Add a file 'class' in mdev_supported_types that would return > > > the class. > > > > > > > > > > 'display' class or 'gpu' class? display represents only part of GPU > > functionalities. I assume you want to provide an unique class ID > > for each type, instead of allowing multiple classes each identifying > > a subset of controllable GPU resources (e.g. 'display', 'render', etc.)... > > Not sure where you're going with this, we're not creating a programming > interface for the device, that exists through the vfio API. I believe > the intention here is simply a user level classification for the > purpose of being able to interpret the attributes provided in a logical > way. I'm not against that purpose. My main intention is that 'display' is not an accurate classification here. If we allow classification in 'display' level, it implies more finer-grained classifications may be further defined upon which vendor specific GPU resources is allowed to be configured, which might be an overkill. Here we just need a general classification like 'gpu' to cover all possible vendor-agnostic attributes beyond display. > > > for unique class ID, I once considered whether it could be directly > > inherited from class of parent device, since typically a vendor driver > > creates mdev devices using the same type as physical device. But later > > I realized one potential value of keep a class field for mdev types, > > e.g. if we want to extend mdev to FPGA which could be programmed > > as different classes of functionalities. :-) > > And how do we generically determine the class of a parent device? We > cannot assume the parent device is PCI. Thanks, > Yes, this is also a good point. Thanks Kevin -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
> From: Alex Williamson [mailto:alex.william...@redhat.com] > Sent: Wednesday, September 21, 2016 12:37 PM > > On Wed, 21 Sep 2016 03:56:21 + > "Tian, Kevin"wrote: > > > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > > > Sent: Tuesday, September 20, 2016 10:22 PM > > > >> > > > >> 'max_instance': > > > >> Read-Only file. Mandatory. > > > >> Returns integer. Returns maximum mdev device could be created > > > >> at the moment when this file is read. This count would be updated by > > > >> vendor driver. Before creating mdev device of this type, check if > > > >> max_instance is > 0. > > > > > > > > Didn't we discuss this being being something like "available_instances" > > > > with a "devices" sub-directory linking current devices of that type? > > > > > > > > > > I'm ok with 'available_instances' as name of this file. > > > Yes, mdev device directory will have links to devices of that type. I > > > think that is part of mdev core module discussion. I'm trying to list > > > sysfs files for libvirt interface here to focus more on libvirt > > > interface. Hence didn't add that. I'll make sure to add all such details > > > in future. > > > > > > > > > > Definitely you need provide those details since they also contain > > libvirt interface, e.g. 'destroy' which we discussed should be in > > mdev directory. > > $ find /sys/devices -name remove | wc -l > 24 > $ find /sys/devices -name destroy | wc -l > 0 > > 'remove' is the sysfs deconstructor we should use. make sense. > > > Another example is class-specific attributes such > > as 'resolution', etc. which you listed under 'display class'. All those > > attributes should be moved to mdev directory. Only class ID is > > required under each type. > > Can you elaborate on what you're proposing here? If we don't have > attributes like 'resolution' under the type-id then we can't describe > to the user the features of the mdev before we create it. I don't > think anybody wants a 'create it and find out' type of interface. > Thanks, > I think such way would be racy. What about two clients creating mdev devices simultaneously? How to guarantee their configuration of class specific attributes not mutual-impacted since before creation any such configuration would be global under that type? My feeling is that we'd better keep create simple - just write a UUID to "type-id/create". Any class-specific attribute, if we really want to support, should be able to be individually configured with required resource allocated incrementally on top of basic resources allocated at create stage. Then libvirt can set those attributes between create and open a mdev device. If this direction is accepted, then naturally such attributes should be put under mdev directory. Possibly we don't need a class description under type-id. libvirt just checks directly whether any known class represented in each mdev directory (vendor driver will expose it on demand), and then operate attributes under that class. Thanks Kevin -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Thu, 22 Sep 2016 00:04:14 +0530 Kirti Wankhedewrote: > On 9/20/2016 10:20 PM, Alex Williamson wrote: > > On Tue, 20 Sep 2016 21:53:16 +0530 > > Kirti Wankhede wrote: > > > >> On 9/20/2016 8:13 PM, Alex Williamson wrote: > >>> On Tue, 20 Sep 2016 19:51:58 +0530 > >>> Kirti Wankhede wrote: > >>> > On 9/20/2016 3:06 AM, Alex Williamson wrote: > > On Tue, 20 Sep 2016 02:05:52 +0530 > > Kirti Wankhede wrote: > > > >> Hi libvirt experts, > >> > >> > >> 'create': > >> Write-only file. Mandatory. > >> Accepts string to create mediated device. > >> > >> 'name': > >> Read-Only file. Mandatory. > >> Returns string, the name of that type id. > >> > >> 'fb_length': > >> Read-only file. Mandatory. > >> Returns {K,M,G}, size of framebuffer. > > > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > > just one user of mediated devices. > > > > As per our discussion in BOF, we would define class and its attributes. > As I mentioned above these are attributes of "display" class. > >>> > >>> Just as I didn't know here, how does libvirt know the class of a given > >>> type id? > >>> > >> > >> Yes, we need some way to identify the class as Daniel mentioned in his > >> suggestion. Add a file 'class' in mdev_supported_types that would return > >> the class. > >> > >> > > Can all parameters be changed dynamically? > > Parameters here would be extra parameters which are not listed above in > mandatory list. If such parameters are required to set, these parameters > should be set per mdev device before VM is booted. > > > Do parameter changes apply to existing devices or only future devices? > > > > No, it should be applied at the time when mdev device is created or > after mdev device is created but before VM is booted. It will not be > applicable to existing devices. > > > This is a poorly specified > > interface. I'd prefer this be done via module options on the vendor > > driver. > > > > Module option is not suitable here, in that case such parameters would > be applied to all mdev device created for vendor driver and that is not > what we want to. > >>> > >>> Then use attributes on the mdev device itself, this is not an > >>> acceptable interface. > >>> > >> > >> With above example, vGPU device XML would look like: > >> > >> > >> my-vgpu > >> pci__86_00_0 > >> > >> > >>1 > >>'frame_rate_limiter=0' > >> > >> > >> > >> 'type id' is mandatory. > > > > I was under the impression that the vendor supplied "name" was the one > > unique identifier. We're certainly not going to create a registrar to > > hand out type ids to each vendor, so what makes type id unique? > > Type id is unique, 'name' is the name (string) of device that would be > displayed in guest OS for that type of mdev device. > >>> > >>> We were told at the BoF that name was the unique string which would be > >>> consistent and you again indicate below that "GRID-M60-0B" has a fixed > >>> set of attributes, so why do we care what type id is associated with > >>> that name? > >>> > > I have > > a hard time believing that a given vendor can even allocate unique type > > ids for their own devices. Unique type id across vendors is not > > practical. So which attribute are we actually using to identify which > > type of mdev device we need and why would we ever specify additional > > attributes like fb_length? Doesn't the vendor guarantee that "GRID > > M60-0B" has a fixed setup of those attributes? > > > > Specifying attributes here is not our requirement. Yes we have fixed set > of attributes for "GRID-M60-0B" and on. > We are defining the attributed here for "display" class for all other > vendor of gpu can use. > > > >> 'group' is optional. It should be a unique number in the system among > >> all the groups created for mdev devices. Its usage is: > >> - not needed if single vGPU device is being assigned to a domain. > >> - only need to be set if multiple vGPUs need to be assigned to a > >> domain and vendor driver have 'requires_group' file in type id > >> directory. > >> - if type id directory include 'requires_group' and user tries to > >> assign multiple vGPUs to a domain without having field in XML, > >> it will create single vGPU. > > > > We never finished our discussion of how this gets implemented or > > whether a group applies
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On 9/20/2016 10:20 PM, Alex Williamson wrote: > On Tue, 20 Sep 2016 21:53:16 +0530 > Kirti Wankhedewrote: > >> On 9/20/2016 8:13 PM, Alex Williamson wrote: >>> On Tue, 20 Sep 2016 19:51:58 +0530 >>> Kirti Wankhede wrote: >>> On 9/20/2016 3:06 AM, Alex Williamson wrote: > On Tue, 20 Sep 2016 02:05:52 +0530 > Kirti Wankhede wrote: > >> Hi libvirt experts, >> >> >> 'create': >> Write-only file. Mandatory. >> Accepts string to create mediated device. >> >> 'name': >> Read-Only file. Mandatory. >> Returns string, the name of that type id. >> >> 'fb_length': >> Read-only file. Mandatory. >> Returns {K,M,G}, size of framebuffer. > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > just one user of mediated devices. > As per our discussion in BOF, we would define class and its attributes. As I mentioned above these are attributes of "display" class. >>> >>> Just as I didn't know here, how does libvirt know the class of a given >>> type id? >>> >> >> Yes, we need some way to identify the class as Daniel mentioned in his >> suggestion. Add a file 'class' in mdev_supported_types that would return >> the class. >> >> > Can all parameters be changed dynamically? Parameters here would be extra parameters which are not listed above in mandatory list. If such parameters are required to set, these parameters should be set per mdev device before VM is booted. > Do parameter changes apply to existing devices or only future devices? > No, it should be applied at the time when mdev device is created or after mdev device is created but before VM is booted. It will not be applicable to existing devices. > This is a poorly specified > interface. I'd prefer this be done via module options on the vendor > driver. > Module option is not suitable here, in that case such parameters would be applied to all mdev device created for vendor driver and that is not what we want to. >>> >>> Then use attributes on the mdev device itself, this is not an >>> acceptable interface. >>> >> >> With above example, vGPU device XML would look like: >> >> >> my-vgpu >> pci__86_00_0 >> >> >>1 >>'frame_rate_limiter=0' >> >> >> >> 'type id' is mandatory. > > I was under the impression that the vendor supplied "name" was the one > unique identifier. We're certainly not going to create a registrar to > hand out type ids to each vendor, so what makes type id unique? Type id is unique, 'name' is the name (string) of device that would be displayed in guest OS for that type of mdev device. >>> >>> We were told at the BoF that name was the unique string which would be >>> consistent and you again indicate below that "GRID-M60-0B" has a fixed >>> set of attributes, so why do we care what type id is associated with >>> that name? >>> > I have > a hard time believing that a given vendor can even allocate unique type > ids for their own devices. Unique type id across vendors is not > practical. So which attribute are we actually using to identify which > type of mdev device we need and why would we ever specify additional > attributes like fb_length? Doesn't the vendor guarantee that "GRID > M60-0B" has a fixed setup of those attributes? > Specifying attributes here is not our requirement. Yes we have fixed set of attributes for "GRID-M60-0B" and on. We are defining the attributed here for "display" class for all other vendor of gpu can use. >> 'group' is optional. It should be a unique number in the system among >> all the groups created for mdev devices. Its usage is: >> - not needed if single vGPU device is being assigned to a domain. >> - only need to be set if multiple vGPUs need to be assigned to a >> domain and vendor driver have 'requires_group' file in type id directory. >> - if type id directory include 'requires_group' and user tries to >> assign multiple vGPUs to a domain without having field in XML, >> it will create single vGPU. > > We never finished our discussion of how this gets implemented or > whether a group applies only to devices from the same vendor, same > device, how heterogeneous groups are handled, etc. > We agreed on above points that I mentioned here and we wish to know what libvirt folks think, right? Since there were no inputs on that thread I thought I should summarize what had been discussed and get libvirt experts thoughts on this. >>>
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Wed, 21 Sep 2016 04:10:53 + "Tian, Kevin"wrote: > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > > Sent: Wednesday, September 21, 2016 12:23 AM > > > > > > On 9/20/2016 8:13 PM, Alex Williamson wrote: > > > On Tue, 20 Sep 2016 19:51:58 +0530 > > > Kirti Wankhede wrote: > > > > > >> On 9/20/2016 3:06 AM, Alex Williamson wrote: > > >>> On Tue, 20 Sep 2016 02:05:52 +0530 > > >>> Kirti Wankhede wrote: > > >>> > > Hi libvirt experts, > > > > > > 'create': > > Write-only file. Mandatory. > > Accepts string to create mediated device. > > > > 'name': > > Read-Only file. Mandatory. > > Returns string, the name of that type id. > > > > 'fb_length': > > Read-only file. Mandatory. > > Returns {K,M,G}, size of framebuffer. > > >>> > > >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > > >>> just one user of mediated devices. > > >>> > > >> > > >> As per our discussion in BOF, we would define class and its attributes. > > >> As I mentioned above these are attributes of "display" class. > > > > > > Just as I didn't know here, how does libvirt know the class of a given > > > type id? > > > > > > > Yes, we need some way to identify the class as Daniel mentioned in his > > suggestion. Add a file 'class' in mdev_supported_types that would return > > the class. > > > > > > 'display' class or 'gpu' class? display represents only part of GPU > functionalities. I assume you want to provide an unique class ID > for each type, instead of allowing multiple classes each identifying > a subset of controllable GPU resources (e.g. 'display', 'render', etc.)... Not sure where you're going with this, we're not creating a programming interface for the device, that exists through the vfio API. I believe the intention here is simply a user level classification for the purpose of being able to interpret the attributes provided in a logical way. > for unique class ID, I once considered whether it could be directly > inherited from class of parent device, since typically a vendor driver > creates mdev devices using the same type as physical device. But later > I realized one potential value of keep a class field for mdev types, > e.g. if we want to extend mdev to FPGA which could be programmed > as different classes of functionalities. :-) And how do we generically determine the class of a parent device? We cannot assume the parent device is PCI. Thanks, Alex -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Wed, 21 Sep 2016 03:56:21 + "Tian, Kevin"wrote: > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > > Sent: Tuesday, September 20, 2016 10:22 PM > > >> > > >> 'max_instance': > > >> Read-Only file. Mandatory. > > >> Returns integer. Returns maximum mdev device could be created > > >> at the moment when this file is read. This count would be updated by > > >> vendor driver. Before creating mdev device of this type, check if > > >> max_instance is > 0. > > > > > > Didn't we discuss this being being something like "available_instances" > > > with a "devices" sub-directory linking current devices of that type? > > > > > > > I'm ok with 'available_instances' as name of this file. > > Yes, mdev device directory will have links to devices of that type. I > > think that is part of mdev core module discussion. I'm trying to list > > sysfs files for libvirt interface here to focus more on libvirt > > interface. Hence didn't add that. I'll make sure to add all such details > > in future. > > > > > > Definitely you need provide those details since they also contain > libvirt interface, e.g. 'destroy' which we discussed should be in > mdev directory. $ find /sys/devices -name remove | wc -l 24 $ find /sys/devices -name destroy | wc -l 0 'remove' is the sysfs deconstructor we should use. > Another example is class-specific attributes such > as 'resolution', etc. which you listed under 'display class'. All those > attributes should be moved to mdev directory. Only class ID is > required under each type. Can you elaborate on what you're proposing here? If we don't have attributes like 'resolution' under the type-id then we can't describe to the user the features of the mdev before we create it. I don't think anybody wants a 'create it and find out' type of interface. Thanks, Alex -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
> From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > Sent: Wednesday, September 21, 2016 12:23 AM > > > On 9/20/2016 8:13 PM, Alex Williamson wrote: > > On Tue, 20 Sep 2016 19:51:58 +0530 > > Kirti Wankhedewrote: > > > >> On 9/20/2016 3:06 AM, Alex Williamson wrote: > >>> On Tue, 20 Sep 2016 02:05:52 +0530 > >>> Kirti Wankhede wrote: > >>> > Hi libvirt experts, > > > 'create': > Write-only file. Mandatory. > Accepts string to create mediated device. > > 'name': > Read-Only file. Mandatory. > Returns string, the name of that type id. > > 'fb_length': > Read-only file. Mandatory. > Returns {K,M,G}, size of framebuffer. > >>> > >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > >>> just one user of mediated devices. > >>> > >> > >> As per our discussion in BOF, we would define class and its attributes. > >> As I mentioned above these are attributes of "display" class. > > > > Just as I didn't know here, how does libvirt know the class of a given > > type id? > > > > Yes, we need some way to identify the class as Daniel mentioned in his > suggestion. Add a file 'class' in mdev_supported_types that would return > the class. > > 'display' class or 'gpu' class? display represents only part of GPU functionalities. I assume you want to provide an unique class ID for each type, instead of allowing multiple classes each identifying a subset of controllable GPU resources (e.g. 'display', 'render', etc.)... for unique class ID, I once considered whether it could be directly inherited from class of parent device, since typically a vendor driver creates mdev devices using the same type as physical device. But later I realized one potential value of keep a class field for mdev types, e.g. if we want to extend mdev to FPGA which could be programmed as different classes of functionalities. :-) Thanks Kevin -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
> From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > Sent: Tuesday, September 20, 2016 10:22 PM > >> > >> 'max_instance': > >> Read-Only file. Mandatory. > >> Returns integer. Returns maximum mdev device could be created > >> at the moment when this file is read. This count would be updated by > >> vendor driver. Before creating mdev device of this type, check if > >> max_instance is > 0. > > > > Didn't we discuss this being being something like "available_instances" > > with a "devices" sub-directory linking current devices of that type? > > > > I'm ok with 'available_instances' as name of this file. > Yes, mdev device directory will have links to devices of that type. I > think that is part of mdev core module discussion. I'm trying to list > sysfs files for libvirt interface here to focus more on libvirt > interface. Hence didn't add that. I'll make sure to add all such details > in future. > > Definitely you need provide those details since they also contain libvirt interface, e.g. 'destroy' which we discussed should be in mdev directory. Another example is class-specific attributes such as 'resolution', etc. which you listed under 'display class'. All those attributes should be moved to mdev directory. Only class ID is required under each type. Thanks Kevin -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Tue, 20 Sep 2016 21:53:16 +0530 Kirti Wankhedewrote: > On 9/20/2016 8:13 PM, Alex Williamson wrote: > > On Tue, 20 Sep 2016 19:51:58 +0530 > > Kirti Wankhede wrote: > > > >> On 9/20/2016 3:06 AM, Alex Williamson wrote: > >>> On Tue, 20 Sep 2016 02:05:52 +0530 > >>> Kirti Wankhede wrote: > >>> > Hi libvirt experts, > > > 'create': > Write-only file. Mandatory. > Accepts string to create mediated device. > > 'name': > Read-Only file. Mandatory. > Returns string, the name of that type id. > > 'fb_length': > Read-only file. Mandatory. > Returns {K,M,G}, size of framebuffer. > >>> > >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > >>> just one user of mediated devices. > >>> > >> > >> As per our discussion in BOF, we would define class and its attributes. > >> As I mentioned above these are attributes of "display" class. > > > > Just as I didn't know here, how does libvirt know the class of a given > > type id? > > > > Yes, we need some way to identify the class as Daniel mentioned in his > suggestion. Add a file 'class' in mdev_supported_types that would return > the class. > > > >>> Can all parameters be changed dynamically? > >> > >> Parameters here would be extra parameters which are not listed above in > >> mandatory list. If such parameters are required to set, these parameters > >> should be set per mdev device before VM is booted. > >> > >>> Do parameter changes apply to existing devices or only future devices? > >>> > >> > >> No, it should be applied at the time when mdev device is created or > >> after mdev device is created but before VM is booted. It will not be > >> applicable to existing devices. > >> > >>> This is a poorly specified > >>> interface. I'd prefer this be done via module options on the vendor > >>> driver. > >>> > >> > >> Module option is not suitable here, in that case such parameters would > >> be applied to all mdev device created for vendor driver and that is not > >> what we want to. > > > > Then use attributes on the mdev device itself, this is not an > > acceptable interface. > > > > With above example, vGPU device XML would look like: > > > my-vgpu > pci__86_00_0 > > > 1 > 'frame_rate_limiter=0' > > > > 'type id' is mandatory. > >>> > >>> I was under the impression that the vendor supplied "name" was the one > >>> unique identifier. We're certainly not going to create a registrar to > >>> hand out type ids to each vendor, so what makes type id unique? > >> > >> Type id is unique, 'name' is the name (string) of device that would be > >> displayed in guest OS for that type of mdev device. > > > > We were told at the BoF that name was the unique string which would be > > consistent and you again indicate below that "GRID-M60-0B" has a fixed > > set of attributes, so why do we care what type id is associated with > > that name? > > > >>> I have > >>> a hard time believing that a given vendor can even allocate unique type > >>> ids for their own devices. Unique type id across vendors is not > >>> practical. So which attribute are we actually using to identify which > >>> type of mdev device we need and why would we ever specify additional > >>> attributes like fb_length? Doesn't the vendor guarantee that "GRID > >>> M60-0B" has a fixed setup of those attributes? > >>> > >> > >> Specifying attributes here is not our requirement. Yes we have fixed set > >> of attributes for "GRID-M60-0B" and on. > >> We are defining the attributed here for "display" class for all other > >> vendor of gpu can use. > >> > >> > 'group' is optional. It should be a unique number in the system among > all the groups created for mdev devices. Its usage is: > - not needed if single vGPU device is being assigned to a domain. > - only need to be set if multiple vGPUs need to be assigned to a > domain and vendor driver have 'requires_group' file in type id directory. > - if type id directory include 'requires_group' and user tries to > assign multiple vGPUs to a domain without having field in XML, > it will create single vGPU. > >>> > >>> We never finished our discussion of how this gets implemented or > >>> whether a group applies only to devices from the same vendor, same > >>> device, how heterogeneous groups are handled, etc. > >>> > >> > >> We agreed on above points that I mentioned here and we wish to know what > >> libvirt folks think, right? > >> Since there were no inputs on that thread I thought I should summarize > >> what had been discussed and get libvirt experts thoughts on this. > > > > No, that was just an idea that was never full
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On 9/20/2016 8:13 PM, Alex Williamson wrote: > On Tue, 20 Sep 2016 19:51:58 +0530 > Kirti Wankhedewrote: > >> On 9/20/2016 3:06 AM, Alex Williamson wrote: >>> On Tue, 20 Sep 2016 02:05:52 +0530 >>> Kirti Wankhede wrote: >>> Hi libvirt experts, 'create': Write-only file. Mandatory. Accepts string to create mediated device. 'name': Read-Only file. Mandatory. Returns string, the name of that type id. 'fb_length': Read-only file. Mandatory. Returns {K,M,G}, size of framebuffer. >>> >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are >>> just one user of mediated devices. >>> >> >> As per our discussion in BOF, we would define class and its attributes. >> As I mentioned above these are attributes of "display" class. > > Just as I didn't know here, how does libvirt know the class of a given > type id? > Yes, we need some way to identify the class as Daniel mentioned in his suggestion. Add a file 'class' in mdev_supported_types that would return the class. >>> Can all parameters be changed dynamically? >> >> Parameters here would be extra parameters which are not listed above in >> mandatory list. If such parameters are required to set, these parameters >> should be set per mdev device before VM is booted. >> >>> Do parameter changes apply to existing devices or only future devices? >> >> No, it should be applied at the time when mdev device is created or >> after mdev device is created but before VM is booted. It will not be >> applicable to existing devices. >> >>> This is a poorly specified >>> interface. I'd prefer this be done via module options on the vendor >>> driver. >>> >> >> Module option is not suitable here, in that case such parameters would >> be applied to all mdev device created for vendor driver and that is not >> what we want to. > > Then use attributes on the mdev device itself, this is not an > acceptable interface. > With above example, vGPU device XML would look like: my-vgpu pci__86_00_0 1 'frame_rate_limiter=0' 'type id' is mandatory. >>> >>> I was under the impression that the vendor supplied "name" was the one >>> unique identifier. We're certainly not going to create a registrar to >>> hand out type ids to each vendor, so what makes type id unique? >> >> Type id is unique, 'name' is the name (string) of device that would be >> displayed in guest OS for that type of mdev device. > > We were told at the BoF that name was the unique string which would be > consistent and you again indicate below that "GRID-M60-0B" has a fixed > set of attributes, so why do we care what type id is associated with > that name? > >>> I have >>> a hard time believing that a given vendor can even allocate unique type >>> ids for their own devices. Unique type id across vendors is not >>> practical. So which attribute are we actually using to identify which >>> type of mdev device we need and why would we ever specify additional >>> attributes like fb_length? Doesn't the vendor guarantee that "GRID >>> M60-0B" has a fixed setup of those attributes? >>> >> >> Specifying attributes here is not our requirement. Yes we have fixed set >> of attributes for "GRID-M60-0B" and on. >> We are defining the attributed here for "display" class for all other >> vendor of gpu can use. >> >> 'group' is optional. It should be a unique number in the system among all the groups created for mdev devices. Its usage is: - not needed if single vGPU device is being assigned to a domain. - only need to be set if multiple vGPUs need to be assigned to a domain and vendor driver have 'requires_group' file in type id directory. - if type id directory include 'requires_group' and user tries to assign multiple vGPUs to a domain without having field in XML, it will create single vGPU. >>> >>> We never finished our discussion of how this gets implemented or >>> whether a group applies only to devices from the same vendor, same >>> device, how heterogeneous groups are handled, etc. >>> >> >> We agreed on above points that I mentioned here and we wish to know what >> libvirt folks think, right? >> Since there were no inputs on that thread I thought I should summarize >> what had been discussed and get libvirt experts thoughts on this. > > No, that was just an idea that was never full defined. We had that > idea, the IOMMU group idea, the invalid container idea... This is > still an open question in the design. > We do want to close down on the design as soon as possible. I had tried to address open questions in earlier discussion also. Let me address your concerns you have: > whether a group applies only to devices from the same vendor, Yes. > same device,
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Tue, 20 Sep 2016 20:05:01 +0530 Kirti Wankhedewrote: > On 9/20/2016 3:55 AM, Alex Williamson wrote: > > On Mon, 19 Sep 2016 23:50:56 +0200 > > Paolo Bonzini wrote: > > > >> On 19/09/2016 23:36, Alex Williamson wrote: > >>> On Tue, 20 Sep 2016 02:05:52 +0530 > >>> Kirti Wankhede wrote: > 'fb_length': > Read-only file. Mandatory. > Returns {K,M,G}, size of framebuffer. > >>> > >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > >>> just one user of mediated devices. [...] > >>> > 'params' > Write-Only file. Optional. > String input. Libvirt would pass the string given in XML file to > this file and then create mdev device. Set empty string to clear params. > For example, set parameter 'frame_rate_limiter=0' to disable frame rate > limiter for performance benchmarking, then create device of type 11. The > device created would have that parameter set by vendor driver. > >>> > >>> Might as well just call it "magic", there's absolutely no ability to > >>> introspect what parameters are allowed or even valid here. Can all > >>> parameters be changed dynamically? Do parameter changes apply to > >>> existing devices or only future devices? This is a poorly specified > >>> interface. I'd prefer this be done via module options on the vendor > >>> driver. > >> > >> Or additional files in the mdev's sysfs directory? > > > > Sure, the vendor driver could certainly support that, it'd be vendor > > specific of course. > > > > Right, it would be vendor specific. But if this is not set through mdev > device XML, user have to create mdev device with XML first and then > manually set such parameter. Also these parameters would not retain if > mdev device is destroyed and created again. That's why if libvirt sets > these parameters by reading it from XML file, it would be simple for user. > > If we move 'params' file to mdev device, could libvirt do the following > to create mdev device: > > * Read /sys/../\:86\:00.0/11/max_instances. If it is greater than 0, > then only proceed else fail. > > * Autogenerate UUID > * Create device: > > echo "$UUID" > /sys/../\:86\:00.0/11/create > > * Set extra params if 'params' field exist in device XML and 'params' > file exist mdev device directory: > > echo "frame_rate_limiter=0" > /sys/bus/mdev/devices/$UUID/params Why does libvirt even need to participate in setting these additional parameters? The example here is for benchmarking, which doesn't express to me a strong need to be automated through libvirt. For hostdev devices libvirt supports a managed='no' option that requires the user to setup the device in advance. Something similar sounds reasonable here too. This is also where I see that a vendor module option could be used, something like: modprobe nvidia enable-no-frl-mdev You say you don't want to apply this to all mdev devices, but maybe it just makes an additional type-id available "GRID-M60-0B-NOFRL". Then you've removed libvirt from the equation, the VM simply specifies this type-id that incorporates that feature. Obviously it gets complicated if there are multiple features to enable, but we've only seen one example so far of why this parameter is needed. Thanks, Alex -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Tue, 20 Sep 2016 19:51:58 +0530 Kirti Wankhedewrote: > On 9/20/2016 3:06 AM, Alex Williamson wrote: > > On Tue, 20 Sep 2016 02:05:52 +0530 > > Kirti Wankhede wrote: > > > >> Hi libvirt experts, > >> > >> Thanks for valuable input on v1 version of RFC. > >> > >> Quick brief, VFIO based mediated device framework provides a way to > >> virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT > >> and IBM's channel IO. This framework reuses VFIO APIs for all the > >> functionalities for mediated devices which are currently being used for > >> pass through devices. This framework introduces a set of new sysfs files > >> for device creation and its life cycle management. > >> > >> Here is the summary of discussion on v1: > >> 1. Discover mediated device: > >> As part of physical device initialization process, vendor driver will > >> register their physical devices, which will be used to create virtual > >> device (mediated device, aka mdev) to the mediated framework. > >> > >> Vendor driver should specify mdev_supported_types in directory format. > >> This format is class based, for example, display class directory format > >> should be as below. We need to define such set for each class of devices > >> which would be supported by mediated device framework. > >> > >> --- mdev_destroy > > > > I thought we replaced mdev_destroy with a remove attribute for each > > mdev device. > > > >> --- mdev_supported_types > >> |-- 11 > >> | |-- create > >> | |-- name > >> | |-- fb_length > >> | |-- resolution > >> | |-- heads > >> | |-- max_instances > >> | |-- params > >> | |-- requires_group > >> |-- 12 > >> | |-- create > >> | |-- name > >> | |-- fb_length > >> | |-- resolution > >> | |-- heads > >> | |-- max_instances > >> | |-- params > >> | |-- requires_group > >> |-- 13 > >> |-- create > >> |-- name > >> |-- fb_length > >> |-- resolution > >> |-- heads > >> |-- max_instances > >> |-- params > >> |-- requires_group > >> > >> > >> In the above example directory '11' represents a type id of mdev device. > >> 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and > >> 'requires_group' would be Read-Only files that vendor would provide to > >> describe about that type. > >> > >> 'create': > >> Write-only file. Mandatory. > >> Accepts string to create mediated device. > >> > >> 'name': > >> Read-Only file. Mandatory. > >> Returns string, the name of that type id. > >> > >> 'fb_length': > >> Read-only file. Mandatory. > >> Returns {K,M,G}, size of framebuffer. > > > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > > just one user of mediated devices. > > > > As per our discussion in BOF, we would define class and its attributes. > As I mentioned above these are attributes of "display" class. Just as I didn't know here, how does libvirt know the class of a given type id? > >> > >> 'resolution': > >> Read-Only file. Mandatory. > >> Returns 'hres x vres' format. Maximum supported resolution. > > > > Same. > > > >> > >> 'heads': > >> Read-Only file. Mandatory. > >> Returns integer. Number of maximum heads supported. > > > > Same. > > > >> > >> 'max_instance': > >> Read-Only file. Mandatory. > >> Returns integer. Returns maximum mdev device could be created > >> at the moment when this file is read. This count would be updated by > >> vendor driver. Before creating mdev device of this type, check if > >> max_instance is > 0. > > > > Didn't we discuss this being being something like "available_instances" > > with a "devices" sub-directory linking current devices of that type? > > > > I'm ok with 'available_instances' as name of this file. > Yes, mdev device directory will have links to devices of that type. I > think that is part of mdev core module discussion. I'm trying to list > sysfs files for libvirt interface here to focus more on libvirt > interface. Hence didn't add that. I'll make sure to add all such details > in future. > > > >> 'params' > >> Write-Only file. Optional. > >> String input. Libvirt would pass the string given in XML file to > >> this file and then create mdev device. Set empty string to clear params. > >> For example, set parameter 'frame_rate_limiter=0' to disable frame rate > >> limiter for performance benchmarking, then create device of type 11. The > >> device created would have that parameter set by vendor driver. > > > > Might as well just call it "magic", there's absolutely no ability to > > introspect what parameters are allowed or even valid here. > > Libvirt don't need to introspect these parameters. These are meant for > vendor driver. Let vendor driver interpret these parameters and take > action
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On 9/20/2016 3:55 AM, Alex Williamson wrote: > On Mon, 19 Sep 2016 23:50:56 +0200 > Paolo Bonziniwrote: > >> On 19/09/2016 23:36, Alex Williamson wrote: >>> On Tue, 20 Sep 2016 02:05:52 +0530 >>> Kirti Wankhede wrote: 'fb_length': Read-only file. Mandatory. Returns {K,M,G}, size of framebuffer. >>> >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are >>> just one user of mediated devices. [...] >>> 'params' Write-Only file. Optional. String input. Libvirt would pass the string given in XML file to this file and then create mdev device. Set empty string to clear params. For example, set parameter 'frame_rate_limiter=0' to disable frame rate limiter for performance benchmarking, then create device of type 11. The device created would have that parameter set by vendor driver. >>> >>> Might as well just call it "magic", there's absolutely no ability to >>> introspect what parameters are allowed or even valid here. Can all >>> parameters be changed dynamically? Do parameter changes apply to >>> existing devices or only future devices? This is a poorly specified >>> interface. I'd prefer this be done via module options on the vendor >>> driver. >> >> Or additional files in the mdev's sysfs directory? > > Sure, the vendor driver could certainly support that, it'd be vendor > specific of course. > Right, it would be vendor specific. But if this is not set through mdev device XML, user have to create mdev device with XML first and then manually set such parameter. Also these parameters would not retain if mdev device is destroyed and created again. That's why if libvirt sets these parameters by reading it from XML file, it would be simple for user. If we move 'params' file to mdev device, could libvirt do the following to create mdev device: * Read /sys/../\:86\:00.0/11/max_instances. If it is greater than 0, then only proceed else fail. * Autogenerate UUID * Create device: echo "$UUID" > /sys/../\:86\:00.0/11/create * Set extra params if 'params' field exist in device XML and 'params' file exist mdev device directory: echo "frame_rate_limiter=0" > /sys/bus/mdev/devices/$UUID/params Kirti -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On 9/20/2016 3:06 AM, Alex Williamson wrote: > On Tue, 20 Sep 2016 02:05:52 +0530 > Kirti Wankhedewrote: > >> Hi libvirt experts, >> >> Thanks for valuable input on v1 version of RFC. >> >> Quick brief, VFIO based mediated device framework provides a way to >> virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT >> and IBM's channel IO. This framework reuses VFIO APIs for all the >> functionalities for mediated devices which are currently being used for >> pass through devices. This framework introduces a set of new sysfs files >> for device creation and its life cycle management. >> >> Here is the summary of discussion on v1: >> 1. Discover mediated device: >> As part of physical device initialization process, vendor driver will >> register their physical devices, which will be used to create virtual >> device (mediated device, aka mdev) to the mediated framework. >> >> Vendor driver should specify mdev_supported_types in directory format. >> This format is class based, for example, display class directory format >> should be as below. We need to define such set for each class of devices >> which would be supported by mediated device framework. >> >> --- mdev_destroy > > I thought we replaced mdev_destroy with a remove attribute for each > mdev device. > >> --- mdev_supported_types >> |-- 11 >> | |-- create >> | |-- name >> | |-- fb_length >> | |-- resolution >> | |-- heads >> | |-- max_instances >> | |-- params >> | |-- requires_group >> |-- 12 >> | |-- create >> | |-- name >> | |-- fb_length >> | |-- resolution >> | |-- heads >> | |-- max_instances >> | |-- params >> | |-- requires_group >> |-- 13 >> |-- create >> |-- name >> |-- fb_length >> |-- resolution >> |-- heads >> |-- max_instances >> |-- params >> |-- requires_group >> >> >> In the above example directory '11' represents a type id of mdev device. >> 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and >> 'requires_group' would be Read-Only files that vendor would provide to >> describe about that type. >> >> 'create': >> Write-only file. Mandatory. >> Accepts string to create mediated device. >> >> 'name': >> Read-Only file. Mandatory. >> Returns string, the name of that type id. >> >> 'fb_length': >> Read-only file. Mandatory. >> Returns {K,M,G}, size of framebuffer. > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > just one user of mediated devices. > As per our discussion in BOF, we would define class and its attributes. As I mentioned above these are attributes of "display" class. >> >> 'resolution': >> Read-Only file. Mandatory. >> Returns 'hres x vres' format. Maximum supported resolution. > > Same. > >> >> 'heads': >> Read-Only file. Mandatory. >> Returns integer. Number of maximum heads supported. > > Same. > >> >> 'max_instance': >> Read-Only file. Mandatory. >> Returns integer. Returns maximum mdev device could be created >> at the moment when this file is read. This count would be updated by >> vendor driver. Before creating mdev device of this type, check if >> max_instance is > 0. > > Didn't we discuss this being being something like "available_instances" > with a "devices" sub-directory linking current devices of that type? > I'm ok with 'available_instances' as name of this file. Yes, mdev device directory will have links to devices of that type. I think that is part of mdev core module discussion. I'm trying to list sysfs files for libvirt interface here to focus more on libvirt interface. Hence didn't add that. I'll make sure to add all such details in future. >> 'params' >> Write-Only file. Optional. >> String input. Libvirt would pass the string given in XML file to >> this file and then create mdev device. Set empty string to clear params. >> For example, set parameter 'frame_rate_limiter=0' to disable frame rate >> limiter for performance benchmarking, then create device of type 11. The >> device created would have that parameter set by vendor driver. > > Might as well just call it "magic", there's absolutely no ability to > introspect what parameters are allowed or even valid here. Libvirt don't need to introspect these parameters. These are meant for vendor driver. Let vendor driver interpret these parameters and take action based on that. > Can all parameters be changed dynamically? Parameters here would be extra parameters which are not listed above in mandatory list. If such parameters are required to set, these parameters should be set per mdev device before VM is booted. > Do parameter changes apply to existing devices or only future devices? No, it should be applied at the time when mdev device is created or after mdev device is created but before VM is
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
> From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > Sent: Tuesday, September 20, 2016 4:36 AM > > > Hi libvirt experts, > > Thanks for valuable input on v1 version of RFC. > > Quick brief, VFIO based mediated device framework provides a way to > virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT > and IBM's channel IO. This framework reuses VFIO APIs for all the > functionalities for mediated devices which are currently being used for > pass through devices. This framework introduces a set of new sysfs files > for device creation and its life cycle management. > > Here is the summary of discussion on v1: > 1. Discover mediated device: > As part of physical device initialization process, vendor driver will > register their physical devices, which will be used to create virtual > device (mediated device, aka mdev) to the mediated framework. > > Vendor driver should specify mdev_supported_types in directory format. > This format is class based, for example, display class directory format > should be as below. We need to define such set for each class of devices > which would be supported by mediated device framework. > > --- mdev_destroy > --- mdev_supported_types > |-- 11 > | |-- create > | |-- name > | |-- fb_length > | |-- resolution > | |-- heads > | |-- max_instances > | |-- params > | |-- requires_group > |-- 12 > | |-- create > | |-- name > | |-- fb_length > | |-- resolution > | |-- heads > | |-- max_instances > | |-- params > | |-- requires_group > |-- 13 > |-- create > |-- name > |-- fb_length > |-- resolution > |-- heads > |-- max_instances > |-- params > |-- requires_group > > Per previous discussion, I'd think below layout is more general and clearer: --- mdev_supported_types |-- ID (or name) | |-- create | |-- available_instances | |-- requires_group |-- ID (or name) | |-- create | |-- available_instances | |-- requires_group ... Then under each mdev directory: |-- type (link to type directory) |-- destroy |-- online |-- reset |-- Destroy Group association may be done under mdev directory too, per Alex's earlier suggestion. Optional vendor-agnostic attributes can be extended in the future, e.g.: |-- priority |-- weight Vendor-specific or class-specific parameters (black string or explicit attributes) could be also extended per-mdev directory, if necessary to support. Thanks Kevin -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
> From: Alex Williamson [mailto:alex.william...@redhat.com] > Sent: Tuesday, September 20, 2016 5:36 AM > > > In the above example directory '11' represents a type id of mdev device. > > 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and > > 'requires_group' would be Read-Only files that vendor would provide to > > describe about that type. > > > > 'create': > > Write-only file. Mandatory. > > Accepts string to create mediated device. > > > > 'name': > > Read-Only file. Mandatory. > > Returns string, the name of that type id. > > > > 'fb_length': > > Read-only file. Mandatory. > > Returns {K,M,G}, size of framebuffer. > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > just one user of mediated devices. Agree. at least it doesn't make sense for s390 passthrough usage. > > > > > 'max_instance': > > Read-Only file. Mandatory. > > Returns integer. Returns maximum mdev device could be created > > at the moment when this file is read. This count would be updated by > > vendor driver. Before creating mdev device of this type, check if > > max_instance is > 0. > > Didn't we discuss this being being something like "available_instances" > with a "devices" sub-directory linking current devices of that type? Right. "available_instances" which is dynamically changed under two scenarios: a) create a new mdev under one type decrements "available_instances" of this type by 1; b) create a new mdev under one type decrements "available_instances" under other types by type-specific number, if vendor supports creating multiple types simultaneously like in Intel solution; > > > 'params' > > Write-Only file. Optional. > > String input. Libvirt would pass the string given in XML file to > > this file and then create mdev device. Set empty string to clear params. > > For example, set parameter 'frame_rate_limiter=0' to disable frame rate > > limiter for performance benchmarking, then create device of type 11. The > > device created would have that parameter set by vendor driver. > > Might as well just call it "magic", there's absolutely no ability to > introspect what parameters are allowed or even valid here. Can all > parameters be changed dynamically? Do parameter changes apply to > existing devices or only future devices? This is a poorly specified > interface. I'd prefer this be done via module options on the vendor > driver. I guess it might be mdev specific for nvidia's case, then module option is not suitable. Then this magic string should be placed under mdev directory instead of here, otherwise there'll be race condition. > > > > > 2. Create/destroy mediated device > > > > With above example, vGPU device XML would look like: > > > > > > my-vgpu > > pci__86_00_0 > > > > > >1 > >'frame_rate_limiter=0' > > > > > > > > 'type id' is mandatory. > > I was under the impression that the vendor supplied "name" was the one > unique identifier. We're certainly not going to create a registrar to > hand out type ids to each vendor, so what makes type id unique? I have > a hard time believing that a given vendor can even allocate unique type > ids for their own devices. Unique type id across vendors is not > practical. So which attribute are we actually using to identify which > type of mdev device we need and why would we ever specify additional > attributes like fb_length? Doesn't the vendor guarantee that "GRID > M60-0B" has a fixed setup of those attributes? I'm a bit confused about type ID and type name here. Just keeping one should be enough (ether a number or descriptive name), which is per-device scope, not necessarily to cross vendors. As long as the vendor defines the ID or name unique cross its own generations, looks should be sufficient (I guess cross-vendor migration is not an interest here) > > > 'params' is optional field. User should set this field if extra > > parameters need to be set for a particular vGPU device. Libvirt don't > > need to parse these params. These are meant for vendor driver. > > ie. magic black hole. nope. A curious question here, though it's not a mandatory requirement for Intel's GPU virtualization solution now, but allow such extension might be useful in the long term. Has libvirt ever supported similar black string in other usages? I can think it's a problem for public cloud usage, where any configuration on the managed device should be controlled by orchestration layer (e.g. openstack) automatically. It's not appropriate to allow a cloud tenant specifying such parameter, so in that manner any configuration parameter should be explicitly understood from upper level stack to libvirt then vendor specific parameters would be hard to handle. On the other hand, if we consider enterprise virtualization scenario which is more proprietary, administrator may add such vendor specific parameters as a black string through libvirt, as long
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Mon, 19 Sep 2016 23:50:56 +0200 Paolo Bonziniwrote: > On 19/09/2016 23:36, Alex Williamson wrote: > > On Tue, 20 Sep 2016 02:05:52 +0530 > > Kirti Wankhede wrote: > >> 'fb_length': > >> Read-only file. Mandatory. > >> Returns {K,M,G}, size of framebuffer. > > > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > > just one user of mediated devices. [...] > > > >> 'params' > >> Write-Only file. Optional. > >> String input. Libvirt would pass the string given in XML file to > >> this file and then create mdev device. Set empty string to clear params. > >> For example, set parameter 'frame_rate_limiter=0' to disable frame rate > >> limiter for performance benchmarking, then create device of type 11. The > >> device created would have that parameter set by vendor driver. > > > > Might as well just call it "magic", there's absolutely no ability to > > introspect what parameters are allowed or even valid here. Can all > > parameters be changed dynamically? Do parameter changes apply to > > existing devices or only future devices? This is a poorly specified > > interface. I'd prefer this be done via module options on the vendor > > driver. > > Or additional files in the mdev's sysfs directory? Sure, the vendor driver could certainly support that, it'd be vendor specific of course. > >> The parent device would look like: > >> > >> > >> pci__86_00_0 > >> > >>0 > >>134 > >>0 > >>0 > > > > This is the parent device address? > > (Since this is taken from an email of mine, this is libvirt's XML > interpretation of the udev database and the contents of sysfs. Try > "virsh nodedev-dumpxml pci__00_02_0" for an example that is > unrelated to mdev). > > >> 2. Create/destroy mediated device > >> > >> With above example, vGPU device XML would look like: > >> > >> > >> my-vgpu > >> pci__86_00_0 > >> > >> > >>1 > >>'frame_rate_limiter=0' > >> > >> > >> > >> 'type id' is mandatory. > > > > I was under the impression that the vendor supplied "name" was the one > > unique identifier. We're certainly not going to create a registrar to > > hand out type ids to each vendor, so what makes type id unique? > > Why does it have to be unique? It's just parroting what libvirt said in > the block of the parent's nodedev-dumpxml. > > (Though I guess going by name could be useful as well. Perhaps both > could be allowed). Do we not want to be able to migrate a VM to another host and have the XML mean something? Off-line or on-line, mdev device may support migration. Thanks, Alex -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On 19/09/2016 23:36, Alex Williamson wrote: > On Tue, 20 Sep 2016 02:05:52 +0530 > Kirti Wankhedewrote: >> 'fb_length': >> Read-only file. Mandatory. >> Returns {K,M,G}, size of framebuffer. > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > just one user of mediated devices. [...] > >> 'params' >> Write-Only file. Optional. >> String input. Libvirt would pass the string given in XML file to >> this file and then create mdev device. Set empty string to clear params. >> For example, set parameter 'frame_rate_limiter=0' to disable frame rate >> limiter for performance benchmarking, then create device of type 11. The >> device created would have that parameter set by vendor driver. > > Might as well just call it "magic", there's absolutely no ability to > introspect what parameters are allowed or even valid here. Can all > parameters be changed dynamically? Do parameter changes apply to > existing devices or only future devices? This is a poorly specified > interface. I'd prefer this be done via module options on the vendor > driver. Or additional files in the mdev's sysfs directory? >> The parent device would look like: >> >> >> pci__86_00_0 >> >>0 >>134 >>0 >>0 > > This is the parent device address? (Since this is taken from an email of mine, this is libvirt's XML interpretation of the udev database and the contents of sysfs. Try "virsh nodedev-dumpxml pci__00_02_0" for an example that is unrelated to mdev). >> 2. Create/destroy mediated device >> >> With above example, vGPU device XML would look like: >> >> >> my-vgpu >> pci__86_00_0 >> >> >>1 >>'frame_rate_limiter=0' >> >> >> >> 'type id' is mandatory. > > I was under the impression that the vendor supplied "name" was the one > unique identifier. We're certainly not going to create a registrar to > hand out type ids to each vendor, so what makes type id unique? Why does it have to be unique? It's just parroting what libvirt said in the block of the parent's nodedev-dumpxml. (Though I guess going by name could be useful as well. Perhaps both could be allowed). >> * Clear params, if set earlier: >> >> echo "" > /sys/../\:86\:00.0/11/params > > So params only applies to the devices created while those params are > set? This is inherently racy. I agree, these extra attributes should be set on the mdev, not the type directory. Paolo -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [RFC v2] libvirt vGPU QEMU integration
On Tue, 20 Sep 2016 02:05:52 +0530 Kirti Wankhedewrote: > Hi libvirt experts, > > Thanks for valuable input on v1 version of RFC. > > Quick brief, VFIO based mediated device framework provides a way to > virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT > and IBM's channel IO. This framework reuses VFIO APIs for all the > functionalities for mediated devices which are currently being used for > pass through devices. This framework introduces a set of new sysfs files > for device creation and its life cycle management. > > Here is the summary of discussion on v1: > 1. Discover mediated device: > As part of physical device initialization process, vendor driver will > register their physical devices, which will be used to create virtual > device (mediated device, aka mdev) to the mediated framework. > > Vendor driver should specify mdev_supported_types in directory format. > This format is class based, for example, display class directory format > should be as below. We need to define such set for each class of devices > which would be supported by mediated device framework. > > --- mdev_destroy I thought we replaced mdev_destroy with a remove attribute for each mdev device. > --- mdev_supported_types > |-- 11 > | |-- create > | |-- name > | |-- fb_length > | |-- resolution > | |-- heads > | |-- max_instances > | |-- params > | |-- requires_group > |-- 12 > | |-- create > | |-- name > | |-- fb_length > | |-- resolution > | |-- heads > | |-- max_instances > | |-- params > | |-- requires_group > |-- 13 > |-- create > |-- name > |-- fb_length > |-- resolution > |-- heads > |-- max_instances > |-- params > |-- requires_group > > > In the above example directory '11' represents a type id of mdev device. > 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and > 'requires_group' would be Read-Only files that vendor would provide to > describe about that type. > > 'create': > Write-only file. Mandatory. > Accepts string to create mediated device. > > 'name': > Read-Only file. Mandatory. > Returns string, the name of that type id. > > 'fb_length': > Read-only file. Mandatory. > Returns {K,M,G}, size of framebuffer. This can't be mandatory, it's only relevant to vGPU devices, vGPUs are just one user of mediated devices. > > 'resolution': > Read-Only file. Mandatory. > Returns 'hres x vres' format. Maximum supported resolution. Same. > > 'heads': > Read-Only file. Mandatory. > Returns integer. Number of maximum heads supported. Same. > > 'max_instance': > Read-Only file. Mandatory. > Returns integer. Returns maximum mdev device could be created > at the moment when this file is read. This count would be updated by > vendor driver. Before creating mdev device of this type, check if > max_instance is > 0. Didn't we discuss this being being something like "available_instances" with a "devices" sub-directory linking current devices of that type? > 'params' > Write-Only file. Optional. > String input. Libvirt would pass the string given in XML file to > this file and then create mdev device. Set empty string to clear params. > For example, set parameter 'frame_rate_limiter=0' to disable frame rate > limiter for performance benchmarking, then create device of type 11. The > device created would have that parameter set by vendor driver. Might as well just call it "magic", there's absolutely no ability to introspect what parameters are allowed or even valid here. Can all parameters be changed dynamically? Do parameter changes apply to existing devices or only future devices? This is a poorly specified interface. I'd prefer this be done via module options on the vendor driver. > 'requires_group' > Read-Only file. Optional. > This should be provided by vendor driver if vendor driver need to > group mdev devices in one domain so that vendor driver can use 'first > open' to commit resources of all mdev devices associated to that domain > and 'last close' to free those. > > The parent device would look like: > > > pci__86_00_0 > >0 >134 >0 >0 This is the parent device address? (I think we'd re-use the specification for assigned devices) Is this optional? Couldn't libvirt choose to pick any parent device supplying the specified type if not supplied? > > > > >GRID M60-0B >512M >2560x1600 >2 >16 >1 > > >GRID M60 >NVIDIA What are these specifying? > > > > 2. Create/destroy mediated device > > With above example, vGPU device XML would look like: > > > my-vgpu >
[libvirt] [RFC v2] libvirt vGPU QEMU integration
Hi libvirt experts, Thanks for valuable input on v1 version of RFC. Quick brief, VFIO based mediated device framework provides a way to virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT and IBM's channel IO. This framework reuses VFIO APIs for all the functionalities for mediated devices which are currently being used for pass through devices. This framework introduces a set of new sysfs files for device creation and its life cycle management. Here is the summary of discussion on v1: 1. Discover mediated device: As part of physical device initialization process, vendor driver will register their physical devices, which will be used to create virtual device (mediated device, aka mdev) to the mediated framework. Vendor driver should specify mdev_supported_types in directory format. This format is class based, for example, display class directory format should be as below. We need to define such set for each class of devices which would be supported by mediated device framework. --- mdev_destroy --- mdev_supported_types |-- 11 | |-- create | |-- name | |-- fb_length | |-- resolution | |-- heads | |-- max_instances | |-- params | |-- requires_group |-- 12 | |-- create | |-- name | |-- fb_length | |-- resolution | |-- heads | |-- max_instances | |-- params | |-- requires_group |-- 13 |-- create |-- name |-- fb_length |-- resolution |-- heads |-- max_instances |-- params |-- requires_group In the above example directory '11' represents a type id of mdev device. 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and 'requires_group' would be Read-Only files that vendor would provide to describe about that type. 'create': Write-only file. Mandatory. Accepts string to create mediated device. 'name': Read-Only file. Mandatory. Returns string, the name of that type id. 'fb_length': Read-only file. Mandatory. Returns {K,M,G}, size of framebuffer. 'resolution': Read-Only file. Mandatory. Returns 'hres x vres' format. Maximum supported resolution. 'heads': Read-Only file. Mandatory. Returns integer. Number of maximum heads supported. 'max_instance': Read-Only file. Mandatory. Returns integer. Returns maximum mdev device could be created at the moment when this file is read. This count would be updated by vendor driver. Before creating mdev device of this type, check if max_instance is > 0. 'params' Write-Only file. Optional. String input. Libvirt would pass the string given in XML file to this file and then create mdev device. Set empty string to clear params. For example, set parameter 'frame_rate_limiter=0' to disable frame rate limiter for performance benchmarking, then create device of type 11. The device created would have that parameter set by vendor driver. 'requires_group' Read-Only file. Optional. This should be provided by vendor driver if vendor driver need to group mdev devices in one domain so that vendor driver can use 'first open' to commit resources of all mdev devices associated to that domain and 'last close' to free those. The parent device would look like: pci__86_00_0 0 134 0 0 GRID M60-0B 512M 2560x1600 2 16 1 GRID M60 NVIDIA 2. Create/destroy mediated device With above example, vGPU device XML would look like: my-vgpu pci__86_00_0 1 'frame_rate_limiter=0' 'type id' is mandatory. 'group' is optional. It should be a unique number in the system among all the groups created for mdev devices. Its usage is: - not needed if single vGPU device is being assigned to a domain. - only need to be set if multiple vGPUs need to be assigned to a domain and vendor driver have 'requires_group' file in type id directory. - if type id directory include 'requires_group' and user tries to assign multiple vGPUs to a domain without having field in XML, it will create single vGPU. 'params' is optional field. User should set this field if extra parameters need to be set for a particular vGPU device. Libvirt don't need to parse these params. These are meant for vendor driver. Libvirt need to follow the sequence to create device: * Read /sys/../\:86\:00.0/11/max_instances. If it is greater than 0, then only proceed else fail. * Set extra params if 'params' field exist in device XML and 'params' file exist in type id directory echo "frame_rate_limiter=0" > /sys/../\:86\:00.0/11/params * Autogenerate UUID * Create device: echo "$UUID:" > /sys/../\:86\:00.0/11/create where is optional. Group should be unique number among all the