Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Fri, 7 Oct 2016 10:46:22 +0530 Kirti Wankhedewrote: > Ping.. > > Pulling the questions at the top. > > >> Will libvirt report 'description' RO attribute, its output would be > >> string, so that user could be able to see the configuration of that > >> profile? > >> > > > > Daniel, > > Waiting for your input on this. > > > >> We can have 'class' as optional attribute. So Intel don't have to > >> provide 'class' attribute and they don't have to specify mandatory > >> attributes of that class. We would provide 'class' attribute and provide > >> mandatory attributes. Hi Kirti, I would proceed with your design and posting, this is not a sufficiently significant issue to delay a new version. The next posting will likely not be the last, more feedback will come. Personally I don't see an issue. We should have a small number of mandatory sysfs interfaces for mdev devices, essentially just enough to manage the life cycle of a device, even if it requires vendor specific knowledge to understand the exact composition of that device. Some sort of human readable description seems like a reasonable optional feature for a userspace tool like libvirt to take advantage of. A class definition with an available set of attributes per class would clearly enhance the ability of a userspace tool to understand the device, but does not seem strictly required for managing the device. Thanks, Alex
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
Ping.. Pulling the questions at the top. >> Will libvirt report 'description' RO attribute, its output would be >> string, so that user could be able to see the configuration of that >> profile? >> > > Daniel, > Waiting for your input on this. > >> We can have 'class' as optional attribute. So Intel don't have to >> provide 'class' attribute and they don't have to specify mandatory >> attributes of that class. We would provide 'class' attribute and provide >> mandatory attributes. > Thanks, Kirti On 10/3/2016 1:50 PM, Kirti Wankhede wrote: > > > On 9/30/2016 10:49 AM, Kirti Wankhede wrote: >> > ... > >>> Hi Daniel, >>> >>> Here you are proposing to add a class named "gpu", which will make all >>> those gpu >>> related attributes mandatory, which libvirt can allow user to better >>> parse/present a particular mdev configuration? >>> >>> I am just wondering if there is another option that we just make all >>> those >>> attributes that a mdev device can have as optional but still meaningful >>> to >>> libvirt, so libvirt can still parse / recognize them as an class "mdev". >> >> 'mdev' isn't a class - mdev is the name of the kernel module. The class >> refers to the broad capability of the device. class would be things >> like "gpu", "nic", "fpga" or other such things. The point of the class >> is to identify which other attributes will be considered mandatory. >> >> > > Thanks Daniel. This class definition makes sense to me. > > However I'm not sure whether we should define such common mandatory > attributes > of a 'gpu' class now. Intel will go with a 2's power sharing of type > definition... actual > type name to be finalized, but an example looks like below: > > [GVTG-SKL-x2]: available instances (2) > [GVTG-SKL-x4]: available instances (4) > [GVTG-SKL-x8]: available instances (8) > ... > > User can create different types of vGPUs simultaneously. A GVTG-SKL-x2 > type > vGPU will get half of the physical GPU resource, while a GVTG-SKL-x4 type > will > get a quarter. However it's unclear to me how we want to enumerate those > resources into resolution or heads. I feel it'd be more reasonable for us > to push > initial libvirt mdev support w/o vgpu specific class definition, until we > see > a clear value of doing so (at that time we then follow Daniel's guideline > to define > mandatory attributes common to all GPU vendors). Libvirt won't report arbitrary vendor define attributes. So if we are not going to define a gpu class & associated attributes, then there will be no reporting of the 'heads', 'resolution', 'fb_length' data described above. >>> >>> yes, that's my point. I think nvidia may put them into the 'description' >>> attribute >>> just for descriptive purpose for now. >> >> >> Will libvirt report 'description' RO attribute, its output would be >> string, so that user could be able to see the configuration of that >> profile? >> > > Daniel, > Waiting for your input on this. > >> We can have 'class' as optional attribute. So Intel don't have to >> provide 'class' attribute and they don't have to specify mandatory >> attributes of that class. We would provide 'class' attribute and provide >> mandatory attributes. > > > Thanks, > Kirti >
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On 9/30/2016 10:49 AM, Kirti Wankhede wrote: > ... >> Hi Daniel, >> >> Here you are proposing to add a class named "gpu", which will make all >> those gpu >> related attributes mandatory, which libvirt can allow user to better >> parse/present a particular mdev configuration? >> >> I am just wondering if there is another option that we just make all >> those >> attributes that a mdev device can have as optional but still meaningful >> to >> libvirt, so libvirt can still parse / recognize them as an class "mdev". > > 'mdev' isn't a class - mdev is the name of the kernel module. The class > refers to the broad capability of the device. class would be things > like "gpu", "nic", "fpga" or other such things. The point of the class > is to identify which other attributes will be considered mandatory. > > Thanks Daniel. This class definition makes sense to me. However I'm not sure whether we should define such common mandatory attributes of a 'gpu' class now. Intel will go with a 2's power sharing of type definition... actual type name to be finalized, but an example looks like below: [GVTG-SKL-x2]: available instances (2) [GVTG-SKL-x4]: available instances (4) [GVTG-SKL-x8]: available instances (8) ... User can create different types of vGPUs simultaneously. A GVTG-SKL-x2 type vGPU will get half of the physical GPU resource, while a GVTG-SKL-x4 type will get a quarter. However it's unclear to me how we want to enumerate those resources into resolution or heads. I feel it'd be more reasonable for us to push initial libvirt mdev support w/o vgpu specific class definition, until we see a clear value of doing so (at that time we then follow Daniel's guideline to define mandatory attributes common to all GPU vendors). >>> >>> Libvirt won't report arbitrary vendor define attributes. So if we are not >>> going to define a gpu class & associated attributes, then there will be >>> no reporting of the 'heads', 'resolution', 'fb_length' data described >>> above. >>> >> >> yes, that's my point. I think nvidia may put them into the 'description' >> attribute >> just for descriptive purpose for now. > > > Will libvirt report 'description' RO attribute, its output would be > string, so that user could be able to see the configuration of that > profile? > Daniel, Waiting for your input on this. > We can have 'class' as optional attribute. So Intel don't have to > provide 'class' attribute and they don't have to specify mandatory > attributes of that class. We would provide 'class' attribute and provide > mandatory attributes. Thanks, Kirti
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On 9/29/2016 8:12 PM, Tian, Kevin wrote: >> From: Daniel P. Berrange [mailto:berra...@redhat.com] >> Sent: Thursday, September 29, 2016 10:39 PM >> >> On Thu, Sep 29, 2016 at 02:35:48PM +, Tian, Kevin wrote: From: Daniel P. Berrange [mailto:berra...@redhat.com] Sent: Thursday, September 29, 2016 4:06 PM On Wed, Sep 28, 2016 at 12:48:33PM -0700, Neo Jia wrote: > On Tue, Sep 20, 2016 at 10:47:53AM +0100, Daniel P. Berrange wrote: >> On Tue, Sep 20, 2016 at 02:05:52AM +0530, Kirti Wankhede wrote: >>> >>> Hi libvirt experts, >>> >>> Thanks for valuable input on v1 version of RFC. >>> >>> Quick brief, VFIO based mediated device framework provides a way to >>> virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT >>> and IBM's channel IO. This framework reuses VFIO APIs for all the >>> functionalities for mediated devices which are currently being used for >>> pass through devices. This framework introduces a set of new sysfs files >>> for device creation and its life cycle management. >>> >>> Here is the summary of discussion on v1: >>> 1. Discover mediated device: >>> As part of physical device initialization process, vendor driver will >>> register their physical devices, which will be used to create virtual >>> device (mediated device, aka mdev) to the mediated framework. >>> >>> Vendor driver should specify mdev_supported_types in directory format. >>> This format is class based, for example, display class directory format >>> should be as below. We need to define such set for each class of devices >>> which would be supported by mediated device framework. >>> >>> --- mdev_destroy >>> --- mdev_supported_types >>> |-- 11 >>> | |-- create >>> | |-- name >>> | |-- fb_length >>> | |-- resolution >>> | |-- heads >>> | |-- max_instances >>> | |-- params >>> | |-- requires_group >>> |-- 12 >>> | |-- create >>> | |-- name >>> | |-- fb_length >>> | |-- resolution >>> | |-- heads >>> | |-- max_instances >>> | |-- params >>> | |-- requires_group >>> |-- 13 >>> |-- create >>> |-- name >>> |-- fb_length >>> |-- resolution >>> |-- heads >>> |-- max_instances >>> |-- params >>> |-- requires_group >>> >>> >>> In the above example directory '11' represents a type id of mdev device. >>> 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and >>> 'requires_group' would be Read-Only files that vendor would provide to >>> describe about that type. >>> >>> 'create': >>> Write-only file. Mandatory. >>> Accepts string to create mediated device. >>> >>> 'name': >>> Read-Only file. Mandatory. >>> Returns string, the name of that type id. >> >> Presumably this is a human-targetted title/description of >> the device. >> >>> >>> 'fb_length': >>> Read-only file. Mandatory. >>> Returns {K,M,G}, size of framebuffer. >>> >>> 'resolution': >>> Read-Only file. Mandatory. >>> Returns 'hres x vres' format. Maximum supported resolution. >>> >>> 'heads': >>> Read-Only file. Mandatory. >>> Returns integer. Number of maximum heads supported. >> >> None of these should be mandatory as that makes the mdev >> useless for non-GPU devices. >> >> I'd expect to see a 'class' or 'type' attribute in the >> directory whcih tells you what kind of mdev it is. A >> valid 'class' value would be 'gpu'. The fb_length, >> resolution, and heads parameters would only be mandatory >> when class==gpu. >> > > Hi Daniel, > > Here you are proposing to add a class named "gpu", which will make all > those gpu > related attributes mandatory, which libvirt can allow user to better > parse/present a particular mdev configuration? > > I am just wondering if there is another option that we just make all those > attributes that a mdev device can have as optional but still meaningful to > libvirt, so libvirt can still parse / recognize them as an class "mdev". 'mdev' isn't a class - mdev is the name of the kernel module. The class refers to the broad capability of the device. class would be things like "gpu", "nic", "fpga" or other such things. The point of the class is to identify which other attributes will be considered mandatory. >>> >>> Thanks Daniel. This class definition makes sense to me. >>> >>> However I'm not sure whether we should define such common mandatory >>> attributes >>> of a 'gpu' class now. Intel will go with a 2's power
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
> From: Daniel P. Berrange [mailto:berra...@redhat.com] > Sent: Thursday, September 29, 2016 10:39 PM > > On Thu, Sep 29, 2016 at 02:35:48PM +, Tian, Kevin wrote: > > > From: Daniel P. Berrange [mailto:berra...@redhat.com] > > > Sent: Thursday, September 29, 2016 4:06 PM > > > > > > On Wed, Sep 28, 2016 at 12:48:33PM -0700, Neo Jia wrote: > > > > On Tue, Sep 20, 2016 at 10:47:53AM +0100, Daniel P. Berrange wrote: > > > > > On Tue, Sep 20, 2016 at 02:05:52AM +0530, Kirti Wankhede wrote: > > > > > > > > > > > > Hi libvirt experts, > > > > > > > > > > > > Thanks for valuable input on v1 version of RFC. > > > > > > > > > > > > Quick brief, VFIO based mediated device framework provides a way to > > > > > > virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel > > > > > > KVMGT > > > > > > and IBM's channel IO. This framework reuses VFIO APIs for all the > > > > > > functionalities for mediated devices which are currently being used > > > > > > for > > > > > > pass through devices. This framework introduces a set of new sysfs > > > > > > files > > > > > > for device creation and its life cycle management. > > > > > > > > > > > > Here is the summary of discussion on v1: > > > > > > 1. Discover mediated device: > > > > > > As part of physical device initialization process, vendor driver > > > > > > will > > > > > > register their physical devices, which will be used to create > > > > > > virtual > > > > > > device (mediated device, aka mdev) to the mediated framework. > > > > > > > > > > > > Vendor driver should specify mdev_supported_types in directory > > > > > > format. > > > > > > This format is class based, for example, display class directory > > > > > > format > > > > > > should be as below. We need to define such set for each class of > > > > > > devices > > > > > > which would be supported by mediated device framework. > > > > > > > > > > > > --- mdev_destroy > > > > > > --- mdev_supported_types > > > > > > |-- 11 > > > > > > | |-- create > > > > > > | |-- name > > > > > > | |-- fb_length > > > > > > | |-- resolution > > > > > > | |-- heads > > > > > > | |-- max_instances > > > > > > | |-- params > > > > > > | |-- requires_group > > > > > > |-- 12 > > > > > > | |-- create > > > > > > | |-- name > > > > > > | |-- fb_length > > > > > > | |-- resolution > > > > > > | |-- heads > > > > > > | |-- max_instances > > > > > > | |-- params > > > > > > | |-- requires_group > > > > > > |-- 13 > > > > > > |-- create > > > > > > |-- name > > > > > > |-- fb_length > > > > > > |-- resolution > > > > > > |-- heads > > > > > > |-- max_instances > > > > > > |-- params > > > > > > |-- requires_group > > > > > > > > > > > > > > > > > > In the above example directory '11' represents a type id of mdev > > > > > > device. > > > > > > 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and > > > > > > 'requires_group' would be Read-Only files that vendor would provide > > > > > > to > > > > > > describe about that type. > > > > > > > > > > > > 'create': > > > > > > Write-only file. Mandatory. > > > > > > Accepts string to create mediated device. > > > > > > > > > > > > 'name': > > > > > > Read-Only file. Mandatory. > > > > > > Returns string, the name of that type id. > > > > > > > > > > Presumably this is a human-targetted title/description of > > > > > the device. > > > > > > > > > > > > > > > > > 'fb_length': > > > > > > Read-only file. Mandatory. > > > > > > Returns {K,M,G}, size of framebuffer. > > > > > > > > > > > > 'resolution': > > > > > > Read-Only file. Mandatory. > > > > > > Returns 'hres x vres' format. Maximum supported resolution. > > > > > > > > > > > > 'heads': > > > > > > Read-Only file. Mandatory. > > > > > > Returns integer. Number of maximum heads supported. > > > > > > > > > > None of these should be mandatory as that makes the mdev > > > > > useless for non-GPU devices. > > > > > > > > > > I'd expect to see a 'class' or 'type' attribute in the > > > > > directory whcih tells you what kind of mdev it is. A > > > > > valid 'class' value would be 'gpu'. The fb_length, > > > > > resolution, and heads parameters would only be mandatory > > > > > when class==gpu. > > > > > > > > > > > > > Hi Daniel, > > > > > > > > Here you are proposing to add a class named "gpu", which will make all > > > > those gpu > > > > related attributes mandatory, which libvirt can allow user to better > > > > parse/present a particular mdev configuration? > > > > > > > > I am just wondering if there is another option that we just make all > > > > those > > > > attributes that a mdev device can have as optional but still meaningful > > > > to > > > > libvirt, so libvirt can still parse / recognize them as an class "mdev". > > > > > > 'mdev' isn't a class - mdev is
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Thu, Sep 29, 2016 at 02:35:48PM +, Tian, Kevin wrote: > > From: Daniel P. Berrange [mailto:berra...@redhat.com] > > Sent: Thursday, September 29, 2016 4:06 PM > > > > On Wed, Sep 28, 2016 at 12:48:33PM -0700, Neo Jia wrote: > > > On Tue, Sep 20, 2016 at 10:47:53AM +0100, Daniel P. Berrange wrote: > > > > On Tue, Sep 20, 2016 at 02:05:52AM +0530, Kirti Wankhede wrote: > > > > > > > > > > Hi libvirt experts, > > > > > > > > > > Thanks for valuable input on v1 version of RFC. > > > > > > > > > > Quick brief, VFIO based mediated device framework provides a way to > > > > > virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT > > > > > and IBM's channel IO. This framework reuses VFIO APIs for all the > > > > > functionalities for mediated devices which are currently being used > > > > > for > > > > > pass through devices. This framework introduces a set of new sysfs > > > > > files > > > > > for device creation and its life cycle management. > > > > > > > > > > Here is the summary of discussion on v1: > > > > > 1. Discover mediated device: > > > > > As part of physical device initialization process, vendor driver will > > > > > register their physical devices, which will be used to create virtual > > > > > device (mediated device, aka mdev) to the mediated framework. > > > > > > > > > > Vendor driver should specify mdev_supported_types in directory format. > > > > > This format is class based, for example, display class directory > > > > > format > > > > > should be as below. We need to define such set for each class of > > > > > devices > > > > > which would be supported by mediated device framework. > > > > > > > > > > --- mdev_destroy > > > > > --- mdev_supported_types > > > > > |-- 11 > > > > > | |-- create > > > > > | |-- name > > > > > | |-- fb_length > > > > > | |-- resolution > > > > > | |-- heads > > > > > | |-- max_instances > > > > > | |-- params > > > > > | |-- requires_group > > > > > |-- 12 > > > > > | |-- create > > > > > | |-- name > > > > > | |-- fb_length > > > > > | |-- resolution > > > > > | |-- heads > > > > > | |-- max_instances > > > > > | |-- params > > > > > | |-- requires_group > > > > > |-- 13 > > > > > |-- create > > > > > |-- name > > > > > |-- fb_length > > > > > |-- resolution > > > > > |-- heads > > > > > |-- max_instances > > > > > |-- params > > > > > |-- requires_group > > > > > > > > > > > > > > > In the above example directory '11' represents a type id of mdev > > > > > device. > > > > > 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and > > > > > 'requires_group' would be Read-Only files that vendor would provide to > > > > > describe about that type. > > > > > > > > > > 'create': > > > > > Write-only file. Mandatory. > > > > > Accepts string to create mediated device. > > > > > > > > > > 'name': > > > > > Read-Only file. Mandatory. > > > > > Returns string, the name of that type id. > > > > > > > > Presumably this is a human-targetted title/description of > > > > the device. > > > > > > > > > > > > > > 'fb_length': > > > > > Read-only file. Mandatory. > > > > > Returns {K,M,G}, size of framebuffer. > > > > > > > > > > 'resolution': > > > > > Read-Only file. Mandatory. > > > > > Returns 'hres x vres' format. Maximum supported resolution. > > > > > > > > > > 'heads': > > > > > Read-Only file. Mandatory. > > > > > Returns integer. Number of maximum heads supported. > > > > > > > > None of these should be mandatory as that makes the mdev > > > > useless for non-GPU devices. > > > > > > > > I'd expect to see a 'class' or 'type' attribute in the > > > > directory whcih tells you what kind of mdev it is. A > > > > valid 'class' value would be 'gpu'. The fb_length, > > > > resolution, and heads parameters would only be mandatory > > > > when class==gpu. > > > > > > > > > > Hi Daniel, > > > > > > Here you are proposing to add a class named "gpu", which will make all > > > those gpu > > > related attributes mandatory, which libvirt can allow user to better > > > parse/present a particular mdev configuration? > > > > > > I am just wondering if there is another option that we just make all those > > > attributes that a mdev device can have as optional but still meaningful to > > > libvirt, so libvirt can still parse / recognize them as an class "mdev". > > > > 'mdev' isn't a class - mdev is the name of the kernel module. The class > > refers to the broad capability of the device. class would be things > > like "gpu", "nic", "fpga" or other such things. The point of the class > > is to identify which other attributes will be considered mandatory. > > > > > > Thanks Daniel. This class definition makes sense to me. > > However I'm not sure whether we should define such common mandatory attributes > of
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
> From: Daniel P. Berrange [mailto:berra...@redhat.com] > Sent: Thursday, September 29, 2016 4:06 PM > > On Wed, Sep 28, 2016 at 12:48:33PM -0700, Neo Jia wrote: > > On Tue, Sep 20, 2016 at 10:47:53AM +0100, Daniel P. Berrange wrote: > > > On Tue, Sep 20, 2016 at 02:05:52AM +0530, Kirti Wankhede wrote: > > > > > > > > Hi libvirt experts, > > > > > > > > Thanks for valuable input on v1 version of RFC. > > > > > > > > Quick brief, VFIO based mediated device framework provides a way to > > > > virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT > > > > and IBM's channel IO. This framework reuses VFIO APIs for all the > > > > functionalities for mediated devices which are currently being used for > > > > pass through devices. This framework introduces a set of new sysfs files > > > > for device creation and its life cycle management. > > > > > > > > Here is the summary of discussion on v1: > > > > 1. Discover mediated device: > > > > As part of physical device initialization process, vendor driver will > > > > register their physical devices, which will be used to create virtual > > > > device (mediated device, aka mdev) to the mediated framework. > > > > > > > > Vendor driver should specify mdev_supported_types in directory format. > > > > This format is class based, for example, display class directory format > > > > should be as below. We need to define such set for each class of devices > > > > which would be supported by mediated device framework. > > > > > > > > --- mdev_destroy > > > > --- mdev_supported_types > > > > |-- 11 > > > > | |-- create > > > > | |-- name > > > > | |-- fb_length > > > > | |-- resolution > > > > | |-- heads > > > > | |-- max_instances > > > > | |-- params > > > > | |-- requires_group > > > > |-- 12 > > > > | |-- create > > > > | |-- name > > > > | |-- fb_length > > > > | |-- resolution > > > > | |-- heads > > > > | |-- max_instances > > > > | |-- params > > > > | |-- requires_group > > > > |-- 13 > > > > |-- create > > > > |-- name > > > > |-- fb_length > > > > |-- resolution > > > > |-- heads > > > > |-- max_instances > > > > |-- params > > > > |-- requires_group > > > > > > > > > > > > In the above example directory '11' represents a type id of mdev device. > > > > 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and > > > > 'requires_group' would be Read-Only files that vendor would provide to > > > > describe about that type. > > > > > > > > 'create': > > > > Write-only file. Mandatory. > > > > Accepts string to create mediated device. > > > > > > > > 'name': > > > > Read-Only file. Mandatory. > > > > Returns string, the name of that type id. > > > > > > Presumably this is a human-targetted title/description of > > > the device. > > > > > > > > > > > 'fb_length': > > > > Read-only file. Mandatory. > > > > Returns {K,M,G}, size of framebuffer. > > > > > > > > 'resolution': > > > > Read-Only file. Mandatory. > > > > Returns 'hres x vres' format. Maximum supported resolution. > > > > > > > > 'heads': > > > > Read-Only file. Mandatory. > > > > Returns integer. Number of maximum heads supported. > > > > > > None of these should be mandatory as that makes the mdev > > > useless for non-GPU devices. > > > > > > I'd expect to see a 'class' or 'type' attribute in the > > > directory whcih tells you what kind of mdev it is. A > > > valid 'class' value would be 'gpu'. The fb_length, > > > resolution, and heads parameters would only be mandatory > > > when class==gpu. > > > > > > > Hi Daniel, > > > > Here you are proposing to add a class named "gpu", which will make all > > those gpu > > related attributes mandatory, which libvirt can allow user to better > > parse/present a particular mdev configuration? > > > > I am just wondering if there is another option that we just make all those > > attributes that a mdev device can have as optional but still meaningful to > > libvirt, so libvirt can still parse / recognize them as an class "mdev". > > 'mdev' isn't a class - mdev is the name of the kernel module. The class > refers to the broad capability of the device. class would be things > like "gpu", "nic", "fpga" or other such things. The point of the class > is to identify which other attributes will be considered mandatory. > > Thanks Daniel. This class definition makes sense to me. However I'm not sure whether we should define such common mandatory attributes of a 'gpu' class now. Intel will go with a 2's power sharing of type definition... actual type name to be finalized, but an example looks like below: [GVTG-SKL-x2]: available instances (2) [GVTG-SKL-x4]: available instances (4) [GVTG-SKL-x8]: available instances (8) ... User can create different types of vGPUs simultaneously. A GVTG-SKL-x2 type vGPU will
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
7. Hot-plug It is same syntax to create a virtual device for hot-plug. >>> >>> How do groups work with hotplug? Can a device be creating into an >>> existing, running group? Can a device be removed from an existing, >>> running group? Thanks, >>> >> >> Group is for vendor driver to decide when to commit resources for a >> group of devices to support multiple devices in a domain. It will be >> upto vendor driver how it manage it. > > Then it's not sufficiently specified for libvirt to use. Thanks, > Can you clarify what if required for libvirt to use? >>> >>> Everything we're discussing above. We can't simply say that group >>> management is defined by the vendor driver, that means that libvirt >>> needs to know how to uniquely behave for each vendor driver. We want >>> to create a consistent framework where libvirt doesn't care what the >>> vendor driver is. Therefore we need to make group manipulation fully >>> specified, regardless of the vendor driver. Thanks, >>> >> >> There is no communication between different vendor drivers. So if >> libvirt creates a group with one device from one vendor and one device >> from another vendor, both vendor driver would think they have a single >> device and accordingly commit resources for that single device from its >> own driver. Ideally nothing would fail. But that is not the main aim of >> grouping, right? >> >> To make this independent of vendor driver, we had initial proposal of >> having same UUID and different instance numbers. But that was also not >> acceptable. > > We're defining the infrastructure, we get to define the rules, but we > need to define all the rules so that libvirt can actually implement to > them. We can't say "groups are this nebulous thing and each vendor > will manage them however they want", nobody wants to care that NVIDIA > did it one way and Intel did it another and we need to support both. > That's a design failure. Therefore we \must\ make the group definition > we need well specified, any hopefully generic enough that others can > use it. If they cannot then we'd need to add a new type of group. > Therefore we need to \specify\ things like for what set of mdev should > a group be created? Do I need a group for a single device? How do we > introspect a group to know which mdevs it contains? Can we dynamically > add or remove mdev devices from a group? Can we attach multiple > groups to the same user? Can we change the association of an mdev > device without removing it and recreating it? etc, etc, etc. We don't > need vendor coordination, we need a specification of exactly how this > type of group is meant to work. Thanks, > We decided to handle "multiple vGPUs per VM" through internal design changes. With that we don't need grouping in mdev framework. Hope this helps to accelerate and settle on sysfs interface soon. Thanks, Kirti
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Wed, Sep 28, 2016 at 12:48:33PM -0700, Neo Jia wrote: > On Tue, Sep 20, 2016 at 10:47:53AM +0100, Daniel P. Berrange wrote: > > On Tue, Sep 20, 2016 at 02:05:52AM +0530, Kirti Wankhede wrote: > > > > > > Hi libvirt experts, > > > > > > Thanks for valuable input on v1 version of RFC. > > > > > > Quick brief, VFIO based mediated device framework provides a way to > > > virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT > > > and IBM's channel IO. This framework reuses VFIO APIs for all the > > > functionalities for mediated devices which are currently being used for > > > pass through devices. This framework introduces a set of new sysfs files > > > for device creation and its life cycle management. > > > > > > Here is the summary of discussion on v1: > > > 1. Discover mediated device: > > > As part of physical device initialization process, vendor driver will > > > register their physical devices, which will be used to create virtual > > > device (mediated device, aka mdev) to the mediated framework. > > > > > > Vendor driver should specify mdev_supported_types in directory format. > > > This format is class based, for example, display class directory format > > > should be as below. We need to define such set for each class of devices > > > which would be supported by mediated device framework. > > > > > > --- mdev_destroy > > > --- mdev_supported_types > > > |-- 11 > > > | |-- create > > > | |-- name > > > | |-- fb_length > > > | |-- resolution > > > | |-- heads > > > | |-- max_instances > > > | |-- params > > > | |-- requires_group > > > |-- 12 > > > | |-- create > > > | |-- name > > > | |-- fb_length > > > | |-- resolution > > > | |-- heads > > > | |-- max_instances > > > | |-- params > > > | |-- requires_group > > > |-- 13 > > > |-- create > > > |-- name > > > |-- fb_length > > > |-- resolution > > > |-- heads > > > |-- max_instances > > > |-- params > > > |-- requires_group > > > > > > > > > In the above example directory '11' represents a type id of mdev device. > > > 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and > > > 'requires_group' would be Read-Only files that vendor would provide to > > > describe about that type. > > > > > > 'create': > > > Write-only file. Mandatory. > > > Accepts string to create mediated device. > > > > > > 'name': > > > Read-Only file. Mandatory. > > > Returns string, the name of that type id. > > > > Presumably this is a human-targetted title/description of > > the device. > > > > > > > > 'fb_length': > > > Read-only file. Mandatory. > > > Returns {K,M,G}, size of framebuffer. > > > > > > 'resolution': > > > Read-Only file. Mandatory. > > > Returns 'hres x vres' format. Maximum supported resolution. > > > > > > 'heads': > > > Read-Only file. Mandatory. > > > Returns integer. Number of maximum heads supported. > > > > None of these should be mandatory as that makes the mdev > > useless for non-GPU devices. > > > > I'd expect to see a 'class' or 'type' attribute in the > > directory whcih tells you what kind of mdev it is. A > > valid 'class' value would be 'gpu'. The fb_length, > > resolution, and heads parameters would only be mandatory > > when class==gpu. > > > > Hi Daniel, > > Here you are proposing to add a class named "gpu", which will make all those > gpu > related attributes mandatory, which libvirt can allow user to better > parse/present a particular mdev configuration? > > I am just wondering if there is another option that we just make all those > attributes that a mdev device can have as optional but still meaningful to > libvirt, so libvirt can still parse / recognize them as an class "mdev". 'mdev' isn't a class - mdev is the name of the kernel module. The class refers to the broad capability of the device. class would be things like "gpu", "nic", "fpga" or other such things. The point of the class is to identify which other attributes will be considered mandatory. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://entangle-photo.org -o-http://search.cpan.org/~danberr/ :|
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Tue, Sep 20, 2016 at 10:47:53AM +0100, Daniel P. Berrange wrote: > On Tue, Sep 20, 2016 at 02:05:52AM +0530, Kirti Wankhede wrote: > > > > Hi libvirt experts, > > > > Thanks for valuable input on v1 version of RFC. > > > > Quick brief, VFIO based mediated device framework provides a way to > > virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT > > and IBM's channel IO. This framework reuses VFIO APIs for all the > > functionalities for mediated devices which are currently being used for > > pass through devices. This framework introduces a set of new sysfs files > > for device creation and its life cycle management. > > > > Here is the summary of discussion on v1: > > 1. Discover mediated device: > > As part of physical device initialization process, vendor driver will > > register their physical devices, which will be used to create virtual > > device (mediated device, aka mdev) to the mediated framework. > > > > Vendor driver should specify mdev_supported_types in directory format. > > This format is class based, for example, display class directory format > > should be as below. We need to define such set for each class of devices > > which would be supported by mediated device framework. > > > > --- mdev_destroy > > --- mdev_supported_types > > |-- 11 > > | |-- create > > | |-- name > > | |-- fb_length > > | |-- resolution > > | |-- heads > > | |-- max_instances > > | |-- params > > | |-- requires_group > > |-- 12 > > | |-- create > > | |-- name > > | |-- fb_length > > | |-- resolution > > | |-- heads > > | |-- max_instances > > | |-- params > > | |-- requires_group > > |-- 13 > > |-- create > > |-- name > > |-- fb_length > > |-- resolution > > |-- heads > > |-- max_instances > > |-- params > > |-- requires_group > > > > > > In the above example directory '11' represents a type id of mdev device. > > 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and > > 'requires_group' would be Read-Only files that vendor would provide to > > describe about that type. > > > > 'create': > > Write-only file. Mandatory. > > Accepts string to create mediated device. > > > > 'name': > > Read-Only file. Mandatory. > > Returns string, the name of that type id. > > Presumably this is a human-targetted title/description of > the device. > > > > > 'fb_length': > > Read-only file. Mandatory. > > Returns {K,M,G}, size of framebuffer. > > > > 'resolution': > > Read-Only file. Mandatory. > > Returns 'hres x vres' format. Maximum supported resolution. > > > > 'heads': > > Read-Only file. Mandatory. > > Returns integer. Number of maximum heads supported. > > None of these should be mandatory as that makes the mdev > useless for non-GPU devices. > > I'd expect to see a 'class' or 'type' attribute in the > directory whcih tells you what kind of mdev it is. A > valid 'class' value would be 'gpu'. The fb_length, > resolution, and heads parameters would only be mandatory > when class==gpu. > Hi Daniel, Here you are proposing to add a class named "gpu", which will make all those gpu related attributes mandatory, which libvirt can allow user to better parse/present a particular mdev configuration? I am just wondering if there is another option that we just make all those attributes that a mdev device can have as optional but still meaningful to libvirt, so libvirt can still parse / recognize them as an class "mdev". In general, I am just trying to understand the requirement from libvirt and see how we can fit in this requirement for both Intel and NVIDIA since Intel is also moving to the type-based interface although they don't have "class" concept yet. Thanks, Neo > > 'max_instance': > > Read-Only file. Mandatory. > > Returns integer. Returns maximum mdev device could be created > > at the moment when this file is read. This count would be updated by > > vendor driver. Before creating mdev device of this type, check if > > max_instance is > 0. > > > > 'params' > > Write-Only file. Optional. > > String input. Libvirt would pass the string given in XML file to > > this file and then create mdev device. Set empty string to clear params. > > For example, set parameter 'frame_rate_limiter=0' to disable frame rate > > limiter for performance benchmarking, then create device of type 11. The > > device created would have that parameter set by vendor driver. > > Nope, libvirt will explicitly *NEVER* allow arbitrary opaque > passthrough of vendor specific data in this way. > > > The parent device would look like: > > > > > > pci__86_00_0 > > > >0 > >134 > >0 > >0 > > > > > > > > > >GRID M60-0B > >512M > >
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On 9/23/2016 12:55 AM, Tian, Kevin wrote: >> From: Kirti Wankhede [mailto:kwankh...@nvidia.com] >> Sent: Wednesday, September 21, 2016 12:23 AM >>> > I have > a hard time believing that a given vendor can even allocate unique type > ids for their own devices. Unique type id across vendors is not > practical. So which attribute are we actually using to identify which > type of mdev device we need and why would we ever specify additional > attributes like fb_length? Doesn't the vendor guarantee that "GRID > M60-0B" has a fixed setup of those attributes? > Specifying attributes here is not our requirement. Yes we have fixed set of attributes for "GRID-M60-0B" and on. We are defining the attributed here for "display" class for all other vendor of gpu can use. > > Hi, Kirti, > > We decide to go with above type-based interface for KVMGT, with fixed setup > of attributes too. If both Intel and NVIDIA solutions use such fixed manner, > can we go with a proposal which doesn't include 'class' extension for now? > Later if there is a concrete usage requiring such class-specific attribute > setting, > then it's easy to extend following discussion in this thread. I'm thinking > how we > can converge the discussion here into something simple enough (and extensible) > to accelerate upstreaming of both Intel/NVIDIA solutions... > Hi Kevin, We have fixed set of attributes which are GPU/graphics specific, like framebuffer_length, resolution, number of heads, max supported instances. If you are going with fixed set of attributes, how are the vGPU types on KVMGT looks like from attributes point of view? attributes are graphics specific attributes? Thanks, Kirti
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
> From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > Sent: Wednesday, September 21, 2016 12:23 AM > > > >>> I have > >>> a hard time believing that a given vendor can even allocate unique type > >>> ids for their own devices. Unique type id across vendors is not > >>> practical. So which attribute are we actually using to identify which > >>> type of mdev device we need and why would we ever specify additional > >>> attributes like fb_length? Doesn't the vendor guarantee that "GRID > >>> M60-0B" has a fixed setup of those attributes? > >>> > >> > >> Specifying attributes here is not our requirement. Yes we have fixed set > >> of attributes for "GRID-M60-0B" and on. > >> We are defining the attributed here for "display" class for all other > >> vendor of gpu can use. > >> Hi, Kirti, We decide to go with above type-based interface for KVMGT, with fixed setup of attributes too. If both Intel and NVIDIA solutions use such fixed manner, can we go with a proposal which doesn't include 'class' extension for now? Later if there is a concrete usage requiring such class-specific attribute setting, then it's easy to extend following discussion in this thread. I'm thinking how we can converge the discussion here into something simple enough (and extensible) to accelerate upstreaming of both Intel/NVIDIA solutions... Thanks Kevin
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Thu, 22 Sep 2016 09:41:20 +0530 Kirti Wankhedewrote: > > My concern is that a type id seems arbitrary but we're specifying that > > it be unique. We already have something unique, the name. So why try > > to make the type id unique as well? A vendor can accidentally create > > their vendor driver so that a given name means something very > > specific. On the other hand they need to be extremely deliberate to > > coordinate that a type id means a unique thing across all their product > > lines. > > > > Let me clarify, type id should be unique in the list of > mdev_supported_types. You can't have 2 directories in with same name. > >>> > >>> Of course, but does that mean it's only unique to the machine I'm > >>> currently running on? Let's say I have a Tesla P100 on my system and > >>> type-id 11 is named "GRID-M60-0B". At some point in the future I > >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 on that > >>> new card still going to be a "GRID-M60-0B"? If not then we've based > >>> our XML on the wrong attribute. If the new device does not support > >>> "GRID-M60-0B" then we should generate an error, not simply initialize > >>> whatever type-id 11 happens to be on this new card. > >>> > >> > >> If there are 2 M60 in the system then you would find '11' type directory > >> in mdev_supported_types of both M60. If you have P100, '11' type would > >> not be there in its mdev_supported_types, it will have different types. > >> > >> For example, if you replace M60 with P100, but XML is not updated. XML > >> have type '11'. When libvirt would try to create mdev device, libvirt > >> would have to find 'create' file in sysfs in following directory format: > >> > >> --- mdev_supported_types > >> |-- 11 > >> | |-- create > >> > >> but now for P100, '11' directory is not there, so libvirt should throw > >> error on not able to find '11' directory. > > > > This really seems like an accident waiting to happen. What happens > > when the user replaces their M60 with an Intel XYZ device that happens > > to expose a type 11 mdev class gpu device? How is libvirt supposed to > > know that the XML used to refer to a GRID-M60-0B and now it's an > > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and removing > > yet another arbitrary requirement that we have some sort of globally > > unique type-id database make a lot of sense? The same issue applies > > for simple debug-ability, if I'm reviewing the XML for a domain and the > > name is the primary index for the mdev device, I know what it is. > > Seeing type-id='11' is meaningless. > > > > Let me clarify again, type '11' is a string that vendor driver would > define (see my previous reply below) it could be "11" or "GRID-M60-0B". > If 2 vendors used same string we can't control that. right? > > > Lets remove 'id' from type id in XML if that is the concern. Supported > types is going to be defined by vendor driver, so let vendor driver > decide what to use for directory name and same should be used in device > xml file, it could be '11' or "GRID M60-0B": > > > my-vgpu > pci__86_00_0 > >
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
> My concern is that a type id seems arbitrary but we're specifying that > it be unique. We already have something unique, the name. So why try > to make the type id unique as well? A vendor can accidentally create > their vendor driver so that a given name means something very > specific. On the other hand they need to be extremely deliberate to > coordinate that a type id means a unique thing across all their product > lines. > Let me clarify, type id should be unique in the list of mdev_supported_types. You can't have 2 directories in with same name. >>> >>> Of course, but does that mean it's only unique to the machine I'm >>> currently running on? Let's say I have a Tesla P100 on my system and >>> type-id 11 is named "GRID-M60-0B". At some point in the future I >>> replace the Tesla P100 with a Q1000 (made up). Is type-id 11 on that >>> new card still going to be a "GRID-M60-0B"? If not then we've based >>> our XML on the wrong attribute. If the new device does not support >>> "GRID-M60-0B" then we should generate an error, not simply initialize >>> whatever type-id 11 happens to be on this new card. >>> >> >> If there are 2 M60 in the system then you would find '11' type directory >> in mdev_supported_types of both M60. If you have P100, '11' type would >> not be there in its mdev_supported_types, it will have different types. >> >> For example, if you replace M60 with P100, but XML is not updated. XML >> have type '11'. When libvirt would try to create mdev device, libvirt >> would have to find 'create' file in sysfs in following directory format: >> >> --- mdev_supported_types >> |-- 11 >> | |-- create >> >> but now for P100, '11' directory is not there, so libvirt should throw >> error on not able to find '11' directory. > > This really seems like an accident waiting to happen. What happens > when the user replaces their M60 with an Intel XYZ device that happens > to expose a type 11 mdev class gpu device? How is libvirt supposed to > know that the XML used to refer to a GRID-M60-0B and now it's an > INTEL-IGD-XYZ? Doesn't basing the XML entry on the name and removing > yet another arbitrary requirement that we have some sort of globally > unique type-id database make a lot of sense? The same issue applies > for simple debug-ability, if I'm reviewing the XML for a domain and the > name is the primary index for the mdev device, I know what it is. > Seeing type-id='11' is meaningless. > Let me clarify again, type '11' is a string that vendor driver would define (see my previous reply below) it could be "11" or "GRID-M60-0B". If 2 vendors used same string we can't control that. right? Lets remove 'id' from type id in XML if that is the concern. Supported types is going to be defined by vendor driver, so let vendor driver decide what to use for directory name and same should be used in device xml file, it could be '11' or "GRID M60-0B": my-vgpu pci__86_00_0
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
> From: Alex Williamson [mailto:alex.william...@redhat.com] > Sent: Thursday, September 22, 2016 11:02 AM > > On Thu, 22 Sep 2016 02:33:01 + > "Tian, Kevin"wrote: > > > > From: Alex Williamson [mailto:alex.william...@redhat.com] > > > Sent: Wednesday, September 21, 2016 12:37 PM > > > > > > On Wed, 21 Sep 2016 03:56:21 + > > > "Tian, Kevin" wrote: > > > > > > > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > > > > > Sent: Tuesday, September 20, 2016 10:22 PM > > > > > >> > > > > > >> 'max_instance': > > > > > >> Read-Only file. Mandatory. > > > > > >> Returns integer. Returns maximum mdev device could be created > > > > > >> at the moment when this file is read. This count would be updated > > > > > >> by > > > > > >> vendor driver. Before creating mdev device of this type, check if > > > > > >> max_instance is > 0. > > > > > > > > > > > > Didn't we discuss this being being something like > > > > > > "available_instances" > > > > > > with a "devices" sub-directory linking current devices of that type? > > > > > > > > > > > > > > > > I'm ok with 'available_instances' as name of this file. > > > > > Yes, mdev device directory will have links to devices of that type. I > > > > > think that is part of mdev core module discussion. I'm trying to list > > > > > sysfs files for libvirt interface here to focus more on libvirt > > > > > interface. Hence didn't add that. I'll make sure to add all such > > > > > details > > > > > in future. > > > > > > > > > > > > > > > > > > Definitely you need provide those details since they also contain > > > > libvirt interface, e.g. 'destroy' which we discussed should be in > > > > mdev directory. > > > > > > $ find /sys/devices -name remove | wc -l > > > 24 > > > $ find /sys/devices -name destroy | wc -l > > > 0 > > > > > > 'remove' is the sysfs deconstructor we should use. > > > > make sense. > > > > > > > > > Another example is class-specific attributes such > > > > as 'resolution', etc. which you listed under 'display class'. All those > > > > attributes should be moved to mdev directory. Only class ID is > > > > required under each type. > > > > > > Can you elaborate on what you're proposing here? If we don't have > > > attributes like 'resolution' under the type-id then we can't describe > > > to the user the features of the mdev before we create it. I don't > > > think anybody wants a 'create it and find out' type of interface. > > > Thanks, > > > > > > > I think such way would be racy. What about two clients creating mdev > > devices simultaneously? How to guarantee their configuration of class > > specific attributes not mutual-impacted since before creation any such > > configuration would be global under that type? > > A simple mutex in the sysfs store attribute to serialize, return write > error if insufficient resources to create additional devices... Easy. > Let's not exaggerate the issue. > > > My feeling is that we'd better keep create simple - just write a UUID > > to "type-id/create". Any class-specific attribute, if we really want to > > support, should be able to be individually configured with required > > resource allocated incrementally on top of basic resources allocated > > at create stage. Then libvirt can set those attributes between create > > and open a mdev device. If this direction is accepted, then naturally > > such attributes should be put under mdev directory. Possibly we > > don't need a class description under type-id. libvirt just checks directly > > whether any known class represented in each mdev directory (vendor > > driver will expose it on demand), and then operate attributes under > > that class. > > It seems like this just moves the hypothetical race to the point where > we're adjusting individual attributes for the mdev device, but now the > problem is worse because there's no simple locking or serialization > that guarantees we can get all the attributes we want. We might get > half the attributes for one device and half the attributes for another > and not be able to complete initialization of either. I think we also > lose a lot of our ability to expose pre-defined configurations to the > user and the vendor's test matrix explodes. We also can't provide any > sort of reasonable estimate of how many devices can be created because > we don't know the resource requirements of each device. I don't like > this idea at all, sorry. Thanks, > Yes, it's a valid concern. Alex, curious about your opinion of supporting some device-agnostic optional attributes. One example is to set QoS parameters (priority, weight, etc.), which may be enabled by vendor driver to allow dynamic adjustment even after a mdev device is created. Do we still put a QoS class under type-id, meaning must-be-configured before creation (not a bad limitation since from SLA p.o.v all QoS attributes should be negotiated down in advance)? Would there be any concern if a type-id
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Thu, 22 Sep 2016 02:33:01 + "Tian, Kevin"wrote: > > From: Alex Williamson [mailto:alex.william...@redhat.com] > > Sent: Wednesday, September 21, 2016 12:37 PM > > > > On Wed, 21 Sep 2016 03:56:21 + > > "Tian, Kevin" wrote: > > > > > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > > > > Sent: Tuesday, September 20, 2016 10:22 PM > > > > >> > > > > >> 'max_instance': > > > > >> Read-Only file. Mandatory. > > > > >> Returns integer. Returns maximum mdev device could be created > > > > >> at the moment when this file is read. This count would be updated by > > > > >> vendor driver. Before creating mdev device of this type, check if > > > > >> max_instance is > 0. > > > > > > > > > > Didn't we discuss this being being something like > > > > > "available_instances" > > > > > with a "devices" sub-directory linking current devices of that type? > > > > > > > > > > > > > I'm ok with 'available_instances' as name of this file. > > > > Yes, mdev device directory will have links to devices of that type. I > > > > think that is part of mdev core module discussion. I'm trying to list > > > > sysfs files for libvirt interface here to focus more on libvirt > > > > interface. Hence didn't add that. I'll make sure to add all such details > > > > in future. > > > > > > > > > > > > > > Definitely you need provide those details since they also contain > > > libvirt interface, e.g. 'destroy' which we discussed should be in > > > mdev directory. > > > > $ find /sys/devices -name remove | wc -l > > 24 > > $ find /sys/devices -name destroy | wc -l > > 0 > > > > 'remove' is the sysfs deconstructor we should use. > > make sense. > > > > > > Another example is class-specific attributes such > > > as 'resolution', etc. which you listed under 'display class'. All those > > > attributes should be moved to mdev directory. Only class ID is > > > required under each type. > > > > Can you elaborate on what you're proposing here? If we don't have > > attributes like 'resolution' under the type-id then we can't describe > > to the user the features of the mdev before we create it. I don't > > think anybody wants a 'create it and find out' type of interface. > > Thanks, > > > > I think such way would be racy. What about two clients creating mdev > devices simultaneously? How to guarantee their configuration of class > specific attributes not mutual-impacted since before creation any such > configuration would be global under that type? A simple mutex in the sysfs store attribute to serialize, return write error if insufficient resources to create additional devices... Easy. Let's not exaggerate the issue. > My feeling is that we'd better keep create simple - just write a UUID > to "type-id/create". Any class-specific attribute, if we really want to > support, should be able to be individually configured with required > resource allocated incrementally on top of basic resources allocated > at create stage. Then libvirt can set those attributes between create > and open a mdev device. If this direction is accepted, then naturally > such attributes should be put under mdev directory. Possibly we > don't need a class description under type-id. libvirt just checks directly > whether any known class represented in each mdev directory (vendor > driver will expose it on demand), and then operate attributes under > that class. It seems like this just moves the hypothetical race to the point where we're adjusting individual attributes for the mdev device, but now the problem is worse because there's no simple locking or serialization that guarantees we can get all the attributes we want. We might get half the attributes for one device and half the attributes for another and not be able to complete initialization of either. I think we also lose a lot of our ability to expose pre-defined configurations to the user and the vendor's test matrix explodes. We also can't provide any sort of reasonable estimate of how many devices can be created because we don't know the resource requirements of each device. I don't like this idea at all, sorry. Thanks, Alex
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
> From: Tian, Kevin > Sent: Thursday, September 22, 2016 10:33 AM > > > > > > Another example is class-specific attributes such > > > as 'resolution', etc. which you listed under 'display class'. All those > > > attributes should be moved to mdev directory. Only class ID is > > > required under each type. > > > > Can you elaborate on what you're proposing here? If we don't have > > attributes like 'resolution' under the type-id then we can't describe > > to the user the features of the mdev before we create it. I don't > > think anybody wants a 'create it and find out' type of interface. > > Thanks, > > > > I think such way would be racy. What about two clients creating mdev > devices simultaneously? How to guarantee their configuration of class > specific attributes not mutual-impacted since before creation any such > configuration would be global under that type? > > My feeling is that we'd better keep create simple - just write a UUID > to "type-id/create". Any class-specific attribute, if we really want to > support, should be able to be individually configured with required > resource allocated incrementally on top of basic resources allocated > at create stage. Then libvirt can set those attributes between create > and open a mdev device. If this direction is accepted, then naturally > such attributes should be put under mdev directory. Possibly we > don't need a class description under type-id. libvirt just checks directly > whether any known class represented in each mdev directory (vendor > driver will expose it on demand), and then operate attributes under > that class. > Have a better understanding of your concern now. User needs to know and set a list of attributes before requesting to create a new mdev device. From this angle all attributes should be enumerated per type-id. But as I explained above, writing multiple sysfs nodes to create a mdev device is racy. We either need a way to guarantee transactional operations on those nodes, or having type-id directory to only expose supported attributes while programming those attributes has to be through per-mdev directory after mdev device is created... Thanks Kevin
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
> From: Alex Williamson [mailto:alex.william...@redhat.com] > Sent: Wednesday, September 21, 2016 12:43 PM > > On Wed, 21 Sep 2016 04:10:53 + > "Tian, Kevin"wrote: > > > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > > > Sent: Wednesday, September 21, 2016 12:23 AM > > > > > > > > > On 9/20/2016 8:13 PM, Alex Williamson wrote: > > > > On Tue, 20 Sep 2016 19:51:58 +0530 > > > > Kirti Wankhede wrote: > > > > > > > >> On 9/20/2016 3:06 AM, Alex Williamson wrote: > > > >>> On Tue, 20 Sep 2016 02:05:52 +0530 > > > >>> Kirti Wankhede wrote: > > > >>> > > > Hi libvirt experts, > > > > > > > > > 'create': > > > Write-only file. Mandatory. > > > Accepts string to create mediated device. > > > > > > 'name': > > > Read-Only file. Mandatory. > > > Returns string, the name of that type id. > > > > > > 'fb_length': > > > Read-only file. Mandatory. > > > Returns {K,M,G}, size of framebuffer. > > > >>> > > > >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > > > >>> just one user of mediated devices. > > > >>> > > > >> > > > >> As per our discussion in BOF, we would define class and its attributes. > > > >> As I mentioned above these are attributes of "display" class. > > > > > > > > Just as I didn't know here, how does libvirt know the class of a given > > > > type id? > > > > > > > > > > Yes, we need some way to identify the class as Daniel mentioned in his > > > suggestion. Add a file 'class' in mdev_supported_types that would return > > > the class. > > > > > > > > > > 'display' class or 'gpu' class? display represents only part of GPU > > functionalities. I assume you want to provide an unique class ID > > for each type, instead of allowing multiple classes each identifying > > a subset of controllable GPU resources (e.g. 'display', 'render', etc.)... > > Not sure where you're going with this, we're not creating a programming > interface for the device, that exists through the vfio API. I believe > the intention here is simply a user level classification for the > purpose of being able to interpret the attributes provided in a logical > way. I'm not against that purpose. My main intention is that 'display' is not an accurate classification here. If we allow classification in 'display' level, it implies more finer-grained classifications may be further defined upon which vendor specific GPU resources is allowed to be configured, which might be an overkill. Here we just need a general classification like 'gpu' to cover all possible vendor-agnostic attributes beyond display. > > > for unique class ID, I once considered whether it could be directly > > inherited from class of parent device, since typically a vendor driver > > creates mdev devices using the same type as physical device. But later > > I realized one potential value of keep a class field for mdev types, > > e.g. if we want to extend mdev to FPGA which could be programmed > > as different classes of functionalities. :-) > > And how do we generically determine the class of a parent device? We > cannot assume the parent device is PCI. Thanks, > Yes, this is also a good point. Thanks Kevin
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
> From: Alex Williamson [mailto:alex.william...@redhat.com] > Sent: Wednesday, September 21, 2016 12:37 PM > > On Wed, 21 Sep 2016 03:56:21 + > "Tian, Kevin"wrote: > > > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > > > Sent: Tuesday, September 20, 2016 10:22 PM > > > >> > > > >> 'max_instance': > > > >> Read-Only file. Mandatory. > > > >> Returns integer. Returns maximum mdev device could be created > > > >> at the moment when this file is read. This count would be updated by > > > >> vendor driver. Before creating mdev device of this type, check if > > > >> max_instance is > 0. > > > > > > > > Didn't we discuss this being being something like "available_instances" > > > > with a "devices" sub-directory linking current devices of that type? > > > > > > > > > > I'm ok with 'available_instances' as name of this file. > > > Yes, mdev device directory will have links to devices of that type. I > > > think that is part of mdev core module discussion. I'm trying to list > > > sysfs files for libvirt interface here to focus more on libvirt > > > interface. Hence didn't add that. I'll make sure to add all such details > > > in future. > > > > > > > > > > Definitely you need provide those details since they also contain > > libvirt interface, e.g. 'destroy' which we discussed should be in > > mdev directory. > > $ find /sys/devices -name remove | wc -l > 24 > $ find /sys/devices -name destroy | wc -l > 0 > > 'remove' is the sysfs deconstructor we should use. make sense. > > > Another example is class-specific attributes such > > as 'resolution', etc. which you listed under 'display class'. All those > > attributes should be moved to mdev directory. Only class ID is > > required under each type. > > Can you elaborate on what you're proposing here? If we don't have > attributes like 'resolution' under the type-id then we can't describe > to the user the features of the mdev before we create it. I don't > think anybody wants a 'create it and find out' type of interface. > Thanks, > I think such way would be racy. What about two clients creating mdev devices simultaneously? How to guarantee their configuration of class specific attributes not mutual-impacted since before creation any such configuration would be global under that type? My feeling is that we'd better keep create simple - just write a UUID to "type-id/create". Any class-specific attribute, if we really want to support, should be able to be individually configured with required resource allocated incrementally on top of basic resources allocated at create stage. Then libvirt can set those attributes between create and open a mdev device. If this direction is accepted, then naturally such attributes should be put under mdev directory. Possibly we don't need a class description under type-id. libvirt just checks directly whether any known class represented in each mdev directory (vendor driver will expose it on demand), and then operate attributes under that class. Thanks Kevin
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Thu, 22 Sep 2016 00:04:14 +0530 Kirti Wankhedewrote: > On 9/20/2016 10:20 PM, Alex Williamson wrote: > > On Tue, 20 Sep 2016 21:53:16 +0530 > > Kirti Wankhede wrote: > > > >> On 9/20/2016 8:13 PM, Alex Williamson wrote: > >>> On Tue, 20 Sep 2016 19:51:58 +0530 > >>> Kirti Wankhede wrote: > >>> > On 9/20/2016 3:06 AM, Alex Williamson wrote: > > On Tue, 20 Sep 2016 02:05:52 +0530 > > Kirti Wankhede wrote: > > > >> Hi libvirt experts, > >> > >> > >> 'create': > >> Write-only file. Mandatory. > >> Accepts string to create mediated device. > >> > >> 'name': > >> Read-Only file. Mandatory. > >> Returns string, the name of that type id. > >> > >> 'fb_length': > >> Read-only file. Mandatory. > >> Returns {K,M,G}, size of framebuffer. > > > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > > just one user of mediated devices. > > > > As per our discussion in BOF, we would define class and its attributes. > As I mentioned above these are attributes of "display" class. > >>> > >>> Just as I didn't know here, how does libvirt know the class of a given > >>> type id? > >>> > >> > >> Yes, we need some way to identify the class as Daniel mentioned in his > >> suggestion. Add a file 'class' in mdev_supported_types that would return > >> the class. > >> > >> > > Can all parameters be changed dynamically? > > Parameters here would be extra parameters which are not listed above in > mandatory list. If such parameters are required to set, these parameters > should be set per mdev device before VM is booted. > > > Do parameter changes apply to existing devices or only future devices? > > > > No, it should be applied at the time when mdev device is created or > after mdev device is created but before VM is booted. It will not be > applicable to existing devices. > > > This is a poorly specified > > interface. I'd prefer this be done via module options on the vendor > > driver. > > > > Module option is not suitable here, in that case such parameters would > be applied to all mdev device created for vendor driver and that is not > what we want to. > >>> > >>> Then use attributes on the mdev device itself, this is not an > >>> acceptable interface. > >>> > >> > >> With above example, vGPU device XML would look like: > >> > >> > >> my-vgpu > >> pci__86_00_0 > >> > >> > >>1 > >>'frame_rate_limiter=0' > >> > >> > >> > >> 'type id' is mandatory. > > > > I was under the impression that the vendor supplied "name" was the one > > unique identifier. We're certainly not going to create a registrar to > > hand out type ids to each vendor, so what makes type id unique? > > Type id is unique, 'name' is the name (string) of device that would be > displayed in guest OS for that type of mdev device. > >>> > >>> We were told at the BoF that name was the unique string which would be > >>> consistent and you again indicate below that "GRID-M60-0B" has a fixed > >>> set of attributes, so why do we care what type id is associated with > >>> that name? > >>> > > I have > > a hard time believing that a given vendor can even allocate unique type > > ids for their own devices. Unique type id across vendors is not > > practical. So which attribute are we actually using to identify which > > type of mdev device we need and why would we ever specify additional > > attributes like fb_length? Doesn't the vendor guarantee that "GRID > > M60-0B" has a fixed setup of those attributes? > > > > Specifying attributes here is not our requirement. Yes we have fixed set > of attributes for "GRID-M60-0B" and on. > We are defining the attributed here for "display" class for all other > vendor of gpu can use. > > > >> 'group' is optional. It should be a unique number in the system among > >> all the groups created for mdev devices. Its usage is: > >> - not needed if single vGPU device is being assigned to a domain. > >> - only need to be set if multiple vGPUs need to be assigned to a > >> domain and vendor driver have 'requires_group' file in type id > >> directory. > >> - if type id directory include 'requires_group' and user tries to > >> assign multiple vGPUs to a domain without having field in XML, > >> it will create single vGPU. > > > > We never finished our discussion of how this gets implemented or > > whether a group applies
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On 9/20/2016 10:20 PM, Alex Williamson wrote: > On Tue, 20 Sep 2016 21:53:16 +0530 > Kirti Wankhedewrote: > >> On 9/20/2016 8:13 PM, Alex Williamson wrote: >>> On Tue, 20 Sep 2016 19:51:58 +0530 >>> Kirti Wankhede wrote: >>> On 9/20/2016 3:06 AM, Alex Williamson wrote: > On Tue, 20 Sep 2016 02:05:52 +0530 > Kirti Wankhede wrote: > >> Hi libvirt experts, >> >> >> 'create': >> Write-only file. Mandatory. >> Accepts string to create mediated device. >> >> 'name': >> Read-Only file. Mandatory. >> Returns string, the name of that type id. >> >> 'fb_length': >> Read-only file. Mandatory. >> Returns {K,M,G}, size of framebuffer. > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > just one user of mediated devices. > As per our discussion in BOF, we would define class and its attributes. As I mentioned above these are attributes of "display" class. >>> >>> Just as I didn't know here, how does libvirt know the class of a given >>> type id? >>> >> >> Yes, we need some way to identify the class as Daniel mentioned in his >> suggestion. Add a file 'class' in mdev_supported_types that would return >> the class. >> >> > Can all parameters be changed dynamically? Parameters here would be extra parameters which are not listed above in mandatory list. If such parameters are required to set, these parameters should be set per mdev device before VM is booted. > Do parameter changes apply to existing devices or only future devices? > No, it should be applied at the time when mdev device is created or after mdev device is created but before VM is booted. It will not be applicable to existing devices. > This is a poorly specified > interface. I'd prefer this be done via module options on the vendor > driver. > Module option is not suitable here, in that case such parameters would be applied to all mdev device created for vendor driver and that is not what we want to. >>> >>> Then use attributes on the mdev device itself, this is not an >>> acceptable interface. >>> >> >> With above example, vGPU device XML would look like: >> >> >> my-vgpu >> pci__86_00_0 >> >> >>1 >>'frame_rate_limiter=0' >> >> >> >> 'type id' is mandatory. > > I was under the impression that the vendor supplied "name" was the one > unique identifier. We're certainly not going to create a registrar to > hand out type ids to each vendor, so what makes type id unique? Type id is unique, 'name' is the name (string) of device that would be displayed in guest OS for that type of mdev device. >>> >>> We were told at the BoF that name was the unique string which would be >>> consistent and you again indicate below that "GRID-M60-0B" has a fixed >>> set of attributes, so why do we care what type id is associated with >>> that name? >>> > I have > a hard time believing that a given vendor can even allocate unique type > ids for their own devices. Unique type id across vendors is not > practical. So which attribute are we actually using to identify which > type of mdev device we need and why would we ever specify additional > attributes like fb_length? Doesn't the vendor guarantee that "GRID > M60-0B" has a fixed setup of those attributes? > Specifying attributes here is not our requirement. Yes we have fixed set of attributes for "GRID-M60-0B" and on. We are defining the attributed here for "display" class for all other vendor of gpu can use. >> 'group' is optional. It should be a unique number in the system among >> all the groups created for mdev devices. Its usage is: >> - not needed if single vGPU device is being assigned to a domain. >> - only need to be set if multiple vGPUs need to be assigned to a >> domain and vendor driver have 'requires_group' file in type id directory. >> - if type id directory include 'requires_group' and user tries to >> assign multiple vGPUs to a domain without having field in XML, >> it will create single vGPU. > > We never finished our discussion of how this gets implemented or > whether a group applies only to devices from the same vendor, same > device, how heterogeneous groups are handled, etc. > We agreed on above points that I mentioned here and we wish to know what libvirt folks think, right? Since there were no inputs on that thread I thought I should summarize what had been discussed and get libvirt experts thoughts on this. >>>
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Tue, Sep 20, 2016 at 07:21:07PM +0200, Paolo Bonzini wrote: > > > On 20/09/2016 17:14, Daniel P. Berrange wrote: > > Any VM which > > uses the separate namespace is tainted, which means if theres a bug > > report we'll require the reported to remove whatever config caused > > the tainting and then reproduce the problem. > > > > If the vendor specific mdev parameters are things that apps will often > > need to be able to set, then allowing it via a separate namespace is > > just providing a backdoor that everyone needs to use, defeating the > > point of using a separate namespace. > > No, they don't need to be often set, and tainting the domain is a good idea. Basically think of the XML namespace support as "a mechanism for accessing underlying platform features before they are exposed in the main XML, that happens to use the vendor specific data format" *not* as "a mechanism for exposing vendor specific functionality" Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Wed, 21 Sep 2016 04:10:53 + "Tian, Kevin"wrote: > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > > Sent: Wednesday, September 21, 2016 12:23 AM > > > > > > On 9/20/2016 8:13 PM, Alex Williamson wrote: > > > On Tue, 20 Sep 2016 19:51:58 +0530 > > > Kirti Wankhede wrote: > > > > > >> On 9/20/2016 3:06 AM, Alex Williamson wrote: > > >>> On Tue, 20 Sep 2016 02:05:52 +0530 > > >>> Kirti Wankhede wrote: > > >>> > > Hi libvirt experts, > > > > > > 'create': > > Write-only file. Mandatory. > > Accepts string to create mediated device. > > > > 'name': > > Read-Only file. Mandatory. > > Returns string, the name of that type id. > > > > 'fb_length': > > Read-only file. Mandatory. > > Returns {K,M,G}, size of framebuffer. > > >>> > > >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > > >>> just one user of mediated devices. > > >>> > > >> > > >> As per our discussion in BOF, we would define class and its attributes. > > >> As I mentioned above these are attributes of "display" class. > > > > > > Just as I didn't know here, how does libvirt know the class of a given > > > type id? > > > > > > > Yes, we need some way to identify the class as Daniel mentioned in his > > suggestion. Add a file 'class' in mdev_supported_types that would return > > the class. > > > > > > 'display' class or 'gpu' class? display represents only part of GPU > functionalities. I assume you want to provide an unique class ID > for each type, instead of allowing multiple classes each identifying > a subset of controllable GPU resources (e.g. 'display', 'render', etc.)... Not sure where you're going with this, we're not creating a programming interface for the device, that exists through the vfio API. I believe the intention here is simply a user level classification for the purpose of being able to interpret the attributes provided in a logical way. > for unique class ID, I once considered whether it could be directly > inherited from class of parent device, since typically a vendor driver > creates mdev devices using the same type as physical device. But later > I realized one potential value of keep a class field for mdev types, > e.g. if we want to extend mdev to FPGA which could be programmed > as different classes of functionalities. :-) And how do we generically determine the class of a parent device? We cannot assume the parent device is PCI. Thanks, Alex
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Wed, 21 Sep 2016 03:56:21 + "Tian, Kevin"wrote: > > From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > > Sent: Tuesday, September 20, 2016 10:22 PM > > >> > > >> 'max_instance': > > >> Read-Only file. Mandatory. > > >> Returns integer. Returns maximum mdev device could be created > > >> at the moment when this file is read. This count would be updated by > > >> vendor driver. Before creating mdev device of this type, check if > > >> max_instance is > 0. > > > > > > Didn't we discuss this being being something like "available_instances" > > > with a "devices" sub-directory linking current devices of that type? > > > > > > > I'm ok with 'available_instances' as name of this file. > > Yes, mdev device directory will have links to devices of that type. I > > think that is part of mdev core module discussion. I'm trying to list > > sysfs files for libvirt interface here to focus more on libvirt > > interface. Hence didn't add that. I'll make sure to add all such details > > in future. > > > > > > Definitely you need provide those details since they also contain > libvirt interface, e.g. 'destroy' which we discussed should be in > mdev directory. $ find /sys/devices -name remove | wc -l 24 $ find /sys/devices -name destroy | wc -l 0 'remove' is the sysfs deconstructor we should use. > Another example is class-specific attributes such > as 'resolution', etc. which you listed under 'display class'. All those > attributes should be moved to mdev directory. Only class ID is > required under each type. Can you elaborate on what you're proposing here? If we don't have attributes like 'resolution' under the type-id then we can't describe to the user the features of the mdev before we create it. I don't think anybody wants a 'create it and find out' type of interface. Thanks, Alex
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
> From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > Sent: Wednesday, September 21, 2016 12:23 AM > > > On 9/20/2016 8:13 PM, Alex Williamson wrote: > > On Tue, 20 Sep 2016 19:51:58 +0530 > > Kirti Wankhedewrote: > > > >> On 9/20/2016 3:06 AM, Alex Williamson wrote: > >>> On Tue, 20 Sep 2016 02:05:52 +0530 > >>> Kirti Wankhede wrote: > >>> > Hi libvirt experts, > > > 'create': > Write-only file. Mandatory. > Accepts string to create mediated device. > > 'name': > Read-Only file. Mandatory. > Returns string, the name of that type id. > > 'fb_length': > Read-only file. Mandatory. > Returns {K,M,G}, size of framebuffer. > >>> > >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > >>> just one user of mediated devices. > >>> > >> > >> As per our discussion in BOF, we would define class and its attributes. > >> As I mentioned above these are attributes of "display" class. > > > > Just as I didn't know here, how does libvirt know the class of a given > > type id? > > > > Yes, we need some way to identify the class as Daniel mentioned in his > suggestion. Add a file 'class' in mdev_supported_types that would return > the class. > > 'display' class or 'gpu' class? display represents only part of GPU functionalities. I assume you want to provide an unique class ID for each type, instead of allowing multiple classes each identifying a subset of controllable GPU resources (e.g. 'display', 'render', etc.)... for unique class ID, I once considered whether it could be directly inherited from class of parent device, since typically a vendor driver creates mdev devices using the same type as physical device. But later I realized one potential value of keep a class field for mdev types, e.g. if we want to extend mdev to FPGA which could be programmed as different classes of functionalities. :-) Thanks Kevin
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
> From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > Sent: Tuesday, September 20, 2016 10:22 PM > >> > >> 'max_instance': > >> Read-Only file. Mandatory. > >> Returns integer. Returns maximum mdev device could be created > >> at the moment when this file is read. This count would be updated by > >> vendor driver. Before creating mdev device of this type, check if > >> max_instance is > 0. > > > > Didn't we discuss this being being something like "available_instances" > > with a "devices" sub-directory linking current devices of that type? > > > > I'm ok with 'available_instances' as name of this file. > Yes, mdev device directory will have links to devices of that type. I > think that is part of mdev core module discussion. I'm trying to list > sysfs files for libvirt interface here to focus more on libvirt > interface. Hence didn't add that. I'll make sure to add all such details > in future. > > Definitely you need provide those details since they also contain libvirt interface, e.g. 'destroy' which we discussed should be in mdev directory. Another example is class-specific attributes such as 'resolution', etc. which you listed under 'display class'. All those attributes should be moved to mdev directory. Only class ID is required under each type. Thanks Kevin
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On 20/09/2016 17:14, Daniel P. Berrange wrote: > Any VM which > uses the separate namespace is tainted, which means if theres a bug > report we'll require the reported to remove whatever config caused > the tainting and then reproduce the problem. > > If the vendor specific mdev parameters are things that apps will often > need to be able to set, then allowing it via a separate namespace is > just providing a backdoor that everyone needs to use, defeating the > point of using a separate namespace. No, they don't need to be often set, and tainting the domain is a good idea. Paolo
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Tue, 20 Sep 2016 21:53:16 +0530 Kirti Wankhedewrote: > On 9/20/2016 8:13 PM, Alex Williamson wrote: > > On Tue, 20 Sep 2016 19:51:58 +0530 > > Kirti Wankhede wrote: > > > >> On 9/20/2016 3:06 AM, Alex Williamson wrote: > >>> On Tue, 20 Sep 2016 02:05:52 +0530 > >>> Kirti Wankhede wrote: > >>> > Hi libvirt experts, > > > 'create': > Write-only file. Mandatory. > Accepts string to create mediated device. > > 'name': > Read-Only file. Mandatory. > Returns string, the name of that type id. > > 'fb_length': > Read-only file. Mandatory. > Returns {K,M,G}, size of framebuffer. > >>> > >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > >>> just one user of mediated devices. > >>> > >> > >> As per our discussion in BOF, we would define class and its attributes. > >> As I mentioned above these are attributes of "display" class. > > > > Just as I didn't know here, how does libvirt know the class of a given > > type id? > > > > Yes, we need some way to identify the class as Daniel mentioned in his > suggestion. Add a file 'class' in mdev_supported_types that would return > the class. > > > >>> Can all parameters be changed dynamically? > >> > >> Parameters here would be extra parameters which are not listed above in > >> mandatory list. If such parameters are required to set, these parameters > >> should be set per mdev device before VM is booted. > >> > >>> Do parameter changes apply to existing devices or only future devices? > >>> > >> > >> No, it should be applied at the time when mdev device is created or > >> after mdev device is created but before VM is booted. It will not be > >> applicable to existing devices. > >> > >>> This is a poorly specified > >>> interface. I'd prefer this be done via module options on the vendor > >>> driver. > >>> > >> > >> Module option is not suitable here, in that case such parameters would > >> be applied to all mdev device created for vendor driver and that is not > >> what we want to. > > > > Then use attributes on the mdev device itself, this is not an > > acceptable interface. > > > > With above example, vGPU device XML would look like: > > > my-vgpu > pci__86_00_0 > > > 1 > 'frame_rate_limiter=0' > > > > 'type id' is mandatory. > >>> > >>> I was under the impression that the vendor supplied "name" was the one > >>> unique identifier. We're certainly not going to create a registrar to > >>> hand out type ids to each vendor, so what makes type id unique? > >> > >> Type id is unique, 'name' is the name (string) of device that would be > >> displayed in guest OS for that type of mdev device. > > > > We were told at the BoF that name was the unique string which would be > > consistent and you again indicate below that "GRID-M60-0B" has a fixed > > set of attributes, so why do we care what type id is associated with > > that name? > > > >>> I have > >>> a hard time believing that a given vendor can even allocate unique type > >>> ids for their own devices. Unique type id across vendors is not > >>> practical. So which attribute are we actually using to identify which > >>> type of mdev device we need and why would we ever specify additional > >>> attributes like fb_length? Doesn't the vendor guarantee that "GRID > >>> M60-0B" has a fixed setup of those attributes? > >>> > >> > >> Specifying attributes here is not our requirement. Yes we have fixed set > >> of attributes for "GRID-M60-0B" and on. > >> We are defining the attributed here for "display" class for all other > >> vendor of gpu can use. > >> > >> > 'group' is optional. It should be a unique number in the system among > all the groups created for mdev devices. Its usage is: > - not needed if single vGPU device is being assigned to a domain. > - only need to be set if multiple vGPUs need to be assigned to a > domain and vendor driver have 'requires_group' file in type id directory. > - if type id directory include 'requires_group' and user tries to > assign multiple vGPUs to a domain without having field in XML, > it will create single vGPU. > >>> > >>> We never finished our discussion of how this gets implemented or > >>> whether a group applies only to devices from the same vendor, same > >>> device, how heterogeneous groups are handled, etc. > >>> > >> > >> We agreed on above points that I mentioned here and we wish to know what > >> libvirt folks think, right? > >> Since there were no inputs on that thread I thought I should summarize > >> what had been discussed and get libvirt experts thoughts on this. > > > > No, that was just an idea that was never full
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Tue, Sep 20, 2016 at 10:01:18PM +0530, Kirti Wankhede wrote: > > > On 9/20/2016 8:44 PM, Daniel P. Berrange wrote: > > On Tue, Sep 20, 2016 at 05:05:43PM +0200, Paolo Bonzini wrote: > >> > >> > >> On 20/09/2016 16:58, Daniel P. Berrange wrote: > > As I've said in my earlier reply - libvirt will *NOT* support passing > > arbitrary vendor specific parameters as a blob via the XML. Everything > > that appears in the XML must be *fully* specified and explicitly > > represented in the XML as a distinct attribute or element. > > Are generic key/value attributes (e.g. a element) acceptable? > >>> > >>> Only if libvirt has a known list of valid attribute key names upfront. > >>> We don't want to just blindly expose arbitary vendor specific keys exposed > >>> by the kernel. Libvirt's job is to ensure the XML representation is vendor > >>> portable > >> > > In key/value attributes (taking example from proposed xml file) > > 2560x1600 > > 'Key' (i.e. 'resolution') should be known upfront, not the value, right? NB, resolution is a good example of why we would *not* use the generic $value format in the XML. Instead we'd represent it as as applications should not have to further parse data after extracting it from an XML attribute/element. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Tue, Sep 20, 2016 at 10:12:20PM +0530, Kirti Wankhede wrote: > > > On 9/20/2016 10:06 PM, Daniel P. Berrange wrote: > > On Tue, Sep 20, 2016 at 10:01:18PM +0530, Kirti Wankhede wrote: > >> > >> > >> On 9/20/2016 8:44 PM, Daniel P. Berrange wrote: > >>> On Tue, Sep 20, 2016 at 05:05:43PM +0200, Paolo Bonzini wrote: > > > On 20/09/2016 16:58, Daniel P. Berrange wrote: > >>> As I've said in my earlier reply - libvirt will *NOT* support passing > >>> arbitrary vendor specific parameters as a blob via the XML. Everything > >>> that appears in the XML must be *fully* specified and explicitly > >>> represented in the XML as a distinct attribute or element. > >> > >> Are generic key/value attributes (e.g. a element) > >> acceptable? > > > > Only if libvirt has a known list of valid attribute key names upfront. > > We don't want to just blindly expose arbitary vendor specific keys > > exposed > > by the kernel. Libvirt's job is to ensure the XML representation is > > vendor > > portable > > >> > >> In key/value attributes (taking example from proposed xml file) > >> > >> 2560x1600 > >> > >> 'Key' (i.e. 'resolution') should be known upfront, not the value, right? > > > > Yes, the actual value is not important - only its structured. > > ie, libvirt would check that it is in the format '$WIDTHx$HEIGHT' > > and reject it if not. > > > > In this particular example, libvirt checks if its integer? or value > could be 2560x1600 or 4096x4096 both are valid? > > Does libvirt accept string value? Err, as I said, we'd validate that its in the format '$WIDTHx$HEIGHT' Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On 9/20/2016 10:06 PM, Daniel P. Berrange wrote: > On Tue, Sep 20, 2016 at 10:01:18PM +0530, Kirti Wankhede wrote: >> >> >> On 9/20/2016 8:44 PM, Daniel P. Berrange wrote: >>> On Tue, Sep 20, 2016 at 05:05:43PM +0200, Paolo Bonzini wrote: On 20/09/2016 16:58, Daniel P. Berrange wrote: >>> As I've said in my earlier reply - libvirt will *NOT* support passing >>> arbitrary vendor specific parameters as a blob via the XML. Everything >>> that appears in the XML must be *fully* specified and explicitly >>> represented in the XML as a distinct attribute or element. >> >> Are generic key/value attributes (e.g. a element) acceptable? > > Only if libvirt has a known list of valid attribute key names upfront. > We don't want to just blindly expose arbitary vendor specific keys exposed > by the kernel. Libvirt's job is to ensure the XML representation is vendor > portable >> >> In key/value attributes (taking example from proposed xml file) >> >> 2560x1600 >> >> 'Key' (i.e. 'resolution') should be known upfront, not the value, right? > > Yes, the actual value is not important - only its structured. > ie, libvirt would check that it is in the format '$WIDTHx$HEIGHT' > and reject it if not. > In this particular example, libvirt checks if its integer? or value could be 2560x1600 or 4096x4096 both are valid? Does libvirt accept string value? Kirti
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Tue, Sep 20, 2016 at 10:01:18PM +0530, Kirti Wankhede wrote: > > > On 9/20/2016 8:44 PM, Daniel P. Berrange wrote: > > On Tue, Sep 20, 2016 at 05:05:43PM +0200, Paolo Bonzini wrote: > >> > >> > >> On 20/09/2016 16:58, Daniel P. Berrange wrote: > > As I've said in my earlier reply - libvirt will *NOT* support passing > > arbitrary vendor specific parameters as a blob via the XML. Everything > > that appears in the XML must be *fully* specified and explicitly > > represented in the XML as a distinct attribute or element. > > Are generic key/value attributes (e.g. a element) acceptable? > >>> > >>> Only if libvirt has a known list of valid attribute key names upfront. > >>> We don't want to just blindly expose arbitary vendor specific keys exposed > >>> by the kernel. Libvirt's job is to ensure the XML representation is vendor > >>> portable > >> > > In key/value attributes (taking example from proposed xml file) > > 2560x1600 > > 'Key' (i.e. 'resolution') should be known upfront, not the value, right? Yes, the actual value is not important - only its structured. ie, libvirt would check that it is in the format '$WIDTHx$HEIGHT' and reject it if not. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On 9/20/2016 8:44 PM, Daniel P. Berrange wrote: > On Tue, Sep 20, 2016 at 05:05:43PM +0200, Paolo Bonzini wrote: >> >> >> On 20/09/2016 16:58, Daniel P. Berrange wrote: > As I've said in my earlier reply - libvirt will *NOT* support passing > arbitrary vendor specific parameters as a blob via the XML. Everything > that appears in the XML must be *fully* specified and explicitly > represented in the XML as a distinct attribute or element. Are generic key/value attributes (e.g. a element) acceptable? >>> >>> Only if libvirt has a known list of valid attribute key names upfront. >>> We don't want to just blindly expose arbitary vendor specific keys exposed >>> by the kernel. Libvirt's job is to ensure the XML representation is vendor >>> portable >> In key/value attributes (taking example from proposed xml file) 2560x1600 'Key' (i.e. 'resolution') should be known upfront, not the value, right? Kirti >> Ok, then I guess vendor-specific mdev parameters are out completely. Or >> could they be added under a separate namespace, like QEMU passthrough? > > The QEMU XML namespace that allows passthrough is intended either as a > short term hack to let people experiment with new QEMU features, before > they get explicitly represented in the main XML namespace, or in a few > cases, for things we never want to support officially. Any VM which > uses the separate namespace is tainted, which means if theres a bug > report we'll require the reported to remove whatever config caused > the tainting and then reproduce the problem. > > If the vendor specific mdev parameters are things that apps will often > need to be able to set, then allowing it via a separate namespace is > just providing a backdoor that everyone needs to use, defeating the > point of using a separate namespace. > > We don't want to knowingly design a system which defacto requires > the app to use the separate namespace most/all of the time. > > > Regards, > Daniel >
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On 9/20/2016 8:13 PM, Alex Williamson wrote: > On Tue, 20 Sep 2016 19:51:58 +0530 > Kirti Wankhedewrote: > >> On 9/20/2016 3:06 AM, Alex Williamson wrote: >>> On Tue, 20 Sep 2016 02:05:52 +0530 >>> Kirti Wankhede wrote: >>> Hi libvirt experts, 'create': Write-only file. Mandatory. Accepts string to create mediated device. 'name': Read-Only file. Mandatory. Returns string, the name of that type id. 'fb_length': Read-only file. Mandatory. Returns {K,M,G}, size of framebuffer. >>> >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are >>> just one user of mediated devices. >>> >> >> As per our discussion in BOF, we would define class and its attributes. >> As I mentioned above these are attributes of "display" class. > > Just as I didn't know here, how does libvirt know the class of a given > type id? > Yes, we need some way to identify the class as Daniel mentioned in his suggestion. Add a file 'class' in mdev_supported_types that would return the class. >>> Can all parameters be changed dynamically? >> >> Parameters here would be extra parameters which are not listed above in >> mandatory list. If such parameters are required to set, these parameters >> should be set per mdev device before VM is booted. >> >>> Do parameter changes apply to existing devices or only future devices? >> >> No, it should be applied at the time when mdev device is created or >> after mdev device is created but before VM is booted. It will not be >> applicable to existing devices. >> >>> This is a poorly specified >>> interface. I'd prefer this be done via module options on the vendor >>> driver. >>> >> >> Module option is not suitable here, in that case such parameters would >> be applied to all mdev device created for vendor driver and that is not >> what we want to. > > Then use attributes on the mdev device itself, this is not an > acceptable interface. > With above example, vGPU device XML would look like: my-vgpu pci__86_00_0 1 'frame_rate_limiter=0' 'type id' is mandatory. >>> >>> I was under the impression that the vendor supplied "name" was the one >>> unique identifier. We're certainly not going to create a registrar to >>> hand out type ids to each vendor, so what makes type id unique? >> >> Type id is unique, 'name' is the name (string) of device that would be >> displayed in guest OS for that type of mdev device. > > We were told at the BoF that name was the unique string which would be > consistent and you again indicate below that "GRID-M60-0B" has a fixed > set of attributes, so why do we care what type id is associated with > that name? > >>> I have >>> a hard time believing that a given vendor can even allocate unique type >>> ids for their own devices. Unique type id across vendors is not >>> practical. So which attribute are we actually using to identify which >>> type of mdev device we need and why would we ever specify additional >>> attributes like fb_length? Doesn't the vendor guarantee that "GRID >>> M60-0B" has a fixed setup of those attributes? >>> >> >> Specifying attributes here is not our requirement. Yes we have fixed set >> of attributes for "GRID-M60-0B" and on. >> We are defining the attributed here for "display" class for all other >> vendor of gpu can use. >> >> 'group' is optional. It should be a unique number in the system among all the groups created for mdev devices. Its usage is: - not needed if single vGPU device is being assigned to a domain. - only need to be set if multiple vGPUs need to be assigned to a domain and vendor driver have 'requires_group' file in type id directory. - if type id directory include 'requires_group' and user tries to assign multiple vGPUs to a domain without having field in XML, it will create single vGPU. >>> >>> We never finished our discussion of how this gets implemented or >>> whether a group applies only to devices from the same vendor, same >>> device, how heterogeneous groups are handled, etc. >>> >> >> We agreed on above points that I mentioned here and we wish to know what >> libvirt folks think, right? >> Since there were no inputs on that thread I thought I should summarize >> what had been discussed and get libvirt experts thoughts on this. > > No, that was just an idea that was never full defined. We had that > idea, the IOMMU group idea, the invalid container idea... This is > still an open question in the design. > We do want to close down on the design as soon as possible. I had tried to address open questions in earlier discussion also. Let me address your concerns you have: > whether a group applies only to devices from the same vendor, Yes. > same device,
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Tue, Sep 20, 2016 at 08:05:01PM +0530, Kirti Wankhede wrote: > > > On 9/20/2016 3:55 AM, Alex Williamson wrote: > > On Mon, 19 Sep 2016 23:50:56 +0200 > > Paolo Bonziniwrote: > > > >> On 19/09/2016 23:36, Alex Williamson wrote: > >>> On Tue, 20 Sep 2016 02:05:52 +0530 > >>> Kirti Wankhede wrote: > 'fb_length': > Read-only file. Mandatory. > Returns {K,M,G}, size of framebuffer. > >>> > >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > >>> just one user of mediated devices. [...] > >>> > 'params' > Write-Only file. Optional. > String input. Libvirt would pass the string given in XML file to > this file and then create mdev device. Set empty string to clear params. > For example, set parameter 'frame_rate_limiter=0' to disable frame rate > limiter for performance benchmarking, then create device of type 11. The > device created would have that parameter set by vendor driver. > >>> > >>> Might as well just call it "magic", there's absolutely no ability to > >>> introspect what parameters are allowed or even valid here. Can all > >>> parameters be changed dynamically? Do parameter changes apply to > >>> existing devices or only future devices? This is a poorly specified > >>> interface. I'd prefer this be done via module options on the vendor > >>> driver. > >> > >> Or additional files in the mdev's sysfs directory? > > > > Sure, the vendor driver could certainly support that, it'd be vendor > > specific of course. > > > > Right, it would be vendor specific. But if this is not set through mdev > device XML, user have to create mdev device with XML first and then > manually set such parameter. Also these parameters would not retain if > mdev device is destroyed and created again. That's why if libvirt sets > these parameters by reading it from XML file, it would be simple for user. > > If we move 'params' file to mdev device, could libvirt do the following > to create mdev device: > > * Read /sys/../\:86\:00.0/11/max_instances. If it is greater than 0, > then only proceed else fail. > > * Autogenerate UUID > * Create device: > > echo "$UUID" > /sys/../\:86\:00.0/11/create > > * Set extra params if 'params' field exist in device XML and 'params' > file exist mdev device directory: > > echo "frame_rate_limiter=0" > /sys/bus/mdev/devices/$UUID/params As I've said in my earlier reply - libvirt will *NOT* support passing arbitrary vendor specific parameters as a blob via the XML. Everything that appears in the XML must be *fully* specified and explicitly represented in the XML as a distinct attribute or element. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Tue, Sep 20, 2016 at 04:49:09PM +0200, Paolo Bonzini wrote: > > > On 20/09/2016 16:41, Daniel P. Berrange wrote: > > As I've said in my earlier reply - libvirt will *NOT* support passing > > arbitrary vendor specific parameters as a blob via the XML. Everything > > that appears in the XML must be *fully* specified and explicitly > > represented in the XML as a distinct attribute or element. > > Are generic key/value attributes (e.g. a element) acceptable? Only if libvirt has a known list of valid attribute key names upfront. We don't want to just blindly expose arbitary vendor specific keys exposed by the kernel. Libvirt's job is to ensure the XML representation is vendor portable and stable across software versions. If we just blindly expose the strings reported by the host, then the XML will change if the vendor arbitararily renames an attribute in a software update, or if two vendors have the same concept there's no guaranteed name between them. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Tue, Sep 20, 2016 at 05:05:43PM +0200, Paolo Bonzini wrote: > > > On 20/09/2016 16:58, Daniel P. Berrange wrote: > > > > As I've said in my earlier reply - libvirt will *NOT* support passing > > > > arbitrary vendor specific parameters as a blob via the XML. Everything > > > > that appears in the XML must be *fully* specified and explicitly > > > > represented in the XML as a distinct attribute or element. > > > > > > Are generic key/value attributes (e.g. a element) acceptable? > > > > Only if libvirt has a known list of valid attribute key names upfront. > > We don't want to just blindly expose arbitary vendor specific keys exposed > > by the kernel. Libvirt's job is to ensure the XML representation is vendor > > portable > > Ok, then I guess vendor-specific mdev parameters are out completely. Or > could they be added under a separate namespace, like QEMU passthrough? The QEMU XML namespace that allows passthrough is intended either as a short term hack to let people experiment with new QEMU features, before they get explicitly represented in the main XML namespace, or in a few cases, for things we never want to support officially. Any VM which uses the separate namespace is tainted, which means if theres a bug report we'll require the reported to remove whatever config caused the tainting and then reproduce the problem. If the vendor specific mdev parameters are things that apps will often need to be able to set, then allowing it via a separate namespace is just providing a backdoor that everyone needs to use, defeating the point of using a separate namespace. We don't want to knowingly design a system which defacto requires the app to use the separate namespace most/all of the time. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On 20/09/2016 16:41, Daniel P. Berrange wrote: > As I've said in my earlier reply - libvirt will *NOT* support passing > arbitrary vendor specific parameters as a blob via the XML. Everything > that appears in the XML must be *fully* specified and explicitly > represented in the XML as a distinct attribute or element. Are generic key/value attributes (e.g. a element) acceptable? Paolo
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On 20/09/2016 16:58, Daniel P. Berrange wrote: > > > As I've said in my earlier reply - libvirt will *NOT* support passing > > > arbitrary vendor specific parameters as a blob via the XML. Everything > > > that appears in the XML must be *fully* specified and explicitly > > > represented in the XML as a distinct attribute or element. > > > > Are generic key/value attributes (e.g. a element) acceptable? > > Only if libvirt has a known list of valid attribute key names upfront. > We don't want to just blindly expose arbitary vendor specific keys exposed > by the kernel. Libvirt's job is to ensure the XML representation is vendor > portable Ok, then I guess vendor-specific mdev parameters are out completely. Or could they be added under a separate namespace, like QEMU passthrough? Paolo
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On 9/20/2016 3:06 AM, Alex Williamson wrote: > On Tue, 20 Sep 2016 02:05:52 +0530 > Kirti Wankhedewrote: > >> Hi libvirt experts, >> >> Thanks for valuable input on v1 version of RFC. >> >> Quick brief, VFIO based mediated device framework provides a way to >> virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT >> and IBM's channel IO. This framework reuses VFIO APIs for all the >> functionalities for mediated devices which are currently being used for >> pass through devices. This framework introduces a set of new sysfs files >> for device creation and its life cycle management. >> >> Here is the summary of discussion on v1: >> 1. Discover mediated device: >> As part of physical device initialization process, vendor driver will >> register their physical devices, which will be used to create virtual >> device (mediated device, aka mdev) to the mediated framework. >> >> Vendor driver should specify mdev_supported_types in directory format. >> This format is class based, for example, display class directory format >> should be as below. We need to define such set for each class of devices >> which would be supported by mediated device framework. >> >> --- mdev_destroy > > I thought we replaced mdev_destroy with a remove attribute for each > mdev device. > >> --- mdev_supported_types >> |-- 11 >> | |-- create >> | |-- name >> | |-- fb_length >> | |-- resolution >> | |-- heads >> | |-- max_instances >> | |-- params >> | |-- requires_group >> |-- 12 >> | |-- create >> | |-- name >> | |-- fb_length >> | |-- resolution >> | |-- heads >> | |-- max_instances >> | |-- params >> | |-- requires_group >> |-- 13 >> |-- create >> |-- name >> |-- fb_length >> |-- resolution >> |-- heads >> |-- max_instances >> |-- params >> |-- requires_group >> >> >> In the above example directory '11' represents a type id of mdev device. >> 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and >> 'requires_group' would be Read-Only files that vendor would provide to >> describe about that type. >> >> 'create': >> Write-only file. Mandatory. >> Accepts string to create mediated device. >> >> 'name': >> Read-Only file. Mandatory. >> Returns string, the name of that type id. >> >> 'fb_length': >> Read-only file. Mandatory. >> Returns {K,M,G}, size of framebuffer. > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > just one user of mediated devices. > As per our discussion in BOF, we would define class and its attributes. As I mentioned above these are attributes of "display" class. >> >> 'resolution': >> Read-Only file. Mandatory. >> Returns 'hres x vres' format. Maximum supported resolution. > > Same. > >> >> 'heads': >> Read-Only file. Mandatory. >> Returns integer. Number of maximum heads supported. > > Same. > >> >> 'max_instance': >> Read-Only file. Mandatory. >> Returns integer. Returns maximum mdev device could be created >> at the moment when this file is read. This count would be updated by >> vendor driver. Before creating mdev device of this type, check if >> max_instance is > 0. > > Didn't we discuss this being being something like "available_instances" > with a "devices" sub-directory linking current devices of that type? > I'm ok with 'available_instances' as name of this file. Yes, mdev device directory will have links to devices of that type. I think that is part of mdev core module discussion. I'm trying to list sysfs files for libvirt interface here to focus more on libvirt interface. Hence didn't add that. I'll make sure to add all such details in future. >> 'params' >> Write-Only file. Optional. >> String input. Libvirt would pass the string given in XML file to >> this file and then create mdev device. Set empty string to clear params. >> For example, set parameter 'frame_rate_limiter=0' to disable frame rate >> limiter for performance benchmarking, then create device of type 11. The >> device created would have that parameter set by vendor driver. > > Might as well just call it "magic", there's absolutely no ability to > introspect what parameters are allowed or even valid here. Libvirt don't need to introspect these parameters. These are meant for vendor driver. Let vendor driver interpret these parameters and take action based on that. > Can all parameters be changed dynamically? Parameters here would be extra parameters which are not listed above in mandatory list. If such parameters are required to set, these parameters should be set per mdev device before VM is booted. > Do parameter changes apply to existing devices or only future devices? No, it should be applied at the time when mdev device is created or after mdev device is created but before VM is
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On 9/20/2016 3:55 AM, Alex Williamson wrote: > On Mon, 19 Sep 2016 23:50:56 +0200 > Paolo Bonziniwrote: > >> On 19/09/2016 23:36, Alex Williamson wrote: >>> On Tue, 20 Sep 2016 02:05:52 +0530 >>> Kirti Wankhede wrote: 'fb_length': Read-only file. Mandatory. Returns {K,M,G}, size of framebuffer. >>> >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are >>> just one user of mediated devices. [...] >>> 'params' Write-Only file. Optional. String input. Libvirt would pass the string given in XML file to this file and then create mdev device. Set empty string to clear params. For example, set parameter 'frame_rate_limiter=0' to disable frame rate limiter for performance benchmarking, then create device of type 11. The device created would have that parameter set by vendor driver. >>> >>> Might as well just call it "magic", there's absolutely no ability to >>> introspect what parameters are allowed or even valid here. Can all >>> parameters be changed dynamically? Do parameter changes apply to >>> existing devices or only future devices? This is a poorly specified >>> interface. I'd prefer this be done via module options on the vendor >>> driver. >> >> Or additional files in the mdev's sysfs directory? > > Sure, the vendor driver could certainly support that, it'd be vendor > specific of course. > Right, it would be vendor specific. But if this is not set through mdev device XML, user have to create mdev device with XML first and then manually set such parameter. Also these parameters would not retain if mdev device is destroyed and created again. That's why if libvirt sets these parameters by reading it from XML file, it would be simple for user. If we move 'params' file to mdev device, could libvirt do the following to create mdev device: * Read /sys/../\:86\:00.0/11/max_instances. If it is greater than 0, then only proceed else fail. * Autogenerate UUID * Create device: echo "$UUID" > /sys/../\:86\:00.0/11/create * Set extra params if 'params' field exist in device XML and 'params' file exist mdev device directory: echo "frame_rate_limiter=0" > /sys/bus/mdev/devices/$UUID/params Kirti
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Tue, 20 Sep 2016 20:05:01 +0530 Kirti Wankhedewrote: > On 9/20/2016 3:55 AM, Alex Williamson wrote: > > On Mon, 19 Sep 2016 23:50:56 +0200 > > Paolo Bonzini wrote: > > > >> On 19/09/2016 23:36, Alex Williamson wrote: > >>> On Tue, 20 Sep 2016 02:05:52 +0530 > >>> Kirti Wankhede wrote: > 'fb_length': > Read-only file. Mandatory. > Returns {K,M,G}, size of framebuffer. > >>> > >>> This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > >>> just one user of mediated devices. [...] > >>> > 'params' > Write-Only file. Optional. > String input. Libvirt would pass the string given in XML file to > this file and then create mdev device. Set empty string to clear params. > For example, set parameter 'frame_rate_limiter=0' to disable frame rate > limiter for performance benchmarking, then create device of type 11. The > device created would have that parameter set by vendor driver. > >>> > >>> Might as well just call it "magic", there's absolutely no ability to > >>> introspect what parameters are allowed or even valid here. Can all > >>> parameters be changed dynamically? Do parameter changes apply to > >>> existing devices or only future devices? This is a poorly specified > >>> interface. I'd prefer this be done via module options on the vendor > >>> driver. > >> > >> Or additional files in the mdev's sysfs directory? > > > > Sure, the vendor driver could certainly support that, it'd be vendor > > specific of course. > > > > Right, it would be vendor specific. But if this is not set through mdev > device XML, user have to create mdev device with XML first and then > manually set such parameter. Also these parameters would not retain if > mdev device is destroyed and created again. That's why if libvirt sets > these parameters by reading it from XML file, it would be simple for user. > > If we move 'params' file to mdev device, could libvirt do the following > to create mdev device: > > * Read /sys/../\:86\:00.0/11/max_instances. If it is greater than 0, > then only proceed else fail. > > * Autogenerate UUID > * Create device: > > echo "$UUID" > /sys/../\:86\:00.0/11/create > > * Set extra params if 'params' field exist in device XML and 'params' > file exist mdev device directory: > > echo "frame_rate_limiter=0" > /sys/bus/mdev/devices/$UUID/params Why does libvirt even need to participate in setting these additional parameters? The example here is for benchmarking, which doesn't express to me a strong need to be automated through libvirt. For hostdev devices libvirt supports a managed='no' option that requires the user to setup the device in advance. Something similar sounds reasonable here too. This is also where I see that a vendor module option could be used, something like: modprobe nvidia enable-no-frl-mdev You say you don't want to apply this to all mdev devices, but maybe it just makes an additional type-id available "GRID-M60-0B-NOFRL". Then you've removed libvirt from the equation, the VM simply specifies this type-id that incorporates that feature. Obviously it gets complicated if there are multiple features to enable, but we've only seen one example so far of why this parameter is needed. Thanks, Alex
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Tue, 20 Sep 2016 19:51:58 +0530 Kirti Wankhedewrote: > On 9/20/2016 3:06 AM, Alex Williamson wrote: > > On Tue, 20 Sep 2016 02:05:52 +0530 > > Kirti Wankhede wrote: > > > >> Hi libvirt experts, > >> > >> Thanks for valuable input on v1 version of RFC. > >> > >> Quick brief, VFIO based mediated device framework provides a way to > >> virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT > >> and IBM's channel IO. This framework reuses VFIO APIs for all the > >> functionalities for mediated devices which are currently being used for > >> pass through devices. This framework introduces a set of new sysfs files > >> for device creation and its life cycle management. > >> > >> Here is the summary of discussion on v1: > >> 1. Discover mediated device: > >> As part of physical device initialization process, vendor driver will > >> register their physical devices, which will be used to create virtual > >> device (mediated device, aka mdev) to the mediated framework. > >> > >> Vendor driver should specify mdev_supported_types in directory format. > >> This format is class based, for example, display class directory format > >> should be as below. We need to define such set for each class of devices > >> which would be supported by mediated device framework. > >> > >> --- mdev_destroy > > > > I thought we replaced mdev_destroy with a remove attribute for each > > mdev device. > > > >> --- mdev_supported_types > >> |-- 11 > >> | |-- create > >> | |-- name > >> | |-- fb_length > >> | |-- resolution > >> | |-- heads > >> | |-- max_instances > >> | |-- params > >> | |-- requires_group > >> |-- 12 > >> | |-- create > >> | |-- name > >> | |-- fb_length > >> | |-- resolution > >> | |-- heads > >> | |-- max_instances > >> | |-- params > >> | |-- requires_group > >> |-- 13 > >> |-- create > >> |-- name > >> |-- fb_length > >> |-- resolution > >> |-- heads > >> |-- max_instances > >> |-- params > >> |-- requires_group > >> > >> > >> In the above example directory '11' represents a type id of mdev device. > >> 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and > >> 'requires_group' would be Read-Only files that vendor would provide to > >> describe about that type. > >> > >> 'create': > >> Write-only file. Mandatory. > >> Accepts string to create mediated device. > >> > >> 'name': > >> Read-Only file. Mandatory. > >> Returns string, the name of that type id. > >> > >> 'fb_length': > >> Read-only file. Mandatory. > >> Returns {K,M,G}, size of framebuffer. > > > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > > just one user of mediated devices. > > > > As per our discussion in BOF, we would define class and its attributes. > As I mentioned above these are attributes of "display" class. Just as I didn't know here, how does libvirt know the class of a given type id? > >> > >> 'resolution': > >> Read-Only file. Mandatory. > >> Returns 'hres x vres' format. Maximum supported resolution. > > > > Same. > > > >> > >> 'heads': > >> Read-Only file. Mandatory. > >> Returns integer. Number of maximum heads supported. > > > > Same. > > > >> > >> 'max_instance': > >> Read-Only file. Mandatory. > >> Returns integer. Returns maximum mdev device could be created > >> at the moment when this file is read. This count would be updated by > >> vendor driver. Before creating mdev device of this type, check if > >> max_instance is > 0. > > > > Didn't we discuss this being being something like "available_instances" > > with a "devices" sub-directory linking current devices of that type? > > > > I'm ok with 'available_instances' as name of this file. > Yes, mdev device directory will have links to devices of that type. I > think that is part of mdev core module discussion. I'm trying to list > sysfs files for libvirt interface here to focus more on libvirt > interface. Hence didn't add that. I'll make sure to add all such details > in future. > > > >> 'params' > >> Write-Only file. Optional. > >> String input. Libvirt would pass the string given in XML file to > >> this file and then create mdev device. Set empty string to clear params. > >> For example, set parameter 'frame_rate_limiter=0' to disable frame rate > >> limiter for performance benchmarking, then create device of type 11. The > >> device created would have that parameter set by vendor driver. > > > > Might as well just call it "magic", there's absolutely no ability to > > introspect what parameters are allowed or even valid here. > > Libvirt don't need to introspect these parameters. These are meant for > vendor driver. Let vendor driver interpret these parameters and take > action
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Tue, Sep 20, 2016 at 02:05:52AM +0530, Kirti Wankhede wrote: > > Hi libvirt experts, > > Thanks for valuable input on v1 version of RFC. > > Quick brief, VFIO based mediated device framework provides a way to > virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT > and IBM's channel IO. This framework reuses VFIO APIs for all the > functionalities for mediated devices which are currently being used for > pass through devices. This framework introduces a set of new sysfs files > for device creation and its life cycle management. > > Here is the summary of discussion on v1: > 1. Discover mediated device: > As part of physical device initialization process, vendor driver will > register their physical devices, which will be used to create virtual > device (mediated device, aka mdev) to the mediated framework. > > Vendor driver should specify mdev_supported_types in directory format. > This format is class based, for example, display class directory format > should be as below. We need to define such set for each class of devices > which would be supported by mediated device framework. > > --- mdev_destroy > --- mdev_supported_types > |-- 11 > | |-- create > | |-- name > | |-- fb_length > | |-- resolution > | |-- heads > | |-- max_instances > | |-- params > | |-- requires_group > |-- 12 > | |-- create > | |-- name > | |-- fb_length > | |-- resolution > | |-- heads > | |-- max_instances > | |-- params > | |-- requires_group > |-- 13 > |-- create > |-- name > |-- fb_length > |-- resolution > |-- heads > |-- max_instances > |-- params > |-- requires_group > > > In the above example directory '11' represents a type id of mdev device. > 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and > 'requires_group' would be Read-Only files that vendor would provide to > describe about that type. > > 'create': > Write-only file. Mandatory. > Accepts string to create mediated device. > > 'name': > Read-Only file. Mandatory. > Returns string, the name of that type id. Presumably this is a human-targetted title/description of the device. > > 'fb_length': > Read-only file. Mandatory. > Returns {K,M,G}, size of framebuffer. > > 'resolution': > Read-Only file. Mandatory. > Returns 'hres x vres' format. Maximum supported resolution. > > 'heads': > Read-Only file. Mandatory. > Returns integer. Number of maximum heads supported. None of these should be mandatory as that makes the mdev useless for non-GPU devices. I'd expect to see a 'class' or 'type' attribute in the directory whcih tells you what kind of mdev it is. A valid 'class' value would be 'gpu'. The fb_length, resolution, and heads parameters would only be mandatory when class==gpu. > 'max_instance': > Read-Only file. Mandatory. > Returns integer. Returns maximum mdev device could be created > at the moment when this file is read. This count would be updated by > vendor driver. Before creating mdev device of this type, check if > max_instance is > 0. > > 'params' > Write-Only file. Optional. > String input. Libvirt would pass the string given in XML file to > this file and then create mdev device. Set empty string to clear params. > For example, set parameter 'frame_rate_limiter=0' to disable frame rate > limiter for performance benchmarking, then create device of type 11. The > device created would have that parameter set by vendor driver. Nope, libvirt will explicitly *NEVER* allow arbitrary opaque passthrough of vendor specific data in this way. > The parent device would look like: > > > pci__86_00_0 > >0 >134 >0 >0 > > > > >GRID M60-0B >512M >2560x1600 >2 >16 >1 > There would need to be a element, eg gpu We would then have further elements based on the class. eg GRID M60-0B 512M 2560x1600 2 16 1 > >GRID M60 >NVIDIA > > > > 2. Create/destroy mediated device > > With above example, vGPU device XML would look like: > > > my-vgpu > pci__86_00_0 > > >1 >'frame_rate_limiter=0' No, we will not support in this manner in libvirt. The entire purpose of libvirt is to represent data in a vendor agnostic manner and not do abitrary passthrough of vendor specific data. Simply saying this field is optional does not get around that either. > > > > 'type id' is mandatory. > 'group' is optional. It should be a unique number in the system among > all the groups created for mdev devices. Its usage is: >
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
> From: Kirti Wankhede [mailto:kwankh...@nvidia.com] > Sent: Tuesday, September 20, 2016 4:36 AM > > > Hi libvirt experts, > > Thanks for valuable input on v1 version of RFC. > > Quick brief, VFIO based mediated device framework provides a way to > virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT > and IBM's channel IO. This framework reuses VFIO APIs for all the > functionalities for mediated devices which are currently being used for > pass through devices. This framework introduces a set of new sysfs files > for device creation and its life cycle management. > > Here is the summary of discussion on v1: > 1. Discover mediated device: > As part of physical device initialization process, vendor driver will > register their physical devices, which will be used to create virtual > device (mediated device, aka mdev) to the mediated framework. > > Vendor driver should specify mdev_supported_types in directory format. > This format is class based, for example, display class directory format > should be as below. We need to define such set for each class of devices > which would be supported by mediated device framework. > > --- mdev_destroy > --- mdev_supported_types > |-- 11 > | |-- create > | |-- name > | |-- fb_length > | |-- resolution > | |-- heads > | |-- max_instances > | |-- params > | |-- requires_group > |-- 12 > | |-- create > | |-- name > | |-- fb_length > | |-- resolution > | |-- heads > | |-- max_instances > | |-- params > | |-- requires_group > |-- 13 > |-- create > |-- name > |-- fb_length > |-- resolution > |-- heads > |-- max_instances > |-- params > |-- requires_group > > Per previous discussion, I'd think below layout is more general and clearer: --- mdev_supported_types |-- ID (or name) | |-- create | |-- available_instances | |-- requires_group |-- ID (or name) | |-- create | |-- available_instances | |-- requires_group ... Then under each mdev directory: |-- type (link to type directory) |-- destroy |-- online |-- reset |-- Destroy Group association may be done under mdev directory too, per Alex's earlier suggestion. Optional vendor-agnostic attributes can be extended in the future, e.g.: |-- priority |-- weight Vendor-specific or class-specific parameters (black string or explicit attributes) could be also extended per-mdev directory, if necessary to support. Thanks Kevin
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
> From: Alex Williamson [mailto:alex.william...@redhat.com] > Sent: Tuesday, September 20, 2016 5:36 AM > > > In the above example directory '11' represents a type id of mdev device. > > 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and > > 'requires_group' would be Read-Only files that vendor would provide to > > describe about that type. > > > > 'create': > > Write-only file. Mandatory. > > Accepts string to create mediated device. > > > > 'name': > > Read-Only file. Mandatory. > > Returns string, the name of that type id. > > > > 'fb_length': > > Read-only file. Mandatory. > > Returns {K,M,G}, size of framebuffer. > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > just one user of mediated devices. Agree. at least it doesn't make sense for s390 passthrough usage. > > > > > 'max_instance': > > Read-Only file. Mandatory. > > Returns integer. Returns maximum mdev device could be created > > at the moment when this file is read. This count would be updated by > > vendor driver. Before creating mdev device of this type, check if > > max_instance is > 0. > > Didn't we discuss this being being something like "available_instances" > with a "devices" sub-directory linking current devices of that type? Right. "available_instances" which is dynamically changed under two scenarios: a) create a new mdev under one type decrements "available_instances" of this type by 1; b) create a new mdev under one type decrements "available_instances" under other types by type-specific number, if vendor supports creating multiple types simultaneously like in Intel solution; > > > 'params' > > Write-Only file. Optional. > > String input. Libvirt would pass the string given in XML file to > > this file and then create mdev device. Set empty string to clear params. > > For example, set parameter 'frame_rate_limiter=0' to disable frame rate > > limiter for performance benchmarking, then create device of type 11. The > > device created would have that parameter set by vendor driver. > > Might as well just call it "magic", there's absolutely no ability to > introspect what parameters are allowed or even valid here. Can all > parameters be changed dynamically? Do parameter changes apply to > existing devices or only future devices? This is a poorly specified > interface. I'd prefer this be done via module options on the vendor > driver. I guess it might be mdev specific for nvidia's case, then module option is not suitable. Then this magic string should be placed under mdev directory instead of here, otherwise there'll be race condition. > > > > > 2. Create/destroy mediated device > > > > With above example, vGPU device XML would look like: > > > > > > my-vgpu > > pci__86_00_0 > > > > > >1 > >'frame_rate_limiter=0' > > > > > > > > 'type id' is mandatory. > > I was under the impression that the vendor supplied "name" was the one > unique identifier. We're certainly not going to create a registrar to > hand out type ids to each vendor, so what makes type id unique? I have > a hard time believing that a given vendor can even allocate unique type > ids for their own devices. Unique type id across vendors is not > practical. So which attribute are we actually using to identify which > type of mdev device we need and why would we ever specify additional > attributes like fb_length? Doesn't the vendor guarantee that "GRID > M60-0B" has a fixed setup of those attributes? I'm a bit confused about type ID and type name here. Just keeping one should be enough (ether a number or descriptive name), which is per-device scope, not necessarily to cross vendors. As long as the vendor defines the ID or name unique cross its own generations, looks should be sufficient (I guess cross-vendor migration is not an interest here) > > > 'params' is optional field. User should set this field if extra > > parameters need to be set for a particular vGPU device. Libvirt don't > > need to parse these params. These are meant for vendor driver. > > ie. magic black hole. nope. A curious question here, though it's not a mandatory requirement for Intel's GPU virtualization solution now, but allow such extension might be useful in the long term. Has libvirt ever supported similar black string in other usages? I can think it's a problem for public cloud usage, where any configuration on the managed device should be controlled by orchestration layer (e.g. openstack) automatically. It's not appropriate to allow a cloud tenant specifying such parameter, so in that manner any configuration parameter should be explicitly understood from upper level stack to libvirt then vendor specific parameters would be hard to handle. On the other hand, if we consider enterprise virtualization scenario which is more proprietary, administrator may add such vendor specific parameters as a black string through libvirt, as long
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Mon, 19 Sep 2016 23:50:56 +0200 Paolo Bonziniwrote: > On 19/09/2016 23:36, Alex Williamson wrote: > > On Tue, 20 Sep 2016 02:05:52 +0530 > > Kirti Wankhede wrote: > >> 'fb_length': > >> Read-only file. Mandatory. > >> Returns {K,M,G}, size of framebuffer. > > > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > > just one user of mediated devices. [...] > > > >> 'params' > >> Write-Only file. Optional. > >> String input. Libvirt would pass the string given in XML file to > >> this file and then create mdev device. Set empty string to clear params. > >> For example, set parameter 'frame_rate_limiter=0' to disable frame rate > >> limiter for performance benchmarking, then create device of type 11. The > >> device created would have that parameter set by vendor driver. > > > > Might as well just call it "magic", there's absolutely no ability to > > introspect what parameters are allowed or even valid here. Can all > > parameters be changed dynamically? Do parameter changes apply to > > existing devices or only future devices? This is a poorly specified > > interface. I'd prefer this be done via module options on the vendor > > driver. > > Or additional files in the mdev's sysfs directory? Sure, the vendor driver could certainly support that, it'd be vendor specific of course. > >> The parent device would look like: > >> > >> > >> pci__86_00_0 > >> > >>0 > >>134 > >>0 > >>0 > > > > This is the parent device address? > > (Since this is taken from an email of mine, this is libvirt's XML > interpretation of the udev database and the contents of sysfs. Try > "virsh nodedev-dumpxml pci__00_02_0" for an example that is > unrelated to mdev). > > >> 2. Create/destroy mediated device > >> > >> With above example, vGPU device XML would look like: > >> > >> > >> my-vgpu > >> pci__86_00_0 > >> > >> > >>1 > >>'frame_rate_limiter=0' > >> > >> > >> > >> 'type id' is mandatory. > > > > I was under the impression that the vendor supplied "name" was the one > > unique identifier. We're certainly not going to create a registrar to > > hand out type ids to each vendor, so what makes type id unique? > > Why does it have to be unique? It's just parroting what libvirt said in > the block of the parent's nodedev-dumpxml. > > (Though I guess going by name could be useful as well. Perhaps both > could be allowed). Do we not want to be able to migrate a VM to another host and have the XML mean something? Off-line or on-line, mdev device may support migration. Thanks, Alex
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On 19/09/2016 23:36, Alex Williamson wrote: > On Tue, 20 Sep 2016 02:05:52 +0530 > Kirti Wankhedewrote: >> 'fb_length': >> Read-only file. Mandatory. >> Returns {K,M,G}, size of framebuffer. > > This can't be mandatory, it's only relevant to vGPU devices, vGPUs are > just one user of mediated devices. [...] > >> 'params' >> Write-Only file. Optional. >> String input. Libvirt would pass the string given in XML file to >> this file and then create mdev device. Set empty string to clear params. >> For example, set parameter 'frame_rate_limiter=0' to disable frame rate >> limiter for performance benchmarking, then create device of type 11. The >> device created would have that parameter set by vendor driver. > > Might as well just call it "magic", there's absolutely no ability to > introspect what parameters are allowed or even valid here. Can all > parameters be changed dynamically? Do parameter changes apply to > existing devices or only future devices? This is a poorly specified > interface. I'd prefer this be done via module options on the vendor > driver. Or additional files in the mdev's sysfs directory? >> The parent device would look like: >> >> >> pci__86_00_0 >> >>0 >>134 >>0 >>0 > > This is the parent device address? (Since this is taken from an email of mine, this is libvirt's XML interpretation of the udev database and the contents of sysfs. Try "virsh nodedev-dumpxml pci__00_02_0" for an example that is unrelated to mdev). >> 2. Create/destroy mediated device >> >> With above example, vGPU device XML would look like: >> >> >> my-vgpu >> pci__86_00_0 >> >> >>1 >>'frame_rate_limiter=0' >> >> >> >> 'type id' is mandatory. > > I was under the impression that the vendor supplied "name" was the one > unique identifier. We're certainly not going to create a registrar to > hand out type ids to each vendor, so what makes type id unique? Why does it have to be unique? It's just parroting what libvirt said in the block of the parent's nodedev-dumpxml. (Though I guess going by name could be useful as well. Perhaps both could be allowed). >> * Clear params, if set earlier: >> >> echo "" > /sys/../\:86\:00.0/11/params > > So params only applies to the devices created while those params are > set? This is inherently racy. I agree, these extra attributes should be set on the mdev, not the type directory. Paolo
Re: [Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
On Tue, 20 Sep 2016 02:05:52 +0530 Kirti Wankhedewrote: > Hi libvirt experts, > > Thanks for valuable input on v1 version of RFC. > > Quick brief, VFIO based mediated device framework provides a way to > virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT > and IBM's channel IO. This framework reuses VFIO APIs for all the > functionalities for mediated devices which are currently being used for > pass through devices. This framework introduces a set of new sysfs files > for device creation and its life cycle management. > > Here is the summary of discussion on v1: > 1. Discover mediated device: > As part of physical device initialization process, vendor driver will > register their physical devices, which will be used to create virtual > device (mediated device, aka mdev) to the mediated framework. > > Vendor driver should specify mdev_supported_types in directory format. > This format is class based, for example, display class directory format > should be as below. We need to define such set for each class of devices > which would be supported by mediated device framework. > > --- mdev_destroy I thought we replaced mdev_destroy with a remove attribute for each mdev device. > --- mdev_supported_types > |-- 11 > | |-- create > | |-- name > | |-- fb_length > | |-- resolution > | |-- heads > | |-- max_instances > | |-- params > | |-- requires_group > |-- 12 > | |-- create > | |-- name > | |-- fb_length > | |-- resolution > | |-- heads > | |-- max_instances > | |-- params > | |-- requires_group > |-- 13 > |-- create > |-- name > |-- fb_length > |-- resolution > |-- heads > |-- max_instances > |-- params > |-- requires_group > > > In the above example directory '11' represents a type id of mdev device. > 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and > 'requires_group' would be Read-Only files that vendor would provide to > describe about that type. > > 'create': > Write-only file. Mandatory. > Accepts string to create mediated device. > > 'name': > Read-Only file. Mandatory. > Returns string, the name of that type id. > > 'fb_length': > Read-only file. Mandatory. > Returns {K,M,G}, size of framebuffer. This can't be mandatory, it's only relevant to vGPU devices, vGPUs are just one user of mediated devices. > > 'resolution': > Read-Only file. Mandatory. > Returns 'hres x vres' format. Maximum supported resolution. Same. > > 'heads': > Read-Only file. Mandatory. > Returns integer. Number of maximum heads supported. Same. > > 'max_instance': > Read-Only file. Mandatory. > Returns integer. Returns maximum mdev device could be created > at the moment when this file is read. This count would be updated by > vendor driver. Before creating mdev device of this type, check if > max_instance is > 0. Didn't we discuss this being being something like "available_instances" with a "devices" sub-directory linking current devices of that type? > 'params' > Write-Only file. Optional. > String input. Libvirt would pass the string given in XML file to > this file and then create mdev device. Set empty string to clear params. > For example, set parameter 'frame_rate_limiter=0' to disable frame rate > limiter for performance benchmarking, then create device of type 11. The > device created would have that parameter set by vendor driver. Might as well just call it "magic", there's absolutely no ability to introspect what parameters are allowed or even valid here. Can all parameters be changed dynamically? Do parameter changes apply to existing devices or only future devices? This is a poorly specified interface. I'd prefer this be done via module options on the vendor driver. > 'requires_group' > Read-Only file. Optional. > This should be provided by vendor driver if vendor driver need to > group mdev devices in one domain so that vendor driver can use 'first > open' to commit resources of all mdev devices associated to that domain > and 'last close' to free those. > > The parent device would look like: > > > pci__86_00_0 > >0 >134 >0 >0 This is the parent device address? (I think we'd re-use the specification for assigned devices) Is this optional? Couldn't libvirt choose to pick any parent device supplying the specified type if not supplied? > > > > >GRID M60-0B >512M >2560x1600 >2 >16 >1 > > >GRID M60 >NVIDIA What are these specifying? > > > > 2. Create/destroy mediated device > > With above example, vGPU device XML would look like: > > > my-vgpu >
[Qemu-devel] [RFC v2] libvirt vGPU QEMU integration
Hi libvirt experts, Thanks for valuable input on v1 version of RFC. Quick brief, VFIO based mediated device framework provides a way to virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT and IBM's channel IO. This framework reuses VFIO APIs for all the functionalities for mediated devices which are currently being used for pass through devices. This framework introduces a set of new sysfs files for device creation and its life cycle management. Here is the summary of discussion on v1: 1. Discover mediated device: As part of physical device initialization process, vendor driver will register their physical devices, which will be used to create virtual device (mediated device, aka mdev) to the mediated framework. Vendor driver should specify mdev_supported_types in directory format. This format is class based, for example, display class directory format should be as below. We need to define such set for each class of devices which would be supported by mediated device framework. --- mdev_destroy --- mdev_supported_types |-- 11 | |-- create | |-- name | |-- fb_length | |-- resolution | |-- heads | |-- max_instances | |-- params | |-- requires_group |-- 12 | |-- create | |-- name | |-- fb_length | |-- resolution | |-- heads | |-- max_instances | |-- params | |-- requires_group |-- 13 |-- create |-- name |-- fb_length |-- resolution |-- heads |-- max_instances |-- params |-- requires_group In the above example directory '11' represents a type id of mdev device. 'name', 'fb_length', 'resolution', 'heads', 'max_instance' and 'requires_group' would be Read-Only files that vendor would provide to describe about that type. 'create': Write-only file. Mandatory. Accepts string to create mediated device. 'name': Read-Only file. Mandatory. Returns string, the name of that type id. 'fb_length': Read-only file. Mandatory. Returns {K,M,G}, size of framebuffer. 'resolution': Read-Only file. Mandatory. Returns 'hres x vres' format. Maximum supported resolution. 'heads': Read-Only file. Mandatory. Returns integer. Number of maximum heads supported. 'max_instance': Read-Only file. Mandatory. Returns integer. Returns maximum mdev device could be created at the moment when this file is read. This count would be updated by vendor driver. Before creating mdev device of this type, check if max_instance is > 0. 'params' Write-Only file. Optional. String input. Libvirt would pass the string given in XML file to this file and then create mdev device. Set empty string to clear params. For example, set parameter 'frame_rate_limiter=0' to disable frame rate limiter for performance benchmarking, then create device of type 11. The device created would have that parameter set by vendor driver. 'requires_group' Read-Only file. Optional. This should be provided by vendor driver if vendor driver need to group mdev devices in one domain so that vendor driver can use 'first open' to commit resources of all mdev devices associated to that domain and 'last close' to free those. The parent device would look like: pci__86_00_0 0 134 0 0 GRID M60-0B 512M 2560x1600 2 16 1 GRID M60 NVIDIA 2. Create/destroy mediated device With above example, vGPU device XML would look like: my-vgpu pci__86_00_0 1 'frame_rate_limiter=0' 'type id' is mandatory. 'group' is optional. It should be a unique number in the system among all the groups created for mdev devices. Its usage is: - not needed if single vGPU device is being assigned to a domain. - only need to be set if multiple vGPUs need to be assigned to a domain and vendor driver have 'requires_group' file in type id directory. - if type id directory include 'requires_group' and user tries to assign multiple vGPUs to a domain without having field in XML, it will create single vGPU. 'params' is optional field. User should set this field if extra parameters need to be set for a particular vGPU device. Libvirt don't need to parse these params. These are meant for vendor driver. Libvirt need to follow the sequence to create device: * Read /sys/../\:86\:00.0/11/max_instances. If it is greater than 0, then only proceed else fail. * Set extra params if 'params' field exist in device XML and 'params' file exist in type id directory echo "frame_rate_limiter=0" > /sys/../\:86\:00.0/11/params * Autogenerate UUID * Create device: echo "$UUID:" > /sys/../\:86\:00.0/11/create where is optional. Group should be unique number among all the