Re: [RFC]Add new mdev interface for QoS

2017-08-08 Thread Gao, Ping A

On 2017/8/8 14:42, Kirti Wankhede wrote:
>
> On 8/7/2017 1:11 PM, Gao, Ping A wrote:
>> On 2017/8/4 5:11, Alex Williamson wrote:
>>> On Thu, 3 Aug 2017 20:26:14 +0800
>>> "Gao, Ping A"  wrote:
>>>
 On 2017/8/3 0:58, Alex Williamson wrote:
> On Wed, 2 Aug 2017 21:16:28 +0530
> Kirti Wankhede  wrote:
>  
>> On 8/2/2017 6:29 PM, Gao, Ping A wrote:  
>>> On 2017/8/2 18:19, Kirti Wankhede wrote:
 On 8/2/2017 3:56 AM, Alex Williamson wrote:
> On Tue, 1 Aug 2017 13:54:27 +0800
> "Gao, Ping A"  wrote:
>
>> On 2017/7/28 0:00, Gao, Ping A wrote:
>>> On 2017/7/27 0:43, Alex Williamson wrote:  
 [cc +libvir-list]

 On Wed, 26 Jul 2017 21:16:59 +0800
 "Gao, Ping A"  wrote:
  
> The vfio-mdev provide the capability to let different guest share 
> the
> same physical device through mediate sharing, as result it bring a
> requirement about how to control the device sharing, we need a QoS
> related interface for mdev to management virtual device resource.
>
> E.g. In practical use, vGPUs assigned to different quests almost 
> has
> different performance requirements, some guests may need higher 
> priority
> for real time usage, some other may need more portion of the GPU
> resource to get higher 3D performance, corresponding we can 
> define some
> interfaces like weight/cap for overall budget control, priority 
> for
> single submission control.
>
> So I suggest to add some common attributes which are vendor 
> agnostic in
> mdev core sysfs for QoS purpose.  
 I think what you're asking for is just some standardization of a 
 QoS
 attribute_group which a vendor can optionally include within the
 existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
 transparently enable this, but it really only provides the 
 standard,
 all of the support code is left for the vendor.  I'm fine with 
 that,
 but of course the trouble with and sort of standardization is 
 arriving
 at an agreed upon standard.  Are there QoS knobs that are generic
 across any mdev device type?  Are there others that are more 
 specific
 to vGPU?  Are there existing examples of this that we can steal 
 their
 specification?  
>>> Yes, you are right, standardization QoS knobs are exactly what I 
>>> wanted.
>>> Only when it become a part of the mdev framework and libvirt, then 
>>> QoS
>>> such critical feature can be leveraged by cloud usage. HW vendor 
>>> only
>>> need to focus on the implementation of the corresponding QoS 
>>> algorithm
>>> in their back-end driver.
>>>
>>> Vfio-mdev framework provide the capability to share the device that 
>>> lack
>>> of HW virtualization support to guests, no matter the device type,
>>> mediated sharing actually is a time sharing multiplex method, from 
>>> this
>>> point of view, QoS can be take as a generic way about how to 
>>> control the
>>> time assignment for virtual mdev device that occupy HW. As result 
>>> we can
>>> define QoS knob generic across any device type by this way. Even if 
>>> HW
>>> has build in with some kind of QoS support, I think it's not a 
>>> problem
>>> for back-end driver to convert mdev standard QoS definition to their
>>> specification to reach the same performance expectation. Seems 
>>> there are
>>> no examples for us to follow, we need define it from scratch.
>>>
>>> I proposal universal QoS control interfaces like below:
>>>
>>> Cap: The cap limits the maximum percentage of time a mdev device 
>>> can own
>>> physical device. e.g. cap=60, means mdev device cannot take over 
>>> 60% of
>>> total physical resource.
>>>
>>> Weight: The weight define proportional control of the mdev device
>>> resource between guests, it’s orthogonal with Cap, to target load
>>> balancing. E.g. if guest 1 should take double mdev device resource
>>> compare with guest 2, need set weight ratio to 2:1.
>>>
>>> Priority: The guest who has higher priority will get execution 
>>> first,
>>> target to some real time usage and speeding interactive response.
>>>
>>> Above QoS interfaces cover both overall budget control and single
>>> submission

Re: [RFC]Add new mdev interface for QoS

2017-08-07 Thread Kirti Wankhede


On 8/7/2017 1:11 PM, Gao, Ping A wrote:
> 
> On 2017/8/4 5:11, Alex Williamson wrote:
>> On Thu, 3 Aug 2017 20:26:14 +0800
>> "Gao, Ping A"  wrote:
>>
>>> On 2017/8/3 0:58, Alex Williamson wrote:
 On Wed, 2 Aug 2017 21:16:28 +0530
 Kirti Wankhede  wrote:
  
> On 8/2/2017 6:29 PM, Gao, Ping A wrote:  
>> On 2017/8/2 18:19, Kirti Wankhede wrote:
>>> On 8/2/2017 3:56 AM, Alex Williamson wrote:
 On Tue, 1 Aug 2017 13:54:27 +0800
 "Gao, Ping A"  wrote:

> On 2017/7/28 0:00, Gao, Ping A wrote:
>> On 2017/7/27 0:43, Alex Williamson wrote:  
>>> [cc +libvir-list]
>>>
>>> On Wed, 26 Jul 2017 21:16:59 +0800
>>> "Gao, Ping A"  wrote:
>>>  
 The vfio-mdev provide the capability to let different guest share 
 the
 same physical device through mediate sharing, as result it bring a
 requirement about how to control the device sharing, we need a QoS
 related interface for mdev to management virtual device resource.

 E.g. In practical use, vGPUs assigned to different quests almost 
 has
 different performance requirements, some guests may need higher 
 priority
 for real time usage, some other may need more portion of the GPU
 resource to get higher 3D performance, corresponding we can define 
 some
 interfaces like weight/cap for overall budget control, priority for
 single submission control.

 So I suggest to add some common attributes which are vendor 
 agnostic in
 mdev core sysfs for QoS purpose.  
>>> I think what you're asking for is just some standardization of a QoS
>>> attribute_group which a vendor can optionally include within the
>>> existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
>>> transparently enable this, but it really only provides the standard,
>>> all of the support code is left for the vendor.  I'm fine with that,
>>> but of course the trouble with and sort of standardization is 
>>> arriving
>>> at an agreed upon standard.  Are there QoS knobs that are generic
>>> across any mdev device type?  Are there others that are more 
>>> specific
>>> to vGPU?  Are there existing examples of this that we can steal 
>>> their
>>> specification?  
>> Yes, you are right, standardization QoS knobs are exactly what I 
>> wanted.
>> Only when it become a part of the mdev framework and libvirt, then 
>> QoS
>> such critical feature can be leveraged by cloud usage. HW vendor only
>> need to focus on the implementation of the corresponding QoS 
>> algorithm
>> in their back-end driver.
>>
>> Vfio-mdev framework provide the capability to share the device that 
>> lack
>> of HW virtualization support to guests, no matter the device type,
>> mediated sharing actually is a time sharing multiplex method, from 
>> this
>> point of view, QoS can be take as a generic way about how to control 
>> the
>> time assignment for virtual mdev device that occupy HW. As result we 
>> can
>> define QoS knob generic across any device type by this way. Even if 
>> HW
>> has build in with some kind of QoS support, I think it's not a 
>> problem
>> for back-end driver to convert mdev standard QoS definition to their
>> specification to reach the same performance expectation. Seems there 
>> are
>> no examples for us to follow, we need define it from scratch.
>>
>> I proposal universal QoS control interfaces like below:
>>
>> Cap: The cap limits the maximum percentage of time a mdev device can 
>> own
>> physical device. e.g. cap=60, means mdev device cannot take over 60% 
>> of
>> total physical resource.
>>
>> Weight: The weight define proportional control of the mdev device
>> resource between guests, it’s orthogonal with Cap, to target load
>> balancing. E.g. if guest 1 should take double mdev device resource
>> compare with guest 2, need set weight ratio to 2:1.
>>
>> Priority: The guest who has higher priority will get execution first,
>> target to some real time usage and speeding interactive response.
>>
>> Above QoS interfaces cover both overall budget control and single
>> submission control. I will sent out detail design later once get 
>> aligned.  
> Hi Alex,
> Any comments about the interface mentioned above?
 Not really.

 Kirti, are there a

Re: [RFC]Add new mdev interface for QoS

2017-08-07 Thread Gao, Ping A

On 2017/8/4 5:11, Alex Williamson wrote:
> On Thu, 3 Aug 2017 20:26:14 +0800
> "Gao, Ping A"  wrote:
>
>> On 2017/8/3 0:58, Alex Williamson wrote:
>>> On Wed, 2 Aug 2017 21:16:28 +0530
>>> Kirti Wankhede  wrote:
>>>  
 On 8/2/2017 6:29 PM, Gao, Ping A wrote:  
> On 2017/8/2 18:19, Kirti Wankhede wrote:
>> On 8/2/2017 3:56 AM, Alex Williamson wrote:
>>> On Tue, 1 Aug 2017 13:54:27 +0800
>>> "Gao, Ping A"  wrote:
>>>
 On 2017/7/28 0:00, Gao, Ping A wrote:
> On 2017/7/27 0:43, Alex Williamson wrote:  
>> [cc +libvir-list]
>>
>> On Wed, 26 Jul 2017 21:16:59 +0800
>> "Gao, Ping A"  wrote:
>>  
>>> The vfio-mdev provide the capability to let different guest share 
>>> the
>>> same physical device through mediate sharing, as result it bring a
>>> requirement about how to control the device sharing, we need a QoS
>>> related interface for mdev to management virtual device resource.
>>>
>>> E.g. In practical use, vGPUs assigned to different quests almost has
>>> different performance requirements, some guests may need higher 
>>> priority
>>> for real time usage, some other may need more portion of the GPU
>>> resource to get higher 3D performance, corresponding we can define 
>>> some
>>> interfaces like weight/cap for overall budget control, priority for
>>> single submission control.
>>>
>>> So I suggest to add some common attributes which are vendor 
>>> agnostic in
>>> mdev core sysfs for QoS purpose.  
>> I think what you're asking for is just some standardization of a QoS
>> attribute_group which a vendor can optionally include within the
>> existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
>> transparently enable this, but it really only provides the standard,
>> all of the support code is left for the vendor.  I'm fine with that,
>> but of course the trouble with and sort of standardization is 
>> arriving
>> at an agreed upon standard.  Are there QoS knobs that are generic
>> across any mdev device type?  Are there others that are more specific
>> to vGPU?  Are there existing examples of this that we can steal their
>> specification?  
> Yes, you are right, standardization QoS knobs are exactly what I 
> wanted.
> Only when it become a part of the mdev framework and libvirt, then QoS
> such critical feature can be leveraged by cloud usage. HW vendor only
> need to focus on the implementation of the corresponding QoS algorithm
> in their back-end driver.
>
> Vfio-mdev framework provide the capability to share the device that 
> lack
> of HW virtualization support to guests, no matter the device type,
> mediated sharing actually is a time sharing multiplex method, from 
> this
> point of view, QoS can be take as a generic way about how to control 
> the
> time assignment for virtual mdev device that occupy HW. As result we 
> can
> define QoS knob generic across any device type by this way. Even if HW
> has build in with some kind of QoS support, I think it's not a problem
> for back-end driver to convert mdev standard QoS definition to their
> specification to reach the same performance expectation. Seems there 
> are
> no examples for us to follow, we need define it from scratch.
>
> I proposal universal QoS control interfaces like below:
>
> Cap: The cap limits the maximum percentage of time a mdev device can 
> own
> physical device. e.g. cap=60, means mdev device cannot take over 60% 
> of
> total physical resource.
>
> Weight: The weight define proportional control of the mdev device
> resource between guests, it’s orthogonal with Cap, to target load
> balancing. E.g. if guest 1 should take double mdev device resource
> compare with guest 2, need set weight ratio to 2:1.
>
> Priority: The guest who has higher priority will get execution first,
> target to some real time usage and speeding interactive response.
>
> Above QoS interfaces cover both overall budget control and single
> submission control. I will sent out detail design later once get 
> aligned.  
 Hi Alex,
 Any comments about the interface mentioned above?
>>> Not really.
>>>
>>> Kirti, are there any QoS knobs that would be interesting
>>> for NVIDIA devices?
>>>
>> We have different types of vGPU for different QoS factors.
>>
>> When mdev devices are created, its resources are allocated irrespective

Re: [RFC]Add new mdev interface for QoS

2017-08-03 Thread Alex Williamson
On Thu, 3 Aug 2017 20:26:14 +0800
"Gao, Ping A"  wrote:

> On 2017/8/3 0:58, Alex Williamson wrote:
> > On Wed, 2 Aug 2017 21:16:28 +0530
> > Kirti Wankhede  wrote:
> >  
> >> On 8/2/2017 6:29 PM, Gao, Ping A wrote:  
> >>> On 2017/8/2 18:19, Kirti Wankhede wrote:
>  On 8/2/2017 3:56 AM, Alex Williamson wrote:
> > On Tue, 1 Aug 2017 13:54:27 +0800
> > "Gao, Ping A"  wrote:
> >
> >> On 2017/7/28 0:00, Gao, Ping A wrote:
> >>> On 2017/7/27 0:43, Alex Williamson wrote:  
>  [cc +libvir-list]
> 
>  On Wed, 26 Jul 2017 21:16:59 +0800
>  "Gao, Ping A"  wrote:
>   
> > The vfio-mdev provide the capability to let different guest share 
> > the
> > same physical device through mediate sharing, as result it bring a
> > requirement about how to control the device sharing, we need a QoS
> > related interface for mdev to management virtual device resource.
> >
> > E.g. In practical use, vGPUs assigned to different quests almost has
> > different performance requirements, some guests may need higher 
> > priority
> > for real time usage, some other may need more portion of the GPU
> > resource to get higher 3D performance, corresponding we can define 
> > some
> > interfaces like weight/cap for overall budget control, priority for
> > single submission control.
> >
> > So I suggest to add some common attributes which are vendor 
> > agnostic in
> > mdev core sysfs for QoS purpose.  
>  I think what you're asking for is just some standardization of a QoS
>  attribute_group which a vendor can optionally include within the
>  existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
>  transparently enable this, but it really only provides the standard,
>  all of the support code is left for the vendor.  I'm fine with that,
>  but of course the trouble with and sort of standardization is 
>  arriving
>  at an agreed upon standard.  Are there QoS knobs that are generic
>  across any mdev device type?  Are there others that are more specific
>  to vGPU?  Are there existing examples of this that we can steal their
>  specification?  
> >>> Yes, you are right, standardization QoS knobs are exactly what I 
> >>> wanted.
> >>> Only when it become a part of the mdev framework and libvirt, then QoS
> >>> such critical feature can be leveraged by cloud usage. HW vendor only
> >>> need to focus on the implementation of the corresponding QoS algorithm
> >>> in their back-end driver.
> >>>
> >>> Vfio-mdev framework provide the capability to share the device that 
> >>> lack
> >>> of HW virtualization support to guests, no matter the device type,
> >>> mediated sharing actually is a time sharing multiplex method, from 
> >>> this
> >>> point of view, QoS can be take as a generic way about how to control 
> >>> the
> >>> time assignment for virtual mdev device that occupy HW. As result we 
> >>> can
> >>> define QoS knob generic across any device type by this way. Even if HW
> >>> has build in with some kind of QoS support, I think it's not a problem
> >>> for back-end driver to convert mdev standard QoS definition to their
> >>> specification to reach the same performance expectation. Seems there 
> >>> are
> >>> no examples for us to follow, we need define it from scratch.
> >>>
> >>> I proposal universal QoS control interfaces like below:
> >>>
> >>> Cap: The cap limits the maximum percentage of time a mdev device can 
> >>> own
> >>> physical device. e.g. cap=60, means mdev device cannot take over 60% 
> >>> of
> >>> total physical resource.
> >>>
> >>> Weight: The weight define proportional control of the mdev device
> >>> resource between guests, it’s orthogonal with Cap, to target load
> >>> balancing. E.g. if guest 1 should take double mdev device resource
> >>> compare with guest 2, need set weight ratio to 2:1.
> >>>
> >>> Priority: The guest who has higher priority will get execution first,
> >>> target to some real time usage and speeding interactive response.
> >>>
> >>> Above QoS interfaces cover both overall budget control and single
> >>> submission control. I will sent out detail design later once get 
> >>> aligned.  
> >> Hi Alex,
> >> Any comments about the interface mentioned above?
> > Not really.
> >
> > Kirti, are there any QoS knobs that would be interesting
> > for NVIDIA devices?
> >
>  We have different types of vGPU for different QoS factors.
> 
>  When mdev devices are created, its resources are allocated irrespective
>  of which VM/userspace app is going to us

Re: [RFC]Add new mdev interface for QoS

2017-08-03 Thread Gao, Ping A

On 2017/8/3 0:58, Alex Williamson wrote:
> On Wed, 2 Aug 2017 21:16:28 +0530
> Kirti Wankhede  wrote:
>
>> On 8/2/2017 6:29 PM, Gao, Ping A wrote:
>>> On 2017/8/2 18:19, Kirti Wankhede wrote:  
 On 8/2/2017 3:56 AM, Alex Williamson wrote:  
> On Tue, 1 Aug 2017 13:54:27 +0800
> "Gao, Ping A"  wrote:
>  
>> On 2017/7/28 0:00, Gao, Ping A wrote:  
>>> On 2017/7/27 0:43, Alex Williamson wrote:
 [cc +libvir-list]

 On Wed, 26 Jul 2017 21:16:59 +0800
 "Gao, Ping A"  wrote:

> The vfio-mdev provide the capability to let different guest share the
> same physical device through mediate sharing, as result it bring a
> requirement about how to control the device sharing, we need a QoS
> related interface for mdev to management virtual device resource.
>
> E.g. In practical use, vGPUs assigned to different quests almost has
> different performance requirements, some guests may need higher 
> priority
> for real time usage, some other may need more portion of the GPU
> resource to get higher 3D performance, corresponding we can define 
> some
> interfaces like weight/cap for overall budget control, priority for
> single submission control.
>
> So I suggest to add some common attributes which are vendor agnostic 
> in
> mdev core sysfs for QoS purpose.
 I think what you're asking for is just some standardization of a QoS
 attribute_group which a vendor can optionally include within the
 existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
 transparently enable this, but it really only provides the standard,
 all of the support code is left for the vendor.  I'm fine with that,
 but of course the trouble with and sort of standardization is arriving
 at an agreed upon standard.  Are there QoS knobs that are generic
 across any mdev device type?  Are there others that are more specific
 to vGPU?  Are there existing examples of this that we can steal their
 specification?
>>> Yes, you are right, standardization QoS knobs are exactly what I wanted.
>>> Only when it become a part of the mdev framework and libvirt, then QoS
>>> such critical feature can be leveraged by cloud usage. HW vendor only
>>> need to focus on the implementation of the corresponding QoS algorithm
>>> in their back-end driver.
>>>
>>> Vfio-mdev framework provide the capability to share the device that lack
>>> of HW virtualization support to guests, no matter the device type,
>>> mediated sharing actually is a time sharing multiplex method, from this
>>> point of view, QoS can be take as a generic way about how to control the
>>> time assignment for virtual mdev device that occupy HW. As result we can
>>> define QoS knob generic across any device type by this way. Even if HW
>>> has build in with some kind of QoS support, I think it's not a problem
>>> for back-end driver to convert mdev standard QoS definition to their
>>> specification to reach the same performance expectation. Seems there are
>>> no examples for us to follow, we need define it from scratch.
>>>
>>> I proposal universal QoS control interfaces like below:
>>>
>>> Cap: The cap limits the maximum percentage of time a mdev device can own
>>> physical device. e.g. cap=60, means mdev device cannot take over 60% of
>>> total physical resource.
>>>
>>> Weight: The weight define proportional control of the mdev device
>>> resource between guests, it’s orthogonal with Cap, to target load
>>> balancing. E.g. if guest 1 should take double mdev device resource
>>> compare with guest 2, need set weight ratio to 2:1.
>>>
>>> Priority: The guest who has higher priority will get execution first,
>>> target to some real time usage and speeding interactive response.
>>>
>>> Above QoS interfaces cover both overall budget control and single
>>> submission control. I will sent out detail design later once get 
>>> aligned.
>> Hi Alex,
>> Any comments about the interface mentioned above?  
> Not really.
>
> Kirti, are there any QoS knobs that would be interesting
> for NVIDIA devices?
>  
 We have different types of vGPU for different QoS factors.

 When mdev devices are created, its resources are allocated irrespective
 of which VM/userspace app is going to use that mdev device. Any
 parameter we add here should be tied to particular mdev device and not
 to the guest/app that are going to use it. 'Cap' and 'Priority' are
 along that line. All mdev device might not need/use these parameters,
 these can be made optional interfaces.  
>>> We also define some QoS parameters in Intel vGPU types, but it only
>>> pro

Re: [RFC]Add new mdev interface for QoS

2017-08-02 Thread Alex Williamson
On Wed, 2 Aug 2017 21:16:28 +0530
Kirti Wankhede  wrote:

> On 8/2/2017 6:29 PM, Gao, Ping A wrote:
> > 
> > On 2017/8/2 18:19, Kirti Wankhede wrote:  
> >>
> >> On 8/2/2017 3:56 AM, Alex Williamson wrote:  
> >>> On Tue, 1 Aug 2017 13:54:27 +0800
> >>> "Gao, Ping A"  wrote:
> >>>  
>  On 2017/7/28 0:00, Gao, Ping A wrote:  
> > On 2017/7/27 0:43, Alex Williamson wrote:
> >> [cc +libvir-list]
> >>
> >> On Wed, 26 Jul 2017 21:16:59 +0800
> >> "Gao, Ping A"  wrote:
> >>
> >>> The vfio-mdev provide the capability to let different guest share the
> >>> same physical device through mediate sharing, as result it bring a
> >>> requirement about how to control the device sharing, we need a QoS
> >>> related interface for mdev to management virtual device resource.
> >>>
> >>> E.g. In practical use, vGPUs assigned to different quests almost has
> >>> different performance requirements, some guests may need higher 
> >>> priority
> >>> for real time usage, some other may need more portion of the GPU
> >>> resource to get higher 3D performance, corresponding we can define 
> >>> some
> >>> interfaces like weight/cap for overall budget control, priority for
> >>> single submission control.
> >>>
> >>> So I suggest to add some common attributes which are vendor agnostic 
> >>> in
> >>> mdev core sysfs for QoS purpose.
> >> I think what you're asking for is just some standardization of a QoS
> >> attribute_group which a vendor can optionally include within the
> >> existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
> >> transparently enable this, but it really only provides the standard,
> >> all of the support code is left for the vendor.  I'm fine with that,
> >> but of course the trouble with and sort of standardization is arriving
> >> at an agreed upon standard.  Are there QoS knobs that are generic
> >> across any mdev device type?  Are there others that are more specific
> >> to vGPU?  Are there existing examples of this that we can steal their
> >> specification?
> > Yes, you are right, standardization QoS knobs are exactly what I wanted.
> > Only when it become a part of the mdev framework and libvirt, then QoS
> > such critical feature can be leveraged by cloud usage. HW vendor only
> > need to focus on the implementation of the corresponding QoS algorithm
> > in their back-end driver.
> >
> > Vfio-mdev framework provide the capability to share the device that lack
> > of HW virtualization support to guests, no matter the device type,
> > mediated sharing actually is a time sharing multiplex method, from this
> > point of view, QoS can be take as a generic way about how to control the
> > time assignment for virtual mdev device that occupy HW. As result we can
> > define QoS knob generic across any device type by this way. Even if HW
> > has build in with some kind of QoS support, I think it's not a problem
> > for back-end driver to convert mdev standard QoS definition to their
> > specification to reach the same performance expectation. Seems there are
> > no examples for us to follow, we need define it from scratch.
> >
> > I proposal universal QoS control interfaces like below:
> >
> > Cap: The cap limits the maximum percentage of time a mdev device can own
> > physical device. e.g. cap=60, means mdev device cannot take over 60% of
> > total physical resource.
> >
> > Weight: The weight define proportional control of the mdev device
> > resource between guests, it’s orthogonal with Cap, to target load
> > balancing. E.g. if guest 1 should take double mdev device resource
> > compare with guest 2, need set weight ratio to 2:1.
> >
> > Priority: The guest who has higher priority will get execution first,
> > target to some real time usage and speeding interactive response.
> >
> > Above QoS interfaces cover both overall budget control and single
> > submission control. I will sent out detail design later once get 
> > aligned.
>  Hi Alex,
>  Any comments about the interface mentioned above?  
> >>> Not really.
> >>>
> >>> Kirti, are there any QoS knobs that would be interesting
> >>> for NVIDIA devices?
> >>>  
> >> We have different types of vGPU for different QoS factors.
> >>
> >> When mdev devices are created, its resources are allocated irrespective
> >> of which VM/userspace app is going to use that mdev device. Any
> >> parameter we add here should be tied to particular mdev device and not
> >> to the guest/app that are going to use it. 'Cap' and 'Priority' are
> >> along that line. All mdev device might not need/use these parameters,
> >> these can be made optional interfaces.  
> > 
> > We also define some QoS parameters in Intel vGPU types, but it only
> > provided a default fool-style way. W

Re: [RFC]Add new mdev interface for QoS

2017-08-02 Thread Kirti Wankhede


On 8/2/2017 6:29 PM, Gao, Ping A wrote:
> 
> On 2017/8/2 18:19, Kirti Wankhede wrote:
>>
>> On 8/2/2017 3:56 AM, Alex Williamson wrote:
>>> On Tue, 1 Aug 2017 13:54:27 +0800
>>> "Gao, Ping A"  wrote:
>>>
 On 2017/7/28 0:00, Gao, Ping A wrote:
> On 2017/7/27 0:43, Alex Williamson wrote:  
>> [cc +libvir-list]
>>
>> On Wed, 26 Jul 2017 21:16:59 +0800
>> "Gao, Ping A"  wrote:
>>  
>>> The vfio-mdev provide the capability to let different guest share the
>>> same physical device through mediate sharing, as result it bring a
>>> requirement about how to control the device sharing, we need a QoS
>>> related interface for mdev to management virtual device resource.
>>>
>>> E.g. In practical use, vGPUs assigned to different quests almost has
>>> different performance requirements, some guests may need higher priority
>>> for real time usage, some other may need more portion of the GPU
>>> resource to get higher 3D performance, corresponding we can define some
>>> interfaces like weight/cap for overall budget control, priority for
>>> single submission control.
>>>
>>> So I suggest to add some common attributes which are vendor agnostic in
>>> mdev core sysfs for QoS purpose.  
>> I think what you're asking for is just some standardization of a QoS
>> attribute_group which a vendor can optionally include within the
>> existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
>> transparently enable this, but it really only provides the standard,
>> all of the support code is left for the vendor.  I'm fine with that,
>> but of course the trouble with and sort of standardization is arriving
>> at an agreed upon standard.  Are there QoS knobs that are generic
>> across any mdev device type?  Are there others that are more specific
>> to vGPU?  Are there existing examples of this that we can steal their
>> specification?  
> Yes, you are right, standardization QoS knobs are exactly what I wanted.
> Only when it become a part of the mdev framework and libvirt, then QoS
> such critical feature can be leveraged by cloud usage. HW vendor only
> need to focus on the implementation of the corresponding QoS algorithm
> in their back-end driver.
>
> Vfio-mdev framework provide the capability to share the device that lack
> of HW virtualization support to guests, no matter the device type,
> mediated sharing actually is a time sharing multiplex method, from this
> point of view, QoS can be take as a generic way about how to control the
> time assignment for virtual mdev device that occupy HW. As result we can
> define QoS knob generic across any device type by this way. Even if HW
> has build in with some kind of QoS support, I think it's not a problem
> for back-end driver to convert mdev standard QoS definition to their
> specification to reach the same performance expectation. Seems there are
> no examples for us to follow, we need define it from scratch.
>
> I proposal universal QoS control interfaces like below:
>
> Cap: The cap limits the maximum percentage of time a mdev device can own
> physical device. e.g. cap=60, means mdev device cannot take over 60% of
> total physical resource.
>
> Weight: The weight define proportional control of the mdev device
> resource between guests, it’s orthogonal with Cap, to target load
> balancing. E.g. if guest 1 should take double mdev device resource
> compare with guest 2, need set weight ratio to 2:1.
>
> Priority: The guest who has higher priority will get execution first,
> target to some real time usage and speeding interactive response.
>
> Above QoS interfaces cover both overall budget control and single
> submission control. I will sent out detail design later once get aligned. 
>  
 Hi Alex,
 Any comments about the interface mentioned above?
>>> Not really.
>>>
>>> Kirti, are there any QoS knobs that would be interesting
>>> for NVIDIA devices?
>>>
>> We have different types of vGPU for different QoS factors.
>>
>> When mdev devices are created, its resources are allocated irrespective
>> of which VM/userspace app is going to use that mdev device. Any
>> parameter we add here should be tied to particular mdev device and not
>> to the guest/app that are going to use it. 'Cap' and 'Priority' are
>> along that line. All mdev device might not need/use these parameters,
>> these can be made optional interfaces.
> 
> We also define some QoS parameters in Intel vGPU types, but it only
> provided a default fool-style way. We still need a flexible approach
> that give user the ability to change QoS parameters freely and
> dynamically according to their requirement , not restrict to the current
> limited and static vGPU types.
> 
>> In the above proposal, I'm not sure how 'Weight' would work for mdev
>> devices on 

Re: [RFC]Add new mdev interface for QoS

2017-08-02 Thread Gao, Ping A

On 2017/8/2 18:19, Kirti Wankhede wrote:
>
> On 8/2/2017 3:56 AM, Alex Williamson wrote:
>> On Tue, 1 Aug 2017 13:54:27 +0800
>> "Gao, Ping A"  wrote:
>>
>>> On 2017/7/28 0:00, Gao, Ping A wrote:
 On 2017/7/27 0:43, Alex Williamson wrote:  
> [cc +libvir-list]
>
> On Wed, 26 Jul 2017 21:16:59 +0800
> "Gao, Ping A"  wrote:
>  
>> The vfio-mdev provide the capability to let different guest share the
>> same physical device through mediate sharing, as result it bring a
>> requirement about how to control the device sharing, we need a QoS
>> related interface for mdev to management virtual device resource.
>>
>> E.g. In practical use, vGPUs assigned to different quests almost has
>> different performance requirements, some guests may need higher priority
>> for real time usage, some other may need more portion of the GPU
>> resource to get higher 3D performance, corresponding we can define some
>> interfaces like weight/cap for overall budget control, priority for
>> single submission control.
>>
>> So I suggest to add some common attributes which are vendor agnostic in
>> mdev core sysfs for QoS purpose.  
> I think what you're asking for is just some standardization of a QoS
> attribute_group which a vendor can optionally include within the
> existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
> transparently enable this, but it really only provides the standard,
> all of the support code is left for the vendor.  I'm fine with that,
> but of course the trouble with and sort of standardization is arriving
> at an agreed upon standard.  Are there QoS knobs that are generic
> across any mdev device type?  Are there others that are more specific
> to vGPU?  Are there existing examples of this that we can steal their
> specification?  
 Yes, you are right, standardization QoS knobs are exactly what I wanted.
 Only when it become a part of the mdev framework and libvirt, then QoS
 such critical feature can be leveraged by cloud usage. HW vendor only
 need to focus on the implementation of the corresponding QoS algorithm
 in their back-end driver.

 Vfio-mdev framework provide the capability to share the device that lack
 of HW virtualization support to guests, no matter the device type,
 mediated sharing actually is a time sharing multiplex method, from this
 point of view, QoS can be take as a generic way about how to control the
 time assignment for virtual mdev device that occupy HW. As result we can
 define QoS knob generic across any device type by this way. Even if HW
 has build in with some kind of QoS support, I think it's not a problem
 for back-end driver to convert mdev standard QoS definition to their
 specification to reach the same performance expectation. Seems there are
 no examples for us to follow, we need define it from scratch.

 I proposal universal QoS control interfaces like below:

 Cap: The cap limits the maximum percentage of time a mdev device can own
 physical device. e.g. cap=60, means mdev device cannot take over 60% of
 total physical resource.

 Weight: The weight define proportional control of the mdev device
 resource between guests, it’s orthogonal with Cap, to target load
 balancing. E.g. if guest 1 should take double mdev device resource
 compare with guest 2, need set weight ratio to 2:1.

 Priority: The guest who has higher priority will get execution first,
 target to some real time usage and speeding interactive response.

 Above QoS interfaces cover both overall budget control and single
 submission control. I will sent out detail design later once get aligned.  
>>> Hi Alex,
>>> Any comments about the interface mentioned above?
>> Not really.
>>
>> Kirti, are there any QoS knobs that would be interesting
>> for NVIDIA devices?
>>
> We have different types of vGPU for different QoS factors.
>
> When mdev devices are created, its resources are allocated irrespective
> of which VM/userspace app is going to use that mdev device. Any
> parameter we add here should be tied to particular mdev device and not
> to the guest/app that are going to use it. 'Cap' and 'Priority' are
> along that line. All mdev device might not need/use these parameters,
> these can be made optional interfaces.

We also define some QoS parameters in Intel vGPU types, but it only
provided a default fool-style way. We still need a flexible approach
that give user the ability to change QoS parameters freely and
dynamically according to their requirement , not restrict to the current
limited and static vGPU types.

> In the above proposal, I'm not sure how 'Weight' would work for mdev
> devices on same physical device.
>
> In the above example, "if guest 1 should take double mdev device
> resource compare with guest 2" but what if guest 2 never boo

Re: [RFC]Add new mdev interface for QoS

2017-08-02 Thread Kirti Wankhede


On 8/2/2017 3:56 AM, Alex Williamson wrote:
> On Tue, 1 Aug 2017 13:54:27 +0800
> "Gao, Ping A"  wrote:
> 
>> On 2017/7/28 0:00, Gao, Ping A wrote:
>>> On 2017/7/27 0:43, Alex Williamson wrote:  
 [cc +libvir-list]

 On Wed, 26 Jul 2017 21:16:59 +0800
 "Gao, Ping A"  wrote:
  
> The vfio-mdev provide the capability to let different guest share the
> same physical device through mediate sharing, as result it bring a
> requirement about how to control the device sharing, we need a QoS
> related interface for mdev to management virtual device resource.
>
> E.g. In practical use, vGPUs assigned to different quests almost has
> different performance requirements, some guests may need higher priority
> for real time usage, some other may need more portion of the GPU
> resource to get higher 3D performance, corresponding we can define some
> interfaces like weight/cap for overall budget control, priority for
> single submission control.
>
> So I suggest to add some common attributes which are vendor agnostic in
> mdev core sysfs for QoS purpose.  
 I think what you're asking for is just some standardization of a QoS
 attribute_group which a vendor can optionally include within the
 existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
 transparently enable this, but it really only provides the standard,
 all of the support code is left for the vendor.  I'm fine with that,
 but of course the trouble with and sort of standardization is arriving
 at an agreed upon standard.  Are there QoS knobs that are generic
 across any mdev device type?  Are there others that are more specific
 to vGPU?  Are there existing examples of this that we can steal their
 specification?  
>>> Yes, you are right, standardization QoS knobs are exactly what I wanted.
>>> Only when it become a part of the mdev framework and libvirt, then QoS
>>> such critical feature can be leveraged by cloud usage. HW vendor only
>>> need to focus on the implementation of the corresponding QoS algorithm
>>> in their back-end driver.
>>>
>>> Vfio-mdev framework provide the capability to share the device that lack
>>> of HW virtualization support to guests, no matter the device type,
>>> mediated sharing actually is a time sharing multiplex method, from this
>>> point of view, QoS can be take as a generic way about how to control the
>>> time assignment for virtual mdev device that occupy HW. As result we can
>>> define QoS knob generic across any device type by this way. Even if HW
>>> has build in with some kind of QoS support, I think it's not a problem
>>> for back-end driver to convert mdev standard QoS definition to their
>>> specification to reach the same performance expectation. Seems there are
>>> no examples for us to follow, we need define it from scratch.
>>>
>>> I proposal universal QoS control interfaces like below:
>>>
>>> Cap: The cap limits the maximum percentage of time a mdev device can own
>>> physical device. e.g. cap=60, means mdev device cannot take over 60% of
>>> total physical resource.
>>>
>>> Weight: The weight define proportional control of the mdev device
>>> resource between guests, it’s orthogonal with Cap, to target load
>>> balancing. E.g. if guest 1 should take double mdev device resource
>>> compare with guest 2, need set weight ratio to 2:1.
>>>
>>> Priority: The guest who has higher priority will get execution first,
>>> target to some real time usage and speeding interactive response.
>>>
>>> Above QoS interfaces cover both overall budget control and single
>>> submission control. I will sent out detail design later once get aligned.  
>>
>> Hi Alex,
>> Any comments about the interface mentioned above?
> 
> Not really.
> 
> Kirti, are there any QoS knobs that would be interesting
> for NVIDIA devices?
> 

We have different types of vGPU for different QoS factors.

When mdev devices are created, its resources are allocated irrespective
of which VM/userspace app is going to use that mdev device. Any
parameter we add here should be tied to particular mdev device and not
to the guest/app that are going to use it. 'Cap' and 'Priority' are
along that line. All mdev device might not need/use these parameters,
these can be made optional interfaces.

In the above proposal, I'm not sure how 'Weight' would work for mdev
devices on same physical device.

In the above example, "if guest 1 should take double mdev device
resource compare with guest 2" but what if guest 2 never booted, how
will you calculate resources?

If libvirt/other toolstack decides to do smart allocation based on type
name without taking physical host device as input, guest 1 and guest 2
might get mdev devices created on different physical device. Then would
weightage matter here?

Thanks,
Kirti


> Implementing libvirt support at the same time might be an interesting
> exercise if we don't have a second user in the kernel to validate
> 

RE: [RFC]Add new mdev interface for QoS

2017-08-01 Thread Tian, Kevin
> From: Alex Williamson [mailto:alex.william...@redhat.com]
> Sent: Wednesday, August 2, 2017 6:26 AM
> 
> On Tue, 1 Aug 2017 13:54:27 +0800
> "Gao, Ping A"  wrote:
> 
> > On 2017/7/28 0:00, Gao, Ping A wrote:
> > > On 2017/7/27 0:43, Alex Williamson wrote:
> > >> [cc +libvir-list]
> > >>
> > >> On Wed, 26 Jul 2017 21:16:59 +0800
> > >> "Gao, Ping A"  wrote:
> > >>
> > >>> The vfio-mdev provide the capability to let different guest share the
> > >>> same physical device through mediate sharing, as result it bring a
> > >>> requirement about how to control the device sharing, we need a QoS
> > >>> related interface for mdev to management virtual device resource.
> > >>>
> > >>> E.g. In practical use, vGPUs assigned to different quests almost has
> > >>> different performance requirements, some guests may need higher
> priority
> > >>> for real time usage, some other may need more portion of the GPU
> > >>> resource to get higher 3D performance, corresponding we can define
> some
> > >>> interfaces like weight/cap for overall budget control, priority for
> > >>> single submission control.
> > >>>
> > >>> So I suggest to add some common attributes which are vendor agnostic
> in
> > >>> mdev core sysfs for QoS purpose.
> > >> I think what you're asking for is just some standardization of a QoS
> > >> attribute_group which a vendor can optionally include within the
> > >> existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
> > >> transparently enable this, but it really only provides the standard,
> > >> all of the support code is left for the vendor.  I'm fine with that,
> > >> but of course the trouble with and sort of standardization is arriving
> > >> at an agreed upon standard.  Are there QoS knobs that are generic
> > >> across any mdev device type?  Are there others that are more specific
> > >> to vGPU?  Are there existing examples of this that we can steal their
> > >> specification?
> > > Yes, you are right, standardization QoS knobs are exactly what I wanted.
> > > Only when it become a part of the mdev framework and libvirt, then QoS
> > > such critical feature can be leveraged by cloud usage. HW vendor only
> > > need to focus on the implementation of the corresponding QoS algorithm
> > > in their back-end driver.
> > >
> > > Vfio-mdev framework provide the capability to share the device that lack
> > > of HW virtualization support to guests, no matter the device type,
> > > mediated sharing actually is a time sharing multiplex method, from this
> > > point of view, QoS can be take as a generic way about how to control the
> > > time assignment for virtual mdev device that occupy HW. As result we can
> > > define QoS knob generic across any device type by this way. Even if HW
> > > has build in with some kind of QoS support, I think it's not a problem
> > > for back-end driver to convert mdev standard QoS definition to their
> > > specification to reach the same performance expectation. Seems there
> are
> > > no examples for us to follow, we need define it from scratch.
> > >
> > > I proposal universal QoS control interfaces like below:
> > >
> > > Cap: The cap limits the maximum percentage of time a mdev device can
> own
> > > physical device. e.g. cap=60, means mdev device cannot take over 60% of
> > > total physical resource.
> > >
> > > Weight: The weight define proportional control of the mdev device
> > > resource between guests, it’s orthogonal with Cap, to target load
> > > balancing. E.g. if guest 1 should take double mdev device resource
> > > compare with guest 2, need set weight ratio to 2:1.
> > >
> > > Priority: The guest who has higher priority will get execution first,
> > > target to some real time usage and speeding interactive response.
> > >
> > > Above QoS interfaces cover both overall budget control and single
> > > submission control. I will sent out detail design later once get aligned.
> >
> > Hi Alex,
> > Any comments about the interface mentioned above?
> 
> Not really.
> 
> Kirti, are there any QoS knobs that would be interesting
> for NVIDIA devices?
> 
> Implementing libvirt support at the same time might be an interesting
> exercise if we don't have a second user in the kernel to validate
> against.  We could at least have two communities reviewing the feature
> then.  Thanks,
> 

We planned to introduce new vdev types to indirectly validate 
some features (e.g. weight and cap) in our device model, which
however will not exercise the to-be-proposed sysfs interface.
yes, we can check/extend libvirt simultaneously to draw a
whole picture of all required changes in the stack...

Thanks
Kevin


Re: [RFC]Add new mdev interface for QoS

2017-08-01 Thread Alex Williamson
On Tue, 1 Aug 2017 13:54:27 +0800
"Gao, Ping A"  wrote:

> On 2017/7/28 0:00, Gao, Ping A wrote:
> > On 2017/7/27 0:43, Alex Williamson wrote:  
> >> [cc +libvir-list]
> >>
> >> On Wed, 26 Jul 2017 21:16:59 +0800
> >> "Gao, Ping A"  wrote:
> >>  
> >>> The vfio-mdev provide the capability to let different guest share the
> >>> same physical device through mediate sharing, as result it bring a
> >>> requirement about how to control the device sharing, we need a QoS
> >>> related interface for mdev to management virtual device resource.
> >>>
> >>> E.g. In practical use, vGPUs assigned to different quests almost has
> >>> different performance requirements, some guests may need higher priority
> >>> for real time usage, some other may need more portion of the GPU
> >>> resource to get higher 3D performance, corresponding we can define some
> >>> interfaces like weight/cap for overall budget control, priority for
> >>> single submission control.
> >>>
> >>> So I suggest to add some common attributes which are vendor agnostic in
> >>> mdev core sysfs for QoS purpose.  
> >> I think what you're asking for is just some standardization of a QoS
> >> attribute_group which a vendor can optionally include within the
> >> existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
> >> transparently enable this, but it really only provides the standard,
> >> all of the support code is left for the vendor.  I'm fine with that,
> >> but of course the trouble with and sort of standardization is arriving
> >> at an agreed upon standard.  Are there QoS knobs that are generic
> >> across any mdev device type?  Are there others that are more specific
> >> to vGPU?  Are there existing examples of this that we can steal their
> >> specification?  
> > Yes, you are right, standardization QoS knobs are exactly what I wanted.
> > Only when it become a part of the mdev framework and libvirt, then QoS
> > such critical feature can be leveraged by cloud usage. HW vendor only
> > need to focus on the implementation of the corresponding QoS algorithm
> > in their back-end driver.
> >
> > Vfio-mdev framework provide the capability to share the device that lack
> > of HW virtualization support to guests, no matter the device type,
> > mediated sharing actually is a time sharing multiplex method, from this
> > point of view, QoS can be take as a generic way about how to control the
> > time assignment for virtual mdev device that occupy HW. As result we can
> > define QoS knob generic across any device type by this way. Even if HW
> > has build in with some kind of QoS support, I think it's not a problem
> > for back-end driver to convert mdev standard QoS definition to their
> > specification to reach the same performance expectation. Seems there are
> > no examples for us to follow, we need define it from scratch.
> >
> > I proposal universal QoS control interfaces like below:
> >
> > Cap: The cap limits the maximum percentage of time a mdev device can own
> > physical device. e.g. cap=60, means mdev device cannot take over 60% of
> > total physical resource.
> >
> > Weight: The weight define proportional control of the mdev device
> > resource between guests, it’s orthogonal with Cap, to target load
> > balancing. E.g. if guest 1 should take double mdev device resource
> > compare with guest 2, need set weight ratio to 2:1.
> >
> > Priority: The guest who has higher priority will get execution first,
> > target to some real time usage and speeding interactive response.
> >
> > Above QoS interfaces cover both overall budget control and single
> > submission control. I will sent out detail design later once get aligned.  
> 
> Hi Alex,
> Any comments about the interface mentioned above?

Not really.

Kirti, are there any QoS knobs that would be interesting
for NVIDIA devices?

Implementing libvirt support at the same time might be an interesting
exercise if we don't have a second user in the kernel to validate
against.  We could at least have two communities reviewing the feature
then.  Thanks,

Alex


Re: [RFC]Add new mdev interface for QoS

2017-07-31 Thread Gao, Ping A

On 2017/7/28 0:00, Gao, Ping A wrote:
> On 2017/7/27 0:43, Alex Williamson wrote:
>> [cc +libvir-list]
>>
>> On Wed, 26 Jul 2017 21:16:59 +0800
>> "Gao, Ping A"  wrote:
>>
>>> The vfio-mdev provide the capability to let different guest share the
>>> same physical device through mediate sharing, as result it bring a
>>> requirement about how to control the device sharing, we need a QoS
>>> related interface for mdev to management virtual device resource.
>>>
>>> E.g. In practical use, vGPUs assigned to different quests almost has
>>> different performance requirements, some guests may need higher priority
>>> for real time usage, some other may need more portion of the GPU
>>> resource to get higher 3D performance, corresponding we can define some
>>> interfaces like weight/cap for overall budget control, priority for
>>> single submission control.
>>>
>>> So I suggest to add some common attributes which are vendor agnostic in
>>> mdev core sysfs for QoS purpose.
>> I think what you're asking for is just some standardization of a QoS
>> attribute_group which a vendor can optionally include within the
>> existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
>> transparently enable this, but it really only provides the standard,
>> all of the support code is left for the vendor.  I'm fine with that,
>> but of course the trouble with and sort of standardization is arriving
>> at an agreed upon standard.  Are there QoS knobs that are generic
>> across any mdev device type?  Are there others that are more specific
>> to vGPU?  Are there existing examples of this that we can steal their
>> specification?
> Yes, you are right, standardization QoS knobs are exactly what I wanted.
> Only when it become a part of the mdev framework and libvirt, then QoS
> such critical feature can be leveraged by cloud usage. HW vendor only
> need to focus on the implementation of the corresponding QoS algorithm
> in their back-end driver.
>
> Vfio-mdev framework provide the capability to share the device that lack
> of HW virtualization support to guests, no matter the device type,
> mediated sharing actually is a time sharing multiplex method, from this
> point of view, QoS can be take as a generic way about how to control the
> time assignment for virtual mdev device that occupy HW. As result we can
> define QoS knob generic across any device type by this way. Even if HW
> has build in with some kind of QoS support, I think it's not a problem
> for back-end driver to convert mdev standard QoS definition to their
> specification to reach the same performance expectation. Seems there are
> no examples for us to follow, we need define it from scratch.
>
> I proposal universal QoS control interfaces like below:
>
> Cap: The cap limits the maximum percentage of time a mdev device can own
> physical device. e.g. cap=60, means mdev device cannot take over 60% of
> total physical resource.
>
> Weight: The weight define proportional control of the mdev device
> resource between guests, it’s orthogonal with Cap, to target load
> balancing. E.g. if guest 1 should take double mdev device resource
> compare with guest 2, need set weight ratio to 2:1.
>
> Priority: The guest who has higher priority will get execution first,
> target to some real time usage and speeding interactive response.
>
> Above QoS interfaces cover both overall budget control and single
> submission control. I will sent out detail design later once get aligned.

Hi Alex,
Any comments about the interface mentioned above?

>> Also, mdev devices are not necessarily the exclusive users of the
>> hardware, we can have a native user such as a local X client.  They're
>> not an mdev user, so we can't support them via the mdev_attr_group.
>> Does there need to be a per mdev parent QoS attribute_group standard
>> for somehow defining the QoS of all the child mdev devices, or perhaps
>> representing the remaining host QoS attributes?
> That's really an open, if we don't take host workload into consideration
> for cloud usage, it's not a problem any more, however such assumption is
> not reasonable. Any way if we take mdev devices as clients of host
> driver, and host driver provide the capability to divide out a portion
> HW resource to mdev devices, then it's only need to take care about the
> resource that host assigned for mdev devices. Follow this way QoS for
> mdev focus on the relationship between mdev devices no need to take care
> the host workload.
>
> -Ping
>
>> Ultimately libvirt and upper level management tools would be the
>> consumer of these control knobs, so let's immediately get libvirt
>> involved in the discussion.  Thanks,
>>
>> Alex



Re: [RFC]Add new mdev interface for QoS

2017-07-27 Thread Gao, Ping A

On 2017/7/27 0:43, Alex Williamson wrote:
> [cc +libvir-list]
>
> On Wed, 26 Jul 2017 21:16:59 +0800
> "Gao, Ping A"  wrote:
>
>> The vfio-mdev provide the capability to let different guest share the
>> same physical device through mediate sharing, as result it bring a
>> requirement about how to control the device sharing, we need a QoS
>> related interface for mdev to management virtual device resource.
>>
>> E.g. In practical use, vGPUs assigned to different quests almost has
>> different performance requirements, some guests may need higher priority
>> for real time usage, some other may need more portion of the GPU
>> resource to get higher 3D performance, corresponding we can define some
>> interfaces like weight/cap for overall budget control, priority for
>> single submission control.
>>
>> So I suggest to add some common attributes which are vendor agnostic in
>> mdev core sysfs for QoS purpose.
> I think what you're asking for is just some standardization of a QoS
> attribute_group which a vendor can optionally include within the
> existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
> transparently enable this, but it really only provides the standard,
> all of the support code is left for the vendor.  I'm fine with that,
> but of course the trouble with and sort of standardization is arriving
> at an agreed upon standard.  Are there QoS knobs that are generic
> across any mdev device type?  Are there others that are more specific
> to vGPU?  Are there existing examples of this that we can steal their
> specification?

Yes, you are right, standardization QoS knobs are exactly what I wanted.
Only when it become a part of the mdev framework and libvirt, then QoS
such critical feature can be leveraged by cloud usage. HW vendor only
need to focus on the implementation of the corresponding QoS algorithm
in their back-end driver.

Vfio-mdev framework provide the capability to share the device that lack
of HW virtualization support to guests, no matter the device type,
mediated sharing actually is a time sharing multiplex method, from this
point of view, QoS can be take as a generic way about how to control the
time assignment for virtual mdev device that occupy HW. As result we can
define QoS knob generic across any device type by this way. Even if HW
has build in with some kind of QoS support, I think it's not a problem
for back-end driver to convert mdev standard QoS definition to their
specification to reach the same performance expectation. Seems there are
no examples for us to follow, we need define it from scratch.

I proposal universal QoS control interfaces like below:

Cap: The cap limits the maximum percentage of time a mdev device can own
physical device. e.g. cap=60, means mdev device cannot take over 60% of
total physical resource.

Weight: The weight define proportional control of the mdev device
resource between guests, it’s orthogonal with Cap, to target load
balancing. E.g. if guest 1 should take double mdev device resource
compare with guest 2, need set weight ratio to 2:1.

Priority: The guest who has higher priority will get execution first,
target to some real time usage and speeding interactive response.

Above QoS interfaces cover both overall budget control and single
submission control. I will sent out detail design later once get aligned.

> Also, mdev devices are not necessarily the exclusive users of the
> hardware, we can have a native user such as a local X client.  They're
> not an mdev user, so we can't support them via the mdev_attr_group.
> Does there need to be a per mdev parent QoS attribute_group standard
> for somehow defining the QoS of all the child mdev devices, or perhaps
> representing the remaining host QoS attributes?

That's really an open, if we don't take host workload into consideration
for cloud usage, it's not a problem any more, however such assumption is
not reasonable. Any way if we take mdev devices as clients of host
driver, and host driver provide the capability to divide out a portion
HW resource to mdev devices, then it's only need to take care about the
resource that host assigned for mdev devices. Follow this way QoS for
mdev focus on the relationship between mdev devices no need to take care
the host workload.

-Ping

> Ultimately libvirt and upper level management tools would be the
> consumer of these control knobs, so let's immediately get libvirt
> involved in the discussion.  Thanks,
>
> Alex



Re: [RFC]Add new mdev interface for QoS

2017-07-26 Thread Alex Williamson
[cc +libvir-list]

On Wed, 26 Jul 2017 21:16:59 +0800
"Gao, Ping A"  wrote:

> The vfio-mdev provide the capability to let different guest share the
> same physical device through mediate sharing, as result it bring a
> requirement about how to control the device sharing, we need a QoS
> related interface for mdev to management virtual device resource.
> 
> E.g. In practical use, vGPUs assigned to different quests almost has
> different performance requirements, some guests may need higher priority
> for real time usage, some other may need more portion of the GPU
> resource to get higher 3D performance, corresponding we can define some
> interfaces like weight/cap for overall budget control, priority for
> single submission control.
> 
> So I suggest to add some common attributes which are vendor agnostic in
> mdev core sysfs for QoS purpose.

I think what you're asking for is just some standardization of a QoS
attribute_group which a vendor can optionally include within the
existing mdev_parent_ops.mdev_attr_groups.  The mdev core will
transparently enable this, but it really only provides the standard,
all of the support code is left for the vendor.  I'm fine with that,
but of course the trouble with and sort of standardization is arriving
at an agreed upon standard.  Are there QoS knobs that are generic
across any mdev device type?  Are there others that are more specific
to vGPU?  Are there existing examples of this that we can steal their
specification?

Also, mdev devices are not necessarily the exclusive users of the
hardware, we can have a native user such as a local X client.  They're
not an mdev user, so we can't support them via the mdev_attr_group.
Does there need to be a per mdev parent QoS attribute_group standard
for somehow defining the QoS of all the child mdev devices, or perhaps
representing the remaining host QoS attributes?

Ultimately libvirt and upper level management tools would be the
consumer of these control knobs, so let's immediately get libvirt
involved in the discussion.  Thanks,

Alex