Sahid, Just share some background. XenServer doesn't expose vGPUs as mdev or pci devices. I proposed a spec about one year ago to make fake pci devices so that we can use the existing PCI mechanism to cover vGPUs. But that's not a good design and got strongly objection. After that, we switched to use the resource providers by following the advice from the core team.
Regards, Jianghua -----Original Message----- From: Sahid Orentino Ferdjaoui [mailto:[email protected]] Sent: Monday, September 25, 2017 11:01 PM To: OpenStack Development Mailing List (not for usage questions) <[email protected]> Subject: Re: [openstack-dev] vGPUs support for Nova On Mon, Sep 25, 2017 at 09:29:25AM -0500, Matt Riedemann wrote: > On 9/25/2017 5:40 AM, Jay Pipes wrote: > > On 09/25/2017 05:39 AM, Sahid Orentino Ferdjaoui wrote: > > > There is a desire to expose the vGPUs resources on top of Resource > > > Provider which is probably the path we should be going in the long > > > term. I was not there for the last PTG and you probably already > > > made a decision about moving in that direction anyway. My personal > > > feeling is that it is premature. > > > > > > The nested Resource Provider work is not yet feature-complete and > > > requires more reviewer attention. If we continue in the direction > > > of Resource Provider, it will need at least 2 more releases to > > > expose the vGPUs feature and that without the support of NUMA, and > > > with the feeling of pushing something which is not > > > stable/production-ready. > > > > > > It's seems safer to first have the Resource Provider work well > > > finalized/stabilized to be production-ready. Then on top of > > > something stable we could start to migrate our current virt > > > specific features like NUMA, CPU Pinning, Huge Pages and finally PCI > > > devices. > > > > > > I'm talking about PCI devices in general because I think we should > > > implement the vGPU on top of our /pci framework which is > > > production ready and provides the support of NUMA. > > > > > > The hardware vendors building their drivers using mdev and the > > > /pci framework currently understand only SRIOV but on a quick > > > glance it does not seem complicated to make it support mdev. > > > > > > In the /pci framework we will have to: > > > > > > * Update the PciDevice object fields to accept NULL value for > > > 'address' and add new field 'uuid' > > > * Update PciRequest to handle a new tag like 'vgpu_types' > > > * Update PciDeviceStats to also maintain pool of vGPUs > > > > > > The operators will have to create alias(-es) and configure > > > flavors. Basically most of the logic is already implemented and > > > the method 'consume_request' is going to select the right vGPUs > > > according the request. > > > > > > In /virt we will have to: > > > > > > * Update the field 'pci_passthrough_devices' to also include GPUs > > > devices. > > > * Update attach/detach PCI device to handle vGPUs > > > > > > We have a few people interested in working on it, so we could > > > certainly make this feature available for Queen. > > > > > > I can take the lead updating/implementing the PCI and libvirt > > > driver part, I'm sure Jianghua Wang will be happy to take the lead > > > for the virt XenServer part. > > > > > > And I trust Jay, Stephen and Sylvain to follow the developments. > > > > I understand the desire to get something in to Nova to support > > vGPUs, and I understand that the existing /pci modules represent the > > fastest/cheapest way to get there. > > > > I won't block you from making any of the above changes, Sahid. I'll > > even do my best to review them. However, I will be primarily > > focusing this cycle on getting the nested resource providers work > > feature-complete for (at least) SR-IOV PF/VF devices. > > > > The decision of whether to allow an approach that adds more to the > > existing /pci module is ultimately Matt's. > > > > Best, > > -jay > > > > ____________________________________________________________________ > > ______ OpenStack Development Mailing List (not for usage questions) > > Unsubscribe: > > [email protected]?subject:unsubscribe > > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > Nested resource providers is not merged or production ready because we > haven't made it a priority. We've certainly talked about it and Jay > has had patches proposed for several releases now though. > > Building vGPU support into the existing framework, which only a couple > of people understand - certainly not me, might be a short-term gain > but is just more technical debt we have to pay off later, and delays > any focus on nested resource providers for the wider team. > > At the Queens PTG it was abundantly clear that many features are > dependent on nested resource providers, including several > networking-related features like bandwidth-based scheduling. > > The priorities for placement/scheduler in Queens are: > > 1. Dan Smith's migration allocations cleanup. > 2. Alternative hosts for reschedules with cells v2. > 3. Nested resource providers. > > All of these are in progress and need review. > > I personally don't think we should abandon the plan to implement vGPU > support with nested resource providers without first seeing any code > changes for it as a proof of concept. It also sounds like we have a > pretty simple staggered plan for rolling out vGPU support so it's not > very detailed to start. The virt driver reports vGPU inventory and we > decorate the details later with traits (which Alex Xu is working on and needs > review). > > Sahid, you could certainly implement a separate proof of concept and > make that available if the nested resource providers-based change hits > major issues or goes far too long and has too much risk, then we have > a contingency plan at least. But I don't expect that to get review > priority and you'd have to accept that it might not get merged since > we want to use nested resource providers. That seems to be fair, I understand your desire to make the implementation on Resource Provider a priority and I'm with you. In general my preference is to do not stop progress on virt features because we have a new "product" on-going. > Either way we are going to need solid functional testing and that > functional testing should be written against the API as much as > possible so that it works regardless of the backend implementation of > the feature. One of the big things we failed at in Pike was not doing > enough functional testing of move operations with claims in the > scheduler earlier in the cycle. That all came in late and we're still fixing > bugs as a result. It's very true and most of the time we are asking our users to be beta-testers, that is one more reason why my preference is for a real deprecation phase. > If we can get started early on the functional testing for vGPUs, then > work both implementations in parallel, we should be able to retain the > functional tests and determine which implementation we ultimately need > to go with probably sometime in the second milestone. > > -- > > Thanks, > > Matt > > ______________________________________________________________________ > ____ OpenStack Development Mailing List (not for usage questions) > Unsubscribe: > [email protected]?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: [email protected]?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
