Re: [openstack-dev] [Nova] [Cyborg] Updates to os-acc proposal
Sundar- > On an unrelated note, thanks for the > pointer to the GPU spec > (https://review.openstack.org/#/c/579359/10/doc/source/specs/rocky/device-passthrough.rst). > I will review that. Thanks. Please note that this is for nova-powervm, PowerVM's *out-of-tree* compute driver. We hope to bring this into the in-tree driver eventually (unless we skip straight to the cyborg model :) but it should give a good idea of some of the requirements and use cases we're looking to support. > Fair enough. We had discussed that too. The Cyborg drivers can also > invoke REST APIs etc. for Power. Ack. > Agreed. So, we could say: > - The plugins do the instance half. They are hypervisor-specific and > platform-specific. (The term 'platform' subsumes both the architecture > (Power, x86) and the server/system type.) They are invoked by os-acc. > - The drivers do the device half, device discovery/enumeration and > anything not explicitly assigned to plugins. They contain > device-specific and platform-specific code. They are invoked by Cyborg > agent and os-acc. Sounds good. > Are you ok with the workflow in > https://docs.google.com/drawings/d/1cX06edia_Pr7P5nOB08VsSMsgznyrz4Yy2u8nb596sU/edit?usp=sharing > ? Yes (but see below). >> You mean for getVAN()? > Yes -- BTW, I renamed it as prepareVANs() or prepareVAN(), because it is > not just a query as the name getVAN implies, but has side effects. Ack. >> Because AFAIK, os_vif.plug(list_of_vif_objects, >> InstanceInfo) is *not* how nova uses os-vif for plugging. > > Yes, the os-acc will invoke the plug() once per VAN. IIUC, Nova calls > Neutron once per instance for all networks, as seen in this code > sequence in nova/nova/compute/manager.py: > > _build_and_run_instance() --> _build_resources() --> > > _build_networks_for_instance() --> _allocate_network() > > The _allocate_network() actually takes a list of requested_networks, and > handles all networks for an instance [1]. > > Chasing this further down: > > _allocate_network --> _allocate_network_async() > > --> self.network_api.allocate_for_instance() > > == nova/network/rpcapi.py::allocate_for_instance() > > So, even the RPC out of Nova seems to take a list of networks [2]. Yes yes, but by the time we get to os_vif.plug(), we're doing one VIF at a time. That corresponds to what you've got in your flow diagram, so as long as that's accurate, I'm fine with it. That said, we could discuss os_acc.plug taking a list of VANs and threading out the calls to the plugin's plug() method (which takes one at a time). I think we've talked a bit about this before: the pros and cons of having the threading managed by os-acc or by the plugin. We could have the same discussion for prepareVANs() too. > [1] > https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1529 > [2] > https://github.com/openstack/nova/blob/master/nova/network/rpcapi.py#L163 >> Thanks, >> Eric >> //lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > Regards, > Sundar > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] [Cyborg] Updates to os-acc proposal
Hi Eric, Please see my responses inline. On an unrelated note, thanks for the pointer to the GPU spec (https://review.openstack.org/#/c/579359/10/doc/source/specs/rocky/device-passthrough.rst). I will review that. On 7/31/2018 10:42 AM, Eric Fried wrote: Sundar- * Cyborg drivers deal with device-specific aspects, including discovery/enumeration of devices and handling the Device Half of the attach (preparing devices/accelerators for attach to an instance, post-attach cleanup (if any) after successful attach, releasing device/accelerator resources on instance termination or failed attach, etc.) * os-acc plugins deal with hypervisor/system/architecture-specific aspects, including handling the Instance Half of the attach (e.g. for libvirt with PCI, preparing the XML snippet to be included in the domain XML). This sounds well and good, but discovery/enumeration will also be hypervisor/system/architecture-specific. So... Fair enough. We had discussed that too. The Cyborg drivers can also invoke REST APIs etc. for Power. Thus, the drivers and plugins are expected to be complementary. For example, for 2 devices of types T1 and T2, there shall be 2 separate Cyborg drivers. Further, we would have separate plugins for, say, x86+KVM systems and Power systems. We could then have four different deployments -- T1 on x86+KVM, T2 on x86+KVM, T1 on Power, T2 on Power -- by suitable combinations of the drivers and plugins. ...the discovery/enumeration code for T1 on x86+KVM (lsdev? lspci? walking the /dev file system?) will be totally different from the discovery/enumeration code for T1 on Power (pypowervm.wrappers.ManagedSystem.get(adapter)). I don't mind saying "drivers do the device side; plugins do the instance side" but I don't see getting around the fact that both "sides" will need to have platform-specific code Agreed. So, we could say: - The plugins do the instance half. They are hypervisor-specific and platform-specific. (The term 'platform' subsumes both the architecture (Power, x86) and the server/system type.) They are invoked by os-acc. - The drivers do the device half, device discovery/enumeration and anything not explicitly assigned to plugins. They contain device-specific and platform-specific code. They are invoked by Cyborg agent and os-acc. Are you ok with the workflow in https://docs.google.com/drawings/d/1cX06edia_Pr7P5nOB08VsSMsgznyrz4Yy2u8nb596sU/edit?usp=sharing ? One secondary detail to note is that Nova compute calls os-acc per instance for all accelerators for that instance, not once for each accelerator. You mean for getVAN()? Yes -- BTW, I renamed it as prepareVANs() or prepareVAN(), because it is not just a query as the name getVAN implies, but has side effects. Because AFAIK, os_vif.plug(list_of_vif_objects, InstanceInfo) is *not* how nova uses os-vif for plugging. Yes, the os-acc will invoke the plug() once per VAN. IIUC, Nova calls Neutron once per instance for all networks, as seen in this code sequence in nova/nova/compute/manager.py: _build_and_run_instance() --> _build_resources() --> _build_networks_for_instance() --> _allocate_network() The _allocate_network() actually takes a list of requested_networks, and handles all networks for an instance [1]. Chasing this further down: _allocate_network --> _allocate_network_async() --> self.network_api.allocate_for_instance() == nova/network/rpcapi.py::allocate_for_instance() So, even the RPC out of Nova seems to take a list of networks [2]. [1] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1529 [2] https://github.com/openstack/nova/blob/master/nova/network/rpcapi.py#L163 Thanks, Eric //lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Regards, Sundar __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] [Cyborg] Updates to os-acc proposal
Sundar- > * Cyborg drivers deal with device-specific aspects, including > discovery/enumeration of devices and handling the Device Half of the > attach (preparing devices/accelerators for attach to an instance, > post-attach cleanup (if any) after successful attach, releasing > device/accelerator resources on instance termination or failed > attach, etc.) > * os-acc plugins deal with hypervisor/system/architecture-specific > aspects, including handling the Instance Half of the attach (e.g. > for libvirt with PCI, preparing the XML snippet to be included in > the domain XML). This sounds well and good, but discovery/enumeration will also be hypervisor/system/architecture-specific. So... > Thus, the drivers and plugins are expected to be complementary. For > example, for 2 devices of types T1 and T2, there shall be 2 separate > Cyborg drivers. Further, we would have separate plugins for, say, > x86+KVM systems and Power systems. We could then have four different > deployments -- T1 on x86+KVM, T2 on x86+KVM, T1 on Power, T2 on Power -- > by suitable combinations of the drivers and plugins. ...the discovery/enumeration code for T1 on x86+KVM (lsdev? lspci? walking the /dev file system?) will be totally different from the discovery/enumeration code for T1 on Power (pypowervm.wrappers.ManagedSystem.get(adapter)). I don't mind saying "drivers do the device side; plugins do the instance side" but I don't see getting around the fact that both "sides" will need to have platform-specific code. > One secondary detail to note is that Nova compute calls os-acc per > instance for all accelerators for that instance, not once for each > accelerator. You mean for getVAN()? Because AFAIK, os_vif.plug(list_of_vif_objects, InstanceInfo) is *not* how nova uses os-vif for plugging. Thanks, Eric . __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Nova] [Cyborg] Updates to os-acc proposal
Hi Eric and all, With recent discussions [1], we have convergence on how Power and other architectures can use Cyborg. Before I update the spec [2], I am setting down some key aspects of the updates, so that we are all aligned. The accelerator - instance attachment has two parts: * The connection between the accelerator and a host-visible attach handle, such as a PCI function or a mediated device UUID. We call this the Device Half of the attach. * The connection between the attach handle and the instance. We name this the Instance Half of the attach. I propose two different extensibility mechanisms: * Cyborg drivers deal with device-specific aspects, including discovery/enumeration of devices and handling the Device Half of the attach (preparing devices/accelerators for attach to an instance, post-attach cleanup (if any) after successful attach, releasing device/accelerator resources on instance termination or failed attach, etc.) * os-acc plugins deal with hypervisor/system/architecture-specific aspects, including handling the Instance Half of the attach (e.g. for libvirt with PCI, preparing the XML snippet to be included in the domain XML). When invoked by Nova compute to attach accelerator(s) to an instance, os-acc would call the Cyborg driver to prepare a VAN (Virtual Accelerator Nexus, which is a handle object for attaching an accelerator to an instance, similar to VIFs for networking). Such preparation may involve configuring the device in some way, including programming for FPGAs. This sets up a VAN object with the necessary data for the attach (e.g. PCI VF, Power DRC index, etc.). Then the os-acc would call a plugin to do the needful for that hypervisor, using that VAN. Finally the os-acc may call the Cyborg driver again to do any post-attach cleanup, if needed. A more detailed workflow is here: https://docs.google.com/drawings/d/1cX06edia_Pr7P5nOB08VsSMsgznyrz4Yy2u8nb596sU/edit?usp=sharing Thus, the drivers and plugins are expected to be complementary. For example, for 2 devices of types T1 and T2, there shall be 2 separate Cyborg drivers. Further, we would have separate plugins for, say, x86+KVM systems and Power systems. We could then have four different deployments -- T1 on x86+KVM, T2 on x86+KVM, T1 on Power, T2 on Power -- by suitable combinations of the drivers and plugins. It is possible that there may be scenarios where the separation of roles between the plugins and the drivers are not so clear-cut. That can be addressed by allowing the plugins to call into Cyborg drivers in the future and/or by other mechanisms. One secondary detail to note is that Nova compute calls os-acc per instance for all accelerators for that instance, not once for each accelerator. There are two reasons for that: * I think this is how Nova deals with os-vif [3]. * If some accelerators got allocated/configured, and the next accelerator configuration fails, a rollback needs to be done. This is better done in os-acc than Nova compute. Cyborg drivers are invoked both by the Cyborg agent (for discovery/enumeration) and by os-acc (for instance attach). Both shall use Stevedore to locate and load the drivers. A single Python module may implement both sets of interfaces, like this: +--+ +---+ | Nova Compute | |Cyborg | ++-+ |Agent | | +---+---+ +v---+ | | os-acc | | ++---+ | | | | Cyborg driver | +v+--v---+ |UN/PLUG ACCELERATORS | DISCOVER| |FROM INSTANCES | ACCELERATORS| | | | |* can_handle() | * get_devices() | |* prepareVAN() | | |* postplug() | | |* unprepareVAN() | | +-+--+ If there are no objections to the above, I will update the spec [2]. [1] http://eavesdrop.openstack.org/irclogs/%23openstack-cyborg/%23openstack-cyborg.2018-07-30.log.html#t2018-07-30T16:25:41-2 [2] https://review.openstack.org/#/c/577438/ [3] https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1529 Regards, Sundar __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev