Re: [openstack-dev] [Nova] [Cyborg] Updates to os-acc proposal

2018-08-01 Thread Eric Fried
Sundar-

> On an unrelated note, thanks for the
> pointer to the GPU spec
> (https://review.openstack.org/#/c/579359/10/doc/source/specs/rocky/device-passthrough.rst).
> I will review that.

Thanks. Please note that this is for nova-powervm, PowerVM's
*out-of-tree* compute driver. We hope to bring this into the in-tree
driver eventually (unless we skip straight to the cyborg model :) but it
should give a good idea of some of the requirements and use cases we're
looking to support.

> Fair enough. We had discussed that too. The Cyborg drivers can also
> invoke REST APIs etc. for Power.

Ack.

> Agreed. So, we could say:
> - The plugins do the instance half. They are hypervisor-specific and
> platform-specific. (The term 'platform' subsumes both the architecture
> (Power, x86) and the server/system type.) They are invoked by os-acc.
> - The drivers do the device half, device discovery/enumeration and
> anything not explicitly assigned to plugins. They contain
> device-specific and platform-specific code. They are invoked by Cyborg
> agent and os-acc.

Sounds good.

> Are you ok with the workflow in
> https://docs.google.com/drawings/d/1cX06edia_Pr7P5nOB08VsSMsgznyrz4Yy2u8nb596sU/edit?usp=sharing
> ?

Yes (but see below).

>> You mean for getVAN()?
> Yes -- BTW, I renamed it as prepareVANs() or prepareVAN(), because it is
> not just a query as the name getVAN implies, but has side effects.

Ack.

>> Because AFAIK, os_vif.plug(list_of_vif_objects,
>> InstanceInfo) is *not* how nova uses os-vif for plugging.
> 
> Yes, the os-acc will invoke the plug() once per VAN. IIUC, Nova calls
> Neutron once per instance for all networks, as seen in this code
> sequence in nova/nova/compute/manager.py:
> 
> _build_and_run_instance() --> _build_resources() -->
> 
>     _build_networks_for_instance() --> _allocate_network()
> 
> The _allocate_network() actually takes a list of requested_networks, and
> handles all networks for an instance [1].
> 
> Chasing this further down:
> 
> _allocate_network --> _allocate_network_async()
> 
> --> self.network_api.allocate_for_instance()
> 
>  == nova/network/rpcapi.py::allocate_for_instance()
> 
> So, even the RPC out of Nova seems to take a list of networks [2].

Yes yes, but by the time we get to os_vif.plug(), we're doing one VIF at
a time. That corresponds to what you've got in your flow diagram, so as
long as that's accurate, I'm fine with it.

That said, we could discuss os_acc.plug taking a list of VANs and
threading out the calls to the plugin's plug() method (which takes one
at a time). I think we've talked a bit about this before: the pros and
cons of having the threading managed by os-acc or by the plugin. We
could have the same discussion for prepareVANs() too.

> [1]
> https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1529
> [2]
> https://github.com/openstack/nova/blob/master/nova/network/rpcapi.py#L163
>> Thanks,
>> Eric
>> //lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> Regards,
> Sundar
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] [Cyborg] Updates to os-acc proposal

2018-08-01 Thread Nadathur, Sundar

Hi Eric,
    Please see my responses inline. On an unrelated note, thanks for 
the pointer to the GPU spec 
(https://review.openstack.org/#/c/579359/10/doc/source/specs/rocky/device-passthrough.rst). 
I will review that.


On 7/31/2018 10:42 AM, Eric Fried wrote:

Sundar-


   * Cyborg drivers deal with device-specific aspects, including
 discovery/enumeration of devices and handling the Device Half of the
 attach (preparing devices/accelerators for attach to an instance,
 post-attach cleanup (if any) after successful attach, releasing
 device/accelerator resources on instance termination or failed
 attach, etc.)
   * os-acc plugins deal with hypervisor/system/architecture-specific
 aspects, including handling the Instance Half of the attach (e.g.
 for libvirt with PCI, preparing the XML snippet to be included in
 the domain XML).

This sounds well and good, but discovery/enumeration will also be
hypervisor/system/architecture-specific. So...
Fair enough. We had discussed that too. The Cyborg drivers can also 
invoke REST APIs etc. for Power.

Thus, the drivers and plugins are expected to be complementary. For
example, for 2 devices of types T1 and T2, there shall be 2 separate
Cyborg drivers. Further, we would have separate plugins for, say,
x86+KVM systems and Power systems. We could then have four different
deployments -- T1 on x86+KVM, T2 on x86+KVM, T1 on Power, T2 on Power --
by suitable combinations of the drivers and plugins.

...the discovery/enumeration code for T1 on x86+KVM (lsdev? lspci?
walking the /dev file system?) will be totally different from the
discovery/enumeration code for T1 on Power
(pypowervm.wrappers.ManagedSystem.get(adapter)).

I don't mind saying "drivers do the device side; plugins do the instance
side" but I don't see getting around the fact that both "sides" will
need to have platform-specific code

Agreed. So, we could say:
- The plugins do the instance half. They are hypervisor-specific and 
platform-specific. (The term 'platform' subsumes both the architecture 
(Power, x86) and the server/system type.) They are invoked by os-acc.
- The drivers do the device half, device discovery/enumeration and 
anything not explicitly assigned to plugins. They contain 
device-specific and platform-specific code. They are invoked by Cyborg 
agent and os-acc.


Are you ok with the workflow in 
https://docs.google.com/drawings/d/1cX06edia_Pr7P5nOB08VsSMsgznyrz4Yy2u8nb596sU/edit?usp=sharing 
?

One secondary detail to note is that Nova compute calls os-acc per
instance for all accelerators for that instance, not once for each
accelerator.

You mean for getVAN()?
Yes -- BTW, I renamed it as prepareVANs() or prepareVAN(), because it is 
not just a query as the name getVAN implies, but has side effects.

Because AFAIK, os_vif.plug(list_of_vif_objects,
InstanceInfo) is *not* how nova uses os-vif for plugging.


Yes, the os-acc will invoke the plug() once per VAN. IIUC, Nova calls 
Neutron once per instance for all networks, as seen in this code 
sequence in nova/nova/compute/manager.py:


_build_and_run_instance() --> _build_resources() -->

    _build_networks_for_instance() --> _allocate_network()

The _allocate_network() actually takes a list of requested_networks, and 
handles all networks for an instance [1].


Chasing this further down:

_allocate_network --> _allocate_network_async()

--> self.network_api.allocate_for_instance()

 == nova/network/rpcapi.py::allocate_for_instance()

So, even the RPC out of Nova seems to take a list of networks [2].

[1] 
https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1529
[2] 
https://github.com/openstack/nova/blob/master/nova/network/rpcapi.py#L163

Thanks,
Eric
//lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Regards,
Sundar

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] [Cyborg] Updates to os-acc proposal

2018-07-31 Thread Eric Fried
Sundar-

>   * Cyborg drivers deal with device-specific aspects, including
> discovery/enumeration of devices and handling the Device Half of the
> attach (preparing devices/accelerators for attach to an instance,
> post-attach cleanup (if any) after successful attach, releasing
> device/accelerator resources on instance termination or failed
> attach, etc.)
>   * os-acc plugins deal with hypervisor/system/architecture-specific
> aspects, including handling the Instance Half of the attach (e.g.
> for libvirt with PCI, preparing the XML snippet to be included in
> the domain XML).

This sounds well and good, but discovery/enumeration will also be
hypervisor/system/architecture-specific. So...

> Thus, the drivers and plugins are expected to be complementary. For
> example, for 2 devices of types T1 and T2, there shall be 2 separate
> Cyborg drivers. Further, we would have separate plugins for, say,
> x86+KVM systems and Power systems. We could then have four different
> deployments -- T1 on x86+KVM, T2 on x86+KVM, T1 on Power, T2 on Power --
> by suitable combinations of the drivers and plugins.

...the discovery/enumeration code for T1 on x86+KVM (lsdev? lspci?
walking the /dev file system?) will be totally different from the
discovery/enumeration code for T1 on Power
(pypowervm.wrappers.ManagedSystem.get(adapter)).

I don't mind saying "drivers do the device side; plugins do the instance
side" but I don't see getting around the fact that both "sides" will
need to have platform-specific code.

> One secondary detail to note is that Nova compute calls os-acc per
> instance for all accelerators for that instance, not once for each
> accelerator.

You mean for getVAN()? Because AFAIK, os_vif.plug(list_of_vif_objects,
InstanceInfo) is *not* how nova uses os-vif for plugging.

Thanks,
Eric
.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Nova] [Cyborg] Updates to os-acc proposal

2018-07-30 Thread Nadathur, Sundar

Hi Eric and all,
    With recent discussions [1], we have convergence on how Power and 
other architectures can use Cyborg. Before I update the spec [2], I am 
setting down some key aspects of the updates, so that we are all aligned.


The accelerator - instance attachment has two parts:

 * The connection between the accelerator and a host-visible attach
   handle, such as a PCI function or a mediated device UUID. We call
   this the Device Half of the attach.
 * The connection between the attach handle and the instance. We name
   this the Instance Half of the attach.

I propose two different extensibility mechanisms:

 * Cyborg drivers deal with device-specific aspects, including
   discovery/enumeration of devices and handling the Device Half of the
   attach (preparing devices/accelerators for attach to an instance,
   post-attach cleanup (if any) after successful attach, releasing
   device/accelerator resources on instance termination or failed
   attach, etc.)
 * os-acc plugins deal with hypervisor/system/architecture-specific
   aspects, including handling the Instance Half of the attach (e.g.
   for libvirt with PCI, preparing the XML snippet to be included in
   the domain XML).

When invoked by Nova compute to attach accelerator(s) to an instance, 
os-acc would call the Cyborg driver to prepare a VAN (Virtual 
Accelerator Nexus, which is a handle object for attaching an accelerator 
to an instance, similar to VIFs for networking). Such preparation may 
involve configuring the device in some way, including programming for 
FPGAs. This sets up a VAN object with the necessary data for the attach 
(e.g. PCI VF, Power DRC index, etc.). Then the os-acc would call a 
plugin to do the needful for that hypervisor, using that VAN. Finally 
the os-acc may call the Cyborg driver again to do any post-attach 
cleanup, if needed.


A more detailed workflow is here: 
https://docs.google.com/drawings/d/1cX06edia_Pr7P5nOB08VsSMsgznyrz4Yy2u8nb596sU/edit?usp=sharing 



Thus, the drivers and plugins are expected to be complementary. For 
example, for 2 devices of types T1 and T2, there shall be 2 separate 
Cyborg drivers. Further, we would have separate plugins for, say, 
x86+KVM systems and Power systems. We could then have four different 
deployments -- T1 on x86+KVM, T2 on x86+KVM, T1 on Power, T2 on Power -- 
by suitable combinations of the drivers and plugins.


It is possible that there may be scenarios where the separation of roles 
between the plugins and the drivers are not so clear-cut. That can be 
addressed by allowing the plugins to call into Cyborg drivers in the 
future and/or by other mechanisms.


One secondary detail to note is that Nova compute calls os-acc per 
instance for all accelerators for that instance, not once for each 
accelerator. There are two reasons for that:


 * I think this is how Nova deals with os-vif [3].
 * If some accelerators got allocated/configured, and the next
   accelerator configuration fails, a rollback needs to be done. This
   is better done in os-acc than Nova compute.

Cyborg drivers are invoked both by the Cyborg agent (for 
discovery/enumeration) and by os-acc (for instance attach). Both shall 
use Stevedore to locate and load the drivers. A single Python module may 
implement both sets of interfaces, like this:


+--+ +---+
| Nova Compute | |Cyborg |
++-+ |Agent  |
 |   +---+---+
+v---+   |
| os-acc |   |
++---+   |
 |   |
 | Cyborg driver |
+v+--v---+
|UN/PLUG ACCELERATORS |  DISCOVER|
|FROM INSTANCES   |  ACCELERATORS|
| |  |
|* can_handle()   |  * get_devices() |
|* prepareVAN()   |  |
|* postplug() |  |
|* unprepareVAN() |  |
+-+--+

If there are no objections to the above, I will update the spec [2].

[1] 
http://eavesdrop.openstack.org/irclogs/%23openstack-cyborg/%23openstack-cyborg.2018-07-30.log.html#t2018-07-30T16:25:41-2 


[2] https://review.openstack.org/#/c/577438/
[3] 
https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1529


Regards,
Sundar
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev