Re: [openstack-dev] [Ironic] Node groups and multi-node operations

Joshua Harlow Sun, 26 Jan 2014 11:11:49 -0800

Doesn't nova already have logic for creating N virtual machines (similar to a 
group) in the same request? I thought it did (maybe it doesn't anymore in the 
v3 API), creating N bare metal machines seems like it would comply to that api?


Sent from my really tiny device...

> On Jan 22, 2014, at 4:50 PM, "Devananda van der Veen" 
> <[email protected]> wrote:
> 
> So, a conversation came again up today around whether or not Ironic will, in 
> the future, support operations on groups of nodes. Some folks have expressed 
> a desire for Ironic to expose operations on groups of nodes; others want 
> Ironic to host the hardware-grouping data so that eg. Heat and Tuskar can 
> make more intelligent group-aware decisions or represent the groups in a UI. 
> Neither of these have an implementation in Ironic today... and we still need 
> to implement a host of other things before we start on this. FWIW, this 
> discussion is meant to stimulate thinking ahead to things we might address in 
> Juno, and aligning development along the way.
> 
> There's also some refactoring / code cleanup which is going on and worth 
> mentioning because it touches the part of the code which this discussion 
> impacts. For our developers, here is additional context:
> * our TaskManager class supports locking >1 node atomically, but both the 
> driver API and our REST API only support operating on one node at a time. 
> AFAIK, no where in the code do we actually pass a group of nodes.
> * for historical reasons, our driver API requires both a TaskManager and a 
> Node object be passed to all methods. However, the TaskManager object 
> contains a reference to the Node(s) which it has acquired, so the node 
> parameter is redundant.
> * we've discussed cleaning this up, but I'd like to avoid refactoring the 
> same interfaces again when we go to add group-awareness.
> 
> 
> I'll try to summarize the different axis-of-concern around which the 
> discussion of node groups seem to converge...
> 
> 1: physical vs. logical grouping
> - Some hardware is logically, but not strictly physically, grouped. Eg, 1U 
> servers in the same rack. There is some grouping, such as failure domain, but 
> operations on discrete nodes are discreet. This grouping should be modeled 
> somewhere, and some times a user may wish to perform an operation on that 
> group. Is a higher layer (tuskar, heat, etc) sufficient? I think so.
> - Some hardware _is_ physically grouped. Eg, high-density cartridges which 
> share firmware state or a single management end-point, but are otherwise 
> discrete computing devices. This grouping must be modeled somewhere, and 
> certain operations can not be performed on one member without affecting all 
> members. Things will break if each node is treated independently.
> 
> 2: performance optimization
> - Some operations may be optimized if there is an awareness of concurrent 
> identical operations. Eg, deploy the same image to lots of nodes using 
> multicast or bittorrent. If Heat were to inform Ironic that this deploy is 
> part of a group, the optimization would be deterministic. If Heat does not 
> inform Ironic of this grouping, but Ironic infers it (eg, from timing of 
> requests for similar actions) then optimization is possible but 
> non-deterministic, and may be much harder to reason about or debug.
> 
> 3: APIs
> - Higher layers of OpenStack (eg, Heat) are expected to orchestrate discrete 
> resource units into a larger group operation. This is where the grouping 
> happens today, but already results in inefficiencies when performing 
> identical operations at scale. Ironic may be able to get around this by 
> coalescing adjacent requests for the same operation, but this would be 
> non-deterministic.
> - Moving group-awareness or group-operations into the lower layers (eg, 
> Ironic) looks like it will require non-trivial changes to Heat and Nova, and, 
> in my opinion, violates a layer-constraint that I would like to maintain. On 
> the other hand, we could avoid the challenges around coalescing. This might 
> be necessary to support physically-grouped hardware anyway, too.
> 
> 
> If Ironic coalesces requests, it could be done in either the ConductorManager 
> layer or in the drivers themselves. The difference would be whether our 
> internal driver API accepts one node or a set of nodes for each operation. 
> It'll also impact our locking model. Both of these are implementation details 
> that wouldn't affect other projects, but would affect our driver developers.
> 
> Also, until Ironic models physically-grouped hardware relationships in some 
> internal way, we're going to have difficulty supporting that class of 
> hardware. Is that OK? What is the impact of not supporting such hardware? It 
> seems, at least today, to be pretty minimal.
> 
> 
> Discussion is welcome.
> 
> -Devananda
> _______________________________________________
> OpenStack-dev mailing list
> [email protected]
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

_______________________________________________
OpenStack-dev mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Re: [openstack-dev] [Ironic] Node groups and multi-node operations

Reply via email to