Re: [openstack-dev] [Heat] Comments on Steve Baker's Proposal on HOT Software Config
-Mike Spreitzer/Watson/IBM@IBMUS wrote: -To: "OpenStack Development Mailing List \(not for usage questions\)"openstack-dev@lists.openstack.orgFrom: Mike Spreitzer/Watson/IBM@IBMUSDate: 10/30/2013 03:56PMSubject: Re: [openstack-dev] [Heat] Comments on Steve Baker'sProposal on HOT Software ConfigLakshminaraya Renganarayana/Watson/IBM@IBMUS wrote on10/30/2013 03:35:32 PM: Zane Bitter zbit...@redhat.com wrote on 10/29/2013 08:46:21AM: ... In this method (i.e. option (2) above) shouldn't we be building the dependencygraph in Heat rather than running through them sequentially as specifiedby the user? In that case, we should use a dictionary not a list: app_server:type: OS::Nova::Serverproperties: components: install_user_profile: definition: InstallWasProfile params:user_id install_admin_profile: definition: InstallWasProfile params:admin_id I missed this implication of using a list! You are right, it shouldbe a dictionary and Heat would be building the dependence graph. Using a dictionary instead of a list can work, butI think we might be going overboard here. Do we expect the componentinvocations on a given VM instance to run concurrently? I think thathas been dissed before. Chef users are happy to let a role be a listof recipes, not a DAG. A list is simple; is there an actual problemwith it?Yes, there was some agreement on component invocations on a given VM instancebeing run sequentially. However, the issue here is slightly different. If, as adesign principle, Heat analyzes dependences between component invocations,then it should do that irrespective of whether the component invocations areon the same VM or from different VMs. Given this, a list of component invocationswould imply an ordering which is in addition to the ordering induced by dependences.Whereas a dictionary would not impose any additional ordering.Thanks,LN ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] Comments on Steve Baker's Proposal on HOT Software Config
Zane, thanks very much for the detailed feedback. I have added my comments inline. Zane Bitter zbit...@redhat.com wrote on 10/29/2013 08:46:21 AM: ... As brief feedback on these suggestions: E1: Probably +1 for inputs, but tentative -1 for attributes. I'm not sure we can check anything useful with that other than internal consistency of the template. The explicit specification of the attributes (or outputs) is aimed primarily at providing a name for the output so that they can be referred to in component invocations and also by software config providers (Chef or Puppet, etc.) to access them. The use of the attributes to check or validate is only a secondary aim and this could just be a consistency check with name matching. If we had more information such as types, constraints, about these attributes (similar to inputs of a template) we can validate more. But this is not the primary goal. I'd like to see some more detail about how inputs/outputs would be exposed in the configuration management systems - or, more specifically, how the user can extend this to arbitrary configuration management systems. The way inputs/outputs are exposed in a CM system would depend on its conventions. In our use with Chef, we expose these inputs and outputs as a Chef's node attributes, i.e., via the node[][] hash. I could imagine a similar scheme for Puppet. For a shell type of CM provider the inputs/outputs can be exposed as Shell environment variables. To avoid name conflicts, these inputs/outputs can be prefixed by a namespace, say Heat. E2: +1 for Opt1 -1 for Opt2 (mixing namespaces is bad) -1 for Opt3 Agreed -- we also prefer Opt1! E3: Sounds like a real issue (also, the solution for E2 has to take this into account too); not sure about the implementation. Yes, agreed -- E2 has to understand which output from which invocation of a component is being used or referred to. With invocation_id's it should be possible to uniquely identify the component invocation. In this method (i.e. option (2) above) shouldn't we be building the dependency graph in Heat rather than running through them sequentially as specified by the user? In that case, we should use a dictionary not a list: app_server: type: OS::Nova::Server properties: components: install_user_profile: definition: InstallWasProfile params: user_id install_admin_profile: definition: InstallWasProfile params: admin_id I missed this implication of using a list! You are right, it should be a dictionary and Heat would be building the dependence graph. E5: +1 but a question on where this is specified. In the component definition itself, or in the particular invocation of it on a server? Seems like it would have to be the latter. Good point. I think it could be specified on both places. It definitely could be specified on a particular invocation. I am not fully sure on this, but one can also imagine cases where a cross-component dependency is true for all invocations of a component, and hence might be specified as a part of the component definition as a dependency between two component types. Thanks, LN ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Comments on Steve Baker's Proposal on HOT Software Config
Hello, A few us at IBM studied Steve Baker's proposal on HOT Software Configuration. Overall the proposed constructs and syntax are great -- we really like the clean syntax and concise specification of components. We would like to propose a few minor extensions that help with better expression of dependencies among components and resources, and in-turn enable cross-vm coordination. We have captured our thoughts on this on the following Wiki page https://wiki.openstack.org/wiki/Heat/Blueprints/hot-software-config-ibm-response We would like to discuss these further ... please post your comments and suggestions. Thank you, LN _ Lakshminarayanan Renganarayana Research Staff Member IBM T.J. Watson Research Center http://researcher.ibm.com/person/us-lrengan___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Heat] Comments on Steve Baker's Proposal on HOT Software Config
Sorry, Re-posting this with [Heat] in the subject line, because many of us have filters based on [Heat] in the subject line. Hello, A few us at IBM studied Steve Baker's proposal on HOT Software Configuration. Overall the proposed constructs and syntax are great -- we really like the clean syntax and concise specification of components. We would like to propose a few minor extensions that help with better expression of dependencies among components and resources, and in-turn enable cross-vm coordination. We have captured our thoughts on this on the following Wiki page https://wiki.openstack.org/wiki/Heat/Blueprints/hot-software-config-ibm-response We would like to discuss these further ... please post your comments and suggestions. Thank you, LN _ Lakshminarayanan Renganarayana Research Staff Member IBM T.J. Watson Research Center___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] HOT Software configuration proposal
Zane Bitter zbit...@redhat.com wrote on 10/22/2013 09:24:28 AM: On 22/10/13 09:15, Thomas Spatzier wrote: BTW, the convention of properties being input and attributes being output, i.e. that subtle distinction between properties and attributes is not really intuitive, at least not to me as non-native speaker, because I used to use both words as synonyms. As a native speaker, I can confidently state that it's not intuitive to anyone ;) We unfortunately inherited these names from the Properties section and the Fn::GetAtt function in cfn templates. It's even worse than that, because there's a whole category of... uh... things (DependsOn, DeletionPolicy, c.) that don't even have a name - I always have to resist the urge to call them 'attributes' too. At least for the components construct being proposed (by Steve Baker), shall we adopt a more explicit convention and require component definitions to explicitly name their inputs and outputs? Thanks, LN___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] A prototype for cross-vm synchronization and communication
Hi Steven, Steven Hardy sha...@redhat.com wrote on 10/21/2013 11:27:43 AM: On Fri, Oct 18, 2013 at 02:45:01PM -0400, Lakshminaraya Renganarayana wrote: snip The prototype is implemented in Python and Ruby is used for chef interception. Where can we find the code? What part of the code are you interested in? The python pre-processor part or the Ruby chef interceptor part? I need to get clearance from IBM to post it on the Git. I am guessing it might be easy to get clearance for the pre-processor code and a bit harder for the chef interceptor code. BTW, will you be attending the OpenStack summit in HongKong? I am planning to and I can show you a demo of this pre-processor there (if the IBM clearance takes too long). Thanks, LN___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] A prototype for cross-vm synchronization and communication
Hi Stan, Thanks for the comments. As you have observed the prototype that I have built is tied to Chef. I just wanted to describe that here for reference and not as a proposal for the general implementation. What I would like to work on is a more general solution that is agnostic to (or works with any) underlying CM tool (such as chfe, puppet, saltstack, murano, etc.). Regarding identifying reads/writes: I was thinking that we could come up with a general syntax + semantics of explicitly defining the reads/writes of Heat components. I think we can extend Steve Baker's recent proposal, to include the inputs/outputs in software component definitions. Your experience with the Unified Agent would be valuable for this. I would be happy to collaborate with you! Thanks, LN Stan Lagun sla...@mirantis.com wrote on 10/21/2013 10:03:58 AM: From: Stan Lagun sla...@mirantis.com To: OpenStack Development Mailing List openstack-dev@lists.openstack.org Date: 10/21/2013 10:18 AM Subject: Re: [openstack-dev] [Heat] A prototype for cross-vm synchronization and communication Hi Lakshminarayanan, Seems like a solid plan. I'm probably wrong here but ain't this too tied to chef? I believe the solution should equally be suitable for chef, puppet, SaltStack, Murano, or maybe all I need is just a plain bash script execution. It may be difficult to intercept script reads the way it is possible with chef's node[][]. In Murano we has a generic agent that could integrate all such deployment platforms using common syntax. Agent specification can be found here: https://wiki.openstack.org/wiki/ Murano/UnifiedAgent and it can be helpful or at least can be a source for design ideas. I'm very positive on adoption on such solution to Heat. There would be a significant amount of work to abstract all underlying technologies (chef, Zookeper etc) so that they become pluggable and replaceable without introducing hard-coded dependencies for the Heat and bringing everything to production quality level. We could collaborate on bringing such solution to the Heat if it would be accepted by Heat's core team and community On Fri, Oct 18, 2013 at 10:45 PM, Lakshminaraya Renganarayana lren...@us.ibm.com wrote: Hi, In the last Openstack Heat meeting there was good interest in proposals for cross-vm synchronization and communication and I had mentioned the prototype I have built. I had also promised that I will post an outline of the prototype ... Here it is. I might have missed some details, please feel free to ask / comment and I would be happy to explain more. --- Goal of the prototype: Enable cross-vm synchronization and communication using high-level declarative description (no wait- conditions) Use chef as the CM tool. Design rationale / choices of the prototype (note that these were made just for the prototype and I am not proposing them to be the choices for Heat/HOT): D1: No new construct in Heat template = use metadata sections D2: No extensions to core Heat engine = use a pre-processor that will produce a Heat template that the standard Heat engine can consume D3: Do not require chef recipes to be modified = use a convention of accessing inputs/outputs from chef node[][] = use ruby meta-programming to intercept reads/writes to node[][] forward values D4: Use a standard distributed coordinator (don't reinvent) = use zookeeper as a coordinator and as a global data space for communciation Overall, the flow is the following: 1. User specifies a Heat template with details about software config and dependences in the metadata section of resources (see step S1 below). 2. A pre-processor consumes this augmented heat template and produces another heat template with user-data sections with cloud- init scripts and also sets up a zookeeper instance with enough information to coordinate between the resources at runtime to realize the dependences and synchronization (see step S2) 3. The generated heat template is fed into standard heat engine to deploy. After the VMs are created the cloud-init script kicks in. The cloud init script installs chef solo and then starts the execution of the roles specified in the metadata section. During this execution of the recipes the coordination is realized (see steps S2 and S3 below). Implementation scheme: S1. Use metadata section of each resource to describe (see attached example) - a list of roles - inputs to and outputs from each role and their mapping to resource attrs (any attr) - convention: these inputs/outputs will be through chef node attrs node [][] S2. Dependence analysis and cloud init script generation Dependence analysis: - resolve every reference that can be statically resolved using Heat's fucntions (this step just uses Heat's current dependence analysis -- Thanks to Zane Bitter for helping me understand this) - flag all unresolved references as values resolved at run-time at communicated via the coordinator
Re: [openstack-dev] [Heat] A prototype for cross-vm synchronization and communication
Thomas Spatzier thomas.spatz...@de.ibm.com wrote on 10/21/2013 08:29:47 AM: you mentioned an example in your original post, but I did not find it. Can you add the example? Hi Thomas, Here is the example I used earlier: For example, consider a two VM app, with VMs vmA, vmB, and a set of software components (ai's and bi's) to be installed on them: vmA = base-vmA + a1 + a2 + a3 vmB = base-vmB + b1 + b2 + b3 let us say that software component b1 of vmB, requires a config value produced by software component a1 of vmA. How to declaratively model this dependence? Clearly, modeling a dependence between just base-vmA and base-vmB is not enough. However, defining a dependence between the whole of vmA and vmB is too coarse. It would be ideal to be able to define a dependence at the granularity of software components, i.e., vmB.b1 depends on vmA.a1. Of course, it would also be good to capture what value is passed between vmB.b1 and vmA.a1, so that the communication can be facilitated by the orchestration engine. Thanks, LN Lakshminaraya Renganarayana lren...@us.ibm.com wrote on 18.10.2013 20:57:43: From: Lakshminaraya Renganarayana lren...@us.ibm.com To: OpenStack Development Mailing List openstack-dev@lists.openstack.org, Date: 18.10.2013 21:01 Subject: Re: [openstack-dev] [Heat] A prototype for cross-vm synchronization and communication Just wanted to add a couple of clarifications: 1. the cross-vm dependences are captured via the read/writes of attributes in resources and in software components (described in metadata sections). 2. these dependences are then realized via blocking-reads and writes to zookeeper, which realizes the cross-vm synchronization and communication of values between the resources. Thanks, LN Lakshminaraya Renganarayana/Watson/IBM@IBMUS wrote on 10/18/2013 02:45:01 PM: From: Lakshminaraya Renganarayana/Watson/IBM@IBMUS To: OpenStack Development Mailing List openstack-dev@lists.openstack.org Date: 10/18/2013 02:48 PM Subject: [openstack-dev] [Heat] A prototype for cross-vm synchronization and communication Hi, In the last Openstack Heat meeting there was good interest in proposals for cross-vm synchronization and communication and I had mentioned the prototype I have built. I had also promised that I will post an outline of the prototype ... Here it is. I might have missed some details, please feel free to ask / comment and I would be happy to explain more. --- Goal of the prototype: Enable cross-vm synchronization and communication using high-level declarative description (no wait- conditions) Use chef as the CM tool. Design rationale / choices of the prototype (note that these were made just for the prototype and I am not proposing them to be the choices for Heat/HOT): D1: No new construct in Heat template = use metadata sections D2: No extensions to core Heat engine = use a pre-processor that will produce a Heat template that the standard Heat engine can consume D3: Do not require chef recipes to be modified = use a convention of accessing inputs/outputs from chef node[][] = use ruby meta-programming to intercept reads/writes to node[][] forward values D4: Use a standard distributed coordinator (don't reinvent) = use zookeeper as a coordinator and as a global data space for communciation Overall, the flow is the following: 1. User specifies a Heat template with details about software config and dependences in the metadata section of resources (see step S1 below). 2. A pre-processor consumes this augmented heat template and produces another heat template with user-data sections with cloud- init scripts and also sets up a zookeeper instance with enough information to coordinate between the resources at runtime to realize the dependences and synchronization (see step S2) 3. The generated heat template is fed into standard heat engine to deploy. After the VMs are created the cloud-init script kicks in. The cloud init script installs chef solo and then starts the execution of the roles specified in the metadata section. During this execution of the recipes the coordination is realized (see steps S2 and S3 below). Implementation scheme: S1. Use metadata section of each resource to describe (see attached example) - a list of roles - inputs to and outputs from each role and their mapping to resource attrs (any attr) - convention: these inputs/outputs will be through chef node attrs node [][] S2. Dependence analysis and cloud init script generation Dependence analysis: - resolve every reference that can be statically resolved using Heat's fucntions (this step just uses Heat's current dependence analysis -- Thanks to Zane Bitter for helping me understand this) - flag all unresolved references as values resolved at run-time at communicated
[openstack-dev] [Heat] A prototype for cross-vm synchronization and communication
Hi, In the last Openstack Heat meeting there was good interest in proposals for cross-vm synchronization and communication and I had mentioned the prototype I have built. I had also promised that I will post an outline of the prototype ... Here it is. I might have missed some details, please feel free to ask / comment and I would be happy to explain more. --- Goal of the prototype: Enable cross-vm synchronization and communication using high-level declarative description (no wait-conditions) Use chef as the CM tool. Design rationale / choices of the prototype (note that these were made just for the prototype and I am not proposing them to be the choices for Heat/HOT): D1: No new construct in Heat template = use metadata sections D2: No extensions to core Heat engine = use a pre-processor that will produce a Heat template that the standard Heat engine can consume D3: Do not require chef recipes to be modified = use a convention of accessing inputs/outputs from chef node[][] = use ruby meta-programming to intercept reads/writes to node[][] forward values D4: Use a standard distributed coordinator (don't reinvent) = use zookeeper as a coordinator and as a global data space for communciation Overall, the flow is the following: 1. User specifies a Heat template with details about software config and dependences in the metadata section of resources (see step S1 below). 2. A pre-processor consumes this augmented heat template and produces another heat template with user-data sections with cloud-init scripts and also sets up a zookeeper instance with enough information to coordinate between the resources at runtime to realize the dependences and synchronization (see step S2) 3. The generated heat template is fed into standard heat engine to deploy. After the VMs are created the cloud-init script kicks in. The cloud init script installs chef solo and then starts the execution of the roles specified in the metadata section. During this execution of the recipes the coordination is realized (see steps S2 and S3 below). Implementation scheme: S1. Use metadata section of each resource to describe (see attached example) - a list of roles - inputs to and outputs from each role and their mapping to resource attrs (any attr) - convention: these inputs/outputs will be through chef node attrs node[][] S2. Dependence analysis and cloud init script generation Dependence analysis: - resolve every reference that can be statically resolved using Heat's fucntions (this step just uses Heat's current dependence analysis -- Thanks to Zane Bitter for helping me understand this) - flag all unresolved references as values resolved at run-time at communicated via the coordinator Use cloud-init in user-data sections: - automatically generate a script that would bootstrap chef and will run the roles/recipes in the order specified in the metadata section - generate dependence info for zookeeper to coordinate at runtime S3. Coordinate synchronization and communication at run-time - intercept reads and writes to node[][] - if it is a remote read, get it from Zookeeper - execution will block till the value is available - if write is for a value required by a remote resource, write the value to Zookeeper The prototype is implemented in Python and Ruby is used for chef interception. There are alternatives for many of the choices I have made for the prototype: - zookeeper can be replaced with any other service that provides a data space and distributed coordination - chef can be replaced by any other CM tool (a little bit of design / convention needed for other CM tools because of the interception used in the prototype to catch reads/writes to node[][]) - the whole dependence analysis can be integrated into the Heat's dependence analyzer - the component construct proposed recently (by Steve Baker) for HOT/Heat can be used to specify much of what is specified using the metadata sections in this prototype. I am interested in using my experience with this prototype to contribute to HOT/Heat's cross-vm synchronization and communication design and code. I look forward to your comments. Thanks, LN___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] A prototype for cross-vm synchronization and communication
Just wanted to add a couple of clarifications: 1. the cross-vm dependences are captured via the read/writes of attributes in resources and in software components (described in metadata sections). 2. these dependences are then realized via blocking-reads and writes to zookeeper, which realizes the cross-vm synchronization and communication of values between the resources. Thanks, LN Lakshminaraya Renganarayana/Watson/IBM@IBMUS wrote on 10/18/2013 02:45:01 PM: From: Lakshminaraya Renganarayana/Watson/IBM@IBMUS To: OpenStack Development Mailing List openstack-dev@lists.openstack.org Date: 10/18/2013 02:48 PM Subject: [openstack-dev] [Heat] A prototype for cross-vm synchronization and communication Hi, In the last Openstack Heat meeting there was good interest in proposals for cross-vm synchronization and communication and I had mentioned the prototype I have built. I had also promised that I will post an outline of the prototype ... Here it is. I might have missed some details, please feel free to ask / comment and I would be happy to explain more. --- Goal of the prototype: Enable cross-vm synchronization and communication using high-level declarative description (no wait- conditions) Use chef as the CM tool. Design rationale / choices of the prototype (note that these were made just for the prototype and I am not proposing them to be the choices for Heat/HOT): D1: No new construct in Heat template = use metadata sections D2: No extensions to core Heat engine = use a pre-processor that will produce a Heat template that the standard Heat engine can consume D3: Do not require chef recipes to be modified = use a convention of accessing inputs/outputs from chef node[][] = use ruby meta-programming to intercept reads/writes to node[][] forward values D4: Use a standard distributed coordinator (don't reinvent) = use zookeeper as a coordinator and as a global data space for communciation Overall, the flow is the following: 1. User specifies a Heat template with details about software config and dependences in the metadata section of resources (see step S1 below). 2. A pre-processor consumes this augmented heat template and produces another heat template with user-data sections with cloud- init scripts and also sets up a zookeeper instance with enough information to coordinate between the resources at runtime to realize the dependences and synchronization (see step S2) 3. The generated heat template is fed into standard heat engine to deploy. After the VMs are created the cloud-init script kicks in. The cloud init script installs chef solo and then starts the execution of the roles specified in the metadata section. During this execution of the recipes the coordination is realized (see steps S2 and S3 below). Implementation scheme: S1. Use metadata section of each resource to describe (see attached example) - a list of roles - inputs to and outputs from each role and their mapping to resource attrs (any attr) - convention: these inputs/outputs will be through chef node attrs node [][] S2. Dependence analysis and cloud init script generation Dependence analysis: - resolve every reference that can be statically resolved using Heat's fucntions (this step just uses Heat's current dependence analysis -- Thanks to Zane Bitter for helping me understand this) - flag all unresolved references as values resolved at run-time at communicated via the coordinator Use cloud-init in user-data sections: - automatically generate a script that would bootstrap chef and will run the roles/recipes in the order specified in the metadata section - generate dependence info for zookeeper to coordinate at runtime S3. Coordinate synchronization and communication at run-time - intercept reads and writes to node[][] - if it is a remote read, get it from Zookeeper - execution will block till the value is available - if write is for a value required by a remote resource, write the value to Zookeeper The prototype is implemented in Python and Ruby is used for chef interception. There are alternatives for many of the choices I have made for the prototype: - zookeeper can be replaced with any other service that provides a data space and distributed coordination - chef can be replaced by any other CM tool (a little bit of design / convention needed for other CM tools because of the interception used in the prototype to catch reads/writes to node[][]) - the whole dependence analysis can be integrated into the Heat's dependence analyzer - the component construct proposed recently (by Steve Baker) for HOT/Heat can be used to specify much of what is specified using the metadata sections in this prototype. I am interested in using my experience with this prototype to contribute to HOT/Heat's cross-vm synchronization and communication design and code. I look forward to your comments. Thanks, LN___ OpenStack
Re: [openstack-dev] [Heat] HOT Software configuration proposal
Clint Byrum cl...@fewbar.com wrote on 10/16/2013 03:02:13 PM: Excerpts from Zane Bitter's message of 2013-10-16 06:16:33 -0700: For me the crucial question is, how do we define the interface for synchronising and passing data from and to arbitrary applications running under an arbitrary configuration management system? Compared to this, defining the actual format in which software applications are specified in HOT seems like a Simple Matter of Bikeshedding ;) Agreed. This is one area where juju excels (making cross-node message passing simple). So perhaps we should take a look at what works from the juju model and copy it. Actually, this exactly the point how do we define the interface for synchronising and passing data from and to arbitrary applications running under an arbitrary configuration management system? I was addressing in my message/proposal a couple of days back on the mailing list :-) Glad to see that echoed again. I am proposing that Heat should have a higher (than current wait-conditions/signals) level abstraction for synchronization and data exchange. I do not mind it being message passing as in JuJu. Based on our experience I am proposing a zookeeper style global data space with blocking-reads, and non-blocking writes. Thanks, LN ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] HOT Software orchestration proposal for workflows
Hi Angus, Thanks for detailed reply. I have a few comments that I have written below in the context. Angus Salkeld asalk...@redhat.com wrote on 10/13/2013 06:40:01 PM: - INPUTS: all the attributes that are consumed/used/read by that resource (currently, we have Ref, GetAttrs that can give this implicitly) - OUTPUTS: all the attributes that are produced/written by that resource (I do not know if this write-set is currently well-defined for a resource. I think some of them are implicitly defined by Heat on particular resource types.) - Global name-space and data-space : all the values produced and consumed (INPUTS/OUTPUTS) are described using a names that are fully qualified (XXX.stack_name.resource_name.property_name). The data values associated with these names are stored in a global data-space. Reads are blocking, i.e., reading a value will block the execution resource/thread until the value is available. Writes are non-blocking, i.e., any thread can write a value and the write will succeed immediately. I don't believe this would give us any new behaviour. I believe that in today's Heat, wait-conditions and signals are the only mechanism for synchronization during software configuration. The proposed mechanism would provide a higher level synchronization based on blocking-reads. For example, if one is using Chef for software configuration, then the recipes can use the proposed mechanism to wait for the all the node[][] attributes they require before starting the recipe execution. And, Heat can actually analyze and reason about deadlock properties of such a synchronization. On the other hand, if the recipe were using wait-conditions how would Heat reason about deadlock properties of it? The ability to define resources at arbitrary levels of granularity together with the explicit specification of INPUTS/OUTPUTS allows us to reap the benefits G1 and G2 outlined above. Note that the ability to reason about the inputs/outputs of each resource and the induced dependencies will also allow Heat to detect dead-locks via dependence cycles (benefit G3). This is already done today in Heat for Refs, GetAttr on base-resources, but the proposal is to extend the same to arbitrary attributes for any resource. How are TemplateResources and NestedStacks any different? To my knowledge this is aleady the case. The blocking-read and non-blocking writes further structures the specification to avoid deadlocks and race conditions (benefit G3). Have you experienced deadlocks with heat? I have never seen this... Heat as it is today does not tackle the problem of synchronization during software configuration and hence the problems I see cannot be attributed to Heat and can only be attributed to the scripts / recipes that do the software configuration. However, if we envision Heat to provide some support for software configuration I can easily imagine cases where it is impossible for Heat to analyze/reason with wait-conditions and hence leading to deadlocks. Wait-conditions and signals are equal to Timed-Semaphores in their power and expressivity and these are known for their problems with deadlocks. To me what is missing to better support complex software configuration is : - better integrating with existing configuration tools (puppet, chef, salt, ansible, etc). (resource types) One question is, whether in this integration the synchronization is completely left to the configuration tools or Heat would be involved in it. If it is left to configuration tools, say chef, then the question is how does the iterative convergence style execution of chef interfere with the schedule order that Heat determines for a template. On the other hand, if Heat provides the mechanism for synchronization, then the question is whether wait-conditions and signals are the right abstractions for them. What are your thoughts on this? Thanks, LN ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] HOT Software orchestration proposal for workflows
Excellent discussion on various issues around orchestration and coordination -- thanks to you all, in particular to Clint, Angus, Stan, Thomas, Joshua, Zane, Steve ... After reading the discussions, I am finding the following themes emerging (please feel free to correct/add): 1. Most of the building blocks needed for effective coordination and orchestration are already in Heat/HOT. 2. Heat would like to view software configuration as a resource (type) with related providers + plugins 3. There is scope for communication/synchronization mechanisms that would complement the wait-conditions and signals I would like to propose a simple abstraction that would complement the current wait-conditions and signals. My proposal is based our experience with supporting such an abstraction on our DSL and also on an extension of Heat. In a nut-shell, this abstraction is a global data space (visible across resources, stacks) from which resources can read and write their inputs / outputs PLUS the semantics that reads will block until the read values are available and writes are non-blocking. We used ZooKeeper to implement this global data space and the blocking-read/non-blocking-writes semantics. But, these could be implemented using several other mechanisms and I believe the techniques currently used by Heat for meta-data service can be used here. I would like to make clear that I am not proposing a replacement for wait-conditions and signals. I am hoping that wait-conditions and signals would be used by power-users (concurrent/distributed programming experts) and the proposed abstraction would be used by folks (like me) who do not want to reason about concurrency and related problems. Also, the proposed global data-space with blocking reads and non-blocking writes is not a new idea (google tuple-spaces, linda) and it has been proven in other domains such as coordination languages to improve the level of abstraction and productivity. The benefits of the proposed abstraction are: G1. Support finer granularity of dependences G2. Allow Heat to reason/analyze about these dependences so that it can order resource creations/management G3. Avoid classic synchronization problems such as dead-locks and race conditions G4 *Conjecture* : Capture most of the coordination use cases (including those required for software configuration / orchestration). Here is more detailed description: Let us say that we can use either pre-defined or custom resource types to define resources at arbitrary levels of granularity. This can be easily supported and I guess is already possible in current version of Heat/HOT. Given this, the proposed abstraction has two parts: (1) an interface style specification a resource's inputs and outputs and (2) a global name/data space. The interface specification which would capture - INPUTS: all the attributes that are consumed/used/read by that resource (currently, we have Ref, GetAttrs that can give this implicitly) - OUTPUTS: all the attributes that are produced/written by that resource (I do not know if this write-set is currently well-defined for a resource. I think some of them are implicitly defined by Heat on particular resource types.) - Global name-space and data-space : all the values produced and consumed (INPUTS/OUTPUTS) are described using a names that are fully qualified (XXX.stack_name.resource_name.property_name). The data values associated with these names are stored in a global data-space. Reads are blocking, i.e., reading a value will block the execution resource/thread until the value is available. Writes are non-blocking, i.e., any thread can write a value and the write will succeed immediately. The ability to define resources at arbitrary levels of granularity together with the explicit specification of INPUTS/OUTPUTS allows us to reap the benefits G1 and G2 outlined above. Note that the ability to reason about the inputs/outputs of each resource and the induced dependencies will also allow Heat to detect dead-locks via dependence cycles (benefit G3). This is already done today in Heat for Refs, GetAttr on base-resources, but the proposal is to extend the same to arbitrary attributes for any resource. The blocking-read and non-blocking writes further structures the specification to avoid deadlocks and race conditions (benefit G3). As for G4, the conjecture, I can only give as evidence our experience with using our DSL with the proposed abstraction to deploy a few reasonably large applications :-) I would like to know your comments and suggestions. Also, if there is interest I can write a Blueprint / proposal with more details and use-cases. Thanks, LN Clint Byrum cl...@fewbar.com wrote on 10/11/2013 12:40:19 PM: From: Clint Byrum cl...@fewbar.com To: openstack-dev openstack-dev@lists.openstack.org Date: 10/11/2013 12:43 PM Subject: Re: [openstack-dev] [Heat] HOT Software orchestration proposal for workflows Excerpts from Stan Lagun's message of 2013-10-11 07:22:37
Re: [openstack-dev] [Heat] HOT Software orchestration proposal for workflows
Clint Byrum cl...@fewbar.com wrote on 10/11/2013 12:40:19 PM: From: Clint Byrum cl...@fewbar.com To: openstack-dev openstack-dev@lists.openstack.org Date: 10/11/2013 12:43 PM Subject: Re: [openstack-dev] [Heat] HOT Software orchestration proposal for workflows 3. Ability to return arbitrary (JSON-compatible) data structure from config application and use attributes of that structure as an input for other configs Note that I'd like to see more use cases specified for this ability. The random string generator that Steve Baker has put up should handle most cases where you just need passwords. Generated key sharing might best be deferred to something like Barbican which does a lot more than Heat to try and keep your secrets safe. I had seen a deployment scenario that needed more than random string generator. It was during the deployment of a system that has clustered application servers, i.e., a cluster of application server nodes + a cluster manager node. The deployment progresses by all the VMs (cluster-manager and cluster-nodes) starting concurrently. Then the cluster-nodes wait for the cluster-manager to send them data (xml) to configure themselves. The cluster-manager after reading its own config file, generates config-data for each cluster-node and sends it to them. Thanks, LN___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] HOT Software orchestration proposal for workflows
Steven Hardy sha...@redhat.com wrote on 10/09/2013 05:24:38 AM: So as has already been mentioned, Heat defines an internal workflow, based on the declarative model defined in the template. The model should define dependencies, and Heat should convert those dependencies into a workflow internally. IMO if the user also needs to describe a workflow explicitly in the template, then we've probably failed to provide the right template interfaces for describing depenendencies. I agree with Steven here, models should define the dependencies and Heat should realize/enforce them. An important design issue is granularity at which dependencies are defined and enforced. I am aware of the wait-condition and signal constructs in Heat, but I find them a bit low-level as they are prone to the classic dead-lock and race condition problems. I would like to have higher level constructs that support finer-granularity dependences which are needed for software orchestration. Reading through the various disucssion on this topic in this mailing list, I see that many would like to have such higher level constructs for coordination. In our experience with software orchestration using our own DSL and also with some extensions to Heat, we found that the granularity of VMs or Resources to be too coarse for defining dependencies for software orchestration. For example, consider a two VM app, with VMs vmA, vmB, and a set of software components (ai's and bi's) to be installed on them: vmA = base-vmA + a1 + a2 + a3 vmB = base-vmB + b1 + b2 + b3 let us say that software component b1 of vmB, requires a config value produced by software component a1 of vmA. How to declaratively model this dependence? Clearly, modeling a dependence between just base-vmA and base-vmB is not enough. However, defining a dependence between the whole of vmA and vmB is too coarse. It would be ideal to be able to define a dependence at the granularity of software components, i.e., vmB.b1 depends on vmA.a1. Of course, it would also be good to capture what value is passed between vmB.b1 and vmA.a1, so that the communication can be facilitated by the orchestration engine. We found that such finer granular modeling of the dependencies provides two valuable benefits: 1. Faster total (resources + software setup) deployment time. For the example described above, a coarse-granularity dependence enforcer would start the deployment of base-vmB after vmA + a1 + a2 + a3 is completed, but a fine-granularity dependence enforcer would start base-vmA and base-vmB concurrently, and then suspend the execution of vmB.b1 until vmA.a1 is complete and then let the rest of deployment proceed concurrently, resulting in a faster completion. 2. More flexible dependencies. For example, mutual dependencies between resources, which can be satisfied when orchestrated at a finer granularity. Using the example described above, fine-granularity would allow vmB.b1 depends_on vmA.a1 and also vmA.a3 depends_on vmB.b2, but coarse-granularity model would flag this as a cyclic dependence. There are two aspects that needs support: 1. Heat/HOT template level constructs to support declarative expression of such fine-granularity dependencies and the values communicated / passed for the dependence. 2. Support from Heat engine / analyzer in supporting the runtime ordering, coordination between resources, and also the communication of the values. What are your thoughts?___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] HOT Software orchestration proposal for workflows
Georgy Okrokvertskhov gokrokvertsk...@mirantis.com wrote on 10/09/2013 03:37:01 PM: From: Georgy Okrokvertskhov gokrokvertsk...@mirantis.com To: OpenStack Development Mailing List openstack-dev@lists.openstack.org Date: 10/09/2013 03:41 PM Subject: Re: [openstack-dev] [Heat] HOT Software orchestration proposal for workflows Thank you for bringing your use case and your thought here. That is exactly tried to achieve in Murano project. There are important aspects you highlighted. Sometime resource model is two high level to describe deployment process. If you start to use more granular approach to have defined steps of deployment you will finish with workflow approach where you have fine control of deployment process but description will be quite complex. IMHO workflow approaches tend to be heavy-weight. So, I am hoping for more light-weight data-flow constructs and mechanisms that can help with the coordination scenarios I have outlined. Data-flow constructs and mechanisms have had lots of success in other domains, and I wondering why can't we (the Heat community) leverage the related theory and tools! Thanks, LN ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] HOT Software orchestration proposal for workflows
Stan Lagun sla...@mirantis.com wrote on 10/09/2013 04:07:33 PM: It seems to me that something is missing in our discussion. If something depends on something else there must be a definition of that something. It is clear that it is not the case that one instance depends on another but one application depends on another application. But there is no such thing as application (service, whatever) in HOT templates. Only low-level resources. And even resources cannot be grouped to some application scope because typical HOT template has resources that are shared between several applications (network, security groups etc.). It also possible to have several applications sharing a single VM instance. That brings us to a conclusion that applications and resources cannot be mixed in the same template on the same level of abstraction. Good point on the levels of abstraction. Now suppose we did somehow established the dependency between two applications. But this dependency is out of scope of particular HOT template. Thats because HOT template says what user whishes to install. But a dependency between applications is an attribute of applications themselves, not the particular deployment. For example WordPress requires database. It always does. Not that it requires it within this particular template but a a universal rule. In Murano we call it data vs. metadata separation. If there is a metadata that says WordPress requires DB then you not just only don't have to repeat it in each template but you cannot even ask a system to deploy WordPress without DB. I think the kind of dependency you have outlined above is more of software component requirements of an application. These kind of semantic dependencies are important and are probably outside the scope of Heat. The kind of dependencies I referred to are of the nature of data-flow between software components: for example, a tomcat application server needs (and hence, depends on) the DB's username/password to set up its configuration. How do we model such a data-flow dependence and how to we facilitate the communication of such values from the DB to the tomcat component? IMHO, such questions are related to Heat. So the question is maybe we need to think about applications/ services and their metadata before going into workflow orchestration? Otherwise the whole orchestration would be reinvented time and time again with each new HOT template. What are your thoughts on this? I find your separation of metadata vs. data useful. In my opinion, the kind of metadata you are trying to capture would be best modeled by a DSL that sits on top of HOT/Heat. Thanks, LN On Wed, Oct 9, 2013 at 11:37 PM, Georgy Okrokvertskhov gokrokvertsk...@mirantis.com wrote: Hi Lakshminaraya, Thank you for bringing your use case and your thought here. That is exactly tried to achieve in Murano project. There are important aspects you highlighted. Sometime resource model is two high level to describe deployment process. If you start to use more granular approach to have defined steps of deployment you will finish with workflow approach where you have fine control of deployment process but description will be quite complex. I think the HOT approach is to provide a simple way do describe you deployment which consists of solid bricks (resources). If you are using standard resources you can easily create a simple HOT template for your deployment. If you need some custom resource you basically have two options - create new resource class and hide all complexity inside the code or use some workflows language to describe all steps required. The first approach is currently supported by Heat. We have an experience of creating new custom resources for orchestration deployment to specific IT infrastructure with specific hardware and software. Right now we are trying to figure out the possibility of adding workflows to HOT. It looks like adding workflows language directly might harm HOT simplicity by overloaded DSL syntax and structures. I actually see the value in Steve's idea to have specific resource or resource set to call workflows execution on external engine. In this case HOT template will be still pretty simple as all workflow details will be hidden, but still manageable without code writing. Thanks Gosha On Wed, Oct 9, 2013 at 11:31 AM, Lakshminaraya Renganarayana lren...@us.ibm.com wrote: Steven Hardy sha...@redhat.com wrote on 10/09/2013 05:24:38 AM: So as has already been mentioned, Heat defines an internal workflow, based on the declarative model defined in the template. The model should define dependencies, and Heat should convert those dependencies into a workflow internally. IMO if the user also needs to describe a workflow explicitly in the template, then we've probably failed to provide the right template interfaces for describing depenendencies. I agree with Steven here, models should define