Hey Steve, it has taken me about a month to get enough time to go
through this. Thanks for doing it, comments in-line.

Excerpts from Steve Baker's message of 2013-12-13 15:46:48 -0800:
> I've been working on a POC in heat for resources which perform software
> configuration, with the aim of implementing this spec
> https://wiki.openstack.org/wiki/Heat/Blueprints/hot-software-config-spec
> 
> The code to date is here:
> https://review.openstack.org/#/q/topic:bp/hot-software-config,n,z
> 
> What would be helpful now is reviews which give the architectural
> approach enough of a blessing to justify fleshing this POC out into a
> ready to merge changeset.
> 
> Currently it is possible to:
> - create templates containing OS::Heat::SoftwareConfig and
> OS::Heat::SoftwareDeployment resources
> - deploy configs to OS::Nova::Server, where the deployment resource
> remains in an IN_PROGRESS state until it is signalled with the output values
> - write configs which execute shell scripts and report back with output
> values that other resources can have access to.
> 
> What follows is an overview of the architecture and implementation to
> help with your reviews.
> 
> REST API
> ========
> Like many heat resources, OS::Heat::SoftwareConfig and
> OS::Heat::SoftwareDeployment are backed by "real" resources that are
> invoked via a REST API. However in this case, the API that is called is
> heat itself.
> 
> The REST API for these resources really just act as structured storage
> for config and deployments, and the entities are managed via the REST
> paths /{tenant_id}/software_configs and /{tenant_id}/software_deployments:
> <https://review.openstack.org/#/c/58878/7/heat/api/openstack/v1/__init__.py>https://review.openstack.org/#/c/58878/
> RPC layer of REST API:
> https://review.openstack.org/#/c/58877/
> DB layer of REST API:
> https://review.openstack.org/#/c/58876
> heatclient lib access to REST API:
> https://review.openstack.org/#/c/58885
> 
> This data could be stored in a less structured datastore like swift, but
> this API has a couple of important implementation details which I think
> justify it existing:
> - SoftwareConfig resources are immutable once created. There is no
> update API to modify an existing config. This gives confidence that a
> config can have a long lifecycle without changing, and a certainty of
> what exactly is deployed on a server with a given config.
> - Fetching all the deployments and configs for a given server is an
> operation done repeatedly throughout the lifecycle of the stack, so is
> optimized to be able to do in a single operation. This is called by
> using the deployments index API call,
> /{tenant_id}/software_deployments?server_id=<server_id>. The resulting
> list of deployments include the their associated config data[1].
>

I'm curious if we can use the existing Metadata for this. That would
be attractive to me, as that would keep the software deployments data
available to CFN-API based tools. For instance ohai can read the CFN
API already, but not the Heat native in-instance API (since such a thing
basically does not exist).

I'm not suggesting they should both work at the same time. But I can't
see a time where they need to, so it seems logical enough to me to just
have the deployments bit populate the resource metadata, albeit with a
bit more structure. I understand that means updating resource metadata
after booting the server which may be the very reason not having them be
the same thing is simpler.

> OS::Heat::SoftwareConfig resource
> =================================
> OS::Heat::SoftwareConfig can be used directly in a template, but it may
> end be more frequently used in a resource provider template which
> provides a resource aimed at a particular configuration management tool.
> http://docs-draft.openstack.org/79/58879/7/check/gate-heat-docs/911a250/doc/build/html/template_guide/openstack.html#OS::Heat::SoftwareConfig
> The contents of the config property will depend on the CM tool being
> used, but at least one value in the config map will be the actual script
> that the CM tool invokes.  An inputs and outputs schema is also defined
> here. The group property is used when the deployments data is actually
> delivered to the server (more on that later).
> 

Can you elaborate on "the actual script that the CM tool invokes" ?

I have an extremely strong desire to never embed script code in an
orchestration template as it quickly becomes a quagmire of formatting
problems and poorly defined interfaces, IMO. Also my whole reason for
doing golden image based deployment is to have code all live in the
image and stay immutable through the deployed servers' life time.

So I want to make sure I don't _have_ to do the script for some reason.

> Since a config is immutable, any changes to a OS::Heat::SoftwareConfig
> on stack update result in replacement.
> 
> OS::Heat::SoftwareDeployment resource
> =====================================
> OS::Heat::SoftwareDeployment joins a OS::Heat::SoftwareConfig resource
> with a OS::Nova::Server resource. It allows server-specific input values
> to be specified that map to the OS::Heat::SoftwareConfig inputs schema.
> Output values that are signaled to the deployment resource are exposed
> as resource attributes, using the names specified in the outputs schema.
> The OS::Heat::SoftwareDeployment resource remains in an IN_PROGRESS
> state until it receives a signal (containing any outputs) from the server.
> http://docs-draft.openstack.org/79/58879/7/check/gate-heat-docs/911a250/doc/build/html/template_guide/openstack.html#OS::Heat::SoftwareDeployment
> 
> A deployment has its own actions and statuses that are specific to what
> a deployment does, and OS::Heat::SoftwareDeployment maps this to heat
> resource statuses and actions:
> actions:
> DEPLOY -> CREATE
> REDEPLOY -> UPDATE
> UNDEPLOY -> DELETE
> 
> status (these could use some bikeshedding):
> WAITING -> IN_PROGRESS
> RECEIVED -> COMPLETE
> FAILED -> FAILED
> 
> In the config outputs schema there is a special flag for error_output.
> If the signal response contains any value for any of these error_output
> outputs then the deployment resource is put into the FAILED state.
> 
> The SoftwareDeployment class subclasses SignalResponder which means that
> a SoftwareDeployment creates an associated user and ec2 keypair. Since
> the SoftwareDeployment needs to use the resource_id for the deployment
> resource uuid, the user_id needs to be stored in resource-date instead.
> This non-wip change enables that:
> https://review.openstack.org/#/c/61902/
> 
> During create, the deployment REST API is polled until status goes from
> WAITING to RECEIVED. When handle_signal is called, the deployment is
> updated via the REST API to set the status to RECEIVED (or FAILED),
> along with any output values that were received.
> 
> One alarming consequence of having a deployments API is that any tenant
> user can create a deployment for any heat-created nova server and that
> software will be deployed to that server, which is, um, powerful.
> 

Seems like there is a whole policy can of worms there that all OpenStack
services probably need to address. AFAIK, any same-tenant user can just
delete any other user in the same tenant's servers, right? I'm sure
somebody with Keystone knowledge (I'm looking at you Mr. Hardy) would be
able to tell us if such things exist now and whether or not we could
make a stronger bond between user and stack.

> There will need to be a deployment policy (probably an OS::Nova::Server
> property) which limits to scope of what deployments are allowed on that
> server. This could default to deployments in the same stack, but could
> still allow deployments from anywhere.
> 
> OS::Nova::Server support
> ========================
> https://review.openstack.org/#/c/58880
> A new user_data_format=SOFTWARE_CONFIG is currently used to denote that
> this server is configured via software config deployments. Like
> user_data_format=HEAT_CFNTOOLS, nova_utils.build_userdata is used to
> build the cloud-init parts required to support software config. However
> like user_data_format=RAW anything specified in user_data will be parsed
> as cloud-init data. If user_data is multi-part data then the parts will
> be appended to the parts created in nova_utils.build_userdata.
> 
> The agent used currently is os-collect-config. This is typically
> configured to poll for metadata from a particular heat resource via the
> CFN API using the configured ec2 keypair. In the current implementation
> the resource which is polled is the OS::Nova::Server itself, since this
> is the only resource known to exist at server boot time (deployment
> resources depend on server resources, so have not been created yet). The
> ec2 keypair comes from a user created implicitly with the server
> (similar to SignalResponder resources). This means the template author
> doesn't need to include User/AccessKey/AccessPolicy resources in their
> templates just to enable os-collect-config metadata polling.
> 

\o/

> Until now, polling the metadata for a resource just returns the metadata
> which has been stored in the stack resource database. This
> implementation changes metadata polling to actually query the
> deployments API to return the latest deployments data. This means
> deployment state can be stored in one place, and there is no need to
> keep various metadata stores updated with any changed state.
> 

I do like that very much. :)

> An actual template
> ==================
> http://paste.openstack.org/show/54988/
> This template contains:
> - a config resource
> - 2 deployments which deploy that config with 2 different sets of inputs
> - stack outputs which output the results of the deployments
> - a server resource
> - an os-refresh-config script delivered via cloud-config[2] which
> executes config scripts with deployment inputs and signals outputs to
> the provided webhook.

You have demonstrated exactly what I need here. One config definition, two
deployments of it on the same server with different parameters.  Huzzah!

I'm a little confused how the config.script gets access to $bar and $foo
though. ?

> 
> /opt/stack/os-config-refresh/configure.d/55-heat-config-bash is a hook
> specific for performing configuration via shell scripts, and only acts
> on software config which has group=Heat::Shell. Each configuration
> management tool will have its own hook, and will act on its own group
> namespace. Each configuration management tool will also have its own way
> of passing inputs and outputs. The hooks job is to invoke the CM tool
> with the given inputs and script, then extract the outputs and signal heat.
> 
> The server needs to have the CM tool and the hook already installed,
> either by building a golden image or by using cloud-config during boot.
> 
> Next steps
> ==========
> There is a lot left to do and I'd like to spread the development load.
> What happens next entirely depends on feedback to this POC, but here is
> my ideal scenario:
> - any feedback which causes churn on many of the current changes I will
> address
> - a volunteer is found to take the REST API/RPC/DB/heatclient changes
> and make them ready to merge

Using this is a mid-term goal for me, but the Tuskar devs would probably
love to drop my old terrible merge.py and start working on translating
the TripleO templates to using this. So you will find a mid-term
volunteer in me, but you might find a sooner volunteer in them. :)

> - we continue to discuss and refine the resources, the changes to
> OS::Nova::Server, and the example shell hook
> - volunteers write hooks for different CM tools, Chef and Puppet hooks
> will need to be attempted soon to validate this approach.
> 
> Vaguely related changes include:
> - Some solution for specifying cloud-init config, either the intrinsic
> functions or cloud-init heat resources
> - Some heatclient file inclusion mechanism - writing that python hook in
> a heat yaml template was a bit painful ;)

This has been an open question for a long time. Absent golden images, I
think you just need a simple way to express where to fetch the hooks from.
That could be hidden behind the SoftwareConfig resource so that the url
has a default where the cloud operator has put tools. Users could then
just override that url when they want to write their own hooks.

> 
> Trying for yourself
> ===================
> - Using diskimage-builder, create an ubuntu image with
> tripleo-image-elements os-apply-config, os-refresh-config and
> os-collect-config
> - Create a local heat branch containing
> https://review.openstack.org/#/q/topic:bp/cloud-init-resource,n,z and
> https://review.openstack.org/#/q/topic:bp/hot-software-config,n,z
> - launch the above template with your created image
> 

At the time you sent this I think I got this working.

Very excited to help move this forward as much as I can.

_______________________________________________
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to