On Mon, Jun 18, 2018 at 1:51 PM, Dmitry Tantsur <dtant...@redhat.com> wrote: > On 06/13/2018 03:17 PM, James Slagle wrote: >> >> On Wed, Jun 13, 2018 at 6:49 AM, Dmitry Tantsur <dtant...@redhat.com> >> wrote: >>> >>> Slightly hijacking the thread to provide a status update on one of the >>> items >>> :) >> >> >> Thanks for jumping in. >> >> >>> The immediate plan right now is to wait for metalsmith 0.4.0 to hit the >>> repositories, then start experimenting. I need to find a way to >>> 1. make creating nova instances no-op >>> 2. collect the required information from the created stack (I need >>> networks, >>> ports, hostnames, initial SSH keys, capabilities, images) >>> 3. update the config-download code to optionally include the role [2] >>> I'm not entirely sure where to start, so any hints are welcome. >> >> >> Here are a couple of possibilities. >> >> We could reuse the OS::TripleO::{{role.name}}Server mappings that we >> already have in place for pre-provisioned nodes (deployed-server). >> This could be mapped to a template that exposes some Ansible tasks as >> outputs that drives metalsmith to do the deployment. When >> config-download runs, it would execute these ansible tasks to >> provision the nodes with Ironic. This has the advantage of maintaining >> compatibility with our existing Heat parameter interfaces. It removes >> Nova from the deployment so that from the undercloud perspective you'd >> roughly have: >> >> Mistral -> Heat -> config-download -> Ironic (driven via >> ansible/metalsmith) > > > One thing that came to my mind while planning this work is that I'd prefer > all nodes to be processed in one step. This will help avoiding some issues > that we have now. For example, the following does not work reliably: > > compute-0: just any profile:compute > compute-1: precise node=abcd > control-0: any node > > This has two issues that will pop up randomly: > 1. compute-0 can pick node abcd designated for compute-1 > 2. control-0 can pick a compute node, failing either compute-0 or compute-1 > > This problem is hard to fix if all deployment requests are processed > separately, but is quite trivial if the decision is done based on the whole > deployment plan. I'm going to work on a bulk scheduler like that in > metalsmith. > >> >> A further (or completely different) iteration might look like: >> >> Step 1: Mistral -> Ironic (driven via ansible/metalsmith) >> Step 2: Heat -> config-download > > > Step 1 will still use provided environment to figure out the count of nodes > for each role, their images, capabilities and (optionally) precise node > scheduling? > I'm a bit worried about the last bit: IIRC we rely on Heat's %index% > variable currently. We can, of course, ask people to replace it with > something more explicit on upgrade. > >> >> Step 2 would use the pre-provisioned node (deployed-server) feature >> already existing in TripleO and treat the just provisioned by Ironic >> nodes, as pre-provisioned from the Heat stack perspective. Step 1 and >> Step 2 would also probably be driven by a higher level Mistral >> workflow. This has the advantage of minimal impact to >> tripleo-heat-templates, and also removes Heat from the baremetal >> provisioning step. However, we'd likely need some python compatibility >> libraries that could translate Heat parameter values such as >> HostnameMap to ansible vars for some basic backwards compatibility. > > > Overall, I like this option better. It will allow an operator to isolate the > bare metal provisioning step from everything else. > >> >>> >>> [1] https://github.com/openstack/metalsmith >>> [2] https://metalsmith.readthedocs.io/en/latest/user/ansible.html >>> >>>> >>>> Obviously we have things to consider here such as backwards >>>> compatibility >>>> and >>>> upgrades, but overall, I think this would be a great simplification to >>>> our >>>> overall deployment workflow. >>>> >>> >>> Yeah, this is tricky. Can we make Heat "forget" about Nova instances? >>> Maybe >>> by re-defining them to OS::Heat::None? >> >> >> Not exactly, as Heat would delete the previous versions of the >> resources. We'd need some special migrations, or could support the >> existing method forever for upgrades, and only deprecate it for new >> deployments. > > > Do I get it right that if we redefine OS::TripleO::{{role.name}}Server to be > OS::Heat::None, Heat will delete the old {{role.name}}Server instances on > the next update? This is sad.. > > I'd prefer not to keep Nova support forever, this is going to be hard to > maintain and cover by the CI. Should we extend Heat to support "forgetting" > resources? I think it may have a use case outside of TripleO.
This is already supported, it's just not the default: https://docs.openstack.org/heat/latest/template_guide/hot_spec.html#resources-section you can used e.g deletion_policy: retain to skip the deletion of the underlying heat-managed resource. Steve __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev