On Sun, Apr 2, 2017 at 6:01 AM, Dan Prince <dpri...@redhat.com> wrote: > On Fri, 2017-03-31 at 17:21 -0600, Alex Schultz wrote: >> Hey folks, >> >> I wanted to raise awareness of the concept of idempotence[0] and how >> it affects deployment(s). In the puppet world, we consider this very >> important because since puppet is all about ensuring a desired state >> (ie. a system with config files + services). That being said, I feel >> that it is important for any deployment tool to be aware of this. >> When the same code is applied to the system repeatedly (as would be >> the case in a puppet master deployment) the subsequent runs should >> result in no changes if there is no need. If you take a configured >> system and rerun the same deployment code you don't want your >> services >> restarting when the end state is supposed to be the same. In the case >> of TripleO, we should be able deploy an overcloud and rerun the >> deployment process should result in no configuration changes and 0 >> services being restarted during the process. The second run should >> essentially be a noop. >> >> We have recently uncovered various bugs[1][2][3][4] that have >> introduced service disruption due to a lack of idempotency causing >> service restarts. So when reviewing or developing new code what is >> important about the deployment is to think about what happens if I >> run >> this bit of code twice. There are a few common items that come up >> around idempotency. Things like execs in puppet-tripleo should be >> refreshonly or use unless/onlyif to prevent running again if >> unnecessary. Additionally in the TripleO configuration it's >> important >> to understand in which step a service is configured and if it >> possibly >> would get deconfigured in another step. For example, we configure >> apache and some wsgi services in step 3. But we currently configure >> some additional wsgi openstack services in step 4 which is resulting >> in excessive httpd restarts and possible service unavailability[5] >> when updates are applied. >> >> Another important place to understand this concept is in upgrades >> where we currently allow for ansible tasks to be used. These should >> result in an idempotent action when puppet is subsequently run which >> means that the two bits of code essentially need to result in the >> same >> configuration. For example in the nova-api upgrades for Newton to >> Ocata we needed to run the same commands[6] that would later be run >> by >> puppet to prevent clashing configurations and possible idempotency >> problems. >> >> Idempotency issues can cause service disruptions, longer deployment >> times for end users, or even possible misconfigurations. I think it >> might be beneficial to add an idempotency periodic job that is >> basically a double run of the deployment process to ensure no service >> or configuration changes on the second run. Thoughts? Ideally one in >> the gate would be awesome but I think it would take to long to be >> feasible with all the other jobs we currently run. > > How would we verify that services aren't getting changed/restarted > even? Checking process runtimes perhaps or something? >
So from deployment standpoint we can check the steps when you run a deployment twice to ensure that there are no changes in the output from the puppet steps. So at a minimum we could deploy, run an update and analyze the logs from the update to ensure there were no items. In the past I've done this[0] by capturing the last run summary from puppet and checking to make sure nothing was changed. > If you used the multinode jobs or perhaps the new undercloud_deploy > installer (single node) it might be feasible to add this into the gate. > I would avoid adding this to the OVB queue as it is already too full > and we can probably gain the coverage we need without that type of > testing. > I wouldn't necessarily start with as a gating action as I think a basic periodic job might be sufficient. Thanks, -Alex [0] https://review.openstack.org/#/c/279271/9/fuelweb_test/helpers/astute_log_parser.py@212 > Dan > >> >> Thanks, >> -Alex >> >> [0] http://binford2k.com/content/2015/10/idempotence-not-just-big-sca >> ry-word >> [1] https://bugs.launchpad.net/tripleo/+bug/1664650 >> [2] https://bugs.launchpad.net/puppet-nova/+bug/1665443 >> [3] https://bugs.launchpad.net/tripleo/+bug/1665405 >> [4] https://bugs.launchpad.net/tripleo/+bug/1665426 >> [5] https://review.openstack.org/#/c/434016/ >> [6] https://review.openstack.org/#/c/405241/ >> >> _____________________________________________________________________ >> _____ >> OpenStack Development Mailing List (not for usage questions) >> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubs >> cribe >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev > > __________________________________________________________________________ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev