Re: [Openstack-operators] [openstack-dev] [goals][upgrade-checkers] Week R-26 Update
On 10/15/18 3:27 AM, Jean-Philippe Evrard wrote: On Fri, 2018-10-12 at 17:05 -0500, Matt Riedemann wrote: The big update this week is version 0.1.0 of oslo.upgradecheck was released. The documentation along with usage examples can be found here [1]. A big thanks to Ben Nemec for getting that done since a few projects were waiting for it. In other updates, some changes were proposed in other projects [2]. And finally, Lance Bragstad and I had a discussion this week [3] about the validity of upgrade checks looking for deleted configuration options. The main scenario I'm thinking about here is FFU where someone is going from Mitaka to Pike. Let's say a config option was deprecated in Newton and then removed in Ocata. As the operator is rolling through from Mitaka to Pike, they might have missed the deprecation signal in Newton and removal in Ocata. Does that mean we should have upgrade checks that look at the configuration for deleted options, or options where the deprecated alias is removed? My thought is that if things will not work once they get to the target release and restart the service code, which would definitely impact the upgrade, then checking for those scenarios is probably OK. If on the other hand the removed options were just tied to functionality that was removed and are otherwise not causing any harm then I don't think we need a check for that. It was noted that oslo.config has a new validation tool [4] so that would take care of some of this same work if run during upgrades. So I think whether or not an upgrade check should be looking for config option removal ultimately depends on the severity of what happens if the manual intervention to handle that removed option is not performed. That's pretty broad, but these upgrade checks aren't really set in stone for what is applied to them. I'd like to get input from others on this, especially operators and if they would find these types of checks useful. [1] https://docs.openstack.org/oslo.upgradecheck/latest/ [2] https://storyboard.openstack.org/#!/story/2003657 [3] http://eavesdrop.openstack.org/irclogs/%23openstack-dev/%23openstack-dev.2018-10-10.log.html#t2018-10-10T15:17:17 [4] http://lists.openstack.org/pipermail/openstack-dev/2018-October/135688.html Hey, Nice topic, thanks Matt! TL:DR; I would rather fail explicitly for all removals, warning on all deprecations. My concern is, by being more surgical, we'd have to decide what's "not causing any harm" (and I think deployers/users are best to determine what's not causing them any harm). Also, it's probably more work to classify based on "severity". The quick win here (for upgrade-checks) is not about being smart, but being an exhaustive, standardized across projects, and _always used_ source of truth for upgrades, which is complemented by release notes. Long answer: At some point in the past, I was working full time on upgrades using OpenStack-Ansible. Our process was the following: 1) Read all the project's releases notes to find upgrade documentation 2) With said release notes, Adapt our deploy tools to handle the upgrade, or/and write ourselves extra documentation+release notes for our deployers. 3) Try the upgrade manually, fail because some release note was missing x or y. Find root cause and retry from step 2 until success. Here is where I see upgrade checkers improving things: 1) No need for deployment projects to parse all release notes for configuration changes, as tooling to upgrade check would be directly outputting things that need to change for scenario x or y that is included in the deployment project. No need to iterate either. 2) Test real deployer use cases. The deployers using openstack-ansible have ultimate flexibility without our code changes. Which means they may have different code paths than our gating. Including these checks in all upgrades, always requiring them to pass, and making them explicit about the changes is tremendously helpful for deployers: - If config deprecations are handled as warnings as part of the same process, we will output said warnings to generate a list of action items for the deployers. We would use only one tool as source of truth for giving the action items (and still continue the upgrade); - If config removals are handled as errors, the upgrade will fail, which is IMO normal, as the deployer would not have respected its action items. Note that deprecated config opts should already be generating warnings in the logs. It is also possible now to use fatal-deprecations with config opts: https://github.com/openstack/oslo.config/commit/5f8b0e0185dafeb68cf04590948b9c9f7d727051 I'm not sure that's exactly what you're talking about, but those might be useful to get us at least part of the way there. In OSA, we could probably implement a deployer override (variable). It would allow the deployers an explicit bypass of an upgrade failure. "I know I am doing this!". It would be useful for doing multiple serial upgrades. In
[Openstack-operators] Forum Schedule - Seeking Community Review
Hi - The Forum schedule is now up (https://www.openstack.org/summit/berlin-2018/summit-schedule/#track=262). If you see a glaring content conflict within the Forum itself, please let me know. You can also view the Full Schedule in the attached PDF if that makes life easier... NOTE: BoFs and WGs are still not all up on the schedule. No need to let us know :) Cheers, Jimmy full-schedule (2).pdf Description: Adobe PDF document ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] [openstack-dev] [SIGS] Ops Tools SIG
On 12/10/2018 14:21, Sean McGinnis wrote: On Fri, Oct 12, 2018 at 11:25:20AM +0200, Martin Magr wrote: Greetings guys, On Thu, Oct 11, 2018 at 4:19 PM, Miguel Angel Ajo Pelayo < majop...@redhat.com> wrote: Adding the mailing lists back to your reply, thank you :) I guess that +melvin.hills...@huawei.com can help us a little bit organizing the SIG, but I guess the first thing would be collecting a list of tools which could be published under the umbrella of the SIG, starting by the ones already in Osops. Publishing documentation for those tools, and the catalog under docs.openstack.org is possibly the next step (or a parallel step). On Wed, Oct 10, 2018 at 4:43 PM Rob McAllister wrote: Hi Miguel, I would love to join this. What do I need to do? Sent from my iPhone On Oct 9, 2018, at 03:17, Miguel Angel Ajo Pelayo wrote: Hello Yesterday, during the Oslo meeting we discussed [6] the possibility of creating a new Special Interest Group [1][2] to provide home and release means for operator related tools [3] [4] [5] all of those tools have python dependencies related to openstack such as python-openstackclient or python-pbr. Which is exactly the reason why we moved osops-tools-monitoring-oschecks packaging away from OpsTools SIG to Cloud SIG. AFAIR we had some issues of having opstools SIG being dependent on openstack SIG. I believe that Cloud SIG is proper home for tools like [3][4][5] as they are related to OpenStack anyway. OpsTools SIG contains general tools like fluentd, sensu, collectd. Hope this helps, Martin Hey Martin, I'm not sure I understand the issue with these tools have dependencies on other packages and the relationship to SIG ownership. Is your concern (or the history of a concern you are pointing out) that the tools would have a more difficult time if they required updates to dependencies if they are owned by a different group? Thanks! Sean Hello, the mentioned sigs (opstools/cloud) are in CentOS scope and mention repository dependencies. That shouldn't bother us here now. There is already a SIG under the CentOS project, providing tools for operators[7], but also documentation and integrational bits. Also, there is some overlap with other groups and SIGs, such as Barometer[8]. Since there is already some duplication, I don't know where it makes sense to have a single group for this purpose? If that hasn't been clear yet, I'd be absolutely interested in joining/helping this effort. Matthias [7] https://wiki.centos.org/SpecialInterestGroup/OpsTools [8] https://wiki.opnfv.org/collector/pages.action?key=fastpath -- Matthias Runge Red Hat GmbH, http://www.de.redhat.com/, Registered seat: Grasbrunn, Commercial register: Amtsgericht Muenchen, HRB 153243, Managing Directors: Charles Cachera, Michael Cunningham, Michael O'Neill, Eric Shander ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] [openstack-dev] [goals][upgrade-checkers] Week R-26 Update
On Fri, 2018-10-12 at 17:05 -0500, Matt Riedemann wrote: > The big update this week is version 0.1.0 of oslo.upgradecheck was > released. The documentation along with usage examples can be found > here > [1]. A big thanks to Ben Nemec for getting that done since a few > projects were waiting for it. > > In other updates, some changes were proposed in other projects [2]. > > And finally, Lance Bragstad and I had a discussion this week [3] > about > the validity of upgrade checks looking for deleted configuration > options. The main scenario I'm thinking about here is FFU where > someone > is going from Mitaka to Pike. Let's say a config option was > deprecated > in Newton and then removed in Ocata. As the operator is rolling > through > from Mitaka to Pike, they might have missed the deprecation signal > in > Newton and removal in Ocata. Does that mean we should have upgrade > checks that look at the configuration for deleted options, or > options > where the deprecated alias is removed? My thought is that if things > will > not work once they get to the target release and restart the service > code, which would definitely impact the upgrade, then checking for > those > scenarios is probably OK. If on the other hand the removed options > were > just tied to functionality that was removed and are otherwise not > causing any harm then I don't think we need a check for that. It was > noted that oslo.config has a new validation tool [4] so that would > take > care of some of this same work if run during upgrades. So I think > whether or not an upgrade check should be looking for config option > removal ultimately depends on the severity of what happens if the > manual > intervention to handle that removed option is not performed. That's > pretty broad, but these upgrade checks aren't really set in stone > for > what is applied to them. I'd like to get input from others on this, > especially operators and if they would find these types of checks > useful. > > [1] https://docs.openstack.org/oslo.upgradecheck/latest/ > [2] https://storyboard.openstack.org/#!/story/2003657 > [3] > http://eavesdrop.openstack.org/irclogs/%23openstack-dev/%23openstack-dev.2018-10-10.log.html#t2018-10-10T15:17:17 > [4] > http://lists.openstack.org/pipermail/openstack-dev/2018-October/135688.html > Hey, Nice topic, thanks Matt! TL:DR; I would rather fail explicitly for all removals, warning on all deprecations. My concern is, by being more surgical, we'd have to decide what's "not causing any harm" (and I think deployers/users are best to determine what's not causing them any harm). Also, it's probably more work to classify based on "severity". The quick win here (for upgrade-checks) is not about being smart, but being an exhaustive, standardized across projects, and _always used_ source of truth for upgrades, which is complemented by release notes. Long answer: At some point in the past, I was working full time on upgrades using OpenStack-Ansible. Our process was the following: 1) Read all the project's releases notes to find upgrade documentation 2) With said release notes, Adapt our deploy tools to handle the upgrade, or/and write ourselves extra documentation+release notes for our deployers. 3) Try the upgrade manually, fail because some release note was missing x or y. Find root cause and retry from step 2 until success. Here is where I see upgrade checkers improving things: 1) No need for deployment projects to parse all release notes for configuration changes, as tooling to upgrade check would be directly outputting things that need to change for scenario x or y that is included in the deployment project. No need to iterate either. 2) Test real deployer use cases. The deployers using openstack-ansible have ultimate flexibility without our code changes. Which means they may have different code paths than our gating. Including these checks in all upgrades, always requiring them to pass, and making them explicit about the changes is tremendously helpful for deployers: - If config deprecations are handled as warnings as part of the same process, we will output said warnings to generate a list of action items for the deployers. We would use only one tool as source of truth for giving the action items (and still continue the upgrade); - If config removals are handled as errors, the upgrade will fail, which is IMO normal, as the deployer would not have respected its action items. In OSA, we could probably implement a deployer override (variable). It would allow the deployers an explicit bypass of an upgrade failure. "I know I am doing this!". It would be useful for doing multiple serial upgrades. In that case, deployers could then share together their "recipes" for handling upgrade failure bypasses for certain multi-upgrade (jumps) scenarios. After a while, we could think of feeding those back to upgrade checkers. 3) I like the approach of having oslo-config-validator. However, I must admit it's