Re: [openstack-dev] [nova] [placement] resource providers update 18-07
On 02/24/2018 02:17 AM, Matt Riedemann wrote: On 2/16/2018 7:54 AM, Chris Dent wrote: Before I get to the meat of this week's report, I'd like to request some feedback from readers on how to improve the report. Over its lifetime it has grown and it has now reached the point that while it tries to give the impression of being complete, it never actually is, and is a fair chunk of work to get that way. So perhaps there is a way to make it a bit more focused and thus bit more actionable. If there are parts you can live without or parts you can't live without, please let me know. One idea I've had is to do some kind of automation to make it what amounts to a dashboard, but I'm not super inclined to do that because the human curation has been useful for me. If it's not useful for anyone else, however, then that's something to consider. -1 on a dashboard unless it's just something like a placement-specific review dashboard, but you'd have to star or somehow label placement-specific patches. I appreciate the human thought/comments on the various changes for context. As do I. Thank you, Chris, for doing this week after week. It may not seem like it, but these emails are immensely useful for me. Best, -jay __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [placement] resource providers update 18-07
On 2/16/2018 7:54 AM, Chris Dent wrote: Before I get to the meat of this week's report, I'd like to request some feedback from readers on how to improve the report. Over its lifetime it has grown and it has now reached the point that while it tries to give the impression of being complete, it never actually is, and is a fair chunk of work to get that way. So perhaps there is a way to make it a bit more focused and thus bit more actionable. If there are parts you can live without or parts you can't live without, please let me know. One idea I've had is to do some kind of automation to make it what amounts to a dashboard, but I'm not super inclined to do that because the human curation has been useful for me. If it's not useful for anyone else, however, then that's something to consider. -1 on a dashboard unless it's just something like a placement-specific review dashboard, but you'd have to star or somehow label placement-specific patches. I appreciate the human thought/comments on the various changes for context. I don't think I'd remove anything. One thing to maybe add is work on the osc-placement plugin: https://review.openstack.org/#/q/status:open+project:openstack/osc-placement+branch:master+topic:bp/placement-osc-plugin-rocky We did a grind through a bunch of those in Queens and made good progress on providing a minimal useful set of CLIs in osc-placement 1.0.0, so I'd like to see that continue, especially as deployments are upgrading to the point of needing to interact with placement from an ops perspective. -- Thanks, Matt __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] [placement] resource providers update 18-07
Resource provider update 18-07. This will be the last one before the PTG and there won't be one during the PTG, so the next one will be 18-10 or later. Before I get to the meat of this week's report, I'd like to request some feedback from readers on how to improve the report. Over its lifetime it has grown and it has now reached the point that while it tries to give the impression of being complete, it never actually is, and is a fair chunk of work to get that way. So perhaps there is a way to make it a bit more focused and thus bit more actionable. If there are parts you can live without or parts you can't live without, please let me know. One idea I've had is to do some kind of automation to make it what amounts to a dashboard, but I'm not super inclined to do that because the human curation has been useful for me. If it's not useful for anyone else, however, then that's something to consider. If, at the PTG, we decide to start making incremental progress on extracting placement to its own thing, I'll probably add a section on this related to work on that. I've been doing a lot of spikes to see where some of the issues are and experiment with solutions. Those need feedback to decide if the direction has promise or creates problems. Okay, with that out of the way. # Most Important RC2 was cut last night. Bug triage and fixing is important. There's been a lot of interesting specs started recently. Part of this is the result of various parties moving their deployments forward (not just to queens) and real issues with placement (and friends) being exposed. See the specs section for some links to ones that are pending. A few have already merged but for sake of visibility: * Add placement-req-filter spec https://review.openstack.org/#/c/544585/ * Support member_of param for allocation candidates https://review.openstack.org/#/c/544694/ PTG planning screams along on etherpads, agenda and retrospective: * https://etherpad.openstack.org/p/nova-ptg-rocky * https://etherpad.openstack.org/p/nova-queens-retrospective # Bugs: * Placement related bugs without owners: https://goo.gl/TgiPXb * In progress placement bugs: https://goo.gl/vzGGDQ # Specs * Support traits in Glance https://review.openstack.org/#/c/541507/4 * Update ProviderTree https://review.openstack.org/#/c/540111/ * Support aggregate affinity filter/weighers https://review.openstack.org/#/c/529135/ (Note that this is not placement aggregates and is not a placement-oriented solution but is something many of the same people are into.) * Report CPU features to placement https://review.openstack.org/#/c/497733/ * Account for host agg allocation ratio in placement https://review.openstack.org/#/c/544683/ * mirror nova host aggregates to placement API https://review.openstack.org/#/c/545057/ * Network bandwidth resource provider https://review.openstack.org/#/c/502306/ # Main Themes We're between themes at the moment so I'll just put everything into other today: # Other * Nested resource providers https://review.openstack.org/#/q/status:open+topic:bp/nested-resource-providers * Update references to OSC in old rp specs https://review.openstack.org/#/c/539038/ * [Placement] Invalid query parameter could lead to HTTP 500 https://review.openstack.org/#/c/539408/ * [placement] use simple FaultWrapper https://review.openstack.org/#/c/533752/ * WIP: Move resource provider objects https://review.openstack.org/#/c/540049/ * Do not normalize allocation ratios https://review.openstack.org/#/c/532924/ * Sending global request ids from nova to placement https://review.openstack.org/#/q/topic:bug/1734625 * Add functional test for two-cell scheduler behaviors https://review.openstack.org/#/c/452006/ (This is old and maybe out of date, but something we might like to resurrect) * Make API history doc consistent https://review.openstack.org/#/c/477478/ * WIP: General policy sample file for placement https://review.openstack.org/#/c/524425/ * Support relay RP for allocation candidates https://review.openstack.org/#/c/533437/ Bug fix for sharing with multiple providers * Convert driver supported capabilities to compute node provider traits https://review.openstack.org/#/c/538498/ * Update resources once in update available resources https://review.openstack.org/#/c/520024/ (This ought, when it works, to help address some redunancy concerns with nova making too many requests to placement) * Support aggregate affinity filters/weighers https://review.openstack.org/#/q/topic:bp/aggregate-affinity A rocky targeted improvement to affinity handling * Improved functional test coverage for placement https://review.openstack.org/#/q/topic:bp/placement-test-enhancement * Functional tests for traits api https://review.openstack.org/#/c/524094/ * WIP: SchedulerReportClient.set_aggregates_for_provider
[openstack-dev] [nova] [placement] resource providers update 18-06
Resource provider 18-06 is here. # Most Important RC1 was cut last night, so we shouldn't be merging any new features now, just bug fixes. Which, of course, means finding and fixing bugs is the thing to do. In the gaps where that's not happening, planning for Rocky is a useful thing to be doing. The PTG is coming up at the end of this month. If you have topics for discussion that are not already on the etherpad, add them: https://etherpad.openstack.org/p/nova-ptg-rocky A variety of specs, and discussions related to such things, are in progress and listed below. If I've forgotten something, let me know, as usual. I wrote a thing describing some of my efforts to break placement: https://anticdent.org/placement-scale-fun.html Placement itself was fine, but I was able to break other stuff. If you have an environment where you are able to do that kind of concrete experimentation, it will help to make the release better. # What's Changed RC1 happened. Some more "sending global request id" changes merged. A release note was created to describe the behavior change in AggregateCoreFilter (and friends): https://review.openstack.org/#/c/541018/ # Help Wanted Testing, Testing, Testing. There are a fair few unstarted bugs related to placement that could do with some attention. Here's a handy URL: https://goo.gl/TgiPXb # Specs * Support traits in Glance https://review.openstack.org/#/c/541507/4 * Add generation support in aggregate assocation https://review.openstack.org/#/c/540447/ * Update ProviderTree https://review.openstack.org/#/c/540111/ * Support aggregate affinity filter/weighers https://review.openstack.org/#/c/529135/ (Note that this is not placement aggregates and is not a placement-oriented solution but is something many of the same people are into.) * Granular Resource Request Syntax (Rocky) https://review.openstack.org/#/c/540179/ * Report CPU features to placement https://review.openstack.org/#/c/497733/ # Main Themes We've not yet identified the new themes, other than to know that Nested remains a big deal. Presumably at the PTG we will define and then narrow the themes. ## Nested Resource Providers Work continues at https://review.openstack.org/#/q/status:open+topic:bp/nested-resource-providers By which I mean that there's lots of active work and discussion on the patches on this topic. It's the locus of activity. # Other Many of these things are bug fixes or doc tuneups, and thus potentially relevant for Queens. * Update references to OSC in old rp specs https://review.openstack.org/#/c/539038/ * [Placement] Invalid query parameter could lead to HTTP 500 https://review.openstack.org/#/c/539408/ * [placement] use simple FaultWrapper https://review.openstack.org/#/c/533752/ * Ensure resource classes correctly https://review.openstack.org/#/c/539738/ * Avoid inventory DELETE API (no conflict detection) https://review.openstack.org/#/c/539712/ * Fix nits in allocation canidate limit handling https://review.openstack.org/#/c/536784/ * WIP: Move resource provider objects https://review.openstack.org/#/c/540049/ * Do not normalize allocation ratios https://review.openstack.org/#/c/532924/ * Sending global request ids from nova to placement https://review.openstack.org/#/q/topic:bug/1734625 * Update resources once in update available resources https://review.openstack.org/#/c/520024/ (This ought, when it works, to help address some redunancy concerns with nova making too many requests to placement) * Support aggregate affinity filters/weighers https://review.openstack.org/#/q/topic:bp/aggregate-affinity A rocky targeted improvement to affinity handling * Move placement body samples in docs to own dir https://review.openstack.org/#/c/529998/ * Improved functional test coverage for placement https://review.openstack.org/#/q/topic:bp/placement-test-enhancement * Functional tests for traits api https://review.openstack.org/#/c/524094/ * annotate loadapp() (for placement wsgi app) as public https://review.openstack.org/#/c/526691/ * Remove microversion fallback code from report client https://review.openstack.org/#/c/528794/ * WIP: SchedulerReportClient.set_aggregates_for_provider https://review.openstack.org/#/c/532995/ This is for rocky as it depends on changing the api for aggregates handling on the placement side to accept and provide a generation * Add functional test for two-cell scheduler behaviors https://review.openstack.org/#/c/452006/ (This is old and maybe out of date, but something we might like to resurrect) * Make API history doc consistent https://review.openstack.org/#/c/477478/ * WIP: General policy sample file for placement https://review.openstack.org/#/c/524425/ * Support relay RP for allocation candidates https://review.openstack.org/#/c/533437/ Bug fix for sharing with multiple
[openstack-dev] [nova] [placement] resource providers update 18-05
Here's resource provider and placement update 18-05. 18-04 was skipped on account of illness. # Most Important Feature freeze has come and gone, RC1 is next week. This means that finding bugs and, where relevant, reporting them with a tag of 'queens-rc-potential' is top priority. The PTG is coming up at the end of this month. If you have topics for discussion that are not already on the etherpad add them: https://etherpad.openstack.org/p/nova-ptg-rocky I wrote a blog post to gather some thinking (and links) about preparing to extract placement from nova (or at least ease the path when it does eventually happen): https://anticdent.org/placement-extraction.html It's probably time to start writing specs for some of the things we know will be a big deal with placement in Rocky. Eric has started with a spec that covers the ProviderTree work. Much of that work is already done, but never had a spec in the first place: https://review.openstack.org/#/c/540111/ I'm on the hook to create a spec for enabling generation handling when associating aggregates. If there are others, getting them started before the PTG can help to make the time at the PTG more effective. # What's Changed A limit is now passed to /allocation_candidates to ensure that we don't cause out of memory errors in big empty clouds. Traits expressed as 'required' in flavor extra specs are passed in requests to placement and /allocation_candidates accepts the the required parameter. More, but not yet all, requests from nova to placement include the global request id. Some, but not all, of the ProviderTree functionality has merged. The full stack of Alternate Hosts is now merged. The ironic driver now manages traits. At least some support for VGPU merged. Not clear what this means for end users. # Help Wanted Testing, Testing, Testing. There are a fair few unstarted bugs related to placement that could do with some attention. Here's a handy URL: https://goo.gl/TgiPXb # Main Themes We've not yet identified the new themes, other than to know that Nested remains a big deal. ## Nested Resource Providers The work to get nested providers represented in the /allocation_candidates did not complete before feature freeze. It remains in progresss at https://review.openstack.org/#/q/status:open+topic:bp/nested-resource-providers There's been a lot of discussion in IRC about the sometimes differing goals on how people want NRP to work. One example is at: http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2018-01-29.log.html#t2018-01-29T15:01:24 There's an email thread related to that discussion: http://lists.openstack.org/pipermail/openstack-dev/2018-January/126651.html I think we'll be doing ourselves a favor if we can work to satisfy concrete use cases and then generalize from that. The related provider tree work is now under its own topic: https://review.openstack.org/#/q/topic:bp/update-provider-tree # Other Plenty of these are bugs or fairly trivial and/or non-feature fixes. * doc: mark the max microversions for queens https://review.openstack.org/#/c/539978/ * [Placement] Invalid query parameter could lead to HTTP 500 https://review.openstack.org/#/c/539408/ * [placement] use simple FaultWrapper https://review.openstack.org/#/c/533752/ * Ensure resource classes correctly https://review.openstack.org/#/c/539738/ * Avoid inventory DELETE API (no conflict detection) https://review.openstack.org/#/c/539712/ * Do not normalize allocation ratios https://review.openstack.org/#/c/532924/ * Sending global request ids from nova to placement https://review.openstack.org/#/q/topic:bug/1734625 * VGPU suppport https://review.openstack.org/#/q/topic:bp/add-support-for-vgpu * Update resources once in update available resources https://review.openstack.org/#/c/520024/ (This ought, when it works, to help address some performance concerns with nova making too many requests to placement) * spec: treat devices as generic resources https://review.openstack.org/#/c/497978/ This is a WIP and will need to move to Rocky * Support aggregate affinity filters/weighers https://review.openstack.org/#/q/topic:bp/aggregate-affinity A rocky targeted improvement to affinity handling * Move placement body samples in docs to own dir https://review.openstack.org/#/c/529998/ * Improved functional test coverage for placement https://review.openstack.org/#/q/topic:bp/placement-test-enhancement * Functional tests for traits api https://review.openstack.org/#/c/524094/ * annotate loadapp() (for placement wsgi app) as public https://review.openstack.org/#/c/526691/ * Remove microversion fallback code from report client https://review.openstack.org/#/c/528794/ * WIP: SchedulerReportClient.set_aggregates_for_provider https://review.openstack.org/#/c/532995/ This is likely for rocky as it depends on changing the api for agg
Re: [openstack-dev] [nova] [placement] resource providers update 18-03
> Earlier in the week I did some exercising by humans and was confused > by the state of traits handling on /allocation_candidates (it could be > the current state is the expected state but the code didn't make that > clear) so I made a bug on it make sure that confusion didn't get forgotten: > > https://bugs.launchpad.net/nova/+bug/1743860 I can help with the confusion. The current state is indeed expected (at least by me). There were some WIPs early in the cycle to get just the ?required= part of traits in place, BUT the granular resource requests effort was a superset of that. Granular was mostly finished even at that time, but the final piece of the puzzle relies on code that's in progress right now (NRP in allocation candidates) so has been on hold. Whereas I hope it's still possible to tie all that off in Q, we're now getting to a point where it's prudent to hedge our bets and make sure we at least support traits on the single (un-numbered) request group. TL;DR: Yes, let's move forward with Alex's patch: > (Looks like Alex is working on the correct fix at > > https://review.openstack.org/#/c/535642/ ...but also make sure we get lots of review focus on Jay's NRP-in-alloc-cands series to give Granular a fighting chance. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] [placement] resource providers update 18-03
Here's resource provider and placement update 18-03. I'm travelling so this version may be a bit abridged. # Most Important This remains mostly the same, getting alternate hosts all the way in and finishing up nested resource provider support (as ProviderTree on the nova side and support for nested in /allocation_candidates on the placement side). Both of these will likely need some time to be rigorously run through their paces before the end of the cycle, so the sooner stuff merges the sooner we can start getting the whole suite exercised by humans. Earlier in the week I did some exercising by humans and was confused by the state of traits handling on /allocation_candidates (it could be the current state is the expected state but the code didn't make that clear) so I made a bug on it make sure that confusion didn't get forgotten: https://bugs.launchpad.net/nova/+bug/1743860 I highlight this not because I think that problems is especially a "most important" but that it is a type of problem that I think we'll see a fair bit of over the next small number of weeks as we close out Queens and head for Rocky. (Looks like Alex is working on the correct fix at https://review.openstack.org/#/c/535642/ Based on that it seems most of the confusion here is mine, but that it was hard to tell what is up or the plan is is something we probably need to get better at.) The Rocky PTG prep etherpad is in flight at https://etherpad.openstack.org/p/nova-ptg-rocky please add things you think need to be talked about at the PTG. There's an email thread in progress that is probably pretty important to understand, if you're working on placement related things: http://lists.openstack.org/pipermail/openstack-dev/2018-January/126283.html The behavior of the Aggregate*FilterS has gone awry in the face of placement satisfying allocation_ratio concerns before those filters ever see proposed hosts. There are some ideas on how to improve the situation in the thread, but it appears there are still some open questions. # What's Changed An issue with foreign key constraints and deleting a resource provider whose root is itself has been resolved and the change merged: https://review.openstack.org/#/c/529519/ Anybody (or thing) that was experimenting with deleting resource providers with a database with some integrity would have encountered this problem. A proposal to create a Resource Management SIG has merged. There was some email discussion about it: http://lists.openstack.org/pipermail/openstack-dev/2018-January/126039.html # Help Wanted There are a fair few unstarted bugs related to placement that could do with some attention. Here's a handy URL: https://goo.gl/TgiPXb # Main Themes ## Nested Providers The nested provider work is proceeding along two main courses: getting the ProviderTree on the nova side gathering and syncing all the necessary information, and enabling nested provider searching when requesting /allocation_candidates. Both of these are within the same topic: https://review.openstack.org/#/q/topic:bp/nested-resource-providers One of the challenges this week was working out a reasonable way to have a read-only and thread-safe duplicate of a ProviderTree so that tree A and tree B can have what amounts to a diff done on them. This is being figured out on https://review.openstack.org/#/c/533244/ ## Alternate Hosts The last piece of the puzzle, changing the RPC interface, is pending: https://review.openstack.org/#/q/topic:bp/return-alternate-hosts Related to this, exploration has started on limiting the number of responses that the scheduler will get when requesting hosts (some of which will become alternates): https://review.openstack.org/#/c/531517/ # Other * Support traits in allocation candidates https://review.openstack.org/#/c/535642/ * Extract instance allocation removal code https://review.openstack.org/#/c/513041/ * Sending global request ids from nova to placement https://review.openstack.org/#/q/topic:bug/1734625 * VGPU suppport https://review.openstack.org/#/q/topic:bp/add-support-for-vgpu * Use traits with ironic https://review.openstack.org/#/q/topic:bp/ironic-driver-traits * Move api schemas to own dir https://review.openstack.org/#/c/528629/ Just one of these left * request limit /allocation_candidate WIP https://review.openstack.org/#/c/531517/ * Update resources once in update available resources https://review.openstack.org/#/c/520024/ (This ought, when it works, to help address some performance concerns with nova making too many requests to placement) * spec: treat devices as generic resources https://review.openstack.org/#/c/497978/ This is a WIP and will need to move to Rocky * log options at DEBUG when starting wsgi app https://review.openstack.org/#/c/519462/ * Support aggregate affinity filters/weighers https://review.openstack.org/#/q/topic:bp/aggregate-affinity A rocky
[openstack-dev] [nova] [placement] resource providers update 18-02
Resource provider and placement 18-02. Getting a bit more warmed up here, so should be more stuff from more places. # Most Important Completing alternate hosts and exposing the basic nested resource providers functionality is what matters. We've reached that stage in the cycle where at least some interesting ideas, inspired by current work, need to be pushed off to Rocky. Speaking of Rocky, the etherpad for PTG topics is underway at https://etherpad.openstack.org/p/nova-ptg-rocky In typical fashion there's plenty of stuff on there related to placement already, but there's likely plenty more to talk about. If you have something, even if it is tentative, add it. The list will get more structured closer to the PTG. As we approach the end of the cycle finding and fixing bugs ought to become the focus. # What's Changed Eric gave a nice summary of this week's scheduler meeting in yesterday's Nova team meeting. It's worth reading: http://eavesdrop.openstack.org/meetings/nova/2018/nova.2018-01-11-14.01.log.html#l-74 # Help Wanted There are a fair few unstarted bugs related to placement that could do with some attention. Here's a handy URL: https://goo.gl/TgiPXb # Main Themes ## Nested Providers The nested provider work is proceeding along two main courses: getting the ProviderTree on the nova side gathering and syncing all the necessary information, and enabling nested provider searching when requesting /allocation_candidates. Both of these are within the same topic: https://review.openstack.org/#/q/topic:bp/nested-resource-providers We've identified the need to handle conflicts responses (409) in a more generic fashion in the ProviderTree. The new plan is, when a conflict is caused by mismatched generations, reset and reload the entire tree rather than attempting to resync at a granular level. # Alternate Hosts The last piece of the puzzle, changing the RPC interface, is pending: https://review.openstack.org/#/q/topic:bp/return-alternate-hosts Some issues with resizes and interaction with the CachingScheduler have been addressed. Related to this, exploration has started on limiting the number of responses that the scheduler will request when requesting hosts (some of which will become alternates): https://review.openstack.org/#/c/531517/ ## Misc Traits, Shared, Etc Cleanups There's a stack of code that fixes up a lot of things related to traits, sharing providers, test additions and fixes to those tests. At the moment the changes are in a bug topic: https://review.openstack.org/#/q/topic:bug/1702420 # Other * Extract instance allocation removal code https://review.openstack.org/#/c/513041/ * Sending global request ids from nova to placement https://review.openstack.org/#/q/topic:bug/1734625 * VGPU suppport https://review.openstack.org/#/q/topic:bp/add-support-for-vgpu * Use traits with ironic https://review.openstack.org/#/q/topic:bp/ironic-driver-traits * Move api schemas to own dir https://review.openstack.org/#/c/528629/ * request limit /allocation_candidate WIP https://review.openstack.org/#/c/531517/ * Update resources once in update available resources https://review.openstack.org/#/c/520024/ (This ought, when it works, to help address some performance concerns with nova making too many requests to placement) * Fix resource provider delete https://review.openstack.org/#/c/529519/ * spec: treat devices as generic resources https://review.openstack.org/#/c/497978/ This is a WIP and will need to move to Rocky * log options at DEBUG when starting wsgi app https://review.openstack.org/#/c/519462/ * Support aggregate affinity filters/weighers https://review.openstack.org/#/q/topic:bp/aggregate-affinity A rocky targeted improvement to affinity handling * Move placement body samples in docs to own dir https://review.openstack.org/#/c/529998/ * Improved functional test coverage for placement https://review.openstack.org/#/q/topic:bp/placement-test-enhancement * Functional tests for traits api https://review.openstack.org/#/c/524094/ * Functional test improvements for resource class https://review.openstack.org/#/c/524506/ * annotate loadapp() (for placement wsgi app) as public https://review.openstack.org/#/c/526691/ * Remove microversion fallback code from report client https://review.openstack.org/#/c/528794/ * Document lack of side-effects in AllocationList.create_all() https://review.openstack.org/#/c/530997/ * Fix documentation nits in set_and_clear_allocations https://review.openstack.org/#/c/531001/ * WIP: SchedulerReportClient.set_aggregates_for_provider https://review.openstack.org/#/c/532995/ This is likely for rocky as it depends on changing the api for aggregates handling on the placement side to accept and provide a generation * Naming update cn to rp (for clarity) https://review.openstack.org/#/c/529786/ * Add functional test for two-cell scheduler behaviors https://re
[openstack-dev] [nova] [placement] resource providers update 18-01
First resource provider and placement update for 2018. This year I'll be labelling the report with %y-%W to distinguish from last year, so this is 18-01. The engine of activity is still warming up for the new year, so much of this is pre-existing stuff. # Most Important Matt posted a message with some words about getting to the end of Queens smoothly. In general, getting to the end of Queens smoothly is what's most important. The message: http://lists.openstack.org/pipermail/openstack-dev/2018-January/125953.html In there are some bits related to placement and resource providers but he also identified some gaps related to understanding what's up with nested resource providers. Eric provided a response with some of that info: http://lists.openstack.org/pipermail/openstack-dev/2018-January/125977.html Related to that, there's some open discussion on a review about whether or not the ProviderTree system (mentioned in Eric's mail) is going to track shared providers. See: https://review.openstack.org/#/c/526539/ # What's Changed The / of placement no longer requires auth (which helps support automated version discovery). Placement JSON schemas are now in their own directory rather than in the handler files. The report client now uses POST /allocations to set and or clear allocations for multiple consumer uuids in one request (meaning we no longer need the migration allocations theme, below). 'limit' on /allocation_candidates has been approved and should merge today. We should probably have the discussion on if/how to use it from nova-scheduler. I'll put it on the agenda for the next scheduler meeting. # Main Themes ## Nested Providers Mentioned above, the nested-resource-providers stack has grown a long tail of changes for managing nested providers rooted on a compute node: https://review.openstack.org/#/q/topic:bp/nested-resource-providers ## Alternate Hosts Having the scheduler request and use alternate hosts is real close: https://review.openstack.org/#/q/topic:bp/return-alternate-hosts but has hit a snag with resizes and some stuff with the CachingScheduler, such as https://review.openstack.org/#/c/531211/ Alternate hosts is something we want to bring to resolution as soon as possible so it gets as much exposure as possible. ## Misc Traits, Shared, Etc Cleanups There's a stack of code that fixes up a lot of things related to traits, sharing providers, test additions and fixes to those tests. At the moment the changes are in a bug topic: https://review.openstack.org/#/q/topic:bug/1702420 # Other * https://review.openstack.org/#/c/519462/ Log options at debug when starting API services under wsgi (Make any sense to split this into placement and nova versions? One seems easier than the other) * https://review.openstack.org/#/q/I0c4ca6a81f213277fe7219cb905a805712f81e36 Proper error handling by _ensure_resource_provider (This is already approved for master, but there are backports.) * https://review.openstack.org/#/q/topic:bp/placement-osc-plugin Build the placement osc plugin * https://review.openstack.org/#/q/topic:bp/request-traits-in-nova request traits in nova * https://review.openstack.org/#/c/513041/ Extract instance allocation removal code * https://review.openstack.org/#/c/493865/ cover migration cases with functional tests * https://review.openstack.org/#/c/527541/ Add nova-status check for ironic flavor migration * https://review.openstack.org/#/q/topic:bp/add-support-for-vgpu Add support for VGPU * https://review.openstack.org/#/q/topic:placement_schema_separation Put the json schema in their own directory (one left) * https://review.openstack.org/#/q/topic:bug/1734625 global request id passed from nova to placement in requests (makes logging life much easier) * https://review.openstack.org/#/c/529998/ Move body examples to an isolated directory * https://review.openstack.org/#/c/524506/ Add functional tests for resource class API * https://review.openstack.org/#/c/524094/ Add functional tests for traits API # End There's probably more but as I'm not fully up to review speed, I've not seen everything yet. Next week will likely be more complete. Between now and then there will probably also be some conversations on priorities and the state of things such that we can picking what's going to fall off the radar as we race to the end of Queens. -- Chris Dent (⊙_⊙') https://anticdent.org/ freenode: cdent tw: @anticdent__ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] placement/resource providers update 18
On 4/7/2017 8:35 AM, Chris Dent wrote: There was a nova-specs sprint this week, so a lot of eyes were on specs but there continues to be regular progress on resource providers, the placement API and related work in the scheduler and resource tracker. If you're doing work that I haven't noticed and reported in here that you think should be, please follow up with some links. # What Matters Most The addition of traits to the placement API is very close, one patch remains. Linked below. That means that the top of the priority stack is the spec for claims via the scheduler. Also linked below. # What's Changed There's a new spec for including user and project information in allocations. This is a start towards allowing placement info to be used for the counting required for quotas. There's a spec and a followup review to fix some issues with it: http://specs.openstack.org/openstack/nova-specs/specs/pike/approved/placement-project-user.html https://review.openstack.org/#/c/454352/ # Help Wanted Areas where volunteers are needed. * General attention to bugs tagged placement: https://bugs.launchpad.net/nova/+bugs?field.tag=placement * Helping to create api documentation for placement (see the Docs section below). * Helping to create and evaluate functional tests of the resource tracker and the ways in which it and nova-scheduler use the reporting client. For some info see https://etherpad.openstack.org/p/nova-placement-functional and talk to edleafe. * Performance testing. If you have access to some nodes, some basic benchmarking and profiling would be very useful. See the performance section below. Is there room on OSIC for this kind of thing? # Main Themes ## Traits The work to implement the traits API in placement is happening at https://review.openstack.org/#/q/status:open+topic:bp/resource-provider-traits There's one patch left to get the API in place and a patch for a new command to sync the os-traits library into the database: https://review.openstack.org/#/c/450125/ There is a stack of changes to the os-traits library to add more traits and also automate creating symbols associated with the trait strings: https://review.openstack.org/#/c/448282/4 ## Ironic/Custom Resource Classes There's a blueprint for "custom resource classes in flavors" that describes the stuff that will actually make use of custom resource classes: https://blueprints.launchpad.net/nova/+spec/custom-resource-classes-in-flavors The spec has merged, but the implementation has not yet started. Over in Ironic some functional and integration tests have started: https://review.openstack.org/#/c/443628/ ## Claims in the Scheduler Progress has been made on the spec for claims in the scheduler: https://review.openstack.org/#/c/437424/ Some differences of opinion on what's possible now and what the API should expose have been resolved, but now we need to resolve some questions on how (or even if) to most effectively deal with reconciling allocations that used to happen in the resource tracker and will now happen in the scheduler. Eyes and brains required. Thinking about this stuff has also revealed some places where it's possible for allocations to become wrong or orphaned: https://bugs.launchpad.net/nova/+bug/1679750 https://bugs.launchpad.net/nova/+bug/1661312 ## Shared Resource Providers https://blueprints.launchpad.net/nova/+spec/shared-resources-pike Progress on this will continue once traits and claims have moved forward. ## Nested Resource Providers The spec for this has been updated with what was learned at the PTG and moved to pike and merged: http://specs.openstack.org/openstack/nova-specs/specs/pike/approved/nested-resource-providers.html ## Docs https://review.openstack.org/#/q/topic:cd/placement-api-ref Several reviews are in progress for documenting the placement API. This is likely going to take quite a few iterations as we work out the patterns and tooling. But it's great to see the progress and when looking at the draft rendered docs it makes placement feel like a real thing™. Find me (cdent) or Andrey (avolkov) if you want to help out or have other questions. ## Performance We're aware that there are some redundancies in the resource tracker that we'd like to clean up http://lists.openstack.org/pipermail/openstack-dev/2017-January/110953.html but it's also the case that we've done no performance testing on the placement service itself. We ought to do some testing to make sure there aren't unexpected performance drains. Is this something where we could get time on the OSIC hardware? # Other Code/Specs * https://review.openstack.org/#/c/418393/ A spec for improving the level of detail and structure in placement error responses so that it is easier to distinguish between different types of, for example, 409 responses. This hasn't seen any attention since March 17, and as a result didn't
[openstack-dev] [nova] placement/resource providers update 18
There was a nova-specs sprint this week, so a lot of eyes were on specs but there continues to be regular progress on resource providers, the placement API and related work in the scheduler and resource tracker. If you're doing work that I haven't noticed and reported in here that you think should be, please follow up with some links. # What Matters Most The addition of traits to the placement API is very close, one patch remains. Linked below. That means that the top of the priority stack is the spec for claims via the scheduler. Also linked below. # What's Changed There's a new spec for including user and project information in allocations. This is a start towards allowing placement info to be used for the counting required for quotas. There's a spec and a followup review to fix some issues with it: http://specs.openstack.org/openstack/nova-specs/specs/pike/approved/placement-project-user.html https://review.openstack.org/#/c/454352/ # Help Wanted Areas where volunteers are needed. * General attention to bugs tagged placement: https://bugs.launchpad.net/nova/+bugs?field.tag=placement * Helping to create api documentation for placement (see the Docs section below). * Helping to create and evaluate functional tests of the resource tracker and the ways in which it and nova-scheduler use the reporting client. For some info see https://etherpad.openstack.org/p/nova-placement-functional and talk to edleafe. * Performance testing. If you have access to some nodes, some basic benchmarking and profiling would be very useful. See the performance section below. Is there room on OSIC for this kind of thing? # Main Themes ## Traits The work to implement the traits API in placement is happening at https://review.openstack.org/#/q/status:open+topic:bp/resource-provider-traits There's one patch left to get the API in place and a patch for a new command to sync the os-traits library into the database: https://review.openstack.org/#/c/450125/ There is a stack of changes to the os-traits library to add more traits and also automate creating symbols associated with the trait strings: https://review.openstack.org/#/c/448282/4 ## Ironic/Custom Resource Classes There's a blueprint for "custom resource classes in flavors" that describes the stuff that will actually make use of custom resource classes: https://blueprints.launchpad.net/nova/+spec/custom-resource-classes-in-flavors The spec has merged, but the implementation has not yet started. Over in Ironic some functional and integration tests have started: https://review.openstack.org/#/c/443628/ ## Claims in the Scheduler Progress has been made on the spec for claims in the scheduler: https://review.openstack.org/#/c/437424/ Some differences of opinion on what's possible now and what the API should expose have been resolved, but now we need to resolve some questions on how (or even if) to most effectively deal with reconciling allocations that used to happen in the resource tracker and will now happen in the scheduler. Eyes and brains required. Thinking about this stuff has also revealed some places where it's possible for allocations to become wrong or orphaned: https://bugs.launchpad.net/nova/+bug/1679750 https://bugs.launchpad.net/nova/+bug/1661312 ## Shared Resource Providers https://blueprints.launchpad.net/nova/+spec/shared-resources-pike Progress on this will continue once traits and claims have moved forward. ## Nested Resource Providers The spec for this has been updated with what was learned at the PTG and moved to pike and merged: http://specs.openstack.org/openstack/nova-specs/specs/pike/approved/nested-resource-providers.html ## Docs https://review.openstack.org/#/q/topic:cd/placement-api-ref Several reviews are in progress for documenting the placement API. This is likely going to take quite a few iterations as we work out the patterns and tooling. But it's great to see the progress and when looking at the draft rendered docs it makes placement feel like a real thing™. Find me (cdent) or Andrey (avolkov) if you want to help out or have other questions. ## Performance We're aware that there are some redundancies in the resource tracker that we'd like to clean up http://lists.openstack.org/pipermail/openstack-dev/2017-January/110953.html but it's also the case that we've done no performance testing on the placement service itself. We ought to do some testing to make sure there aren't unexpected performance drains. Is this something where we could get time on the OSIC hardware? # Other Code/Specs * https://review.openstack.org/#/c/418393/ A spec for improving the level of detail and structure in placement error responses so that it is easier to distinguish between different types of, for example, 409 responses. This hasn't seen any attention since March 17, and as a result didn't get any attention duri