Re: [openstack-dev] [placement] Anchor/Relay Providers
On 04/16/2018 06:23 PM, Eric Fried wrote: I still don't see a use in returning the root providers in the allocation requests -- since there is nothing consuming resources from those providers. And we already return the root_provider_uuid for all providers involved in allocation requests within the provider_summaries section. So, I can kind of see where we might want to change *this* line of the nova scheduler: https://github.com/openstack/nova/blob/stable/pike/nova/scheduler/filter_scheduler.py#L349 from this: compute_uuids = list(provider_summaries.keys()) to this: compute_uuids = set([ ps['root_provider_uuid'] for ps in provider_summaries ]) If we're granting that it's possible to get all your resources from sharing providers, the above doesn't help you to know which of your compute_uuids belongs to which of those sharing-only allocation requests. I'm fine deferring this part until we have a use case for sharing-only allocation requests that aren't prompted by an "attach-*" case where we already know the target host/consumer. But I'd like to point out that there's nothing in the API that prevents us from getting such results. And I'd like to point out that I originally made the GET /allocation_candidates API not return allocation requests when there were only sharing providers. Because... well, there's just no viable use cases for it. -jay __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [placement] Anchor/Relay Providers
> I still don't see a use in returning the root providers in the > allocation requests -- since there is nothing consuming resources from > those providers. > > And we already return the root_provider_uuid for all providers involved > in allocation requests within the provider_summaries section. > > So, I can kind of see where we might want to change *this* line of the > nova scheduler: > > https://github.com/openstack/nova/blob/stable/pike/nova/scheduler/filter_scheduler.py#L349 > > > from this: > > compute_uuids = list(provider_summaries.keys()) > > to this: > > compute_uuids = set([ > ps['root_provider_uuid'] for ps in provider_summaries > ]) If we're granting that it's possible to get all your resources from sharing providers, the above doesn't help you to know which of your compute_uuids belongs to which of those sharing-only allocation requests. I'm fine deferring this part until we have a use case for sharing-only allocation requests that aren't prompted by an "attach-*" case where we already know the target host/consumer. But I'd like to point out that there's nothing in the API that prevents us from getting such results. -efried __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [placement] Anchor/Relay Providers
On 04/16/2018 04:16 PM, Eric Fried wrote: I was presenting an example using VM-ish resource classes, because I can write them down and everybody knows what I'm talking about without me having to explain what they are. But remember we want placement to be usable outside of Nova as well. But also, I thought we had situations where the VCPU and MEMORY_MB were themselves provided by sharing providers, associated with a compute host RP that may be itself devoid of inventory. (This may even be a viable way to model VMWare's clustery things today.) I still don't see a use in returning the root providers in the allocation requests -- since there is nothing consuming resources from those providers. And we already return the root_provider_uuid for all providers involved in allocation requests within the provider_summaries section. So, I can kind of see where we might want to change *this* line of the nova scheduler: https://github.com/openstack/nova/blob/stable/pike/nova/scheduler/filter_scheduler.py#L349 from this: compute_uuids = list(provider_summaries.keys()) to this: compute_uuids = set([ ps['root_provider_uuid'] for ps in provider_summaries ]) But other than that, I don't see a reason to change the response from GET /allocation_candidates at this time. Best, -jay On 04/16/2018 01:58 PM, Jay Pipes wrote: Sorry it took so long to respond. Comments inline. On 03/30/2018 08:34 PM, Eric Fried wrote: Folks who care about placement (but especially Jay and Tetsuro)- I was reviewing [1] and was at first very unsatisfied that we were not returning the anchor providers in the results. But as I started digging into what it would take to fix it, I realized it's going to be nontrivial. I wanted to dump my thoughts before the weekend. It should be legal to have a configuration like: # CN1 (VCPU, MEMORY_MB) # / \ # /agg1 \agg2 # / \ # SS1 SS2 # (DISK_GB) (IPV4_ADDRESS) And make a request for DISK_GB,IPV4_ADDRESS; And have it return a candidate including SS1 and SS2. The CN1 resource provider acts as an "anchor" or "relay": a provider that doesn't provide any of the requested resource, but connects to one or more sharing providers that do so. To be honest, such a request just doesn't make much sense to me. Think about what that is requesting. I want some DISK_GB resources and an IP address. For what? What is going to be *using* those resources? Ah... a virtual machine. In other words, something that would *also* be requesting some CPU and memory resources as well. So, the request is just fatally flawed, IMHO. It doesn't represent a use case from the real world. I don't believe we should be changing placement (either the REST API or the implementation of allocation candidate retrieval) for use cases that don't represent real-world requests. Best, -jay __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [placement] Anchor/Relay Providers
I was presenting an example using VM-ish resource classes, because I can write them down and everybody knows what I'm talking about without me having to explain what they are. But remember we want placement to be usable outside of Nova as well. But also, I thought we had situations where the VCPU and MEMORY_MB were themselves provided by sharing providers, associated with a compute host RP that may be itself devoid of inventory. (This may even be a viable way to model VMWare's clustery things today.) -efried On 04/16/2018 01:58 PM, Jay Pipes wrote: > Sorry it took so long to respond. Comments inline. > > On 03/30/2018 08:34 PM, Eric Fried wrote: >> Folks who care about placement (but especially Jay and Tetsuro)- >> >> I was reviewing [1] and was at first very unsatisfied that we were not >> returning the anchor providers in the results. But as I started digging >> into what it would take to fix it, I realized it's going to be >> nontrivial. I wanted to dump my thoughts before the weekend. >> >> >> It should be legal to have a configuration like: >> >> # CN1 (VCPU, MEMORY_MB) >> # / \ >> # /agg1 \agg2 >> # / \ >> # SS1 SS2 >> # (DISK_GB) (IPV4_ADDRESS) >> >> And make a request for DISK_GB,IPV4_ADDRESS; >> And have it return a candidate including SS1 and SS2. >> >> The CN1 resource provider acts as an "anchor" or "relay": a provider >> that doesn't provide any of the requested resource, but connects to one >> or more sharing providers that do so. > > To be honest, such a request just doesn't make much sense to me. > > Think about what that is requesting. I want some DISK_GB resources and > an IP address. For what? What is going to be *using* those resources? > > Ah... a virtual machine. In other words, something that would *also* be > requesting some CPU and memory resources as well. > > So, the request is just fatally flawed, IMHO. It doesn't represent a use > case from the real world. > > I don't believe we should be changing placement (either the REST API or > the implementation of allocation candidate retrieval) for use cases that > don't represent real-world requests. > > Best, > -jay > > __ > OpenStack Development Mailing List (not for usage questions) > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [placement] Anchor/Relay Providers
Sorry it took so long to respond. Comments inline. On 03/30/2018 08:34 PM, Eric Fried wrote: Folks who care about placement (but especially Jay and Tetsuro)- I was reviewing [1] and was at first very unsatisfied that we were not returning the anchor providers in the results. But as I started digging into what it would take to fix it, I realized it's going to be nontrivial. I wanted to dump my thoughts before the weekend. It should be legal to have a configuration like: #CN1 (VCPU, MEMORY_MB) #/ \ # /agg1\agg2 # / \ # SS1SS2 # (DISK_GB) (IPV4_ADDRESS) And make a request for DISK_GB,IPV4_ADDRESS; And have it return a candidate including SS1 and SS2. The CN1 resource provider acts as an "anchor" or "relay": a provider that doesn't provide any of the requested resource, but connects to one or more sharing providers that do so. To be honest, such a request just doesn't make much sense to me. Think about what that is requesting. I want some DISK_GB resources and an IP address. For what? What is going to be *using* those resources? Ah... a virtual machine. In other words, something that would *also* be requesting some CPU and memory resources as well. So, the request is just fatally flawed, IMHO. It doesn't represent a use case from the real world. I don't believe we should be changing placement (either the REST API or the implementation of allocation candidate retrieval) for use cases that don't represent real-world requests. Best, -jay __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [placement] Anchor/Relay Providers
/me responds to self Good progress has been made here. Tetsuro solved the piece where provider summaries were only showing resources that had been requested - with [8] they show usage information for *all* their resources. In order to make use of both [1] and [8], I had to shuffle them into the same series - I put [8] first - and then balance my (heretofore) WIP [7] on the top. So we now have a lovely 5-part series starting at [9]. Regarding the (heretofore) WIP [7], I cleaned it up and made it ready. QUESTION: Do we need a microversions for [8] and/or [1] and/or [7]? Each changes the response payload content of GET /allocation_candidates, so yes; but that content was arguably broken before, so no. Please comment on the patches accordingly. -efried > [1] https://review.openstack.org/#/c/533437/ > [2] https://bugs.launchpad.net/nova/+bug/1732731 > [3] https://review.openstack.org/#/c/533437/6/nova/api/openstack/placement/objects/resource_provider.py@3308 > [4] https://review.openstack.org/#/c/533437/6/nova/api/openstack/placement/objects/resource_provider.py@3062 > [5] https://review.openstack.org/#/c/533437/6/nova/api/openstack/placement/objects/resource_provider.py@2658 > [6] https://review.openstack.org/#/c/533437/6/nova/api/openstack/placement/objects/resource_provider.py@3303 > [7] https://review.openstack.org/#/c/558014/ [8] https://review.openstack.org/#/c/558045/ [9] https://review.openstack.org/#/c/558044/ On 03/30/2018 07:34 PM, Eric Fried wrote: > Folks who care about placement (but especially Jay and Tetsuro)- > > I was reviewing [1] and was at first very unsatisfied that we were not > returning the anchor providers in the results. But as I started digging > into what it would take to fix it, I realized it's going to be > nontrivial. I wanted to dump my thoughts before the weekend. > > > It should be legal to have a configuration like: > > #CN1 (VCPU, MEMORY_MB) > #/ \ > # /agg1\agg2 > # / \ > # SS1SS2 > # (DISK_GB) (IPV4_ADDRESS) > > And make a request for DISK_GB,IPV4_ADDRESS; > And have it return a candidate including SS1 and SS2. > > The CN1 resource provider acts as an "anchor" or "relay": a provider > that doesn't provide any of the requested resource, but connects to one > or more sharing providers that do so. > > This scenario doesn't work today (see bug [2]). Tetsuro has a partial > fix [1]. > > However, whereas that fix will return you an allocation_request > containing SS1 and SS2, neither the allocation_request nor the > provider_summary mentions CN1. > > That's bad. Consider use cases like Nova's, where we have to land that > allocation_request on a host: we have no good way of figuring out who > that host is. > > > Starting from the API, the response payload should look like: > > { > "allocation_requests": [ > {"allocations": { > # This is missing ==> > CN1_UUID: {"resources": {}}, > # <== > SS1_UUID: {"resources": {"DISK_GB": 1024}}, > SS2_UUID: {"resources": {"IPV4_ADDRESS": 1}} > }} > ], > "provider_summaries": { > # This is missing ==> > CN1_UUID: {"resources": { > "VCPU": {"used": 123, "capacity": 456} > }}, > # <== > SS1_UUID: {"resources": { > "DISK_GB": {"used": 2048, "capacity": 1048576} > }}, > SS2_UUID: {"resources": { > "IPV4_ADDRESS": {"used": 4, "capacity": 32} > }} > }, > } > > Here's why it's not working currently: > > => CN1_UUID isn't in `summaries` [3] > => because _build_provider_summaries [4] doesn't return it > => because it's not in usages because _get_usages_by_provider_and_rc [5] > only finds providers providing resource in that RC > => and since CN1 isn't providing resource in any requested RC, it ain't > included. > > But we have the anchor provider's (internal) ID; it's the ns_rp_id we're > iterating on in this loop [6]. So let's just use that to get the > summary and add it to the mix, right? Things that make that difficult: > > => We have no convenient helper that builds a summary object without > specifying a resource class (which is a separate problem, because it > means resources we didn't request don't show up in the provider > summaries either - they should). > => We internally build these gizmos inside out - an AllocationRequest > contains a list of AllocationRequestResource, which contains a provider > UUID, resource class, and amount. The latter two are required - but > would be n/a for our anchor RP. > > I played around with this and came up with something that gets us most > of the way there [7]. It's quick and dirty: there are functional holes > (like returning "N/A" as a resource class; and traits are missing) and > places where things could be made more efficient. But it's a start. > > -efried > > [1]
[openstack-dev] [placement] Anchor/Relay Providers
Folks who care about placement (but especially Jay and Tetsuro)- I was reviewing [1] and was at first very unsatisfied that we were not returning the anchor providers in the results. But as I started digging into what it would take to fix it, I realized it's going to be nontrivial. I wanted to dump my thoughts before the weekend. It should be legal to have a configuration like: #CN1 (VCPU, MEMORY_MB) #/ \ # /agg1\agg2 # / \ # SS1SS2 # (DISK_GB) (IPV4_ADDRESS) And make a request for DISK_GB,IPV4_ADDRESS; And have it return a candidate including SS1 and SS2. The CN1 resource provider acts as an "anchor" or "relay": a provider that doesn't provide any of the requested resource, but connects to one or more sharing providers that do so. This scenario doesn't work today (see bug [2]). Tetsuro has a partial fix [1]. However, whereas that fix will return you an allocation_request containing SS1 and SS2, neither the allocation_request nor the provider_summary mentions CN1. That's bad. Consider use cases like Nova's, where we have to land that allocation_request on a host: we have no good way of figuring out who that host is. Starting from the API, the response payload should look like: { "allocation_requests": [ {"allocations": { # This is missing ==> CN1_UUID: {"resources": {}}, # <== SS1_UUID: {"resources": {"DISK_GB": 1024}}, SS2_UUID: {"resources": {"IPV4_ADDRESS": 1}} }} ], "provider_summaries": { # This is missing ==> CN1_UUID: {"resources": { "VCPU": {"used": 123, "capacity": 456} }}, # <== SS1_UUID: {"resources": { "DISK_GB": {"used": 2048, "capacity": 1048576} }}, SS2_UUID: {"resources": { "IPV4_ADDRESS": {"used": 4, "capacity": 32} }} }, } Here's why it's not working currently: => CN1_UUID isn't in `summaries` [3] => because _build_provider_summaries [4] doesn't return it => because it's not in usages because _get_usages_by_provider_and_rc [5] only finds providers providing resource in that RC => and since CN1 isn't providing resource in any requested RC, it ain't included. But we have the anchor provider's (internal) ID; it's the ns_rp_id we're iterating on in this loop [6]. So let's just use that to get the summary and add it to the mix, right? Things that make that difficult: => We have no convenient helper that builds a summary object without specifying a resource class (which is a separate problem, because it means resources we didn't request don't show up in the provider summaries either - they should). => We internally build these gizmos inside out - an AllocationRequest contains a list of AllocationRequestResource, which contains a provider UUID, resource class, and amount. The latter two are required - but would be n/a for our anchor RP. I played around with this and came up with something that gets us most of the way there [7]. It's quick and dirty: there are functional holes (like returning "N/A" as a resource class; and traits are missing) and places where things could be made more efficient. But it's a start. -efried [1] https://review.openstack.org/#/c/533437/ [2] https://bugs.launchpad.net/nova/+bug/1732731 [3] https://review.openstack.org/#/c/533437/6/nova/api/openstack/placement/objects/resource_provider.py@3308 [4] https://review.openstack.org/#/c/533437/6/nova/api/openstack/placement/objects/resource_provider.py@3062 [5] https://review.openstack.org/#/c/533437/6/nova/api/openstack/placement/objects/resource_provider.py@2658 [6] https://review.openstack.org/#/c/533437/6/nova/api/openstack/placement/objects/resource_provider.py@3303 [7] https://review.openstack.org/#/c/558014/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev