[openstack-dev] [nova][placement] No n-sch meeting next week

2018-11-07 Thread Eric Fried
...due to summit.

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Placement requests and caching in the resource tracker

2018-11-06 Thread Eric Fried
I do intend to respond to all the excellent discussion on this thread,
but right now I just want to offer an update on the code:

I've split the effort apart into multiple changes starting at [1]. A few
of these are ready for review.

One opinion was that a specless blueprint would be appropriate. If
there's consensus on this, I'll spin one up.

[1] https://review.openstack.org/#/c/615606/

On 11/5/18 03:16, Belmiro Moreira wrote:
> Thanks Eric for the patch.
> This will help keeping placement calls under control.
> 
> Belmiro
> 
> 
> On Sun, Nov 4, 2018 at 1:01 PM Jay Pipes  <mailto:jaypi...@gmail.com>> wrote:
> 
>     On 11/02/2018 03:22 PM, Eric Fried wrote:
> > All-
> >
> > Based on a (long) discussion yesterday [1] I have put up a patch [2]
> > whereby you can set [compute]resource_provider_association_refresh to
> > zero and the resource tracker will never* refresh the report client's
> > provider cache. Philosophically, we're removing the "healing"
> aspect of
> > the resource tracker's periodic and trusting that placement won't
> > diverge from whatever's in our cache. (If it does, it's because the op
> > hit the CLI, in which case they should SIGHUP - see below.)
> >
> > *except:
> > - When we initially create the compute node record and bootstrap its
> > resource provider.
> > - When the virt driver's update_provider_tree makes a change,
> > update_from_provider_tree reflects them in the cache as well as
> pushing
> > them back to placement.
> > - If update_from_provider_tree fails, the cache is cleared and gets
> > rebuilt on the next periodic.
> > - If you send SIGHUP to the compute process, the cache is cleared.
> >
> > This should dramatically reduce the number of calls to placement from
> > the compute service. Like, to nearly zero, unless something is
> actually
> > changing.
> >
> > Can I get some initial feedback as to whether this is worth
> polishing up
> > into something real? (It will probably need a bp/spec if so.)
> >
> > [1]
> >
> 
> http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2018-11-01.log.html#t2018-11-01T17:32:03
> > [2] https://review.openstack.org/#/c/614886/
> >
> > ==
> > Background
> > ==
> > In the Queens release, our friends at CERN noticed a serious spike in
> > the number of requests to placement from compute nodes, even in a
> > stable-state cloud. Given that we were in the process of adding a
> ton of
> > infrastructure to support sharing and nested providers, this was not
> > unexpected. Roughly, what was previously:
> >
> >   @periodic_task:
> >       GET /resource_providers/$compute_uuid
> >       GET /resource_providers/$compute_uuid/inventories
> >
> > became more like:
> >
> >   @periodic_task:
> >       # In Queens/Rocky, this would still just return the compute RP
> >       GET /resource_providers?in_tree=$compute_uuid
> >       # In Queens/Rocky, this would return nothing
> >       GET /resource_providers?member_of=...=MISC_SHARES...
> >       for each provider returned above:  # i.e. just one in Q/R
> >           GET /resource_providers/$compute_uuid/inventories
> >           GET /resource_providers/$compute_uuid/traits
> >           GET /resource_providers/$compute_uuid/aggregates
> >
> > In a cloud the size of CERN's, the load wasn't acceptable. But at the
> > time, CERN worked around the problem by disabling refreshing entirely.
> > (The fact that this seems to have worked for them is an
> encouraging sign
> > for the proposed code change.)
> >
> > We're not actually making use of most of that information, but it sets
> > the stage for things that we're working on in Stein and beyond, like
> > multiple VGPU types, bandwidth resource providers, accelerators, NUMA,
> > etc., so removing/reducing the amount of information we look at isn't
> > really an option strategically.
> 
> I support your idea of getting rid of the periodic refresh of the cache
> in the scheduler report client. Much of that was added in order to
> emulate the original way the resource tracker worked.
> 
> Most of the behaviour in the original resource tracker (and some of the
> code still in there for dealing with (surprise!) PCI passthrough
> 

[openstack-dev] [nova][placement] Placement requests and caching in the resource tracker

2018-11-02 Thread Eric Fried
All-

Based on a (long) discussion yesterday [1] I have put up a patch [2]
whereby you can set [compute]resource_provider_association_refresh to
zero and the resource tracker will never* refresh the report client's
provider cache. Philosophically, we're removing the "healing" aspect of
the resource tracker's periodic and trusting that placement won't
diverge from whatever's in our cache. (If it does, it's because the op
hit the CLI, in which case they should SIGHUP - see below.)

*except:
- When we initially create the compute node record and bootstrap its
resource provider.
- When the virt driver's update_provider_tree makes a change,
update_from_provider_tree reflects them in the cache as well as pushing
them back to placement.
- If update_from_provider_tree fails, the cache is cleared and gets
rebuilt on the next periodic.
- If you send SIGHUP to the compute process, the cache is cleared.

This should dramatically reduce the number of calls to placement from
the compute service. Like, to nearly zero, unless something is actually
changing.

Can I get some initial feedback as to whether this is worth polishing up
into something real? (It will probably need a bp/spec if so.)

[1]
http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2018-11-01.log.html#t2018-11-01T17:32:03
[2] https://review.openstack.org/#/c/614886/

==
Background
==
In the Queens release, our friends at CERN noticed a serious spike in
the number of requests to placement from compute nodes, even in a
stable-state cloud. Given that we were in the process of adding a ton of
infrastructure to support sharing and nested providers, this was not
unexpected. Roughly, what was previously:

 @periodic_task:
 GET /resource_providers/$compute_uuid
 GET /resource_providers/$compute_uuid/inventories

became more like:

 @periodic_task:
 # In Queens/Rocky, this would still just return the compute RP
 GET /resource_providers?in_tree=$compute_uuid
 # In Queens/Rocky, this would return nothing
 GET /resource_providers?member_of=...=MISC_SHARES...
 for each provider returned above:  # i.e. just one in Q/R
 GET /resource_providers/$compute_uuid/inventories
 GET /resource_providers/$compute_uuid/traits
 GET /resource_providers/$compute_uuid/aggregates

In a cloud the size of CERN's, the load wasn't acceptable. But at the
time, CERN worked around the problem by disabling refreshing entirely.
(The fact that this seems to have worked for them is an encouraging sign
for the proposed code change.)

We're not actually making use of most of that information, but it sets
the stage for things that we're working on in Stein and beyond, like
multiple VGPU types, bandwidth resource providers, accelerators, NUMA,
etc., so removing/reducing the amount of information we look at isn't
really an option strategically.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][limits] Does ANYONE at all use the quota class functionality in Nova?

2018-10-24 Thread Eric Fried
Forwarding to openstack-operators per Jay.

On 10/24/18 10:10, Jay Pipes wrote:
> Nova's API has the ability to create "quota classes", which are
> basically limits for a set of resource types. There is something called
> the "default quota class" which corresponds to the limits in the
> CONF.quota section. Quota classes are basically templates of limits to
> be applied if the calling project doesn't have any stored
> project-specific limits.
> 
> Has anyone ever created a quota class that is different from "default"?
> 
> I'd like to propose deprecating this API and getting rid of this
> functionality since it conflicts with the new Keystone /limits endpoint,
> is highly coupled with RAX's turnstile middleware and I can't seem to
> find anyone who has ever used it. Deprecating this API and functionality
> would make the transition to a saner quota management system much easier
> and straightforward.
> 
> Also, I'm apparently blocked now from the operators ML so could someone
> please forward this there?
> 
> Thanks,
> -jay
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-sigs] [all] Naming the T release of OpenStack

2018-10-18 Thread Eric Fried
Sorry, I'm opposed to this idea.

I admit I don't understand the political framework, nor have I read the
governing documents beyond [1], but that document makes it clear that
this is supposed to be a community-wide vote.  Is it really legal for
the TC (or whoever has merge rights on [2]) to merge a patch that gives
that same body the power to take the decision out of the hands of the
community? So it's really an oligarchy that gives its constituency the
illusion of democracy until something comes up that it feels like not
having a vote on? The fact that it's something relatively "unimportant"
(this time) is not a comfort.

Not that I think the TC would necessarily move forward with [2] in the
face of substantial opposition from non-TC "cores" or whatever.

I will vote enthusiastically for "Train". But a vote it should be.

-efried

[1] https://governance.openstack.org/tc/reference/release-naming.html
[2] https://review.openstack.org/#/c/611511/

On 10/18/2018 10:52 AM, arkady.kanev...@dell.com wrote:
> +1 for the poll.
> 
> Let’s follow well established process.
> 
> If we want to add Train as one of the options for the name I am OK with it.
> 
>  
> 
> *From:* Jonathan Mills 
> *Sent:* Thursday, October 18, 2018 10:49 AM
> *To:* openstack-s...@lists.openstack.org
> *Subject:* Re: [Openstack-sigs] [all] Naming the T release of OpenStack
> 
>  
> 
> [EXTERNAL EMAIL]
> Please report any suspicious attachments, links, or requests for
> sensitive information.
> 
> +1 for just having a poll
> 
>  
> 
> On Thu, Oct 18, 2018 at 11:39 AM David Medberry  > wrote:
> 
> I'm fine with Train but I'm also fine with just adding it to the
> list and voting on it. It will win.
> 
>  
> 
> Also, for those not familiar with the debian/ubuntu command "sl",
> now is the time to become so.
> 
>  
> 
> apt install sl
> 
> sl -Flea #ftw
> 
>  
> 
> On Thu, Oct 18, 2018 at 12:35 AM Tony Breeds
> mailto:t...@bakeyournoodle.com>> wrote:
> 
> Hello all,
>     As per [1] the nomination period for names for the T release
> have
> now closed (actually 3 days ago sorry).  The nominated names and any
> qualifying remarks can be seen at2].
> 
> Proposed Names
>  * Tarryall
>  * Teakettle
>  * Teller
>  * Telluride
>  * Thomas
>  * Thornton
>  * Tiger
>  * Tincup
>  * Timnath
>  * Timber
>  * Tiny Town
>  * Torreys
>  * Trail
>  * Trinidad
>  * Treasure
>  * Troublesome
>  * Trussville
>  * Turret
>  * Tyrone
> 
> Proposed Names that do not meet the criteria
>  * Train
> 
> However I'd like to suggest we skip the CIVS poll and select
> 'Train' as
> the release name by TC resolution[3].  My think for this is
> 
>  * It's fun and celebrates a humorous moment in our community
>  * As a developer I've heard the T release called Train for quite
>    sometime, and was used often at the PTG[4].
>  * As the *next* PTG is also in Colorado we can still choose a
>    geographic based name for U[5]
>  * If train causes a problem for trademark reasons then we can
> always
>    run the poll
> 
> I'll leave[3] for marked -W for a week for discussion to happen
> before the
> TC can consider / vote on it.
> 
> Yours Tony.
> 
> [1]
> 
> http://lists.openstack.org/pipermail/openstack-dev/2018-September/134995.html
> [2] https://wiki.openstack.org/wiki/Release_Naming/T_Proposals
> [3]
> 
> https://review.openstack.org/#/q/I0d8d3f24af0ee8578712878a3d6617aad1e55e53
> [4] https://twitter.com/vkmc/status/1040321043959754752
> [5]
> https://en.wikipedia.org/wiki/List_of_places_in_Colorado:_T–Z
> 
> 
> ___
> openstack-sigs mailing list
> openstack-s...@lists.openstack.org
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-sigs
> 
> ___
> openstack-sigs mailing list
> openstack-s...@lists.openstack.org
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-sigs
> 
> 
> 
> ___
> openstack-sigs mailing list
> openstack-s...@lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-sigs
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

Re: [openstack-dev] [oslo][taskflow] Thoughts on moving taskflow out of openstack/oslo

2018-10-10 Thread Eric Fried


On 10/10/2018 12:41 PM, Greg Hill wrote:
> I've been out of the openstack loop for a few years, so I hope this
> reaches the right folks.
> 
> Josh Harlow (original author of taskflow and related libraries) and I
> have been discussing the option of moving taskflow out of the openstack
> umbrella recently. This move would likely also include the futurist and
> automaton libraries that are primarily used by taskflow. The idea would
> be to just host them on github and use the regular Github features for
> Issues, PRs, wiki, etc, in the hopes that this would spur more
> development. Taskflow hasn't had any substantial contributions in
> several years and it doesn't really seem that the current openstack devs
> have a vested interest in moving it forward. I would like to move it
> forward, but I don't have an interest in being bound by the openstack
> workflow (this is why the project stagnated as core reviewers were
> pulled on to other projects and couldn't keep up with the review
> backlog, so contributions ground to a halt).
> 
> I guess I'm putting it forward to the larger community. Does anyone have
> any objections to us doing this? Are there any non-obvious
> technicalities that might make such a transition difficult? Who would
> need to be made aware so they could adjust their own workflows?

The PowerVM nova virt driver uses taskflow (and we love it, btw). So we
do need to be kept apprised of any movement in this area, and will need
to be able to continue tracking it as a requirement.

If it does move, I assume the maintainers will still be available and
accessible. Josh has been helpful a number of times in the past.

Other than that, I have no opinion on whether such a move is good or
bad, right or wrong, or what it should look like.

-efried

> 
> Or would it be preferable to just fork and rename the project so
> openstack can continue to use the current taskflow version without worry
> of us breaking features?
> 
> Greg
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Supporting force live-migrate and force evacuate with nested allocations

2018-10-09 Thread Eric Fried


On 10/09/2018 02:20 PM, Jay Pipes wrote:
> On 10/09/2018 11:04 AM, Balázs Gibizer wrote:
>> If you do the force flag removal in a nw microversion that also means
>> (at least to me) that you should not change the behavior of the force
>> flag in the old microversions.
> 
> Agreed.
> 
> Keep the old, buggy and unsafe behaviour for the old microversion and in
> a new microversion remove the --force flag entirely and always call GET
> /a_c, followed by a claim_resources() on the destination host.
> 
> For the old microversion behaviour, continue to do the "blind copy" of
> allocations from the source compute node provider to the destination
> compute node provider.

TBC, for nested/sharing source, we should consolidate all the resources
into a single allocation against the destination's root provider?

> That "blind copy" will still fail if there isn't
> capacity for the new allocations on the destination host anyway, because
> the blind copy is just issuing a POST /allocations, and that code path
> still checks capacity on the target resource providers.

What happens when the migration fails, either because of that POST
/allocations, or afterwards? Do we still have the old allocation around
to restore? Cause we can't re-figure it from the now-monolithic
destination allocation.

> There isn't a
> code path in the placement API that allows a provider's inventory
> capacity to be exceeded by new allocations.
> 
> Best,
> -jay
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Supporting force live-migrate and force evacuate with nested allocations

2018-10-09 Thread Eric Fried
IIUC, the primary thing the force flag was intended to do - allow an
instance to land on the requested destination even if that means
oversubscription of the host's resources - doesn't happen anymore since
we started making the destination claim in placement.

IOW, since pike, you don't actually see a difference in behavior by
using the force flag or not. (If you do, it's more likely a bug than
what you were expecting.)

So there's no reason to keep it around. We can remove it in a new
microversion (or not); but even in the current microversion we need not
continue making convoluted attempts to observe it.

What that means is that we should simplify everything down to ignore the
force flag and always call GET /a_c. Problem solved - for nested and/or
sharing, NUMA or not, root resources or no, on the source and/or
destination.

-efried

On 10/09/2018 04:40 AM, Balázs Gibizer wrote:
> Hi,
> 
> Setup
> -
> 
> nested allocation: an allocation that contains resources from one or 
> more nested RPs. (if you have better term for this then please suggest).
> 
> If an instance has nested allocation it means that the compute, it 
> allocates from, has a nested RP tree. BUT if a compute has a nested RP 
> tree it does not automatically means that the instance, allocating from 
> that compute, has a nested allocation (e.g. bandwidth inventory will be 
> on a nested RPs but not every instance will require bandwidth)
> 
> Afaiu, as soon as we have NUMA modelling in place the most trivial 
> servers will have nested allocations as CPU and MEMORY inverntory will 
> be moved to the nested NUMA RPs. But NUMA is still in the future.
> 
> Sidenote: there is an edge case reported by bauzas when an instance 
> allocates _only_ from nested RPs. This was discussed on last Friday and 
> it resulted in a new patch[0] but I would like to keep that discussion 
> separate from this if possible.
> 
> Sidenote: the current problem somewhat related to not just nested PRs 
> but to sharing RPs as well. However I'm not aiming to implement sharing 
> support in Nova right now so I also try to keep the sharing disscussion 
> separated if possible.
> 
> There was already some discussion on the Monday's scheduler meeting but 
> I could not attend.
> http://eavesdrop.openstack.org/meetings/nova_scheduler/2018/nova_scheduler.2018-10-08-14.00.log.html#l-20
> 
> 
> The meat
> 
> 
> Both live-migrate[1] and evacuate[2] has an optional force flag on the 
> nova REST API. The documentation says: "Force  by not 
> verifying the provided destination host by the scheduler."
> 
> Nova implements this statement by not calling the scheduler if 
> force=True BUT still try to manage allocations in placement.
> 
> To have allocation on the destination host Nova blindly copies the 
> instance allocation from the source host to the destination host during 
> these operations. Nova can do that as 1) the whole allocation is 
> against a single RP (the compute RP) and 2) Nova knows both the source 
> compute RP and the destination compute RP.
> 
> However as soon as we bring nested allocations into the picture that 
> blind copy will not be feasible. Possible cases
> 0) The instance has non-nested allocation on the source and would need 
> non nested allocation on the destination. This works with blindy copy 
> today.
> 1) The instance has a nested allocation on the source and would need a 
> nested allocation on the destination as well.
> 2) The instance has a non-nested allocation on the source and would 
> need a nested allocation on the destination.
> 3) The instance has a nested allocation on the source and would need a 
> non nested allocation on the destination.
> 
> Nova cannot generate nested allocations easily without reimplementing 
> some of the placement allocation candidate (a_c) code. However I don't 
> like the idea of duplicating some of the a_c code in Nova.
> 
> Nova cannot detect what kind of allocation (nested or non-nested) an 
> instance would need on the destination without calling placement a_c. 
> So knowing when to call placement is a chicken and egg problem.
> 
> Possible solutions:
> A) fail fast
> 
> 0) Nova can detect that the source allocatioin is non-nested and try 
> the blindy copy and it will succeed.
> 1) Nova can detect that the source allocaton is nested and fail the 
> operation
> 2) Nova only sees a non nested source allocation. Even if the dest RP 
> tree is nested it does not mean that the allocation will be nested. We 
> cannot fail fast. Nova can try the blind copy and allocate every 
> resources from the root RP of the destination. If the instance require 
> nested allocation instead the claim will fail in placement. So nova can 
> fail the operation a bit later than in 1).
> 3) Nova can detect that the source allocation is nested and fail the 
> operation. However and enhanced blind copy that tries to allocation 
> everything from the root RP on the destinaton would have worked.
> 
> B) Guess 

Re: [openstack-dev] [nova] Rocky RC time regression analysis

2018-10-08 Thread Eric Fried
Mel-

I don't have much of anything useful to add here, but wanted to say
thanks for this thorough analysis. It must have taken a lot of time and
work.

Musings inline.

On 10/05/2018 06:59 PM, melanie witt wrote:
> Hey everyone,
> 
> During our Rocky retrospective discussion at the PTG [1], we talked
> about the spec freeze deadline (milestone 2, historically it had been
> milestone 1) and whether or not it was related to the hectic
> late-breaking regression RC time we had last cycle. I had an action item
> to go through the list of RC time bugs [2] and dig into each one,
> examining: when the patch that introduced the bug landed vs when the bug
> was reported, why it wasn't caught sooner, and report back so we can
> take a look together and determine whether they were related to the spec
> freeze deadline.
> 
> I used this etherpad to make notes [3], which I will [mostly] copy-paste
> here. These are all after RC1 and I'll paste them in chronological order
> of when the bug was reported.
> 
> Milestone 1 r-1 was 2018-04-19.
> Spec freeze was milestone 2 r-2 was 2018-06-07.
> Feature freeze (FF) was on 2018-07-26.
> RC1 was on 2018-08-09.
> 
> 1) Broken live migration bandwidth minimum => maximum based on neutron
> event https://bugs.launchpad.net/nova/+bug/1786346
> 
> - Bug was reported on 2018-08-09, the day of RC1
> - The patch that caused the regression landed on 2018-03-30
> https://review.openstack.org/497457
> - Unrelated to a blueprint, the regression was part of a bug fix
> - Was found because prometheanfire was doing live migrations and noticed
> they seemed to be stuck at 1MiB/s for linuxbridge VMs
> - The bug was due to a race, so the gate didn't hit it
> - Comment on the regression bug from dansmith: "The few hacked up gate
> jobs we used to test this feature at merge time likely didn't notice the
> race because the migrations finished before the potential timeout and/or
> are on systems so loaded that the neutron event came late enough for us
> to win the race repeatedly."
> 
> 2) Docs for the zvm driver missing
> 
> - All zvm driver code changes were merged by 2018-07-17 but the
> documentation was overlooked but was noticed near RC time
> - Blueprint was approved on 2018-02-12
> 
> 3) Volume status remains "detaching" after a failure to detach a volume
> due to DeviceDetachFailed https://bugs.launchpad.net/nova/+bug/1786318
> 
> - Bug was reported on 2018-08-09, the day of RC1
> - The change that introduced the regression landed on 2018-02-21
> https://review.openstack.org/546423
> - Unrelated to a blueprint, the regression was part of a bug fix
> - Question: why wasn't this caught earlier?
> - Answer: Unit tests were not asserting the call to the roll_detaching
> volume API. Coverage has since been added along with the bug fix
> https://review.openstack.org/590439
> 
> 4) OVB overcloud deploy fails on nova placement errors
> https://bugs.launchpad.net/nova/+bug/1787910
> 
> - Bug was reported on 2018-08-20
> - Change that caused the regression landed on 2018-07-26, FF day
> https://review.openstack.org/517921
> - Blueprint was approved on 2018-05-16
> - Was found because of a failure in the
> legacy-periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master
> CI job. The ironic-inspector CI upstream also failed because of this, as
> noted by dtantsur.
> - Question: why did it take nearly a month for the failure to be
> noticed? Is there any way we can cover this in our
> ironic-tempest-dsvm-ipa-wholedisk-bios-agent_ipmitool-tinyipa job?
> 
> 5) when live migration fails due to a internal error rollback is not
> handled correctly https://bugs.launchpad.net/nova/+bug/1788014
> 
> - Bug was reported on 2018-08-20
> - The change that caused the regression landed on 2018-07-26, FF day
> https://review.openstack.org/434870
> - Unrelated to a blueprint, the regression was part of a bug fix
> - Was found because sean-k-mooney was doing live migrations and found
> that when a LM failed because of a QEMU internal error, the VM remained
> ACTIVE but the VM no longer had network connectivity.
> - Question: why wasn't this caught earlier?
> - Answer: We would need a live migration job scenario that intentionally
> initiates and fails a live migration, then verify network connectivity
> after the rollback occurs.
> - Question: can we add something like that?
> 
> 6) nova-manage db online_data_migrations hangs on instances with no host
> set https://bugs.launchpad.net/nova/+bug/1788115
> 
> - Bug was reported on 2018-08-21
> - The patch that introduced the bug landed on 2018-05-30
> https://review.openstack.org/567878
> - Unrelated to a blueprint, the regression was part of a bug fix
> - Question: why wasn't this caught earlier?
> - Answer: To hit the bug, you had to have had instances with no host set
> (that failed to schedule) in your database during an upgrade. This does
> not happen during the grenade job
> - Question: could we add anything to the grenade job that would leave
> some 

Re: [openstack-dev] [placement] update 18-40

2018-10-05 Thread Eric Fried
> * What should we do about nova calling the placement db, like in
>  
> [nova-manage](https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L416)

This should be purely a placement-side migration, nah?

>   and
>  
> [nova-status](https://github.com/openstack/nova/blob/master/nova/cmd/status.py#L254).

For others' reference, Chris and I have been discussing this [1] in the
spec review that was prompted by the above. As of the last episode: a)
we're not convinced this status check is worth having in the first
place; but if it is, b) the algorithm being used currently is pretty
weak, and will soon be actually bogus; and c) there's a suggestion for a
"better" (if not particularly efficient) alternative that uses the API
instead of going directly to the db.

-efried

[1]
https://review.openstack.org/#/c/600016/3/specs/stein/approved/list-rps-having.rst@49

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] Please opt-in for neutron-lib patches

2018-10-03 Thread Eric Fried
Hi Boden.

Love this initiative.

We would like networking-powervm to be included, and have proposed [5],
but are wondering why we weren't picked up in [6]. Your email [1] says

"If your project isn't in [3][4],
but you think it should be; it maybe missing a recent neutron-lib
version in your requirements.txt."

What's "recent"? I see the latest (per the requirements project) is
1.19.0 and we have 1.18.0. Should we bump?

Thanks,
efried

[5] https://review.openstack.org/#/c/607625/
[6] https://etherpad.openstack.org/p/neutron-sibling-setup

On 10/03/2018 10:43 AM, Boden Russell wrote:
> Just a friendly reminder that networking projects now need to opt-in for
> neutron-lib consumption patches [1].
> 
> Starting next week (September 8) I'd like to start basing consumption
> patches on those projects that have opted-in. If there are exceptions
> please let me know so we can track them accordingly.
> 
> Thanks
> 
> [1]
> http://lists.openstack.org/pipermail/openstack-dev/2018-September/135063.html
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement] The "intended purpose" of traits

2018-10-02 Thread Eric Fried


On 09/28/2018 07:23 PM, Mohammed Naser wrote:
> On Fri, Sep 28, 2018 at 7:17 PM Chris Dent  wrote:
>>
>> On Fri, 28 Sep 2018, melanie witt wrote:
>>
>>> I'm concerned about a lot of repetition here and maintenance headache for
>>> operators. That's where the thoughts about whether we should provide
>>> something like a key-value construct to API callers where they can instead
>>> say:
>>>
>>> * OWNER=CINDER
>>> * RAID=10
>>> * NUMA_CELL=0
>>>
>>> for each resource provider.
>>>
>>> If I'm off base with my example, please let me know. I'm not a placement
>>> expert.
>>>
>>> Anyway, I hope that gives an idea of what I'm thinking about in this
>>> discussion. I agree we need to pick a direction and go with it. I'm just
>>> trying to look out for the experience operators are going to be using this
>>> and maintaining it in their deployments.
>>
>> Despite saying "let's never do this" with regard to having formal
>> support for key/values in placement, if we did choose to do it (if
>> that's what we chose, I'd live with it), when would we do it? We
>> have a very long backlog of features that are not yet done. I
>> believe (I hope obviously) that we will be able to accelerate
>> placement's velocity with it being extracted, but that won't be
>> enough to suddenly be able to do quickly do all the things we have
>> on the plate.
>>
>> Are we going to make people wait for some unknown amount of time,
>> in the meantime? While there is a grammar that could do some of
>> these things?
>>
>> Unless additional resources come on the scene I don't think is
>> either feasible or reasonable for us to considering doing any model
>> extending at this time (irrespective of the merit of the idea).
>>
>> In some kind of weird belief way I'd really prefer we keep the
>> grammar placement exposes simple, because my experience with HTTP
>> APIs strongly suggests that's very important, and that experience is
>> effectively why I am here, but I have no interest in being a
>> fundamentalist about it. We should argue about it strongly to make
>> sure we get the right result, but it's not a huge deal either way.
> 
> Is there a spec up for this should anyone want to implement it?

By "this" are you referring to a placement key/value primitive?

There is not a spec or blueprint that I'm aware of. And I think the
reason is the strong and immediate resistance to the very idea any time
it is mentioned. Who would want to write a spec that's almost certain to
be vetoed?

> 
>> --
>> Chris Dent   ٩◔̯◔۶   https://anticdent.org/
>> freenode: cdent tw: 
>> @anticdent__
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [ironic] [nova] [tripleo] Deprecation of Nova's integration with Ironic Capabilities and ComputeCapabilitiesFilter

2018-10-02 Thread Eric Fried


On 10/02/2018 11:09 AM, Jim Rollenhagen wrote:
> On Tue, Oct 2, 2018 at 11:40 AM Eric Fried  wrote:
> 
> > What Eric is proposing (and Julia and I seem to be in favor of), is
> > nearly the same as your proposal. The single difference is that these
> > config templates or deploy templates or whatever could *also* require
> > certain traits, and the scheduler would use that information to pick a
> > node. While this does put some scheduling information into the config
> > template, it also means that we can remove some of the flavor
> explosion
> > *and* mostly separate scheduling from configuration.
> >
> > So, you'd have a list of traits on a flavor:
> >
> > required=HW_CPU_X86_VMX,HW_NIC_ACCEL_IPSEC
> >
> > And you would also have a list of traits in the deploy template:
> >
> > {"traits": {"required": ["STORAGE_HARDWARE_RAID"]}, "config":
> }
> >
> > This allows for making flavors that are reasonably flexible
> (instead of
> > two flavors that do VMX and IPSEC acceleration, one of which does
> RAID).
> > It also allows users to specify a desired configuration without also
> > needing to know how to correctly choose a flavor that can handle that
> > configuration.
> >
> > I think it makes a lot of sense, doesn't impose more work on
> users, and
> > can reduce the number of flavors operators need to manage.
> >
> > Does that make sense?
> 
> This is in fact exactly what Jay proposed. And both Julia and I are in
> favor of it as an ideal long-term solution. Where Julia and I deviated
> from Jay's point of view was in our desire to use "the hack" in the
> short term so we can satisfy the majority of use cases right away
> without having to wait for that ideal solution to materialize.
> 
> 
> Ah, good point, I had missed that initially. Thanks. Let's do that.
> 
> So if we all agree Jay's proposal is the right thing to do, is there any
> reason to start working on a short-term hack instead of putting those
> efforts into the better solution? I don't see why we couldn't get that
> done in one cycle, if we're all in agreement on it.

It takes more than agreement, though. It takes resources. I may have
misunderstood a major theme of the PTG, but I think the Nova team is
pretty overextended already. Even assuming authorship by wicked smaaht
folks such as yourself, the spec and code reviews will require a
nontrivial investment from Nova cores. The result would likely be
de-/re-prioritization of things we just got done agreeing to work on. If
that's The Right Thing, so be it. But we can't just say we're going to
move forward with something of this magnitude without sacrificing
something else.

(Note that the above opinion is based on the assumption that the hacky
way will require *much* less spec/code/review bandwidth to accomplish.
If that's not true, then I totally agree with you that we should spend
our time working on the right solution.)

> 
> // jim
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [ironic] [nova] [tripleo] Deprecation of Nova's integration with Ironic Capabilities and ComputeCapabilitiesFilter

2018-10-02 Thread Eric Fried
> What Eric is proposing (and Julia and I seem to be in favor of), is
> nearly the same as your proposal. The single difference is that these
> config templates or deploy templates or whatever could *also* require
> certain traits, and the scheduler would use that information to pick a
> node. While this does put some scheduling information into the config
> template, it also means that we can remove some of the flavor explosion
> *and* mostly separate scheduling from configuration.
> 
> So, you'd have a list of traits on a flavor:
> 
> required=HW_CPU_X86_VMX,HW_NIC_ACCEL_IPSEC
> 
> And you would also have a list of traits in the deploy template:
> 
> {"traits": {"required": ["STORAGE_HARDWARE_RAID"]}, "config": }
> 
> This allows for making flavors that are reasonably flexible (instead of
> two flavors that do VMX and IPSEC acceleration, one of which does RAID).
> It also allows users to specify a desired configuration without also
> needing to know how to correctly choose a flavor that can handle that
> configuration.
> 
> I think it makes a lot of sense, doesn't impose more work on users, and
> can reduce the number of flavors operators need to manage.
> 
> Does that make sense?

This is in fact exactly what Jay proposed. And both Julia and I are in
favor of it as an ideal long-term solution. Where Julia and I deviated
from Jay's point of view was in our desire to use "the hack" in the
short term so we can satisfy the majority of use cases right away
without having to wait for that ideal solution to materialize.

> 
> // jim
> 
> 
> Best,
> -jay
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> 
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-operators] [ironic] [nova] [tripleo] Deprecation of Nova's integration with Ironic Capabilities and ComputeCapabilitiesFilter

2018-10-01 Thread Eric Fried

> So say the user requests a node that supports UEFI because their image
> needs UEFI. Which workflow would you want here?
> 
> 1) The operator (or ironic?) has already configured the node to boot in
> UEFI mode. Only pre-configured nodes advertise the "supports UEFI" trait.
> 
> 2) Any node that supports UEFI mode advertises the trait. Ironic ensures
> that UEFI mode is enabled before provisioning the machine.
> 
> I imagine doing #2 by passing the traits which were specifically
> requested by the user, from Nova to Ironic, so that Ironic can do the
> right thing for the user.
> 
> Your proposal suggests that the user request the "supports UEFI" trait,
> and *also* pass some glance UUID which the user understands will make
> sure the node actually boots in UEFI mode. Something like:
> 
> openstack server create --flavor METAL_12CPU_128G --trait SUPPORTS_UEFI
> --config-data $TURN_ON_UEFI_UUID
> 
> Note that I pass --trait because I hope that will one day be supported
> and we can slow down the flavor explosion.

IMO --trait would be making things worse (but see below). I think UEFI
with Jay's model would be more like:

  openstack server create --flavor METAL_12CPU_128G --config-data $UEFI

where the UEFI profile would be pretty trivial, consisting of
placement.traits.required = ["BOOT_MODE_UEFI"] and object.boot_mode =
"uefi".

I agree that this seems kind of heavy, and that it would be nice to be
able to say "boot mode is UEFI" just once. OTOH I get Jay's point that
we need to separate the placement decision from the instance configuration.

That said, what if it was:

 openstack config-profile create --name BOOT_MODE_UEFI --json -
 {
  "type": "boot_mode_scheme",
  "version": 123,
  "object": {
  "boot_mode": "uefi"
  },
  "placement": {
   "traits": {
"required": [
 "BOOT_MODE_UEFI"
]
   }
  }
 }
 ^D

And now you could in fact say

 openstack server create --flavor foo --config-profile BOOT_MODE_UEFI

using the profile name, which happens to be the same as the trait name
because you made it so. Does that satisfy the yen for saying it once? (I
mean, despite the fact that you first had to say it three times to get
it set up.)



I do want to zoom out a bit and point out that we're talking about
implementing a new framework of substantial size and impact when the
original proposal - using the trait for both - would just work out of
the box today with no changes in either API. Is it really worth it?



By the way, with Jim's --trait suggestion, this:

> ...dozens of flavors that look like this:
> - 12CPU_128G_RAID10_DRIVE_LAYOUT_X
> - 12CPU_128G_RAID5_DRIVE_LAYOUT_X
> - 12CPU_128G_RAID01_DRIVE_LAYOUT_X
> - 12CPU_128G_RAID10_DRIVE_LAYOUT_Y
> - 12CPU_128G_RAID5_DRIVE_LAYOUT_Y
> - 12CPU_128G_RAID01_DRIVE_LAYOUT_Y

...could actually become:

 openstack server create --flavor 12CPU_128G --trait $WHICH_RAID --trait
$WHICH_LAYOUT

No flavor explosion.

(Maybe if we called it something other than --trait, like maybe
--config-option, it would let us pretend we're not really overloading a
trait to do config - it's just a coincidence that the config option has
the same name as the trait it causes to be required.)

-efried
.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement] The "intended purpose" of traits

2018-10-01 Thread Eric Fried


On 09/29/2018 10:40 AM, Jay Pipes wrote:
> On 09/28/2018 04:36 PM, Eric Fried wrote:
>> So here it is. Two of the top influencers in placement, one saying we
>> shouldn't overload traits, the other saying we shouldn't add a primitive
>> that would obviate the need for that. Historically, this kind of
>> disagreement seems to result in an impasse: neither thing happens and
>> those who would benefit are forced to find a workaround or punt.
>> Frankly, I don't particularly care which way we go; I just want to be
>> able to do the things.
> 
> I don't think that's a fair statement. You absolutely *do* care which
> way we go. You want to encode multiple bits of information into a trait
> string -- such as "PCI_ADDRESS_01_AB_23_CD" -- and leave it up to the
> caller to have to understand that this trait string has multiple bits of
> information encoded in it (the fact that it's a PCI device and that the
> PCI device is at 01_AB_23_CD).

It was an oversimplification to say I don't care. I would like, ideally,
long-term, to see a true key/value primitive, because I think it's much
more powerful and less hacky. But am sympathetic to what Chris brought
up about full plate and timeline. So while we're waiting for that to fit
into the schedule, I wouldn't mind the ability to use encoded traits to
some extent to satisfy the use cases.

> You don't see a problem encoding these variants inside a string. Chris
> doesn't either.

Yeah, I see the problem, and I don't love the idea - as I say, I would
prefer a true key/value primitive. But I would rather use encoded traits
as a temporary measure to get stuff done than a) work around things with
a mess of extra specs and/or b) wait, potentially until the heat death
of the universe if we remain deadlocked on whether a key/value primitive
should happen.

> 
> I *do* see a problem with it, based on my experience in Nova where this
> kind of thing leads to ugly, unmaintainable, and incomprehensible code
> as I have pointed to in previous responses.
> 
> Furthermore, your point isn't that "you just want to be able to do the
> things". Your point (and the point of others, from Cyborg and Ironic) is
> that you want to be able to use placement to pass various bits of
> information to an instance, and placement wasn't designed for that
> purpose. Nova was.
>
> So, instead of working out a solution with the Nova team for passing
> configuration data about an instance, the proposed solution is instead
> to hack/encode multiple bits of information into a trait string. This
> proposed solution is seen as a way around having to work out a more
> appropriate solution that has Nova pass that configuration data (as is
> appropriate, since nova is the project that manages instances) to the
> virt driver or generic device manager (i.e. Cyborg) before the instance
> spawns.

I agree that we should not overload placement as a mechanism to pass
configuration information ("set up RAID5 on my storage, please") to the
driver. So let's put that aside. (Looking forward to that spec.)

I still want to use something like "Is capable of RAID5" and/or "Has
RAID5 already configured" as part of a scheduling and placement
decision. Being able to have the GET /a_c response filtered down to
providers with those, ahem, traits is the exact purpose of that operation.

While we're in the neighborhood, we agreed in Denver to use a trait to
indicate which service "owns" a provider [1], so we can eventually
coordinate a smooth handoff of e.g. a device provider from nova to
cyborg. This is certainly not a capability (but it is a trait), and it
can certainly be construed as a key/value (owning_service=cyborg). Are
we rescinding that decision?

[1] https://review.openstack.org/#/c/602160/

> I'm working on a spec that will describe a way for the user to instruct
> Nova to pass configuration data to the virt driver (or device manager)
> before instance spawn. This will have nothing to do with placement or
> traits, since this configuration data is not modeling scheduling and
> placement decisions.
> 
> I hope to have that spec done by Monday so we can discuss on the spec.
> 
> Best,
> -jay
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement] The "intended purpose" of traits

2018-10-01 Thread Eric Fried
Dan-

On 10/01/2018 10:06 AM, Dan Smith wrote:
> I was out when much of this conversation happened, so I'm going to
> summarize my opinion here.
> 
>> So from a code perspective _placement_ is completely agnostic to
>> whether a trait is "PCI_ADDRESS_01_AB_23_CD", "STORAGE_DISK_SSD", or
>> "JAY_LIKES_CRUNCHIE_BARS".
>>
>> However, things which are using traits (e.g., nova, ironic) need to
>> make their own decisions about how the value of traits are
>> interpreted. I don't have a strong position on that except to say
>> that _if_ we end up in a position of there being lots of traits
>> willy nilly, people who have chosen to do that need to know that the
>> contract presented by traits right now (present or not present, no
>> value comprehension) is fixed.
> 
> I agree with what Chris holds sacred here, which is that placement
> shouldn't ever care about what the trait names are or what they mean to
> someone else. That also extends to me hoping we never implement a
> generic key=value store on resource providers in placement.
> 
>>> I *do* see a problem with it, based on my experience in Nova where
>>> this kind of thing leads to ugly, unmaintainable, and
>>> incomprehensible code as I have pointed to in previous responses.
> 
> I definitely agree with what Jay holds sacred here, which is that
> abusing the data model to encode key=value information into single trait
> strings is bad (which is what you're doing with something like
> PCI_ADDRESS_01_AB_23_CD).
> 
> I don't want placement (the code) to try to put any technical
> restrictions on the meaning of trait names, in an attempt to try to
> prevent the above abuse. I agree that means people _can_ abuse it if
> they wish, which I think is Chris' point. However, I think it _is_
> important for the placement team (the people) to care about how
> consumers (nova, etc) use traits, and thus provide guidance on that is
> necessary. Not everyone will follow that guidance, but we should provide
> it. Projects with history-revering developers on both sides of the fence
> can help this effort if they lead by example.
> 
> If everyone goes off and implements their way around the perceived
> restriction of not being able to ask placement for RAID_LEVEL>=5, we're
> going to have a much larger mess than the steaming pile of extra specs
> in nova that we're trying to avoid.

Sorry, I'm having a hard time understanding where you're landing here.

It sounds like you might be saying, "I would rather not see encoded
trait names OR a new key/value primitive; but if the alternative is
ending up with 'a much larger mess', I would accept..." ...which?

Or is it, "We should not implement a key/value primitive, nor should we
implement restrictions on trait names; but we should continue to
discourage (ab)use of trait names by steering placement consumers to..."
...do what?

The restriction is real, not perceived. Without key/value (either
encoded or explicit), how should we steer placement consumers to satisfy
e.g., "Give me disk from a provider with RAID5"?

(Put aside the ability to do comparisons other than straight equality -
so limiting the discussion to RAID_LEVEL=5, ignoring RAID_LEVEL>=5. Also
limiting the discussion to what we want out of GET /a_c - so this
excludes, "And then go configure RAID5 on my storage.")

> 
> --Dan
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement] The "intended purpose" of traits

2018-09-28 Thread Eric Fried


On 09/28/2018 09:41 AM, Balázs Gibizer wrote:
> 
> 
> On Fri, Sep 28, 2018 at 3:25 PM, Eric Fried  wrote:
>> It's time somebody said this.
>>
>> Every time we turn a corner or look under a rug, we find another use
>> case for provider traits in placement. But every time we have to have
>> the argument about whether that use case satisfies the original
>> "intended purpose" of traits.
>>
>> That's only reason I've ever been able to glean: that it (whatever "it"
>> is) wasn't what the architects had in mind when they came up with the
>> idea of traits. We're not even talking about anything that would require
>> changes to the placement API. Just, "Oh, that's not a *capability* -
>> shut it down."
>>
>> Bubble wrap was originally intended as a textured wallpaper and a
>> greenhouse insulator. Can we accept the fact that traits have (many,
>> many) uses beyond marking capabilities, and quit with the arbitrary
>> restrictions?
> 
> How far are we willing to go? Does an arbitrary (key: value) pair
> encoded in a trait name like key_`str(value)` (e.g. CURRENT_TEMPERATURE:
> 85 encoded as CUSTOM_TEMPERATURE_85) something we would be OK to see in
> placement?

Great question. Perhaps TEMPERATURE_DANGEROUSLY_HIGH is okay, but
TEMPERATURE_ is not. This thread isn't about setting
these parameters; it's about getting us to a point where we can discuss
a question just like this one without running up against:

"That's a hard no, because you shouldn't encode key/value pairs in traits."

"Oh, why's that?"

"Because that's not what we intended when we created traits."

"But it would work, and the alternatives are way harder."

"-1"

"But..."

"-1"

> 
> Cheers,
> gibi
> 
>>
>> __
>>
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement] The "intended purpose" of traits

2018-09-28 Thread Eric Fried


On 09/28/2018 12:19 PM, Chris Dent wrote:
> On Fri, 28 Sep 2018, Jay Pipes wrote:
> 
>> On 09/28/2018 09:25 AM, Eric Fried wrote:
>>> It's time somebody said this.
> 
> Yes, a useful topic, I think.
> 
>>> Every time we turn a corner or look under a rug, we find another use
>>> case for provider traits in placement. But every time we have to have
>>> the argument about whether that use case satisfies the original
>>> "intended purpose" of traits.
>>>
>>> That's only reason I've ever been able to glean: that it (whatever "it"
>>> is) wasn't what the architects had in mind when they came up with the
>>> idea of traits.
>>
>> Don't pussyfoot around things. It's me you're talking about, Eric. You
>> could just ask me instead of passive-aggressively posting to the list
>> like this.
> 
> It's not just you. Ed and I have also expressed some fairly strong
> statement about how traits are "supposed" to be used and I would
> guess that from Eric's perspective all three of us (amongst others)
> have some form of architectural influence. Since it takes a village
> and all that.

Correct. I certainly wasn't talking about Jay specifically. I also
wanted people other than placement cores/architects to participate in
the discussion (thanks Julia and Zane).

>> They aren't arbitrary. They are there for a reason: a trait is a
>> boolean capability. It describes something that either a provider is
>> capable of supporting or it isn't.
> 
> This is somewhat (maybe even only slightly) different from what I
> think the definition of a trait is, and that nuance may be relevant.
> 
> I describe a trait as a "quality that a resource provider has" (the
> car is blue). This contrasts with a resource class which is a
> "quantity that a resource provider has" (the car has 4 doors).

Yes, this. I don't want us to go off in the weeds about the reason or
relevance of the choice of name, but "trait" is a superset of
"capability" and easily encompasses "BLUE" or "PHYSNET_PUBLIC" or
"OWNED_BY_NEUTRON" or "XYZ_BITSTREAM" or "PCI_ADDRESS_01_AB_23_CD" or
"RAID5".

> Our implementation is pretty much exactly that ^. We allow
> clients to ask "give me things that have qualities x, y, z, not
> qualities a, b, c, and quanities of G of 5 and H of 7".
> 
> Add in aggregates and we have exactly what you say:
> 
>> * Does the provider have *capacity* for the requested resources?
>> * Does the provider have the required (or forbidden) *capabilities*?
>> * Does the provider belong to some group?
> 
> The nuance of difference is that your description of *capabilities*
> seems more narrow than my description of *qualities* (aka
> characteristics). You've got something fairly specific in mind, as a
> way of constraining the profusion of noise that has happened with
> how various kinds of information about resources of all sorts is
> managed in OpenStack, as you describe in your message.
> 
> I do not think it should be placement's job to control that noise.
> It should be placement's job to provide a very strict contract about
> what you can do with a trait:
> 
> * create it, if necessary
> * assign it to one or more resource providers
> * ask for providers that either have it
> * ... or do not have it
> 
> That's all. Placement _code_ should _never_ be aware of the value of
> a trait (except for the magical MISC_SHARES...). It should never
> become possible to regex on traits or do comparisons
> (required= 
>> If we want to add further constraints to the placement allocation
>> candidates request that ask things like:
>>
>> * Does the provider have version 1.22.61821 of BIOS firmware from
>> Marvell installed on it?
> 
> That's a quality of the provider in a moment.
> 
>> * Does the provider support an FPGA that has had an OVS program
>> flashed to it in the last 20 days?
> 
> If you squint, so is this.
> 
>> * Does the provider belong to physical network "corpnet" and also
>> support creation of virtual NICs of type either "DIRECT" or "NORMAL"?
> 
> And these.
> 
> But at least some of them are dynamic rather than some kind of
> platonic ideal associated with the resource provider.
> 
> I don't think placement should be concerned about temporal aspects
> of traits. If we can't write a web service that can handle setting
> lots of traits every second of every day, we should go home. If
> clients of placement want to set weird traits, more power to them.
> 
> However, if clients of placement (such as no

[openstack-dev] [placement] The "intended purpose" of traits

2018-09-28 Thread Eric Fried
It's time somebody said this.

Every time we turn a corner or look under a rug, we find another use
case for provider traits in placement. But every time we have to have
the argument about whether that use case satisfies the original
"intended purpose" of traits.

That's only reason I've ever been able to glean: that it (whatever "it"
is) wasn't what the architects had in mind when they came up with the
idea of traits. We're not even talking about anything that would require
changes to the placement API. Just, "Oh, that's not a *capability* -
shut it down."

Bubble wrap was originally intended as a textured wallpaper and a
greenhouse insulator. Can we accept the fact that traits have (many,
many) uses beyond marking capabilities, and quit with the arbitrary
restrictions?

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Stein PTG summary

2018-09-27 Thread Eric Fried


On 09/27/2018 07:37 AM, Matt Riedemann wrote:
> On 9/27/2018 5:23 AM, Sylvain Bauza wrote:
>>
>>
>> On Thu, Sep 27, 2018 at 2:46 AM Matt Riedemann > > wrote:
>>
>>     On 9/26/2018 5:30 PM, Sylvain Bauza wrote:
>>  > So, during this day, we also discussed about NUMA affinity and we
>>     said
>>  > that we could possibly use nested resource providers for NUMA
>>     cells in
>>  > Stein, but given we don't have yet a specific Placement API
>>     query, NUMA
>>  > affinity should still be using the NUMATopologyFilter.
>>  > That said, when looking about how to use this filter for vGPUs,
>>     it looks
>>  > to me that I'd need to provide a new version for the NUMACell
>>     object and
>>  > modify the virt.hardware module. Are we also accepting this
>>     (given it's
>>  > a temporary question), or should we need to wait for the
>>     Placement API
>>  > support ?
>>  >
>>  > Folks, what are you thoughts ?
>>
>>     I'm pretty sure we've said several times already that modeling
>> NUMA in
>>     Placement is not something for which we're holding up the extraction.
>>
>>
>> It's not an extraction question. Just about knowing whether the Nova
>> folks would accept us to modify some o.vo object and module just for a
>> temporary time until Placement API has some new query parameter.
>> Whether Placement is extracted or not isn't really the problem, it's
>> more about the time it will take for this query parameter ("numbered
>> request groups to be in the same subtree") to be implemented in the
>> Placement API.
>> The real problem we have with vGPUs is that if we don't have NUMA
>> affinity, the performance would be around 10% less for vGPUs (if the
>> pGPU isn't on the same NUMA cell than the pCPU). Not sure large
>> operators would accept that :(
>>
>> -Sylvain
> 
> I don't know how close we are to having whatever we need for modeling
> NUMA in the placement API, but I'll go out on a limb and assume we're
> not close.

True story. We've been talking about ways to do this since (at least)
the Queens PTG, but haven't even landed on a decent design, let alone
talked about getting it specced, prioritized, and implemented. Since
full NRP support was going to be a prerequisite in any case, and our
Stein plate is full, Train is the earliest we could reasonably expect to
get the placement support going, let alone the nova side. So yeah...

> Given that, if we have to do something within nova for NUMA
> affinity for vGPUs for the NUMATopologyFilter, then I'd be OK with that
> since it's short term like you said (although our "short term"
> workarounds tend to last for many releases). Anyone that cares about
> NUMA today already has to enable the scheduler filter anyway.
> 

+1 to this ^

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Nominating Tetsuro Nakamura for placement-core

2018-09-19 Thread Eric Fried
+1

On 09/19/2018 10:25 AM, Chris Dent wrote:
> 
> 
> I'd like to nominate Tetsuro Nakamura for membership in the
> placement-core team. Throughout placement's development Tetsuro has
> provided quality reviews; done the hard work of creating rigorous
> functional tests, making them fail, and fixing them; and implemented
> some of the complex functionality required at the persistence layer.
> He's aware of and respects the overarching goals of placement and has
> demonstrated pragmatism when balancing those goals against the
> requirements of nova, blazar and other projects.
> 
> Please follow up with a +1/-1 to express your preference. No need to
> be an existing placement core, everyone with an interest is welcome.
> 
> Thanks.
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] About microversion setting to enable nested resource provider

2018-09-13 Thread Eric Fried
There's a patch series in progress for this:

https://review.openstack.org/#/q/topic:use-nested-allocation-candidates

It needs some TLC. I'm sure gibi and tetsuro would welcome some help...

efried

On 09/13/2018 08:31 AM, Naichuan Sun wrote:
> Thank you very much, Jay.
> Is there somewhere I could set microversion(some configure file?), Or just 
> modify the source code to set it?
> 
> BR.
> Naichuan Sun
> 
> -Original Message-
> From: Jay Pipes [mailto:jaypi...@gmail.com] 
> Sent: Thursday, September 13, 2018 9:19 PM
> To: Naichuan Sun ; OpenStack Development Mailing 
> List (not for usage questions) 
> Cc: melanie witt ; efr...@us.ibm.com; Sylvain Bauza 
> 
> Subject: Re: About microversion setting to enable nested resource provider
> 
> On 09/13/2018 06:39 AM, Naichuan Sun wrote:
>> Hi, guys,
>>
>> Looks n-rp is disabled by default because microversion matches 1.29 : 
>> https://github.com/openstack/nova/blob/master/nova/api/openstack/place
>> ment/handlers/allocation_candidate.py#L252
>>
>> Anyone know how to set the microversion to enable n-rp in placement?
> 
> It is the client which must send the 1.29+ placement API microversion header 
> to indicate to the placement API server that the client wants to receive 
> nested provider information in the allocation candidates response.
> 
> Currently, nova-scheduler calls the scheduler reportclient's
> get_allocation_candidates() method:
> 
> https://github.com/openstack/nova/blob/0ba34a818414823eda5e693dc2127a534410b5df/nova/scheduler/manager.py#L138
> 
> The scheduler reportclient's get_allocation_candidates() method currently 
> passes the 1.25 placement API microversion header:
> 
> https://github.com/openstack/nova/blob/0ba34a818414823eda5e693dc2127a534410b5df/nova/scheduler/client/report.py#L353
> 
> https://github.com/openstack/nova/blob/0ba34a818414823eda5e693dc2127a534410b5df/nova/scheduler/client/report.py#L53
> 
> In order to get the nested information returned in the allocation candidates 
> response, that would need to be upped to 1.29.
> 
> Best,
> -jay
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][placement] No NovaScheduler meeting during PTG

2018-09-07 Thread Eric Fried
Our regularly scheduled Monday nova-scheduler meeting will not take
place next Monday, Sept 10th. We'll resume the following week.

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Nominating Chris Dent for placement-core

2018-09-07 Thread Eric Fried
After a week with only positive responses, it is my pleasure to add
Chris to the placement-core team.

Welcome home, Chris.

On 08/31/2018 10:45 AM, Eric Fried wrote:
> The openstack/placement project [1] and its core team [2] have been
> established in gerrit.
> 
> I hereby nominate Chris Dent for membership in the placement-core team.
> He has been instrumental in the design, implementation, and stewardship
> of the placement API since its inception and has shown clear and
> consistent leadership.
> 
> As we are effectively bootstrapping placement-core at this time, it
> would seem appropriate to consider +1/-1 responses from heavy placement
> contributors as well as existing cores (currently nova-core).
> 
> [1] https://review.openstack.org/#/admin/projects/openstack/placement
> [2] https://review.openstack.org/#/admin/groups/1936,members
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tempest][CI][nova compute] Skipping non-compute-driver tests

2018-09-06 Thread Eric Fried
Jichen-

That patch is not ever intended to merge; hope that was clear from the
start :) It was just a proving ground to demonstrate which tests still
pass when there's effectively no compute driver in play.

We haven't taken any action on this from our end, though we have done a
little brainstorming about how we would tool our CI to skip those tests
most (but not all) of the time. Happy to share our experiences with you
if/as we move forward with that.

Regarding the tempest-level automation, I certainly had z in mind when
I was thinking about it. If you have the time and inclination to help
look into it, that would be most welcome.

Thanks,
efried

On 09/06/2018 12:33 AM, Chen CH Ji wrote:
> I see the patch is still ongoing status and do you have a follow up
> plan/discussion for that? we are maintaining 2 CIs (z/VM and KVM on z)
> so skip non-compute related cases will be a good for 3rd part CI .. thanks
> 
> Best Regards!
> 
> Kevin (Chen) Ji 纪 晨
> 
> Engineer, zVM Development, CSTL
> Notes: Chen CH Ji/China/IBM@IBMCN Internet: jiche...@cn.ibm.com
> Phone: +86-10-82451493
> Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian
> District, Beijing 100193, PRC
> 
> Inactive hide details for Eric Fried ---09/04/2018 09:35:09 PM---Folks-
> The other day, I posted an experimental patch [1] withEric Fried
> ---09/04/2018 09:35:09 PM---Folks- The other day, I posted an
> experimental patch [1] with an effectively
> 
> From: Eric Fried 
> To: "OpenStack Development Mailing List (not for usage questions)"
> 
> Date: 09/04/2018 09:35 PM
> Subject: [openstack-dev] [tempest][CI][nova compute] Skipping
> non-compute-driver tests
> 
> 
> 
> 
> 
> Folks-
> 
> The other day, I posted an experimental patch [1] with an effectively
> empty ComputeDriver (just enough to make n-cpu actually start) to see
> how much of our CI would pass. The theory being that any tests that
> still pass are tests that don't touch our compute driver, and are
> therefore not useful to run in our CI environment. Because anything that
> doesn't touch our code should already be well covered by generic
> dsvm-tempest CIs. The results [2] show that 707 tests still pass.
> 
> So I'm wondering whether there might be a way to mark tests as being
> "compute driver-specific" such that we could switch off all the other
> ones [3] via a one-line conf setting. Because surely this has potential
> to save a lot of CI resource not just for us but for other driver
> vendors, in tree and out.
> 
> Thanks,
> efried
> 
> [1] https://review.openstack.org/#/c/599066/
> [2]
> http://184.172.12.213/66/599066/5/check/nova-powervm-out-of-tree-pvm/a1b42d5/powervm_os_ci.html.gz
> [3] I get that there's still value in running all those tests. But it
> could be done like once every 10 or 50 or 100 runs instead of every time.
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [placement] extraction (technical) update

2018-09-04 Thread Eric Fried
> 030 is okay as long as nothing goes wrong. If something does it
> raises exceptions which would currently fail as the exceptions are
> not there. See below for more about exceptions.

Maybe I'm misunderstanding what these migration thingies are supposed to
be doing, but 030 [1] seems like it's totally not applicable to
placement and should be removed. The placement database doesn't (and
shouldn't) have 'flavors', 'cell_mappings', or 'host_mappings' tables in
the first place.

What am I missing?

> * Presumably we can trim the placement DB migrations to just stuff
>   that is relevant to placement 

Yah, I would hope so. What possible reason could there be to do otherwise?

-efried

[1]
https://github.com/openstack/placement/blob/2f1267c8785138c8f2c9495bd97e6c2a96c7c336/placement/db/sqlalchemy/api_migrations/migrate_repo/versions/030_require_cell_setup.py

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tempest][CI][nova compute] Skipping non-compute-driver tests

2018-09-04 Thread Eric Fried
Folks-

The other day, I posted an experimental patch [1] with an effectively
empty ComputeDriver (just enough to make n-cpu actually start) to see
how much of our CI would pass. The theory being that any tests that
still pass are tests that don't touch our compute driver, and are
therefore not useful to run in our CI environment. Because anything that
doesn't touch our code should already be well covered by generic
dsvm-tempest CIs. The results [2] show that 707 tests still pass.

So I'm wondering whether there might be a way to mark tests as being
"compute driver-specific" such that we could switch off all the other
ones [3] via a one-line conf setting. Because surely this has potential
to save a lot of CI resource not just for us but for other driver
vendors, in tree and out.

Thanks,
efried

[1] https://review.openstack.org/#/c/599066/
[2]
http://184.172.12.213/66/599066/5/check/nova-powervm-out-of-tree-pvm/a1b42d5/powervm_os_ci.html.gz
[3] I get that there's still value in running all those tests. But it
could be done like once every 10 or 50 or 100 runs instead of every time.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Nominating Chris Dent for placement-core

2018-08-31 Thread Eric Fried
The openstack/placement project [1] and its core team [2] have been
established in gerrit.

I hereby nominate Chris Dent for membership in the placement-core team.
He has been instrumental in the design, implementation, and stewardship
of the placement API since its inception and has shown clear and
consistent leadership.

As we are effectively bootstrapping placement-core at this time, it
would seem appropriate to consider +1/-1 responses from heavy placement
contributors as well as existing cores (currently nova-core).

[1] https://review.openstack.org/#/admin/projects/openstack/placement
[2] https://review.openstack.org/#/admin/groups/1936,members

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][placement] Freezing placement for extraction

2018-08-30 Thread Eric Fried
Greetings.

The captains of placement extraction have declared readiness to begin
the process of seeding the new repository (once [1] has finished
merging). As such, we are freezing development in the affected portions
of the openstack/nova repository until this process is completed. We're
relying on our active placement reviewers noticing any patches that
touch these "affected portions" and, if that reviewer is not a nova
core, bringing them to the attention of one, so we can put a -2 on it.

Once the extraction is complete [2], any such frozen patches should be
abandoned and reproposed to the openstack/placement repository.

Since there will be an interval during which placement code will exist
in both repositories, but before $world has cut over to using
openstack/placement, it is possible that some crucial fix will still
need to be merged into the openstack/nova side. In this case, the fix
must be proposed to *both* repositories, and the justification for its
existence in openstack/nova made clear.

For more details on the technical aspects of the extraction process,
refer to this thread [3].

For information on the procedural/governance process we will be
following, see [4].

Please let us know if you have any questions or concerns, either via
this thread or in #openstack-placement.

[1] https://review.openstack.org/#/c/597220/
[2] meaning that we've merged the initial glut of patches necessary to
repath everything and get tests passing
[3]
http://lists.openstack.org/pipermail/openstack-dev/2018-August/133781.html
[4] https://docs.openstack.org/infra/manual/creators.html

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [placement] XenServer CI failed frequently because of placement update

2018-08-28 Thread Eric Fried
Naichuan-

Are you running with [1]? If you are, the placement logs (at debug
level) should be giving you some useful info. If you're not... perhaps
you could pull that in :) Note that it refactors the
_get_provider_ids_matching method completely, so it's possible your
problem will magically go away when you do.

[1] https://review.openstack.org/#/c/590041/

On 08/28/2018 07:54 AM, Jay Pipes wrote:
> On 08/28/2018 04:17 AM, Naichuan Sun wrote:
>> Hi, experts,
>>
>> XenServer CI failed frequently with an error "No valid host was found.
>> " for more than a week. I think it is cause by placement update.
> 
> Hi Naichuan,
> 
> Can you give us a link to the logs a patchset's Citrix XenServer CI that
> has failed? Also, a timestamp for the failure you refer to would be
> useful so we can correlate across service logs.
> 
> Thanks,
> -jay
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] UUID sentinel needs a home

2018-08-27 Thread Eric Fried
Thanks Doug. I restored [4] and moved the code to the fixture module. Enjoy.

-efried

On 08/27/2018 10:59 AM, Doug Hellmann wrote:
> Excerpts from Eric Fried's message of 2018-08-22 09:13:25 -0500:
>> For some time, nova has been using uuidsentinel [1] which conveniently
>> allows you to get a random UUID in a single LOC with a readable name
>> that's the same every time you reference it within that process (but not
>> across processes). Example usage: [2].
>>
>> We would like other projects (notably the soon-to-be-split-out placement
>> project) to be able to use uuidsentinel without duplicating the code. So
>> we would like to stuff it in an oslo lib.
>>
>> The question is whether it should live in oslotest [3] or in
>> oslo_utils.uuidutils [4]. The proposed patches are (almost) the same.
>> The issues we've thought of so far:
>>
>> - If this thing is used only for test, oslotest makes sense. We haven't
>> thought of a non-test use, but somebody surely will.
>> - Conversely, if we put it in oslo_utils, we're kinda saying we support
>> it for non-test too. (This is why the oslo_utils version does some extra
>> work for thread safety and collision avoidance.)
>> - In oslotest, awkwardness is necessary to avoid circular importing:
>> uuidsentinel uses oslo_utils.uuidutils, which requires oslotest. In
>> oslo_utils.uuidutils, everything is right there.
>> - It's a... UUID util. If I didn't know anything and I was looking for a
>> UUID util like uuidsentinel, I would look in a module called uuidutils
>> first.
>>
>> We hereby solicit your opinions, either by further discussion here or as
>> votes on the respective patches.
>>
>> Thanks,
>> efried
>>
>> [1]
>> https://github.com/openstack/nova/blob/17b69575bc240ca1dd8b7a681de846d90f3b642c/nova/tests/uuidsentinel.py
>> [2]
>> https://github.com/openstack/nova/blob/17b69575bc240ca1dd8b7a681de846d90f3b642c/nova/tests/functional/api/openstack/placement/db/test_resource_provider.py#L109-L115
>> [3] https://review.openstack.org/594068
>> [4] https://review.openstack.org/594179
>>
> 
> We discussed this during the Oslo team meeting today, and have settled
> on the idea of placing Eric's version of the code (with the thread-safe
> fix and the module-level global) in oslo_utils.fixture to allow it to
> easily reuse the oslo_utils.uuidutils module and still be clearly marked
> as test code.
> 
> Doug
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [placement] extraction (technical) update

2018-08-27 Thread Eric Fried
Thanks Matt, you summed it up nicely.

Just one thing to point out...

> Option 1 would clearly be a drain on at least 2 nova cores to go through
> the changes. I think Eric is on board for reviewing options 1 or 2 in
> either case, but he prefers option 2. Since I'm throwing a wrench in the
> works, I also need to stand up and review the changes if we go with
> option 1 or 2. Jay said he'd review them but consider these reviews
> lower priority. I expect we could get some help from some other nova
> cores though, maybe not on all changes, but at least some (thinking
> gibi, alex_xu, sfinucan).

The placement-core team should be seeded and should be the ones on the
hook for the reviews. Since we've agreed in the other thread to make
placement-core a superset of nova-core, what you've said above is still
applicable, but incomplete: I would expect there to be at least one or
two additional non-nova-core placement cores willing to do these
reviews. (Assuming Ed and/or Chris to be on that team, I would of course
expect them to refrain from approving, regardless of who does the gerrit
work, since they've both been developing the changes in github.)

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] UUID sentinel needs a home

2018-08-24 Thread Eric Fried
So...

Restore the PS of the oslo_utils version that exposed the global [1]?

Or use the forced-singleton pattern from nova [2] to put it in its own
importable module, e.g. oslo_utils.uuidutils.uuidsentinel?

(FTR, "import only modules" is a thing for me too, but I've noticed it
doesn't seem to be a hard and fast rule in OpenStack; and in this case
it seemed most important to emulate the existing syntax+behavior for
consumers.)

-efried

[1] https://review.openstack.org/#/c/594179/2/oslo_utils/uuidutils.py
[2]
https://github.com/openstack/nova/blob/a421bd2a8c3b549c603df7860e6357738e79c7c3/nova/tests/uuidsentinel.py#L30

On 08/23/2018 11:23 PM, Doug Hellmann wrote:
> 
> 
>> On Aug 23, 2018, at 4:01 PM, Ben Nemec  wrote:
>>
>>
>>
>>> On 08/23/2018 12:25 PM, Doug Hellmann wrote:
>>> Excerpts from Eric Fried's message of 2018-08-23 09:51:21 -0500:
 Do you mean an actual fixture, that would be used like:

  class MyTestCase(testtools.TestCase):
  def setUp(self):
  self.uuids = self.useFixture(oslofx.UUIDSentinelFixture()).uuids

  def test_foo(self):
  do_a_thing_with(self.uuids.foo)

 ?

 That's... okay I guess, but the refactoring necessary to cut over to it
 will now entail adding 'self.' to every reference. Is there any way
 around that?
>>> That is what I had envisioned, yes.  In the absence of a global,
>>> which we do not want, what other API would you propose?
>>
>> If we put it in oslotest instead, would the global still be a problem? 
>> Especially since mock has already established a pattern for this 
>> functionality?
> 
> I guess all of the people who complained so loudly about the global in 
> oslo.config are gone?
> 
> If we don’t care about the global then we could just put the code from Eric’s 
> threadsafe version in oslo.utils somewhere. 
> 
> Doug
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] UUID sentinel needs a home

2018-08-23 Thread Eric Fried
The compromise, using the patch as currently written [1], would entail
adding one line at the top of each test file:

 uuids = uuidsentinel.UUIDSentinels()

...as seen (more or less) at [2]. The subtle difference being that this
`uuids` wouldn't share a namespace across the whole process, only within
that file. Given current usage, that shouldn't cause a problem, but it's
a change.

-efried

[1] https://review.openstack.org/#/c/594068/9
[2]
https://review.openstack.org/#/c/594068/9/oslotest/tests/unit/test_uuidsentinel.py@22

On 08/23/2018 12:41 PM, Jay Pipes wrote:
> On 08/23/2018 01:25 PM, Doug Hellmann wrote:
>> Excerpts from Eric Fried's message of 2018-08-23 09:51:21 -0500:
>>> Do you mean an actual fixture, that would be used like:
>>>
>>>   class MyTestCase(testtools.TestCase):
>>>   def setUp(self):
>>>   self.uuids =
>>> self.useFixture(oslofx.UUIDSentinelFixture()).uuids
>>>
>>>   def test_foo(self):
>>>   do_a_thing_with(self.uuids.foo)
>>>
>>> ?
>>>
>>> That's... okay I guess, but the refactoring necessary to cut over to it
>>> will now entail adding 'self.' to every reference. Is there any way
>>> around that?
>>
>> That is what I had envisioned, yes.  In the absence of a global,
>> which we do not want, what other API would you propose?
> 
> As dansmith mentioned, the niceness and simplicity of being able to do:
> 
>  import nova.tests.uuidsentinel as uuids
> 
>  ..
> 
>  def test_something(self):
>  my_uuid = uuids.instance1
> 
> is remarkably powerful and is something I would want to keep.
> 
> Best,
> -jay
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] UUID sentinel needs a home

2018-08-23 Thread Eric Fried
Do you mean an actual fixture, that would be used like:

 class MyTestCase(testtools.TestCase):
 def setUp(self):
 self.uuids = self.useFixture(oslofx.UUIDSentinelFixture()).uuids

 def test_foo(self):
 do_a_thing_with(self.uuids.foo)

?

That's... okay I guess, but the refactoring necessary to cut over to it
will now entail adding 'self.' to every reference. Is there any way
around that?

efried

On 08/23/2018 07:40 AM, Jay Pipes wrote:
> On 08/23/2018 08:06 AM, Doug Hellmann wrote:
>> Excerpts from Davanum Srinivas (dims)'s message of 2018-08-23 06:46:38
>> -0400:
>>> Where exactly Eric? I can't seem to find the import:
>>>
>>> http://codesearch.openstack.org/?q=(from%7Cimport).*oslotest=nope==oslo.utils
>>>
>>>
>>> -- dims
>>
>> oslo.utils depends on oslotest via test-requirements.txt and oslotest is
>> used within the test modules in oslo.utils.
>>
>> As I've said on both reviews, I think we do not want a global
>> singleton instance of this sentinal class. We do want a formal test
>> fixture.  Either library can export a test fixture and olso.utils
>> already has oslo_utils.fixture.TimeFixture so there's precedent to
>> adding it there, so I have a slight preference for just doing that.
>>
>> That said, oslo_utils.uuidutils.generate_uuid() is simply returning
>> str(uuid.uuid4()). We have it wrapped up as a function so we can
>> mock it out in other tests, but we hardly need to rely on that if
>> we're making a test fixture for oslotest.
>>
>> My vote is to add a new fixture class to oslo_utils.fixture.
> 
> OK, thanks for the helpful explanation, Doug. Works for me.
> 
> -jay
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [oslo] UUID sentinel needs a home

2018-08-22 Thread Eric Fried
For some time, nova has been using uuidsentinel [1] which conveniently
allows you to get a random UUID in a single LOC with a readable name
that's the same every time you reference it within that process (but not
across processes). Example usage: [2].

We would like other projects (notably the soon-to-be-split-out placement
project) to be able to use uuidsentinel without duplicating the code. So
we would like to stuff it in an oslo lib.

The question is whether it should live in oslotest [3] or in
oslo_utils.uuidutils [4]. The proposed patches are (almost) the same.
The issues we've thought of so far:

- If this thing is used only for test, oslotest makes sense. We haven't
thought of a non-test use, but somebody surely will.
- Conversely, if we put it in oslo_utils, we're kinda saying we support
it for non-test too. (This is why the oslo_utils version does some extra
work for thread safety and collision avoidance.)
- In oslotest, awkwardness is necessary to avoid circular importing:
uuidsentinel uses oslo_utils.uuidutils, which requires oslotest. In
oslo_utils.uuidutils, everything is right there.
- It's a... UUID util. If I didn't know anything and I was looking for a
UUID util like uuidsentinel, I would look in a module called uuidutils
first.

We hereby solicit your opinions, either by further discussion here or as
votes on the respective patches.

Thanks,
efried

[1]
https://github.com/openstack/nova/blob/17b69575bc240ca1dd8b7a681de846d90f3b642c/nova/tests/uuidsentinel.py
[2]
https://github.com/openstack/nova/blob/17b69575bc240ca1dd8b7a681de846d90f3b642c/nova/tests/functional/api/openstack/placement/db/test_resource_provider.py#L109-L115
[3] https://review.openstack.org/594068
[4] https://review.openstack.org/594179

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] how nova should behave when placement returns consumer generation conflict

2018-08-22 Thread Eric Fried
b) sounds the most sane in both cases. I don't like the idea of "your
move operation failed and you have no recourse but to delete your
instance". And automatic retry sounds lovely, but potentially hairy to
implement (and we would need to account for the retries-failed scenario
anyway) so at least initially we should leave that out.

On 08/22/2018 07:55 AM, Balázs Gibizer wrote:
> 
> 
> On Fri, Aug 17, 2018 at 5:40 PM, Eric Fried  wrote:
>> gibi-
>>
>>>>  - On migration, when we transfer the allocations in either
>>>> direction, a
>>>>  conflict means someone managed to resize (or otherwise change
>>>>  allocations?) since the last time we pulled data. Given the global
>>>> lock
>>>>  in the report client, this should have been tough to do. If it does
>>>>  happen, I would think any retry would need to be done all the way back
>>>>  at the claim, which I imagine is higher up than we should go. So
>>>> again,
>>>>  I think we should fail the migration and make the user retry.
>>>
>>>  Do we want to fail the whole migration or just the migration step (e.g.
>>>  confirm, revert)?
>>>  The later means that failure during confirm or revert would put the
>>>  instance back to VERIFY_RESIZE. While the former would mean that in
>>> case
>>>  of conflict at confirm we try an automatic revert. But for a
>>> conflict at
>>>  revert we can only put the instance to ERROR state.
>>
>> This again should be "impossible" to come across. What would the
>> behavior be if we hit, say, ValueError in this spot?
> 
> I might not totally follow you. I see two options to choose from for the
> revert case:
> 
> a) Allocation manipulation error during revert of a migration causes
> that instance goes to ERROR. -> end user cannot retry the revert the
> instance needs to be deleted.
> 
> b) Allocation manipulation error during revert of a migration causes
> that the instance goes back to VERIFY_RESIZE state. -> end user can
> retry the revert via the API.
> 
> I see three options to choose from for the confirm case:
> 
> a) Allocation manipulation error during confirm of a migration causes
> that instance goes to ERROR. -> end user cannot retry the confirm the
> instance needs to be deleted.
> 
> b) Allocation manipulation error during confirm of a migration causes
> that the instance goes back to VERIFY_RESIZE state. -> end user can
> retry the confirm via the API.
> 
> c) Allocation manipulation error during confirm of a migration causes
> that nova automatically tries to revert the migration. (For failure
> during this revert the same options available as for the generic revert
> case, see above)
> 
> We also need to consider live migration. It is similar in a sense that
> it also use move_allocations. But it is different as the end user
> doesn't explicitly confirm or revert a live migration.
> 
> I'm looking for opinions about which option we should take in each cases.
> 
> gibi
> 
>>
>> -efried
>>
>> __
>>
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] [nova] [placement] placement below or beside compute after extraction?

2018-08-21 Thread Eric Fried
> The reshaper code
> is still going through code review, then next we have the integration to
> do.

To clarify: we're doing the integration in concert with the API side.
Right now the API side patches [1][2] are in series underneath the nova
side [3].

In a placement-in-its-own-repo world, the only difference would have
been that these would be separate series with a Depends-On linking them,
and would require a placement release. (In fact, with a couple of
additional "placement cores", the API side could have been completed
faster and we might have landed the whole in Rocky.)

In a placement-under-separate-governance world, I contend there would
have been *zero* additional difference. Speculating on who the
"placement team" would be, the exact same people would have been present
at the hangouts and participating in the spec and code reviews.

[1] https://review.openstack.org/#/c/576927/
[2] https://review.openstack.org/#/c/585033/
[3] https://review.openstack.org/#/c/584598/ and up

> I think going through this
> integration would be best done *before* extraction to a new repo.

Agree. That could happen this week with some focused reviewing.

> I am OK with the idea of doing the extraction first, if that is
> what most people want to do.

Sweet. To close on this part of the discussion, is there anyone who
still objects to doing at least the repository-and-code part of the
extraction now?

> Affinity modeling and shared storage support are compute features
> OpenStack operators and users need. Operators need affinity modeling in
> the placement is needed to achieve parity for affinity scheduling with
> multiple cells. That means, affinity scheduling in Nova with multiple
> cells is susceptible to races and does *not* work as well as the
> previous single cell support.

Sorry, I'm confused - are we talking about NUMA cells or cellsv2 cells?
If the latter, what additional placement-side support is needed to
support it?

> Shared storage support is something
> operators have badly needed for years now and was envisioned to be
> solved with placement.

Again, I'm pretty sure the placement side work for this is done, or very
close to it; the remaining work is on the nova side.

But regardless, let's assume both of the above require significant
placement work in close coordination with nova for specs, design,
implementation, etc. How would separating governance have a negative
impact on that? As for reshaper, it would be all the same people in the
room. As Doug says:

> What do you think those folks are more interested in working on than the
> things you listed as needing to be done to support the nova use cases?
>
> What can they do to reassure you that they will work on the items
> nova needs, regardless of the governance structure?

More...

> If operators need things for compute, that are well-known
> and that placement was created to solve, how will placement have a
> shared interest in solving compute problems, if it is not part of the
> compute project?

You answered your own question. If operators need a thing that involves
placement and nova, placement and nova have a shared interest in making
it happen. s/placement|nova/$openstack_project/. It's what we're about...

> separate goals and priorities

...because those priorities should largely overlap and be aligned with
OpenStack's goals and priorities, right?

> Who are candidates to be members of a review team for the placement
> repository after the code is moved out of openstack/nova?
>
> How many of them are also members of the nova-core team?

This brings us to another part of the discussion I think we can close on
right now. I don't think I've heard any objections to: "The initial
placement-core team should be a superset of the nova-core team." Do we
have a consensus on that?

(Deferring discussion of who the additional members ought to be. That
probably needs its own thread and/or a different audience.)

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] [nova] [placement] placement below or beside compute after extraction?

2018-08-20 Thread Eric Fried
This is great information, thanks Hongbin.

If I'm understanding correctly, it sounds like Zun ultimately wants to
be a peer of nova in terms of placement consumption. Using the resource
information reported by nova, neutron, etc., you wish to be able to
discover viable targets for a container deployment (GET
/allocation_candidates) and claim resources to schedule to them (PUT
/allocations/{uuid}). And you want to do it while Nova is doing the same
for VMs, in the same cloud. Do I have that right?

> * Is placement stable enough so that it won't break us often?

Yes.

> * If there is a breaking change in placement and we contribute a fix,
> how fast the fix will be merged?
> * If there is a feature request from our side and we contribute patches
> to placement, will the patches be accepted?

I believe this to be one of the main issues in the decision about
independent governance. If placement remains under nova, it is more
likely that fixes and features impacting the nova team would receive
higher priority than those impacting zun.

-efried

> I express the Zun's point of view.
> 
> Zun has a scheduler to schedule containers to nodes based on the
> demanded and available compute resources (i.e. cpu, memory). Right now,
> Zun's scheduler is independent of Nova so VMs and containers have to be
> separated into two set of resource pools. One of the most demanding
> features from our users (e.g. requested from Chinese UnionPay via
> OpenStack Financial WG) is to have VMs and containers share the same set
> of resource pool to maximize utilization. To satisfy this requirement,
> Zun needs to know the current resource allocation that are made by
> external services (i.e. Nova) so that we can take those information into
> account when scheduling the containers. Adopting placement is a
> straightforward and feasible approach to address that.
> 
> As a summary, below are high-level requirements from Zun's perspective:
> * Have VMs and containers multiplex into a pool of compute nodes.
> * Make optimal scheduling decisions for containers based on information
> (i.e. VM allocations) query from placement.
> * Report container allocations to placement and hope external schedulers
> can make optimal decisions.
> 
> We haven't figured out the technical details yet. However, to look
> forward, if Zun team decides to adopt placement, I would have the
> following concerns:
> * Is placement stable enough so that it won't break us often?
> * If there is a breaking change in placement and we contribute a fix,
> how fast the fix will be merged?
> * If there is a feature request from our side and we contribute patches
> to placement, will the patches be accepted?
> 
> Regardless of whether placement is extracted or not, above are the
> concerns that I mostly care about.
> 
> Best regards,
> Hongbin
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] [nova] [placement] placement below or beside compute after extraction?

2018-08-18 Thread Eric Fried
> So my hope is that (in no particular order) Jay Pipes, Eric Fried,
> Takashi Natsume, Tetsuro Nakamura, Matt Riedemann, Andrey Volkov,
> Alex Xu, Balazs Gibizer, Ed Leafe, and any other contributor to
> placement whom I'm forgetting [1] would express their preference on
> what they'd like to see happen.

Extract now, as a fully-independent project, under governance right out
of the gate.

A year ago we might have developed a feature where one patch would
straddle placement and nova. Six months ago we were developing features
where those patches were separate but in the same series. Today that's
becoming less and less the case: nrp, sharing providers, consumer
generations, and other things mentioned have had their placement side
completed and their nova side - if started at all - done completely
independently. The reshaper series is an exception - but looking back on
its development, Depends-On would have worked just as well.

Agree with the notion that nova needs to catch up with placement
features, and would therefore actually *benefit* from a placement
"feature freeze".

Agree the nova project is overloaded and would benefit from having
broader core reviewer coverage over placement code.  The list Chris
gives above includes more than one non-nova core who should be made
placement cores as soon as that's a thing.

The fact that other projects are in various stages of adopting/using
placement in various capacities is a great motive to extract. But IMO
the above would be sufficient reason without that.

Plus other things that other people have said.

Do it. Do it completely. Do it now.

-efried
.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] how nova should behave when placement returns consumer generation conflict

2018-08-17 Thread Eric Fried
gibi-

>> - On migration, when we transfer the allocations in either direction, a
>> conflict means someone managed to resize (or otherwise change
>> allocations?) since the last time we pulled data. Given the global lock
>> in the report client, this should have been tough to do. If it does
>> happen, I would think any retry would need to be done all the way back
>> at the claim, which I imagine is higher up than we should go. So again,
>> I think we should fail the migration and make the user retry.
> 
> Do we want to fail the whole migration or just the migration step (e.g.
> confirm, revert)?
> The later means that failure during confirm or revert would put the
> instance back to VERIFY_RESIZE. While the former would mean that in case
> of conflict at confirm we try an automatic revert. But for a conflict at
> revert we can only put the instance to ERROR state.

This again should be "impossible" to come across. What would the
behavior be if we hit, say, ValueError in this spot?

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] how nova should behave when placement returns consumer generation conflict

2018-08-16 Thread Eric Fried
Thanks for this, gibi.

TL;DR: a).

I didn't look, but I'm pretty sure we're not caching allocations in the
report client. Today, nobody outside of nova (specifically the resource
tracker via the report client) is supposed to be mucking with instance
allocations, right? And given the global lock in the resource tracker,
it should be pretty difficult to race e.g. a resize and a delete in any
meaningful way. So short term, IMO it is reasonable to treat any
generation conflict as an error. No retries. Possible wrinkle on delete,
where it should be a failure unless forced.

Long term, I also can't come up with any scenario where it would be
appropriate to do a narrowly-focused GET+merge/replace+retry. But
implementing the above short-term plan shouldn't prevent us from adding
retries for individual scenarios later if we do uncover places where it
makes sense.

Here's some stream-of-consciousness that led me to the above opinions:

- On spawn, we send the allocation with a consumer gen of None because
we expect the consumer not to exist. If it exists, that should be a hard
fail. (Hopefully the only way this happens is a true UUID conflict.)

- On migration, when we create the migration UUID, ditto above ^

- On migration, when we transfer the allocations in either direction, a
conflict means someone managed to resize (or otherwise change
allocations?) since the last time we pulled data. Given the global lock
in the report client, this should have been tough to do. If it does
happen, I would think any retry would need to be done all the way back
at the claim, which I imagine is higher up than we should go. So again,
I think we should fail the migration and make the user retry.

- On destroy, a conflict again means someone managed a resize despite
the global lock. If I'm deleting an instance and something about it
changes, I would think I want the opportunity to reevaluate my decision
to delete it. That said, I would definitely want a way to force it (in
which case we can just use the DELETE call explicitly). But neither case
should be a retry, and certainly there is no destroy scenario where I
would want a "merging" of allocations to happen.

Thanks,
efried


On 08/16/2018 06:43 AM, Balázs Gibizer wrote:
> reformatted for readabiliy, sorry:
> 
> Hi,
> 
> tl;dr: To properly use consumer generation (placement 1.28) in Nova we
> need to decide how to handle consumer generation conflict from Nova
> perspective:
> a) Nova reads the current consumer_generation before the allocation
>   update operation and use that generation in the allocation update
>   operation.  If the allocation is changed between the read and the
>   update then nova fails the server lifecycle operation and let the
>   end user retry it.
> b) Like a) but in case of conflict nova blindly retries the
>   read-and-update operation pair couple of times and if only fails
>   the life cycle operation if run out of retries.
> c) Nova stores its own view of the allocation. When a consumer's
>   allocation needs to be modified then nova reads the current state
>   of the consumer from placement. Then nova combines the two
>   allocations to generate the new expected consumer state. In case
>   of generation conflict nova retries the read-combine-update
>   operation triplet.
> 
> Which way we should go now?
> 
> What should be or long term goal?
> 
> 
> Details:
> 
> There are plenty of affected lifecycle operations. See the patch series
> starting at [1].
> 
> For example:
> 
> The current patch[1] that handles the delete server case implements
> option b).  It simly reads the current consumer generation from
> placement and uses that to send a PUT /allocatons/{instance_uuid} with
> "allocations": {} in its body.
> 
> Here implementing option c) would mean that during server delete nova
> needs:
> 1) to compile its own view of the resource need of the server
>   (currently based on the flavor but in the future based on the
>   attached port's resource requests as well)
> 2) then read the current allocation of the server from placement
> 3) then subtract the server resource needs from the current allocation
>   and send the resulting allocation back in the update to placement
> 
> In the simple case this subtraction would result in an empty allocation
> sent to placement. Also in this simple case c) has the same effect as
> b) currently implementated in [1].
> 
> However if somebody outside of nova modifies the allocation of this
> consumer in a way that nova does not know about such changed resource
> need then b) and c) will result in different placement state after
> server delete.
> 
> I only know of one example, the change of neutron port's resource
> request while the port is attached. (Note, it is out of scope in the
> first step of bandwidth implementation.) In this specific example
> option c) can work if nova re-reads the port's resource request during
> delete when recalculates its own view of the server resource needs. But
> I don't know if 

Re: [openstack-dev] [nova] How to debug no valid host failures with placement

2018-08-14 Thread Eric Fried
Folks-

The patch mentioned below [1] has undergone several rounds of review
and collaborative revision, and we'd really like to get your feedback on
it. From the commit message:

Here are some examples of the debug output:

- A request for three resources with no aggregate or trait filters:

 found 7 providers with available 5 VCPU
 found 9 providers with available 1024 MEMORY_MB
   5 after filtering by previous result
 found 8 providers with available 1500 DISK_GB
   2 after filtering by previous result

- The same request, but with a required trait that nobody has, shorts
  out quickly:

 found 0 providers after applying required traits filter
({'HW_CPU_X86_AVX2': 65})

- A request for one resource with aggregates and forbidden (but no
  required) traits:

 found 2 providers after applying aggregates filter
([['3ed8fb2f-4793-46ee-a55b-fdf42cb392ca']])
 found 1 providers after applying forbidden traits filter
({u'CUSTOM_TWO': 201, u'CUSTOM_THREE': 202})
 found 3 providers with available 4 VCPU
   1 after applying initial aggregate and trait filters

Thanks,
efried

[1] https://review.openstack.org/#/c/590041


> I've created a patch that (hopefully) will address some of the
> difficulty that folks have had in diagnosing which parts of a request
> caused all providers to be filtered out from the return of GET
> /allocation_candidates:
> 
> https://review.openstack.org/#/c/590041
> 
> This patch changes two primary things:
> 
> 1) Query-splitting
> 
> The patch splits the existing monster SQL query that was being used for
> querying for all providers that matched all requested resources,
> required traits, forbidden traits and required aggregate associations
> into doing multiple queries, one for each requested resource. While this
> does increase the number of database queries executed for each call to
> GET /allocation_candidates, the changes allow better visibility into
> what parts of the request cause an exhaustion of matching providers.
> We've benchmarked the new patch and have shown the performance impact of
> doing 3 queries versus 1 (when there is a request for 3 resources --
> VCPU, RAM and disk) is minimal (a few extra milliseconds for execution
> against a DB with 1K providers having inventory of all three resource
> classes).
> 
> 2) Diagnostic logging output
> 
> The patch adds debug log output within each loop iteration, so there is
> no logging output that shows how many matching providers were found for
> each resource class involved in the request. The output looks like this
> in the logs:
> 
> [req-2d30faa8-4190-4490-a91e-610045530140] inside VCPU request loop.
> before applying trait and aggregate filters, found 12 matching providers
> [req-2d30faa8-4190-4490-a91e-610045530140] found 12 providers with
> capacity for the requested 1 VCPU.
> [req-2d30faa8-4190-4490-a91e-610045530140] inside MEMORY_MB request
> loop. before applying trait and aggregate filters, found 9 matching
> providers [req-2d30faa8-4190-4490-a91e-610045530140] found 9 providers
> with capacity for the requested 64 MEMORY_MB. before loop iteration we
> had 12 matches. [req-2d30faa8-4190-4490-a91e-610045530140]
> RequestGroup(use_same_provider=False, resources={MEMORY_MB:64, VCPU:1},
> traits=[], aggregates=[]) (suffix '') returned 9 matches
> 
> If a request includes required traits, forbidden traits or required
> aggregate associations, there are additional log messages showing how
> many matching providers were found after applying the trait or aggregate
> filtering set operation (in other words, the log output shows the impact
> of the trait filter or aggregate filter in much the same way that the
> existing FilterScheduler logging shows the "before and after" impact
> that a particular filter had on a request process.
> 
> Have a look at the patch in question and please feel free to add your
> feedback and comments on ways this can be improved to meet your needs.
> 
> Best,
> -jay
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] Reminder about the weekly Cinder meeting ...

2018-08-13 Thread Eric Fried
Are you talking about the nastygram from "Sigyn" saying:

"Your actions in # tripped automated anti-spam measures
(nicks/hilight spam), but were ignored based on your time in channel;
stop now, or automated action will still be taken. If you have any
questions, please don't hesitate to contact a member of staff"

I'm getting this too, and (despite the implication to the contrary) it
sometimes cuts off my messages in an unpredictable spot.

I'm contacting "a member of staff" to see if there's any way to get
"whitelisted" for big messages. In the meantime, the only solution I'm
aware of is to chop your pasteypaste up into smaller chunks, and wait a
couple seconds between pastes.

-efried

On 08/13/2018 04:06 PM, Ben Nemec wrote:
> 
> 
> On 08/08/2018 12:04 PM, Jay S Bryant wrote:
>> Team,
>>
>> A reminder that we have our weekly Cinder meeting on Wednesdays at
>> 16:00 UTC.  I bring this up as I can no longer send the courtesy pings
>> without being kicked from IRC.  So, if you wish to join the meeting
>> please add a reminder to your calendar of choice.
> 
> Do you have any idea why you're being kicked?  I'm wondering how to
> avoid getting into this situation with the Oslo pings.
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] How to debug no valid host failures with placement

2018-08-03 Thread Eric Fried
> I'm of two minds here.
> 
> On the one hand, you have the case where the end user has accidentally
> requested some combination of things that isn't normally available, and
> they need to be able to ask the provider what they did wrong.  I agree
> that this case is not really an exception, those resources were never
> available in the first place.
> 
> On the other hand, suppose the customer issues a valid request and it
> works, and then issues the same request again and it fails, leading to a
> violation of that customers SLA.  In this case I would suggest that it
> could be considered an exception since the system is not delivering the
> service that it was intended to deliver.

While the case can be made for this being an exception from *nova* (I'm
not getting into that), it is not an exception from the point of view of
*placement*. You asked a service "list the ways I can do X". The first
time, there were three ways. The second time, zero.

It would be like saying:

 # This is the "placement" part
 results = [x for x in l if ]

 # It is up to the placement *consumer* (e.g. nova) to do this, or not
 if len(results) == 0:
 raise Something()

The hard point, which I'm not disputing, is that the end user needs a
way to understand *why* len(results) == 0.

efried
.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] How to debug no valid host failures with placement

2018-08-02 Thread Eric Fried
> And we could do the same kind of approach with the non-granular request
> groups by reducing the single large SQL statement that is used for all
> resources and all traits (and all agg associations) into separate SELECT
> statements.
> 
> It could be slightly less performance-optimized but more readable and
> easier to output debug logs like those above.

Okay, but first we should define the actual problem(s) we're trying to
solve, as Chris says, so we can assert that it's worth the (possible)
perf hit and (definite) dev resources, not to mention the potential for
injecting bugs.

That said, it might be worth doing what you suggest purely for the sake
of being able to read and understand the code...

efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] How to debug no valid host failures with placement

2018-08-02 Thread Eric Fried
I should have made it clear that this is a tiny incremental improvement,
to a code path that almost nobody is even going to see until Stein. In
no way was it intended to close this topic.

Thanks,
efried

On 08/02/2018 12:40 PM, Eric Fried wrote:
> Jay et al-
> 
>> And what I'm referring to is doing a single query per "related
>> resource/trait placement request group" -- which is pretty much what
>> we're heading towards anyway.
>>
>> If we had a request for:
>>
>> GET /allocation_candidates?
>>  resources0=VCPU:1&
>>  required0=HW_CPU_X86_AVX2,!HW_CPU_X86_VMX&
>>  resources1=MEMORY_MB:1024
>>
>> and logged something like this:
>>
>> DEBUG: [placement request ID XXX] request group 1 of 2 for 1 PCPU,
>> requiring HW_CPU_X86_AVX2, forbidding HW_CPU_X86_VMX, returned 10 matches
>>
>> DEBUG: [placement request ID XXX] request group 2 of 2 for 1024
>> MEMORY_MB returned 3 matches
>>
>> that would at least go a step towards being more friendly for debugging
>> a particular request's results.
> 
> Well, that's easy [1] (but I'm sure you knew that when you suggested
> it). Produces logs like [2].
> 
> This won't be backportable, I'm afraid.
> 
> [1] https://review.openstack.org/#/c/588350/
> [2] http://paste.openstack.org/raw/727165/
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] How to debug no valid host failures with placement

2018-08-02 Thread Eric Fried
Jay et al-

> And what I'm referring to is doing a single query per "related
> resource/trait placement request group" -- which is pretty much what
> we're heading towards anyway.
> 
> If we had a request for:
> 
> GET /allocation_candidates?
>  resources0=VCPU:1&
>  required0=HW_CPU_X86_AVX2,!HW_CPU_X86_VMX&
>  resources1=MEMORY_MB:1024
> 
> and logged something like this:
> 
> DEBUG: [placement request ID XXX] request group 1 of 2 for 1 PCPU,
> requiring HW_CPU_X86_AVX2, forbidding HW_CPU_X86_VMX, returned 10 matches
> 
> DEBUG: [placement request ID XXX] request group 2 of 2 for 1024
> MEMORY_MB returned 3 matches
> 
> that would at least go a step towards being more friendly for debugging
> a particular request's results.

Well, that's easy [1] (but I'm sure you knew that when you suggested
it). Produces logs like [2].

This won't be backportable, I'm afraid.

[1] https://review.openstack.org/#/c/588350/
[2] http://paste.openstack.org/raw/727165/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] [Cyborg] Updates to os-acc proposal

2018-08-01 Thread Eric Fried
Sundar-

> On an unrelated note, thanks for the
> pointer to the GPU spec
> (https://review.openstack.org/#/c/579359/10/doc/source/specs/rocky/device-passthrough.rst).
> I will review that.

Thanks. Please note that this is for nova-powervm, PowerVM's
*out-of-tree* compute driver. We hope to bring this into the in-tree
driver eventually (unless we skip straight to the cyborg model :) but it
should give a good idea of some of the requirements and use cases we're
looking to support.

> Fair enough. We had discussed that too. The Cyborg drivers can also
> invoke REST APIs etc. for Power.

Ack.

> Agreed. So, we could say:
> - The plugins do the instance half. They are hypervisor-specific and
> platform-specific. (The term 'platform' subsumes both the architecture
> (Power, x86) and the server/system type.) They are invoked by os-acc.
> - The drivers do the device half, device discovery/enumeration and
> anything not explicitly assigned to plugins. They contain
> device-specific and platform-specific code. They are invoked by Cyborg
> agent and os-acc.

Sounds good.

> Are you ok with the workflow in
> https://docs.google.com/drawings/d/1cX06edia_Pr7P5nOB08VsSMsgznyrz4Yy2u8nb596sU/edit?usp=sharing
> ?

Yes (but see below).

>> You mean for getVAN()?
> Yes -- BTW, I renamed it as prepareVANs() or prepareVAN(), because it is
> not just a query as the name getVAN implies, but has side effects.

Ack.

>> Because AFAIK, os_vif.plug(list_of_vif_objects,
>> InstanceInfo) is *not* how nova uses os-vif for plugging.
> 
> Yes, the os-acc will invoke the plug() once per VAN. IIUC, Nova calls
> Neutron once per instance for all networks, as seen in this code
> sequence in nova/nova/compute/manager.py:
> 
> _build_and_run_instance() --> _build_resources() -->
> 
>     _build_networks_for_instance() --> _allocate_network()
> 
> The _allocate_network() actually takes a list of requested_networks, and
> handles all networks for an instance [1].
> 
> Chasing this further down:
> 
> _allocate_network --> _allocate_network_async()
> 
> --> self.network_api.allocate_for_instance()
> 
>  == nova/network/rpcapi.py::allocate_for_instance()
> 
> So, even the RPC out of Nova seems to take a list of networks [2].

Yes yes, but by the time we get to os_vif.plug(), we're doing one VIF at
a time. That corresponds to what you've got in your flow diagram, so as
long as that's accurate, I'm fine with it.

That said, we could discuss os_acc.plug taking a list of VANs and
threading out the calls to the plugin's plug() method (which takes one
at a time). I think we've talked a bit about this before: the pros and
cons of having the threading managed by os-acc or by the plugin. We
could have the same discussion for prepareVANs() too.

> [1]
> https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L1529
> [2]
> https://github.com/openstack/nova/blob/master/nova/network/rpcapi.py#L163
>> Thanks,
>> Eric
>> //lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> Regards,
> Sundar
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] [Cyborg] Updates to os-acc proposal

2018-07-31 Thread Eric Fried
Sundar-

>   * Cyborg drivers deal with device-specific aspects, including
> discovery/enumeration of devices and handling the Device Half of the
> attach (preparing devices/accelerators for attach to an instance,
> post-attach cleanup (if any) after successful attach, releasing
> device/accelerator resources on instance termination or failed
> attach, etc.)
>   * os-acc plugins deal with hypervisor/system/architecture-specific
> aspects, including handling the Instance Half of the attach (e.g.
> for libvirt with PCI, preparing the XML snippet to be included in
> the domain XML).

This sounds well and good, but discovery/enumeration will also be
hypervisor/system/architecture-specific. So...

> Thus, the drivers and plugins are expected to be complementary. For
> example, for 2 devices of types T1 and T2, there shall be 2 separate
> Cyborg drivers. Further, we would have separate plugins for, say,
> x86+KVM systems and Power systems. We could then have four different
> deployments -- T1 on x86+KVM, T2 on x86+KVM, T1 on Power, T2 on Power --
> by suitable combinations of the drivers and plugins.

...the discovery/enumeration code for T1 on x86+KVM (lsdev? lspci?
walking the /dev file system?) will be totally different from the
discovery/enumeration code for T1 on Power
(pypowervm.wrappers.ManagedSystem.get(adapter)).

I don't mind saying "drivers do the device side; plugins do the instance
side" but I don't see getting around the fact that both "sides" will
need to have platform-specific code.

> One secondary detail to note is that Nova compute calls os-acc per
> instance for all accelerators for that instance, not once for each
> accelerator.

You mean for getVAN()? Because AFAIK, os_vif.plug(list_of_vif_objects,
InstanceInfo) is *not* how nova uses os-vif for plugging.

Thanks,
Eric
.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Fwd: [TIP] tox release 3.1.1

2018-07-13 Thread Eric Fried
Ben-

On 07/13/2018 10:12 AM, Ben Nemec wrote:
>
>
> On 07/12/2018 04:29 PM, Eric Fried wrote:
>> Here it is for nova.
>>
>> https://review.openstack.org/#/c/582392/
>>
>>>> also don't love that immediately bumping the lower bound for tox is
>>>> going to be kind of disruptive to a lot of people.
>>
>> By "kind of disruptive," do you mean:
>>
>>   $ tox -e blah
>>   ERROR: MinVersionError: tox version is 1.6, required is at least 3.1.1
>>   $ sudo pip install --upgrade tox
>>   
>>   $ tox -e blah
>>   
>
> Repeat for every developer on every project that gets updated.  And if
> you installed tox from a distro package then it might not be that
> simple since pip installing over distro packages can get weird.

Not every project; I only install tox once on my system and it works for
all projects, nah? Am I missing something?

Stephen commented similarly that we should wait for distros to pick up
the package. WFM, nothing urgent about this.

>
> No, it's not a huge deal, but then neither is the repetition in
> tox.ini so I'd just as soon leave it be for now.  But I'm not going to
> -1 any patches either.
>
>>
>> ?
>>
>> Thanks,
>> efried
>>
>> On 07/09/2018 03:58 PM, Doug Hellmann wrote:
>>> Excerpts from Ben Nemec's message of 2018-07-09 15:42:02 -0500:
>>>>
>>>> On 07/09/2018 11:16 AM, Eric Fried wrote:
>>>>> Doug-
>>>>>
>>>>>  How long til we can start relying on the new behavior in the
>>>>> gate?  I
>>>>> gots me some basepython to purge...
>>>>
>>>> I want to point out that most projects require a rather old version of
>>>> tox, so chances are most people are not staying up to date with the
>>>> very
>>>> latest version.  I don't love the repetition in tox.ini right now,
>>>> but I
>>>> also don't love that immediately bumping the lower bound for tox is
>>>> going to be kind of disruptive to a lot of people.
>>>>
>>>> 1:
>>>> http://codesearch.openstack.org/?q=minversion=nope=tox.ini=
>>>
>>> Good point. Any patches to clean up the repetition should probably
>>> go ahead and update that minimum version setting, too.
>>>
>>> Doug
>>>
>>> __
>>>
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>> __
>>
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Fwd: [TIP] tox release 3.1.1

2018-07-12 Thread Eric Fried
Here it is for nova.

https://review.openstack.org/#/c/582392/

>> also don't love that immediately bumping the lower bound for tox is
>> going to be kind of disruptive to a lot of people.

By "kind of disruptive," do you mean:

 $ tox -e blah
 ERROR: MinVersionError: tox version is 1.6, required is at least 3.1.1
 $ sudo pip install --upgrade tox
 
 $ tox -e blah
 

?

Thanks,
efried

On 07/09/2018 03:58 PM, Doug Hellmann wrote:
> Excerpts from Ben Nemec's message of 2018-07-09 15:42:02 -0500:
>>
>> On 07/09/2018 11:16 AM, Eric Fried wrote:
>>> Doug-
>>>
>>> How long til we can start relying on the new behavior in the gate?  I
>>> gots me some basepython to purge...
>>
>> I want to point out that most projects require a rather old version of 
>> tox, so chances are most people are not staying up to date with the very 
>> latest version.  I don't love the repetition in tox.ini right now, but I 
>> also don't love that immediately bumping the lower bound for tox is 
>> going to be kind of disruptive to a lot of people.
>>
>> 1: http://codesearch.openstack.org/?q=minversion=nope=tox.ini=
> 
> Good point. Any patches to clean up the repetition should probably
> go ahead and update that minimum version setting, too.
> 
> Doug
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [stestr?][tox?][infra?] Unexpected success isn't a failure

2018-07-09 Thread Eric Fried
In gabbi, there's a way [1] to mark a test as an expected failure, which
makes it show up in your stestr run thusly:

{0}
nova.tests.functional.api.openstack.placement.test_placement_api.allocations-1.28_put_that_allocation_to_new_consumer.test_request
[0.710821s] ... ok

==
Totals
==
Ran: 1 tests in 9. sec.
 - Passed: 0
 - Skipped: 0
 - Expected Fail: 1
 - Unexpected Success: 0
 - Failed: 0

If I go fix the thing causing the heretofore-expected failure, but
forget to take out the `xfail: True`, it does this:

{0}
nova.tests.functional.api.openstack.placement.test_placement_api.allocations-1.28_put_that_allocation_to_new_consumer.test_request
[0.710517s] ... FAILED
{0}
nova.tests.functional.api.openstack.placement.test_placement_api.allocations-1.28_put_that_allocation_to_new_consumer.test_request
[0.00s] ... ok

==
Failed 1 tests - output below:
==

nova.tests.functional.api.openstack.placement.test_placement_api.allocations-1.28_put_that_allocation_to_new_consumer.test_request
--


==
Totals
==
Ran: 2 tests in 9. sec.
 - Passed: 1
 - Skipped: 0
 - Expected Fail: 0
 - Unexpected Success: 1
 - Failed: 0

BUT it does not cause the run to fail. For example, see the
nova-tox-functional results for [2] (specifically PS4): the test appears
twice in the middle of the run [3] and prints failure output [4] but the
job passes [5].

So I'm writing this email because I have no idea if this is expected
behavior or a bug (I'm hoping the latter, cause it's whack, yo); and if
a bug, I have no idea whose bug it should be. Help?

Thanks,
efried

[1] https://gabbi.readthedocs.io/en/latest/format.html?highlight=xfail
[2] https://review.openstack.org/#/c/579921/4
[3]
http://logs.openstack.org/21/579921/4/check/nova-tox-functional/5fb6ee9/job-output.txt.gz#_2018-07-09_17_22_11_846366
[4]
http://logs.openstack.org/21/579921/4/check/nova-tox-functional/5fb6ee9/job-output.txt.gz#_2018-07-09_17_31_07_229271
[5]
http://logs.openstack.org/21/579921/4/check/nova-tox-functional/5fb6ee9/testr_results.html.gz

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Fwd: [TIP] tox release 3.1.1

2018-07-09 Thread Eric Fried
Doug-

How long til we can start relying on the new behavior in the gate?  I
gots me some basepython to purge...

-efried

On 07/09/2018 11:03 AM, Doug Hellmann wrote:
> Heads-up, there is a new tox release out. 3.1 includes some behavior
> changes in the way basepython behaves (thanks, Stephen Finucan!), as
> well as other bug fixes.
> 
> If you start seeing odd job failures, check your tox version.
> 
> Doug
> 
> --- Begin forwarded message from toxdevorg ---
> From: toxdevorg 
> To: testing-in-python , tox-dev 
> 
> Date: Mon, 09 Jul 2018 08:45:15 -0700
> Subject: [TIP] tox release 3.1.1
> 
> The tox team is proud to announce the 3.1.1 bug fix release!
> 
> tox aims to automate and standardize testing in Python. It is part of
> a larger vision of easing the packaging, testing and release process
> of Python software.
> 
> For details about the fix(es),please check the CHANGELOG:
> https://pypi.org/project/tox/3.1.1/#changelog
> 
> We thank all present and past contributors to tox. Have a look at
> https://github.com/tox-dev/tox/blob/master/CONTRIBUTORS to see who
> contributed.
> 
> Happy toxing,
> the tox-dev team
> 
> --- End forwarded message ---
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [keystone] Keystone Team Update - Week of 18 June 2018

2018-06-22 Thread Eric Fried
Also:

keystoneauth1 3.9.0 was released.  Its new feature is the ability to set
raise_exc on the Adapter object so you don't have to do it per request.
Here's a patch that makes use of the feature:
https://review.openstack.org/#/c/577437/

-efried

On 06/22/2018 06:53 AM, Colleen Murphy wrote:
> # Keystone Team Update - Week of 18 June 2018
> 
> ## News
> 
> ### Default Roles Fallout
> 
> Our change to automatically create the 'reader' and 'member' roles during 
> bootstrap[1] caused some problems in the CI of other projects[2]. One problem 
> was that projects were manually creating a 'Member' role, and with the 
> database backend being case-insensitve, there would be a conflict with the 
> 'member' role that keystone is now creating. The immediate fix is to ensure 
> the clients in CI are checking for the 'member' role rather than the 'Member' 
> role before trying to create either role, but in the longer term, clients 
> would benefit from decoupling the API case sensitivity from the configuration 
> of the database backend[3].
> 
> Another problem was a bug related to implied roles in trusts[4]. If a role 
> implies another, but a trust is created with both roles explicitly, the token 
> will contain duplicate role names, which breaks the usage of trusts and hit 
> Sahara. This issue would have existed before, but was only discovered now 
> that we have implied roles by default.
> 
> [1] https://review.openstack.org/572243
> [2] 
> http://eavesdrop.openstack.org/meetings/keystone/2018/keystone.2018-06-19-16.00.log.html#l-24
> [3] 
> http://eavesdrop.openstack.org/meetings/keystone/2018/keystone.2018-06-19-16.00.log.html#l-175
> [4] https://bugs.launchpad.net/keystone/+bug/1778109
> 
> ### Limits Schema Restructuring
> 
> Morgan discovered some problems with the database schemas[5] for registered 
> limits and project limits and proposed that we can improve performance and 
> reduce data duplication by doing some restructuring and adding some indexes. 
> The migration path to the new schema is tricky[6] and we're still trying to 
> come up with a strategy that avoids triggers[7].
> 
> [5] 
> http://eavesdrop.openstack.org/meetings/keystone/2018/keystone.2018-06-19-16.00.log.html#l-184
> [6] 
> http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2018-06-19.log.html#t2018-06-19T21:04:05
> [7] https://etherpad.openstack.org/p/keystone-unified-limit-migration-notepad
> 
> ### No-nitpicking Culture
> 
> Following the community discussion on fostering a healthier culture by 
> avoiding needlessly nitpicking changes[8], the keystone team had a thoughtful 
> discussion on what constitutes nitpicking and how we should be voting on 
> changes[9]. Context is always important, and considering who the author is, 
> how significant the imperfection is, and how much effort it will take the 
> author to correct it should to be considered when deciding whether to ask 
> them to change something about their patch versus proposing yor own fix in a 
> folllowup. I've always been proud of keystone's no-nitpicking culture and 
> it's encouraging to see continuous introspection.
> 
> [8] https://governance.openstack.org/tc/reference/principles.html
> [9] 
> http://eavesdrop.openstack.org/irclogs/%23openstack-keystone/%23openstack-keystone.2018-06-19.log.html#t2018-06-19T21:18:01
> 
> ## Recently Merged Changes
> 
> Search query: https://bit.ly/2IACk3F
> 
> We merged 16 changes this week, including client support for limits and a 
> major bugfix for implied roles.
> 
> ## Changes that need Attention
> 
> Search query: https://bit.ly/2wv7QLK
> 
> There are 57 changes that are passing CI, not in merge conflict, have no 
> negative reviews and aren't proposed by bots, so their authors are waiting 
> for any feedback.
> 
> ## Bugs
> 
> This week we opened 5 new bugs and closed 4.
> 
> Bugs opened (5) 
> Bug #1777671 (keystone:Medium) opened by Morgan Fainberg 
> https://bugs.launchpad.net/keystone/+bug/1777671 
> Bug #1777892 (keystone:Medium) opened by Lance Bragstad 
> https://bugs.launchpad.net/keystone/+bug/1777892 
> Bug #1777893 (keystone:Medium) opened by Lance Bragstad 
> https://bugs.launchpad.net/keystone/+bug/1777893 
> Bug #1778023 (keystone:Undecided) opened by kirandevraaj 
> https://bugs.launchpad.net/keystone/+bug/1778023 
> Bug #1778109 (keystone:Undecided) opened by Jeremy Freudberg 
> https://bugs.launchpad.net/keystone/+bug/1778109 
> 
> Bugs closed (2) 
> Bug #1758460 (keystone:Low) https://bugs.launchpad.net/keystone/+bug/1758460 
> Bug #1774654 (keystone:Undecided) 
> https://bugs.launchpad.net/keystone/+bug/1774654 
> 
> Bugs fixed (2) 
> Bug #1754184 (keystone:Medium) fixed by wangxiyuan 
> https://bugs.launchpad.net/keystone/+bug/1754184 
> Bug #1774229 (keystone:Medium) fixed by Lance Bragstad 
> https://bugs.launchpad.net/keystone/+bug/1774229
> 
> ## Milestone Outlook
> 
> https://releases.openstack.org/rocky/schedule.html
> 
> This week is our feature proposal freeze 

[openstack-dev] [nova][oot drivers] Putting a contract out on ComputeDriver.get_traits()

2018-06-19 Thread Eric Fried
All (but especially out-of-tree compute driver maintainers)-

ComputeDriver.get_traits() was introduced mere months ago [1] for
initial implementation by Ironic [2] mainly because the whole
update_provider_tree framework [3] wasn't fully baked yet.  Now that
update_provider_tree is a thing, I'm starting work to cut Ironic over to
using it [4].  Since, as of this writing, Ironic still has the only
in-tree implementation of get_traits [5], I'm planning to whack the
ComputeDriver interface [6] and its one callout in the resource tracker
[7] at the same time.

If you maintain an out-of-tree driver and this is going to break you
unbearably, scream now.  However, be warned that I will probably just
ask you to cut over to update_provider_tree.

Thanks,
efried

[1] https://review.openstack.org/#/c/532290/
[2]
https://review.openstack.org/#/q/topic:bp/ironic-driver-traits+(status:open+OR+status:merged)
[3]
http://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/update-provider-tree.html
[4] https://review.openstack.org/#/c/576588/
[5]
https://github.com/openstack/nova/blob/0876b091db6f6f0d6795d5907d3d8314706729a7/nova/virt/ironic/driver.py#L737
[6]
https://github.com/openstack/nova/blob/ecaadf6d6d3c94706fdd1fb24676e3bd2370f9f7/nova/virt/driver.py#L886-L895
[7]
https://github.com/openstack/nova/blob/ecaadf6d6d3c94706fdd1fb24676e3bd2370f9f7/nova/compute/resource_tracker.py#L915-L926

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [requirements][nova] weird error on 'Validating lower constraints of test-requirements.txt'

2018-06-15 Thread Eric Fried
Doug-

> The lower constraints tests only look at files in the same repo.
> The minimum versions of dependencies set in requirements.txt,
> test-requirements.txt, etc. need to match the values in
> lower-constraints.txt.
> 
> In this case, the more detailed error message is a few lines above the
> error quoted by Chen CH Ji. The detail say "Requirement for package
> retrying has no lower bound" which means that there is a line in
> requirements.txt indicating a dependency on "retrying" but without
> specifying a minimum version. That is the problem.

The patch didn't change the retrying constraint in requirements.txt [1];
why isn't this same failure affecting every other patch in nova?

[1] https://review.openstack.org/#/c/523387/51/requirements.txt@65

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [placement] placement update 18-24

2018-06-15 Thread Eric Fried
Thank you as always for doing this, Chris.

> Some of the older items in this list are not getting much attention.
> That's a shame. The list is ordered (oldest first) the way it is on
> purpose.
> 
> * 
>   Purge comp_node and res_prvdr records during deletion of
>   cells/hosts

This is still on its first patch set, in merge conflict, with no action
for about 3mo.  Is it still being worked?

> * 
>   placement: Make API history doc more consistent

Reviewed.

> * 
>   Handle agg generation conflict in report client

Rebased.  This previously had three +1s.

> * 
>   Add unit test for non-placement resize

Reviewed.

> * 
>   cover migration cases with functional tests

Reviewed.

> * 
>   Bug fixes for sharing resource providers

Two patches under this topic.

https://review.openstack.org/533437 is abandoned

https://review.openstack.org/#/c/519601/ reviewed (again) & rebased

> * 
>   Move refresh time from report client to prov tree

This patch is still alive only as a marker on my TODO list - I need to
replace it with something completely different as noted by Jay & myself
at the bottom.

> * 
>   PCPU resource class

Reviewed & rebased.  This also made me notice an unused thing, which
I've proposed to remove via https://review.openstack.org/575847

> * 
>   rework how we pass candidate request information

This represents a toe in the waters of "we ought to go back and majorly
refactor a lot of the placement code - especially
nova/api/openstack/placement/objects/resource_provider.py - to make it
more readable and maintainable.  This particular patch is in merge
conflict (pretty majorly, if I'm not mistaken) and probably needs to
wait until the dust settles around nrp-in-alloc-candidates to be
resurrected.

> * 
>   add root parent NULL online migration

Reviewed.  (In merge conflict, and needs tests.)

> * 
>   add resource_requests field to RequestSpec

Active series currently starts at https://review.openstack.org/#/c/570018/

I've been reviewing these; need to catch up on the latest.

> * 
>   replace deprecated accept.best_match

Heading to the gate.

> * 
>   Enforce placement minimum in nova.cmd.status

Heading to the gate.

> * 
>   normalize_name helper (in os-traits)

Needs second core review, please.

> * 
>   Fix nits in nested provider allocation candidates(2)

Heading to the gate.

> * 
>   Convert driver supported capabilities to compute node provider
>   traits

Merge conflict and a bevy of -1s, needs TLC from the author.

> * 
>   Use placement.inventory.inuse in report client

Rebased.

> * 
>   ironic: Report resources as reserved when needed

Needs merge conflict resolved.

> * 
>   Test for multiple limit/group_policy qparams

Another marker for my TODO list.  Added -W.

> # End
> 
> Yow. That was long. Thanks for reading. Review some code please.

++

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] [placement] cinder + placement forum session etherpad

2018-06-15 Thread Eric Fried
We just merged an initial pass at direct access to the placement service
[1].  See the test_direct suite for simple usage examples.

Note that this was written primarily to satisfy the FFU use case in
blueprint reshape-provider-tree [2] and therefore likely won't have
everything cinder needs.  So play around with it, but please do not put
it anywhere near production until we've had some more collab.  Find us
in #openstack-placement.

-efried

[1] https://review.openstack.org/572576
[2] https://review.openstack.org/572583

On 06/04/2018 07:57 AM, Jay S Bryant wrote:
> 
> 
> On 6/1/2018 7:28 PM, Chris Dent wrote:
>> On Wed, 9 May 2018, Chris Dent wrote:
>>
>>> I've started an etherpad for the forum session in Vancouver devoted
>>> to discussing the possibility of tracking and allocation resources
>>> in Cinder using the Placement service. This is not a done deal.
>>> Instead the session is to discuss if it could work and how to make
>>> it happen if it seems like a good idea.
>>>
>>> The etherpad is at
>>>
>>>    https://etherpad.openstack.org/p/YVR-cinder-placement
>>
>> The session went well. Some of the members of the cinder team who
>> might have had more questions had not been able to be at summit so
>> we were unable to get their input.
>>
>> We clarified some of the things that cinder wants to be able to
>> accomplish (run multiple schedulers in active-active and avoid race
>> conditions) and the fact that this is what placement is built for.
>> We also made it clear that placement itself can be highly available
>> (and scalable) because of its nature as a dead-simple web app over a
>> database.
>>
>> The next steps are for the cinder team to talk amongst themselves
>> and socialize the capabilities of placement (with the help of
>> placement people) and see if it will be suitable. It is unlikely
>> there will be much visible progress in this area before Stein.
> Chris,
> 
> Thanks for this update.  I have it on the agenda for the Cinder team to
> discuss this further.  We ran out of time in last week's meeting but
> will hopefully get some time to discuss it this week.  We will keep you
> updated as to how things progress on our end and pull in the placement
> guys as necessary. 
> 
> Jay
>>
>> See the etherpad for a bit more detail.
>>
>>
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] increasing the number of allowed volumes attached per instance > 26

2018-06-11 Thread Eric Fried
I thought we were leaning toward the option where nova itself doesn't
impose a limit, but lets the virt driver decide.

I would really like NOT to see logic like this in any nova code:

> if kvm|qemu:
> return 256
> elif POWER:
> return 4000
> elif:
> ...

On 06/11/2018 10:06 AM, Kashyap Chamarthy wrote:
> On Mon, Jun 11, 2018 at 11:55:29AM +0200, Sahid Orentino Ferdjaoui wrote:
>> On Fri, Jun 08, 2018 at 11:35:45AM +0200, Kashyap Chamarthy wrote:
>>> On Thu, Jun 07, 2018 at 01:07:48PM -0500, Matt Riedemann wrote:
> 
> [...]
> 
>>>> The 26 volumes thing is a libvirt driver restriction.
>>>
>>> The original limitation of 26 disks was because at that time there was
>>> no 'virtio-scsi'.  
>>>
>>> (With 'virtio-scsi', each of its controller allows upto 256 targets, and
>>> each target can use any LUN (Logical Unit Number) from 0 to 16383
>>> (inclusive).  Therefore, the maxium allowable disks on a single
>>> 'virtio-scsi' controller is 256 * 16384 == 4194304.)  Source[1].
>>
>> Not totally true for Nova. Nova handles one virtio-scsi controller per
>> guest and plug all the volumes in one target so in theory that would
>> be 16384 LUN (only).
> 
> Yeah, I could've been clearer that I was only talking the maximum
> allowable disks regardless of how Nova handles it.
> 
>> But you made a good point the 26 volumes thing is not a libvirt driver
>> restriction. For example the QEMU SCSI native implementation handles
>> 256 disks.
>>
>> About the virtio-blk limitation I made the same finding but Tsuyoshi
>> Nagata shared an interesting point saying that virtio-blk is not longer
>> limited by the number of PCI slot available. That in recent kernel and
>> QEMU version [0].
>>
>> I could join what you are suggesting at the bottom and fix the limit
>> to 256 disks.
> 
> Right, that's for KVM-based hypervisors.  
> 
> Eric Fried on IRC said the other day that for IBM POWER hypervisor they
> have tested (not with OpenStack) upto 4000 disks.  But I am yet to see
> any more concrete details from POWER hypervisor users on this thread.
> 
> If people can't seem to reach an agreement on the limits, we may have to
> settle with conditionals:
> 
> if kvm|qemu:
> return 256
> elif POWER:
> return 4000
> elif:
> ...
> 
> Before that we need concrete data that it is a _reasonble_ limit for
> POWER hypervisor (and possibly others).
> 
>> [0] 
>> https://review.openstack.org/#/c/567472/16/nova/virt/libvirt/blockinfo.py@162
> 
> [...]
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers

2018-06-08 Thread Eric Fried
There is now a blueprint [1] and draft spec [2].  Reviews welcomed.

[1] https://blueprints.launchpad.net/nova/+spec/reshape-provider-tree
[2] https://review.openstack.org/#/c/572583/

On 06/04/2018 06:00 PM, Eric Fried wrote:
> There has been much discussion.  We've gotten to a point of an initial
> proposal and are ready for more (hopefully smaller, hopefully
> conclusive) discussion.
> 
> To that end, there will be a HANGOUT tomorrow (TUESDAY, JUNE 5TH) at
> 1500 UTC.  Be in #openstack-placement to get the link to join.
> 
> The strawpeople outlined below and discussed in the referenced etherpad
> have been consolidated/distilled into a new etherpad [1] around which
> the hangout discussion will be centered.
> 
> [1] https://etherpad.openstack.org/p/placement-making-the-(up)grade
> 
> Thanks,
> efried
> 
> On 06/01/2018 01:12 PM, Jay Pipes wrote:
>> On 05/31/2018 02:26 PM, Eric Fried wrote:
>>>> 1. Make everything perform the pivot on compute node start (which can be
>>>>     re-used by a CLI tool for the offline case)
>>>> 2. Make everything default to non-nested inventory at first, and provide
>>>>     a way to migrate a compute node and its instances one at a time (in
>>>>     place) to roll through.
>>>
>>> I agree that it sure would be nice to do ^ rather than requiring the
>>> "slide puzzle" thing.
>>>
>>> But how would this be accomplished, in light of the current "separation
>>> of responsibilities" drawn at the virt driver interface, whereby the
>>> virt driver isn't supposed to talk to placement directly, or know
>>> anything about allocations?
>> FWIW, I don't have a problem with the virt driver "knowing about
>> allocations". What I have a problem with is the virt driver *claiming
>> resources for an instance*.
>>
>> That's what the whole placement claims resources things was all about,
>> and I'm not interested in stepping back to the days of long racy claim
>> operations by having the compute nodes be responsible for claiming
>> resources.
>>
>> That said, once the consumer generation microversion lands [1], it
>> should be possible to *safely* modify an allocation set for a consumer
>> (instance) and move allocation records for an instance from one provider
>> to another.
>>
>> [1] https://review.openstack.org/#/c/565604/
>>
>>> Here's a first pass:
>>>
>>> The virt driver, via the return value from update_provider_tree, tells
>>> the resource tracker that "inventory of resource class A on provider B
>>> have moved to provider C" for all applicable AxBxC.  E.g.
>>>
>>> [ { 'from_resource_provider': ,
>>>  'moved_resources': [VGPU: 4],
>>>  'to_resource_provider': 
>>>    },
>>>    { 'from_resource_provider': ,
>>>  'moved_resources': [VGPU: 4],
>>>  'to_resource_provider': 
>>>    },
>>>    { 'from_resource_provider': ,
>>>  'moved_resources': [
>>>  SRIOV_NET_VF: 2,
>>>  NET_BANDWIDTH_EGRESS_KILOBITS_PER_SECOND: 1000,
>>>  NET_BANDWIDTH_INGRESS_KILOBITS_PER_SECOND: 1000,
>>>  ],
>>>  'to_resource_provider': 
>>>    }
>>> ]
>>>
>>> As today, the resource tracker takes the updated provider tree and
>>> invokes [1] the report client method update_from_provider_tree [2] to
>>> flush the changes to placement.  But now update_from_provider_tree also
>>> accepts the return value from update_provider_tree and, for each "move":
>>>
>>> - Creates provider C (as described in the provider_tree) if it doesn't
>>> already exist.
>>> - Creates/updates provider C's inventory as described in the
>>> provider_tree (without yet updating provider B's inventory).  This ought
>>> to create the inventory of resource class A on provider C.
>>
>> Unfortunately, right here you'll introduce a race condition. As soon as
>> this operation completes, the scheduler will have the ability to throw
>> new instances on provider C and consume the inventory from it that you
>> intend to give to the existing instance that is consuming from provider B.
>>
>>> - Discovers allocations of rc A on rp B and POSTs to move them to rp C*.
>>
>> For each consumer of resources on rp B, right?
>>
>>> - Updates provider B's inventory.
>>
>> Again, this is problematic because the scheduler will have already begun
>> to place new instances on B's inventory, w

Re: [openstack-dev] [Cyborg] [Nova] Backup plan without nested RPs

2018-06-05 Thread Eric Fried
Alex-

Allocations for an instance are pulled down by the compute manager and
passed into the virt driver's spawn method since [1].  An allocation
comprises a consumer, provider, resource class, and amount.  Once we can
schedule to trees, the allocations pulled down by the compute manager
will span the tree as appropriate.  So in that sense, yes, nova-compute
knows which amounts of which resource classes come from which providers.

However, if you're asking about the situation where we have two
different allocations of the same resource class coming from two
separate providers: Yes, we can still tell which RCxAMOUNT is associated
with which provider; but No, we still have no inherent way to correlate
a specific one of those allocations with the part of the *request* it
came from.  If just the provider UUID isn't enough for the virt driver
to figure out what to do, it may have to figure it out by looking at the
flavor (and/or image metadata), inspecting the traits on the providers
associated with the allocations, etc.  (The theory here is that, if the
virt driver can't tell the difference at that point, then it actually
doesn't matter.)

[1] https://review.openstack.org/#/c/511879/

On 06/05/2018 09:05 AM, Alex Xu wrote:
> Maybe I missed something. Is there anyway the nova-compute can know the
> resources are allocated from which child resource provider? For example,
> the host has two PFs. The request is asking one VF, then the
> nova-compute needs to know the VF is allocated from which PF (resource
> provider). As my understand, currently we only return a list of
> alternative resource provider to the nova-compute, those alternative is
> root resource provider.
> 
> 2018-06-05 21:29 GMT+08:00 Jay Pipes  >:
> 
> On 06/05/2018 08:50 AM, Stephen Finucane wrote:
> 
> I thought nested resource providers were already supported by
> placement? To the best of my knowledge, what is /not/ supported
> is virt drivers using these to report NUMA topologies but I
> doubt that affects you. The placement guys will need to weigh in
> on this as I could be missing something but it sounds like you
> can start using this functionality right now.
> 
> 
> To be clear, this is what placement and nova *currently* support
> with regards to nested resource providers:
> 
> 1) When creating a resource provider in placement, you can specify a
> parent_provider_uuid and thus create trees of providers. This was
> placement API microversion 1.14. Also included in this microversion
> was support for displaying the parent and root provider UUID for
> resource providers.
> 
> 2) The nova "scheduler report client" (terrible name, it's mostly
> just the placement client at this point) understands how to call
> placement API 1.14 and create resource providers with a parent provider.
> 
> 3) The nova scheduler report client uses a ProviderTree object [1]
> to cache information about the hierarchy of providers that it knows
> about. For nova-compute workers managing hypervisors, that means the
> ProviderTree object contained in the report client is rooted in a
> resource provider that represents the compute node itself (the
> hypervisor). For nova-compute workers managing baremetal, that means
> the ProviderTree object contains many root providers, each
> representing an Ironic baremetal node.
> 
> 4) The placement API's GET /allocation_candidates endpoint now
> understands the concept of granular request groups [2]. Granular
> request groups are only relevant when a user wants to specify that
> child providers in a provider tree should be used to satisfy part of
> an overall scheduling request. However, this support is yet
> incomplete -- see #5 below.
> 
> The following parts of the nested resource providers modeling are
> *NOT* yet complete, however:
> 
> 5) GET /allocation_candidates does not currently return *results*
> when granular request groups are specified. So, while the placement
> service understands the *request* for granular groups, it doesn't
> yet have the ability to constrain the returned candidates
> appropriately. Tetsuro is actively working on this functionality in
> this patch series:
> 
> 
> https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/nested-resource-providers-allocation-candidates
> 
> 
> 
> 6) The virt drivers need to implement the update_provider_tree()
> interface [3] and construct the tree of resource providers along
> with appropriate inventory records for each child provider in the
> tree. Both libvirt and XenAPI virt drivers have patch series up that
> begin to take advantage of the 

Re: [openstack-dev] [Cyborg] [Nova] Backup plan without nested RPs

2018-06-05 Thread Eric Fried
To summarize: cyborg could model things nested-wise, but there would be
no way to schedule them yet.

Couple of clarifications inline.

On 06/05/2018 08:29 AM, Jay Pipes wrote:
> On 06/05/2018 08:50 AM, Stephen Finucane wrote:
>> I thought nested resource providers were already supported by
>> placement? To the best of my knowledge, what is /not/ supported is
>> virt drivers using these to report NUMA topologies but I doubt that
>> affects you. The placement guys will need to weigh in on this as I
>> could be missing something but it sounds like you can start using this
>> functionality right now.
> 
> To be clear, this is what placement and nova *currently* support with
> regards to nested resource providers:
> 
> 1) When creating a resource provider in placement, you can specify a
> parent_provider_uuid and thus create trees of providers. This was
> placement API microversion 1.14. Also included in this microversion was
> support for displaying the parent and root provider UUID for resource
> providers.
> 
> 2) The nova "scheduler report client" (terrible name, it's mostly just
> the placement client at this point) understands how to call placement
> API 1.14 and create resource providers with a parent provider.
> 
> 3) The nova scheduler report client uses a ProviderTree object [1] to
> cache information about the hierarchy of providers that it knows about.
> For nova-compute workers managing hypervisors, that means the
> ProviderTree object contained in the report client is rooted in a
> resource provider that represents the compute node itself (the
> hypervisor). For nova-compute workers managing baremetal, that means the
> ProviderTree object contains many root providers, each representing an
> Ironic baremetal node.
> 
> 4) The placement API's GET /allocation_candidates endpoint now
> understands the concept of granular request groups [2]. Granular request
> groups are only relevant when a user wants to specify that child
> providers in a provider tree should be used to satisfy part of an
> overall scheduling request. However, this support is yet incomplete --
> see #5 below.

Granular request groups are also usable/useful when sharing providers
are in play. That functionality is complete on both the placement side
and the report client side (see below).

> The following parts of the nested resource providers modeling are *NOT*
> yet complete, however:
> 
> 5) GET /allocation_candidates does not currently return *results* when
> granular request groups are specified. So, while the placement service
> understands the *request* for granular groups, it doesn't yet have the
> ability to constrain the returned candidates appropriately. Tetsuro is
> actively working on this functionality in this patch series:
> 
> https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/nested-resource-providers-allocation-candidates
> 
> 
> 6) The virt drivers need to implement the update_provider_tree()
> interface [3] and construct the tree of resource providers along with
> appropriate inventory records for each child provider in the tree. Both
> libvirt and XenAPI virt drivers have patch series up that begin to take
> advantage of the nested provider modeling. However, a number of concerns
> [4] about in-place nova-compute upgrades when moving from a single
> resource provider to a nested provider tree model were raised, and we
> have begun brainstorming how to handle the migration of existing data in
> the single-provider model to the nested provider model. [5] We are
> blocking any reviews on patch series that modify the local provider
> modeling until these migration concerns are fully resolved.
> 
> 7) The scheduler does not currently pass granular request groups to
> placement.

The code is in place to do this [6] - so the scheduler *will* pass
granular request groups to placement if your flavor specifies them.  As
noted above, such flavors will be limited to exploiting sharing
providers until Tetsuro's series merges.  But no further code work is
required on the scheduler side.

[6] https://review.openstack.org/#/c/515811/

> Once #5 and #6 are resolved, and once the migration/upgrade
> path is resolved, clearly we will need to have the scheduler start
> making requests to placement that represent the granular request groups
> and have the scheduler pass the resulting allocation candidates to its
> filters and weighers.
> 
> Hope this helps highlight where we currently are and the work still left
> to do (in Rocky) on nested resource providers.
> 
> Best,
> -jay
> 
> 
> [1]
> https://github.com/openstack/nova/blob/master/nova/compute/provider_tree.py
> 
> [2]
> https://specs.openstack.org/openstack/nova-specs/specs/queens/approved/granular-resource-requests.html
> 
> 
> [3]
> https://github.com/openstack/nova/blob/f902e0d5d87fb05207e4a7aca73d185775d43df2/nova/virt/driver.py#L833
> 
> 
> [4] http://lists.openstack.org/pipermail/openstack-dev/2018-May/130783.html
> 
> [5] 

Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers

2018-06-04 Thread Eric Fried
There has been much discussion.  We've gotten to a point of an initial
proposal and are ready for more (hopefully smaller, hopefully
conclusive) discussion.

To that end, there will be a HANGOUT tomorrow (TUESDAY, JUNE 5TH) at
1500 UTC.  Be in #openstack-placement to get the link to join.

The strawpeople outlined below and discussed in the referenced etherpad
have been consolidated/distilled into a new etherpad [1] around which
the hangout discussion will be centered.

[1] https://etherpad.openstack.org/p/placement-making-the-(up)grade

Thanks,
efried

On 06/01/2018 01:12 PM, Jay Pipes wrote:
> On 05/31/2018 02:26 PM, Eric Fried wrote:
>>> 1. Make everything perform the pivot on compute node start (which can be
>>>     re-used by a CLI tool for the offline case)
>>> 2. Make everything default to non-nested inventory at first, and provide
>>>     a way to migrate a compute node and its instances one at a time (in
>>>     place) to roll through.
>>
>> I agree that it sure would be nice to do ^ rather than requiring the
>> "slide puzzle" thing.
>>
>> But how would this be accomplished, in light of the current "separation
>> of responsibilities" drawn at the virt driver interface, whereby the
>> virt driver isn't supposed to talk to placement directly, or know
>> anything about allocations?
> FWIW, I don't have a problem with the virt driver "knowing about
> allocations". What I have a problem with is the virt driver *claiming
> resources for an instance*.
> 
> That's what the whole placement claims resources things was all about,
> and I'm not interested in stepping back to the days of long racy claim
> operations by having the compute nodes be responsible for claiming
> resources.
> 
> That said, once the consumer generation microversion lands [1], it
> should be possible to *safely* modify an allocation set for a consumer
> (instance) and move allocation records for an instance from one provider
> to another.
> 
> [1] https://review.openstack.org/#/c/565604/
> 
>> Here's a first pass:
>>
>> The virt driver, via the return value from update_provider_tree, tells
>> the resource tracker that "inventory of resource class A on provider B
>> have moved to provider C" for all applicable AxBxC.  E.g.
>>
>> [ { 'from_resource_provider': ,
>>  'moved_resources': [VGPU: 4],
>>  'to_resource_provider': 
>>    },
>>    { 'from_resource_provider': ,
>>  'moved_resources': [VGPU: 4],
>>  'to_resource_provider': 
>>    },
>>    { 'from_resource_provider': ,
>>  'moved_resources': [
>>  SRIOV_NET_VF: 2,
>>  NET_BANDWIDTH_EGRESS_KILOBITS_PER_SECOND: 1000,
>>  NET_BANDWIDTH_INGRESS_KILOBITS_PER_SECOND: 1000,
>>  ],
>>  'to_resource_provider': 
>>    }
>> ]
>>
>> As today, the resource tracker takes the updated provider tree and
>> invokes [1] the report client method update_from_provider_tree [2] to
>> flush the changes to placement.  But now update_from_provider_tree also
>> accepts the return value from update_provider_tree and, for each "move":
>>
>> - Creates provider C (as described in the provider_tree) if it doesn't
>> already exist.
>> - Creates/updates provider C's inventory as described in the
>> provider_tree (without yet updating provider B's inventory).  This ought
>> to create the inventory of resource class A on provider C.
> 
> Unfortunately, right here you'll introduce a race condition. As soon as
> this operation completes, the scheduler will have the ability to throw
> new instances on provider C and consume the inventory from it that you
> intend to give to the existing instance that is consuming from provider B.
> 
>> - Discovers allocations of rc A on rp B and POSTs to move them to rp C*.
> 
> For each consumer of resources on rp B, right?
> 
>> - Updates provider B's inventory.
> 
> Again, this is problematic because the scheduler will have already begun
> to place new instances on B's inventory, which could very well result in
> incorrect resource accounting on the node.
> 
> We basically need to have one giant new REST API call that accepts the
> list of "move instructions" and performs all of the instructions in a
> single transaction. :(
> 
>> (*There's a hole here: if we're splitting a glommed-together inventory
>> across multiple new child providers, as the VGPUs in the example, we
>> don't know which allocations to put where.  The virt driver should know
>> which instances own which specific inventory units, and would be able to
>> report tha

Re: [openstack-dev] [Cyborg] [Nova] Backup plan without nested RPs

2018-06-04 Thread Eric Fried
Sundar-

We've been discussing the upgrade path on another thread [1] and are
working toward a solution [2][3] that would not require downtime or
special scripts (other than whatever's normally required for an upgrade).

We still hope to have all of that ready for Rocky, but if you're
concerned about timing, this work should make it a viable option for you
to start out modeling everything in the compute RP as you say, and then
move it over later.

Thanks,
Eric

[1] http://lists.openstack.org/pipermail/openstack-dev/2018-May/130783.html
[2] http://lists.openstack.org/pipermail/openstack-dev/2018-June/131045.html
[3] https://etherpad.openstack.org/p/placement-migrate-operations

On 06/04/2018 12:49 PM, Nadathur, Sundar wrote:
> Hi,
>  Cyborg needs to create RCs and traits for accelerators. The
> original plan was to do that with nested RPs. To avoid rushing the Nova
> developers, I had proposed that Cyborg could start by applying the
> traits to the compute node RP, and accept the resulting caveats for
> Rocky, till we get nested RP support. That proposal did not find many
> takers, and Cyborg has essentially been in waiting mode.
> 
> Since it is June already, and there is a risk of not delivering anything
> meaningful in Rocky, I am reviving my older proposal, which is
> summarized as below:
> 
>   * Cyborg shall create the RCs and traits as per spec
> (https://review.openstack.org/#/c/554717/), both in Rocky and
> beyond. Only the RPs will change post-Rocky.
>   * In Rocky:
>   o Cyborg will not create nested RPs. It shall apply the device
> traits to the compute node RP.
>   o Cyborg will document the resulting caveat, i.e., all devices in
> the same host should have the same traits. In particular, we
> cannot have a GPU and a FPGA, or 2 FPGAs of different types, in
> the same host.
>   o Cyborg will document that upgrades to post-Rocky releases will
> require operator intervention (as described below).
>   *  For upgrade to post-Rocky world with nested RPs:
>   o The operator needs to stop all running instances that use an
> accelerator.
>   o The operator needs to run a script that removes the Cyborg
> traits and the inventory for Cyborg RCs from compute node RPs.
>   o The operator can then perform the upgrade. The new Cyborg
> agent/driver(s) shall created nested RPs and publish
> inventory/traits as specified.
> 
> IMHO, it is acceptable for Cyborg to do this because it is new and we
> can set expectations for the (lack of) upgrade plan. The alternative is
> that potentially no meaningful use cases get addressed in Rocky for Cyborg.
> 
> Please LMK what you think.
> 
> Regards,
> Sundar
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers

2018-06-01 Thread Eric Fried
Sylvain-

On 05/31/2018 02:41 PM, Sylvain Bauza wrote:
> 
> 
> On Thu, May 31, 2018 at 8:26 PM, Eric Fried  <mailto:openst...@fried.cc>> wrote:
> 
> > 1. Make everything perform the pivot on compute node start (which can be
> >    re-used by a CLI tool for the offline case)
> > 2. Make everything default to non-nested inventory at first, and provide
> >    a way to migrate a compute node and its instances one at a time (in
> >    place) to roll through.
> 
> I agree that it sure would be nice to do ^ rather than requiring the
> "slide puzzle" thing.
> 
> But how would this be accomplished, in light of the current "separation
> of responsibilities" drawn at the virt driver interface, whereby the
> virt driver isn't supposed to talk to placement directly, or know
> anything about allocations?  Here's a first pass:
> 
> 
> 
> What we usually do is to implement either at the compute service level
> or at the virt driver level some init_host() method that will reconcile
> what you want.
> For example, we could just imagine a non-virt specific method (and I
> like that because it's non-virt specific) - ie. called by compute's
> init_host() that would lookup the compute root RP inventories, see
> whether one ore more inventories tied to specific resource classes have
> to be moved from the root RP and be attached to a child RP.
> The only subtility that would require a virt-specific update would be
> the name of the child RP (as both Xen and libvirt plan to use the child
> RP name as the vGPU type identifier) but that's an implementation detail
> that a possible virt driver update by the resource tracker would
> reconcile that.

The question was rhetorical; my suggestion (below) was an attempt at
designing exactly what you've described.  Let me know if I can
explain/clarify it further.  I'm looking for feedback as to whether it's
a viable approach.

> The virt driver, via the return value from update_provider_tree, tells
> the resource tracker that "inventory of resource class A on provider B
> have moved to provider C" for all applicable AxBxC.  E.g.
> 
> [ { 'from_resource_provider': ,
>     'moved_resources': [VGPU: 4],
>     'to_resource_provider': 
>   },
>   { 'from_resource_provider': ,
>     'moved_resources': [VGPU: 4],
>     'to_resource_provider': 
>   },
>   { 'from_resource_provider': ,
>     'moved_resources': [
>         SRIOV_NET_VF: 2,
>         NET_BANDWIDTH_EGRESS_KILOBITS_PER_SECOND: 1000,
>         NET_BANDWIDTH_INGRESS_KILOBITS_PER_SECOND: 1000,
>     ],
>     'to_resource_provider': 
>   }
> ]
> 
> As today, the resource tracker takes the updated provider tree and
> invokes [1] the report client method update_from_provider_tree [2] to
> flush the changes to placement.  But now update_from_provider_tree also
> accepts the return value from update_provider_tree and, for each "move":
> 
> - Creates provider C (as described in the provider_tree) if it doesn't
> already exist.
> - Creates/updates provider C's inventory as described in the
> provider_tree (without yet updating provider B's inventory).  This ought
> to create the inventory of resource class A on provider C.
> - Discovers allocations of rc A on rp B and POSTs to move them to rp C*.
> - Updates provider B's inventory.
> 
> (*There's a hole here: if we're splitting a glommed-together inventory
> across multiple new child providers, as the VGPUs in the example, we
> don't know which allocations to put where.  The virt driver should know
> which instances own which specific inventory units, and would be able to
> report that info within the data structure.  That's getting kinda close
> to the virt driver mucking with allocations, but maybe it fits well
> enough into this model to be acceptable?)
> 
> Note that the return value from update_provider_tree is optional, and
> only used when the virt driver is indicating a "move" of this ilk.  If
> it's None/[] then the RT/update_from_provider_tree flow is the same as
> it is today.
> 
> If we can do it this way, we don't need a migration tool.  In fact, we
> don't even need to restrict provider tree "reshaping" to release
> boundaries.  As long as the virt driver understands its own data model
> migrations and reports them properly via update_provider_tree, it can
> shuffle its tree around whenever it wants.
> 
> Thoughts?
> 
> -efried
> 
> [1]
> 
> https://github.com/openstack/nova

Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers

2018-05-31 Thread Eric Fried
Chris-

>> virt driver isn't supposed to talk to placement directly, or know
>> anything about allocations?
> 
> For sake of discussion, how much (if any) easier would it be if we
> got rid of this restriction?

At this point, having implemented the update_[from_]provider_tree flow
as we have, it would probably make things harder.  We still have to do
the same steps, but any bits we wanted to let the virt driver handle
would need some kind of weird callback dance.

But even if we scrapped update_[from_]provider_tree and redesigned from
first principles, virt drivers would have a lot of duplication of the
logic that currently resides in update_from_provider_tree.

So even though the restriction seems to make things awkward, having been
embroiled in this code as I have, I'm actually seeing how it keeps
things as clean and easy to reason about as can be expected for
something that's inherently as complicated as this.

>> the resource tracker that "inventory of resource class A on provider B
>> have moved to provider C" for all applicable AxBxC.  E.g.
> 
> traits too?

The traits are part of the updated provider tree itself.  The existing
logic in update_from_provider_tree handles shuffling those around.  I
don't think the RT needs to be told about any specific trait movement in
order to reason about moving allocations.  Do you see something I'm
missing there?

> The fact that we are using what amounts to a DSL to pass
> some additional instruction back from the virt driver feels squiffy

Yeah, I don't disagree.  The provider_tree object, and updating it via
update_provider_tree, is kind of a DSL already.  The list-of-dicts
format is just a strawman; we could make it an object or whatever (not
that that would make it less DSL-ish).

Perhaps an OVO :P

-efried
.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers

2018-05-31 Thread Eric Fried
Rats, typo correction below.

On 05/31/2018 01:26 PM, Eric Fried wrote:
>> 1. Make everything perform the pivot on compute node start (which can be
>>re-used by a CLI tool for the offline case)
>> 2. Make everything default to non-nested inventory at first, and provide
>>a way to migrate a compute node and its instances one at a time (in
>>place) to roll through.
> 
> I agree that it sure would be nice to do ^ rather than requiring the
> "slide puzzle" thing.
> 
> But how would this be accomplished, in light of the current "separation
> of responsibilities" drawn at the virt driver interface, whereby the
> virt driver isn't supposed to talk to placement directly, or know
> anything about allocations?  Here's a first pass:
> 
> The virt driver, via the return value from update_provider_tree, tells
> the resource tracker that "inventory of resource class A on provider B
> have moved to provider C" for all applicable AxBxC.  E.g.
> 
> [ { 'from_resource_provider': ,
> 'moved_resources': [VGPU: 4],
> 'to_resource_provider': 
>   },
>   { 'from_resource_provider': ,
> 'moved_resources': [VGPU: 4],
> 'to_resource_provider': 
>   },
>   { 'from_resource_provider': ,
> 'moved_resources': [
> SRIOV_NET_VF: 2,
> NET_BANDWIDTH_EGRESS_KILOBITS_PER_SECOND: 1000,
> NET_BANDWIDTH_INGRESS_KILOBITS_PER_SECOND: 1000,
> ],
> 'to_resource_provider': 
---
s/gpu_rp2_uuid/sriovnic_rp_uuid/ or similar.

>   }
> ]
> 
> As today, the resource tracker takes the updated provider tree and
> invokes [1] the report client method update_from_provider_tree [2] to
> flush the changes to placement.  But now update_from_provider_tree also
> accepts the return value from update_provider_tree and, for each "move":
> 
> - Creates provider C (as described in the provider_tree) if it doesn't
> already exist.
> - Creates/updates provider C's inventory as described in the
> provider_tree (without yet updating provider B's inventory).  This ought
> to create the inventory of resource class A on provider C.
> - Discovers allocations of rc A on rp B and POSTs to move them to rp C*.
> - Updates provider B's inventory.
> 
> (*There's a hole here: if we're splitting a glommed-together inventory
> across multiple new child providers, as the VGPUs in the example, we
> don't know which allocations to put where.  The virt driver should know
> which instances own which specific inventory units, and would be able to
> report that info within the data structure.  That's getting kinda close
> to the virt driver mucking with allocations, but maybe it fits well
> enough into this model to be acceptable?)
> 
> Note that the return value from update_provider_tree is optional, and
> only used when the virt driver is indicating a "move" of this ilk.  If
> it's None/[] then the RT/update_from_provider_tree flow is the same as
> it is today.
> 
> If we can do it this way, we don't need a migration tool.  In fact, we
> don't even need to restrict provider tree "reshaping" to release
> boundaries.  As long as the virt driver understands its own data model
> migrations and reports them properly via update_provider_tree, it can
> shuffle its tree around whenever it wants.
> 
> Thoughts?
> 
> -efried
> 
> [1]
> https://github.com/openstack/nova/blob/8753c9a38667f984d385b4783c3c2fc34d7e8e1b/nova/compute/resource_tracker.py#L890
> [2]
> https://github.com/openstack/nova/blob/8753c9a38667f984d385b4783c3c2fc34d7e8e1b/nova/scheduler/client/report.py#L1341
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers

2018-05-31 Thread Eric Fried
> 1. Make everything perform the pivot on compute node start (which can be
>re-used by a CLI tool for the offline case)
> 2. Make everything default to non-nested inventory at first, and provide
>a way to migrate a compute node and its instances one at a time (in
>place) to roll through.

I agree that it sure would be nice to do ^ rather than requiring the
"slide puzzle" thing.

But how would this be accomplished, in light of the current "separation
of responsibilities" drawn at the virt driver interface, whereby the
virt driver isn't supposed to talk to placement directly, or know
anything about allocations?  Here's a first pass:

The virt driver, via the return value from update_provider_tree, tells
the resource tracker that "inventory of resource class A on provider B
have moved to provider C" for all applicable AxBxC.  E.g.

[ { 'from_resource_provider': ,
'moved_resources': [VGPU: 4],
'to_resource_provider': 
  },
  { 'from_resource_provider': ,
'moved_resources': [VGPU: 4],
'to_resource_provider': 
  },
  { 'from_resource_provider': ,
'moved_resources': [
SRIOV_NET_VF: 2,
NET_BANDWIDTH_EGRESS_KILOBITS_PER_SECOND: 1000,
NET_BANDWIDTH_INGRESS_KILOBITS_PER_SECOND: 1000,
],
'to_resource_provider': 
  }
]

As today, the resource tracker takes the updated provider tree and
invokes [1] the report client method update_from_provider_tree [2] to
flush the changes to placement.  But now update_from_provider_tree also
accepts the return value from update_provider_tree and, for each "move":

- Creates provider C (as described in the provider_tree) if it doesn't
already exist.
- Creates/updates provider C's inventory as described in the
provider_tree (without yet updating provider B's inventory).  This ought
to create the inventory of resource class A on provider C.
- Discovers allocations of rc A on rp B and POSTs to move them to rp C*.
- Updates provider B's inventory.

(*There's a hole here: if we're splitting a glommed-together inventory
across multiple new child providers, as the VGPUs in the example, we
don't know which allocations to put where.  The virt driver should know
which instances own which specific inventory units, and would be able to
report that info within the data structure.  That's getting kinda close
to the virt driver mucking with allocations, but maybe it fits well
enough into this model to be acceptable?)

Note that the return value from update_provider_tree is optional, and
only used when the virt driver is indicating a "move" of this ilk.  If
it's None/[] then the RT/update_from_provider_tree flow is the same as
it is today.

If we can do it this way, we don't need a migration tool.  In fact, we
don't even need to restrict provider tree "reshaping" to release
boundaries.  As long as the virt driver understands its own data model
migrations and reports them properly via update_provider_tree, it can
shuffle its tree around whenever it wants.

Thoughts?

-efried

[1]
https://github.com/openstack/nova/blob/8753c9a38667f984d385b4783c3c2fc34d7e8e1b/nova/compute/resource_tracker.py#L890
[2]
https://github.com/openstack/nova/blob/8753c9a38667f984d385b4783c3c2fc34d7e8e1b/nova/scheduler/client/report.py#L1341

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers

2018-05-31 Thread Eric Fried
This seems reasonable, but...

On 05/31/2018 04:34 AM, Balázs Gibizer wrote:
> 
> 
> On Thu, May 31, 2018 at 11:10 AM, Sylvain Bauza  wrote:
>>>
>>
>> After considering the whole approach, discussing with a couple of
>> folks over IRC, here is what I feel the best approach for a seamless
>> upgrade :
>>  - VGPU inventory will be kept on root RP (for the first type) in
>> Queens so that a compute service upgrade won't impact the DB
>>  - during Queens, operators can run a DB online migration script (like
-^^
Did you mean Rocky?

>> the ones we currently have in
>> https://github.com/openstack/nova/blob/c2f42b0/nova/cmd/manage.py#L375) that
>> will create a new resource provider for the first type and move the
>> inventory and allocations to it.
>>  - it's the responsibility of the virt driver code to check whether a
>> child RP with its name being the first type name already exists to
>> know whether to update the inventory against the root RP or the child RP.
>>
>> Does it work for folks ?
> 
> +1 works for me
> gibi
> 
>> PS : we already have the plumbing in place in nova-manage and we're
>> still managing full Nova resources. I know we plan to move Placement
>> out of the nova tree, but for the Rocky timeframe, I feel we can
>> consider nova-manage as the best and quickiest approach for the data
>> upgrade.
>>
>> -Sylvain
>>
>>
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Cyborg] [Nova] Cyborg traits

2018-05-31 Thread Eric Fried
Yup.  I'm sure reviewers will bikeshed the names, but the review is the
appropriate place for that to happen.

A couple of test changes will also be required.  You can have a look at
[1] as an example to follow.

-efried

[1] https://review.openstack.org/#/c/511180/

On 05/31/2018 01:02 AM, Nadathur, Sundar wrote:
> On 5/30/2018 1:18 PM, Eric Fried wrote:
>> This all sounds fully reasonable to me.  One thing, though...
>>
>>>>    * There is a resource class per device category e.g.
>>>>  CUSTOM_ACCELERATOR_GPU, CUSTOM_ACCELERATOR_FPGA.
>> Let's propose standard resource classes for these ASAP.
>>
>> https://github.com/openstack/nova/blob/d741f624c81baf89fc8b6b94a2bc20eb5355a818/nova/rc_fields.py
>>
>>
>> -efried
> Makes sense, Eric. The obvious names would be ACCELERATOR_GPU and
> ACCELERATOR_FPGA. Do we just submit a patch to rc_fields.py?
> 
> Thanks,
> Sundar
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Cyborg] [Nova] Cyborg traits

2018-05-30 Thread Eric Fried
This all sounds fully reasonable to me.  One thing, though...

>>   * There is a resource class per device category e.g.
>> CUSTOM_ACCELERATOR_GPU, CUSTOM_ACCELERATOR_FPGA.

Let's propose standard resource classes for these ASAP.

https://github.com/openstack/nova/blob/d741f624c81baf89fc8b6b94a2bc20eb5355a818/nova/rc_fields.py

-efried
.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Extra feature of vCPU allocation on demands

2018-05-07 Thread Eric Fried
I will be interested to watch this develop.  In PowerVM we already have
shared vs. dedicated processors [1] along with concepts like capped vs.
uncapped, min/max proc units, weights, etc.  But obviously it's all
heavily customized to be PowerVM-specific.  If these concepts made their
way into mainstream Nova, we could hopefully adapt to use them and
remove some tech debt.

[1]
https://github.com/openstack/nova/blob/master/nova/virt/powervm/vm.py#L372-L401

On 05/07/2018 04:55 AM, 倪蔚辰 wrote:
> Hi, all
> 
> I would like to propose a blueprint (not proposed yet), which is related
> to openstack nova. I hope to have some comments by explaining my idea
> through this e-mail. Please contact me if anyone has any comment.
> 
>  
> 
> Background
> 
> Under current OpenStack, vCPUs assigned to each VM can be configured as
> dedicated or shared. In some scenarios, such as deploying Radio Access
> Network VNF, the VM is required to have dedicated vCPUs to insure the
> performance. However, in that case, each VM has a vCPU to do Guest OS
> housekeeping. Usually, this vCPU is not a high performance required vCPU
> and do not take high percentage of dedicated vCPU utilization. There is
> some vCPU resources waste.
> 
>  
> 
> Proposed feature
> 
> I hope to add an extra feature to flavor extra specs. It refers to how
> many dedicated vCPUs and how many shared vCPUs are needed for the VM.
> When VM requires vCPU, OpenStack allocates vCPUs on demands. In the
> background scenario, this idea can save many dedicated vCPUs which take
> Guest OS housekeeping. And the scenario stated above is only one use
> case for the feature. This feature potentially allows user to have more
> flexible VM design to save CPU resource.
> 
>  
> 
> Thanks.
> 
>  
> 
> Weichen
> 
> e-mail: niweic...@chinamobile.com
> 
>  
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-05-03 Thread Eric Fried
>> verify with placement
>> whether the image traits requested are 1) supported by the compute
>> host the instance is residing on and 2) coincide with the
>> already-existing allocations.

Note that #2 is a subset of #1.  The only potential advantage of
including #1 is efficiency: We can do #1 in one API call and bail early
if it fails; but if it passes, we have to do #2 anyway, which is
multiple steps.  So would we rather save one step in the "good path" or
potentially N-1 steps in the failure case?  IMO the cost of the
additional dev/test to implement #1 is higher than that of the potential
extra API calls.  (TL;DR: just implement #2.)

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-24 Thread Eric Fried
Alex-

On 04/24/2018 09:21 AM, Alex Xu wrote:
> 
> 
> 2018-04-24 20:53 GMT+08:00 Eric Fried <openst...@fried.cc
> <mailto:openst...@fried.cc>>:
> 
> > The problem isn't just checking the traits in the nested resource
> > provider. We also need to ensure the trait in the exactly same child
> > resource provider.
> 
> No, we can't get "granular" with image traits.  We accepted this as a
> limitation for the spawn aspect of this spec [1], for all the same
> reasons [2].  And by the time we've spawned the instance, we've lost the
> information about which granular request groups (from the flavor) were
> satisfied by which resources - retrofitting that information from a new
> image would be even harder.  So we need to accept the same limitation
> for rebuild.
> 
> [1] "Due to the difficulty of attempting to reconcile granular request
> groups between an image and a flavor, only the (un-numbered) trait group
> is supported. The traits listed there are merged with those of the
> un-numbered request group from the flavor."
> 
> (http://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/glance-image-traits.html#proposed-change
> 
> <http://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/glance-image-traits.html#proposed-change>)
> [2]
> 
> https://review.openstack.org/#/c/554305/2/specs/rocky/approved/glance-image-traits.rst@86
> 
> <https://review.openstack.org/#/c/554305/2/specs/rocky/approved/glance-image-traits.rst@86>
> 
> 
> Why we can return a RP which has a specific trait but we won't consume
> any resources on it?
> If the case is that we request two VFs, and this two VFs have different
> required traits, then that should be granular request.

We don't care about RPs we're not consuming resources from.  Forget
rebuild - if the image used for the original spawn request has traits
pertaining to VFs, we folded those traits into the un-numbered request
group, which means the VF resources would have needed to be in the
un-numbered request group in the flavor as well.  That was the
limitation discussed at [2]: trying to correlate granular groups from an
image to granular groups in a trait would require nontrivial invention
beyond what we're willing to do at this point.  So we're limited at
spawn time to VFs (or whatever) where we can't tell which trait belongs
to which.  The best we can do is ensure that the end result of the
un-numbered request group will collectively satisfy all the traits from
the image.  And this same limitation exists, for the same reasons, on
rebuild.  It even goes a bit further, because if there are *other* VFs
(or whatever) that came from numbered groups in the original request, we
have no way to know that; so if *those* guys have traits required by the
new image, we'll still pass.  Which is almost certainly okay.

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-24 Thread Eric Fried
> The problem isn't just checking the traits in the nested resource
> provider. We also need to ensure the trait in the exactly same child
> resource provider.

No, we can't get "granular" with image traits.  We accepted this as a
limitation for the spawn aspect of this spec [1], for all the same
reasons [2].  And by the time we've spawned the instance, we've lost the
information about which granular request groups (from the flavor) were
satisfied by which resources - retrofitting that information from a new
image would be even harder.  So we need to accept the same limitation
for rebuild.

[1] "Due to the difficulty of attempting to reconcile granular request
groups between an image and a flavor, only the (un-numbered) trait group
is supported. The traits listed there are merged with those of the
un-numbered request group from the flavor."
(http://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/glance-image-traits.html#proposed-change)
[2]
https://review.openstack.org/#/c/554305/2/specs/rocky/approved/glance-image-traits.rst@86

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-23 Thread Eric Fried
> for the GET
> /resource_providers?in_tree==, nested
> resource providers and allocation pose a problem see #3 above.

This *would* work as a quick up-front check as Jay described (if you get
no results from this, you know that at least one of your image traits
doesn't exist anywhere in the tree) except that it doesn't take sharing
providers into account :(

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-23 Thread Eric Fried
Following the discussion on IRC, here's what I think you need to do:

- Assuming the set of traits from your new image is called image_traits...
- Use GET /allocations/{instance_uuid} and pull out the set of all RP
UUIDs.  Let's call this instance_rp_uuids.
- Use the SchedulerReportClient.get_provider_tree_and_ensure_root method
[1] to populate and return the ProviderTree for the host.  (If we're
uncomfortable about the `ensure_root` bit, we can factor that away.)
Call this ptree.
- Collect all the traits in the RPs you've got allocated to your instance:

 traits_in_instance_rps = set()
 for rp_uuid in instance_rp_uuids:
 traits_in_instance_rps.update(ptree.data(rp_uuid).traits)

- See if any of your image traits are *not* in those RPs.

 missing_traits = image_traits - traits_in_instance_rps

- If there were any, it's a no go.

 if missing_traits:
 FAIL(_("The following traits were in the image but not in the
instance's RPs: %s") % ', '.join(missing_traits))

[1]
https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py#L986

On 04/23/2018 03:47 PM, Matt Riedemann wrote:
> On 4/23/2018 3:26 PM, Eric Fried wrote:
>> No, the question you're really asking in this case is, "Do the resource
>> providers in this tree contain (or not contain) these traits?"  Which to
>> me, translates directly to:
>>
>>   GET /resource_providers?in_tree=$rp_uuid={$TRAIT|!$TRAIT, ...}
>>
>> ...which we already support.  The answer is a list of providers. Compare
>> that to the providers from which resources are already allocated, and
>> Bob's your uncle.
> 
> OK and that will include filtering the required traits on nested
> providers in that tree rather than just against the root provider? If
> so, then yeah that sounds like an improvement on option 2 or 3 in my
> original email and resolves the issue without having to call (or change)
> "GET /allocation_candidates". I still think it should happen from within
> ImagePropertiesFilter, but that's an implementation detail.
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild

2018-04-23 Thread Eric Fried
Semantically, GET /allocation_candidates where we don't actually want to
allocate anything (i.e. we don't want to use the returned candidates) is
goofy, and talking about what the result would look like when there's no
`resources` is going to spider into some weird questions.

Like what does the response payload look like?  In the "good" scenario,
you would be expecting an allocation_request like:

"allocations": {
$rp_uuid: {
"resources": {
# Nada
}
},
}

...which is something we discussed recently [1] in relation to "anchor"
providers, and killed.

No, the question you're really asking in this case is, "Do the resource
providers in this tree contain (or not contain) these traits?"  Which to
me, translates directly to:

 GET /resource_providers?in_tree=$rp_uuid={$TRAIT|!$TRAIT, ...}

...which we already support.  The answer is a list of providers. Compare
that to the providers from which resources are already allocated, and
Bob's your uncle.

(I do find it messy/weird that the required/forbidden traits in the
image meta are supposed to apply *anywhere* in the provider tree.  But I
get that that's probably going to make the most sense.)

[1]
http://lists.openstack.org/pipermail/openstack-dev/2018-April/129408.html

On 04/23/2018 02:48 PM, Matt Riedemann wrote:
> We seem to be at a bit of an impasse in this spec amendment [1] so I
> want to try and summarize the alternative solutions as I see them.
> 
> The overall goal of the blueprint is to allow defining traits via image
> properties, like flavor extra specs. Those image-defined traits are used
> to filter hosts during scheduling of the instance. During server create,
> that filtering happens during the normal "GET /allocation_candidates"
> call to placement.
> 
> The problem is during rebuild with a new image that specifies new
> required traits. A rebuild is not a move operation, but we run through
> the scheduler filters to make sure the new image (if one is specified),
> is valid for the host on which the instance is currently running.
> 
> We don't currently call "GET /allocation_candidates" during rebuild
> because that could inadvertently filter out the host we know we need
> [2]. Also, since flavors don't change for rebuild, we haven't had a need
> for getting allocation candidates during rebuild since we're not
> allocating new resources (pretend bug 1763766 [3] does not exist for now).
> 
> Now that we know the problem, here are some of the solutions that have
> been discussed in the spec amendment, again, only for rebuild with a new
> image that has new traits:
> 
> 1. Fail in the API saying you can't rebuild with a new image with new
> required traits.
> 
> Pros:
> 
> - Simple way to keep the new image off a host that doesn't support it.
> - Similar solution to volume-backed rebuild with a new image.
> 
> Cons:
> 
> - Confusing user experience since they might be able to rebuild with
> some new images but not others with no clear explanation about the
> difference.
> 
> 2. Have the ImagePropertiesFilter call "GET
> /resource_providers/{rp_uuid}/traits" and compare the compute node root
> provider traits against the new image's required traits.
> 
> Pros:
> 
> - Avoids having to call "GET /allocation_candidates" during rebuild.
> - Simple way to compare the required image traits against the compute
> node provider traits.
> 
> Cons:
> 
> - Does not account for nested providers so the scheduler could reject
> the image due to its required traits which actually apply to a nested
> provider in the tree. This is somewhat related to bug 1763766.
> 
> 3. Slight variation on #2 except build a set of all traits from all
> providers in the same tree.
> 
> Pros:
> 
> - Handles the nested provider traits issue from #2.
> 
> Cons:
> 
> - Duplicates filtering in ImagePropertiesFilter that could otherwise
> happen in "GET /allocation_candidates".
> 
> 4. Add a microversion to change "GET /allocation_candidates" to make two
> changes:
> 
> a) Add an "in_tree" filter like in "GET /resource_providers". This would
> be needed to limit the scope of what gets returned since we know we only
> want to check against one specific host (the current host for the
> instance).
> 
> b) Make "resources" optional since on a rebuild we don't want to
> allocate new resources (again, notwithstanding bug 1763766).
> 
> Pros:
> 
> - We can call "GET /allocation_candidates?in_tree= UUID>=" and if nothing is returned,
> we know the new image's required traits don't work with the current node.
> - The filtering is baked into "GET /allocation_candidates" and not
> client-side in ImagePropertiesFilter.
> 
> Cons:
> 
> - Changes to the "GET /allocation_candidates" API which is going to be
> more complicated and more up-front work, but I don't have a good idea of
> how hard this would be to add since we already have the same "in_tree"
> logic in "GET 

Re: [openstack-dev] [placement][nova] Decision time on granular request groups for like resources

2018-04-19 Thread Eric Fried
Thanks to everyone who contributed to this discussion.  With just a
teeny bit more bikeshedding on the exact syntax [1], we landed on:

group_policy={none|isolate}

I have proposed this delta to the granular spec [2].

-efried

[1]
http://p.anticdent.org/logs/openstack-placement?dated=2018-04-19%2013:48:39.213790#a1c
[2] https://review.openstack.org/#/c/562687/

On 04/19/2018 07:38 AM, Balázs Gibizer wrote:
> 
> 
> On Thu, Apr 19, 2018 at 2:27 PM, Eric Fried <openst...@fried.cc> wrote:
>> gibi-
>>
>>>  Can the proximity param specify relationship between the un-numbered
>>> and
>>>  the numbered groups as well or only between numbered groups?
>>>  Besides that I'm +1 about proxyimity={isolate|any}
>>
>> Remembering that the resources in the un-numbered group can be spread
>> around the tree and sharing providers...
>>
>> If applying "isolate" to the un-numbered group means that each resource
>> you specify therein must be satisfied by a different provider, then you
>> should have just put those resources into numbered groups.
>>
>> If "isolate" means that *none* of the numbered groups will land on *any*
>> of the providers satisfying the un-numbered group... that could be hard
>> to reason about, and I don't know if it's useful.
>>
>> So thus far I've been thinking about all of these semantics only in
>> terms of the numbered groups (although Jay's `can_split` was
>> specifically aimed at the un-numbered group).
> 
> Thanks for the explanation. Now it make sense to me to limit the
> proximity param to the numbered groups.
> 
>>
>> That being the case (is that a bikeshed on the horizon?) perhaps
>> `granular_policy={isolate|any}` is a more appropriate name than
>> `proximity`.
> 
> The policy term is more general than proximity therefore the
> granular_policy=any query fragment isn't descriptive enough any more.
> 
> 
> gibi
> 
>>
>> -efried
>>
>> __
>>
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement][nova] Decision time on granular request groups for like resources

2018-04-19 Thread Eric Fried
Sylvain-

> What's the default behaviour if we aren't providing the proximity qparam
> ? Isolate or any ?

What we've been talking about, per mriedem's suggestion, is that the
qparam is required when you specify any numbered request groups.  There
is no default.  If you don't provide the qparam, 400.

(Edge case: the qparam is meaningless if you only provide *one* numbered
request group - assuming it has no bearing on the un-numbered group.  In
that case omitting it might be acceptable... or 400 for consistency.)

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement][nova] Decision time on granular request groups for like resources

2018-04-19 Thread Eric Fried
Chris-

Thanks for this perspective.  I totally agree.

> * the common behavior should require the least syntax.

To that point, I had been assuming "any fit" was going to be more common
than "explicit anti-affinity".  But I think this is where we are having
trouble agreeing.  So since, as you point out, we're in the weeds to
begin with when talking about nested, IMO mriedem's suggestion (no
default, require behavior to be specified) is a reasonable compromise.

> it'll be okay. Let's not maintain this painful illusion that we're
> writing stone tablets.

This.  I, for one, was being totally guilty of that.

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement][nova] Decision time on granular request groups for like resources

2018-04-19 Thread Eric Fried
gibi-

> Can the proximity param specify relationship between the un-numbered and
> the numbered groups as well or only between numbered groups?
> Besides that I'm +1 about proxyimity={isolate|any}

Remembering that the resources in the un-numbered group can be spread
around the tree and sharing providers...

If applying "isolate" to the un-numbered group means that each resource
you specify therein must be satisfied by a different provider, then you
should have just put those resources into numbered groups.

If "isolate" means that *none* of the numbered groups will land on *any*
of the providers satisfying the un-numbered group... that could be hard
to reason about, and I don't know if it's useful.

So thus far I've been thinking about all of these semantics only in
terms of the numbered groups (although Jay's `can_split` was
specifically aimed at the un-numbered group).

That being the case (is that a bikeshed on the horizon?) perhaps
`granular_policy={isolate|any}` is a more appropriate name than `proximity`.

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement][nova] Decision time on granular request groups for like resources

2018-04-18 Thread Eric Fried
> I have a feeling we're just going to go back and forth on this, as we
> have for weeks now, and not reach any conclusion that is satisfactory to
> everyone. And we'll delay, yet again, getting functionality into this
> release that serves 90% of use cases because we are obsessing over the
> 0.01% of use cases that may pop up later.

So I vote that, for the Rocky iteration of the granular spec, we add a
single `proximity={isolate|any}` qparam, required when any numbered
request groups are specified.  I believe this allows us to satisfy the
two NUMA use cases we care most about: "forced sharding" and "any fit".
And as you demonstrated, it leaves the way open for finer-grained and
more powerful semantics to be added in the future.

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement][nova] Decision time on granular request groups for like resources

2018-04-18 Thread Eric Fried
Sorry, addressing gaffe, bringing this back on-list...

On 04/18/2018 04:36 PM, Ed Leafe wrote:
> On Apr 18, 2018, at 4:11 PM, Eric Fried <openst...@fried.cc> wrote:
>>> That makes a lot of sense. Since we are already suffixing the query param 
>>> “resources” to indicate granular, why not add a clarifying term to that 
>>> suffix? E.g., “resources1=“ -> “resources1d” (for ‘different’). The exact 
>>> string we use can be bike shedded, but requiring it be specified sounds 
>>> pretty sane to me.
>>  I'm not understanding what you mean here.  The issue at hand is how
>> numbered groups interact with *each other*.  If I said
>> resources1s=...=..., what am I saying about whether the
>> resources in group 1 can or can't land in the same RP as those of group 2?
> OK, sorry. What I meant by the ‘d’ was that that group’s resources must be 
> from a different provider than any other group’s resources (anti-affinity). 
> So in your example, you don’t care if group1 is from the same provider, but 
> you do with group2, so that’s kind of a contradictory set-up (unless you had 
> other groups).
>
> Instead, if the example were changed to 
> resources1s=...=..=…, then groups 1 and 3 could be 
> allocated from the same provider.
>
> -- Ed Leafe

This is a cool idea.  It doesn't allow the same level of granularity as
being able to list explicit group numbers to be [anti-]affinitized with
specific other groups - but I'm not sure we need that.  I would have to
think through the use cases with this in mind.

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement][nova] Decision time on granular request groups for like resources

2018-04-18 Thread Eric Fried
> Cool. So let's not use a GET for this and instead change it to a POST
> with a request body that can more cleanly describe what the user is
> requesting, which is something we talked about a long time ago.

I kinda doubt we could agree on a format for this in the Rocky
timeframe.  But for the sake of curiosity, I'd like to see some strawman
proposals for what that request body would look like.  Here's a couple
off the top:

{
  "anti-affinity": [
  {
  "resources": { $RESOURCE_CLASS: amount, ... },
  "required": [ $TRAIT, ... ],
  "forbidden": [ $TRAIT, ... ],
  },
  ...
  ],
  "affinity": [
  ...
  ],
  "any fit": [
  ...
  ],
}

Or maybe:

{
  $ARBITRARY_USER_SPECIFIED_KEY_DESCRIBING_THE_GROUP: {
  "resources": { $RESOURCE_CLASS: amount, ... },
  "required": [ $TRAIT, ... ],
  "forbidden": [ $TRAIT, ... ],
  },
  ...
  "affinity_spec": {
  "isolate": [ $ARBITRARY_KEY, ... ],
  "any": [ $ARBITRARY_KEY, ... ],
  "common_subtree_by_trait": {
  "groups": [ $ARBITRARY_KEY, ... ],
  "traits": [ $TRAIT, ... ],
  },
  }
}

(I think we also now need to fold multiple `member_of` in there somehow.
 And `limit` - does that stay in the querystring?  Etc.)

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement][nova] Decision time on granular request groups for like resources

2018-04-18 Thread Eric Fried
Chris-

Going to accumulate a couple of your emails and answer them.  I could
have answered them separately (anti-affinity).  But in this case I felt
it appropriate to provide responses in a single note (best fit).

> I'm a bit conflicted.  On the one hand...

> On the other hand,

Right; we're in agreement that we need to handle both.

> I'm half tempted to side with mriedem and say that there is no default
> and it must be explicit, but I'm concerned that this would make the
> requests a lot larger if you have to specify it for every resource. 
and
> The request might get unwieldy if we have to specify affinity/anti-
> affinity for each resource.  Maybe you could specify the default for
> the request and then optionally override it for each resource?

Yes, good call.  I'm favoring this as a first pass.  See my other response.

> In either viewpoint, is there a way to represent "I want two resource
> groups, with resource X in each group coming from different resource
> providers (anti-affinity) and resource Y from the same resource provider
> (affinity)?

As proposed, yes.  Though if we go with the above (one flag to specify
request-wide behavior) then there wouldn't be that ability beyond
putting things in the un-numbered vs. numbered groups.  So I guess my
question is: do we have a use case *right now* that requires supporting
"isolate for some, unrestricted for others"?

> I'm not current on the placement implementation details, but would
> this level of flexibility cause complexity problems in the code?

Oh, implementing this is complex af.  Here's what it takes *just* to
satisfy the "any fit" version:

https://review.openstack.org/#/c/517757/10/nova/api/openstack/placement/objects/resource_provider.py@3599

I've made some progress implementing "proximity=isolate:X,Y,..." in my
sandbox, and that's even hairier.  Doing "proximity=isolate"
(request-wide policy) would be a little easier.

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement][nova] Decision time on granular request groups for like resources

2018-04-18 Thread Eric Fried
I can't tell if you're being facetious, but this seems sane, albeit
complex.  It's also extensible as we come up with new and wacky affinity
semantics we want to support.

I can't say I'm sold on requiring `proximity` qparams that cover every
granular group - that seems like a pretty onerous burden to put on the
user right out of the gate.  That said, the idea of not having a default
is quite appealing.  Perhaps as a first pass we can require a single
?proximity={isolate|any} and build on it to support group numbers (etc.)
in the future.

One other thing inline below, not related to the immediate subject.

On 04/18/2018 12:40 PM, Jay Pipes wrote:
> On 04/18/2018 11:58 AM, Matt Riedemann wrote:
>> On 4/18/2018 9:06 AM, Jay Pipes wrote:
>>> "By default, should resources/traits submitted in different numbered
>>> request groups be supplied by separate resource providers?"
>>
>> Without knowing all of the hairy use cases, I'm trying to channel my
>> inner sdague and some of the similar types of discussions we've had to
>> changes in the compute API, and a lot of the time we've agreed that we
>> shouldn't assume a default in certain cases.
>>
>> So for this case, if I'm requesting numbered request groups, why
>> doesn't the API just require that I pass a query parameter telling it
>> how I'd like those requests to be handled, either via affinity or
>> anti-affinity
> So, you're thinking maybe something like this?
> 
> 1) Get me two dedicated CPUs. One of those dedicated CPUs must have AVX2
> capabilities. They must be on different child providers (different NUMA
> cells that are providing those dedicated CPUs).
> 
> GET /allocation_candidates?
> 
>  resources1=PCPU:1=HW_CPU_X86_AVX2
> =PCPU:1
> =isolate:1,2
> 
> 2) Get me four dedicated CPUs. Two of those dedicated CPUs must have
> AVX2 capabilities. Two of the dedicated CPUs must have the SSE 4.2
> capability. They may come from the same provider (NUMA cell) or
> different providers.
> 
> GET /allocation_candidates?
> 
>  resources1=PCPU:2=HW_CPU_X86_AVX2
> =PCPU:2=HW_CPU_X86_SSE42
> =any:1,2
> 
> 3) Get me 2 dedicated CPUs and 2 SR-IOV VFs. The VFs must be provided by
> separate physical function providers which have different traits marking
> separate physical networks. The dedicated CPUs must come from the same
> provider tree in which the physical function providers reside.
> 
> GET /allocation_candidates?
> 
>  resources1=PCPU:2
> =SRIOV_NET_VF:1=CUSTOM_PHYSNET_A
> =SRIOV_NET_VF:1=CUSTOM_PHYSNET_B
> =isolate:2,3
> =same_tree:1,2,3
> 
> 3) Get me 2 dedicated CPUs and 2 SR-IOV VFs. The VFs must be provided by
> separate physical function providers which have different traits marking
> separate physical networks. The dedicated CPUs must come from the same
> provider *subtree* in which the second group of VF resources are sourced.
> 
> GET /allocation_candidates?
> 
>  resources1=PCPU:2
> =SRIOV_NET_VF:1=CUSTOM_PHYSNET_A
> =SRIOV_NET_VF:1=CUSTOM_PHYSNET_B
> =isolate:2,3
> =same_subtree:1,3

The 'same_subtree' concept requires a way to identify how far up the
common ancestor can be.  Otherwise, *everything* is in the same subtree.
 You could arbitrarily say "one step down from the root", but that's not
very flexible.  Allowing the user to specify a *number* of steps down
from the root is getting closer, but it requires the user to have an
understanding of the provider tree's exact structure, which is not ideal.

The idea I've been toying with here is "common ancestor by trait".  For
example, you would tag your NUMA node providers with trait NUMA_ROOT,
and then your request would include:

  ...
  =common_ancestor_by_trait:NUMA_ROOT:1,3

> 
> 4) Get me 4 SR-IOV VFs. 2 VFs should be sourced from a provider that is
> decorated with the CUSTOM_PHYSNET_A trait. 2 VFs should be sourced from
> a provider that is decorated with the CUSTOM_PHYSNET_B trait. For HA
> purposes, none of the VFs should be sourced from the same provider.
> However, the VFs for each physical network should be within the same
> subtree (NUMA cell) as each other.
> 
> GET /allocation_candidates?
> 
>  resources1=SRIOV_NET_VF:1=CUSTOM_PHYSNET_A
> =SRIOV_NET_VF:1=CUSTOM_PHYSNET_A
> =SRIOV_NET_VF:1=CUSTOM_PHYSNET_B
> =SRIOV_NET_VF:1=CUSTOM_PHYSNET_B
> =isolate:1,2,3,4
> =same_subtree:1,2
> =same_subtree:3,4
> 
> We can go even deeper if you'd like, since NFV means "never-ending
> feature velocity". Just let me know.
> 
> -jay
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement][nova] Decision time on granular request groups for like resources

2018-04-18 Thread Eric Fried
Thanks for describing the proposals clearly and concisely, Jay.

My preamble would have been that we need to support two use cases:

- "explicit anti-affinity": make sure certain parts of my request land
on *different* providers;
- "any fit": make sure my instance lands *somewhere*.

Both proposals address both use cases, but in different ways.

> "By default, should resources/traits submitted in different numbered
> request groups be supplied by separate resource providers?"

I agree this question needs to be answered, but that won't necessarily
inform which path we choose.  Viewpoint B [3] is set up to go either
way: either we're unrestricted by default and use a queryparam to force
separation; or we're split by default and use a queryparam to allow the
unrestricted behavior.

Otherwise I agree with everything Jay said.

-efried

On 04/18/2018 09:06 AM, Jay Pipes wrote:
> Stackers,
> 
> Eric Fried and I are currently at an impasse regarding a decision that
> will have far-reaching (and end-user facing) impacts to the placement
> API and how nova interacts with the placement service from the nova
> scheduler.
> 
> We need to make a decision regarding the following question:
> 
> 
> There are two competing proposals right now (both being amendments to
> the original granular request groups spec [1]) which outline two
> different viewpoints.
> 
> Viewpoint A [2], from me, is that like resources listed in different
> granular request groups should mean that those resources will be sourced
> from *different* resource providers.
> 
> In other words, if I issue the following request:
> 
> GET /allocation_candidates?resources1=VCPU:1=VCPU:1
> 
> Then I am assured of getting allocation candidates that contain 2
> distinct resource providers consuming 1 VCPU from each provider.
> 
> Viewpoint B [3], from Eric, is that like resources listed in different
> granular request groups should not necessarily mean that those resources
> will be sourced from different resource providers. They *could* be
> sourced from different providers, or they could be sourced from the same
> provider.
> 
> Both proposals include ways to specify whether certain resources or
> whole request groups can be forced to be sources from either a single
> provider or from different providers.
> 
> In Viewpoint A, the proposal is to have a can_split=RESOURCE1,RESOURCE2
> query parameter that would indicate which resource classes in the
> unnumbered request group that may be split across multiple providers
> (remember that viewpoint A considers different request groups to
> explicitly mean different providers, so it doesn't make sense to have a
> can_split query parameter for numbered request groups).
> 
> In Viewpoint B, the proposal is to have a separate_providers=1,2 query
> parameter that would indicate that the identified request groups should
> be sourced from separate providers. Request groups that are not listed
> in the separate_providers query parameter are not guaranteed to be
> sourced from different providers.
> 
> I know this is a complex subject, but I thought it was worthwhile trying
> to explain the two proposals in as clear terms as I could muster.
> 
> I'm, quite frankly, a bit on the fence about the whole thing and would
> just like to have a clear path forward so that we can start landing the
> 12+ patches that are queued up waiting for a decision on this.
> 
> Thoughts and opinions welcome.
> 
> Thanks,
> -jay
> 
> 
> [1]
> http://specs.openstack.org/openstack/nova-specs/specs/rocky/approved/granular-resource-requests.html
> 
> 
> [2] https://review.openstack.org/#/c/560974/
> 
> [3] https://review.openstack.org/#/c/561717/
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement] Anchor/Relay Providers

2018-04-16 Thread Eric Fried
> I still don't see a use in returning the root providers in the
> allocation requests -- since there is nothing consuming resources from
> those providers.
> 
> And we already return the root_provider_uuid for all providers involved
> in allocation requests within the provider_summaries section.
> 
> So, I can kind of see where we might want to change *this* line of the
> nova scheduler:
> 
> https://github.com/openstack/nova/blob/stable/pike/nova/scheduler/filter_scheduler.py#L349
> 
> 
> from this:
> 
>  compute_uuids = list(provider_summaries.keys())
> 
> to this:
> 
>  compute_uuids = set([
>  ps['root_provider_uuid'] for ps in provider_summaries
>  ])

If we're granting that it's possible to get all your resources from
sharing providers, the above doesn't help you to know which of your
compute_uuids belongs to which of those sharing-only allocation requests.

I'm fine deferring this part until we have a use case for sharing-only
allocation requests that aren't prompted by an "attach-*" case where we
already know the target host/consumer.  But I'd like to point out that
there's nothing in the API that prevents us from getting such results.

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement] Anchor/Relay Providers

2018-04-16 Thread Eric Fried
I was presenting an example using VM-ish resource classes, because I can
write them down and everybody knows what I'm talking about without me
having to explain what they are.  But remember we want placement to be
usable outside of Nova as well.

But also, I thought we had situations where the VCPU and MEMORY_MB were
themselves provided by sharing providers, associated with a compute host
RP that may be itself devoid of inventory.  (This may even be a viable
way to model VMWare's clustery things today.)

-efried

On 04/16/2018 01:58 PM, Jay Pipes wrote:
> Sorry it took so long to respond. Comments inline.
> 
> On 03/30/2018 08:34 PM, Eric Fried wrote:
>> Folks who care about placement (but especially Jay and Tetsuro)-
>>
>> I was reviewing [1] and was at first very unsatisfied that we were not
>> returning the anchor providers in the results.  But as I started digging
>> into what it would take to fix it, I realized it's going to be
>> nontrivial.  I wanted to dump my thoughts before the weekend.
>>
>> 
>> It should be legal to have a configuration like:
>>
>>  #    CN1 (VCPU, MEMORY_MB)
>>  #    /  \
>>  #   /agg1    \agg2
>>  #  /  \
>>  # SS1    SS2
>>  #  (DISK_GB)  (IPV4_ADDRESS)
>>
>> And make a request for DISK_GB,IPV4_ADDRESS;
>> And have it return a candidate including SS1 and SS2.
>>
>> The CN1 resource provider acts as an "anchor" or "relay": a provider
>> that doesn't provide any of the requested resource, but connects to one
>> or more sharing providers that do so.
> 
> To be honest, such a request just doesn't make much sense to me.
> 
> Think about what that is requesting. I want some DISK_GB resources and
> an IP address. For what? What is going to be *using* those resources?
> 
> Ah... a virtual machine. In other words, something that would *also* be
> requesting some CPU and memory resources as well.
> 
> So, the request is just fatally flawed, IMHO. It doesn't represent a use
> case from the real world.
> 
> I don't believe we should be changing placement (either the REST API or
> the implementation of allocation candidate retrieval) for use cases that
> don't represent real-world requests.
> 
> Best,
> -jay
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Deployers] Optional, platform specific, dependancies in requirements.txt

2018-04-12 Thread Eric Fried
> Is avoiding three lines of code really worth making future cleanup
> harder? Is a three line change really blocking "an approved blueprint
> with ready code"?

Nope.  What's blocking is deciding that that's the right thing to do.
Which we clearly don't have consensus on, based on what's happening in
this thread.

> global ironic
> if ironic is None:
> ironic = importutils.import_module('ironicclient')

I have a pretty strong dislike for this mechanism.  For one thing, I'm
frustrated when I can't use hotkeys to jump to an ironicclient method
because my IDE doesn't recognize that dynamic import.  I have to go look
up the symbol some other way (and hope I'm getting the right one).  To
me (with my bias as a dev rather than a deployer) that's way worse than
having the 704KB python-ironicclient installed on my machine even though
I've never spawned an ironic VM in my life.

It should also be noted that python-ironicclient is in
test-requirements.txt.

Not that my personal preference ought to dictate or even influence what
we decide to do here.  But dynamic import is not the obviously correct
choice.

-efried

On 04/12/2018 03:28 PM, Michael Still wrote:
> I don't understand why you think the alternative is so hard. Here's how
> ironic does it:
> 
>         global ironic
> 
>         if ironic is None:
> 
>             ironic = importutils.import_module('ironicclient')
> 
> 
> Is avoiding three lines of code really worth making future cleanup
> harder? Is a three line change really blocking "an approved blueprint
> with ready code"?
> 
> Michael
> 
> 
> 
> On Thu, Apr 12, 2018 at 10:42 PM, Eric Fried <openst...@fried.cc
> <mailto:openst...@fried.cc>> wrote:
> 
> +1
> 
> This sounds reasonable to me.  I'm glad the issue was raised, but IMO it
> shouldn't derail progress on an approved blueprint with ready code.
> 
> Jichen, would you please go ahead and file that blueprint template (no
> need to write a spec yet) and link it in a review comment on the bottom
> zvm patch so we have a paper trail?  I'm thinking something like
> "Consistent platform-specific and optional requirements" -- that leaves
> us open to decide *how* we're going to "handle" them.
> 
> Thanks,
> efried
> 
> On 04/12/2018 04:13 AM, Chen CH Ji wrote:
> > Thanks for Michael for raising this question and detailed information
> > from Clark
> >
> > As indicated in the mail, xen, vmware etc might already have this kind
> > of requirements (and I guess might be more than that) ,
> > can we accept z/VM requirements first by following other existing ones
> > then next I can create a BP later to indicate this kind
> > of change request by referring to Clark's comments and submit patches to
> > handle it ? Thanks
> >
> > Best Regards!
> >
> > Kevin (Chen) Ji 纪 晨
> >
> > Engineer, zVM Development, CSTL
> > Notes: Chen CH Ji/China/IBM@IBMCN Internet: jiche...@cn.ibm.com 
> <mailto:jiche...@cn.ibm.com>
> > Phone: +86-10-82451493
> > Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian
> > District, Beijing 100193, PRC
> >
> > Inactive hide details for Matt Riedemann ---04/12/2018 08:46:25 AM---On
> > 4/11/2018 5:09 PM, Michael Still wrote: >Matt Riedemann ---04/12/2018
> > 08:46:25 AM---On 4/11/2018 5:09 PM, Michael Still wrote: >
> >
> > From: Matt Riedemann <mriede...@gmail.com <mailto:mriede...@gmail.com>>
> > To: openstack-dev@lists.openstack.org
> <mailto:openstack-dev@lists.openstack.org>
> > Date: 04/12/2018 08:46 AM
> > Subject: Re: [openstack-dev] [Nova][Deployers] Optional, platform
> > specific, dependancies in requirements.txt
> >
> >
> 
> >
> >
> >
> > On 4/11/2018 5:09 PM, Michael Still wrote:
> >>
> >>
> >
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__review.openstack.org_-23_c_523387=DwIGaQ=jf_iaSHvJObTbx-siA1ZOg=8sI5aZT88Uetyy_XsOddbPjIiLSGM-sFnua3lLy2Xr0=212PUwLYOBlJZ3BiZNuJIFkRfqXoBPJDcWYCDk7vCHg=CNosrTHnAR21zOI52fnDRfTqu2zPiAn2oW9f67Qijo4=
> 
> <https://urldefense.proofpoint.com/v2/url?u=https-3A__review.openstack.org_-23_c_523387=DwIGaQ=jf_iaSHvJObTbx-siA1ZOg=8sI5aZT88Uetyy_XsOddbPjIiLSGM-sFnua3lLy2Xr0=212PUwLYOBlJZ3BiZNuJIFkRfqXoBPJDcWYCDk7vCHg=CNosrTHnAR21zOI52fnDRfTqu2zPiAn2oW9f67Qijo4=>
>  proposes
> > adding a z/VM specific

Re: [openstack-dev] [Nova][Deployers] Optional, platform specific, dependancies in requirements.txt

2018-04-12 Thread Eric Fried
+1

This sounds reasonable to me.  I'm glad the issue was raised, but IMO it
shouldn't derail progress on an approved blueprint with ready code.

Jichen, would you please go ahead and file that blueprint template (no
need to write a spec yet) and link it in a review comment on the bottom
zvm patch so we have a paper trail?  I'm thinking something like
"Consistent platform-specific and optional requirements" -- that leaves
us open to decide *how* we're going to "handle" them.

Thanks,
efried

On 04/12/2018 04:13 AM, Chen CH Ji wrote:
> Thanks for Michael for raising this question and detailed information
> from Clark
> 
> As indicated in the mail, xen, vmware etc might already have this kind
> of requirements (and I guess might be more than that) ,
> can we accept z/VM requirements first by following other existing ones
> then next I can create a BP later to indicate this kind
> of change request by referring to Clark's comments and submit patches to
> handle it ? Thanks
> 
> Best Regards!
> 
> Kevin (Chen) Ji 纪 晨
> 
> Engineer, zVM Development, CSTL
> Notes: Chen CH Ji/China/IBM@IBMCN Internet: jiche...@cn.ibm.com
> Phone: +86-10-82451493
> Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian
> District, Beijing 100193, PRC
> 
> Inactive hide details for Matt Riedemann ---04/12/2018 08:46:25 AM---On
> 4/11/2018 5:09 PM, Michael Still wrote: >Matt Riedemann ---04/12/2018
> 08:46:25 AM---On 4/11/2018 5:09 PM, Michael Still wrote: >
> 
> From: Matt Riedemann 
> To: openstack-dev@lists.openstack.org
> Date: 04/12/2018 08:46 AM
> Subject: Re: [openstack-dev] [Nova][Deployers] Optional, platform
> specific, dependancies in requirements.txt
> 
> 
> 
> 
> 
> On 4/11/2018 5:09 PM, Michael Still wrote:
>>
>>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__review.openstack.org_-23_c_523387=DwIGaQ=jf_iaSHvJObTbx-siA1ZOg=8sI5aZT88Uetyy_XsOddbPjIiLSGM-sFnua3lLy2Xr0=212PUwLYOBlJZ3BiZNuJIFkRfqXoBPJDcWYCDk7vCHg=CNosrTHnAR21zOI52fnDRfTqu2zPiAn2oW9f67Qijo4=
>  proposes
> adding a z/VM specific
>> dependancy to nova's requirements.txt. When I objected the counter
>> argument is that we have examples of windows specific dependancies
>> (os-win) and powervm specific dependancies in that file already.
>>
>> I think perhaps all three are a mistake and should be removed.
>>
>> My recollection is that for drivers like ironic which may not be
>> deployed by everyone, we have the dependancy documented, and then loaded
>> at runtime by the driver itself instead of adding it to
>> requirements.txt. This is to stop pip for auto-installing the dependancy
>> for anyone who wants to run nova. I had assumed this was at the request
>> of the deployer community.
>>
>> So what do we do with z/VM? Do we clean this up? Or do we now allow
>> dependancies that are only useful to a very small number of deployments
>> into requirements.txt?
> 
> As Eric pointed out in the review, this came up when pypowervm was added:
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__review.openstack.org_-23_c_438119_5_requirements.txt=DwIGaQ=jf_iaSHvJObTbx-siA1ZOg=8sI5aZT88Uetyy_XsOddbPjIiLSGM-sFnua3lLy2Xr0=212PUwLYOBlJZ3BiZNuJIFkRfqXoBPJDcWYCDk7vCHg=iyKxF-CcGAFmnQs8B7d5u2zwEiJqq8ivETmrgB77PEg=
> 
> And you're asking the same questions I did in there, which was, should
> it go into test-requirements.txt like oslo.vmware and
> python-ironicclient, or should it go under [extras], or go into
> requirements.txt like os-win (we also have the xenapi library now too).
> 
> I don't really think all of these optional packages should be in
> requirements.txt, but we should just be consistent with whatever we do,
> be that test-requirements.txt or [extras]. I remember caring more about
> this back in my rpm packaging days when we actually tracked what was in
> requirements.txt to base what needed to go into the rpm spec, unlike
> Fedora rpm specs which just zero out requirements.txt and depend on
> their own knowledge of what needs to be installed (which is sometimes
> lacking or lagging master).
> 
> I also seem to remember that [extras] was less than user-friendly for
> some reason, but maybe that was just because of how our CI jobs are
> setup? Or I'm just making that up. I know it's pretty simple to install
> the stuff from extras for tox runs, it's just an extra set of
> dependencies to list in the tox.ini.
> 
> Having said all this, I don't have the energy to help push for
> consistency myself, but will happily watch you from the sidelines.
> 
> -- 
> 
> Thanks,
> 
> Matt
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> 

Re: [openstack-dev] [nova] Changes toComputeVirtAPI.wait_for_instance_event

2018-04-11 Thread Eric Fried
Jichen was able to use this information immediately, to great benefit
[1].  (If those paying attention could have a quick look at that to make
sure he used it right, it would be appreciated; I'm not an expert here.)

[1]
https://review.openstack.org/#/c/527658/31..32/nova/virt/zvm/guest.py@192

On 04/10/2018 09:06 PM, Chen CH Ji wrote:
> Thanks for your info ,really helpful
> 
> Best Regards!
> 
> Kevin (Chen) Ji 纪 晨
> 
> Engineer, zVM Development, CSTL
> Notes: Chen CH Ji/China/IBM@IBMCN Internet: jiche...@cn.ibm.com
> Phone: +86-10-82451493
> Address: 3/F Ring Building, ZhongGuanCun Software Park, Haidian
> District, Beijing 100193, PRC
> 
> Inactive hide details for Andreas Scheuring ---04/10/2018 10:19:21
> PM---Yes, that’s how it works! ---Andreas Scheuring ---04/10/2018
> 10:19:21 PM---Yes, that’s how it works! ---
> 
> From: Andreas Scheuring 
> To: "OpenStack Development Mailing List (not for usage questions)"
> 
> Date: 04/10/2018 10:19 PM
> Subject: Re: [openstack-dev] [nova] Changes
> toComputeVirtAPI.wait_for_instance_event
> 
> 
> 
> 
> 
> Yes, that’s how it works!
> 
> ---
> Andreas Scheuring (andreas_s)
> 
> 
> 
> On 10. Apr 2018, at 16:05, Matt Riedemann <_mriedemos@gmail.com_
> > wrote:
> 
> On 4/9/2018 9:57 PM, Chen CH Ji wrote:
> 
> Could you please help to share whether this kind of event is
> sent by neutron-server or neutron agent ? I searched neutron code
> from [1][2] this means the agent itself need tell neutron server
> the device(VIF) is up then neutron server will send notification
> to nova through REST API and in turn consumed by compute node?
> 
> [1]_https://github.com/openstack/neutron/tree/master/neutron/notify_port_active_direct_
> 
> 
> 
> [2]_https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/rpc.py#L264_
> 
> 
> 
> 
> I believe the neutron agent is the one that is getting (or polling) the
> information from the underlying network backend when VIFs are plugged or
> unplugged from a host, then route that information via RPC to the
> neutron server which then sends an os-server-external-events request to
> the compute REST API, which then routes the event information down to
> the nova-compute host where the instance is currently running.
> 
> -- 
> 
> Thanks,
> 
> Matt
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: _OpenStack-dev-request@lists.openstack.org_
> ?subject:unsubscribe_
> __http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev_
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> https://urldefense.proofpoint.com/v2/url?u=http-3A__lists.openstack.org_cgi-2Dbin_mailman_listinfo_openstack-2Ddev=DwIGaQ=jf_iaSHvJObTbx-siA1ZOg=8sI5aZT88Uetyy_XsOddbPjIiLSGM-sFnua3lLy2Xr0=tIntFpZ0ffp-_h5CsqN1I9tv64hW2xugxBXaxDn7Z_I=z2jOgMD7B3XFoNsUHTtIO6hWKYXH-Dm4L4P0-u-oSSw=
> 
> 
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] [placement] placement update 18-14

2018-04-06 Thread Eric Fried
>> it's really on nested allocation candidates.
> 
> Yup. And that series is deadlocked on a disagreement about whether
> granular request groups should be "separate by default" (meaning: if you
> request multiple groups of resources, the expectation is that they will
> be served by distinct resource providers) or "unrestricted by default"
> (meaning: if you request multiple groups of resources, those resources
> may or may not be serviced by distinct resource providers).

This is really a granular thing, not a nested thing.  I was holding up
the nrp-in-alloc-cands spec [1] for other reasons, but I've stopped
doing that now.  We should be able to proceed with the nrp work.  I'm
working on the granular code, wherein I can hopefully isolate the
separate-vs-unrestricted decision such that we can go either way once
that issue is resolved.

[1] https://review.openstack.org/#/c/556873/

>> Some negotiation happened with regard to when/if the fixes for
>> shared providers is going to happen. I'm not sure how that resolved,
>> if someone can follow up with that, that would be most excellent.

This is the subject of another thread [2] that's still "dangling".  We
discussed it in the sched meeting this week [3] and concluded [4] that
we shouldn't do it in Rocky.  BUT tetsuro later pointed out that part of
the series in question [5] is still needed to satisfy NRP-in-alloc-cands
(return the whole tree's providers in provider_summaries - even the ones
that aren't providing resource to the request).  That patch changes
behavior, so needs a microversion (mostly done already in that patch),
so needs a spec.  We haven't yet resolved whether this is truly needed,
so haven't assigned a body to the spec work.  I believe Jay is still
planning [6] to parse and respond to the ML thread.  After he clones
himself.

[2]
http://lists.openstack.org/pipermail/openstack-dev/2018-March/128944.html
[3]
http://eavesdrop.openstack.org/meetings/nova_scheduler/2018/nova_scheduler.2018-04-02-14.00.log.html#l-91
[4]
http://eavesdrop.openstack.org/meetings/nova_scheduler/2018/nova_scheduler.2018-04-02-14.00.log.html#l-137
[5] https://review.openstack.org/#/c/558045/
[6]
http://eavesdrop.openstack.org/meetings/nova_scheduler/2018/nova_scheduler.2018-04-02-14.00.log.html#l-104

>> * Shared providers status?
>>    (I really think we need to make this go. It was one of the
>>    original value propositions of placement: being able to accurate
>>    manage shared disk.)
> 
> Agreed, but you know NUMA. And CPU pinning. And vGPUs. And FPGAs.
> And physnet network bandwidth scheduling. And... well, you get the idea.

Right.  I will say that Tetsuro has been doing an excellent job slinging
code for this, though.  So the bottleneck is really reviewer bandwidth
(already an issue for the work we *are* trying to fit in Rocky).

If it's still on the table by Stein, we ought to consider making it a
high priority.  (Our Rocky punchlist seems to be favoring "urgent" over
"important" to some extent.)

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Proposing Eric Fried for nova-core

2018-04-03 Thread Eric Fried
Thank you Melanie for the complimentary nomination, to the cores for
welcoming me into the fold, and especially to all (cores and non, Nova
and otherwise) who have mentored me along the way thus far.  I hope to
live up to your example and continue to pay it forward.

-efried

On 04/03/2018 02:20 PM, melanie witt wrote:
> On Mon, 26 Mar 2018 19:00:06 -0700, Melanie Witt wrote:
>> Howdy everyone,
>>
>> I'd like to propose that we add Eric Fried to the nova-core team.
>>
>> Eric has been instrumental to the placement effort with his work on
>> nested resource providers and has been actively contributing to many
>> other areas of openstack [0] like project-config, gerritbot,
>> keystoneauth, devstack, os-loganalyze, and so on.
>>
>> He's an active reviewer in nova [1] and elsewhere in openstack and
>> reviews in-depth, asking questions and catching issues in patches and
>> working with authors to help get code into merge-ready state. These are
>> qualities I look for in a potential core reviewer.
>>
>> In addition to all that, Eric is an active participant in the project in
>> general, helping people with questions in the #openstack-nova IRC
>> channel, contributing to design discussions, helping to write up
>> outcomes of discussions, reporting bugs, fixing bugs, and writing tests.
>> His contributions help to maintain and increase the health of our
>> project.
>>
>> To the existing core team members, please respond with your comments,
>> +1s, or objections within one week.
>>
>> Cheers,
>> -melanie
>>
>> [0] https://review.openstack.org/#/q/owner:efried
>> [1] http://stackalytics.com/report/contribution/nova/90
> 
> Thanks to everyone who responded with their feedback. It's been one week
> and we have had more than enough +1s, so I've added Eric to the team.
> 
> Welcome Eric!
> 
> Best,
> -melanie
> 
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [barbican][nova-powervm][pyghmi][solum][trove] Switching to cryptography from pycrypto

2018-03-31 Thread Eric Fried
Mr. Fire-

> nova-powervm: no open reviews
>   - in test-requirements, but not actually used?
>   - made https://review.openstack.org/558091 for it

Thanks for that.  It passed all our tests; we should merge it early next
week.

-efried

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][oslo] what to do with problematic mocking in nova unit tests

2018-03-31 Thread Eric Fried
Hi Doug, I made this [2] for you.  I tested it locally with oslo.config
master, and whereas I started off with a slightly different set of
errors than you show at [1], they were in the same suites.  Since I
didn't want to tox the world locally, I went ahead and added a
Depends-On from [3].  Let's see how it plays out.

>> [1]
http://logs.openstack.org/12/557012/1/check/cross-nova-py27/37b2a7c/job-output.txt.gz#_2018-03-27_21_41_09_883881
[2] https://review.openstack.org/#/c/558084/
[3] https://review.openstack.org/#/c/557012/

-efried

On 03/30/2018 06:35 AM, Doug Hellmann wrote:
> Anyone?
> 
>> On Mar 28, 2018, at 1:26 PM, Doug Hellmann  wrote:
>>
>> In the course of preparing the next release of oslo.config, Ben noticed
>> that nova's unit tests fail with oslo.config master [1].
>>
>> The underlying issue is that the tests mock things that oslo.config
>> is now calling as part of determining where options are being set
>> in code. This isn't an API change in oslo.config, and it is all
>> transparent for normal uses of the library. But the mocks replace
>> os.path.exists() and open() for the entire duration of a test
>> function (not just for the isolated application code being tested),
>> and so the library behavior change surfaces as a test error.
>>
>> I'm not really in a position to go through and clean up the use of
>> mocks in those (and other?) tests myself, and I would like to not
>> have to revert the feature work in oslo.config, especially since
>> we did it for the placement API stuff for the nova team.
>>
>> I'm looking for ideas about what to do.
>>
>> Doug
>>
>> [1] 
>> http://logs.openstack.org/12/557012/1/check/cross-nova-py27/37b2a7c/job-output.txt.gz#_2018-03-27_21_41_09_883881
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [placement] Anchor/Relay Providers

2018-03-31 Thread Eric Fried
/me responds to self

Good progress has been made here.

Tetsuro solved the piece where provider summaries were only showing
resources that had been requested - with [8] they show usage information
for *all* their resources.

In order to make use of both [1] and [8], I had to shuffle them into the
same series - I put [8] first - and then balance my (heretofore) WIP [7]
on the top.  So we now have a lovely 5-part series starting at [9].

Regarding the (heretofore) WIP [7], I cleaned it up and made it ready.

QUESTION: Do we need a microversions for [8] and/or [1] and/or [7]?
Each changes the response payload content of GET /allocation_candidates,
so yes; but that content was arguably broken before, so no.  Please
comment on the patches accordingly.

-efried

> [1] https://review.openstack.org/#/c/533437/
> [2] https://bugs.launchpad.net/nova/+bug/1732731
> [3]
https://review.openstack.org/#/c/533437/6/nova/api/openstack/placement/objects/resource_provider.py@3308
> [4]
https://review.openstack.org/#/c/533437/6/nova/api/openstack/placement/objects/resource_provider.py@3062
> [5]
https://review.openstack.org/#/c/533437/6/nova/api/openstack/placement/objects/resource_provider.py@2658
> [6]
https://review.openstack.org/#/c/533437/6/nova/api/openstack/placement/objects/resource_provider.py@3303
> [7] https://review.openstack.org/#/c/558014/
[8] https://review.openstack.org/#/c/558045/
[9] https://review.openstack.org/#/c/558044/

On 03/30/2018 07:34 PM, Eric Fried wrote:
> Folks who care about placement (but especially Jay and Tetsuro)-
> 
> I was reviewing [1] and was at first very unsatisfied that we were not
> returning the anchor providers in the results.  But as I started digging
> into what it would take to fix it, I realized it's going to be
> nontrivial.  I wanted to dump my thoughts before the weekend.
> 
> 
> It should be legal to have a configuration like:
> 
> #CN1 (VCPU, MEMORY_MB)
> #/  \
> #   /agg1\agg2
> #  /  \
> # SS1SS2
> #  (DISK_GB)  (IPV4_ADDRESS)
> 
> And make a request for DISK_GB,IPV4_ADDRESS;
> And have it return a candidate including SS1 and SS2.
> 
> The CN1 resource provider acts as an "anchor" or "relay": a provider
> that doesn't provide any of the requested resource, but connects to one
> or more sharing providers that do so.
> 
> This scenario doesn't work today (see bug [2]).  Tetsuro has a partial
> fix [1].
> 
> However, whereas that fix will return you an allocation_request
> containing SS1 and SS2, neither the allocation_request nor the
> provider_summary mentions CN1.
> 
> That's bad.  Consider use cases like Nova's, where we have to land that
> allocation_request on a host: we have no good way of figuring out who
> that host is.
> 
> 
> Starting from the API, the response payload should look like:
> 
> {
> "allocation_requests": [
> {"allocations": {
> # This is missing ==>
> CN1_UUID: {"resources": {}},
> # <==
> SS1_UUID: {"resources": {"DISK_GB": 1024}},
> SS2_UUID: {"resources": {"IPV4_ADDRESS": 1}}
> }}
> ],
> "provider_summaries": {
> # This is missing ==>
> CN1_UUID: {"resources": {
> "VCPU": {"used": 123, "capacity": 456}
> }},
> # <==
> SS1_UUID: {"resources": {
> "DISK_GB": {"used": 2048, "capacity": 1048576}
> }},
> SS2_UUID: {"resources": {
> "IPV4_ADDRESS": {"used": 4, "capacity": 32}
> }}
> },
> }
> 
> Here's why it's not working currently:
> 
> => CN1_UUID isn't in `summaries` [3]
> => because _build_provider_summaries [4] doesn't return it
> => because it's not in usages because _get_usages_by_provider_and_rc [5]
> only finds providers providing resource in that RC
> => and since CN1 isn't providing resource in any requested RC, it ain't
> included.
> 
> But we have the anchor provider's (internal) ID; it's the ns_rp_id we're
> iterating on in this loop [6].  So let's just use that to get the
> summary and add it to the mix, right?  Things that make that difficult:
> 
> => We have no convenient helper that builds a summary object without
> specifying a resource class (which is a separate problem, because it
> means resources we didn't request don't show up in the provider
> summaries either - they should).
> => We internally build these gizmos inside out - 

  1   2   >