Re: [openstack-dev] [nova] nova cellsv2 and DBs / down cells / quotas
> I guess our architecture is pretty unique in a way but I wonder if > other people are also a little scared about the whole all DB servers > need to up to serve API requests? When we started down this path, we acknowledged that this would create a different access pattern which would require ops to treat the cell databases differently. The input we were getting at the time was that the benefits outweighed the costs here, and that we'd work on caching to deal with performance issues if/when that became necessary. > I’ve been thinking of some hybrid cellsv1/v2 thing where we’d still > have the top level api cell DB but the API would only ever read from > it. Nova-api would only write to the compute cell DBs. > Then keep the nova-cells processes just doing instance_update_at_top to keep > the nova-cell-api db up to date. I'm definitely not in favor of doing more replication in python to address this. What was there in cellsv1 was lossy, even for the subset of things it actually supported (which didn't cover all nova features at the time and hasn't kept pace with features added since, obviously). About a year ago, I proposed that we add another "read only mirror" field to the cell mapping, which nova would use if and only if the primary cell database wasn't reachable, and only for read operations. The ops, if they wanted to use this, would configure plain old one-way mysql replication of the cell databases to a highly-available server (probably wherever the api_db is) and nova could use that as a read-only cache for things like listing instances and calculating quotas. The reaction was (very surprisingly to me) negative to this option. It seems very low-effort, high-gain, and proper re-use of existing technologies to me, without us having to replicate a replication engine (hah) in python. So, I'm curious: does that sound more palatable to you? --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Metadata API cross joining "instance_metadata" and "instance_system_metadata"
> I tested a code change that essentially reverts > https://review.openstack.org/#/c/276861/1/nova/api/metadata/base.py > > In other words, with this change metadata tables are not fetched by > default in API requests. If I understand correctly, metadata is > fetched in separate queries as the instance object is > created. Everything seems to work just fine, and I've considerably > reduced the amount of data fetched from the database, as well as > reduced the average response time of API requests. > > Given how simple it is and the results I'm getting, I don't see any > reason not to patch my clusters with this change. > > Do you guys see any other impact this change could have? Anything that > it could potentially break? This is probably fine as a bandage fix, but it's not the right one for upstream, IMHO. By doing what you did, you cause two RPC round-trips to fetch the instance and then the metadata every single time the metadata API is hit (not including the cache). By converting the DB load to do the two-step, we still hit the DB twice, but only one RPC round-trip, which will be much more efficient especially at load/scale. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Metadata API cross joining "instance_metadata" and "instance_system_metadata"
> Do you guys see an easy fix here? > > Should I open a bug report? Definitely open a bug. IMHO, we should just make the single-instance load work like the multi ones, where we load the metadata separately if requested. We might be able to get away without sysmeta these days, but we needed it for the flavor details back when the join was added. But, user metadata is controllable by the user and definitely of interest in that code, so just dropping sysmeta from the explicit required_attrs isn't enough, IMHO. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Metadata API cross joining "instance_metadata" and "instance_system_metadata"
> We haven't been doing this (intentionally) for quite some time, as we > query and fill metadata linearly: > > https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L2244 > > and have since 2013 (Havana): > > https://review.openstack.org/#/c/26136/ > > So unless there has been a regression that is leaking those columns back > into the join list, I'm not sure why the query you show would be > generated. Ah, Matt Riedemann just pointed out on IRC that we're not doing it on single-instance fetch, which is what you'd be hitting in this path. We use that approach in a lot of places where the rows would also be multiplied by the number of instances, but not in the single case. So, that makes sense now. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Metadata API cross joining "instance_metadata" and "instance_system_metadata"
> Of course this is only a problem when instances have a lot of metadata > records. An instance with 50 records in "instance_metadata" and 50 > records in "instance_system_metadata" will fetch 50 x 50 = 2,500 rows > from the database. It's not difficult to see how this can escalate > quickly. This can be a particularly significant problem in a HA > scenario with multiple API nodes pulling data from multiple database > nodes. We haven't been doing this (intentionally) for quite some time, as we query and fill metadata linearly: https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L2244 and have since 2013 (Havana): https://review.openstack.org/#/c/26136/ So unless there has been a regression that is leaking those columns back into the join list, I'm not sure why the query you show would be generated. Just to be clear, you don't have any modifications to the code anywhere, do you? --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Openstack-operators] [nova] Supporting force live-migrate and force evacuate with nested allocations
>> I disagree on this. I'd rather just do a simple check for >1 >> provider in the allocations on the source and if True, fail hard. >> >> The reverse (going from a non-nested source to a nested destination) >> will hard fail anyway on the destination because the POST >> /allocations won't work due to capacity exceeded (or failure to have >> any inventory at all for certain resource classes on the >> destination's root compute node). > > I agree with Jay here. If we know the source has allocations on >1 > provider, just fail fast, why even walk the tree and try to claim > those against the destination - the nested providers aren't going to > be the same UUIDs on the destination, *and* trying to squash all of > the source nested allocations into the single destination root > provider and hope it works is super hacky and I don't think we should > attempt that. Just fail if being forced and nested allocations exist > on the source. Same, yeah. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [placement] The "intended purpose" of traits
>> I still want to use something like "Is capable of RAID5" and/or "Has >> RAID5 already configured" as part of a scheduling and placement >> decision. Being able to have the GET /a_c response filtered down to >> providers with those, ahem, traits is the exact purpose of that operation. > > And yep, I have zero problem with this either, as I've noted. This is > precisely what placement and traits were designed for. Same. >> While we're in the neighborhood, we agreed in Denver to use a trait to >> indicate which service "owns" a provider [1], so we can eventually >> coordinate a smooth handoff of e.g. a device provider from nova to >> cyborg. This is certainly not a capability (but it is a trait), and it >> can certainly be construed as a key/value (owning_service=cyborg). Are >> we rescinding that decision? > > Unfortunately I have zero recollection of a conversation about using > traits for indicating who "owns" a provider. :( I definitely do. > I don't think I would support such a thing -- rather, I would support > adding an attribute to the provider model itself for an owning service > or such thing. > > That's a great example of where the attribute has specific conceptual > meaning to placement (the concept of ownership) and should definitely > not be tucked away, encoded into a trait string. No, as I recall it means nothing to placement - it means something to the consumers. A gentleperson's agreement for identifying who owns what if we're going to, say, remove things that might be stale from placement when updating the provider tree. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [placement] The "intended purpose" of traits
> It sounds like you might be saying, "I would rather not see encoded > trait names OR a new key/value primitive; but if the alternative is > ending up with 'a much larger mess', I would accept..." ...which? > > Or is it, "We should not implement a key/value primitive, nor should we > implement restrictions on trait names; but we should continue to > discourage (ab)use of trait names by steering placement consumers to..." > ...do what? The second one. > The restriction is real, not perceived. Without key/value (either > encoded or explicit), how should we steer placement consumers to satisfy > e.g., "Give me disk from a provider with RAID5"? Sure, I'm not doubting the need to find providers with certain abilities. What I'm saying (and I assume Jay is as well), is that finding things with more domain-specific attributes is the job of the domain controller (i.e. nova). Placement's strength, IMHO, is the unified and extremely simple data model and consistency guarantees that it provides. It takes a lot of the work of searching and atomic accounting of enumerable and qualitative things out of the scheduler of the consumer. IMHO, it doesn't (i.e. won't ever) and shouldn't replace all the things that nova's scheduler needs to do. I think it's useful to draw the line in front of a full-blown key=value store and DSL grammar for querying everything with all the operations anyone could ever need. Unifying the simpler and more common bits into placement and keeping the domain-specific consideration and advanced filtering of the results in nova/ironic/etc is the right separation of responsibilities, IMHO. RAID level is, of course, an overly simplistic example to use, which makes the problem seem small, but we know more complicated examples exist. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [placement] The "intended purpose" of traits
I was out when much of this conversation happened, so I'm going to summarize my opinion here. > So from a code perspective _placement_ is completely agnostic to > whether a trait is "PCI_ADDRESS_01_AB_23_CD", "STORAGE_DISK_SSD", or > "JAY_LIKES_CRUNCHIE_BARS". > > However, things which are using traits (e.g., nova, ironic) need to > make their own decisions about how the value of traits are > interpreted. I don't have a strong position on that except to say > that _if_ we end up in a position of there being lots of traits > willy nilly, people who have chosen to do that need to know that the > contract presented by traits right now (present or not present, no > value comprehension) is fixed. I agree with what Chris holds sacred here, which is that placement shouldn't ever care about what the trait names are or what they mean to someone else. That also extends to me hoping we never implement a generic key=value store on resource providers in placement. >> I *do* see a problem with it, based on my experience in Nova where >> this kind of thing leads to ugly, unmaintainable, and >> incomprehensible code as I have pointed to in previous responses. I definitely agree with what Jay holds sacred here, which is that abusing the data model to encode key=value information into single trait strings is bad (which is what you're doing with something like PCI_ADDRESS_01_AB_23_CD). I don't want placement (the code) to try to put any technical restrictions on the meaning of trait names, in an attempt to try to prevent the above abuse. I agree that means people _can_ abuse it if they wish, which I think is Chris' point. However, I think it _is_ important for the placement team (the people) to care about how consumers (nova, etc) use traits, and thus provide guidance on that is necessary. Not everyone will follow that guidance, but we should provide it. Projects with history-revering developers on both sides of the fence can help this effort if they lead by example. If everyone goes off and implements their way around the perceived restriction of not being able to ask placement for RAID_LEVEL>=5, we're going to have a much larger mess than the steaming pile of extra specs in nova that we're trying to avoid. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Open letter/request to TC candidates (and existing elected officials)
> I'm just a bit worried to limit that role to the elected TC members. If > we say "it's the role of the TC to do cross-project PM in OpenStack" > then we artificially limit the number of people who would sign up to do > that kind of work. You mention Ildiko and Lance: they did that line of > work without being elected. Why would saying that we _expect_ the TC members to do that work limit such activities only to those that are on the TC? I would expect the TC to take on the less-fun or often-neglected efforts that we all know are needed but don't have an obvious champion or sponsor. I think we expect some amount of widely-focused technical or project leadership from TC members, and certainly that expectation doesn't prevent others from leading efforts (even in the areas of proposing TC resolutions, etc) right? --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [upgrade] request for pre-upgrade check for db purge
> How do people feel about this? It seems pretty straight-forward to > me. If people are generally in favor of this, then the question is > what would be sane defaults - or should we not assume a default and > force operators to opt into this? I dunno, adding something to nova.conf that is only used for nova-status like that seems kinda weird to me. It's just a warning/informational sort of thing so it just doesn't seem worth the complication to me. Moving it to an age thing set at one year seems okay, and better than making the absolute limit more configurable. Any reason why this wouldn't just be a command line flag to status if people want it to behave in a specific way from a specific tool? --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][placement][upgrade][qa] Some upgrade-specific news on extraction
> The other obvious thing is the database. The placement repo code as-is > today still has the check for whether or not it should use the > placement database but falls back to using the nova_api database > [5]. So technically you could point the extracted placement at the > same nova_api database and it should work. However, at some point > deployers will clearly need to copy the placement-related tables out > of the nova_api DB to a new placement DB and make sure the > 'migrate_version' table is dropped so that placement DB schema > versions can reset to 1. I think it's wrong to act like placement and nova-api schemas are the same. One is a clone of the other at a point in time, and technically it will work today. However the placement db sync tool won't do the right thing, and I think we run the major risk of operators not fully grokking what is going on here, seeing that pointing placement at nova-api "works" and move on. Later, when we add the next placement db migration (which could technically happen in stein), they will either screw their nova-api schema, or mess up their versioning, or be unable to apply the placement change. > With respect to grenade and making this work in our own upgrade CI > testing, we have I think two options (which might not be mutually > exclusive): > > 1. Make placement support using nova.conf if placement.conf isn't > found for Stein with lots of big warnings that it's going away in > T. Then Rocky nova.conf with the nova_api database configuration just > continues to work for placement in Stein. I don't think we then have > any grenade changes to make, at least in Stein for upgrading *from* > Rocky. Assuming fresh devstack installs in Stein use placement.conf > and a placement-specific database, then upgrades from Stein to T > should also be OK with respect to grenade, but likely punts the > cut-over issue for all other deployment projects (because we don't CI > with grenade doing Rocky->Stein->T, or FFU in other words). As I have said above and in the review, I really think this is the wrong approach. At the current point of time, the placement schema is a clone of the nova-api schema, and technically they will work. At the first point that placement evolves its schema, that will no longer be a workable solution, unless we also evolve nova-api's database in lockstep. > 2. If placement doesn't support nova.conf in Stein, then grenade will > require an (exceptional) [6] from-rocky upgrade script which will (a) > write out placement.conf fresh and (b) run a DB migration script, > likely housed in the placement repo, to create the placement database > and copy the placement-specific tables out of the nova_api > database. Any script like this is likely needed regardless of what we > do in grenade because deployers will need to eventually do this once > placement would drop support for using nova.conf (if we went with > option 1). Yep, and I'm asserting that we should write that script, make grenade do that step, and confirm that it works. I think operators should do that step during the stein upgrade because that's where the fork/split of history and schema is happening. I'll volunteer to do the grenade side at least. Maybe it would help to call out specifically that, IMHO, this can not and should not follow the typical config deprecation process. It's not a simple case of just making sure we "find" the nova-api database in the various configs. The problem is that _after_ the split, they are _not_ the same thing and should not be considered as the same. Thus, I think to avoid major disaster and major time sink for operators later, we need to impose the minor effort now to make sure that they don't take the process of deploying a new service lightly. Jay's original relatively small concern was that deploying a new placement service and failing to properly configure it would result in a placement running with the default, empty, sqlite database. That's a valid concern, and I think all we need to do is make sure we fail in that case, explaining the situation. We just had a hangout on the topic and I think we've come around to the consensus that just removing the default-to-empty-sqlite behavior is the right thing to do. Placement won't magically find nova.conf if it exists and jump into its database, and it also won't do the silly thing of starting up with an empty database if the very important config step is missed in the process of deploying placement itself. Operators will have to deploy the new package and do the database surgery (which we will provide instructions and a script for) as part of that process, but there's really no other sane alternative without changing the current agreed-to plan regarding the split. Is everyone okay with the above summary of the outcome? --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.o
Re: [openstack-dev] [nova] [placement] extraction (technical) update
> I think there was a period in time where the nova_api database was created > where entires would try to get pulled out from the original nova database and > then checking nova_api if it doesn't exist afterwards (or vice versa). One > of the cases that this was done to deal with was for things like instance > types > or flavours. > > I don't know the exact details but I know that older instance types exist in > the nova db and the newer ones are sitting in nova_api. Something along > those lines? Yep, we've moved entire databases before in nova with minimal disruption to the users. Not just flavors, but several pieces of data came out of the "main" database and into the api database transparently. It's doable, but with placement being split to a separate project/repo/whatever, there's not really any option for being graceful about it in this case. > At this point, I'm thinking turn off placement, setup the new one, do > the migration > of the placement-specific tables (this can be a straightforward documented > task > OR it would be awesome if it was a placement command (something along > the lines of `placement-manage db import_from_nova`) which would import all > the right things > > The idea of having a command would be *extremely* useful for deployment tools > in automating the process and it also allows the placement team to selectively > decide what they want to onboard? Well, it's pretty cut-and-dried as all the tables in nova-api are either for nova or placement, so there's not much confusion about what belongs. I'm not sure that doing this import in python is really the most efficient way. I agree a placement-manage command would be ideal from an "easy button" point of view, but I think a couple lines of bash that call mysqldump are likely to vastly outperform us doing it natively in python. We could script exec()s of those commands from python, but.. I think I'd rather just see that as a shell script that people can easily alter/test on their own. Just curious, but in your case would the service catalog entry change at all? If you stand up the new placement in the exact same spot, it shouldn't, but I imagine some people will have the catalog entry change slightly (even if just because of a VIP or port change). Am I remembering correctly that the catalog can get cached in various places such that much of nova would need a restart to notice? --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [placement] extraction (technical) update
>> Yes, we should definitely trim the placement DB migrations to only >> things relevant to placement. And we can use this opportunity to get >> rid of cruft too and squash all of the placement migrations together >> to start at migration 1 for the placement repo. If anyone can think >> of a problem with doing that, please shout it out. I agree, FWIW. > Umm, nova-manage db sync creates entries in a sqlalchemy-migrate > versions table, something like that, to track per database what the > latest migration sync version has been. > > Based on that, and the fact I thought our DB extraction policy was to > mostly tell operators to copy the nova_api database and throw it > elsewhere in a placement database, then the migrate versions table is > going to be saying you're at 061 and you can't start new migrations > from 1 at that point, unless you wipe out that versions table after > you copy the API DB. They can do this, sure. However, either we'll need migrations to delete all the nova-api-related tables, or they will need to trim them manually. If we do the former, then everyone who ever installs placement from scratch will go through the early history of nova-api only to have that removed. Or we trim those off the front, but we have to keep the collapsing migrations until we compact again, etc. The thing I'm more worried about is operators being surprised by this change (since it's happening suddenly in the middle of a release), noticing some split, and then realizing that if they just point the placement db connection at nova_api everything seems to work. That's going to go really bad when things start to diverge. > I could be wrong, but just copying the database, squashing/trimming > the migration scripts and resetting the version to 1, and assuming > things are going to be hunky dory doesn't sound like it will work to > me. Why not? I think the safest/cleanest thing to do here is renumber placement-related migrations from 1, and provide a script or procedure to dump just the placement-related tables from the nova_api database to the new one (not including the sqlalchemy-migrate versions table). --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][placement] Freezing placement for extraction
> If we're going to do the extraction in Stien, which we said we'd do in > Dublin, we need to start that as early as possible to iron out any > deployment bugs in the switch. We can't wait until the 2nd or 3rd > milestone, it would be too risky. I agree that the current extraction plan is highly risky and that if it's going to happen, we need plenty of time to clean up the mess. I imagine what Sylvain is getting at here is that if we followed the process of other splits like nova-volume, we'd be doing this differently. In that case, we'd freeze late in the cycle when freezing is appropriate anyway. We'd split out placement such that the nova-integrated one and the separate one are equivalent, and do the work to get it working on its own. In the next cycle new changes go to the split placement only. Operators are able to upgrade to stein without deploying a new stein service first, and can switch to the split placement at their leisure, separate from the release upgrade process. To be honest, I'm not sure how we got to the point of considering it acceptable to be splitting out a piece of nova in a single cycle such that operators have to deploy a new thing in order to upgrade. But alas, as has been said, this is politically more important than ... everything else. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [placement] extraction (technical) update
> Grenade already has it's own "resources db" right? So we can shove > things in there before we upgrade and then verify they are still there > after the upgrade? Yep, I'm working on something right now. We create an instance that survives the upgrade and validate it on the other side. I'll just do some basic inventory and allocation validation that we'll trip over if we somehow don't migrate that data from nova to placement. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [placement] extraction (technical) update
> Grenade uses devstack so once we have devstack on master installing > (and configuring) placement from the new repo and disable installing > and configuring it from the nova repo, that's the majority of the > change I'd think. > > Grenade will likely need a from-rocky script to move any config that > is necessary, but as you already noted below, if the new repo can live > with an existing nova.conf, then we might not need to do anything in > grenade since placement from the new repo (in stein) could then run > with nova.conf created for placement from the nova repo (in rocky). The from-rocky will also need to extract data from the nova-api database for the placement tables and put it into the new placement database (as real operators will have to do). It'll need to do this after the split code has been installed and the schema has been sync'd. Without this, the pre-upgrade resources won't have allocations known by the split placement service. I do not think we should cheat by just pointing the split placement at nova's database. Also, ISTR you added some allocation/inventory checking to devstack via hook, maybe after the tempest job ran? We might want to add some stuff to grenade to verify the pre/post resource allocations before we start this move so we can make sure they're still good after we roll. I'll see if I can hack something up to that effect. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [placement] extraction (technical) update
>> 2. We have a stack of changes to zuul jobs that show nova working but >> deploying placement in devstack from the new repo instead of nova's >> repo. This includes the grenade job, ensuring that upgrade works. > > I'm guessing there would need to be changes to Devstack itself, outside > of the zuul jobs? I think we'll need changes to devstack itself, as well as grenade, as well as zuul jobs I'd assume. Otherwise, this sequence of steps is what I've been anticipating. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] UUID sentinel needs a home
> The compromise, using the patch as currently written [1], would entail > adding one line at the top of each test file: > > uuids = uuidsentinel.UUIDSentinels() > > ...as seen (more or less) at [2]. The subtle difference being that this > `uuids` wouldn't share a namespace across the whole process, only within > that file. Given current usage, that shouldn't cause a problem, but it's > a change. ...and it doesn't work like mock.sentinel does, which is part of the value. I really think we should put this wherever it needs to be so that it can continue to be as useful as is is today. Even if that means just copying it into another project -- it's not that complicated of a thing. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] UUID sentinel needs a home
> Do you mean an actual fixture, that would be used like: > > class MyTestCase(testtools.TestCase): > def setUp(self): > self.uuids = self.useFixture(oslofx.UUIDSentinelFixture()).uuids > > def test_foo(self): > do_a_thing_with(self.uuids.foo) > > ? > > That's... okay I guess, but the refactoring necessary to cut over to it > will now entail adding 'self.' to every reference. Is there any way > around that? I don't think it's okay. It makes it a lot more work to use it, where merely importing it (exactly like mock.sentinel) is a large factor in how incredibly convenient it is. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron] Cross-cell cold migration
> I think Nova should never have to rely on Cinder's hosts/backends > information to do migrations or any other operation. > > In this case even if Nova had that info, it wouldn't be the solution. > Cinder would reject migrations if there's an incompatibility on the > Volume Type (AZ, Referenced backend, capabilities...) I think I'm missing a bunch of cinder knowledge required to fully grok this situation and probably need to do some reading. Is there some reason that a volume type can't exist in multiple backends or something? I guess I think of volume type as flavor, and the same definition in two places would be interchangeable -- is that not the case? > I don't know anything about Nova cells, so I don't know the specifics of > how we could do the mapping between them and Cinder backends, but > considering the limited range of possibilities in Cinder I would say we > only have Volume Types and AZs to work a solution. I think the only mapping we need is affinity or distance. The point of needing to migrate the volume would purely be because moving cells likely means you moved physically farther away from where you were, potentially with different storage connections and networking. It doesn't *have* to mean that, but I think in reality it would. So the question I think Matt is looking to answer here is "how do we move an instance from a DC in building A to building C and make sure the volume gets moved to some storage local in the new building so we're not just transiting back to the original home for no reason?" Does that explanation help or are you saying that's fundamentally hard to do/orchestrate? Fundamentally, the cells thing doesn't even need to be part of the discussion, as the same rules would apply if we're just doing a normal migration but need to make sure that storage remains affined to compute. > I don't know how the Nova Placement works, but it could hold an > equivalency mapping of volume types to cells as in: > > Cell#1Cell#2 > > VolTypeA <--> VolTypeD > VolTypeB <--> VolTypeE > VolTypeC <--> VolTypeF > > Then it could do volume retypes (allowing migration) and that would > properly move the volumes from one backend to another. The only way I can think that we could do this in placement would be if volume types were resource providers and we assigned them traits that had special meaning to nova indicating equivalence. Several of the words in that sentence are likely to freak out placement people, myself included :) So is the concern just that we need to know what volume types in one backend map to those in another so that when we do the migration we know what to ask for? Is "they are the same name" not enough? Going back to the flavor analogy, you could kinda compare two flavor definitions and have a good idea if they're equivalent or not... --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [nova] [placement] placement below or beside compute after extraction?
>> So my hope is that (in no particular order) Jay Pipes, Eric Fried, >> Takashi Natsume, Tetsuro Nakamura, Matt Riedemann, Andrey Volkov, >> Alex Xu, Balazs Gibizer, Ed Leafe, and any other contributor to >> placement whom I'm forgetting [1] would express their preference on >> what they'd like to see happen. I apparently don't qualify for a vote, so I'll just reply to Jay's comments here. > I am not opposed to extracting the placement service into its own > repo. I also do not view it as a priority that should take precedence > over the completion of other items, including the reshaper effort and > the integration of placement calls into Nova (nested providers, > sharing providers, etc). > > The remaining items are Nova-centric. We need Nova-focused > contributors to make placement more useful to Nova, and I fail to see > how extracting the placement service will meet that goal. In fact, one > might argue, as Melanie implies, that extracting placement outside of > the Compute project would increase the velocity of the placement > project *at the expense of* getting things done in the Nova project. Yep, this. I know it's a Nova-centric view, but unlike any other project, we have taken the risk of putting placement in our critical path. That has yielded several fire drills right before releases, as well as complicated backports to fix things that we have broken in the process, etc. We've got a list of things that are half-finished or promised-but-not-started, and those are my priority over most everything else. > We've shown we can get many things done in placement. We've shown we > can evolve the API fairly quickly. The velocity of the placement > project isn't the problem. The problem is the lag between features > being written into placement (sometimes too hastily IMHO) and actually > *using* those features in Nova. Right, and the reshaper effort is a really good example of what I'm concerned about. Nova has been getting ready for NRPs for several cycles now, and just before crunch time in Rocky, we realize there's a huge missing piece of the puzzle on the placement side. That's not the first time that has happened and I'm sure it won't be the last. > As for the argument about other projects being able (or being more > willing to) use placement, I think that's not actually true. The > projects that might want to ditch their own custom resource tracking > and management code (Cyborg, Neutron, Cinder, Ironic) have either > already done so or would require minimal changes to do that. There are > no projects other than Ironic that I'm aware of that are interested in > using the allocation candidates functionality (and the allocation > claim process that entails) for the rough scheduling functionality > that provides. I'm not sure placement being extracted would change > that. My point about this is that "reporting" and "consuming" placement are different things. Neutron reports, we'd like Cinder to report. Ironic reports, but indirectly. Cyborg would report. Those reporting activities are to help projects that "consume" placement make better decisions, but I think it's entirely likely that Nova will be the only one that ever does that. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [nova] [placement] placement below or beside compute after extraction?
> The subject of using placement in Cinder has come up, and since then I've had > a > few conversations with people in and outside of that team. I really think > until > placement is its own project outside of the nova team, there will be > resistance > from some to adopt it. I know politics will be involved in this, but this is a really terrible reason to do a thing, IMHO. After the most recent meeting we had with the Cinder people on placement adoption, I'm about as convinced as ever that Cinder won't (and won't need to) _consume_ placement any time soon. I hope it will _report_ to placement so Nova can make better decisions, just like Neutron does now, but I think that's the extent we're likely to see if we're honest. What other projects are _likely_ to _consume_ placement even if they don't know they'd want to? What projects already want to use it but refuse to because it has Nova smeared all over it? We talked about this a lot in the early justification for placement, but the demand for that hasn't really materialized, IMHO; maybe it's just me. > This reluctance on having it part of Nova may be real or just perceived, but > with it within Nova it will likely be an uphill battle for some time > convincing > other projects that it is a nicely separated common service that they can use. Splitting it out to another repository within the compute umbrella (what do we call it these days?) satisfies the _technical_ concern of not being able to use placement without installing the rest of the nova code and dependency tree. Artificially creating more "perceived" distance sounds really political to me, so let's be sure we're upfront about the reasoning for doing that if so :) --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] A multi-cell instance-list performance test
> We have tried out the patch: > https://review.openstack.org/#/c/592698/ > we also applied https://review.openstack.org/#/c/592285/ > > it turns out that we are able to half the overall time consumption, we > did try with different sort key and dirs, the results are similar, we > didn't try out paging yet: Excellent! Let's continue discussion of the batching approach in that review. There are some other things to try. Thanks! --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] A multi-cell instance-list performance test
> yes, the DB query was in serial, after some investigation, it seems that we > are unable to perform eventlet.mockey_patch in uWSGI mode, so > Yikun made this fix: > > https://review.openstack.org/#/c/592285/ Cool, good catch :) > > After making this change, we test again, and we got this kind of data: > > total collect sort view > before monkey_patch 13.5745 11.7012 1.1511 0.5966 > after monkey_patch 12.8367 10.5471 1.5642 0.6041 > > The performance improved a little, and from the log we can saw: Since these all took ~1s when done in series, but now take ~10s in parallel, I think you must be hitting some performance bottleneck in either case, which is why the overall time barely changes. Some ideas: 1. In the real world, I think you really need to have 10x database servers or at least a DB server with plenty of cores loading from a very fast (or separate) disk in order to really ensure you're getting full parallelism of the DB work. However, because these queries all took ~1s in your serialized case, I expect this is not your problem. 2. What does the network look like between the api machine and the DB? 3. What do the memory and CPU usage of the api process look like while this is happening? Related to #3, even though we issue the requests to the DB in parallel, we still process the result of those calls in series in a single python thread on the API. That means all the work of reading the data from the socket, constructing the SQLA objects, turning those into nova objects, etc, all happens serially. It could be that the DB query is really a small part of the overall time and our serialized python handling of the result is the slow part. If you see the api process pegging a single core at 100% for ten seconds, I think that's likely what is happening. > so, now the queries are in parallel, but the whole thing still seems > serial. In your table, you show the time for "1 cell, 1000 instances" as ~3s and "10 cells, 1000 instances" as 10s. The problem with comparing those directly is that in the latter, you're actually pulling 10,000 records over the network, into memory, processing them, and then just returning the first 1000 from the sort. A closer comparison would be the "10 cells, 100 instances" with "1 cell, 1000 instances". In both of those cases, you pull 1000 instances total from the db, into memory, and return 1000 from the sort. In that case, the multi-cell situation is faster (~2.3s vs. ~3.1s). You could also compare the "10 cells, 1000 instances" case to "1 cell, 10,000 instances" just to confirm at the larger scale that it's better or at least the same. We _have_ to pull $limit instances from each cell, in case (according to the sort key) the first $limit instances are all in one cell. We _could_ try to batch the results from each cell to avoid loading so many that we don't need, but we punted this as an optimization to be done later. I'm not sure it's really worth the complexity at this point, but it's something we could investigate. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] increasing the number of allowed volumes attached per instance > 26
> I thought we were leaning toward the option where nova itself doesn't > impose a limit, but lets the virt driver decide. > > I would really like NOT to see logic like this in any nova code: > >> if kvm|qemu: >> return 256 >> elif POWER: >> return 4000 >> elif: >> ... It's insanity to try to find a limit that will work for everyone. PowerVM supports a billion, libvirt/kvm has some practical and theoretical limits, both of which are higher than what is actually sane. It depends on your virt driver, and how you're attaching your volumes, maybe how tightly you pack your instances, probably how many threads you give to an instance, how big your compute nodes are, and definitely what your workload is. That's a really big matrix, and even if we decide on something, IBM will come out of the woodwork with some other hypervisor that has been around since the Nixon era that uses BCD-encoded volume numbers and thus can only support 10. It's going to depend, and a user isn't going to be able to reasonably probe it using any of our existing APIs. If it's going to depend on all the above factors, I see no reason not to put a conf value in so that operators can pick a reasonably sane limit. Otherwise, the limit we pick will be wrong for everyone. Plus... if we do a conf option we can put this to rest and stop talking about it, which I for one am *really* looking forward to :) --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] review runways check-in and feedback
> While I have tried to review a few of the runway-slotted efforts, I > have gotten burned out on a number of them. Other runway-slotted > efforts, I simply don't care enough about or once I've seen some of > the code, simply can't bring myself to review it (sorry, just being > honest). I have the same feeling, although I have reviewed a lot of things I wouldn't have otherwise as a result of them being in the runway. I spent a bunch of time early on with the image signing stuff, which I think was worthwhile, although at this point I'm a bit worn out on it. That's not the fault of runways though. > Is your concern that placement stuff is getting unfair attention since > many of the patch series aren't in the runways? Or is your concern > that you'd like to see *more* core reviews on placement stuff outside > of the usual placement-y core reviewers (you, me, Alex, Eric, Gibi and > Dan)? I think placement has been getting a bit of a free ride, with constant review and insulation from the runway process. However, I don't think that we can stop progress on that effort while we circle around, and the subteam/group of people that focus on placement already has a lot of supporting cores already. So, it's cheating a little bit, but we always said that we're not going to tell cores *not* to review something unless it is in a runway and pragmatially I think it's probably the right thing to do for placement. >> Having said that, it's clear from the list of things in the runways >> etherpad that there are some lower priority efforts that have been >> completed probably because they leveraged runways (there are a few >> xenapi blueprints for example, and the powervm driver changes). > > Wasn't that kind of the point of the runways, though? To enable "lower > priority" efforts to have a chance at getting reviews? Or are you just > stating here the apparent success of that effort? It was, and I think it has worked well for that for several things. The image signing stuff got more review in its first runway slot than it has in years I think. Overall, I don't think we're worse off with runways than we were before it. I think that some things that will get attention regardless are still progressing. I think that some things that are far off on the fringe are still getting ignored. I think that for the huge bulk of things in the middle of those two, runways has helped focus review on specific efforts and thus increased the throughput there. For a first attempt, I'd call that a success. I think maybe a little more monitoring of the review rate of things in the runways and some gentle prodding of people to look at ones that are burning time and not seeing much review would maybe improve things a bit. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] increasing the number of allowed volumes attached per instance > 26
> Some ideas that have been discussed so far include: FYI, these are already in my order of preference. > A) Selecting a new, higher maximum that still yields reasonable > performance on a single compute host (64 or 128, for example). Pros: > helps prevent the potential for poor performance on a compute host > from attaching too many volumes. Cons: doesn't let anyone opt-in to a > higher maximum if their environment can handle it. I prefer this because I think it can be done per virt driver, for whatever actually makes sense there. If powervm can handle 500 volumes in a meaningful way on one instance, then that's cool. I think libvirt's limit should likely be 64ish. > B) Creating a config option to let operators choose how many volumes > allowed to attach to a single instance. Pros: lets operators opt-in to > a maximum that works in their environment. Cons: it's not discoverable > for those calling the API. This is a fine compromise, IMHO, as it lets operators tune it per compute node based on the virt driver and the hardware. If one compute is using nothing but iSCSI over a single 10g link, then they may need to clamp that down to something more sane. Like the per virt driver restriction above, it's not discoverable via the API, but if it varies based on compute node and other factors in a single deployment, then making it discoverable isn't going to be very easy anyway. > C) Create a configurable API limit for maximum number of volumes to > attach to a single instance that is either a quota or similar to a > quota. Pros: lets operators opt-in to a maximum that works in their > environment. Cons: it's yet another quota? Do we have any other quota limits that are per-instance like this would be? If not, then this would likely be weird, but if so, then this would also be an option, IMHO. However, it's too much work for what is really not a hugely important problem, IMHO, and both of the above are lighter-weight ways to solve this and move on. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers
> FWIW, I don't have a problem with the virt driver "knowing about > allocations". What I have a problem with is the virt driver *claiming > resources for an instance*. +1000. > That's what the whole placement claims resources things was all about, > and I'm not interested in stepping back to the days of long racy claim > operations by having the compute nodes be responsible for claiming > resources. > > That said, once the consumer generation microversion lands [1], it > should be possible to *safely* modify an allocation set for a consumer > (instance) and move allocation records for an instance from one > provider to another. Agreed. I'm hesitant to have the compute nodes arguing with the scheduler even to patch things up, given the mess we just cleaned up. The thing that I think makes this okay is that one compute node cleaning/pivoting allocations for instances isn't going to be fighting anything else whilst doing it. Migrations and new instance builds where the source/destination or scheduler/compute aren't clear who owns the allocation is a problem. That said, we need to make sure we can handle the case where an instance is in resize_confirm state across a boundary where we go from non-NRP to NRP. It *should* be okay for the compute to handle this by updating the instance's allocation held by the migration instead of the instance itself, if the compute determines that it is the source. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers
> Dan, you are leaving out the parts of my response where I am agreeing > with you and saying that your "Option #2" is probably the things we > should go with. No, what you said was: >> I would vote for Option #2 if it comes down to it. Implying (to me at least) that you still weren't in favor of either, but would choose that as the least offensive option :) I didn't quote it because I didn't have any response. I just wanted to address the other assertions about what is and isn't a common upgrade scenario, which I think is the important data we need to consider when making a decision here. I didn't mean to imply or hide anything with my message trimming, so sorry if it came across as such. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers
> So, you're saying the normal process is to try upgrading the Linux > kernel and associated low-level libs, wait the requisite amount of > time that takes (can be a long time) and just hope that everything > comes back OK? That doesn't sound like any upgrade I've ever seen. I'm saying I think it's a process practiced by some to install the new kernel and libs and then reboot to activate, yeah. > No, sorry if I wasn't clear. They can live-migrate the instances off > of the to-be-upgraded compute host. They would only need to > cold-migrate instances that use the aforementioned non-movable > resources. I don't think it's reasonable to force people to have to move every instance in their cloud (live or otherwise) in order to upgrade. That means that people who currently do their upgrades in-place in one step, now have to do their upgrade in N steps, for N compute nodes. That doesn't seem reasonable to me. > If we are going to go through the hassle of writing a bunch of > transformation code in order to keep operator action as low as > possible, I would prefer to consolidate all of this code into the > nova-manage (or nova-status) tool and put some sort of > attribute/marker on each compute node record to indicate whether a > "heal" operation has occurred for that compute node. We need to know details of each compute node in order to do that. We could make the tool external and something they run per-compute node, but that still makes it N steps, even if the N steps are lighter weight. > Someone (maybe Gibi?) on this thread had mentioned having the virt > driver (in update_provider_tree) do the whole set reserved = total > thing when first attempting to create the child providers. That would > work to prevent the scheduler from attempting to place workloads on > those child providers, but we would still need some marker on the > compute node to indicate to the nova-manage heal_nested_providers (or > whatever) command that the compute node has had its provider tree > validated/healed, right? So that means you restart your cloud and it's basically locked up until you perform the N steps to unlock N nodes? That also seems like it's not going to make us very popular on the playground :) I need to go read Eric's tome on how to handle the communication of things from virt to compute so that this translation can be done. I'm not saying I have the answer, I'm just saying that making this the problem of the operators doesn't seem like a solution to me, and that we should figure out how we're going to do this before we go down the rabbit hole. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers
> My feeling is that we should not attempt to "migrate" any allocations > or inventories between root or child providers within a compute node, > period. While I agree this is the simplest approach, it does put a lot of responsibility on the operators to do work to sidestep this issue, which might not even apply to them (and knowing if it does might be difficult). > The virt drivers should simply error out of update_provider_tree() if > there are ANY existing VMs on the host AND the virt driver wishes to > begin tracking resources with nested providers. > > The upgrade operation should look like this: > > 1) Upgrade placement > 2) Upgrade nova-scheduler > 3) start loop on compute nodes. for each compute node: > 3a) disable nova-compute service on node (to take it out of scheduling) > 3b) evacuate all existing VMs off of node You mean s/evacuate/cold migrate/ of course... :) > 3c) upgrade compute node (on restart, the compute node will see no > VMs running on the node and will construct the provider tree inside > update_provider_tree() with an appropriate set of child providers > and inventories on those child providers) > 3d) enable nova-compute service on node > > Which is virtually identical to the "normal" upgrade process whenever > there are significant changes to the compute node -- such as upgrading > libvirt or the kernel. Not necessarily. It's totally legit (and I expect quite common) to just reboot the host to take kernel changes, bringing back all the instances that were there when it resumes. The "normal" case of moving things around slide-puzzle-style applies to live migration (which isn't an option here). I think people that can take downtime for the instances would rather not have to move things around for no reason if the instance has to get shut off anyway. > Nested resource tracking is another such significant change and should > be dealt with in a similar way, IMHO. This basically says that for anyone to move to rocky, they will have to cold migrate every single instance in order to do that upgrade right? I mean, anyone with two socket machines or SRIOV NICs would end up with at least one level of nesting, correct? Forcing everyone to move everything to do an upgrade seems like a non-starter to me. We also need to consider the case where people would be FFU'ing past rocky (i.e. never running rocky computes). We've previously said that we'd provide a way to push any needed transitions with everything offline to facilitate that case, so I think we need to implement that method anyway. I kinda think we need to either: 1. Make everything perform the pivot on compute node start (which can be re-used by a CLI tool for the offline case) 2. Make everything default to non-nested inventory at first, and provide a way to migrate a compute node and its instances one at a time (in place) to roll through. We can also document "or do the cold-migration slide puzzle thing" as an alternative for people that feel that's more reasonable. I just think that forcing people to take down their data plane to work around our own data model is kinda evil and something we should be avoiding at this level of project maturity. What we're really saying is "we know how to translate A into B, but we require you to move many GBs of data over the network and take some downtime because it's easier for *us* than making it seamless." --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] z/VM introducing a new config driveformat
> can I know a use case for this 'live copy metadata or ' the 'only way > to access device tags when hot-attach? my thought is this is one time > thing in cloud-init side either through metatdata service or config > drive and won't be used later? then why I need a live copy? If I do something like this: nova interface-attach --tag=data-network --port-id=foo myserver Then we update the device metadata live, which is visible immediately via the metadata service. However, in config drive, that only gets updated the next time the drive is generated (which may be a long time away). For more information on device metadata, see: https://specs.openstack.org/openstack/nova-specs/specs/mitaka/approved/virt-device-role-tagging.html Further, some of the drivers support setting the admin password securely via metadata, which similarly requires the instance pulling updated information out, which wouldn't be available in the config drive. For reference: https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L1985-L1993 --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [StarlingX] StarlingX code followup discussions
> For example, I look at your nova fork and it has a "don't allow this > call during an upgrade" decorator on many API calls. Why wasn't that > done upstream? It doesn't seem overly controversial, so it would be > useful to understand the reasoning for that change. Interesting. We have internal accounting for service versions and can make a determination of if we're in an upgrade scenario (and do block operations until the upgrade is over). Unless this decorator you're looking at checks some non-upstream is-during-upgrade flag, this would be an easy thing to close the gap on. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] nova-manage cell_v2 map_instances uses invalid UUID as marker in the db
> Takashi Natsume writes: > >> In some compute REST APIs, it returns the 'marker' parameter >> in their pagination. >> Then users can specify the 'marker' parameter in the next request. I read this as you saying there was some way that the in-band marker mapping could be leaked to the user via the REST API. However, if you meant to just offer up the REST API's pagination as an example that we could follow in the nova-manage CLI, requiring users to provide the marker each time, then ignore this part: > How is this possible? The only way we would get the marker is if we > either (a) listed the mappings by project_id, using > INSTANCE_MAPPING_MARKER as the query value, or (b) listed all the > mappings and somehow returned those to the user. > > I don't think (a) is a thing, and I'm not seeing how (b) could be > either. If you know of a place, please write a functional test for it > and we can get it resolves. In my proposed patch, I added a filter to > ensure that this doesn't show up in the get_by_cell_id() query, but > again, I'm not sure how this would ever be exposed to a user. > > https://review.openstack.org/#/c/567669/1/nova/objects/instance_mapping.py@173 As I said in my reply to gibi, I don't think making the user keep track of the marker is a very nice UX for a management CLI, nor is it as convenient for something like puppet to run as it has to parse the (grossly verbose) output each time to extract that marker. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] nova-manage cell_v2 map_instances uses invalid UUID as marker in the db
Takashi Natsume writes: > In some compute REST APIs, it returns the 'marker' parameter > in their pagination. > Then users can specify the 'marker' parameter in the next request. How is this possible? The only way we would get the marker is if we either (a) listed the mappings by project_id, using INSTANCE_MAPPING_MARKER as the query value, or (b) listed all the mappings and somehow returned those to the user. I don't think (a) is a thing, and I'm not seeing how (b) could be either. If you know of a place, please write a functional test for it and we can get it resolves. In my proposed patch, I added a filter to ensure that this doesn't show up in the get_by_cell_id() query, but again, I'm not sure how this would ever be exposed to a user. https://review.openstack.org/#/c/567669/1/nova/objects/instance_mapping.py@173 --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] nova-manage cell_v2 map_instances uses invalid UUID as marker in the db
> The oslo UUIDField emits a warning if the string used as a field value > does not pass the validation of the uuid.UUID(str(value)) call > [3]. All the offending places are fixed in nova except the nova-manage > cell_v2 map_instances call [1][2]. That call uses markers in the DB > that are not valid UUIDs. No, that call uses markers in the DB that don't fit the canonical string representation of a UUID that the oslo library is looking for. There are many ways to serialize a UUID: https://en.wikipedia.org/wiki/Universally_unique_identifier#Format The 8-4-4-4-12 format is one of them (and the most popular). Changing the dashes to spaces does not make it not a UUID, it makes it not the same _string_ and it's done (for better or worse) in the aforementioned code to skirt the database's UUID-ignorant _string_ uniqueness constraint. > If we could fix this last offender then we could merge the patch [4] > that changes the this warning to an exception in the nova tests to > avoid such future rule violations. > > However I'm not sure it is easy to fix. Replacing > 'INSTANCE_MIGRATION_MARKER' at [1] to > '----' might work The project_id field on the object is not a UUIDField, nor is it 36 characters in the database schema. It can't be because project ids are not guaranteed to be UUIDs. > but I don't know what to do with instance_uuid.replace(' ', '-') [2] > to make it a valid uuid. Also I think that if there is an unfinished > mapping in the deployment and then the marker is changed in the code > that leads to inconsistencies. IMHO, it would be bad to do anything that breaks people in the middle of a mapping procedure. While I understand the desire to have fewer spurious warnings in the test runs, I feel like doing anything to impact the UX or performance of runtime code to make the unit test output cleaner is a bad idea. > I'm open to any suggestions. We already store values in this field that are not 8-4-4-4-12, and the oslo field warning is just a warning. If people feel like we need to do something, I propose we just do this: https://review.openstack.org/#/c/567669/ It is one of those "we normally wouldn't do this with object schemas, but we know this is okay" sort of situations. Personally, I'd just make the offending tests shut up about the warning and move on, but I'm also okay with the above solution if people prefer. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild
> I'm late to this thread but I finally went through the replies and my > thought is, we should do a pre-flight check to verify with placement > whether the image traits requested are 1) supported by the compute > host the instance is residing on and 2) coincide with the > already-existing allocations. Instead of making an assumption based on > "last image" vs "new image" and artificially limiting a rebuild that > should be valid to go ahead. I can imagine scenarios where a user is > trying to do a rebuild that their cloud admin says should be perfectly > valid on their hypervisor, but it's getting rejected because old image > traits != new image traits. It seems like unnecessary user and admin > pain. Yeah, I think we have to do this. > It doesn't seem correct to reject the request if the current compute > host can fulfill it, and if I understood correctly, we have placement > APIs we can call from the conductor to verify the image traits > requested for the rebuild can be fulfilled. Is there a reason not to > do that? Well, it's a little itcky in that it makes a random part of conductor a bit like the scheduler in its understanding of and iteraction with placement. I don't love it, but I think it's what we have to do. Trying to do the trait math with what was used before, or conservatively rejecting the request and being potentially wrong about that is not reasonable, IMHO. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] z/VM introducing a new config driveformat
> According to requirements and comments, now we opened the CI runs with > run_validation = True And according to [1] below, for example, [2] > need the ssh validation passed the test > > And there are a couple of comments need some enhancement on the logs > of CI such as format and legacy incorrect links of logs etc the newest > logs sample can be found [3] (take n-cpu as example and those logs are > with _white.html) > > Also, the blueprint [4] requested by previous discussion post here > again for reference > > please let us know whether the procedure -2 can be removed in order to > proceed . thanks for your help The CI log format issues look fixed to me and validation is turned on for the stuff supported, which is what was keeping it out of the runway. I still plan to leave the -2 on there until the next few patches have agreement, just so we don't land an empty shell driver before we are sure we're going to land spawn/destroy, etc. That's pretty normal procedure and I'll be around to remove it when appropriate. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] z/VM introducing a new config driveformat
> Having briefly read the cloud-init snippet which was linked earlier in > this thread, the requirement seems to be that the guest exposes the > device as /dev/srX or /dev/cdX. So I guess in order to make this work: > > * You need to tell z/VM to expose the virtual disk as an optical disk > * The z/VM kernel needs to call optical disks /dev/srX or /dev/cdX According to the docs, it doesn't need to be. You can indicate the configdrive via filesystem label which makes sense given we support vfat for it as well. http://cloudinit.readthedocs.io/en/latest/topics/datasources/configdrive.html#version-2 --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Concern about trusted certificates API change
> Maybe it wasn't clear but I'm not advocating that we block the change > until volume-backed instances are supported with trusted certs. I'm > suggesting we add a policy rule which allows deployers to at least > disable it via policy if it's not supported for their cloud. That's fine with me, and provides an out for another issue I pointed out on the code review. Basically, the operator has no way to disable this feature. If they haven't set this up properly and have no desire to, a user reading the API spec and passing trusted certs will not be able to boot an instance and not really understand why. > I agree. I'm the one that noticed the issue and pointed out in the > code review that we should explicitly fail the request if we can't > honor it. I agree for the moment for sure, but it would obviously be nice not to open another gap we're not going to close. There's no reason this can't be supported for volume-backed instances, it just requires some help from cinder. I would think that it'd be nice if we could declare the "can't do this for reasons" response as a valid one regardless of the cause so we don't need another microversion for the future where volume-backed instances can do this. > Again, I'm not advocating that we block until boot from volume is > supported. However, we have a lot of technical debt for "good > functionality" added over the years that failed to consider > volume-backed instances, like rebuild, rescue, backup, etc and it's > painful to deal with that after the fact, as can be seen from the > various specs proposed for adding that support to those APIs. Totes agree. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] z/VM introducing a new config driveformat
> Thanks for the concern and fully under it , the major reason is > cloud-init doesn't have a hook or plugin before it start to read > config drive (ISO disk) z/VM is an old hypervisor and no way to do > something like libvirt to define a ISO format disk in xml definition, > instead, it can define disks in the defintion of virtual machine and > let VM to decide its format. > > so we need a way to tell cloud-init where to find ISO file before > cloud-init start but without AE, we can't handle that...some update on > the spec here for further information > https://review.openstack.org/#/c/562154/ The ISO format does not come from telling libvirt something about it. The host creates and formats the image, adds the data, and then attaches it to the instance. The latter part is the only step that involves configuring libvirt to attach the image to the instance. The rest is just stuff done by nova-compute (and the virt driver) on the linux system it's running on. That's the same arrangement as your driver, AFAICT. You're asking the system to hypervisor (or something running on it) to grab the image from glance, pre-filled with data. This is no different, except that the configdrive image comes from the system running the compute service. I don't see how it's any different in actual hypervisor mechanics, and thus feel like there _has_ to be a way to do this without the AE magic agent. I agree with Mikal that needing more agent behavior than cloud-init does a disservice to the users. I feel like we get a lot of "but no, my hypervisor is special!" reasoning when people go to add a driver to nova. So far, I think they're a lot more similar than people think. Ironic is the weirdest one we have (IMHO and no offense to the ironic folks) and it can support configdrive properly. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] z/VM introducing a new config driveformat
> I propose that we remove the z/VM driver blueprint from the runway at > this time and place it back into the queue while work on the driver > continues. At a minimum, we need to see z/VM CI running with > [validation]run_validation = True in tempest.conf before we add the > z/VM driver blueprint back into a runway in the future. Agreed. I also want to see the CI reporting cleaned up so that it's readable and consistent. Yesterday I pointed out some issues with the fact that the actual config files being used are not the ones being uploaded. There are also duplicate (but not actually identical) logs from all services being uploaded, including things like a full compute log from starting with the libvirt driver. I'm also pretty troubled by the total lack of support for the metadata service. I know it's technically optional on our matrix, but it's a pretty important feature for a lot of scenarios, and it's also a dependency for other features that we'd like to have wider support for (like attached device metadata). Going back to the spec, I see very little detail on some of the things raised here, and very (very) little review back when it was first approved. I'd also like to see more detail be added to the spec about all of these things, especially around required special changes like this extra AE agent. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Deployers] Optional, platform specific, dependancies in requirements.txt
>> global ironic >> if ironic is None: >> ironic = importutils.import_module('ironicclient') I believe ironic was an early example of a client library we hot-loaded, and I believe at the time we said this was a pattern we were going to follow. Personally, I think this makes plenty of sense and I think that even moving things like the python-libvirt load out to something like this to avoid hyperv people having to nuke it from requirements makes sense. > I have a pretty strong dislike for this mechanism. For one thing, I'm > frustrated when I can't use hotkeys to jump to an ironicclient method > because my IDE doesn't recognize that dynamic import. I have to go look > up the symbol some other way (and hope I'm getting the right one). To > me (with my bias as a dev rather than a deployer) that's way worse than > having the 704KB python-ironicclient installed on my machine even though This seems like a terrible reason to make everyone install ironicclient (or the z/vm client) on their systems at runtime. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] z/VM introducing a new config driveformat
> for the run_validation=False issue, you are right, because z/VM driver > only support config drive and don't support metadata service ,we made > bad assumption and took wrong action to disabled the whole ssh check, > actually according to [1] , we should only disable > CONF.compute_feature_enabled.metadata_service but keep both > self.run_ssh and CONF.compute_feature_enabled.config_drive as True in > order to make config drive test validation take effect, our CI will > handle that Why don't you support the metadata service? That's a pretty fundamental mechanism for nova and openstack. It's the only way you can get a live copy of metadata, and it's the only way you can get access to device tags when you hot-attach something. Personally, I think that it's something that needs to work. > For the tgz/iso9660 question below, this is because we got wrong info > from low layer component folks back to 2012 and after discuss with > some experts again, actually we can create iso9660 in the driver layer > and pass down to the spawned virtual machine and during startup > process, the VM itself will mount the iso file and consume it, because > from linux perspective, either tgz or iso9660 doesn't matter , only > need some files in order to transfer the information from openstack > compute node to the spawned VM. so our action is to change the format > from tgz to iso9660 and keep consistent to other drivers. The "iso file" will not be inside the guest, but rather passed to the guest as a block device, right? > For the config drive working mechanism question, according to [2] z/VM > is Type 1 hypervisor while Qemu/KVM are mostly likely to be Type 2 > hypervisor, there is no file system in z/VM hypervisor (I omit too > much detail here) , so we can't do something like linux operation > system to keep a file as qcow2 image in the host operating system, I'm not sure what the type-1-ness has to do with this. The hypervisor doesn't need to support any specific filesystem for this to work. Many drivers we have in the tree are type-1 (xen, vmware, hyperv, powervm) and you can argue that KVM is type-1-ish. They support configdrive. > what we do is use a special file pool to store the config drive and > during VM init process, we read that file from special device and > attach to VM as iso9660 format then cloud-init will handle the follow > up, the cloud-init handle process is identical to other platform This and the previous mention of this sort of behavior has me concerned. Are you describing some sort of process that runs when the instance is starting to initialize its environment, or something that runs *inside* the instance and thus functionality that has to exist in the *image* to work? --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] z/VM introducing a new config drive format
> https://review.openstack.org/#/c/527658 is a z/VM patch which > introduces their support for config drive. They do this by attaching a > tarball to the instance, having pretended in the nova code that it is > an iso9660. This worries me. > > In the past we've been concerned about adding new filesystem formats > for config drives, and the long term support implications of that -- > the filesystem formats for config drive that we use today were > carefully selected as being universally supported by our guest > operating systems. > > The previous example we've had of these issues is the parallels > driver, which had similar "my hypervisor doesn't support these > filesystem format" concerns. We worked around those concerns IIRC, and > certainly virt.configdrive still only supports iso9660 and vfat. Yeah, IIRC, the difference with the parallels driver was that it ends up mounted in the container automagically for the guest by the..uh..man behind the curtain. However, z/VM being much more VM-y I imagine that the guest is just expected to grab that blob and do something with it to extract it on local disk at runtime or something. That concerns me too. In the past I've likened adding filesystem (or format, in this case) options to configdrive as a guest ABI change. I think the stability of what we present to guests is second only to our external API in terms of importance. I know z/VM is "weird" or "different", but I wouldn't want a more conventional hypervisor exposing the configdrive as a tarball, so I don't really think it's a precedent we should set. Both vfat and iso9660 are easily supportable by most everything on the planet so I don't think it's an unreasonable bar. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [cyborg] Race condition in the Cyborg/Nova flow
> ==> Fully dynamic: You can program one region with one function, and > then still program a different region with a different function, etc. Note that this is also the case if you don't have virtualized multi-slot devices. Like, if you had one that only has one region. Consuming it consumes the one and only inventory. > ==> Single program: Once you program the card with a function, *all* its > virtual slots are *only* capable of that function until the card is > reprogrammed. And while any slot is in use, you can't reprogram. This > is Sundar's FPGA use case. It is also Sylvain's VGPU use case. > > The "fully dynamic" case is straightforward (in the sense of being what > placement was architected to handle). > * Model the PF/region as a resource provider. > * The RP has inventory of some generic resource class (e.g. "VGPU", > "SRIOV_NET_VF", "FPGA_FUNCTION"). Allocations consume that inventory, > plain and simple. > * As a region gets programmed dynamically, it's acceptable for the thing > doing the programming to set a trait indicating that that function is in > play. (Sundar, this is the thing I originally said would get > resistance; but we've agreed it's okay. No blood was shed :) > * Requests *may* use preferred traits to help them land on a card that > already has their function flashed on it. (Prerequisite: preferred > traits, which can be implemented in placement. Candidates with the most > preferred traits get sorted highest.) Yup. > The "single program" case needs to be handled more like what Alex > describes below. TL;DR: We do *not* support dynamic programming, > traiting, or inventorying at instance boot time - it all has to be done > "up front". > * The PFs can be initially modeled as "empty" resource providers. Or > maybe not at all. Either way, *they can not be deployed* in this state. > * An operator or admin (via a CLI, config file, agent like blazar or > cyborg, etc.) preprograms the PF to have the specific desired > function/configuration. > * This may be cyborg/blazar pre-programming devices to maintain an > available set of each function > * This may be in response to a user requesting some function, which > causes a new image to be laid down on a device so it will be available > for scheduling > * This may be a human doing it at cloud-build time > * This results in the resource provider being (created and) set up with > the inventory and traits appropriate to that function. > * Now deploys can happen, using required traits representing the desired > function. ...and it could be in response to something noticing that a recent nova boot failed to find any candidates with a particular function, which provisions that thing so it can be retried. This is kindof the "spot instances" approach -- that same workflow would work here as well, although I expect most people would fit into the above cases. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Proposing Eric Fried for nova-core
> To the existing core team members, please respond with your comments, > +1s, or objections within one week. +1. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Does Cell v2 support for muti-cell deployment in Pike?
> Does Cell v2 support for multi-cell deployment in pike? Is there any > good document about the deployment? In the release notes of Pike: https://docs.openstack.org/releasenotes/nova/pike.html is this under 16.0.0 Prelude: Nova now supports a Cells v2 multi-cell deployment. The default deployment is a single cell. There are known limitations with multiple cells. Refer to the Cells v2 Layout page for more information about deploying multiple cells. There are some links to documentation in that paragraph which should be helpful. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Rocky spec review day
>> And I, for one, wouldn't be offended if we could "officially start >> development" (i.e. focus on patches, start runways, etc.) before the >> mystical but arbitrary spec freeze date. Yeah, I agree. I see runways as an attempt to add pressure to the earlier part of the cycle, where we're ignoring things that have been ready but aren't super high priority because "we have plenty of time." The later part of the cycle is when we start having to make hard decisions on things to de-focus, and where focus on the important core changes goes up naturally anyway. Personally, I think we're already kinda late in the cycle to be going on this, as I would have hoped to exit PTG with a plan to start operating in the new process immediately. Maybe I'm in the minority there, but I think that if we start this process late in the middle of a cycle, we'll probably need to adjust the prioritization of things in the queue more strictly, and remember that when retrospecting on the process for next cycle. > Sure, but given we have a lot of specs to review, TBH it'll be > possible for me to look at implementation patches only close to the > 1st milestone. I'm not sure I get this. We can't not review code while we review specs for weeks on end. We've already approved 75% of the blueprints (in number) that we completed in queens. One of the intended outcomes of this effort was to complete a higher percentage of what we approved, so we're not lying to contributors and so we have more focused review of things so they actually get completed instead of half-landed. To that end, I would kind of expect that we need to constantly be throttling (or maybe re-starting) spec review/approval rates to keep the queue full enough so we don't run dry, but without just ending up with a thousand approved things that we'll never get to. Anyway, just MHO. Obviously this will be an experiment and we won't get it right the first time. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] New image backend: StorPool
> Can you be more specific about what is limiting you when you use > volume-backed instances? Presumably it's because you're taking a trip over iscsi instead of using the native attachment mechanism for the technology that you're using? If so, that's a valid argument, but it's hard to see the tradeoff working in favor of adding all these drivers to nova as well. If cinder doesn't support backend-specific connectors, maybe that's something we could work on? People keep saying that "cinder is where I put my storage, that's how I want to back my instances" when it comes to justifying BFV, and that argument is starting to resonate with me more and more. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] about rebuild instance booted from volume
> Deleting all snapshots would seem dangerous though... > > 1. I want to reset my instance to how it was before > 2. I'll just do a snapshot in case I need any data in the future > 3. rebuild > 4. oops Yep, for sure. I think if there are snapshots, we have to refuse to do te thing. My comment was about the "does nova have authority to destroy the root volume during a rebuild" and I think it does, if delete_on_termination=True, and if there are no snapshots. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] about rebuild instance booted from volume
> Rather than overload delete_on_termination, could another flag like > delete_on_rebuild be added? Isn't delete_on_termination already the field we want? To me, that field means "nova owns this". If that is true, then we should be able to re-image the volume (in-place is ideal, IMHO) and if not, we just fail. Is that reasonable? --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Openstack-operators] AggregateMultiTenancyIsolation with multiple (many) projects
> 2. Dan Smith mentioned another idea such that we could index the > aggregate metadata keys like filter_tenant_id0, filter_tenant_id1, > ... filter_tenant_idN and then combine those so you have one host > aggregate filter_tenant_id* key per tenant. Yep, and that's what I've done in my request_filter implementation: https://review.openstack.org/#/c/545002/9/nova/scheduler/request_filter.py Basically it allows any suffix to 'filter_tenant_id' to be processed as a potentially-matching key. Note that I'm hoping we can deprecate/remove the post filter and replace it with this much more efficient version. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] Switching to longer development cycles
Ed Leafe writes: > I think you're missing the reality that intermediate releases have > about zero uptake in the real world. We have had milestone releases of > Nova for years, but I challenge you to find me one non-trivial > deployment that uses one of them. To my knowledge, based on user > surveys, it is only the major 6-month named releases that are > deployed, and even then, some time after their release. > > Integrated releases make sense for deployers. What does it mean if > Nova has some new stuff, but it requires a new release from Cinder in > order to use it, and Cinder hasn't yet released the necessary updates? > Talking about releasing projects on a monthly-tagged basis just dumps > the problem of determining what works with the rest of the codebase > onto the deployers. Similarly, right now we have easy and uniform points at which we have to make upgrade and compatibility guarantees. Presumably in such a new world order, a project would not be allowed to drop compatibility in an intermediate release, which means we're all being forced into a longer support envelope for versioned APIs, config files, etc. If we did do more of what I assume Doug is suggesting, which is just tag monthly and let the projects decide what to do with upgrades, then we end up with a massively more complex problem (for our own CI, as well as for operators) of mapping out where compatibility begins and ends per-project, instead of at least all aiming for the same point in the timeline. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] Switching to longer development cycles
> In my experience, the longer a patch (or worse, patch series) sits > around, the staler it gets. Others are merging changes, so the > long-lived patch series has to be constantly rebased. This is definitely true. > The 20% developer would be spending a greater proportion of her time > figuring out how to solve the rebase conflicts instead of just > focusing on her code. Agreed. The first reaction I had to this proposal was pretty much what you state here: that now the 20% person has a 365-day window in which they have to keep their head in the game, instead of a 180-day one. Assuming doubling the length of the cycle has no impact on the _importance_ of the thing the 20% person is working on, relative to project priorities, then the longer cycle just means they have to continuously rebase for a longer period of time. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Privsep transition state of play
> I hope everyone travelling to the Sydney Summit is enjoying jet lag > just as much as I normally do. Revenge is sweet! My big advice is that > caffeine is your friend, and to not lick any of the wildlife. I wasn't planning on licking any of it, but thanks for the warning. > As of just now, all rootwrap usage has been removed from the libvirt > driver, if you assume that the outstanding patches from the blueprint > are merged. I think that's a pretty cool milestone. That said, I feel > that https://review.openstack.org/#/c/517516/ needs a short talk to > make sure that people don't think the implementation approach I've > taken is confusing -- basically not all methods in nova/privsep are > now escalated, as sometimes we only sometimes escalate our privs for a > call. The review makes it clearer than I can in an email. I commented, agreeing with gibi. Make the exceptional cases exceptionally named; assume non-exceptional names are escalated by default. > We could stop now for Queens if we wanted -- we originally said we'd > land things early to let them stabilise. That said, we haven't > actually caused any stability problems so far -- just a few out of > tree drivers having to play catchup. So we could also go all in and > get this thing done fully in Queens. I agree we should steam ahead. I don't really want to hang the fate of the privsep transition on the removal of cellsv2 and nova-network, so personally I'm not opposed to privsepping those bits if you're willing. I also agree that the lack of breakage thus far should give us more confidence that we're safe to continue applying these changes later in the cycle. Just MHO. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] A way to delete a record in 'host_mappings' table
> But the record in 'host_mappings' table of api database is not deleted > (I tried it with nova master 8ca24bf1ff80f39b14726aca22b5cf52603ea5a0). > The cell cannot be deleted if the records for the cell remains in > 'host_mappings' table. > (An error occurs with a message "There are existing hosts mapped to cell with > uuid ...".) > > Are there any ways (CLI, API) to delete the host record in 'host_mappings' > table? > I couldn't find it. Hmm, yeah, I bet this is a gap. Can you file a bug for this? I think making the cell delete check for instances=0 in the cell and then deleting the host mapping along with the cell would be a good idea. We could also add a command to clean up orphaned host records, although hopefully that’s an exceptional situation. —Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] Update on Zuul v3 Migration - and what to do about issues
> Any update on where we stand on issues now? Because every single patch I > tried to land yesterday was killed by POST_FAILURE in various ways. > Including some really small stuff - https://review.openstack.org/#/c/324720/ Yeah, Nova has only landed eight patches since Thursday. Most of those are test-only patches that run a subset of jobs, and a couple that landed in the wee hours when overall system load was low. > Do we have a defined point on the calendar for getting the false > negatives back below the noise threshold otherwise a rollback is > implemented so that some of these issues can be addressed in parallel > without holding up community development? On Friday I was supportive of the decision to keep steaming forward instead of rolling back. Today, I’m a bit more concerned about light at the end of the tunnel. The infra folks have been hitting this hard for a long time, and for that I’m very appreciative. I too hope that we’re going to revisit mitigation strategies as we approach the weekiversary of being stuck. -—Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] vGPUs support for Nova - Implementation
>> I also think there is value in exposing vGPU in a generic way, irrespective >> of the underlying implementation (whether it is DEMU, mdev, SR-IOV or >> whatever approach Hyper-V/VMWare use). > > That is a big ask. To start with, all GPUs are not created equal, and > various vGPU functionality as designed by the GPU vendors is not > consistent, never mind the quirks added between different hypervisor > implementations. So I feel like trying to expose this in a generic > manner is, at least asking for problems, and more likely bound for > failure. I feel the opposite. IMHO, Nova’s role in life is not to expose all the quirks of the underlying platform, but rather to provide a useful abstraction on top of those things. In spite of them. > Nova already exposes plenty of hypervisor-specific functionality (or > functionality only implemented for one hypervisor), and that's fine. And those bits of functionality are some of the most problematic we have. Among other reasons, they make it difficult for us to expose Thing 2.0, when we’ve encoded Thing 1.0 into our API so rigidly. This happens even within one virt driver where Thing 2.0 is significantly different than Thing 1.0. The vGPU stuff seems well-suited for the generic modeling work that we’ve spent the last few years working on, and is a perfect example of an area where we can avoid piling on more debt to a not-abstract-enough “model” and move forward with the new one. That’s certainly my preference, and I think it’s actually less work than the debt-ridden way. -—Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova][stable] attn: No approvals for stable/newton right now
Hi all, Due to a zuulv3 bug, we're running an old nova-network test job on master and, as you would expect, failing hard. As a workaround in the meantime, we're[0] going to disable that job entirely so that it runs nowhere. This makes it not run on master (good) but also not run on stable/newton (not so good). So, please don't approve anything new for stable/newton until we turn this job back on. That will happen when this patch lands: https://review.openstack.org/#/c/508638 Thanks! --Dan [0]: Note that this is all magic and dedication from the infra people, all I did was stand around and applaud. I'm including myself in the "we" here because I like to feel included by standing next to smart people, not because I did any work. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] vGPUs support for Nova - Implementation
The concepts of PCI and SR-IOV are, of course, generic They are, although the PowerVM guys have already pointed out that they don't even refer to virtual devices by PCI address and thus anything based on that subsystem isn't going to help them. but I think out of principal we should avoid a hypervisor-specific integration for vGPU (indeed Citrix has been clear from the beginning that the vGPU integration we are proposing is intentionally hypervisor agnostic) I also think there is value in exposing vGPU in a generic way, irrespective of the underlying implementation (whether it is DEMU, mdev, SR-IOV or whatever approach Hyper-V/VMWare use). I very much agree, of course. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] vGPUs support for Nova - Implementation
In this serie of patches we are generalizing the PCI framework to handle MDEV devices. We arguing it's a lot of patches but most of them are small and the logic behind is basically to make it understand two new fields MDEV_PF and MDEV_VF. That's not really "generalizing the PCI framework to handle MDEV devices" :) More like it's just changing the /pci module to understand a different device management API, but ok. Yeah, the series is adding more fields to our PCI structure to allow for more variations in the kinds of things we lump into those tables. This is my primary complaint with this approach, and has been since the topic first came up. I really want to avoid building any more dependency on the existing pci-passthrough mechanisms and focus any new effort on using resource providers for this. The existing pci-passthrough code is almost universally hated, poorly understood and tested, and something we should not be further building upon. In this serie of patches we make libvirt driver support, as usually, return resources and attach devices returned by the pci manager. This part can be reused for Resource Provider. Perhaps, but the idea behind the resource providers framework is to treat devices as generic things. Placement doesn't need to know about the particular device attachment status. I quickly went through the patches and left a few comments. The base work of pulling some of this out of libvirt is there, but it's all focused on the act of populating pci structures from the vgpu information we get from libvirt. That code could be made to instead populate a resource inventory, but that's about the most of the set that looks applicable to the placement-based approach. As mentioned in IRC and the previous ML discussion, my focus is on the nested resource providers work and reviews, along with the other two top-priority scheduler items (move operations and alternate hosts). I'll do my best to look at your patch series, but please note it's lower priority than a number of other items. FWIW, I'm not really planning to spend any time reviewing it until/unless it is retooled to generate an inventory from the virt driver. With the two patches that report vgpus and then create guests with them when asked converted to resource providers, I think that would be enough to have basic vgpu support immediately. No DB migrations, model changes, etc required. After that, helping to get the nested-rps and traits work landed gets us the ability to expose attributes of different types of those vgpus and opens up a lot of possibilities. IMHO, that's work I'm interested in reviewing. One thing that would be very useful, Sahid, if you could get with Eric Fried (efried) on IRC and discuss with him the "generic device management" system that was discussed at the PTG. It's likely that the /pci module is going to be overhauled in Rocky and it would be good to have the mdev device management API requirements included in that discussion. Definitely this. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [skip-level-upgrades][fast-forward-upgrades] PTG summary
- Modify the `supports-upgrades`[3] and `supports-accessible-upgrades`[4] tags I have yet to look into the formal process around making changes to these tags but I will aim to make a start ASAP. We've previously tried to avoid changing assert tag definitions because we then have to re-review all of the projects that already have the tags to ensure they meet the new criteria. It might be easier to add a new tag for assert:supports-fast-forward-upgrades with the criteria that are unique to this use case. We already have a confusing array of upgrade tags, so I would really rather not add more that overlap in complicated ways. Most of the change here is clarification of things I think most people assume, so I don't think the validation effort will be a lot of work. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Proposing Balazs Gibizer for nova-core
So to the existing core team members, please respond with a yay/nay and after about a week or so we should have a decision (knowing a few cores are on vacation right now). +1 on the condition that gibi stops finding so many bugs in the stuff I worked on. It's embarrassing. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects
> So, I see your point here, but my concern here is that if we *modify* an > existing schema migration that has already been tested to properly apply > a schema change for MySQL/InnoDB and PostgreSQL with code that is > specific to NDB, we introduce the potential for bugs where users report > that the same migration works sometimes but fails other times. This ^. The same goes for really any sort of conditional in a migration where you could end up with different schema. I know that is Mike's point (to not have that happen) but I think the difficulty is proving and guaranteeing (now and going forward) that they're identical. Modifying a migration in the past is like a late-breaking conditional. > I would much prefer to *add* a brand new schema migration that handles > conversion of the entire InnoDB schema at a certain point to an > NDB-compatible one *after* that point. That way, we isolate the NDB > changes to one specific schema migration -- and can point users to that > one specific migration in case bugs arise. This is the reason that every > release we add a number of "placeholder" schema migration numbered files > to handle situations such as these. Yes. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Openstack-operators] [nova] Does anyone rely on PUT /os-services/disable for non-compute services?
Are we allowed to cheat and say auto-disabling non-nova-compute services on startup is a bug and just fix it that way for #2? :) Because (1) it doesn't make sense, as far as we know, and (2) it forces the operator to have to use the API to enable them later just to fix their nova service-list output. Yes, definitely. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Does anyone rely on PUT /os-services/disable for non-compute services?
So it seems our options are: 1. Allow PUT /os-services/{service_uuid} on any type of service, even if doesn't make sense for non-nova-compute services. 2. Change the behavior of [1] to only disable new "nova-compute" services. Please, #2. Please. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler][placement] Allocating Complex Resources
>> b) a compute node could very well have both local disk and shared >> disk. how would the placement API know which one to pick? This is a >> sorting/weighing decision and thus is something the scheduler is >> responsible for. > I remember having this discussion, and we concluded that a > computenode could either have local or shared resources, but not > both. There would be a trait to indicate shared disk. Has this > changed? I've always thought we discussed that one of the benefits of this approach was that it _could_ have both. Maybe we said "initially we won't implement stuff so it can have both" but I think the plan has been that we'd be able to support it. >>> * We already have the information the filter scheduler needs now >>> by some other means, right? What are the reasons we don't want >>> to use that anymore? >> >> The filter scheduler has most of the information, yes. What it >> doesn't have is the *identifier* (UUID) for things like SRIOV PFs >> or NUMA cells that the Placement API will use to distinguish >> between things. In other words, the filter scheduler currently does >> things like unpack a NUMATopology object into memory and determine >> a NUMA cell to place an instance to. However, it has no concept >> that that NUMA cell is (or will soon be once >> nested-resource-providers is done) a resource provider in the >> placement API. Same for SRIOV PFs. Same for VGPUs. Same for FPGAs, >> etc. That's why we need to return information to the scheduler >> from the placement API that will allow the scheduler to understand >> "hey, this NUMA cell on compute node X is resource provider >> $UUID". Why shouldn't scheduler know those relationships? You were the one (well one of them :P) that specifically wanted to teach the nova scheduler to be in the business of arranging and making claims (allocations) against placement before returning. Why should some parts of the scheduler know about resource providers, but not others? And, how would scheduler be able to make the proper decisions (which require knowledge of hierarchical relationships) without that knowledge? I'm sure I'm missing something obvious, so please correct me. IMHO, the scheduler should eventually evolve into a thing that mostly deals in the currency of placement, translating those into nova concepts where needed to avoid placement having to know anything about them. In other words, I would expect to be able to explain the purpose of the scheduler as "applies nova-specific logic to the generic resources that placement says are _valid_, with the goal of determining which one is _best_". --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler][placement] Allocating Complex Resources
>> My current feeling is that we got ourselves into our existing mess >> of ugly, convoluted code when we tried to add these complex >> relationships into the resource tracker and the scheduler. We set >> out to create the placement engine to bring some sanity back to how >> we think about things we need to virtualize. > > Sorry, I completely disagree with your assessment of why the > placement engine exists. We didn't create it to bring some sanity > back to how we think about things we need to virtualize. We created > it to add consistency and structure to the representation of > resources in the system. > > I don't believe that exposing this structured representation of > resources is a bad thing or that it is leaking "implementation > details" out of the placement API. It's not an implementation detail > that a resource provider is a child of another or that a different > resource provider is supplying some resource to a group of other > providers. That's simply an accurate representation of the underlying > data structures. This ^. With the proposal Jay has up, placement is merely exposing some of its own data structures to a client that has declared what it wants. The client has made a request for resources, and placement is returning some allocations that would be valid. None of them are nova-specific at all -- they're all data structures that you would pass to and/or retrieve from placement already. >> I don't know the answer. I'm hoping that we can have a discussion >> that might uncover a clear approach, or, at the very least, one >> that is less murky than the others. > > I really like Dan's idea of returning a list of HTTP request bodies > for POST /allocations/{consumer_uuid} calls along with a list of > provider information that the scheduler can use in its > sorting/weighing algorithms. > > We've put this straw-man proposal here: > > https://review.openstack.org/#/c/471927/ > > I'm hoping to keep the conversation going there. This is the most clear option that we have, in my opinion. It simplifies what the scheduler has to do, it simplifies what conductor has to do during a retry, and it minimizes the amount of work that something else like cinder would have to do to use placement to schedule resources. Without this, cinder/neutron/whatever has to know about things like aggregates and hierarchical relationships between providers in order to make *any* sane decision about selecting resources. If placement returns valid options with that stuff figured out, then those services can look at the bits they care about and make a decision. I'd really like us to use the existing strawman spec as a place to iterate on what that API would look like, assuming we're going to go that route, and work on actual code in both placement and the scheduler to use it. I'm hoping that doing so will help clarify whether this is the right approach or not, and whether there are other gotchas that we don't yet have on our radar. We're rapidly running out of runway for pike here and I feel like we've got to get moving on this or we're going to have to punt. Since several other things depend on this work, we need to consider the impact to a lot of our pike commitments if we're not able to get something merged. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [upgrades][skip-level][leapfrog] - RFC - Skipping releases when upgrading
> I haven't looked at what Keystone is doing, but to the degree they are > using triggers, those triggers would only impact new data operations as > they continue to run into the schema that is straddling between two > versions (e.g. old column/table still exists, data should be synced to > new column/table). If they are actually running a stored procedure to > migrate existing data (which would be surprising to me...) then I'd > assume that invokes just like any other "ALTER TABLE" instruction in > their migrations. If those operations themselves rely on the triggers, > that's fine. I haven't looked closely either, but I thought the point _was_ to transform data. If they are, and you run through a bunch of migrations where you end at a spot that expects that data was migrated while running at step 3, triggers dropped at step 7, and then schema compacted at step 11, then just blowing through them could be a problem. It'd work for a greenfield install no problem because there was nothing to migrate, but real people would trip over it. > But a keystone person to chime in would be much better than me just > making stuff up. Yeah, same :) --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [upgrades][skip-level][leapfrog] - RFC - Skipping releases when upgrading
> As most of the upgrade issues center around database migrations, we > discussed some of the potential pitfalls at length. One approach was to > roll-up all DB migrations into a single repository and run all upgrades > for a given project in one step. Another was to simply have mutliple > python virtual environments and just run in-line migrations from a > version specific venv (this is what the OSA tooling does). Does one way > work better than the other? Any thoughts on how this could be better? IMHO, and speaking from a Nova perspective, I think that maintaining a separate repo of migrations is a bad idea. We occasionally have to fix a migration to handle a case where someone is stuck and can't move past a certain revision due to some situation that was not originally understood. If you have a separate copy of our migrations, you wouldn't get those fixes. Nova hasn't compacted migrations in a while anyway, so there's not a whole lot of value there I think. The other thing to consider is that our _schema_ migrations often require _data_ migrations to complete before moving on. That means you really have to move to some milestone version of the schema, then move/transform data, and then move to the next milestone. Since we manage those according to releases, those are the milestones that are most likely to be successful if you're stepping through things. I do think that the idea of being able to generate a small utility container (using the broad sense of the word) from each release, and using those to step through N, N+1, N+2 to arrive at N+3 makes the most sense. Nova has offline tooling to push our data migrations (even though the command is intended to be runnable online). The concern I would have would be over how to push Keystone's migrations mechanically, since I believe they moved forward with their proposal to do data migrations in stored procedures with triggers. Presumably there is a need for something similar to nova's online-data-migrations command which will trip all the triggers and provide a green light for moving on? In the end, projects support N->N+1 today, so if you're just stepping through actual 1-version gaps, you should be able to do as many of those as you want and still be running "supported" transitions. There's a lot of value in that, IMHO. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] [Cells] Stupid question: Cells v2 & AZs
> Thanks for answering the base question. So, if AZs are implemented with > haggs, then really, they are truly disjoint from cells (ie, not a subset > of a cell and not a superset of a cell, just unrelated.) Does that > philosophy agree with what you are stating? Correct, aggregates are at the top level, and they can span cells if you so desire (or not if you don't configure any that do). The aggregate stuff doesn't know anything about cells, it only knows about hosts, so it's really independent. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] Boston Forum session recap - cellsv2
The etherpad for this session is here [1]. The goal of the session was to get some questions answered that the developers had for operators around the topic of cellsv2. The bulk of the time was spent discussing ways to limit instance scheduling retries in a cellsv2 world where placement eliminates resource-reservation races. Reschedules would be upcalls from the cell, which we are trying to avoid. While placement should eliminate 95% (or more) of reschedules due to pre-claiming resources before booting, there will still be cases where we may want to reschedule due to unexpected transient failures. How many of those remain, and whether or not rescheduling for them is really useful is in question. The compromise that seemed popular in the room was to grab more than one host at the time of scheduling, claim for that one, but pass the rest to the cell. If the cell needs to reschedule, the cell conductor would try one of the alternates that came as part of the original boot request, instead of asking scheduler again. During the discussion of this, an operator raised the concern that without reschedules, a single compute that fails to boot 100% of the time ends up becoming a magnet for all future builds, looking like an excellent target for the scheduler, but failing anything that is sent to it. If we don't reschedule, that situation could be very problematic. An idea came out that we should really have compute monitor and disable itself if a certain number of _consecutive_ build failures crosses a threshold. That would mitigate/eliminate the "fail magnet" behavior and further reduce the need for retries. A patch has been proposed for this, and so far enjoys wide support [2]. We also discussed the transition to counting quotas, and what that means for operators. The room seemed in favor of this, and discussion was brief. Finally, I made the call for people with reasonably-sized pre-prod environments to begin testing cellsv2 to help prove it out and find the gremlins. CERN and NeCTAR specifically volunteered for this effort. [1] https://etherpad.openstack.org/p/BOS-forum-cellsv2-developer-community-coordination [2] https://review.openstack.org/#/c/463597/ --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [tc] [all] OpenStack moving both too fast and too slow at the same time
> +1. ocata's cell v2 stuff added a lot of extra required complexity > with no perceivable benefit to end users. If there was a long term > stable version, then putting it in the non lts release would have > been ok. In absence of lts, I would have recommended the cell v2 > stuff have been done in a branch instead and merged all together when > it provided something (pike I think) That's how cellsv1 was developed and that turned out spectacularly well. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [placement] experimenting with extracting placement
Interestingly, we just had a meeting about cells and the scheduler, which had quite a bit of overlap on this topic. > That said, as mentioned in the previous email, the priorities for Pike > (and likely Queens) will continue to be, in order: traits, ironic, > shared resource pools, and nested providers. Given that the CachingScheduler is still a thing until we get claims in the scheduler, and given that CachingScheduler doesn't use placement like the FilterScheduler does, I think we need to prioritize the claims part of the above list. Based on the discussion several of us just had, the priority list actually needs to be this: 1. Traits 2. Ironic 3. Claims in the scheduler 4. Shared resources 5. Nested resources Claims in the scheduler is not likely to be a thing for Pike, but should be something we do as much prep for as possible, and land early in Queens. Personally, I think getting to the point of claiming in the scheduler will be easier if we have placement in tree, and anything we break in that process will be easier to backport if they're in the same tree. However, I'd say that after that goal is met, splitting placement should be good to go. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] scaling rabbitmq with cells v2 requires manual database update
> The problem is there's no way to update an existing cell's transport_url > via nova-manage. There is: https://review.openstack.org/#/c/431582/ > It appears the only way to get around this is manually deleting the old > cell1 record from the db. No, don't do that :) > I'd like to hear more opinions on this but it really seems like this > should be a priority to fix prior to the Ocata final release. Already done! --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] Cells meeting on Feb-15 is canceled
Hi all, In an epic collision of cosmic coincidences, four of the primary cells meeting attendees have a conflict tomorrow. Since there won't really be anyone around to run (or attend) the meeting, we'll have to cancel again. Next week we will be at the PTG so any meeting will be done there. So, expect the next cells meeting to be on March 1. Thanks! --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] FYI: cells v1 job is blocked
> We have a fix here: Actual link to fix is left as an exercise for the reader? https://review.openstack.org/#/c/433707 --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] Cells meeting canceled
Hi all, Today's cells meeting is canceled. We're still working on getting ocata out the door, a bunch of normal participants are out today, and not much has transpired for pike just yet. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [placement] [operators] Optional resource asking or not?
> Update on that agreement : I made the necessary modification in the > proposal [1] for not verifying the filters. We now send a request to the > Placement API by introspecting the flavor and we get a list of potential > destinations. Thanks! > When I began doing that modification, I know there was a functional test > about server groups that needed modifications to match our agreement. I > consequently made that change located in a separate patch [2] as a > prerequisite for [1]. > > I then spotted a problem that we didn't identified when discussing : > when checking a destination, the legacy filters for CPU, RAM and disk > don't verify the maximum capacity of the host, they only multiple the > total size by the allocation ratio, so our proposal works for them. > Now, when using the placement service, it fails because somewhere in the > DB call needed for returning the destinations, we also verify a specific > field named max_unit [3]. > > Consequently, the proposal we agreed is not feature-parity between > Newton and Ocata. If you follow our instructions, you will still get > different result from a placement perspective between what was in Newton > and what will be Ocata. To summarize some discussion on IRC: The max_unit field limits the maximum size of any single allocation and is not scaled by the allocation_ratio (for good reason). Right now, computes report a max_unit equal to their total for CPU and RAM resources. So the different behavior here is that placement will not choose hosts where the instance would single-handedly overcommit the entire host. Multiple instances still could, per the rules of the allocation-ratio. The consensus seems to be that this is entirely sane behavior that the previous core and ram filters weren't considering. If there's a good reason to allow computes to report that they're willing to take a larger-than-100% single allocation, then we can make that change later, but the justification seems lacking at the moment. > Technically speaking, the functional test is a canary bird, telling you > that you get NoValidHosts while it was working previously. My opinion, which is shared by several other people, is that this test is broken. It's trying to overcommit the host with a single instance, and in fact, it's doing it unintentionally for some resources that just aren't checked before the move to placement. Changing the test to properly reflect the resources on the host should be the path forward and Sylvain is working on that now. The other concern that was raised was that since CoreFilter is not necessarily enabled on all clouds, cpu_allocation_ratio is not being honored on those systems today. Moving to placement with ocata will cause that value to be used, which may be incorrect for certain overly-committed clouds which had previously ignored it. However, I think we need not be too concerned as the defaults for these values are 16x overcommit for CPU and 1.5x overcommit for RAM. Those are probably on the upper limit of sane for most environments, but also large enough to not cause any sort of immediate panic while people realize (if they didn't read the release notes) that they may want to tweak them. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [placement] [operators] Optional resource asking or not?
> No. Have administrators set the allocation ratios for the resources they > do not care about exceeding capacity to a very high number. > > If someone previously removed a filter, that doesn't mean that the > resources were not consumed on a host. It merely means the admin was > willing to accept a high amount of oversubscription. That's what the > allocation_ratio is for. > > The flavor should continue to have a consumed disk/vcpu/ram amount, > because the VM *does actually consume those resources*. If the operator > doesn't care about oversubscribing one or more of those resources, they > should set the allocation ratios of those inventories to a high value. > > No more adding configuration options for this kind of thing (or in this > case, looking at an old configuration option and parsing it to see if a > certain filter is listed in the list of enabled filters). > > We have a proper system of modeling these data-driven decisions now, so > my opinion is we should use it and ask operators to use the placement > REST API for what it was intended. I agree with the above. I think it's extremely counter-intuitive to set a bunch of over-subscription values only to have them ignored because a scheduler filter isn't configured. If we ignore some of the resources on schedule, the compute nodes will start reporting values that will make the resources appear to be negative to anything looking at the data. Before a somewhat-recent change of mine, the oversubscribed computes would have *failed* to report negative resources at all, which was a problem for a reconfigure event. I think the scheduler purposefully forcing computes into the red is a mistake. Further, new users that don't know our sins of the past will wonder why the nice system they see in front of them isn't doing the right thing. Existing users can reconfigure allocation ratio values before they upgrade. We can also add something to our upgrade status tool to warn them. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] No cells meeting next week (Jan 18)
Hi all, There will be no cells meeting next week, Jan 18 2017. I'll be in the wilderness and nobody else was brave enough to run it in my absence. Yeah, something like that. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][nova]Accessing nullable, not set versioned object field
>> NotImplementedError: Cannot load 'nullable_string' in the base class >> >> Is this the correct behavior? > > Yes, that's the expected behaviour. Yes. >> Then what is the expected behavior if the field is also defaulted to >> None? >> >> fields = { >> 'nullable_string': fields.StringField(nullable=True, >> default=None), >> } >> >> The actual behavior is still the same exception above. Is it the >> correct behavior? > > Yes. So, what the default=None does is describe the behaviour of the > field when obj_set_defaults() is called. It does *not* describe what is > returned if the field *value* is accessed before being populated. > > What you're looking for is the obj_attr_is_set() method: > > > if MyObject.obj_attr_is_set('nullable_string'): > print my_obj.nullable_string I think you meant s/MyObject/my_obj/ above. However, in modern times, it's better to use: if 'nullable_string' in myobj On a per-object basis, it may also be reasonable to define obj_load_attr() to provide the default for a field if it's not set and attempted to be loaded. > In addition to the obj_attr_is_set() method, use the obj_set_defaults() > method to manually set all fields that have a default=XXX value to XXX > if those fields have not yet been manually set: There's another wrinkle here. The default=XXX stuff was actually introduced before we had obj_set_defaults(), and for a very different reason. That reason was confusing and obscure, and mostly supportive of the act of converting nova from dicts to objects. If you look in fields, there is an obscure handling of default, where if you _set_ a field to None that has a default and is not nullable, it will gain the default value. It's confusing and I wish we had never done it, but.. it's part of the contract now and I'd have to do a lot of digging to see if we can remove it (probably can from Nova, but...). Your use above is similar to this, so I just wanted to point it out in case you came across it and it led you to thinking your original example would work. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] No cells meetings until 2017
Hi all, Given the upcoming holidays, there will not be nova cells meetings for the remainder of the year. That puts the next one at January 4, 2017. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Stepping back
> It has been a true pleasure working with you all these past few years > and I'm thankful to have had the opportunity. As I've told people many > times when they ask me what it's like to work on an open source project > like this: working on proprietary software exposes you to smart people > but you're limited to the small set of people within an organization, > working on a project like this exposed me to smart people from many > companies and many parts of the world. I have learned a lot working with > you all. Thanks. Andrew, thanks so much for all your contributions to the Nova community over the years. I have so enjoyed working with you, learning from you, and making nova better alongside you. I'm really sad to see you go, but purely from a selfish point of view. I know you will go on to make other software and communities better, and I wish you the best of luck. I'll leave you with a snippet from my #openstack-nova archives, on a Friday in late 2013, where I think you had just figured out something that I broke in those early days of objectification. I stand by my statement: Oct 25 11:05:13 bearhands: my official opinion is that you should hang on to lascii, just FYI --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] 23-Nov Cells meeting cancelled
Hi all, Since this week's cells meeting falls on food-coma-day-eve, we're canceling it. Anyone that wants to help move things along could review the following patches: https://review.openstack.org/#/q/topic:bp/cells-sched-staging+project:openstack/nova+status:open https://review.openstack.org/#/q/topic:cell-databases-fixture Thanks! --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Thanks to those reviewing specs this week
> I just wanted to say thanks to everyone reviewing specs this week. I've > seen a lot of non-core newer people to the specs review process chipping > in and helping to review a lot of the specs we're trying to get approved > for Ocata. It can be hard to grind through several specs reviews in a > day so I appreciate all of the help from everyone here, and it helps > later when reviewing the code if you were familiar with the spec as it > was being written and reviewed. I second this sentiment and was just thinking the same as I was looking at a few specs this morning. The amount of non-core, non-specs-core involvement in spec review has been noticeably higher this cycle. It's definitely nice to start reviewing a spec that is visibly more complete, and then see that it has been through multiple rounds of review already to hammer out the details. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Follow up on BCN review cadence discussions
> I do imagine, however, that most folks who have been working > on nova for long enough have a list of domain experts in their heads > already. Would actually putting that on paper really hurt? You mean like this? https://wiki.openstack.org/wiki/Nova#Developer_Contacts Those are pretty much the people I look to have sign off on a thing I'm not completely familiar with before approving something. I'm sure it could use some updating, of course. This is linked from the MAINTAINERS file in our tree, by the way. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] No cellsv2 meeting today
Hi all, A bunch of the usual participants cannot attend the CellsV2 meeting today, and the ones that can just discussed it last week face-to-face in Barcelona. So, I'm going to declare it canceled for today for lack of critical mass. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] VIF plugin issue _get_neutron_events
> Basically the issue is seen in the following three lines of nova compute > log. For that port even though it received the vif plugging event 2 mins > before it waits for it and blocks and times out > Is there a race condition in the code that basically gets the events to > wait for and the one where it registers for this callback > Any comments? This shouldn't be possible because the point at which we call plug_vifs() is when we should trigger neutron to fire the vif-plugged event, and that is after where we have already setup to receive those events. In other words, your log lines should never be able to be in the order you showed. If it's happening in that order (especially two minutes early) I would be highly suspect of some modifications or something else weird going on. It'd be much better to handle this in the context of a bug, especially providing the versions and components (i.e. neutron driver), etc. Can you open one and provide all the usual details? I'll be happy to look. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Draft Ocata design summit schedule is up
> Is there a particular reason we're only retrospecting on placement? I think that we need to have a concrete topic that applied to newton and will apply to ocata in order to be productive. I think there will be specific things we can change in ocata that will have an actual impact on major work for the cycle. > I suspect we can map many of the ideas and experiences from a > retrospective devoted to placement to more general concerns but I'd > hate for people who had no involvement in it but are concerned about > Nova to feel excluded. > > [1] https://etherpad.openstack.org/p/nova-newton-retrospective As has been demonstrated in that etherpad, if we try to retrospect every aspect of newton, we'd need a week and wouldn't have time to have the other sessions we need in order to plan for ocata. Picking two major ongoing topics seems like the best way to frame a useful discussion to me. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Future of turbo-hipster CI
> Having said that, I think Dan Smith came across a fairly large > production DB dataset recently which he was using for testing some > archive changes, maybe Dan will become our new Johannes, but grumpier of > course. :) That's quite an insult to Johannes :) While working on the db archiving thing recently I was thinking about how it would be great to get t-h to run this process on one of its large/real datasets. Then I started to wonder when was the last time I actually saw it comment. I feel like these days Nova, by policy, isn't doing any database migrations that can really take a long time for a variety of reasons (i.e. expand-only schema migrations, no data migrations). That means the original thing t-h set out to prevent is not really much of a risk anymore. I surely think it's valuable, but I understand if the benefit does not outweigh the cost at this point. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] ops meetup feedback
> The current DB online data upgrade model feels *very opaque* to > ops. They didn't realize the current model Nova was using, and didn't > feel like it was documented anywhere. > ACTION: document the DB data lifecycle better for operators This is on me, so I'll take it. I've just thrown together something that I think will help a little bit: https://review.openstack.org/373361 Which, instead of a blank screen and a return code, gives you something like this: +---+--+---+ | Migration | Total Needed | Completed | +---+--+---+ | migrate_aggregates| 5 | 4 | | migrate_instance_keypairs | 6 | 6 | +---+--+---+ I'll also see about writing up some docs about the expected workflow here. Presumably that needs to go in some fancy docs and not into the devref, right? Can anyone point me to where that should go? --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Next steps for resource providers work
> We know: > > * It pretty much does what we intend it to do: allocations are added > and deleted on server create and delete. > * On manipulations like a resize the allocations are not updated > immediately, there is a delay until the heal periodic job does its > thing. We know one more thing. For some reason we're overrunning the vcpu capacity in a normal tempest run. It doesn't seem to affect any other resources, though. We are configured to not use CoreFilter, which means the scheduler isn't worried about overcommit or honoring the cpu_allocation_ratio value. I put up a devstack change to hack it up to 4.0, thinking that that would give us enough room to stop getting the errors (4 VCPUS x 4.0 = 16), but it doesn't: https://review.openstack.org/#/c/364581/ You can see in anything that runs placement that it's unhappy: > 2016-09-02 00:34:11.768 18251 WARNING nova.scheduler.client.report > [req-cde54d5f-bcef-4670-8864-eaf479cc9bb9 > tempest-ServersAdminTestJSON-1247365597 > tempest-ServersAdminTestJSON-1247365597] Unable to submit allocation for > instance abc7661d-f258-4eed-8c0c-91b17216d32c (409 409 Conflict > > There was a conflict when trying to complete your request. > > Unable to allocate inventory: Unable to create allocation for 'VCPU' on > resource provider '5295c607-fbb8-472d-8e3d-6067b8814ef8'. The requested > amount would exceed the capacity. We should try to get this figured out before newton ships if possible. I don't think I see it locally, but I have a large dev machine, so I'll have to try to poke it harder. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects
> So that is fine. However, correct me if I'm wrong but you're > proposing just that these projects migrate to also use a new service > layer with oslo.versionedobjects, because IIUC Nova/Neutron's > approach is dependent on that area of indirection being present. > Otherwise, if you meant something like, "use an approach that's kind > of like what Nova does w/ versionedobjects but without actually > having to use versionedobjects", that still sounds like, "come up > with a new idea". If you don't need the RPC bits, versionedobjects is nothing more than an object facade for you to insulate your upper layers from such change. Writing your facade using versionedobjects just means inheriting from a superclass that does a bunch of stuff you don't need. So I would not say that taking the same general approach without that inheritance is "come up with a new idea". Using triggers and magic to solve this instead of an application-level facade is a substantially different approach to the problem. > I suppose if you're thinking more at the macro level, where "current > approach" means "do whatever you have to on the app side", then your > position is consistent, but I think there's still a lot of > confusion in that area when the indirection of a versioned service > layer is not present. It gets into the SQL nastiness I was discussing > w/ Clint and I don't see anyone doing anything like that yet. The indirection service is really unrelated to this discussion, IMHO. If you take RPC out of the picture, all you have left is a direct-to-the-database facade to handle the fact that schema has expanded underneath you. As Clint (et al) have said -- designing the application to expect schema expansion (and avoiding unnecessary contraction) is the key here. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] migrate_flavor_data doesn't flavor migrate meta data of VMs spawned during upgrade.
> Thanks Dan for your response. While I do run that before I start my > move to liberty, what I see is that it doesn't seem to flavor migrate > meta data for the VMs that are spawned after controller upgrade from > juno to kilo and before all computes upgraded from juno to kilo. The > current work around is to delete those VMs that are spawned after > controller upgrade and before all computes upgrade, and then initiate > liberty upgrade. Then it works fine. I can't think of any reason why that would be, or why it would be a problem. Instances created after the controllers are upgraded should not have old-style flavor info, so they need not be touched by the migration code. Maybe filing a bug is in order describing what you see? --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] migrate_flavor_data doesn't flavor migrate meta data of VMs spawned during upgrade.
> While migrate_flavor_data seem to flavor migrate meta data of the VMs > that were spawned before upgrade procedure, it doesn't seem to flavor > migrate for the VMs that were spawned during the upgrade procedure more > specifically after openstack controller upgrade and before compute > upgrade. Am I missing something here or is it by intention? You can run the flavor migration as often as you need, and can certainly run it after your last compute is upgraded before you start to move into liberty. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects
>> I don't think it's all that ambitious to think we can just use >> tried and tested schema evolution techniques that work for everyone >> else. > > People have been asking me for over a year how to do this, and I have > no easy answer, I'm glad that you do. I would like to see some > examples of these techniques. I'm not sure how to point you at the examples we have today because they're not on a single line (or set of lines) in a single file. Nova has moved a lot of data around at runtime using this approach in the last year or so with good success. > If you can show me the SQL access code that deals with the above > change, that would help a lot. We can't show you that, because as you said, there isn't a way to do it...in SQL. That is in fact the point though: don't do it in SQL. > If the answer is, "oh well just don't do a schema change like that", > then we're basically saying we aren't really changing our schemas > anymore except for totally new features that otherwise aren't > accessed by the older version of the code. We _are_ saying "don't change schema like that", but it's not a very limiting requirement. It means you can't move things in a schema migration, but that's all. Nova changes schema all the time. In the last year or so, off the top of my head, nova has: 1. Moved instance flavors from row=value metadata storage to a JSON blob in another table 2. Moved core flavors, aggregates, keypairs and other structures from the cell database to the api database 3. Added uuid to aggregates 4. Added a parent_addr linkage in PCI device ...all online. Those are just the ones I have in my head that have required actual data migrations. We've had dozens of schema changes that enable new features that are all just new data and don't require any of this. > That's fine. It's not what people coming to me are saying, though. Not sure who is coming to you or what they're saying, but.. okay :) If keystone really wants to use triggers to do this, then that's fine. But I think the overwhelming response from this thread (which is asking people's opinions on the matter) seems to be that they're an unnecessary complication that will impede people debugging and working on that part of the code base. We have such impediments elsewhere, but I think we generally try to avoid doing one thing a hundred different ways to keep the playing field as level as possible. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [keystone][nova][neutron][all] Rolling upgrades: database triggers and oslo.versionedobjects
>> Even in the case of projects using versioned objects, it still >> means a SQL layer has to include functionality for both versions of >> a particular schema change which itself is awkward. That's not true. Nova doesn't have multiple models to straddle a particular change. We just... > It's simple, these are the holy SQL schema commandments: > > Don't delete columns, ignore them. > Don't change columns, create new ones. > When you create a column, give it a default that makes sense. > Do not add new foreign key constraints. ...do this ^ :) We can drop columns once they're long-since-unused, but we still don't need duplicate models for that. --Dan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev