Re: [openstack-dev] [nova] fair standards for all hypervisor drivers
On 08/07/2014 08:06 PM, Michael Still wrote: It seems to me that the tension here is that there are groups who would really like to use features in newer libvirts that we don't CI on in the gate. Is it naive to think that a possible solution here is to do the following: - revert the libvirt version_cap flag I don't feel strongly either way on this. It seemed useful at the time for being able to decouple upgrading libvirt and enabling features that come with that. I'd like to let Dan get back from vacation and weigh in on it, though. - instead implement a third party CI with the latest available libvirt release [1] As for the general idea of doing CI, absolutely. That was discussed earlier in the thread, though nobody has picked up the ball yet. I can work on it, though. We just need to figure out a sensible approach. We've seen several times that building and maintaining 3rd party CI is a *lot* of work. Like you said in [1], doing this in infra's CI would be ideal. I think 3rd party should be reserved for when running it in the project's infrastructure is not an option for some reason (requires proprietary hw or sw, for example). I wonder if the job could be as simple as one with an added step in the config to install latest libvirt from source. Dan, do you think someone could add a libvirt-current.tar.gz to http://libvirt.org/sources/ ? Using the latest release seems better than master from git. I'll mess around and see if I can spin up an experimental job. - document clearly in the release notes the versions of dependancies that we tested against in a given release: hypervisor versions (gate and third party), etc etc Sure, that sounds like a good thing to document in release notes. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] fair standards for all hypervisor drivers
On 08/08/2014 01:46 AM, Luke Gorrie wrote: On 8 August 2014 02:06, Michael Still mi...@stillhq.com mailto:mi...@stillhq.com wrote: 1: I think that ultimately should live in infra as part of check, but I'd be ok with it starting as a third party if that delivers us something faster. I'd be happy enough to donate resources to get that going if we decide to go with this plan. Can we cooperate somehow? We are already working on bringing up a third party CI covering QEMU 2.1 and Libvirt 1.2.7. The intention of this CI is to test the software configuration that we are recommending for NFV deployments (including vhost-user feature which appeared in those releases), and to provide CI cover for the code we are offering for Neutron. Michele Paolino is working on this and the relevant nova/devstack changes. It sounds like what you're working on is a separate thing. You're targeting coverage for a specific set of use cases, while this is a flavor of the general CI coverage we're already doing, but with the latest (not pegged) libvirt (and maybe qemu). By all means, more testing is useful though. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] The future of the integrated release
On 08/08/2014 05:06 AM, Thierry Carrez wrote: Michael Still wrote: [...] I think an implied side effect of the runway system is that nova-drivers would -2 blueprint reviews which were not occupying a slot. (If we start doing more -2's I think we will need to explore how to not block on someone with -2's taking a vacation. Some sort of role account perhaps). Ideally CodeReview-2s should be kept for blocking code reviews on technical grounds, not procedural grounds. For example it always feels weird to CodeReview-2 all feature patch reviews on Feature Freeze day -- that CodeReview-2 really doesn't have the same meaning as a traditional CodeReview-2. For those procedural blocks (feature freeze, waiting for runway room...), it might be interesting to introduce a specific score (Workflow-2 perhaps) that drivers could set. That would not prevent code review from happening, that would just clearly express that this is not ready to land for release cycle / organizational reasons. Thoughts? That sounds much nicer than using code review -2. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Glance] Image upload/download bandwidth cap
On 08/08/2014 04:17 PM, Jay Pipes wrote: On 08/08/2014 08:49 AM, Tomoki Sekiyama wrote: Hi all, I'm considering how I can apply image download/upload bandwidth limit for glance for network QoS. There was a review for the bandwidth limit, however it is abandoned. * Download rate limiting https://review.openstack.org/#/c/21380/ Was there any discussion in the past summit about this not to merge this? Or, is there alternative way to cap the bandwidth consumed by Glance? I appreciate any information about this. Hi Tomoki :) Would it be possible to integrate traffic control into the network configuration between the Glance endpoints and the nova-compute nodes over the control plane network? http://www.lartc.org/lartc.html#LARTC.RATELIMIT.SINGLE Yep, that was my first thought as well. It seems like something that would ideally be handled outside of OpenStack itself. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] fair standards for all hypervisor drivers
On 08/08/2014 09:06 AM, Russell Bryant wrote: - instead implement a third party CI with the latest available libvirt release [1] As for the general idea of doing CI, absolutely. That was discussed earlier in the thread, though nobody has picked up the ball yet. I can work on it, though. We just need to figure out a sensible approach. We've seen several times that building and maintaining 3rd party CI is a *lot* of work. Like you said in [1], doing this in infra's CI would be ideal. I think 3rd party should be reserved for when running it in the project's infrastructure is not an option for some reason (requires proprietary hw or sw, for example). I wonder if the job could be as simple as one with an added step in the config to install latest libvirt from source. Dan, do you think someone could add a libvirt-current.tar.gz to http://libvirt.org/sources/ ? Using the latest release seems better than master from git. I'll mess around and see if I can spin up an experimental job. Here's a first stab at it: https://review.openstack.org/113020 -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Neutron][Technical Committee] nova-network - Neutron. Throwing a wrench in the Neutron gap analysis
On 08/06/2014 01:41 PM, Jay Pipes wrote: On 08/06/2014 01:40 AM, Tom Fifield wrote: On 06/08/14 13:30, Robert Collins wrote: On 6 August 2014 17:27, Tom Fifield t...@openstack.org wrote: On 06/08/14 13:24, Robert Collins wrote: What happened to your DB migrations then? :) Sorry if I misunderstood, I thought we were talking about running VM downtime here? While DB migrations are running things like the nova metadata service can/will misbehave - and user code within instances will be affected. Thats arguably VM downtime. OTOH you could define it more narrowly as 'VMs are not powered off' or 'VMs are not stalled for more than 2s without a time slice' etc etc - my sense is that most users are going to be particularly concerned about things for which they have to *do something* - e.g. VMs being powered off or rebooted - but having no network for a short period while vifs are replugged and the overlay network re-establishes itself would be much less concerning. I think you've got it there, Rob - nicely put :) In many cases the users I've spoken to who are looking for a live path out of nova-network on to neutron are actually completely OK with some API service downtime (metadata service is an API service by their definition). A little 'glitch' in the network is also OK for many of them. Contrast that with the original proposal in this thread (snapshot VMs in old nova-network deployment, store in Swift or something, then launch VM from a snapshot in new Neutron deployment) - it is completely unacceptable and is not considered a migration path for these users. Who are these users? Can we speak with them? Would they be interested in participating in the documentation and migration feature process? Yes, I'd really like to see some participation in the development of a solution if it's an important requirement. Until then, it feels like a case of an open question of what do you want. Of course the answer is a pony. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] fair standards for all hypervisor drivers
On 08/11/2014 05:53 AM, Daniel P. Berrange wrote: There is work to add support for this in devestack already which I prefer since it makes it easy for developers to get an environment which matches the build system: https://review.openstack.org/#/c/108714/ Ah, cool. Devstack is indeed a better place to put the build scripting. So, I think we should: 1) Get the above patch working, and then merged. 2) Get an experimental job going to use the above while we work on #3 3) Before the job can move into the check queue and potentially become voting, it needs to not rely on downloading the source on every run. IIRC, we can have nodepool build an image to use for these jobs that includes the bits already installed. I'll switch my efforts over to helping get the above completed. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] fair standards for all hypervisor drivers
On 08/09/2014 12:33 PM, Jeremy Stanley wrote: On 2014-08-08 09:06:29 -0400 (-0400), Russell Bryant wrote: [...] We've seen several times that building and maintaining 3rd party CI is a *lot* of work. Building and maintaining *any* CI is a *lot* of work, not the least of which is the official OpenStack project CI (I believe Monty mentioned in #openstack-infra last night that our CI is about twice the size of Travis-CI now, not sure what metric he's comparing there though). Dang, I'd love to see those numbers. :-) Like you said in [1], doing this in infra's CI would be ideal. I think 3rd party should be reserved for when running it in the project's infrastructure is not an option for some reason (requires proprietary hw or sw, for example). Add to the not an option for some reason list, software which is not easily obtainable through typical installation channels (PyPI, Linux distro-managed package repositories for their LTS/server releases, et cetera) or which requires gyrations which destabilize or significantly complicate maintenance of the overall system as well as reproducibility for developers. It may be possible to work around some of these concerns via access from multiple locations coupled with heavy caching, but adding that in for a one-off source is hard to justify the additional complexity too. Understood. Some questions ... is building an image that has libvirt and qemu pre-installed from source good enough? It avoids the dependency on job runs, but moves it to image build time though, so it still exists. If the above still doesn't seem like a workable setup, then I think we should just go straight to an image with fedora + virt-preview repo, which kind of sounds easier, anyway. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] fair standards for all hypervisor drivers
On 08/11/2014 07:58 AM, Russell Bryant wrote: On 08/11/2014 05:53 AM, Daniel P. Berrange wrote: There is work to add support for this in devestack already which I prefer since it makes it easy for developers to get an environment which matches the build system: https://review.openstack.org/#/c/108714/ Ah, cool. Devstack is indeed a better place to put the build scripting. So, I think we should: 1) Get the above patch working, and then merged. 2) Get an experimental job going to use the above while we work on #3 3) Before the job can move into the check queue and potentially become voting, it needs to not rely on downloading the source on every run. IIRC, we can have nodepool build an image to use for these jobs that includes the bits already installed. I'll switch my efforts over to helping get the above completed. I still think the devstack patch is good, but after some more thought, I think a better long term CI job setup would just be a fedora image with the virt-preview repo. I think I'll try that ... -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] fair standards for all hypervisor drivers
On 08/11/2014 08:01 AM, Daniel P. Berrange wrote: On Mon, Aug 11, 2014 at 07:58:41AM -0400, Russell Bryant wrote: On 08/11/2014 05:53 AM, Daniel P. Berrange wrote: There is work to add support for this in devestack already which I prefer since it makes it easy for developers to get an environment which matches the build system: https://review.openstack.org/#/c/108714/ Ah, cool. Devstack is indeed a better place to put the build scripting. So, I think we should: 1) Get the above patch working, and then merged. 2) Get an experimental job going to use the above while we work on #3 3) Before the job can move into the check queue and potentially become voting, it needs to not rely on downloading the source on every run. Don't we have the ability to mirror downloads locally to the build system for python ? The proposed patch allows an alternate download URL to be set via an env variable so it could point to a local mirror instead of libvirt.org / qemu.org There's a pypi mirror at least. I'm not sure about mirroring other things. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] fair standards for all hypervisor drivers
On 08/11/2014 09:17 AM, Jeremy Stanley wrote: On 2014-08-11 08:04:34 -0400 (-0400), Russell Bryant wrote: Dang, I'd love to see those numbers. :-) Me too. Now that I'm not travelling I'll see if I can find out what he meant by that. Understood. Some questions ... is building an image that has libvirt and qemu pre-installed from source good enough? It avoids the dependency on job runs, but moves it to image build time though, so it still exists. Moving complex stability risks to image creation time still causes us to potentially fail to update our worker images as often, which means tests randomly run on increasingly stale systems in some providers/regions until the issue is noticed, identified and addressed. That said, we do already compile some things during job runs today (in particular, library bindings which get install-time linked by some Python modules). In reality, depending on more things gathered from different places on the Internet (be it Git repository sites like GitHub/Bitbucket, or private package collections) decreases our overall stability far more than compiling things does. If the above still doesn't seem like a workable setup, then I think we should just go straight to an image with fedora + virt-preview repo, which kind of sounds easier, anyway. If it's published from EPEL or whatever Fedora's equivalent is, then that's probably fine. If it's served from a separate site, then that increases the chances that we run into network issues either at image build time or job run time. Also, we would want to make sure whatever solution we settle on is well integrated within DevStack itself, so that individual developers can recreate these conditions themselves without a lot of additional work. EPEL is a repo produced by the Fedora project for RHEL and its derivatives. The virt-preview repo is hosted on fedorapeople.org, which is where custom repos live. I'd say it's more analogous to Ubuntu's PPAs. https://fedorapeople.org/groups/virt/virt-preview/ One other thing to keep in mind... Fedora's lifecycle is too short for us to support outside of jobs for our master branches, so this would not be a solution beyond release time (we couldn't continue to run these jobs for Juno once released if the solution hinges on Fedora). Getting the versions we want developers and deployers to use into Ubuntu 14.04 Cloud Archive and CentOS (RHEL) 7 EPEL on the other hand would be a much more viable long-term solution. Yep, makes sense. For testing bleeding edge, I've also got my eye on how we could do this with CentOS. There is a virt SIG in CentOS that I'm hoping will produce something similar to Fedora's virt-preview repo, but it's not there yet. I'm going to go off and discuss this with the SIG there. http://wiki.centos.org/SpecialInterestGroup/Virtualization -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] fair standards for all hypervisor drivers
On 08/12/2014 05:54 AM, Daniel P. Berrange wrote: I am less concerned about the contents of this patch, and more concerned with how such a big de facto change in nova policy (we accept untested code sometimes) without any discussion or consensus. In your comment on the revert [2], you say the 'whether not-CI-tested features should be allowed to be merged' debate is 'clearly unresolved.' How did you get to that conclusion? This was never brought up in the mid-cycles as a unresolved topic to be discussed. In our specs template we say Is this untestable in gate given current limitations (specific hardware / software configurations available)? If so, are there mitigation plans (3rd party testing, gate enhancements, etc) [3]. We have been blocking untested features for some time now. That last lines are nonsense. We have never unconditionally blocked untested features nor do I recommend that we do so. The specs template testing allows the contributor to *justify* why they think the feature is worth accepting despite lack of testing. The reviewers make a judgement call on whether the justification is valid or not. This is a pragmmatic approach to the problem. That has been my interpretation and approach as well: we strongly prefer functional testing for everything, but take a pragmatic approach and evaluate proposals on a case by case basis. It's clear we need to be a bit more explicit here. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Retrospective veto revert policy
On 08/12/2014 10:56 AM, Mark McLoughlin wrote: Hey (Terrible name for a policy, I know) From the version_cap saga here: https://review.openstack.org/110754 I think we need a better understanding of how to approach situations like this. Here's my attempt at documenting what I think we're expecting the procedure to be: https://etherpad.openstack.org/p/nova-retrospective-veto-revert-policy If it sounds reasonably sane, I can propose its addition to the Development policies doc. Looks reasonable to me. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] fair standards for all hypervisor drivers
On 08/12/2014 03:40 PM, Kashyap Chamarthy wrote: On Mon, Aug 11, 2014 at 08:05:26AM -0400, Russell Bryant wrote: On 08/11/2014 07:58 AM, Russell Bryant wrote: On 08/11/2014 05:53 AM, Daniel P. Berrange wrote: There is work to add support for this in devestack already which I prefer since it makes it easy for developers to get an environment which matches the build system: https://review.openstack.org/#/c/108714/ Ah, cool. Devstack is indeed a better place to put the build scripting. So, I think we should: 1) Get the above patch working, and then merged. 2) Get an experimental job going to use the above while we work on #3 3) Before the job can move into the check queue and potentially become voting, it needs to not rely on downloading the source on every run. IIRC, we can have nodepool build an image to use for these jobs that includes the bits already installed. I'll switch my efforts over to helping get the above completed. I still think the devstack patch is good, but after some more thought, I think a better long term CI job setup would just be a fedora image with the virt-preview repo. So, effectively, you're trying to add a minimal Fedora image w/ virt-preview repo (as part of some post-install kickstart script). If so, where would the image be stored? I'm asking because, previously Sean Dague mentioned of mirroring issues (which later turned out to be intermittent network issues with OpenStack infra cloud providers) of Fedora images, and floated an idea whether an updated image can be stored on tarballs.openstack.org, like how Trove[1] does. But, OpenStack infra folks (fungi) raised some valid points on why not do that. IIUC, if you intend to run tests w/ this CI job with this new image, there has to be a mechanism in place to ensure the cached copy (on tarballs.o.o) is updated. If I misunderstood what you said, please correct me. Patches for this here: https://review.openstack.org/#/c/113349/ https://review.openstack.org/#/c/113350/ The first one is the important part about how the image is created. nodepool runs some prep scripts against the cloud's distro image and then snapshots it. That's the image stored to be used later for testing. In this case, it enables the virt-preview repo and then calls out to the regular devstack prep scripts to cache all packages needed for the test locally on the image. If there are issues with the reliability of fedorapeople.org, it will indeed cause problems, but at least it's local to image creation and not every test run. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Retrospective veto revert policy
On Aug 12, 2014, at 5:10 PM, Michael Still mi...@stillhq.com wrote: This looks reasonable to me, with a slight concern that I don't know what step five looks like... What if we can never reach a consensus on an issue? In an extreme case, the PTL has the authority to make the call. In general I would like to think we can all just put on our big boy pants and talk through contentious issues to find a resolution that everyone can live with. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][core] Expectations of core reviewers
On 08/12/2014 06:57 PM, Michael Still wrote: Hi. One of the action items from the nova midcycle was that I was asked to make nova's expectations of core reviews more clear. This email is an attempt at that. Note that we also have: https://wiki.openstack.org/wiki/Nova/CoreTeam so once new critera reaches consensus, it should be added there. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] The future of the integrated release
On 08/12/2014 10:05 PM, Michael Still wrote: there are hundreds of proposed features for Juno, nearly 100 of which have been accepted. However, we're kidding ourselves if we think we can land 100 blueprints in a release cycle. FWIW, I think this is actually huge improvement from previous cycles. I think we had almost double that # of blueprints on the list in the past. I also don't think 100 is *completely* out of the question. We're in the 50-100 range already: Icehouse - 67 Havana - 91 Grizzly - 66 Anyway, just wanted to share some numbers ... some improvements to prioritization within that 100 is certainly still a good thing. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][core] Expectations of core reviewers
. I'm also not a fan of mid-cycle meetups because I feel it further stratifies our contributors into two increasly distinct camps - core vs non-core. I can see that a big benefit of a mid-cycle meetup is to be a focal point for collaboration, to forcably break contributors our of their day-to-day work pattern to concentrate on discussing specific issues. It also obviously solves the distinct timezone problem we have with our dispersed contributor base. I think that we should be examining what we can achieve with some kind of virtual online mid-cycle meetups instead. Using technology like google hangouts or some similar live collaboration technology, not merely an IRC discussion. Pick a 2-3 day period, schedule formal agendas / talking slots as you would with a physical summit and so on. I feel this would be more inclusive to our community as a whole, avoid excessive travel costs, so allowing more of our community to attend the bigger design summits. It would even open possibility of having multiple meetups during a cycle (eg could arrange mini virtual events around each milestone if we wanted) I think this is a nice concrete suggestion for an alternative. I think it's worth exploring in more detail. I would much prefer something like this as a replacement for the mid-cycle stuff and save the in-person meetings for the existing twice-per-year summits. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] The future of the integrated release
On 08/13/2014 08:52 AM, Mark McLoughlin wrote: On Tue, 2014-08-12 at 14:26 -0400, Eoghan Glynn wrote: It seems like this is exactly what the slots give us, though. The core review team picks a number of slots indicating how much work they think they can actually do (less than the available number of blueprints), and then blueprints queue up to get a slot based on priorities and turnaround time and other criteria that try to make slot allocation fair. By having the slots, not only is the review priority communicated to the review team, it is also communicated to anyone watching the project. One thing I'm not seeing shine through in this discussion of slots is whether any notion of individual cores, or small subsets of the core team with aligned interests, can champion blueprints that they have a particular interest in. For example it might address some pain-point they've encountered, or impact on some functional area that they themselves have worked on in the past, or line up with their thinking on some architectural point. But for whatever motivation, such small groups of cores currently have the freedom to self-organize in a fairly emergent way and champion individual BPs that are important to them, simply by *independently* giving those BPs review attention. Whereas under the slots initiative, presumably this power would be subsumed by the group will, as expressed by the prioritization applied to the holding pattern feeding the runways? I'm not saying this is good or bad, just pointing out a change that we should have our eyes open to. Yeah, I'm really nervous about that aspect. Say a contributor proposes a new feature, a couple of core reviewers think it's important exciting enough for them to champion it but somehow the 'group will' is that it's not a high enough priority for this release, even if everyone agrees that it is actually cool and useful. What does imposing that 'group will' on the two core reviewers and contributor achieve? That the contributor and reviewers will happily turn their attention to some of the higher priority work? Or we lose a contributor and two reviewers because they feel disenfranchised? Probably somewhere in the middle. On the other hand, what happens if work proceeds ahead even if not deemed a high priority? I don't think we can say that the contributor and two core reviewers were distracted from higher priority work, because blocking this work is probably unlikely to shift their focus in a productive way. Perhaps other reviewers are distracted because they feel the work needs more oversight than just the two core reviewers? It places more of a burden on the gate? I dunno ... the consequences of imposing group will worry me more than the consequences of allowing small groups to self-organize like this. Yes, this is by far my #1 concern with the plan. I think perhaps some middle ground makes sense. 1) Start doing a better job of generating a priority list, and identifying the highest priority items based on group will. 2) Expect that reviewers use the priority list to influence their general review time. 3) Don't actually block other things, should small groups self-organize and decide it's important enough to them, even if not to the group as a whole. That sort of approach still sounds like an improvement to what we have today, which is alack of good priority communication to direct general review time. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][core] Expectations of core reviewers
On 08/13/2014 01:09 PM, Dan Smith wrote: Expecting cores to be at these sorts of things seems pretty reasonable to me, given the usefulness (and gravity) of the discussions we've been having so far. Companies with more cores will have to send more or make some hard decisions, but I don't want to cut back on the meetings until their value becomes unjustified. I disagree. IMO, *expecting* people to travel, potentially across the globe, 4 times a year is an unreasonable expectation, and quite uncharacteristic of open source projects. If we can't figure out a way to have the most important conversations in a way that is inclusive of everyone, we're failing with our processes. By all means, if a subset wants to meet up and make progress on some things, I think that's fine. I don't think anyone think it's not useful. However, discussions need to be summarized and taken back to the list for discussion before decisions are made. That's not the way things are trending here, and I think that's a problem. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][core] Expectations of core reviewers
On 08/13/2014 02:33 PM, Dan Smith wrote: On 8/13/14 11:20 AM, Mike Bayer wrote: On Aug 13, 2014, at 1:44 PM, Russell Bryant rbry...@redhat.com wrote: I disagree. IMO, *expecting* people to travel, potentially across the globe, 4 times a year is an unreasonable expectation, and quite uncharacteristic of open source projects. If we can't figure out a way to have the most important conversations in a way that is inclusive of everyone, we're failing with our processes. By all means, if a subset wants to meet up and make progress on some things, I think that's fine. I don't think anyone think it's not useful. Well, it doesn't seem at all excessive to me, given the rate and volume at which we do things around here. That said, if a significant number of cores think it's not doable, then I guess that's a data point. From what you said above, it sounds like you're okay with the meetings but not the requirement for cores. I said expect above -- is that a reasonable thing? Expect them to be present, unless they have a reason not to be there? Reasons could be personal or preference, but hopefully not I never come to midcycles because $reason. It’s difficult to compare OpenStack to other open source projects, in that it is on such a more massive and high velocity scale than almost any others (perhaps the Linux kernel is similar). Yeah, I have a hard time justifying anything by comparing us to other projects. I've been involved with plenty and don't think any of them are useful data points for what we should or should not do here in terms of anything related to velocity :) I think we also need to be careful with not continuing to increase expectations because of velocity. Burnout is a real problem. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] The future of the integrated release
On 08/13/2014 04:01 PM, Doug Hellmann wrote: On Aug 13, 2014, at 9:11 AM, Russell Bryant rbry...@redhat.com wrote: On 08/13/2014 08:52 AM, Mark McLoughlin wrote: On Tue, 2014-08-12 at 14:26 -0400, Eoghan Glynn wrote: It seems like this is exactly what the slots give us, though. The core review team picks a number of slots indicating how much work they think they can actually do (less than the available number of blueprints), and then blueprints queue up to get a slot based on priorities and turnaround time and other criteria that try to make slot allocation fair. By having the slots, not only is the review priority communicated to the review team, it is also communicated to anyone watching the project. One thing I'm not seeing shine through in this discussion of slots is whether any notion of individual cores, or small subsets of the core team with aligned interests, can champion blueprints that they have a particular interest in. For example it might address some pain-point they've encountered, or impact on some functional area that they themselves have worked on in the past, or line up with their thinking on some architectural point. But for whatever motivation, such small groups of cores currently have the freedom to self-organize in a fairly emergent way and champion individual BPs that are important to them, simply by *independently* giving those BPs review attention. Whereas under the slots initiative, presumably this power would be subsumed by the group will, as expressed by the prioritization applied to the holding pattern feeding the runways? I'm not saying this is good or bad, just pointing out a change that we should have our eyes open to. Yeah, I'm really nervous about that aspect. Say a contributor proposes a new feature, a couple of core reviewers think it's important exciting enough for them to champion it but somehow the 'group will' is that it's not a high enough priority for this release, even if everyone agrees that it is actually cool and useful. What does imposing that 'group will' on the two core reviewers and contributor achieve? That the contributor and reviewers will happily turn their attention to some of the higher priority work? Or we lose a contributor and two reviewers because they feel disenfranchised? Probably somewhere in the middle. On the other hand, what happens if work proceeds ahead even if not deemed a high priority? I don't think we can say that the contributor and two core reviewers were distracted from higher priority work, because blocking this work is probably unlikely to shift their focus in a productive way. Perhaps other reviewers are distracted because they feel the work needs more oversight than just the two core reviewers? It places more of a burden on the gate? I dunno ... the consequences of imposing group will worry me more than the consequences of allowing small groups to self-organize like this. Yes, this is by far my #1 concern with the plan. I think perhaps some middle ground makes sense. 1) Start doing a better job of generating a priority list, and identifying the highest priority items based on group will. 2) Expect that reviewers use the priority list to influence their general review time. 3) Don't actually block other things, should small groups self-organize and decide it's important enough to them, even if not to the group as a whole. That sort of approach still sounds like an improvement to what we have today, which is alack of good priority communication to direct general review time. -- Russell Bryant This is more formal than what we’ve been doing in Oslo, but it’s closer than a strict slot-based approach. We talk about review priorities in the meeting each week, and ask anyone in the meeting to suggest changes that need attention. It’s up to the individual core reviewers to act on those suggestions, though. And I think that's a very healthy approach. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][core] Expectations of core reviewers
On 08/13/2014 02:44 PM, Russell Bryant wrote: On 08/13/2014 02:33 PM, Dan Smith wrote: On 8/13/14 11:20 AM, Mike Bayer wrote: On Aug 13, 2014, at 1:44 PM, Russell Bryant rbry...@redhat.com wrote: I disagree. IMO, *expecting* people to travel, potentially across the globe, 4 times a year is an unreasonable expectation, and quite uncharacteristic of open source projects. If we can't figure out a way to have the most important conversations in a way that is inclusive of everyone, we're failing with our processes. By all means, if a subset wants to meet up and make progress on some things, I think that's fine. I don't think anyone think it's not useful. Well, it doesn't seem at all excessive to me, given the rate and volume at which we do things around here. That said, if a significant number of cores think it's not doable, then I guess that's a data point. From what you said above, it sounds like you're okay with the meetings but not the requirement for cores. I said expect above -- is that a reasonable thing? Expect them to be present, unless they have a reason not to be there? Reasons could be personal or preference, but hopefully not I never come to midcycles because $reason. It’s difficult to compare OpenStack to other open source projects, in that it is on such a more massive and high velocity scale than almost any others (perhaps the Linux kernel is similar). Yeah, I have a hard time justifying anything by comparing us to other projects. I've been involved with plenty and don't think any of them are useful data points for what we should or should not do here in terms of anything related to velocity :) I think we also need to be careful with not continuing to increase expectations because of velocity. Burnout is a real problem. Let me try to say it another way. You seemed to say that it wasn't much to ask given the rate at which things happen in OpenStack. I would argue that given the rate, we should not try to ask more of individuals (like this proposal) and risk burnout. Instead, we should be doing our best to be more open an inclusive to give the project the best chance to grow, as that's the best way to get more done. I think an increased travel expectation is a raised bar that will hinder team growth, not help it. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] 3rd Party CI vs. Gerrit
On 08/13/2014 06:23 PM, Mark McLoughlin wrote: On Wed, 2014-08-13 at 12:05 -0700, James E. Blair wrote: cor...@inaugust.com (James E. Blair) writes: Sean Dague s...@dague.net writes: This has all gone far enough that someone actually wrote a Grease Monkey script to purge all the 3rd Party CI content out of Jenkins UI. People are writing mail filters to dump all the notifications. Dan Berange filters all them out of his gerrit query tools. I should also mention that there is a pending change to do something similar via site-local Javascript in our Gerrit: https://review.openstack.org/#/c/95743/ I don't think it's an ideal long-term solution, but if it works, we may have some immediate relief without all having to install greasemonkey scripts. You may have noticed that this has merged, along with a further change that shows the latest results in a table format. (You may need to force-reload in your browser to see the change.) Beautiful! Thank you so much to everyone involved. +1! Love this. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][core] Expectations of core reviewers
On 08/13/2014 07:27 PM, Michael Still wrote: On Thu, Aug 14, 2014 at 3:44 AM, Russell Bryant rbry...@redhat.com wrote: On 08/13/2014 01:09 PM, Dan Smith wrote: Expecting cores to be at these sorts of things seems pretty reasonable to me, given the usefulness (and gravity) of the discussions we've been having so far. Companies with more cores will have to send more or make some hard decisions, but I don't want to cut back on the meetings until their value becomes unjustified. I disagree. IMO, *expecting* people to travel, potentially across the globe, 4 times a year is an unreasonable expectation, and quite uncharacteristic of open source projects. If we can't figure out a way to have the most important conversations in a way that is inclusive of everyone, we're failing with our processes. I am a bit confused by this stance to be honest. You yourself said when you were Icehouse PTL that you wanted cores to come to the summit. What changed? Yes, I would love for core team members to come to the design summit that's twice a year. I still don't *expect* it for them to remain a member of the team, and I certainly don't expect it 4 times a year. It's a matter of frequency and requirement. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][core] Expectations of core reviewers
On 08/13/2014 11:31 PM, Michael Still wrote: On Thu, Aug 14, 2014 at 1:24 PM, Jay Pipes jaypi...@gmail.com wrote: Just wanted to quickly weigh in with my thoughts on this important topic. I very much valued the face-to-face interaction that came from the mid-cycle meetup in Beaverton (it was the only one I've ever been to). That said, I do not believe it should be a requirement that cores make it to the face-to-face meetings in-person. A number of folks have brought up very valid concerns about personal/family time, travel costs and burnout. I'm not proposing they be a requirement. I am proposing that they be strongly encouraged. I'm not sure that's much different in reality. I believe that the issue raised about furthering the divide between core and non-core folks is actually the biggest reason I don't support a mandate to have cores at the face-to-face meetings, and I think we should make our best efforts to support quality virtual meetings that can be done on a more frequent basis than the face-to-face meetings that would be optional. I am all for online meetings, but we don't have a practical way to do them at the moment apart from IRC. Until someone has a concrete proposal that's been shown to work, I feel its a straw man argument. Yes, IRC is one option which we already use on a regular basis. We can also switch to voice communication for higher bandwidth when needed. We even have a conferencing server set up in OpenStack's infrastructure: https://wiki.openstack.org/wiki/Infrastructure/Conferencing In theory it even supports basic video conferencing, though I haven't tested it on this server yet. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][core] Expectations of core reviewers
On 08/13/2014 07:27 PM, Michael Still wrote: The etherpad for the meetup has extensive notes. Any summary I write will basically be those notes in prose. What are you looking for in a summary that isn't in the etherpad? There also wasn't a summary of the Icehouse midcycle produced that I can find. Whilst I am happy to do one for Juno, its a requirement that I hadn't planned for, and is therefore taking me some time to retrofit. I think we should chalk the request for summaries up experience and talk through how to better provide such things at future meetups. The summary from the Icehouse meetup is here: http://lists.openstack.org/pipermail/openstack-dev/2014-February/027370.html -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][core] Expectations of core reviewers
On 08/14/2014 10:04 AM, CARVER, PAUL wrote: Daniel P. Berrange [mailto:berra...@redhat.com] wrote: Depending on the usage needs, I think Google hangouts is a quite useful technology. For many-to-many session its limit of 10 participants can be an issue, but for a few-to-many broadcast it could be practical. What I find particularly appealing is the way it can live stream the session over youtube which allows for unlimited number of viewers, as well as being available offline for later catchup. I can't actually offer ATT resources without getting some level of management approval first, but just for the sake of discussion here's some info about the telepresence system we use. -=-=-=-=-=-=-=-=-=- ATS B2B Telepresence conferences can be conducted with an external company's Telepresence room(s), which subscribe to the ATT Telepresence Solution, or a limited number of other Telepresence service provider's networks. Currently, the number of Telepresence rooms that can participate in a B2B conference is limited to a combined total of 20 rooms (19 of which can be ATT rooms, depending on the number of remote endpoints included). -=-=-=-=-=-=-=-=-=- We currently have B2B interconnect with over 100 companies and ATT has telepresence rooms in many of our locations around the US and around the world. If other large OpenStack companies also have telepresence rooms that we could interconnect with I think it might be possible to get management agreement to hold a couple OpenStack meetups per year. Most of our rooms are best suited for 6 people, but I know of at least one 18 person telepresence room near me. An ideal solution would allow attendees to join as individuals from anywhere. A lot of contributors work from home. Is that sort of thing compatible with your system? -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][core] Expectations of core reviewers
) and the design summit sessions for Juno were a big disappointment after the meetup sessions, basically because of the time constraints. The meetups are nice since there is time to really hash over a topic and you're not rushed, whereas with the design summit sessions it felt like we'd be half way through the allotted time before we really started talking about anything of use and then shortly after that you'd be hearing 5 minutes left, and I felt like very few of the design sessions were actually useful, or things we've worked on in Juno, or at least high-priority/impact things (v3 API being an exception there, that was a useful session). I have seen what you describe, and have also been at sessions where there is active discussion for 15 minutes, all issues are resolved, and there is still a bunch of time left. The issue you cited could be addressed by accepting fewer topics and giving double or triple slots to topics that are important and expected to need a lot of discussion. The design summits are very useful for cores and newcomers alike and I would hate to see that fragmented by people deciding to not go to summits. Yes, giving more than one slot is an option and not one we've used in the nova track before that I can recall. It's usually because we're trying to pack so many things into the schedule. It's probably worth it for some topics. However, I still think things have to be strictly scheduled for the design summit, as compared to very loose for meetups. At the design summit, there are several tracks going on at once that people need to jump between, as well as keep up talks they are giving, or even customer meetings. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][db] Nominating Mike Bayer for the oslo.db core reviewers team
On 08/15/2014 09:13 AM, Jay Pipes wrote: On 08/15/2014 04:21 AM, Roman Podoliaka wrote: Hi Oslo team, I propose that we add Mike Bayer (zzzeek) to the oslo.db core reviewers team. Mike is an author of SQLAlchemy, Alembic, Mako Templates and some other stuff we use in OpenStack. Mike has been working on OpenStack for a few months contributing a lot of good patches and code reviews to oslo.db [1]. He has also been revising the db patterns in our projects and prepared a plan how to solve some of the problems we have [2]. I think, Mike would be a good addition to the team. Uhm, yeah... +10 :) ^2 :-) -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] OS or os are not acronyms for OpenStack
On 08/15/2014 11:00 AM, Mike Spreitzer wrote: Anita Kuno ante...@anteaya.info wrote on 08/15/2014 10:38:20 AM: OpenStack is OpenStack. The use of openstack is also acceptable in our development conversations. OS or os is operating system. I am starting to see some people us OS or os to mean OpenStack. This is confusing and also incorrect[0]. ... I have seen OS for OpenStack from the start. Just look at the environment variables that the CLI reads. Yep, it's quite common and I think it's fine in the right context. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] OS or os are not acronyms for OpenStack
On 08/15/2014 01:28 PM, Mike Spreitzer wrote: Anita Kuno ante...@anteaya.info wrote on 08/15/2014 01:08:44 PM: ... I think you hit the nail on the head here, Russell, it's fine in the right context. The definition of the right context however is somewhat elusive. I have chosen (it is my own fault) to place myself in the area where the folks I deal with struggle with understanding context. The newcomers to the third party space and folks creating stackforge repos don't have the benefit of the understanding that core reviewers have (would I be accurate in saying that it is mostly nova reviewers who have responded to my initial post thus far?). I suffered from an instance of this confusion myself when I was just getting started, and have seen colleagues get confused too. I suspect this problem hits many newcomers. but surely when it comes to learning OpenStack itself, the OpenStack community, dev processes, tools, etc this has got to be extremely far down the list of barriers to entry. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Enabling silent Docker tests for Nova?
On 08/15/2014 02:45 PM, Eric Windisch wrote: I have proposed a _silent_ check for Nova for integration of the Docker driver: https://review.openstack.org/#/c/114547/ It has been established that this code cannot move back into Nova until the tests are running and have a solid history of success. That cannot happen unless we're allowed to run the tests. Running a silent check on changes to Nova is the first step in establishing that history. Joe Gordon suggests we need a spec to bring the driver back into Nova. Besides the fact that specs are closed and there is no intention of reintegrating the driver for Juno, I'm uncertain of proposing a spec without first having solid history of successful testing, especially given the historical context of this driver's relationship with Nova. If we could enable silent checks, we could help minimize API skew and branch breakages, improving driver quality and reducing maintenance while we prepare for the Kilo spec + merge windows. Furthermore, by having a history of testing, we can seek faster inclusion into Kilo. Finally, I acknowledge that we may be entering a window of significant load on the CI servers and I'm sensitive to the needs of the infrastructure team to remain both focused and to conserve precious compute resources. If this is an issue, then I'd like to plot a timeline, however rough, with the infrastructure team. CI resources aside, I think enabling it sounds fine and useful. Given resource concerns, maybe just adding it to the experimental pipeline would be sufficient? That doesn't run as often, but still gives you the chance to run on demand against nova patches. There are other things in experimental for nova as well, so there will be other people triggering runs. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Enabling silent Docker tests for Nova?
On 08/15/2014 02:53 PM, Russell Bryant wrote: On 08/15/2014 02:45 PM, Eric Windisch wrote: I have proposed a _silent_ check for Nova for integration of the Docker driver: https://review.openstack.org/#/c/114547/ It has been established that this code cannot move back into Nova until the tests are running and have a solid history of success. That cannot happen unless we're allowed to run the tests. Running a silent check on changes to Nova is the first step in establishing that history. Joe Gordon suggests we need a spec to bring the driver back into Nova. Besides the fact that specs are closed and there is no intention of reintegrating the driver for Juno, I'm uncertain of proposing a spec without first having solid history of successful testing, especially given the historical context of this driver's relationship with Nova. If we could enable silent checks, we could help minimize API skew and branch breakages, improving driver quality and reducing maintenance while we prepare for the Kilo spec + merge windows. Furthermore, by having a history of testing, we can seek faster inclusion into Kilo. Finally, I acknowledge that we may be entering a window of significant load on the CI servers and I'm sensitive to the needs of the infrastructure team to remain both focused and to conserve precious compute resources. If this is an issue, then I'd like to plot a timeline, however rough, with the infrastructure team. CI resources aside, I think enabling it sounds fine and useful. Given resource concerns, maybe just adding it to the experimental pipeline would be sufficient? That doesn't run as often, but still gives you the chance to run on demand against nova patches. There are other things in experimental for nova as well, so there will be other people triggering runs. And I missed that it's already in experimental. Oops. Feature freeze is only a few weeks away (Sept 4). How about we just leave it in experimental until after that big push? That seems pretty reasonable. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [QA] Picking a Name for the Tempest Library
On 08/15/2014 03:26 PM, Drew Fisher wrote: What about 'teapot' (as in the idiom 'tempest in a teapot'[1]) -Drew [1] http://en.wikipedia.org/wiki/Tempest_in_a_teapot Though in this case it'd be teacup in tempest, I think? There's also a TCup project [1] that uses tempest. So, you have teapot in tempest in tcup ... and that just gets confusing. :-) [1] https://wiki.openstack.org/wiki/RefStack/TCup -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [QA] Picking a Name for the Tempest Library
On Aug 15, 2014, at 5:39 PM, Jay Pipes jaypi...@gmail.com wrote: On 08/15/2014 03:14 PM, Matthew Treinish wrote: Hi Everyone, So as part of splitting out common functionality from tempest into a library [1] we need to create a new repository. Which means we have the fun task of coming up with something to name it. I'm personally thought we should call it: - mesocyclone Which has the advantage of being a cloud/weather thing, and the name sort of fits because it's a precursor to a tornado. Also, it's an available namespace on both launchpad and pypi. But there has been expressed concern that both it is a bit on the long side (which might have 80 char line length implications) and it's unclear from the name what it does. During the last QA meeting some alternatives were also brought up: - tempest-lib / lib-tempest - tsepmet - blackstorm - calm - tempit - integration-test-lib (although I'm not entirely sure I remember which ones were serious suggestions or just jokes) So as a first step I figured that I'd bring it up on the ML to see if anyone had any other suggestions. (or maybe get a consensus around one choice) I'll take the list, check if the namespaces are available, and make a survey so that everyone can vote and hopefully we'll have a clear choice for a name from that. I suggest that tempest should be the name of the import'able library, and that the integration tests themselves should be what is pulled out of the current Tempest repository, into their own repo called openstack-integration-tests or os-integration-tests. Ooh, I like that idea! +1 -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][core] Expectations of core reviewers
On 08/18/2014 06:18 AM, Thierry Carrez wrote: Doug Hellmann wrote: On Aug 13, 2014, at 4:42 PM, Russell Bryant rbry...@redhat.com wrote: Let me try to say it another way. You seemed to say that it wasn't much to ask given the rate at which things happen in OpenStack. I would argue that given the rate, we should not try to ask more of individuals (like this proposal) and risk burnout. Instead, we should be doing our best to be more open an inclusive to give the project the best chance to grow, as that's the best way to get more done. I think an increased travel expectation is a raised bar that will hinder team growth, not help it. +1, well said. Sorry, I was away for a few days. This is a topic I have a few strong opinions on :) There is no denial that the meetup format is working well, comparatively better than the design summit format. There is also no denial that that requiring 4 travels per year for a core dev is unreasonable. Where is the limit ? Wouldn't we be more productive and aligned if we did one per month ? No, the question is how to reach a sufficient level of focus and alignment while keeping the number of mandatory travel at 2 per year. I don't think our issue comes from not having enough F2F time. Our issue is that the design summit no longer reaches its objectives of aligning key contributors on a common plan, and we need to fix it. We established the design summit as the once-per-cycle opportunity to have face-to-face time and get alignment across the main contributors to a project. That used to be completely sufficient, but now it doesn't work as well... which resulted in alignment and team discussions to be discussed at mid-cycle meetups instead. Why ? And what could we change to have those alignment discussions at the design summit again ? Why are design summits less productive that mid-cycle meetups those days ? Is it because there are too many non-contributors in the design summit rooms ? Is it the 40-min format ? Is it the distractions (having talks to give somewhere else, booths to attend, parties and dinners to be at) ? Is it that beginning of cycle is not the best moment ? Once we know WHY the design summit fails its main objective, maybe we can fix it. My gut feeling is that having a restricted audience and a smaller group lets people get to the bottom of an issue and reach consensus. And that you need at least half a day or a full day of open discussion to reach such alignment. And that it's not particularly great to get such alignment in the middle of the cycle, getting it at the start is still the right way to align with the release cycle. Nothing prevents us from changing part of the design summit format (even the Paris one!), and restrict attendance to some of the sessions. And if the main issue is the distraction from the conference colocation, we might have to discuss the future of co-location again. In that 2 events per year objective, we could make the conference the optional cycle thing, and a developer-oriented specific event the mandatory one. If we manage to have alignment at the design summit, then it doesn't spell the end of the mid-cycle things. But then, ideally the extra mid-cycle gatherings should be focused on getting specific stuff done, rather than general team alignment. Think workshop/hackathon rather than private gathering. The goal of the workshop would be published in advance, and people could opt to join that. It would be totally optional. Great response ... I agree with everything you've said here. Let's figure out how to improve the design summit to better achieve team alignment. Of the things you mentioned, I think the biggest limit to alignment has been the 40 minute format. There are some topics that need more time. It may be that we just need to take more advantage of the ability to give a single topic multiple time slots to ensure enough time is available. As Dan discussed, there are some topics that we could stand to turn down and distribute information another way that is just as effective. I would also say that the number of things going on at one time is also problematic. Not only are there several design summit sessions going once, but there are conference sessions and customer meetings. The rapid rate of jumping around and context switching is exhausting. It also makes it a bit harder to get critical mass for an extended period of time around a topic. In mid-cycle meetups, there is one track and no other things competing for time and attention. I don't have a good suggestion for fixing this issue with so many things competing for time and attention. I used to be a big proponent of splitting the event out completely, but I don't feel the same way anymore. In theory we could call the conference the optional event, but in practice it's going to be required for many folks anyway. I can't speak for everyone, but I suspect if you're a senior engineer at your
Re: [openstack-dev] [TripleO][Nova] Specs and approvals
On 08/19/2014 05:31 AM, Robert Collins wrote: Hey everybody - https://wiki.openstack.org/wiki/TripleO/SpecReviews seems pretty sane as we discussed at the last TripleO IRC meeting. I'd like to propose that we adopt it with the following tweak: 19:46:34 lifeless so I propose that +2 on a spec is a commitment to review it over-and-above the core review responsibilities 19:47:05 lifeless if its not important enough for a reviewer to do that thats a pretty strong signal 19:47:06 dprince lifeless: +1, I thought we already agreed to that at the meetup 19:47:17 slagle yea, sounds fine to me 19:47:20 bnemec +1 19:47:30 lifeless dprince: it wasn't clear whether it was part-of-responsibility, or additive, I'm proposing we make it clearly additive 19:47:52 lifeless and separately I think we need to make surfacing reviews-for-themes a lot better That is - +1 on a spec review is 'sure, I like it', +2 is specifically I will review this *over and above* my core commitment - the goal here is to have some very gentle choke on concurrent WIP without needing the transition to a managed pull workflow that Nova are discussing - which we didn't have much support for during the meeting. Obviously, any core can -2 for any of the usual reasons - this motion is about opening up +A to the whole Tripleo core team on specs. Reviewers, and other interested kibbitzers, please +1 / -1 as you feel fit :) +1 I really like this. In fact, I like it a lot more than the current proposal for Nova. I think the Nova team should consider this, as well. It still rate limits code reviews by making core reviewers explicitly commit to reviewing things. This is like our previous attempt at sponsoring blueprints, but the use of gerrit I think would make it more successful. It also addresses my primary concerns with the tensions between group will and small groups no longer being able to self organize and push things to completion without having to haggle through yet another process. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [ptls] The Czar system, or how to scale PTLs
On 08/22/2014 08:33 AM, Thierry Carrez wrote: Hi everyone, We all know being a project PTL is an extremely busy job. That's because in our structure the PTL is responsible for almost everything in a project: - Release management contact - Work prioritization - Keeping bugs under control - Communicate about work being planned or done - Make sure the gate is not broken - Team logistics (run meetings, organize sprints) - ... They end up being completely drowned in those day-to-day operational duties, miss the big picture, can't help in development that much anymore, get burnt out. Since you're either the PTL or not the PTL, you're very alone and succession planning is not working that great either. There have been a number of experiments to solve that problem. John Garbutt has done an incredible job at helping successive Nova PTLs handling the release management aspect. Tracy Jones took over Nova bug management. Doug Hellmann successfully introduced the concept of Oslo liaisons to get clear point of contacts for Oslo library adoption in projects. It may be time to generalize that solution. The issue is one of responsibility: the PTL is ultimately responsible for everything in a project. If we can more formally delegate that responsibility, we can avoid getting up to the PTL for everything, we can rely on a team of people rather than just one person. Enter the Czar system: each project should have a number of liaisons / official contacts / delegates that are fully responsible to cover one aspect of the project. We need to have Bugs czars, which are responsible for getting bugs under control. We need to have Oslo czars, which serve as liaisons for the Oslo program but also as active project-local oslo advocates. We need Security czars, which the VMT can go to to progress quickly on plugging vulnerabilities. We need release management czars, to handle the communication and process with that painful OpenStack release manager. We need Gate czars to serve as first-line-of-contact getting gate issues fixed... You get the idea. Some people can be czars of multiple areas. PTLs can retain some czar activity if they wish. Czars can collaborate with their equivalents in other projects to share best practices. We just need a clear list of areas/duties and make sure each project has a name assigned to each. Now, why czars ? Why not rely on informal activity ? Well, for that system to work we'll need a lot of people to step up and sign up for more responsibility. Making them czars makes sure that effort is recognized and gives them something back. Also if we don't formally designate people, we can't really delegate and the PTL will still be directly held responsible. The Release management czar should be able to sign off release SHAs without asking the PTL. The czars and the PTL should collectively be the new project drivers. At that point, why not also get rid of the PTL ? And replace him with a team of czars ? If the czar system is successful, the PTL should be freed from the day-to-day operational duties and will be able to focus on the project health again. We still need someone to keep an eye on the project-wide picture and coordinate the work of the czars. We need someone to pick czars, in the event multiple candidates sign up. We also still need someone to have the final say in case of deadlocked issues. People say we don't have that many deadlocks in OpenStack for which the PTL ultimate power is needed, so we could get rid of them. I'd argue that the main reason we don't have that many deadlocks in OpenStack is precisely *because* we have a system to break them if they arise. That encourages everyone to find a lazy consensus. That part of the PTL job works. Let's fix the part that doesn't work (scaling/burnout). +1 on czars. That's what was working best for me to start scaling things in Nova, especially through my 2nd term (Icehouse). John and Tracy were a big help as you mentioned as examples. There were others that were stepping up, too. I think it's been working well enough to formalize it. Another area worth calling out is a gate czar. Having someone who understands infra and QA quite well and is regularly on top of the status of the project in the gate is helpful and quite important. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] [ptls] The Czar system, or how to scale PTLs
On 08/22/2014 09:40 AM, Russell Bryant wrote: Another area worth calling out is a gate czar. Having someone who understands infra and QA quite well and is regularly on top of the status of the project in the gate is helpful and quite important. Oops, you said this one, too. Anyway, +1. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Server Groups - remove VM from group?
On 08/25/2014 12:56 PM, Joe Cropper wrote: That was indeed a rather long (and insightful) thread on the topic. It sounds like there are still some healthy discussions worth having on the subject -- either exploring your [potentially superseding] proposal, or minimally rounding out the existing server group API to support add existing VM [1] and remove VM -- I think these would make it a lot more usable (I'm thinking of the poor cloud administrator that makes a mistake when they boot an instance and either forgets to put it in a group or puts it in the wrong group -- it's square 1 for them)? Is this queued up as a discussion point for Paris? If so, count me in! Adding a VM is far from trivial and is why we ripped it out before merging. That implies a potential reshuffling of a bunch of existing VMs. Consider an affinity group of instances A and B and then you add running instance C to that group. What do you expect to happen? Live migrate C to the host running A and B? What if there isn't room? Reschedule all 3 to find a host and live migrate all of them? This kind of orchestration is a good bit outside of the scope of what's done inside of Nova today. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Server Groups - remove VM from group?
On 08/25/2014 01:25 PM, Joe Cropper wrote: I was thinking something simple such as only allowing the add operation to succeed IFF no policies are found to be in violation... and then nova wouldn't need to get into all the complexities you mention? Even something like this is a lot more complicated than it sounds due to the fact that several operations can be happening in parallel. I think we just need to draw a line for Nova that just doesn't include this functionality. And remove would be fairly straightforward as well since no constraints would need to be checked. Right, remove is straight forward, but seems a bit odd to have without add. I'm not sure there's much value to it. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] Design Summit reloaded
On 08/27/2014 08:51 AM, Thierry Carrez wrote: Hi everyone, I've been thinking about what changes we can bring to the Design Summit format to make it more productive. I've heard the feedback from the mid-cycle meetups and would like to apply some of those ideas for Paris, within the constraints we have (already booked space and time). Here is something we could do: Day 1. Cross-project sessions / incubated projects / other projects I think that worked well last time. 3 parallel rooms where we can address top cross-project questions, discuss the results of the various experiments we conducted during juno. Don't hesitate to schedule 2 slots for discussions, so that we have time to come to the bottom of those issues. Incubated projects (and maybe other projects, if space allows) occupy the remaining space on day 1, and could occupy pods on the other days. I would add Don't hesitate to schedule 2 slots ... to the description for days 2 and 3, as well. I think the same point applies for project-specific sessions. I don't think I've seen that used for project sessions much, but I think it would help in some cases. Day 2 and Day 3. Scheduled sessions for various programs That's our traditional scheduled space. We'll have a 33% less slots available. So, rather than trying to cover all the scope, the idea would be to focus those sessions on specific issues which really require face-to-face discussion (which can't be solved on the ML or using spec discussion) *or* require a lot of user feedback. That way, appearing in the general schedule is very helpful. This will require us to be a lot stricter on what we accept there and what we don't -- we won't have space for courtesy sessions anymore, and traditional/unnecessary sessions (like my traditional release schedule one) should just move to the mailing-list. Day 4. Contributors meetups On the last day, we could try to split the space so that we can conduct parallel midcycle-meetup-like contributors gatherings, with no time boundaries and an open agenda. Large projects could get a full day, smaller projects would get half a day (but could continue the discussion in a local bar). Ideally that meetup would end with some alignment on release goals, but the idea is to make the best of that time together to solve the issues you have. Friday would finish with the design summit feedback session, for those who are still around. I think this proposal makes the best use of our setup: discuss clear cross-project issues, address key specific topics which need face-to-face time and broader attendance, then try to replicate the success of midcycle meetup-like open unscheduled time to discuss whatever is hot at this point. There are still details to work out (is it possible split the space, should we use the usual design summit CFP website to organize the scheduled time...), but I would first like to have your feedback on this format. Also if you have alternative proposals that would make a better use of our 4 days, let me know. +1 on the format. I think it sounds like a nice iteration on our setup to try some new ideas. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron][stable] How to backport database schema fixes
On 08/29/2014 06:54 AM, Salvatore Orlando wrote: If you are running version from a stable branch, changes in DB migrations should generally be forbidden as the policy states since those migrations are not likely to be executed again. Downgrading and then upgrading again is extremely risky and I don't think anybody would ever do that. However, if one is running stable branch X-2 where X is the current development branch, back porting migration fixes could make sense for upgrading to version X-1 if the migration being fixed is in the path between X-2 and X-1. Therefore I would forbid every fix to migration earlier than X-2 release (there should not be any in theory but neutron has migrations back to folsom). For the path between X-2 and X-1 fixes might be ok. I think it's safe to backport to X-1. The key bit is that the migration in master and the backported version must be reentrant. They need to inspect the schema and only perform the change if it hasn't already been applied. This is a good best practice to adopt for *all* migrations to make the backport option easier. However, rather than amending existing migration is always better to add new migrations - even if it's a matter of enabling a given change for a particular plugin (*). Agreed, in general. It depends on the bug. If there's an error in the migration that will prevent the original code from running properly, breaking the migration, that obviously needs to be fixed. As nova does, the best place for doing that is always immediately before release. Doing what, adding placeholders? Note that we actually add placeholders at the very *beginning* of a release cycle. The placeholders have to be put in place as the first set of migrations in a release. That way: 1) X-1 has those migration slots unused. 2) X has those slots reserved. If we did it just *before* release, you can't actually backport into those positions. They've already run as no-op. With alembic, we do not need to add placeholders, but just adjust pointers just like you would when inserting an element in a dynamic list. Good point. (*) we are getting rid of this conditional migration logic for juno anyway Yay! -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers
On 09/04/2014 06:24 AM, Daniel P. Berrange wrote: Position statement == Over the past year I've increasingly come to the conclusion that Nova is heading for (or probably already at) a major crisis. If steps are not taken to avert this, the project is likely to loose a non-trivial amount of talent, both regular code contributors and core team members. That includes myself. This is not good for Nova's long term health and so should be of concern to anyone involved in Nova and OpenStack. For those who don't want to read the whole mail, the executive summary is that the nova-core team is an unfixable bottleneck in our development process with our current project structure. The only way I see to remove the bottleneck is to split the virt drivers out of tree and let them all have their own core teams in their area of code, leaving current nova core to focus on all the common code outside the virt driver impls. I, now, none the less urge people to read the whole mail. Fantastic write-up. I can't +1 enough the problem statement, which I think you've done a nice job of framing. We've taken steps to try to improve this, but none of them have been big enough. I feel we've reached a tipping point. I think many others do too, and several proposals being discussed all seem rooted in this same core issue. When it comes to the proposed solution, I'm +1 on that too, but part of that is that it's hard for me to ignore the limitations placed on us by our current review infrastructure (gerrit). If we ignored gerrit for a moment, is rapid increase in splitting out components the ideal workflow? Would we be better off finding a way to finally just implement a model more like the Linux kernel with sub-system maintainers and pull requests to a top-level tree? Maybe. I'm not convinced that split of repos is obviously better. You make some good arguments for why splitting has other benefits. Besides, even if we weren't going to split them and instead wanted to have separate branches, we'd have to take interface stability much more seriously. I think the work immediately needed overlaps quite a bit. In any case, let's not completely side-tracked on the end game workflow. I am completely on board with the idea that we have to move to a model that involves more than one team and spreading out the responsibility further than we have thus far. I don't think we can afford to wait much longer without drastic change, so let's make it happen. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers
- Original Message - On 09/04/2014 11:32 AM, Vladik Romanovsky wrote: +1 I very much agree with Dan's the propsal. I am concerned about difficulties we will face with merging patches that spreads accross various regions: manager, conductor, scheduler, etc.. However, I think, this is a small price to pay for having a more focused teams. IMO, we will stiil have to pay it, the moment the scheduler will separate. There will be more pain the moment the scheduler separates, IMO, especially with its current design and interfaces. I absolutely agree that the scheduler split is a non-starter without stabilizing all of the relevant interfaces. I hope there's not much debate on that high level point. Of course, identifying exactly what those interfaces should be a bit more complicated, but I hope the focus can stay there. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers
On 09/05/2014 10:06 AM, Jay Pipes wrote: On 09/05/2014 06:29 AM, John Garbutt wrote: Scheduler: I think we need to split out the scheduler with a similar level of urgency. We keep blocking features on the split, because we know we don't have the review bandwidth to deal with them. Right now I am talking about a compute related scheduler in the compute program, that might evolve to worry about other services at a later date. -1 Without first cleaning up the interfaces around resource tracking, claim creation and processing, and the communication interfaces between the nova-conductor, nova-scheduler, and nova-compute. I see no urgency at all in splitting out the scheduler. The cleanup of the interfaces around the resource tracker and scheduler has great priority, though, IMO. I'd just reframe things ... I'd like the work you're referring to here be treated as an obvious key pre-requisite to a split, and this cleanup is what should be treated with urgency by those with a vested interest in getting more autonomy around scheduler development. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] doubling our core review bandwidth
On 09/08/2014 05:17 AM, Steven Hardy wrote: On Mon, Sep 08, 2014 at 03:14:24PM +1200, Robert Collins wrote: I hope the subject got your attention :). This might be a side effect of my having too many cosmic rays, but its been percolating for a bit. tl;dr I think we should drop our 'needs 2x+2 to land' rule and instead use 'needs 1x+2'. We can ease up a large chunk of pressure on our review bottleneck, with the only significant negative being that core reviewers may see less of the code going into the system - but they can always read more to stay in shape if thats an issue :) I think this may be a sensible move, but only if it's used primarily to land the less complex/risky patches more quickly. As has been mentioned already by Angus, +1 can (and IMO should) be used for any less trival and/or risky patches, as the more-eyeballs thing is really important for big or complex patches (we are all fallible, and -core folks quite regularly either disagree, spot different types of issue, or just have better familiarity with some parts of the codebase than others). FWIW, every single week in the Heat queue, disagreements between -core reviewers result in issues getting fixed before merge, which would result in more bugs if the 1x+2 scheme was used unconditionally. I'm sure other projects are the same, but I guess this risk can be mitigated with reviewer +1 discretion. Agreed with this. I think this is a worthwhile move for simpler patches. I've already done it plenty of times for a very small category of things (like translations updates). It would be worth having someone write up a proposal that reflects this, with some examples that demonstrate patches that really need the second review vs others that don't. In the end, it has to be based on trust in a -core team member judgement call. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] On an API proxy from baremetal to ironic
On 09/09/2014 05:24 PM, Michael Still wrote: Hi. One of the last things blocking Ironic from graduating is deciding whether or not we need a Nova API proxy for the old baremetal extension to new fangled Ironic API. The TC has asked that we discuss whether we think this functionality is actually necessary. It should be noted that we're _not_ talking about migration of deployed instances from baremetal to Ironic. That is already implemented. What we are talking about is if users post-migration should be able to expect their previous baremetal Nova API extension to continue to function, or if they should use the Ironic APIs from that point onwards. Nova had previously thought this was required, but it hasn't made it in time for Juno unless we do a FFE, and it has been suggested that perhaps its not needed at all because it is an admin extension. To be super specific, we're talking about the baremetal nodes admin extension here. This extension has the ability to: - list nodes running baremetal - show detail of one of those nodes - create a new baremetal node - delete a baremetal node Only the first two of those would be supported if we implemented a proxy. So, discuss. I'm in favor of proceeding with deprecation without requiring the API proxy. In the case of user facing APIs, the administrators in charge of upgrading the cloud do not have full control over all of the apps using the APIs. In this particular case, I would expect that the cloud administrators have *complete* control over the use of these APIs. Assuming we have one overlap release (Juno) to allow the migration to occur and given proper documentation of the migration plan and release notes stating the fact that the old APIs are going away, we should be fine. In summary, +1 to moving forward without the API proxy requirement. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Kilo Cycle Goals Exercise
On 09/03/2014 11:37 AM, Joe Gordon wrote: As you all know, there has recently been several very active discussions around how to improve assorted aspects of our development process. One idea that was brought up is to come up with a list of cycle goals/project priorities for Kilo [0]. To that end, I would like to propose an exercise as discussed in the TC meeting yesterday [1]: Have anyone interested (especially TC members) come up with a list of what they think the project wide Kilo cycle goals should be and post them on this thread by end of day Wednesday, September 10th. After which time we can begin discussing the results. The goal of this exercise is to help us see if our individual world views align with the greater community, and to get the ball rolling on a larger discussion of where as a project we should be focusing more time. In OpenStack, we have no shortage of interest and enthusiasm on all fronts, including development contributors, deployers, and cloud end users. When looking at project wide-priorities, we need to make sure our tools, processes, and resulting technology facilitate turning all of that interest into real value. We need to identify which areas have the most pain, and work to improve them. A lot of this is certainly about Kilo, but it's longer term, too. It's the way we should always be thinking about this. 1) Dev community We clearly have a lot of growing pains here. What's quite encouraging is that we also have a lot of hard work going into several different proposals to figure out ways to help. The largest projects (Nova and Neutron) are overwhelmed and approaching breaking points. We have to find ways to relieve this pressure. This may involve aggressively pursing project splits or other code review workflow changes. I think the problems and solutions here are project-wide issues, as solutions put in place tend to rapidly spread to the rest of OpenStack. This is an area I'm especially concerned about and eager to help look for solutions. We should evaluate all potential improvements against how well they help us scale our teams and processes to remove bottlenecks to productivity in the dev communtiy. There are several other encouraging proposals related to easing pain in the dev community: - re-working how we do functional testing by making it more project focused - discussions like this one to discuss both priorities, but also how we turn priorities into real action (like the nova projects discussions around using priorities in development) - evolving project leadership (the PTL position) so that we can provide more guidance around delegation in a way that is reasonably consistent across projects - continued discussion about the contents of the integrated release and how we can continue to foster growth without sacrificing quality We are always going to have problems like this, and I hope we continue to think about, discuss, and improve the way we run our projects every release cycle to come. 2) End Users A few others have done a very nice job describing end user experience problems. Monty's description of getting an instance with an IP was painful and embarrassing to read. We've got to figure out ways to provide better focus on these sorts of usability issues. They're obviously not getting the attention they deserve. There have also been lots of good points about improving our API consistency. I totally agree. I'd love to see a group of people step up and emerge as leaders in this area across all projects. I feel like that's something we're sorely missing right now. 3) Deployers OpenStack is still painful to deploy, and even more painful to manage. I'm still quite pleased that we have a deployment program working on this space. I'd actually really like to see how we can facilitate better integration and discussion between TripleO and the rest of the project teams. I'm also very pleased with the progress we've made in Nova towards the initial support for live upgrades. We still have more work to do in Nova, but I'd also like to see more work done in other projects towards the same goal. For both deployers and the dev community, figuring out what went wrong when OpenStack breaks sucks. Others provided some good pointers to several areas we can improve that area (better logs, tooling, ...) and I hope we can make some real progress in this area in the coming cycle. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Server Groups - remove VM from group?
On Sep 10, 2014, at 2:03 PM, Joe Cropper cropper@gmail.com wrote: I agree, Chris. I think a number of folks put in a lot of really great work into the existing server groups and there has been a lot of interest on their usage, especially given that the scheduler already has some constructs in place to piggyback on them. I would like to craft up a blueprint proposal for Kilo to add two simple extensions to the existing server group APIs that I believe will make them infinitely more usable in any ‘real world’ scenario. I’ll put more details in the proposal, but in a nutshell: 1. Adding a VM to a server group Only allow it to succeed if its policy wouldn’t be violated by the addition of the VM I'm not sure that determining this at the time of the API request is possible due to the parallel and async nature of the system. I'd love to hear ideas on how you think this might be done, but I'm really not optimistic and would rather just not go down this road. 2. Removing a VM from a server group Just allow it I think this would round out the support that’s there and really allow us to capitalize on the hard work everyone’s already put into them. - Joe On Aug 26, 2014, at 6:39 PM, Chris Friesen chris.frie...@windriver.com wrote: On 08/25/2014 11:25 AM, Joe Cropper wrote: I was thinking something simple such as only allowing the add operation to succeed IFF no policies are found to be in violation... and then nova wouldn't need to get into all the complexities you mention? Personally I would be in favour of this...nothing fancy, just add it if it already meets all the criteria. This is basically just a database operation so I would hope we could make it reliable in the face of simultaneous things going on with the instance. And remove would be fairly straightforward as well since no constraints would need to be checked. Agreed. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Server Groups - remove VM from group?
On 09/10/2014 06:46 PM, Joe Cropper wrote: Hmm, not sure I follow the concern, Russell. How is that any different from putting a VM into the group when it’s booted as is done today? This simply defers the ‘group insertion time’ to some time after initial the VM’s been spawned, so I’m not sure this creates anymore race conditions than what’s already there [1]. [1] Sure, the to-be-added VM could be in the midst of a migration or something, but that would be pretty simple to check make sure its task state is None or some such. The way this works at boot is already a nasty hack. It does policy checking in the scheduler, and then has to re-do some policy checking at launch time on the compute node. I'm afraid of making this any worse. In any case, it's probably better to discuss this in the context of a more detailed design proposal. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][neutron][cinder] Averting the Nova crisis by splitting out virt drivers
On 09/10/2014 10:35 PM, Armando M. wrote: Hi, I devoured this thread, so much it was interesting and full of insights. It's not news that we've been pondering about this in the Neutron project for the past and existing cycle or so. Likely, this effort is going to take more than two cycles, and would require a very focused team of people working closely together to address this (most likely the core team members plus a few other folks interested). One question I was unable to get a clear answer was: what happens to existing/new bug fixes and features? Would the codebase go in lockdown mode, i.e. not accepting anything else that isn't specifically targeting this objective? Just using NFV as an example, I can't imagine having changes supporting NFV still being reviewed and merged while this process takes place...it would be like shooting at a moving target! If we did go into lockdown mode, what happens to all the corporate-backed agendas that aim at delivering new value to OpenStack? Yes, I imagine a temporary slow-down on new feature development makes sense. However, I don't think it has to be across the board. Things should be considered case by case, like usual. For example, a feature that requires invasive changes to the virt driver interface might have a harder time during this transition, but a more straight forward feature isolated to the internals of a driver might be fine to let through. Like anything else, we have to weight cost/benefit. Should we relax what goes into the stable branches, i.e. considering having a Juno on steroids six months from now that includes some of the features/fixes that didn't land in time before this process kicks off? No ... maybe I misunderstand the suggestion, but I definitely would not be in favor of a Juno branch with features that haven't landed in master. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Kilo Cycle Goals Exercise
On 09/11/2014 12:52 AM, Angus Lees wrote: So easy/obvious it probably isn't even worth mentioning: Drop support for python2.6 Yeah, that's been the plan. We discussed this at the Juno summit and representatives from most (all?) distributions carrying OpenStack were there. Dropping in Kilo seemed like a reasonable time frame at the time. https://etherpad.openstack.org/p/juno-cross-project-future-of-python And obviously tweeting about it makes it official, right? https://twitter.com/russellbryant/status/466241078472228864 But seriously, we should probably put out a more official notice about this once Kilo opens up. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] nova-specs for Kilo?
On 09/11/2014 01:32 AM, Joe Cropper wrote: Hi Folks, Just wondering if the nova-specs master branch will have a ‘kilo’ directory created soon for Kilo proposals? I have a few things I’d like to submit, just looking for the proper home. There's some more info on that here: http://lists.openstack.org/pipermail/openstack-dev/2014-August/044431.html -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Server Groups - remove VM from group?
On 09/11/2014 05:01 PM, Jay Pipes wrote: On 09/11/2014 04:51 PM, Matt Riedemann wrote: On 9/10/2014 6:00 PM, Russell Bryant wrote: On 09/10/2014 06:46 PM, Joe Cropper wrote: Hmm, not sure I follow the concern, Russell. How is that any different from putting a VM into the group when it’s booted as is done today? This simply defers the ‘group insertion time’ to some time after initial the VM’s been spawned, so I’m not sure this creates anymore race conditions than what’s already there [1]. [1] Sure, the to-be-added VM could be in the midst of a migration or something, but that would be pretty simple to check make sure its task state is None or some such. The way this works at boot is already a nasty hack. It does policy checking in the scheduler, and then has to re-do some policy checking at launch time on the compute node. I'm afraid of making this any worse. In any case, it's probably better to discuss this in the context of a more detailed design proposal. This [1] is the hack you're referring to right? [1] http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py?id=2014.2.b3#n1297 That's the hack *I* had in the back of my mind. Yep. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] Design Summit planning
On 09/12/2014 07:37 AM, Thierry Carrez wrote: If you think this is wrong and think the design summit suggestion website is a better way to do it, let me know why! If some programs really can't stand the 'etherpad/IRC' approach I'll see how we can spin up a limited instance. I think this is fine, especially if it's a better reflection of reality and lets the teams work more efficiently. However, one of the benefits of the old submission system was the clarity of the process and openness to submissions from anyone. We don't want to be in a situation where non-core folks feel like they have a harder time submitting a session. Once this is settled, as long as the wiki pages [1][2] reflect the process and is publicized, it should be fine. [1] https://wiki.openstack.org/wiki/Summit [2] https://wiki.openstack.org/wiki/Summit/Planning -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] keep old specs
On 09/15/2014 10:01 AM, Kevin Benton wrote: Some of the specs had a significant amount of detail and thought put into them. It seems like a waste to bury them in a git tree history. By having them in a place where external parties (e.g. operators) can easily find them, they could get more visibility and feedback for any future revisions. Just being able to see that a feature was previously designed out and approved can prevent a future person from wasting a bunch of time typing up a new spec for the same feature. Hardly anyone is going to search deleted specs from two cycles ago if it requires checking out a commit. Why just restrict the whole repo to being documentation of what went in? If that's all the specs are for, why don't we just wait to create them until after the code merges? FWIW, I agree with you that it makes sense to keep them in a directory that makes it clear that they were not completed. There's a ton of useful info in them. Even if they get re-proposed, it's still useful to see the difference in the proposal as it evolved between releases. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] What's holding nova development back?
On 09/15/2014 05:42 AM, Daniel P. Berrange wrote: On Sun, Sep 14, 2014 at 07:07:13AM +1000, Michael Still wrote: Just an observation from the last week or so... The biggest problem nova faces at the moment isn't code review latency. Our biggest problem is failing to fix our bugs so that the gate is reliable. The number of rechecks we've done in the last week to try and land code is truly startling. I consider both problems to be pretty much equally as important. I don't think solving review latency or test reliabilty in isolation is enough to save Nova. We need to tackle both problems as a priority. I tried to avoid getting into my concerns about testing in my mail on review team bottlenecks since I think we should address the problems independantly / in parallel. Agreed with this. I don't think we can afford to ignore either one of them. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] reopen a change / pull request for nova-pythonclient ?
On 09/17/2014 11:47 AM, Alex Leonhardt wrote: hi, how does one re-open a abandoned change / pull request ? it just timed out and was then abandoned - https://review.openstack.org/#/c/57834/ please let me know I re-opened it. You should be able to update it now. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] reopen a change / pull request for nova-pythonclient ?
On 09/17/2014 11:56 AM, Daniel P. Berrange wrote: On Wed, Sep 17, 2014 at 04:47:06PM +0100, Alex Leonhardt wrote: hi, how does one re-open a abandoned change / pull request ? it just timed out and was then abandoned - https://review.openstack.org/#/c/57834/ please let me know Just re-upload the change, maintaining the same Change-Id line in the commit message. Gerrit will reject it if it's still abandoned. You have to restore it first. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat][nova] VM restarting on host failure in convergence
On 09/17/2014 09:03 AM, Jastrzebski, Michal wrote: In short, what we'll need from nova is to have 100% reliable host-health monitor and equally reliable rebuild/evacuate mechanism with fencing and scheduler. In heat we need scallable and reliable event listener and engine to decide which action to perform in given situation. Unfortunately, I don't think Nova can provide this alone. Nova only knows about whether or not the nova-compute daemon is current communicating with the rest of the system. Even if the nova-compute daemon drops out, the compute node may still be running all instances just fine. We certainly don't want to impact those running workloads unless absolutely necessary. I understand that you're suggesting that we enhance Nova to be able to provide that level of knowledge and control. I actually don't think Nova should have this knowledge of its underlying infrastructure. I would put the host monitoring infrastructure (to determine if a host is down) and fencing capability as out of scope for Nova and as a part of the supporting infrastructure. Assuming those pieces can properly detect that a host is down and fence it, then all that's needed from Nova is the evacuate capability, which is already there. There may be some enhancements that could be done to it, but surely it's quite close. There's also the part where a notification needs to go out saying that the instance has failed. Some thing (which could be Heat in the case of this proposal) can react to that, either directly or via ceilometer, for example. There is an API today to hard reset the state of an instance to ERROR. After a host is fenced, you could use this API to mark all instances on that host as dead. I'm not sure if there's an easy way to do that for all instances on a host today. That's likely an enhancement we could make to python-novaclient, similar to the evacuate all instances on a host enhancement that was done in novaclient. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
On 12/09/2013 12:56 PM, Gordon Sim wrote: In the case of Nova (and others that followed Nova's messaging patterns), I firmly believe that for scaling reasons, we need to move toward it becoming the norm to use peer-to-peer messaging for most things. For example, the API and conductor services should be talking directly to compute nodes instead of through a broker. Is scale the only reason for preferring direct communication? I don't think an intermediary based solution _necessarily_ scales less effectively (providing it is distributed in nature, which for example is one of the central aims of the dispatch router in Qpid). That's not to argue that peer-to-peer shouldn't be used, just trying to understand all the factors. Scale is the primary one. If the intermediary based solution is easily distributed to handle our scaling needs, that would probably be fine, too. That just hasn't been our experience so far with both RabbitMQ and Qpid. One other pattern that can benefit from intermediated message flow is in load balancing. If the processing entities are effectively 'pulling' messages, this can more naturally balance the load according to capacity than when the producer of the workload is trying to determine the best balance. Yes, that's another factor. Today, we rely on the message broker's behavior to equally distribute messages to a set of consumers. One example is how Nova components talk to the nova-scheduler service. All instances of the nova-scheduler service are reading off a single 'scheduler' queue, so messages hit them round-robin. In the case of the zeromq driver, this logic is embedded in the client. It has to know about all consumers and handles choosing where each message goes itself. See references to the 'matchmaker' code for this. Honestly, using a distributed more lightweight router like Dispatch sounds *much* nicer. The exception to that is cases where we use a publish-subscribe model, and a broker serves that really well. Notifications and notification consumers (such as Ceilometer) are the prime example. The 'fanout' RPC cast would perhaps be another? Good point. In Nova we have been working to get rid of the usage of this pattern. In the latest code the only place it's used AFAIK is in some code we expect to mark deprecated (nova-network). In terms of existing messaging drivers, you could accomplish this with a combination of both RabbitMQ or Qpid for brokered messaging and ZeroMQ for the direct messaging cases. It would require only a small amount of additional code to allow you to select a separate driver for each case. Based on my understanding, AMQP 1.0 could be used for both of these patterns. It seems ideal long term to be able to use the same protocol for everything we need. That is certainly true. AMQP 1.0 is fully symmetric so it can be used directly peer-to-peer as well as between intermediaries. In fact, apart from the establishment of the connection in the first place, a process need not see any difference in the interaction either way. We could use only ZeroMQ, as well. It doesn't have the publish-subscribe stuff we need built in necessarily. Surely that has been done multiple times by others already, though. We could build it too, if we had to. Indeed. However the benefit of choosing a protocol is that you can use solutions developed outside OpenStack or any other single project. Can you (or someone) elaborate further on what will make this solution superior to our existing options? Superior is a very bold claim to make :-) I do personally think that an AMQP 1.0 based solution would be worthwhile for the reasons above. Given a hypothetical choice between say the current qpid driver and one that could talk to different back-ends, over a standard protocol for which e.g. semantic monitoring tools could be developed and which would make reasoning about partial upgrades or migrations easier, I know which I would lean to. Obviously that is not the choice here, since one already exists and the other is as yet hypothetical. However, as I say I think this could be a worthwhile journey and that would justify at least taking some initial steps. Thanks for sharing some additional insight. I was already quite optimistic, but you've helped solidify that. I'm very interested in diving deeper into how Dispatch would fit into the various ways OpenStack is using messaging today. I'd like to get a better handle on how the use of Dispatch as an intermediary would scale out for a deployment that consists of 10s of thousands of compute nodes, for example. Is it roughly just that you can have a network of N Dispatch routers that route messages from point A to point B, and for notifications we would use a traditional message broker (qpidd or rabbitmq) ? -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http
Re: [openstack-dev] [Oslo] First steps towards amqp 1.0
On 12/09/2013 05:16 PM, Gordon Sim wrote: On 12/09/2013 07:15 PM, Russell Bryant wrote: On 12/09/2013 12:56 PM, Gordon Sim wrote: In the case of Nova (and others that followed Nova's messaging patterns), I firmly believe that for scaling reasons, we need to move toward it becoming the norm to use peer-to-peer messaging for most things. For example, the API and conductor services should be talking directly to compute nodes instead of through a broker. Is scale the only reason for preferring direct communication? I don't think an intermediary based solution _necessarily_ scales less effectively (providing it is distributed in nature, which for example is one of the central aims of the dispatch router in Qpid). That's not to argue that peer-to-peer shouldn't be used, just trying to understand all the factors. Scale is the primary one. If the intermediary based solution is easily distributed to handle our scaling needs, that would probably be fine, too. That just hasn't been our experience so far with both RabbitMQ and Qpid. Understood. The Dispatch Router was indeed created from an understanding of the limitations and drawbacks of the 'federation' feature of qpidd (which was the primary mechanism for scaling beyond one broker) as well learning lessons around the difficulties of message replication and storage. Cool. To make the current situation worse, AFAIK, we've never been able to make Qpid federation work at all for OpenStack. That may be due to the way we use Qpid, though. For RabbitMQ, I know people are at least using active-active clustering of the broker. One other pattern that can benefit from intermediated message flow is in load balancing. If the processing entities are effectively 'pulling' messages, this can more naturally balance the load according to capacity than when the producer of the workload is trying to determine the best balance. Yes, that's another factor. Today, we rely on the message broker's behavior to equally distribute messages to a set of consumers. Sometimes you even _want_ message distribution to be 'unequal', if the load varies by message or the capacity by consumer. E.g. If one consumer is particularly slow (or is given a particularly arduous task), it may not be optimal for it to receive the same portion of subsequent messages as other less heavily loaded or more powerful consumers. Indeed. We haven't tried to do that anywhere, but it would be an improvement for some cases. The exception to that is cases where we use a publish-subscribe model, and a broker serves that really well. Notifications and notification consumers (such as Ceilometer) are the prime example. The 'fanout' RPC cast would perhaps be another? Good point. In Nova we have been working to get rid of the usage of this pattern. In the latest code the only place it's used AFAIK is in some code we expect to mark deprecated (nova-network). Interesting. Is that because of problems in scaling the messaging solution or for other reasons? It's primarily a scaling concern. We're assuming that broadcasting messages is generally an anti-pattern for the massive scale we're aiming for. [...] I'm very interested in diving deeper into how Dispatch would fit into the various ways OpenStack is using messaging today. I'd like to get a better handle on how the use of Dispatch as an intermediary would scale out for a deployment that consists of 10s of thousands of compute nodes, for example. Is it roughly just that you can have a network of N Dispatch routers that route messages from point A to point B, and for notifications we would use a traditional message broker (qpidd or rabbitmq) ? For scaling the basic idea is that not all connections are made to the same process and therefore not all messages need to travel through a single intermediary process. So for N different routers, each have a portion of the total number of publishers and consumers connected to them. Though client can communicate even if they are not connected to the same router, each router only needs to handle the messages sent by the publishers directly attached, or sent to the consumer directly attached. It never needs to see messages between publishers and consumer that are not directly attached. To address your example, the 10s of thousands of compute nodes would be spread across N routers. Assuming these were all interconnected, a message from the scheduler would only travel through at most two of these N routers (the one the scheduler was connected to and the one the receiving compute node was connected to). No process needs to be able to handle 10s of thousands of connections itself (as contrasted with full direct, non-intermediated communication, where the scheduler would need to manage connections to each of the compute nodes). This basic pattern is the same as networks of brokers, but Dispatch router has been designed from the start
Re: [openstack-dev] [Nova] New API requirements, review of GCE
On 12/10/2013 08:47 AM, Christopher Yeoh wrote: On Tue, Dec 10, 2013 at 11:36 PM, Alexandre Levine alev...@cloudscaling.com mailto:alev...@cloudscaling.com wrote: Russell, I'm a little confused about importing it into stackforge repository. Please clarify it for me. Right now our code is a part of nova itself, adding lots of files and changing several of existing, namely: service.py, cmd/api.py, setup.cfg, api-paste.ini and nova.conf.sample. This is only the core. Also our functional code for some marginal GCE API functionality has to use database (to store naming translation). So for creating the stackforge repository we have 3 different options with different costs in terms of additional labor: 1. Create the copy of nova and add GCE as a part of it. - I don't fill it to be a customary way for such additions but it'll be the least costly for us. 2. Separate our code from nova leaving it its part still. Stackforge repository will contain only GCE code. It'd be installed after the nova is installed, create another nova service, change api-paste.ini and nova.conf, create tables in nova DB (or create its own DB, I'm not sure here). This will require some changes for the present code but not many though. 3. Completely separate GCE and make it a standalone service using nova via REST. The most costly options for us now. Still doable as well. In the long run I suspect that option 3 will probably be the least work for everyone. Why be dependent on changes to Nova's internal APIs unless you really have to when you have the alternative of being dependent on a much more stable REST API? #3 is preferable by far, IMO. #2 is roughly what I was suggesting. If it doesn't go into Nova, this would be a good way to manage it as an add-on. If it does go into Nova, it would be a good way to stage such a big addition, since we'll be able to see CI running the tests against it. That's beneficial for the code even if it doesn't go into Nova. I'm still waiting to see what kind of support there is for this. We really need clear support for it being in the tree before accepting the ongoing maintenance and review burden. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] New API requirements, review of GCE
On 12/10/2013 11:13 AM, Alexandre Levine wrote: Yes, I understand it perfectly, Cristopher, and cannot agree more. It's just more work to reach this right now than use what's present. Still in my opinion even in a mid-run just till IceHouse release it might be less work overall. I'm going to think it over. So ... if you really do feel that way, I'm not sure it makes a lot of sense to merge it one way if there's already a plan emerging to re-do it. We'd have to go through a painful deprecation cycle of the old code where we're maintaining it in two places for a while. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] New API requirements, review of GCE
On 12/10/2013 10:47 AM, Alexandre Levine wrote: Does nova actually have add-ons infrastructure in which our GCE API can fit? I see the plugins with xen-server only and have found this link: http://docs.openstack.org/developer/nova/api/nova.openstack.common.plugin.plugin.html. Is it the thing? I'm trying to understand what you meant by this: If it doesn't go into Nova, this would be a good way to manage it as an add-on. I mean as a place to host the code with infrastructure that can help make sure it stays in sync with Nova. There's no real plugin infrastructure that this would fit in to. It would be a new service in your repo, based on Nova's service code, that loads the GCE API. Your code would be importing nova.* stuff, even though it's in a separate repo. Also if we do it a separate service does it remove necessity for support commitment from the nova core team? Yes. And if it does who would have to commit instead? It could be anyone. I presume it would be the original authors who were planning to help maintain it anyway, as well as anyone else that the project can attract that is interested in GCE. Sorry if I ask some obvious questions. No problem ... there isn't an existing example for you follow exactly, so it's not obvious. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Nova] icehouse-2 blueprint deadline
Greetings, We have a lot of blueprints targeted to icehouse-2 [1] that are still under review. The blueprints that do not have a priority are still under review. We need a deadline for blueprints to be finalized for this milestone. I propose that deadline to be Thursday, December 19. Any blueprints that have not been approved at that point will be moved to the icehouse-3 milestone. If you have an icehouse-2 blueprint, please check its status. Specifically, check the Definition field of the blueprint. Here is what it means: Approved - blueprint has been approved for this milestone Pending Approval - waiting on a blueprint reviewer to follow up Review - waiting on the blueprint submitter to provide more information. Change to Pending Approval once you feel that the information has been provided. Drafting - Details still being written up by the submitter, so the blueprint has not been reviewed. Update to Pending Approval when ready. Discussion - Blueprint approval hinges on the result of a discussion. The blueprint should contain a link to a mailing list thread discussing the blueprint. Once you feel the discussion has concluded and more review is needed, update it to Pending Approval. New - new blueprints that have not been triaged yet. A blueprint reviewer (member of nova-drivers) will triage it soon. After this deadline, anything left in Review, Drafting, or Discussion state will be moved to icehouse-3. Thanks, [1] https://launchpad.net/nova/+milestone/icehouse-2 -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [qa] [Solum] [tempest] Use of pecan test framework in functional tests
On 12/10/2013 04:10 PM, Georgy Okrokvertskhov wrote: Hi, In Solum project we are currently creating tests environments for future test. We split unit tests and functional tests in order to use tempest framework from the beginning. Tempest framework assumes that you run your service and test APi endpoints by sending HTTP requests. Solum uses Pecan WSGI framework which has its own test framework based on WebTest. This framework allows to test application without sending actual HTTP traffic. It mocks low level stuff related to transport but keeps all high level WSGI part as it is a real life application\service. There is a question to QA\Tempest teams, what do you think about using pecan test framework in tempest for Pecan based applications? I don't think that makes sense. Then we're not using the code like it would be used normally (via HTTP). -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Horizon] Nominations to Horizon Core
On 12/10/2013 05:57 PM, Paul McMillan wrote: +1 on Tatiana Mazur, she's been doing a bunch of good work lately. I'm fine with me being removed from core provided you have someone else qualified to address security issues as they come up. My contributions have lately been reviewing and responding to security issues, vetting fixes for those, and making sure they happen in a timely fashion. Fortunately, we haven't had too many of those lately. Other than that, I've been lurking and reviewing to make sure nothing egregious gets committed. If you don't have anyone else who is a web security specialist on the core team, I'd like to stay. Since I'm also a member of the Django security team, I offer a significant chunk of knowledge about how the underlying security protections are intended work. Security reviews aren't done on gerrit, though. They are handled in launchpad bugs. It seems you could still contribute in this way without being on the horizon-core team responsible for reviewing normal changes in gerrit. The bigger point is that you don't have to be on whatever-core to contribute productively to reviews. I think every project has people that make important review contributions, but aren't necessarily reviewing regularly enough to be whatever-core. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Horizon] Nominations to Horizon Core
On 12/11/2013 08:14 PM, Bryan D. Payne wrote: Re: Removing Paul McMillan from core I would argue that it is critical that each project have 1-2 people on core that are security experts. The VMT is an intentionally small team. They are moving to having specifically appointed security sub-teams on each project (I believe this is what I heard at the last summit). These teams would be a subset of the core devs that can handle security reviews. They idea is that these people would then be able to +1 / -1 embargoed security patches. So having someone like Paul on Horizon core would be very valuable for such things. We can involve people in security reviews without having them on the core review team. They are separate concerns. In addition, I think that gerrit is exactly where security reviews *should* be happening. Much better to catch things before they are merged, rather than as bugs after-the-fact. Would we rather have a -1 on a code review than a CVE? This has been discussed quite a bit. We can't handle security patches on gerrit right now while they are embargoed because we can't completely hide them. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Horizon] Nominations to Horizon Core
On 12/11/2013 11:08 PM, Bryan D. Payne wrote: We can involve people in security reviews without having them on the core review team. They are separate concerns. Yes, but those people can't ultimately approve the patch. So you'd need to have a security reviewer do their review, and then someone who isn't a security person be able to offer the +1/+2 based on the opinion of the security reviewer. This doesn't make any sense to me. You're involving an extra person needlessly, and creating extra work. I don't want someone not regularly looking at changes going into the code able to do the ultimate approval of any patch. I think this is working as designed. Including the extra person in this case is a good thing. This has been discussed quite a bit. We can't handle security patches on gerrit right now while they are embargoed because we can't completely hide them. I think that you're confusing security reviews of new code changes with reviews of fixes to security problems. In this part of my email, I'm talking about the former. These are not embargoed. They are just the everyday improvements to the system. That is the best time to identify and gate on security issues. Without someone on core that can give a -2 when there's a problem, this will basically never happen. Then we'll be back to fixing a greater number of things as bugs. Anyone can offer a -1, and that will be paid attention to. If that ever doesn't happen, let's talk about it. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] All I want for Christmas is one more +2 ...
On 12/12/2013 09:22 AM, Day, Phil wrote: Hi Cores, The “Stop, Rescue, and Delete should give guest a chance to shutdown” change https://review.openstack.org/#/c/35303/ was approved a couple of days ago, but failed to merge because the RPC version had moved on. Its rebased and sitting there with one +2 and a bunch of +1s -would be really nice if it could land before it needs another rebase please ? Approved. FWIW, I'm fine with folks approving with a single +2 for cases where a patch is approved but needed a simple rebase. This happens pretty often. We even have a script that generates a list of patches still open that were previously approved: http://russellbryant.net/openstack-stats/nova-openapproved.txt -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Generic question: Any tips for 'keeping up' with the mailing lists?
On 12/12/2013 11:23 AM, Justin Hammond wrote: I am a developer who is currently having troubles keeping up with the mailing list due to volume, and my inability to organize it in my client. I am nearly forced to use Outlook 2011 for Mac and I have read and attempted to implement https://wiki.openstack.org/wiki/MailingListEtiquette but it is still a lot to deal with. I read once a topic or wiki page on using X-Topics but I have no idea how to set that in outlook (google has told me that the feature was removed). I'm not sure if this is a valid place for this question, but I *am* having difficulty as a developer. Thank you for anyone who takes the time to read this. The trick is defining what keeping up means for you. I doubt anyone reads everything. I certainly don't. First, I filter all of openstack-dev into its own folder. I'm sure others filter more aggressively based on topic, but I don't since I know I may be interested in threads in any of the topics. Figure out what filtering works for you. I scan subjects for the threads I'd probably be most interested in. While I'm scanning, I'm first looking for topic tags, like [Nova], then I read the subject and decide whether I want to dive in and read the rest. It happens very quickly, but that's roughly my thought process. With whatever is left over: mark all as read. :-) -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?
On 12/12/2013 12:02 PM, Clint Byrum wrote: I've been chasing quite a few bugs in the TripleO automated bring-up lately that have to do with failures because either there are no valid hosts ready to have servers scheduled, or there are hosts listed and enabled, but they can't bind to the network because for whatever reason the L2 agent has not checked in with Neutron yet. This is only a problem in the first few minutes of a nova-compute host's life. But it is critical for scaling up rapidly, so it is important for me to understand how this is supposed to work. So I'm asking, is there a standard way to determine whether or not a nova-compute is definitely ready to have things scheduled on it? This can be via an API, or even by observing something on the nova-compute host itself. I just need a definitive signal that the compute host is ready. If a nova compute host has registered itself to start having instances scheduled to it, it *should* be ready. AFAIK, we're not doing any network sanity checks on startup, though. We already do some sanity checks on startup. For example, nova-compute requires that it can talk to nova-conductor. nova-compute will block on startup until nova-conductor is responding if they happened to be brought up at the same time. We could do something like this with a networking sanity check if someone could define what that check should look like. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] New official bugtag 'Ironic' ?
On 12/12/2013 03:03 PM, Robert Collins wrote: We have official tags for most of the hypervisors, but not ironic as yet - any objections to adding one? Nope, go ahead. For reference, to add it we need to: 1) Make it an official tag in launchpad 2) Update https://wiki.openstack.org/wiki/BugTags 3) Update https://wiki.openstack.org/wiki/Nova/BugTriage -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?
On 12/12/2013 12:35 PM, Clint Byrum wrote: Excerpts from Chris Friesen's message of 2013-12-12 09:19:42 -0800: On 12/12/2013 11:02 AM, Clint Byrum wrote: So I'm asking, is there a standard way to determine whether or not a nova-compute is definitely ready to have things scheduled on it? This can be via an API, or even by observing something on the nova-compute host itself. I just need a definitive signal that the compute host is ready. Is it not sufficient that nova service-list shows the compute service as up? I could spin waiting for at least one. Not a bad idea actually. However, I suspect that will only handle the situations I've gotten where the scheduler returns NoValidHost. Right it solves this case I say that because I think if it shows there, it matches the all hosts filter and will have things scheduled on it. With one compute host I get failures after scheduling because neutron has no network segment to bind to. That is because the L2 agent on the host has not yet registered itself with Neutron. but not this one. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?
On 12/12/2013 01:36 PM, Clint Byrum wrote: Excerpts from Kyle Mestery's message of 2013-12-12 09:53:57 -0800: On Dec 12, 2013, at 11:44 AM, Jay Pipes jaypi...@gmail.com wrote: On 12/12/2013 12:36 PM, Clint Byrum wrote: Excerpts from Russell Bryant's message of 2013-12-12 09:09:04 -0800: On 12/12/2013 12:02 PM, Clint Byrum wrote: I've been chasing quite a few bugs in the TripleO automated bring-up lately that have to do with failures because either there are no valid hosts ready to have servers scheduled, or there are hosts listed and enabled, but they can't bind to the network because for whatever reason the L2 agent has not checked in with Neutron yet. This is only a problem in the first few minutes of a nova-compute host's life. But it is critical for scaling up rapidly, so it is important for me to understand how this is supposed to work. So I'm asking, is there a standard way to determine whether or not a nova-compute is definitely ready to have things scheduled on it? This can be via an API, or even by observing something on the nova-compute host itself. I just need a definitive signal that the compute host is ready. If a nova compute host has registered itself to start having instances scheduled to it, it *should* be ready. AFAIK, we're not doing any network sanity checks on startup, though. We already do some sanity checks on startup. For example, nova-compute requires that it can talk to nova-conductor. nova-compute will block on startup until nova-conductor is responding if they happened to be brought up at the same time. We could do something like this with a networking sanity check if someone could define what that check should look like. Could we ask Neutron if our compute host has an L2 agent yet? That seems like a valid sanity check. ++ This makes sense to me as well. Although, not all Neutron plugins have an L2 agent, so I think the check needs to be more generic than that. For example, the OpenDaylight MechanismDriver we have developed doesn't need an agent. I also believe the Nicira plugin is agent-less, perhaps there are others as well. And I should note, does this sort of integration also happen with cinder, for example, when we're dealing with storage? Any other services which have a requirement on startup around integration with nova as well? Does cinder actually have per-compute-host concerns? I admit to being a bit cinder-stupid here. No, it doesn't. Anyway, it seems to me that any service that is compute-host aware should be able to respond to the compute host whether or not it is a) aware of it, and b) ready to serve on it. For agent-less drivers that is easy, you just always return True. And for drivers with agents, you return false unless you can find an agent for the host. So something like: GET /host/%(compute-host-name) And then in the response include a ready attribute that would signal whether all networks that should work there, can work there. As a first pass, just polling until that is ready before nova-compute enables itself would solve the problems I see (and that I think users would see as a cloud provider scales out compute nodes). Longer term we would also want to aim at having notifications available for this so that nova-compute could subscribe to that notification bus and then disable itself if its agent ever goes away. I opened this bug to track the issue. I suspect there are duplicates of it already reported, but would like to start clean to make sure it is analyzed fully and then we can use those other bugs as test cases and confirmation: https://bugs.launchpad.net/nova/+bug/1260440 Sounds good. I'm happy to do this in Nova, but we'll have to get the Neutron API bit sorted out first. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [bugs] definition of triaged
On 12/12/2013 04:46 PM, Robert Collins wrote: Hi, I'm trying to overhaul the bug triage process for nova (initially) to make it much lighter and more effective. I'll be sending a more comprehensive mail shortly before you do, let's agree what we're trying to solve. Perhaps you were going to cover that in your later message, but it wouldn't hurt discussing it now. I actually didn't think our process was that broken. It's more that I feel we need a person leading a small team that is working on it reguarly. The idea with the tagging approach was to break up the triage problem into smaller work queues. I haven't kept up with the tagging part and would really like to hand that off. Then some of the work queues aren't getting triaged as regularly as they need to. I'd like to see a small team making this a high priority with some of their time each week. With all of that said, if you think an overhaul of the process is necessary to get to the end goal of a more well triaged bug queue, then I'm happy to entertain it. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [governance] Becoming a Program, before applying for incubation
, FF ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Docker] Environment variables
On 12/16/2013 09:27 AM, Daniel Kuffner wrote: Hi All, I have submitted a new blueprint which addresses the a common pattern in the docker world. A usual pattern in the docker world is to use environment variables to configure a container. docker run -e SQL_URL=postgres://user:password@/db my-app The nova docker driver doesn't support to set any environment variables. To work around this issue I used cloud-init which works fine. But this approach has of course the drawback that a) I have to install the cloud init service. and b) my docker container doesn't work outside of openstack. I propose to allow a user to set docker environment variables via nova instance metadata. The metadata key should have a prefix like ENV_ which can be used to determine all environment variables. The prefix should be removed and the remainder key and vaule will be injected. The metadata can unfortunately not be set in horizon but can be used from the nova command line tool and from heat. Example heat: myapp: Type: OS::Nova::Server Properties: flavor: m1.small image: my-app:latest meta-data: - ENV_SQL_URL: postgres://user:password@/db - ENV_SOMETHING_ELSE: Value Let me know what you think about that. Blueprint: https://blueprints.launchpad.net/nova/+spec/docker-env-via-meta-data Thanks for starting the discussion. More people should do this for their blueprints. :-) One of the things we should be striving for is to provide as consistent of an experience as we can across drivers. Right now, we have the metadata service and config drive, and neither of those are driver specific. In the case of config drive, whether it's used or not is exposed through the API. As you point out, the meta-data service does technically work with the docker driver. I don't think we should support environment variables like this automatically. Instead, I think it would be more appropriate to add an API extension for specifying env vars. That way the behavior is more explicit and communicated through the API. The env vars would be passed through all of the appropriate plumbing and down to drivers that are able to support it. This is all also assuming that containers support is staying in Nova and not a new service. That discussion seems to have stalled. Is anyone still pushing on that? Any updates? -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] time for a new major network rpc api version?
On 12/15/2013 05:12 PM, Robert Collins wrote: That said, doing anything to the network RPC API seems premature until the Neutron question is resolved. This. I've been pretty much ignoring this API since it has been frozen and almost deprecated for a long time. My plan was to revisit the status of nova-network after the release of icehouse-2. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Docker] Environment variables
On 12/16/2013 10:12 AM, Chuck Short wrote: I have something that is pushing it for to stay in nova (at least the compute drivers). I should have a gerrit branch for people to review soon. OK. Do you have any design notes for whatever you're proposing? That would probably be easier to review and discuss. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Docker] Environment variables
On 12/16/2013 10:18 AM, Daniel Kuffner wrote: Hi Russell, You actually propose to extend the whole nova stack to support environment variables. Would any other driver benefit from this API extension? Is that what you imagine? nova --env SQL_URL=postgres://user:password --image Yes. Regarding the discussion you mentioned. Are there any public resources to read. I kind of missed it. Most likely it was before I was part of this community :) It started here back in November: http://lists.openstack.org/pipermail/openstack-dev/2013-November/019637.html and then there have been a few messages on that thread this month, too. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Docker] Environment variables
On 12/16/2013 10:39 AM, Daniel P. Berrange wrote: On Mon, Dec 16, 2013 at 04:18:52PM +0100, Daniel Kuffner wrote: Hi Russell, You actually propose to extend the whole nova stack to support environment variables. Would any other driver benefit from this API extension? Is that what you imagine? nova --env SQL_URL=postgres://user:password --image Regarding the discussion you mentioned. Are there any public resources to read. I kind of missed it. Most likely it was before I was part of this community :) With glance images we have a way to associate arbitrary metadata attributes with the image. I could see using this mechanism to associate some default set of environment variables. eg use a 'env_' prefix for glance image attributes We've got a couple of cases now where we want to overrides these same things on a per-instance basis. Kernel command line args is one other example. Other hardware overrides like disk/net device types are another possibility Rather than invent new extensions for each, I think we should have a way to pass arbitrary attributes alon with the boot API call, that a driver would handle in much the same way as they do for glance image properties. Basically think of it as a way to custom any image property per instance created. That's a pretty nice idea. I like it. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][db] Thoughts on making instances.uuid non-nullable?
On 12/16/2013 11:45 AM, Matt Riedemann wrote: 1. Add a migration to change instances.uuid to non-nullable. Besides the obvious con of having yet another migration script, this seems the most straight-forward. The instance object class already defines the uuid field as non-nullable, so it's constrained at the objects layer, just not in the DB model. Plus I don't think we'd ever have a case where instance.uuid is null, right? Seems like a lot of things would break down if that happened. With this option I can build on top of it for the DB2 migration support to add the same FKs as the other engines. Yeah, having instance.uuid nullable doesn't seem valuable to me, so this seems OK. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Nova] Future meeting times
Greetings, The weekly Nova meeting [1] has been held on Thursdays at 2100 UTC. I've been getting some requests to offer an alternative meeting time. I'd like to try out alternating the meeting time between two different times to allow more people in our global development team to attend meetings and engage in some real-time discussion. I propose the alternate meeting time as 1400 UTC. I realize that doesn't help *everyone*, but it should be an improvement for some, especially for those in Europe. If we proceed with this, we would meet at 2100 UTC on January 2nd, 1400 UTC on January 9th, and alternate from there. Note that we will not be meeting at all on December 26th as a break for the holidays. If you can't attend either of these times, please note that the meetings are intended to be supplementary to the openstack-dev mailing list. In the meetings, we check in on status, raise awareness of important issues, and progress some discussions with real-time debate, but the most important discussions and decisions will always be brought to the openstack-dev mailing list, as well. With that said, active Nova contributors are always encouraged to attend and participate if they are able. Comments welcome, especially some acknowledgement that there are people that would attend the alternate meeting time. :-) Thanks, [1] https://wiki.openstack.org/wiki/Meetings/Nova -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] How do we format/version/deprecate things from notifications?
On 12/18/2013 12:44 PM, Nikola Đipanov wrote: On 12/18/2013 06:17 PM, Matt Riedemann wrote: On 12/18/2013 9:42 AM, Matt Riedemann wrote: The question came up in this patch [1], how do we deprecate and remove keys in the notification payload? In this case I need to deprecate and replace the 'instance_type' key with 'flavor' per the associated blueprint. [1] https://review.openstack.org/#/c/62430/ By the way, my thinking is it's handled like a deprecated config option, you deprecate it for a release, make sure it's documented in the release notes and then drop it in the next release. For anyone that hasn't switched over they are broken until they start consuming the new key. FWIW - I am OK with this approach - but we should at least document it. I am also thinking that we may want to make it explicit like oslo.config does it. We really need proper versioning for notifications. We've had a blueprint open for about a year, but AFAICT, nobody is actively working on it. https://blueprints.launchpad.net/nova/+spec/versioned-notifications -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Process for proposing patches attached to launchpad bugs?
On 12/20/2013 09:32 AM, Dolph Mathews wrote: In the past, I've been able to get authors of bug fixes attached to Launchpad bugs to sign the CLA and submit the patch through gerrit... although, in one case it took quite a bit of time (and thankfully it wasn't a critical fix or anything). This scenario just came up again (example: [1]), so I'm asking preemptively... what if the author is unwilling / unable in signing the CLA and propose through gerrit, or it's a critical bug fix and waiting on an author to go through the CLA process is undesirable for the community? Obviously that's a bit of a fail on our part, but what's the most appropriate expedient way to handle it? Can we propose the patch to gerrit ourselves? If so, who should appear as the --author of the commit? Who should appear as Co-Authored-By, especially when the committer helps to evolve the patch evolves further in review? Alternatively, am I going about this all wrong? Thanks! [1]: https://bugs.launchpad.net/keystone/+bug/1198171/comments/8 It's not your code, so you really can't propose it without them having signed the CLA, or propose it as your own. Ideally have someone else fix the same bug that hasn't looked at the patch. From a quick look, it seems likely that this fix is small and straight forward enough that the clean new implementation is going to end up looking very similar. Still, I think it's the right thing to do. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Nova] Live upgrades and major rpc versions
Greetings, Bumping the major rpc versions allows us to drop old backwards compatibility code. However, we have to do this in such a way that doesn't break live upgrades. We've expected live upgrades for CD to work for a while, and we're also expecting to be able to support it from Havana to Icehouse. The approach for bumping major rpc versions in the past has been like this: Step 1) https://review.openstack.org/#/c/53944/ Step 2) https://review.openstack.org/#/c/54493/ The approach outlined in the commit message for step 1 discusses how this approach works with live upgrades in a CD environment. However, making changes like this in the middle of a release cycle breaks the live upgrade from the N-1 to N release. (Yes, these changes broke Havana-Icehouse live upgrades, but that has since been resolved with some other patches. This discussion is how we avoid breaking it in the future.) To support N-1 to N live upgrades, I propose that we use the same change structure, but split it over a release boundary. A practical example for the conductor service: Step 1) https://review.openstack.org/#/c/52218/ This patch adds a new revision of the conductor rpc API, 2.0. I say we merge a change like this just before the Icehouse release. The way it's written is very low risk to the release since it leaves most important existing code (1.X) untouched. Step 2) https://review.openstack.org/#/c/52219/ Once master is open for J development, merge a patch like this one as step 2. At this point, we would drop all support for 1.X. It's no longer needed because in J we're only trying to support upgrades from Icehouse, and Icehouse supported 2.0. Using this approach I think we can support live upgrades from N-1 to N while still being able to drop some backwards compatibility code each release cycle. Once we get the details worked out, I'd like to capture the process on the release checklist wiki page for Nova. https://wiki.openstack.org/wiki/Nova/ReleaseChecklist Thoughts? -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Nova] No meeting this week
No Nova meeting this week. We will resume on Thursday, January 2, at 21:00 UTC. https://wiki.openstack.org/wiki/Meetings/Nova -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Horizon] import only module message and #noqa
On 01/03/2014 10:10 AM, Radomir Dopieralski wrote: I think that we can actually do a little bit better and remove many of the #noqa tags without forfeiting automatic checking. I submitted a patch: https://review.openstack.org/#/c/64832/ This basically adds a h302_exceptions option to tox.ini, that lets us specify which names are allowed to be imported. For example, we can do: [hacking] h302_exceptions = django.conf.settings, django.utils.translation.ugettext_lazy, django.core.urlresolvers. To have settings, _ and everything from urlresolvers importable without the need for the #noqa tag. Of course every project can add their own names there, depending what they need. Isn't that what import_exceptions is for? For example, we have this in nova: import_exceptions = nova.openstack.common.gettextutils._ -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Horizon] import only module message and #noqa
On 01/03/2014 10:35 AM, Radomir Dopieralski wrote: On 03/01/14 16:18, Russell Bryant wrote: On 01/03/2014 10:10 AM, Radomir Dopieralski wrote: I think that we can actually do a little bit better and remove many of the #noqa tags without forfeiting automatic checking. I submitted a patch: https://review.openstack.org/#/c/64832/ This basically adds a h302_exceptions option to tox.ini, that lets us specify which names are allowed to be imported. For example, we can do: [hacking] h302_exceptions = django.conf.settings, django.utils.translation.ugettext_lazy, django.core.urlresolvers. To have settings, _ and everything from urlresolvers importable without the need for the #noqa tag. Of course every project can add their own names there, depending what they need. Isn't that what import_exceptions is for? For example, we have this in nova: import_exceptions = nova.openstack.common.gettextutils._ No exactly, as this will disable all import checks, just like # noqa. Ah, makes sense. Thanks. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Gantt] Looking for some answers...
On 01/06/2014 02:30 PM, Vishvananda Ishaya wrote: On Jan 6, 2014, at 11:02 AM, Jay Pipes jaypi...@gmail.com wrote: Hello Stackers, I was hoping to get some answers on a few questions I had regarding the Gantt project [1]. Specifically, here are my queries: 1) Why was Nova forked to the http://github.com/openstack/gantt repository? Forking Nova just to then remove a bunch of code that doesn't relate to the scheduler code means that we bring 10K+ commits and a git history along with the new project... this seems to be the wrong origin for a project the aims to be a separate service. There's a reason that Cinder and Neutron didn't start out as a fork of Nova, after all… Authorship history is nice, but this does seem a bit excessive. The cinder strategy of a single squashed fork would have been/still be fine I’m sure. That's not exactly what was done here. It's a new repo created with the history filtered out. The history was only maintained for code kept. That seems pretty ideal to me. 2) Why is Gantt in the /openstack GitHub organization? Wouldn't the /stackforge organization be more appropriate for a project that isn't integrated? If I understand some of the backstory behind Gantt, the idea was to create a scheduler service from the existing Nova scheduler code in order to complete the work sometime in our lifetime. While I understand the drive to start with something that already exists and iterate over it, I don't understand why the project went right into the /openstack organization instead of following the /stackforge processes for housing code that bakes and gets iterated on before proposing for incubation. Some explanation would be great here. This is split-out of existing code so it is following the same path as cinder. The goal is to deprecate the existing nova scheduler in I. It currently a new project under the nova program I believe. Correct (compute program, technically). It's just a mechanical thing, not new code. Also, it's not an incubated or integrated project yet. It's just an official repo under the compute program. 3) Where is feature planning happening for Gantt? The Launchpad site for Gantt [2] is empty. Furthermore, there are a number of blueprints for improving the Nova scheduler, notably the no-db-scheduler blueprint [3], which even has code submitted for it and is targeted to Icehouse-2. How are improvements like this planned to be ported (if at all) to Gantt? Not sure about the launchpad site. There is a regular scheduler group meeting and as I understand it the hope will be to do the no-db-scheduler blueprint. There was quite a bit of debate on whether to do the no-db-scheduler stuff before or after the forklift and I think the consensus was to do the forklift first. The planning is just being done in nova blueprints right now. Once gantt has enough momentum, we can start using a separate launchpad project. But we haven't even finished step 1 of making the thing run yet. 4) Is the aim of Gantt to provide a RESTful HTTP API in addition to the RPC-based API that the existing Nova scheduler exposes? In the short term the plan is to just replicate the rpc api, but I think a REST api will be considered long term. Yep. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Gantt] Looking for some answers...
On 01/06/2014 02:52 PM, Boris Pavlovic wrote: Vish, and as I understand it the hope will be to do the no-db-scheduler blueprint. There was quite a bit of debate on whether to do the no-db-scheduler stuff before or after the forklift and I think the consensus was to do the forklift first. Current Nova scheduler is so deeply bind to nova data models, that it is useless for every other project. So I don't think that forkit in such state of Nova Scheduler is useful for any other project. It should be pretty easy to do this in gantt though. Right now I would probably do it against the current scheduler and then we'll port it over. I don't think we should do major work only in gantt until we're ready to deprecate the current scheduler. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Gantt] New scheduler tree not up
On 01/06/2014 03:55 PM, Dugger, Donald D wrote: This is to let everyone know that we have created a new source tree, `https://github.com/openstack/gantt.git’, that contains the code for the Nova scheduler. The ultimate goal is to create a separate scheduler service that can be utilized by any part of OpenStack that needs a scheduling capability but we’re going to start out just by moving the current scheduler into a separate tree. Note that right now the new tree is not a new scheduler, it is just the current Nova scheduler code that has been moved to a new tree. Any changes we want to make to the Nova scheduler should still happen inside the Nova tree. There are a few tasks that need to be completed before the new tree can be utilized by Nova: 1) Get the tests working in the Gantt tree. Currently, we’ve make the tests non-voting as none of the tests work yet, fixing this is clearly a critical task. 2) Do the plumbing such that Nova makes its scheduler calls into the new Gantt tree. Given that the Gantt tree is a duplicate of the nova code this should be a fairly safe change but there is clearly work that needs to be done here. 3) Get the documentation working in the Gantt tree, that’s currently broken. 4) Start working on creating Gantt as separately running, callable service (expect some lively sessions at the next summit, new RESTful APIs are probably needed at a minimum). Note that until we’ve completed task 2 there will be 2 separate scheduler source trees, the one in Nova and the one in Gantt. Scheduler development should not change for now, all scheduler changes should be applied to the Nova tree. I’ll be monitoring the Nova tree and will port over any scheduler changes into the Gantt tree. (Hopefully we will complete task 2 before I go crazy doing that.) Further, I think all gantt feature development should be explicitly *not* allowed until the nova scheduler is deprecated and ready to be replaced by gantt. Other efforts (a REST API, supporting other services) will have to wait. This is to make the transition as quick as possible. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [OpenStack][Nova][cold migration] Why we need confirm resize after cold migration
On 01/08/2014 04:52 AM, Jay Lau wrote: Greetings, I have a question related to cold migration. Now in OpenStack nova, we support live migration, cold migration and resize. For live migration, we do not need to confirm after live migration finished. For resize, we need to confirm, as we want to give end user an opportunity to rollback. The problem is cold migration, because cold migration and resize share same code path, so once I submit a cold migration request and after the cold migration finished, the VM will goes to verify_resize state, and I need to confirm resize. I felt a bit confused by this, why do I need to verify resize for a cold migration operation? Why not reset the VM to original state directly after cold migration? The confirm step definitely makes more sense for the resize case. I'm not sure if there was a strong reason why it was also needed for cold migration. If nobody comes up with a good reason to keep it, I'm fine with removing it. It can't be changed in the v2 API, though. This would be a v3 only change. Also, I think that probably we need split compute.api.resize() to two apis: one is for resize and the other is for cold migrations. 1) The VM state can be either ACTIVE and STOPPED for a resize operation 2) The VM state must be STOPPED for a cold migrate operation. I'm not sure why would require different states here, though. ACTIVE and STOPPED are allowed now. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [OpenStack][Nova][cold migration] Why we need confirm resize after cold migration
On 01/08/2014 09:53 AM, John Garbutt wrote: On 8 January 2014 10:02, David Xie david.script...@gmail.com wrote: In nova/compute/api.py#2289, function resize, there's a parameter named flavor_id, if it is None, it is considered as cold migration. Thus, nova should skip resize verifying. However, it doesn't. Like Jay said, we should skip this step during cold migration, does it make sense? Not sure. On Wed, Jan 8, 2014 at 5:52 PM, Jay Lau jay.lau@gmail.com wrote: Greetings, I have a question related to cold migration. Now in OpenStack nova, we support live migration, cold migration and resize. For live migration, we do not need to confirm after live migration finished. For resize, we need to confirm, as we want to give end user an opportunity to rollback. The problem is cold migration, because cold migration and resize share same code path, so once I submit a cold migration request and after the cold migration finished, the VM will goes to verify_resize state, and I need to confirm resize. I felt a bit confused by this, why do I need to verify resize for a cold migration operation? Why not reset the VM to original state directly after cold migration? I think the idea was allow users/admins to check everything went OK, and only delete the original VM when the have confirmed the move went OK. I thought there was an auto_confirm setting. Maybe you want auto_confirm cold migrate, but not auto_confirm resize? I suppose we could add an API parameter to auto-confirm these things. That's probably a good compromise. Also, I think that probably we need split compute.api.resize() to two apis: one is for resize and the other is for cold migrations. 1) The VM state can be either ACTIVE and STOPPED for a resize operation 2) The VM state must be STOPPED for a cold migrate operation. We just stop the VM them perform the migration. I don't think we need to require its stopped first. Am I missing something? Don't think so ... I think we should leave it as is. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] Organizing a Gate Blocking Bug Fix Day
On 01/09/2014 07:46 AM, Sean Dague wrote: I think we are all agreed that the current state of Gate Resets isn't good. Unfortunately some basic functionality is really not working reliably, like being able to boot a guest to a point where you can ssh into it. These are common bugs, but they aren't easy ones. We've had a few folks digging deep on these, but we, as a community, are not keeping up with them. So I'd like to propose Gate Blocking Bug Fix day, to be Monday Jan 20th. On that day I'd ask all core reviewers (and anyone else) on all projects to set aside that day to *only* work on gate blocking bugs. We'd like to quiet the queues to not include any other changes that day so that only fixes related to gate blocking bugs would be in the system. This will have multiple goals: #1 - fix some of the top issues #2 - ensure we classify (ER fingerprint) and register everything we're seeing in the gate fails #3 - ensure all gate bugs are triaged appropriately I'm hopefully that if we can get everyone looking at this one a single day, we can start to dislodge the log jam that exists. Specifically I'd like to get commitments from as many PTLs as possible that they'll both directly participate in the day, as well as encourage the rest of their project to do the same. I'm in! -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][neutron] top gate bugs: a plea for help
On 01/08/2014 05:53 PM, Joe Gordon wrote: Hi All, As you know the gate has been in particularly bad shape (gate queue over 100!) this week due to a number of factors. One factor is how many major outstanding bugs we have in the gate. Below is a list of the top 4 open gate bugs. Here are some fun facts about this list: * All bugs have been open for over a month * All are nova bugs * These 4 bugs alone were hit 588 times which averages to 42 hits per day (data is over two weeks)! If we want the gate queue to drop and not have to continuously run 'recheck bug x' we need to fix these bugs. So I'm looking for volunteers to help debug and fix these bugs. I created the following etherpad to help track the most important Nova gate bugs. who is actively working on them, and any patches that we have in flight to help address them: https://etherpad.openstack.org/p/nova-gate-issue-tracking Please jump in if you can. We shouldn't wait for the gate bug day to move on these. Even if others are already looking at a bug, feel free to do the same. We need multiple sets of eyes on each of these issues. -- Russell Bryant ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev