Re: [Openstack-operators] [nova][ironic][scheduler][placement] IMPORTANT: Getting rid of the automated reschedule functionality
On 05/22/2017 02:45 PM, James Penick wrote: > > > I recognize that large Ironic users expressed their concerns about > IPMI/BMC communication being unreliable and not wanting to have > users manually retry a baremetal instance launch. But, on this > particular point, I'm of the opinion that Nova just do one thing and > do it well. Nova isn't an orchestrator, nor is it intending to be a > "just continually try to get me to this eventual state" system like > Kubernetes. > > > Kubernetes is a larger orchestration platform that provides autoscale. I > don't expect Nova to provide autoscale, but > > I agree that Nova should do one thing and do it really well, and in my > mind that thing is reliable provisioning of compute resources. > Kubernetes does autoscale among other things. I'm not asking for Nova to > provide Autoscale, I -AM- asking OpenStack's compute platform to > provision a discrete compute resource reliably. This means overcoming > common and simple error cases. As a deployer of OpenStack I'm trying to > build a cloud that wraps the chaos of infrastructure, and present a > reliable facade. When my users issue a boot request, I want to see if > fulfilled. I don't expect it to be a 100% guarantee across any possible > failure, but I expect (and my users demand) that my "Infrastructure as a > service" API make reasonable accommodation to overcome common failures. Right, I think hits my major queeziness with throwing the baby out with the bathwater here. I feel like Nova's job is to give me a compute when asked for computes. Yes, like malloc, things could fail. But honestly if Nova can recover from that scenario, it should try to. The baremetal and affinity cases are pretty good instances where Nova can catch and recover, and not just export that complexity up. It would make me sad to just export that complexity to users, and instead of handing those cases internally make every SDK, App, and simple script build their own retry loop. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] RFC - Global Request Ids
On 05/16/2017 12:01 PM, Sean Dague wrote: > After the forum session on logging, we came up with what we think is an > approach here for global request ids - > https://review.openstack.org/#/c/464746/ - it would be great of > interested operators would confirm this solves their concerns. > > There is also an open question. A long standing concern was "trusting" > the request-id, though I don't really know how that could be exploited > for anything really bad, and this puts in a system for using service > users as a signal for trust. > > But the whole system is a lot easier, and comes together quicker, if > we don't have that. For especially public cloud users, are there any > concerns that you have in letting users set Request-Id (assuming you'll > also still have a 2nd request-id that's service local and acts like > request-id today)? FYI, right now CERN and Godaddy expressed that they don't need strong trust validation on these ids (as long as they are validated to look like a uuid, so no injection concerns). We've had no people providing rationale on the original fears around doing that. So unless I hear something in the next 24 hours we'll update the spec to drop that part. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] RFC - Global Request Ids
After the forum session on logging, we came up with what we think is an approach here for global request ids - https://review.openstack.org/#/c/464746/ - it would be great of interested operators would confirm this solves their concerns. There is also an open question. A long standing concern was "trusting" the request-id, though I don't really know how that could be exploited for anything really bad, and this puts in a system for using service users as a signal for trust. But the whole system is a lot easier, and comes together quicker, if we don't have that. For especially public cloud users, are there any concerns that you have in letting users set Request-Id (assuming you'll also still have a 2nd request-id that's service local and acts like request-id today)? -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] [openstack-dev] [nova][glance] Who needs multiple api_servers?
On 04/28/2017 12:50 AM, Blair Bethwaite wrote: > We at Nectar are in the same boat as Mike. Our use-case is a little > bit more about geo-distributed operations though - our Cells are in > different States around the country, so the local glance-apis are > particularly important for caching popular images close to the > nova-computes. We consider these glance-apis as part of the underlying > cloud infra rather than user-facing, so I think we'd prefer not to see > them in the service-catalog returned to users either... is there going > to be a (standard) way to hide them? In a situation like this, where Cells are geographically bounded, is there also a Region for that Cell/Glance? -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] [all] [quotas] Unified Limits Conceptual Spec RFC
Background: At the Atlanta PTG there was yet another attempt to get hierarchical quotas more generally addressed in OpenStack. A proposal was put forward that considered storing the limit information in Keystone (https://review.openstack.org/#/c/363765/). While there were some concerns on details that emerged out of that spec, the concept of the move to Keystone was actually really well received in that room by a wide range of parties, and it seemed to solve some interesting questions around project hierarchy validation. We were perilously close to having a path forward for a community request that's had a hard time making progress over the last couple of years. Let's keep that flame alive! Here is the proposal for the Unified Limits in Keystone approach - https://review.openstack.org/#/c/440815/. It is intentionally a high level spec that largely lays out where the conceptual levels of control will be. It intentionally does not talk about specific quota models (there is a follow on that is doing some of that, under the assumption that the exact model(s) supported will take a while, and that the keystone interfaces are probably not going to substantially change based on model). We're shooting for a 2 week comment cycle here to then decide if we can merge and move forward during this cycle or not. So please comment/question now (either in the spec or here on the mailing list). It is especially important that we get feedback from teams that have limits implementations internally, as well as any that have started on hierarchical limits/quotas (which I believe Cinder is the only one). Thanks for your time, and look forward to seeing comments on this. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] RFC - hierarchical quota models
On 03/08/2017 07:12 AM, Tim Bell wrote: On 7 Mar 2017, at 11:52, Sean Dague <s...@dague.net> wrote: One of the things that came out of the PTG was perhaps a new path forward on hierarchical limits that involves storing of limits in keystone doing counting on the projects. Members of the developer community are working through all that right now, that's not really what this is about. As a related issue, it seemed that every time that we talk about this space, people jump into describing how they think the counting / enforcement would work. It became clear that people were overusing the word "overbooking" to the point where it didn't have a lot of meaning. https://review.openstack.org/#/c/441203/ is a reference document that I started in writing out every model I thought I heard people talk about, the rules with it, and starting to walk through the kind of algorithm needed to update limits, as well as check quota on ones that seem like we might move forward with. It is full of blockdiag markup, which makes the rendered HTML the best way to view it - http://docs-draft.openstack.org/03/441203/11/check/gate-keystone-specs-docs-ubuntu-xenial/c3fc2b3//doc/build/html/specs/keystone/backlog/hierarchical-quota-scenarios.html There are specific question to the operator community here: Are there other models you believe are not represented that you think should be considered? if so, what are the rules of them so I can throw them into the document. Thanks. In the interest of completeness, I’ll add one more scenario to the mix but I would not look for this as part of the functionality of the 1st release. One item we have encountered in the past is how to reduce quota for projects. If a child project quota is to be reduced but it is running the maximum number of VMs, the parent project administrator has to wait for the child to do the deletion before they can reduce the quota. Being able to do this would mean that new resource creation would be blocked but that existing resources would continue to run (until the child project admin gets round to choosing the priorities for deletion out of the many VMs he has running) However, this does bring in significant additional complexity so unless there is an easy way of modelling it, I’d suggest this for nested quotes v2 at the earliest. Actually if we unify limit definitions to keystone, that's going to be the default behavior. A change in limits *will not* validate against existing usage. That will get called out more specifically in the next version of the unified limits spec. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] RFC - hierarchical quota models
One of the things that came out of the PTG was perhaps a new path forward on hierarchical limits that involves storing of limits in keystone doing counting on the projects. Members of the developer community are working through all that right now, that's not really what this is about. As a related issue, it seemed that every time that we talk about this space, people jump into describing how they think the counting / enforcement would work. It became clear that people were overusing the word "overbooking" to the point where it didn't have a lot of meaning. https://review.openstack.org/#/c/441203/ is a reference document that I started in writing out every model I thought I heard people talk about, the rules with it, and starting to walk through the kind of algorithm needed to update limits, as well as check quota on ones that seem like we might move forward with. It is full of blockdiag markup, which makes the rendered HTML the best way to view it - http://docs-draft.openstack.org/03/441203/11/check/gate-keystone-specs-docs-ubuntu-xenial/c3fc2b3//doc/build/html/specs/keystone/backlog/hierarchical-quota-scenarios.html There are specific question to the operator community here: Are there other models you believe are not represented that you think should be considered? if so, what are the rules of them so I can throw them into the document. Would love to try to model everything under consideration here. It seems like the conversations go around in circles a bit because everyone is trying to keep everything in working memory, and paging out parts. Diagrams hopefully ensure we all are talking about the same things. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] What would you like in Pike?
On 01/16/2017 04:02 PM, Jonathan Bryce wrote: > >> On Jan 16, 2017, at 11:03 AM, Matt Riedemann >> <mrie...@linux.vnet.ibm.com <mailto:mrie...@linux.vnet.ibm.com>> wrote: >> >> On 1/12/2017 7:30 PM, Melvin Hillsman wrote: >>> Hey everyone, >>> >>> I am hoping to get a dialogue started to gain some insight around things >>> Operators, Application Developers, and End Users would like to see >>> happen in Pike. If you had a dedicated environment, dedicated team, and >>> freedom to choose how you deployed, new features, older features, >>> enhancements, etc, and were not required to deal with customer/client >>> tickets, calls, and maintenances, could keep a good feedback loop >>> between your team and the upstream community of any project, what would >>> like to make happen or work on hoping the next release of OpenStack >>> had/included/changed/enhanced/removed…? >>> >> >> I just wanted to say thanks for starting this thread. I often wonder >> what people would like to see the Nova team prioritize because we >> don't get input from the product work group or Foundation really on >> big ticket items so we're generally left to prioritizing work on our >> own each release. > > I agree; thanks Melvin! Great input so far. > > Matt, on the input on big ticket items, I’d love to get your feedback on > what is missing or you’d like to see more of in the Foundation reports > or Product Working Group roadmaps to make them more useful for these > kinds of items. Is this thread more consumable because specific > functionality is identified over themes? Is it the way it’s scoped to a > release? We could possibly even add in a similar question (“What would > you like to see in the next release?”) to the user survey if this is > info you’ve been looking for. One of the challenges thus far on PWG input is it has often not been very grounded in reality. It often misses key details about what's hard, possible, or easy when it comes to the code and communities in question. The operator list feedback is typically by teams that are much more steeped in the day to days of running OpenStack, have sifted through chunks of the code, chatted more with community members. It very often starts from a place of much more common ground and understanding. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] How are people dealing with API rate limiting?
On 06/14/2016 11:02 AM, Matt Riedemann wrote: > A question came up in the nova IRC channel this morning about the > api_rate_limit config option in nova which was only for the v2 API. > > Sean Dague explained that it never really worked because it was per API > server so if you had more than one API server it was busted. There is no > in-tree replacement in nova. > > So the open question here is, what are people doing as an alternative? Just as a clarification. The toy implementation that used to live in Nova was even worse then described above. The counters were kept per process. So if you had > 1 API worker (default is # of workers per CPU) it would be inconsistent. This is why it was defaulted in false in Havana, and never carried forward to the new API backend. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] Nova API extensions removal plan
Nova is getting towards it's final phases of the long term arc to really standardize the API, which includes removing the API extensions facility. This has been a long arc that was started in Atlanta. And has been talked about in a lot of channels, but some interactions this past week made us realize that some folks might not have realized this is happening. So we've now got an over arching spec about how and why we're removing the API extensions facility from Nova, and alternatives that exist for folks - https://specs.openstack.org/openstack/nova-specs/specs/newton/approved/api-no-more-extensions.html This is informative for folks, please take a look if you think this will impact you. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Nova 2.1 and user permissions in the policy file
On 05/27/2016 04:23 AM, Dario Vianello wrote: > >> On 25 May 2016, at 17:31, Tim Bell <tim.b...@cern.ch >> <mailto:tim.b...@cern.ch>> wrote: >> >> >> On 25/05/16 17:36, "Sean Dague" <s...@dague.net >> <mailto:s...@dague.net>> wrote: >> >>> On 05/23/2016 10:24 AM, Tim Bell wrote: >>>> >>>> >>>> Quick warning for those who are dependent on the "user_id:%(user_id)s" >>>> syntax for limiting actions by user. According to >>>> https://bugs.launchpad.net/nova/+bug/1539351, this behavior was >>>> apparently not intended according to the bug report feedback. The >>>> behavior has changed from v2 to v2.1 and the old syntax no longer works. >>>> >>>> >>>> >>>> There can be security implications also so I’d recommend those using >>>> this current v2 feature to review the bug to understand the potential >>>> impacts as clouds enable v2.1. >>> >>> The Nova team is currently lacking information about the minimum number >>> of user_id supporting policy points are needed. Because supporting >>> user_id everywhere is definitely not going to be an option. >>> >>> We really need very detailed lists of which actions are required, and >>> why. And for all server actions why "lock" action is not sufficient. And >>> we need all of that by N1, which is in a week. With that we can evaluate >>> what can be added to the API stack. Especially because this all needs >>> tests so it doesn't regress. So if we can keep it at a small number of >>> operations, it is way more likely to happen. If this grows to >>> "everything", it definitely won't. >>> >>> It would honestly be great if people affected by this could also >>> prioritize top to bottom what operations are most important. Detailed >>> use case and priority is really needed to figure out what can be done. >>> >> >> Thanks for looking into this. The current set of activities that our >> developers want to do for their VMs (and do not want other doing to >> their instances ☺ are >> >> - power off/power on/restart >> - VNC console (since this also allows the above with appropriate SysRq) >> - delete VM >> >> I think in the longer term, we’ll can work together to find a way to >> do this with nested projects and some kind of automatic project >> creation but without nested quotas and image sharing in the hierarchy >> being priorities, these are not yet at functional parity compared to >> the current Nova V2 implementation. >> >> Tim > > Here at EMBL-EBI we are touched by this possible change in several ways: > - to properly federate our OpenStack to the EGI FedCloud, which requires > this feature. More on this coming, I hope somebody from the EGI will > post here soon. > - to support the same activities mentioned by Tim (power off/on/restart, > console, VM deletion) in our about-to-come dev platform > - to effectively support the long term of science scenario, where a > single tenancy is shared by different independent researches to do their > computation. > > I do agree that all this can be tackled by leveraging the nested > projects, but especially for the FedCloud we are committed to deliver > soon. Strip this feature out of Nova 2.1 would thus be an issue for us. Ok, but these are not the level of detail that we need to be actionable. We realistically need an ordered list of every policy rule changed in the importance of how that impacts things. What is listed above is way too high level to understand any of that. From Tim we got a small list of actions, we can probably work with that. And to be clear, this feature already doesn't exist in master as it went away in the default implementation 2 releases ago. We're talking about new feature development here, which is real work. And this feature will be marked as deprecated when should it go in, so other alternatives are going to need to be considered here. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Nova 2.1 and user permissions in the policy file
On 05/23/2016 10:24 AM, Tim Bell wrote: > > > Quick warning for those who are dependent on the "user_id:%(user_id)s" > syntax for limiting actions by user. According to > https://bugs.launchpad.net/nova/+bug/1539351, this behavior was > apparently not intended according to the bug report feedback. The > behavior has changed from v2 to v2.1 and the old syntax no longer works. > > > > There can be security implications also so I’d recommend those using > this current v2 feature to review the bug to understand the potential > impacts as clouds enable v2.1. The Nova team is currently lacking information about the minimum number of user_id supporting policy points are needed. Because supporting user_id everywhere is definitely not going to be an option. We really need very detailed lists of which actions are required, and why. And for all server actions why "lock" action is not sufficient. And we need all of that by N1, which is in a week. With that we can evaluate what can be added to the API stack. Especially because this all needs tests so it doesn't regress. So if we can keep it at a small number of operations, it is way more likely to happen. If this grows to "everything", it definitely won't. It would honestly be great if people affected by this could also prioritize top to bottom what operations are most important. Detailed use case and priority is really needed to figure out what can be done. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Nova 2.1 and user permissions in the policy file
On 05/23/2016 11:56 AM, Tim Bell wrote: > On 23/05/16 17:02, "Sean Dague" <s...@dague.net> wrote: > >> On 05/23/2016 10:24 AM, Tim Bell wrote: >>> >>> >>> Quick warning for those who are dependent on the "user_id:%(user_id)s" >>> syntax for limiting actions by user. According to >>> https://bugs.launchpad.net/nova/+bug/1539351, this behavior was >>> apparently not intended according to the bug report feedback. The >>> behavior has changed from v2 to v2.1 and the old syntax no longer works. >> >> Well, the behavior changes with the backend code base. By mitaka the >> default backend code for both is the same. And the legacy code base is >> about to be removed. >> >> This feature (policy enforcement by user_id) was 100% untested, which is >> why it never ended up in the new API stack. Being untested setting >> owner: 'user_id: %(user_id)s' might have some really unexpected results >> because not everything has a user_id. >> > > There are several hints given in the documentation regarding this sort of > feature. > > Examples are such as http://docs.openstack.org/developer/oslo.policy/api.html > and > http://docs.openstack.org/mitaka/config-reference/policy-json-file.html#examples Ok, follow on question. Is the concern that within a large tenant you do not want user A to accidentally reboot user B's server? Would the "lock" construct be sufficient here for users that have servers in critical states? Are all the failure domains these kinds of failures? Or what is the detailed list of bad interactions that you are concerned about. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Nova 2.1 and user permissions in the policy file
On 05/24/2016 02:22 AM, Jerome Pansanel wrote: > Hi, > > Le 23/05/2016 18:23, Sean Dague a écrit : >> On 05/23/2016 11:56 AM, Tim Bell wrote: >>> On 23/05/16 17:02, "Sean Dague" <s...@dague.net> wrote: >>> >>>> On 05/23/2016 10:24 AM, Tim Bell wrote: >>>>> >>>>> > [...] >>>>> There can be security implications also so I’d recommend those using >>>>> this current v2 feature to review the bug to understand the potential >>>>> impacts as clouds enable v2.1. >>>> >>>> While I understand from the bug report what your use case is now, I'm >>>> kind of wondering what the shared resources / actions of these 150 >>>> people are in this project. Are they all in the same project for other >>>> reasons? >>> >>> The resource pool (i.e. quota) is shared between all of the developers. >>> A smaller team is responsible for maintaining the image set for the project >>> and also providing 2nd line support (such as reboot/problem diagnosis…). >> >> Ok, so Bob can take up all the instances and go on vacation, and it's a >> 2nd line support call to handle shutting them down? It definitely >> creates some weird situations where you can all pull from the pool, and >> once pulled only you can give back. >> >> What's the current policy patch look like? (i.e. which operations are >> you changing to user_id). >> >>> I do not know the EMBL-EBI use case or the EGI Federated Cloud scenarios >>> which are also mentioned in the review. > > The EGI Federated Cloud scenarios is almost the same. We have tenants > for several projects and a "catch-all" tenant for small projects (1 or 2 > person per project). Therefore, it is important to be sure that a user > from one project does not interact with VMs from another one. Ok, but the catch-all project is just to have less "projects" allocated in keystone right? Are these users using shared resources. I get there is a convenience that no one is allocating projects... and this just falls back to AD and people get in dynamically, but that seems like we could solve project on demand differently. I think part of the challenge here is that fundamentally project is intended to be the unit of sharing. Because policy is so open ending people did a bunch of things which effectively made nested projects, which work in some cases, but might have some really odd edge conditions based on what actions are enforced where. It also really breaks the construct of GET /servers Being the list of servers you can do things to. Which... is a pretty fundamental contract point in Nova. And confusing that point across clouds makes it really hard for people to build software against the API that your users can just use. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Nova 2.1 and user permissions in the policy file
On 05/23/2016 11:56 AM, Tim Bell wrote: > On 23/05/16 17:02, "Sean Dague" <s...@dague.net> wrote: > >> On 05/23/2016 10:24 AM, Tim Bell wrote: >>> >>> >>> Quick warning for those who are dependent on the "user_id:%(user_id)s" >>> syntax for limiting actions by user. According to >>> https://bugs.launchpad.net/nova/+bug/1539351, this behavior was >>> apparently not intended according to the bug report feedback. The >>> behavior has changed from v2 to v2.1 and the old syntax no longer works. >> >> Well, the behavior changes with the backend code base. By mitaka the >> default backend code for both is the same. And the legacy code base is >> about to be removed. >> >> This feature (policy enforcement by user_id) was 100% untested, which is >> why it never ended up in the new API stack. Being untested setting >> owner: 'user_id: %(user_id)s' might have some really unexpected results >> because not everything has a user_id. >> > > There are several hints given in the documentation regarding this sort of > feature. > > Examples are such as http://docs.openstack.org/developer/oslo.policy/api.html > and > http://docs.openstack.org/mitaka/config-reference/policy-json-file.html#examples Ok, so those are good points of documentation to bring back in line with whatever reality we feel is the one to land on. Keeping user_id support in Nova policy is going to require someone to write a lot of tests to verify it, because it's never been in the current stack, again, because it was never really implemented, because there were never any tests for this scenario anywhere. >>> There can be security implications also so I’d recommend those using >>> this current v2 feature to review the bug to understand the potential >>> impacts as clouds enable v2.1. >> >> While I understand from the bug report what your use case is now, I'm >> kind of wondering what the shared resources / actions of these 150 >> people are in this project. Are they all in the same project for other >> reasons? > > The resource pool (i.e. quota) is shared between all of the developers. > A smaller team is responsible for maintaining the image set for the project > and also providing 2nd line support (such as reboot/problem diagnosis…). Ok, so Bob can take up all the instances and go on vacation, and it's a 2nd line support call to handle shutting them down? It definitely creates some weird situations where you can all pull from the pool, and once pulled only you can give back. What's the current policy patch look like? (i.e. which operations are you changing to user_id). > I do not know the EMBL-EBI use case or the EGI Federated Cloud scenarios > which are also mentioned in the review. Those would be good. I honestly think we need someone to start capturing these in a spec, because a huge part of the disconnect here was this was a backdoor feature that no one on the development side really understood existed, was never tested, and didn't think it was the way things were supposed to be working. And if we are bringing it back we really need to capture the use cases a lot more clearly so in 5 years we don't do the same thing again. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Nova 2.1 and user permissions in the policy file
On 05/23/2016 10:24 AM, Tim Bell wrote: > > > Quick warning for those who are dependent on the "user_id:%(user_id)s" > syntax for limiting actions by user. According to > https://bugs.launchpad.net/nova/+bug/1539351, this behavior was > apparently not intended according to the bug report feedback. The > behavior has changed from v2 to v2.1 and the old syntax no longer works. Well, the behavior changes with the backend code base. By mitaka the default backend code for both is the same. And the legacy code base is about to be removed. This feature (policy enforcement by user_id) was 100% untested, which is why it never ended up in the new API stack. Being untested setting owner: 'user_id: %(user_id)s' might have some really unexpected results because not everything has a user_id. > There can be security implications also so I’d recommend those using > this current v2 feature to review the bug to understand the potential > impacts as clouds enable v2.1. While I understand from the bug report what your use case is now, I'm kind of wondering what the shared resources / actions of these 150 people are in this project. Are they all in the same project for other reasons? -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] disabling deprecated APIs by config?
nova-net is now deprecated - https://review.openstack.org/#/c/310539/ And we're in the process in Nova of doing some spring cleaning and deprecating the proxies to other services - https://review.openstack.org/#/c/312209/ At some point in the future after deprecation the proxy code is going to stop working. Either accidentally, because we're not going to test or fix this forever (and we aren't going to track upstream API changes to the proxy targets), or intentionally when we decide to delete it to make it easier to address core features and bugs that everyone wants addressed. However, the world moves forward slowly. Consider the following scenario. We delete nova-net & the network proxy entirely in Peru (a not entirely unrealistic idea). At that release there are a bunch of people just getting around to Newton. Their deployments allow all these things to happen which are going to 100% break when they upgrade, and people are writing more and more OpenStack software every cycle. How do we signal to users this kind of deprecation? Can we give sites tools to help prevent new software being written to deprecated (and scheduled for deletion) APIs? One idea was a "big red switch" in the format of a config option ``disable_deprecated_apis=True`` (defaults to False). Which would set all deprecated APIs to 404 routes. One of the nice ideas here is this would allow some API servers to have this set, and others not. So users could point to the "clean" API server, figure out that they will break, but the default API server would still support these deprecated APIs. Or, conversely, the default could be the clean API server, and a legacy API server endpoint could be provided for projects that really needed it that included these deprecated things for now. Either way it would allow some site assisted transition. And be something like the -Werror flag in gcc. In the Nova case the kinds of things ending up in this bucket are going to be interfaces that people *really* shouldn't be using any more. Many of them data back to when OpenStack was only 2 projects, and the concept of splitting out function wasn't really thought about (note: we're getting ahead of this one for the 'placement' rest API, so it won't have any of these issues). At some point this house cleaning was going to have to happen, and now seems to be the time to do get it rolling. Feedback on this idea would be welcomed. We're going to deprecate the proxy APIs regardless, however disable_deprecated_apis is it's own idea and consequences, and we really want feedback before pushing forward on this. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] nova snapshots should dump all RAM to hypervisor disk ?
On 04/24/2016 10:15 AM, Matt Riedemann wrote: >> > > To clarify, live snapshots aren't disabled by default because they don't > work, it's because at least with libvirt 1.2.2 and QEMU 2.0 (which is > what we test against in the gate), we'd hit a lot of failures (about 25% > failure rate with the devstack/tempest jobs) when running live snapshot, > so we suspect there are concurrency issues when running live snapshot on > a compute along with other operations at the same time (the CI jobs run > 4 tests concurrently on a single-node devstack). > > This might not be an issue on newer libvirt/QEMU, we'll see when we > start testing with Ubuntu 16.04 nodes. Correct. I tried to build a clarifying comment in the config option here (as I realized it wasn't described all that well in docs) - https://review.openstack.org/#/c/309629/ -Sean -- Sean Dague http://dague.net signature.asc Description: OpenPGP digital signature ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] Anyone else use vendordata_driver in nova.conf?
On 04/18/2016 10:13 AM, Ned Rhudy (BLOOMBERG/ 731 LEX) wrote: > Requiring users to remember to pass specific userdata through to their > instance at every launch in order to replace functionality that > currently works invisible to them would be a step backwards. It's an > alternative, yes, but it's an alternative that adds burden to our users > and is not one we would pursue. > > What is the rationale for desiring to remove this functionality? The Nova team would like to remove every config option that specifies an arbitrary out of tree class file at a function point. This has been the sentiment for a while and we did a wave of deprecations at the end of Mitaka to signal this more broadly, because as an arbitrary class loader it completely impossible to even understand who might be using it and how. These interfaces are not considered stable or contractual, so exposing them as raw class loader is something that we want to stop doing, as we're going to horribly break people at some point. It's fine if there are multiple implementations for these things, however those should all be upstream, and selected by a symbolic name CONF option. One of the alternatives is to propose your solution upstream. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] question on project_id formats
It has come in looking at ways to remove project_ids from API urls that the format of a project_id is actually quite poorly specified in OpenStack. This makes dual stacking supporting the project_id and not the project id at the same time a little challenging because the matching to figure out which controller is going to get called goes a little nuts. For anyone using upstream Keystone, they'll get a uuid (no dashes) unless they work really hard at changing it. RAX currently doesn't use upstream Keystone, they use ints as their project ids. What I'm curious about is if anyone else is doing something in specifying project_ids in their cloud. As this is going to impact the solution going forward here. We're currently pondering a default enforcement of uuid form if you are using the project id in the url, with a config override to allow a different schema (as we know there is at least one cloud in the wild that is different). Comments welcomed. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] Service Catalog TNG urls
For folks that don't know, we've got an effort under way to look at some of what's happened with the service catalog, how it's organically grown, and do some pruning and tuning to make sure it's going to support what we want to do with OpenStack for the next 5 years (wiki page to dive deeper here - https://wiki.openstack.org/wiki/ServiceCatalogTNG). One of the early Open Questions is about urls. Today there is a completely free form field to specify urls, and there are conventions about having publicURL, internalURL, adminURL. These are, however, only conventions. The only project that's ever really used adminURL has been Keystone, so that's something we feel we can phase out in new representations. The real question / concern is around public vs. internal. And something we'd love feedback from people on. When this was brought up in Tokyo the answer we got was that internal URL was important because: * users trusted it to mean "I won't get changed for bandwidth" * it is often http instead of https, which provides a 20% performance gain for transfering large amounts of data (i.e. glance images) The question is, how hard would it be for sites to be configured so that internal routing is used whenever possible? Or is this a concept we need to formalize and make user applications always need to make the decision about which interface they should access? -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] [gate] devstack / grenade branching day - minor snafu
FYI, we've got a bit of a gate snafu this morning in cutting the devstack / grenade branches for stable/liberty, as devstack-gate did not yet have the support for stable/liberty in it. That patch is up at - https://review.openstack.org/#/c/229363/ With any luck this all gets resolved in the next couple of hours. But things are going to be a bit rocky until that bit lands. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] [nova] Backlog Specs: a way to send requirements to the developer community
On 05/15/2015 07:20 AM, Boris Pavlovic wrote: John, So, you can have all details in the spec, or you can have only the problem statement complete. Its up to you as a submitter how much detail you want to provide. I would recommend adding rough ideas into the alternatives section, and leaving everything else blank except the problem statement. We are trying to use a single process for parked developer ideas and operator ideas, so they are on an equal footing. The reason its not just a bug or similar, is so we are able to review the idea with the submitter, to ensure we have a good enough problem description, so a developer can pick up and run with the idea, and turn it into a more complete spec targeted at a specific release. In addition, we are ensuring that the problem description is in scope. Feature requests are the same process as specs. And I fully agree that bug reports are not good idea for this. The only difference is template. I believe you won't find end users that would like to spend time to read all this steps: http://specs.openstack.org/openstack/nova-specs/specs/kilo/template.html and provide required info. As an end users I would like to spend maximum 5 minutes to provide info about: - use case - problem description - possible solution [optional] What about making nova template simpler? And actually doing this across all projects? So... maximum 5 minutes is where I think that idea goes horribly off the rails. Because it actually sets up the *wrong* expectations of how we make progress as a community, and is only going to cause anger and frustration by all parties. Adding most features to most projects requires a reasonable conversation about how that impacts other parts of that project, and other projects in OpenStack, and how it impacts existing deploys of OpenStack, and compatibility between OpenStack implementations in the field, especially as we try to get more serious about interoperability. I get that everyone wants an easy button, and for magical elves to do stuff. However I think Rally is the anomaly here, as it doesn't need to address many of these concerns for where it lives in the tools stack. And easy process for submit, that ends up with too little information or engagement to be useful, is worse than none at all. Because then there is just a black hole in the middle where everything is going to get lost. We make progress as a community by building strong communication ties across different points of views, reaching out across processes, and being willing, on all parts, to invest time to move things forward together. It's one of the reasons I think the Ops meetups have been very successful, as they provide really great forums to do that. We don't do it with a max 5 minute submit process. -Sean -- Sean Dague http://dague.net ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
Re: [Openstack-operators] OpenStack services and ca certificate config entries
design summit (which operators are officially part of). - jlk ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org mailto:OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators -- Rackspace Australia ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org mailto:OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators -Erik ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators -- Sean Dague http://dague.net signature.asc Description: OpenPGP digital signature ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
[Openstack-operators] Deprecation of in tree EC2 API in Nova for Kilo release
The following review for Kilo deprecates the EC2 API in Nova - https://review.openstack.org/#/c/150929/ There are a number of reasons for this. The EC2 API has been slowly rotting in the Nova tree, never was highly tested, implements a substantially older version of what AWS has, and currently can't work with any recent releases of the boto library (due to implementing extremely old version of auth). This has given the misunderstanding that it's a first class supported feature in OpenStack, which it hasn't been in quite sometime. Deprecating honestly communicates where we stand. There is a new stackforge project which is getting some activity now - https://github.com/stackforge/ec2-api. The intent and hope is that is the path forward for the portion of the community that wants this feature, and that efforts will be focused there. Comments are welcomed, but we've attempted to get more people engaged to address these issues over the last 18 months, and never really had anyone step up. Without some real maintainers of this code in Nova (and tests somewhere in the community) it's really no longer viable. -Sean -- Sean Dague http://dague.net signature.asc Description: OpenPGP digital signature ___ OpenStack-operators mailing list OpenStack-operators@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators