[openstack-dev] testing performance/latency of various components?
Is there a straightforward way to determine where the time is going when I run a command from novaclient? For instance, if I run "nova list", that's going to run novaclient, which will send a message to nova-api, which wakes up and does some processing and sends a message to nova-conductor, which wakes up and does some processing and then calls out to the database, which wakes up and does some processing and sends the response back to nova-conductor, etc... And the messaging goes via rabbit, so there are additional messaging and wake-ups involved there. Suppose nova-list takes A amount of time to run...is there a standard way to determine how much time was spent in nova-api, in nova-conductor, in the database, in rabbit, how much was due to scheduler delays, etc.? Or would I be looking at needing to instrument everything to get that level of detail? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] The future of the integrated release
On 08/20/2014 07:21 AM, Jay Pipes wrote: Hi Thierry, thanks for the reply. Comments inline. :) On 08/20/2014 06:32 AM, Thierry Carrez wrote: If we want to follow your model, we probably would have to dissolve programs as they stand right now, and have blessed categories on one side, and teams on the other (with projects from some teams being blessed as the current solution). Why do we have to have "blessed" categories at all? I'd like to think of a day when the TC isn't picking winners or losers at all. Level the playing field and let the quality of the projects themselves determine the winner in the space. Stop the incubation and graduation madness and change the role of the TC to instead play an advisory role to upcoming (and existing!) projects on the best ways to integrate with other OpenStack projects, if integration is something that is natural for the project to work towards. It seems to me that at some point you need to have a recommended way of doing things, otherwise it's going to be *really hard* for someone to bring up an OpenStack installation. We already run into issues with something as basic as competing SQL databases. If every component has several competing implementations and none of them are "official" how many more interaction issues are going to trip us up? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] The future of the integrated release
On 08/20/2014 09:54 PM, Clint Byrum wrote: Excerpts from Jay Pipes's message of 2014-08-20 14:53:22 -0700: On 08/20/2014 05:06 PM, Chris Friesen wrote: On 08/20/2014 07:21 AM, Jay Pipes wrote: Hi Thierry, thanks for the reply. Comments inline. :) On 08/20/2014 06:32 AM, Thierry Carrez wrote: If we want to follow your model, we probably would have to dissolve programs as they stand right now, and have blessed categories on one side, and teams on the other (with projects from some teams being blessed as the current solution). Why do we have to have "blessed" categories at all? I'd like to think of a day when the TC isn't picking winners or losers at all. Level the playing field and let the quality of the projects themselves determine the winner in the space. Stop the incubation and graduation madness and change the role of the TC to instead play an advisory role to upcoming (and existing!) projects on the best ways to integrate with other OpenStack projects, if integration is something that is natural for the project to work towards. It seems to me that at some point you need to have a recommended way of doing things, otherwise it's going to be *really hard* for someone to bring up an OpenStack installation. Why can't there be multiple recommended ways of setting up an OpenStack installation? Matter of fact, in reality, there already are multiple recommended ways of setting up an OpenStack installation, aren't there? There's multiple distributions of OpenStack, multiple ways of doing bare-metal deployment, multiple ways of deploying different message queues and DBs, multiple ways of establishing networking, multiple open and proprietary monitoring systems to choose from, etc. And I don't really see anything wrong with that. This is an argument for loosely coupling things, rather than tightly integrating things. You will almost always win my vote with that sort of movement, and you have here. +1. I mostly agree, but I think we should distinguish between things that are "possible", and things that are "supported". Arguably, anything that is "supported" should be tested as part of the core infrastructure and documented in the core OpenStack documentation. We already run into issues with something as basic as competing SQL databases. If the TC suddenly said "Only MySQL will be supported", that would not mean that the greater OpenStack community would be served better. It would just unnecessarily take options away from deployers. On the other hand, if the community says explicitly "we only test with sqlite and MySQL" then that sends a signal that anyone wanting to use something else should plan on doing additional integration testing. I've stumbled over some of these issues, and it's no fun. (There's still an open bug around the fact that sqlite behaves differently than MySQL with respect to regex.) IMO, OpenStack should be about choice. Choice of hypervisor, choice of DB and MQ infrastructure, choice of operating systems, choice of storage vendors, choice of networking vendors. Err, uh. I think OpenStack should be about users. If having 400 choices means users are just confused, then OpenStack becomes nothing and everything all at once. Choices should be part of the whole not when 1% of the market wants a choice, but when 20%+ of the market _requires_ a choice. I agree. If there are too many choices without enough documentation as to why someone would choose one over the other, or insufficient testing such that some choices are theoretically valid but broken in practice, then it's less useful for the end users. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Server Groups - remove VM from group?
On 08/25/2014 11:25 AM, Joe Cropper wrote: I was thinking something simple such as only allowing the add operation to succeed IFF no policies are found to be in violation... and then nova wouldn't need to get into all the complexities you mention? Personally I would be in favour of this...nothing fancy, just add it if it already meets all the criteria. This is basically just a database operation so I would hope we could make it reliable in the face of simultaneous things going on with the instance. And remove would be fairly straightforward as well since no constraints would need to be checked. Agreed. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Is the BP approval process broken?
On 08/28/2014 01:44 PM, Jay Pipes wrote: On 08/27/2014 09:04 PM, Dugger, Donald D wrote: I understand that reviews are a burden and very hard but it seems wrong that a BP with multiple positive reviews and no negative reviews is dropped because of what looks like indifference. I would posit that this is not actually indifference. The reason that there may not have been >1 +2 from a core team member may very well have been that the core team members did not feel that the blueprint's priority was high enough to put before other work, or that the core team members did have the time to comment on the spec (due to them not feeling the blueprint had the priority to justify the time to do a full review). The overall "scheduler-lib" Blueprint is marked with a "high" priority at "http://status.openstack.org/release/";. Hopefully that would apply to sub-blueprints as well. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Is the BP approval process broken?
On 08/28/2014 02:25 PM, Jay Pipes wrote: On 08/28/2014 04:05 PM, Chris Friesen wrote: On 08/28/2014 01:44 PM, Jay Pipes wrote: On 08/27/2014 09:04 PM, Dugger, Donald D wrote: I understand that reviews are a burden and very hard but it seems wrong that a BP with multiple positive reviews and no negative reviews is dropped because of what looks like indifference. I would posit that this is not actually indifference. The reason that there may not have been >1 +2 from a core team member may very well have been that the core team members did not feel that the blueprint's priority was high enough to put before other work, or that the core team members did have the time to comment on the spec (due to them not feeling the blueprint had the priority to justify the time to do a full review). The overall "scheduler-lib" Blueprint is marked with a "high" priority at "http://status.openstack.org/release/";. Hopefully that would apply to sub-blueprints as well. a) There are no sub-blueprints to that scheduler-lib blueprint I guess my terminology was wrong. The original email referred to "https://review.openstack.org/#/c/89893/"; as the "crucial BP that needs to be implemented". That is part of "https://review.openstack.org/#/q/topic:bp/isolate-scheduler-db,n,z";, which is listed as a Gerrit topic in the "scheduler-lib" blueprint that I pointed out. b) If there were sub-blueprints, that does not mean that they would necessarily take the same priority as their parent blueprint I'm not sure how that would work. If we have a high-priority blueprint depending on work that is considered low-priority, that would seem to set up a classic priority inversion scenario. c) There's no reason priorities can't be revisited when necessary Sure, but in that case it might be a good idea to make the updated priority explicit. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Is the BP approval process broken?
On 08/28/2014 03:02 PM, Jay Pipes wrote: I understand your frustration about the silence, but the silence from core team members may actually be a loud statement about where their priorities are. Or it could be that they haven't looked at it, aren't aware of it, or haven't been paying attention. I think it would be better to make feedback explicit and remove any uncertainty/ambiguity. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Is the BP approval process broken?
On 08/28/2014 04:01 PM, Joe Gordon wrote: On Thu, Aug 28, 2014 at 2:43 PM, Alan Kavanagh mailto:alan.kavan...@ericsson.com>> wrote: I share Donald's points here, I believe what would help is to clearly describe in the Wiki the process and workflow for the BP approval process and build in this process how to deal with discrepancies/disagreements and build timeframes for each stage and process of appeal etc. The current process would benefit from some fine tuning and helping to build safe guards and time limits/deadlines so folks can expect responses within a reasonable time and not be left waiting in the cold. This is a resource problem, the nova team simply does not have enough people doing enough reviews to make this possible. All the more reason to make it obvious which reviews are not being addressed in a timely fashion. (I'm thinking something akin to the order screen at a fast food restaurant that starts blinking in red and beeping if an order hasn't been filled in a certain amount of time.) Perhaps by making it clear that reviews are a bottleneck this will actually help to address the problem. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers
On 09/05/2014 03:52 AM, Daniel P. Berrange wrote: So my biggest fear with a model where each team had their own full Nova tree and did large pull requests, is that we'd suffer major pain during the merging of large pull requests, especially if any of the merges touched common code. It could make the pull requests take a really long time to get accepted into the primary repo. By constrast with split out git repos per virt driver code, we will only ever have 1 stage of code review for each patch. Changes to common code would go straight to main nova common repo and so get reviewed by the experts there without delay, avoiding the 2nd stage of review from merge requests. Why treat things differently? It seems to me that even in the first scenario you could still send common code changes straight to the main nova repo. Then the pulls from the virt repo would literally only touch the virt code in the common repo. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] anyone using RabbitMQ with active/active mirrored queues?
Hi, I see that the OpenStack high availability guide is still recommending the active/standby method of configuring RabbitMQ. Has anyone tried using active/active with mirrored queues as recommended by the RabbitMQ developers? If so, what problems did you run into? Thanks, Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] expected behaviour of _report_state() on rabbitmq failover
Hi, I'm running Havana and I'm seeing some less-than-ideal behaviour on rabbitmq failover. I'd like to figure out if this is expected behaviour or if something is going wrong. We're running rabbitmq in active/standby mode with DRBD storage. On the controller the timeline looks like this: 07:25:38 rabbit starts going inactive on current active controller 07:25:47 rabbit fully disabled 07:25:50 rabbit starts going active on other controller 07:25:53 rabbit fully enabled On the compute node, I've included the nova-compute logs below. We can see that the "model server went away" log doesn't come out until long after the connection was reset, and then the "Recovered model server connection" log comes out right away. In a controlled shutdown case like this would we expect the RPC "call" response to come back successfully? If so, then why didn't it? If not, then wouldn't it make sense to fail the call immediately rather than wait for it to time out? As it stands, it seems that waiting for the RPC call to time out blocks _report_state() from running again in report_interval seconds, which delays the service update until the RPC timeout period expires. 2014-08-18 07:25:46.091 16126 ERROR nova.openstack.common.rpc.common [-] Failed to consume message from queue: [Errno 104] Connection reset by peer 2014-08-18 07:25:46.092 16126 INFO nova.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.204.2:5672 2014-08-18 07:25:46.113 16126 ERROR nova.openstack.common.rpc.common [-] Failed to consume message from queue: [Errno 104] Connection reset by peer 2014-08-18 07:25:46.113 16126 TRACE nova.openstack.common.rpc.common error: [Errno 104] Connection reset by peer 2014-08-18 07:25:46.113 16126 TRACE nova.openstack.common.rpc.common 2014-08-18 07:25:46.114 16126 INFO nova.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.204.2:5672 2014-08-18 07:25:46.132 16126 ERROR nova.openstack.common.rpc.common [-] AMQP server on 192.168.204.2:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 seconds. 2014-08-18 07:25:46.132 16126 ERROR nova.openstack.common.rpc.common [-] AMQP server on 192.168.204.2:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 1 seconds. 2014-08-18 07:25:47.134 16126 INFO nova.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.204.2:5672 2014-08-18 07:25:47.140 16126 INFO nova.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.204.2:5672 2014-08-18 07:25:47.146 16126 ERROR nova.openstack.common.rpc.common [-] AMQP server on 192.168.204.2:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 3 seconds. 2014-08-18 07:25:47.147 16126 ERROR nova.openstack.common.rpc.common [-] AMQP server on 192.168.204.2:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 3 seconds. 2014-08-18 07:25:50.148 16126 INFO nova.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.204.2:5672 2014-08-18 07:25:50.154 16126 INFO nova.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.204.2:5672 2014-08-18 07:25:50.160 16126 ERROR nova.openstack.common.rpc.common [-] AMQP server on 192.168.204.2:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 5 seconds. 2014-08-18 07:25:50.161 16126 ERROR nova.openstack.common.rpc.common [-] AMQP server on 192.168.204.2:5672 is unreachable: [Errno 111] ECONNREFUSED. Trying again in 5 seconds. 2014-08-18 07:25:55.161 16126 INFO nova.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.204.2:5672 2014-08-18 07:25:55.167 16126 INFO nova.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.204.2:5672 2014-08-18 07:25:55.186 16126 INFO nova.openstack.common.rpc.common [-] Connected to AMQP server on 192.168.204.2:5672 2014-08-18 07:25:55.190 16126 INFO nova.openstack.common.rpc.common [-] Connected to AMQP server on 192.168.204.2:5672 2014-08-18 07:26:15.793 16126 ERROR nova.openstack.common.rpc.common [-] Failed to publish message to topic 'conductor': [Errno 104] Connection reset by peer 2014-08-18 07:26:15.793 16126 TRACE nova.openstack.common.rpc.common error: [Errno 104] Connection reset by peer 2014-08-18 07:26:15.793 16126 TRACE nova.openstack.common.rpc.common 2014-08-18 07:26:15.795 16126 INFO nova.openstack.common.rpc.common [-] Reconnecting to AMQP server on 192.168.204.2:5672 2014-08-18 07:26:15.803 16126 INFO nova.openstack.common.rpc.common [-] Connected to AMQP server on 192.168.204.2:5672 2014-08-18 07:26:15.812 16126 AUDIT nova.compute.resource_tracker [-] Auditing locally available compute resources 2014-08-18 07:26:16.101 16126 AUDIT nova.compute.resource_tracker [-] Free ram (MB): 25812, per-node: [11318, 14794], numa nodes:2 2014-08-18 07:26:16.101 16126 AUDIT nova.compute.resource_tracker [-] Free disk (GB): 36 2014-08-18 07:26:16.101 16126 AUDIT nova.compute.resource_tracker [-] Free vcpus: 289, free per-node float vcpus
Re: [openstack-dev] [nova] expected behaviour of _report_state() on rabbitmq failover
On 09/10/2014 02:13 PM, Chris Friesen wrote: As it stands, it seems that waiting for the RPC call to time out blocks _report_state() from running again in report_interval seconds, which delays the service update until the RPC timeout period expires. Just noticed something... In the case of _report_state(), does it really make sense to wait 60 seconds for RPC timeout when we're going to send a new service update in 10 seconds anyway? More generally, the RPC timeout on the service_update() call should be less than or equal to service.report_interval for the service. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][neutron][cinder] Averting the Nova crisis by splitting out virt drivers
On 09/10/2014 02:44 AM, Daniel P. Berrange wrote: On Tue, Sep 09, 2014 at 05:14:43PM -0700, Stefano Maffulli wrote: I have the impression this idea has been circling around for a while but for some reason or another (like lack of capabilities in gerrit and other reasons) we never tried to implement it. Maybe it's time to think about an implementation. We have been thinking about mentors https://wiki.openstack.org/wiki/Mentors, maybe that's a way to go? Sub-team with +1.5 scoring capabilities? I think that setting up subteams is neccessary to stop us imploding but I don't think it is enough. As long as we have one repo we're forever going to have conflict & contention in deciding which features to accept, which is a big factor in problems today. If each hypervisor team mostly only modifies their own code, why would there be conflict? As I see it, the only causes for conflict would be in the shared code, and you'd still need to sort out the issues with the shared code even if you split out the individual drivers into separate repos. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][neutron][cinder] Averting the Nova crisis by splitting out virt drivers
On 09/10/2014 04:11 PM, Jay Pipes wrote: On 09/10/2014 05:55 PM, Chris Friesen wrote: If each hypervisor team mostly only modifies their own code, why would there be conflict? As I see it, the only causes for conflict would be in the shared code, and you'd still need to sort out the issues with the shared code even if you split out the individual drivers into separate repos. a) Sorting out the common code is already accounted for in Dan B's original proposal -- it's a prerequisite for the split. Fair enough. b) The conflict Dan is speaking of is around the current situation where we have a limited core review team bandwidth and we have to pick and choose which virt driver-specific features we will review. This leads to bad feelings and conflict. Why does the core review team need to review virt driver-specific stuff. If we're looking at making subteams responsible for the libvirt code then it really doesn't matter where the code resides as long as everyone knows who owns it. c) It's the impact to the CI and testing load that I see being the biggest benefit to the split-out driver repos. Patches proposed to the XenAPI driver shouldn't have the Hyper-V CI tests run against the patch. Likewise, running libvirt unit tests in the VMWare driver repo doesn't make a whole lot of sense, and all of these tests add a not-insignificant load to the overall upstream and external CI systems. The long wait time for tests to come back means contributors get frustrated, since many reviewers tend to wait until Jenkins returns some result before they review. All of this leads to increased conflict that would be somewhat ameliorated by having separate code repos for the virt drivers. Has anyone considered making the CI tools smarter? Maybe have a way to determine which tests to run based on the code being modified? If someone makes a change in nova/virt/libvirt there's a limited set of tests that make sense to run...there's no need to run xen/hyperv/vmware CI tests against it, for example. Similarly, there's no need to run all the nova-scheduler, neutron, server groups, etc. tests. That way we could give a subteam real responsibility for a specific area of the code, and submissions to that area of the code would not be gated by bugs in unrelated areas of the code. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Server Groups - remove VM from group?
On 09/10/2014 04:16 PM, Russell Bryant wrote: On Sep 10, 2014, at 2:03 PM, Joe Cropper wrote: I would like to craft up a blueprint proposal for Kilo to add two simple extensions to the existing server group APIs that I believe will make them infinitely more usable in any ‘real world’ scenario. I’ll put more details in the proposal, but in a nutshell: 1. Adding a VM to a server group Only allow it to succeed if its policy wouldn’t be violated by the addition of the VM I'm not sure that determining this at the time of the API request is possible due to the parallel and async nature of the system. I'd love to hear ideas on how you think this might be done, but I'm really not optimistic and would rather just not go down this road. I can see a possible race against another instance booting into the group, or another already-running instance being added to the group. I think the solution is to do the update as an atomic database transaction. It seems like it should be possible to create a database operation that does the following in a single transaction: --look up the hosts for the instances in the group --check that the scheduler policy would be satisfied (at least for the basic affinity/anti-affinity policies) --add the instance to the group Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] anyone using RabbitMQ with active/active mirrored queues?
On 09/11/2014 12:50 AM, Jesse Pretorius wrote: On 10 September 2014 17:20, Chris Friesen mailto:chris.frie...@windriver.com>> wrote: I see that the OpenStack high availability guide is still recommending the active/standby method of configuring RabbitMQ. Has anyone tried using active/active with mirrored queues as recommended by the RabbitMQ developers? If so, what problems did you run into? I would recommend that you ask this question on the openstack-perators list as you'll likely get more feedback. Thanks for the suggestion, will do. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers
On 09/11/2014 12:02 PM, Dan Prince wrote: Maybe I'm impatient (I totally am!) but I see much of the review slowdown as a result of the feedback loop times increasing over the years. OpenStack has some really great CI and testing but I think our focus on not breaking things actually has us painted into a corner. We are losing our agility and the review process is paying the price. At this point I think splitting out the virt drivers would be more of a distraction than a help. I think the only solution to feedback loop times increasing is to scale the review process, which I think means giving more people responsibility for a smaller amount of code. I don't think it's strictly necessary to split the code out into a totally separate repo, but I do think it would make sense to have changes that are entirely contained within a virt driver be reviewed only by developers of that virt driver rather than requiring review by the project as a whole. And they should only have to pass a subset of the CI testing--that way they wouldn't be held up by gating bugs in other areas. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Server Groups - remove VM from group?
On 09/11/2014 03:01 PM, Jay Pipes wrote: On 09/11/2014 04:51 PM, Matt Riedemann wrote: On 9/10/2014 6:00 PM, Russell Bryant wrote: On 09/10/2014 06:46 PM, Joe Cropper wrote: Hmm, not sure I follow the concern, Russell. How is that any different from putting a VM into the group when it’s booted as is done today? This simply defers the ‘group insertion time’ to some time after initial the VM’s been spawned, so I’m not sure this creates anymore race conditions than what’s already there [1]. [1] Sure, the to-be-added VM could be in the midst of a migration or something, but that would be pretty simple to check make sure its task state is None or some such. The way this works at boot is already a nasty hack. It does policy checking in the scheduler, and then has to re-do some policy checking at launch time on the compute node. I'm afraid of making this any worse. In any case, it's probably better to discuss this in the context of a more detailed design proposal. This [1] is the hack you're referring to right? [1] http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py?id=2014.2.b3#n1297 That's the hack *I* had in the back of my mind. I think that's the only boot hack related to server groups. I was thinking that it should be possible to deal with the race more cleanly by recording the selected compute node in the database at the time of scheduling. As it stands, the host is implicitly encoded in the compute node to which we send the boot request and nobody else knows about it. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Server Groups - remove VM from group?
On 09/11/2014 04:22 PM, Joe Cropper wrote: I would be a little wary about the DB level locking for stuff like that — it’s certainly doable, but also comes at the expense of things behaving ever-so-slightly different from DBMS to DBMS. Perhaps there are multiple “logical efforts” here—i.e., adding some APIs and cleaning up existing code. I think you could actually do it without locking. Pick a host as we do now, write it into the database, then check whether you hit a race and if so then clear that host from the database and go back to the beginning. Basically the same algorithm that we do now, but all contained within the scheduler code. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][neutron][cinder] Averting the Nova crisis by splitting out virt drivers
On 09/12/2014 04:59 PM, Joe Gordon wrote: On Thu, Sep 11, 2014 at 2:18 AM, Daniel P. Berrange mailto:berra...@redhat.com>> wrote: FYI, for Juno at least I really don't consider that even the libvirt driver got acceptable review times in any sense. The pain of waiting for reviews in libvirt code I've submitted this cycle is what prompted me to start this thread. All the virt drivers are suffering way more than they should be, but those without core team representation suffer Can't you replace the word 'libvirt code' with 'nova code' and this would still be true? Do you think landing virt driver code is harder then landing non virt driver code? If so do you have any numbers to back this up? If the issue here is 'landing code in nova is too painful', then we should discuss solving that more generalized issue first, and maybe we conclude that pulling out the virt drivers gets us the most bang for our buck. But unless we have that more general discussion, saying the right fix for that is to spend a large amount of time working specifically on virt driver related issues seems premature. I agree that this is a nova issue in general, though I suspect that the virt drivers have quite separate developer communities so maybe they feel the pain more clearly. But I think the solution is the same in both cases: 1) Allow people to be responsible for a subset of the nova code (scheduler, virt, conductor, compute, or even just a single driver). They would have significant responsibility for that area of the code. This would serve several purposes--people with deep domain-specific knowledge would be able to review code that touches that domain, and it would free up the nova core team to look at the higher-level picture. For changes that cross domains, the people from the relevant domains would need to be involved. 2) Modify the gate tests such that changes that are wholly contained within a single area of code are not blocked by gate-blocking-bugs in unrelated areas of the code. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] how to debug RPC timeout issues?
Hi, I'm running Havana, and I just tried a testcase involving doing six simultaneous live-migrations. It appears that the migrations succeeded, but two of the instances got stuck with a status of "MIGRATING" because of RPC timeouts: 2014-09-16 20:35:07.376 12493 INFO nova.notifier [-] processing ERROR _post_live_migration 2014-09-16 20:35:07.390 12493 INFO nova.openstack.common.rpc.common [-] Connected to AMQP server on 192.168.204.2:5672 2014-09-16 20:35:07.396 12493 ERROR nova.openstack.common.loopingcall [-] in fixed duration looping call 2014-09-16 20:35:07.396 12493 TRACE nova.openstack.common.loopingcall Traceback (most recent call last): 2014-09-16 20:35:07.396 12493 TRACE nova.openstack.common.loopingcall File "./usr/lib64/python2.7/site-packages/nova/openstack/common/loopingcall.py", line 78, in _inner 2014-09-16 20:35:07.396 12493 TRACE nova.openstack.common.loopingcall File "./usr/lib64/python2.7/site-packages/nova/virt/libvirt/driver.py", line 4874, in wait_for_live_migration 2014-09-16 20:35:07.396 12493 TRACE nova.openstack.common.loopingcall File "./usr/lib64/python2.7/site-packages/nova/exception.py", line 90, in wrapped 2014-09-16 20:35:07.396 12493 TRACE nova.openstack.common.loopingcall File "./usr/lib64/python2.7/site-packages/nova/exception.py", line 73, in wrapped 2014-09-16 20:35:07.396 12493 TRACE nova.openstack.common.loopingcall File "./usr/lib64/python2.7/site-packages/nova/compute/manager.py", line 4558, in _post_live_migration 2014-09-16 20:35:07.396 12493 TRACE nova.openstack.common.loopingcall File "./usr/lib64/python2.7/site-packages/nova/compute/rpcapi.py", line 517, in post_live_migration_at_destination 2014-09-16 20:35:07.396 12493 TRACE nova.openstack.common.loopingcall File "./usr/lib64/python2.7/site-packages/nova/rpcclient.py", line 85, in call 2014-09-16 20:35:07.396 12493 TRACE nova.openstack.common.loopingcall File "./usr/lib64/python2.7/site-packages/nova/rpcclient.py", line 63, in _invoke 2014-09-16 20:35:07.396 12493 TRACE nova.openstack.common.loopingcall File "./usr/lib64/python2.7/site-packages/nova/openstack/common/rpc/proxy.py", line 131, in call 2014-09-16 20:35:07.396 12493 TRACE nova.openstack.common.loopingcall Timeout: Timeout while waiting on RPC response - topic: "compute.compute-0", RPC method: "post_live_migration_at_destination" info: "" Looking at the migration destination compute node, I see it in turn getting stuck on an RPC timeout: 2014-09-16 20:35:32.216 12510 INFO nova.notifier [req-a8389c8d-7e5b-4f08-8669-ce14594e3863 24a43342a5ae4e31bd431aef2d395ebc 02bf771ab40f4ecea6fe42135c5a09bc] processing ERROR post_live_migration_at_destination 2014-09-16 20:35:32.247 12510 INFO nova.openstack.common.rpc.common [req-a8389c8d-7e5b-4f08-8669-ce14594e3863 24a43342a5ae4e31bd431aef2d395ebc 02bf771ab40f4ecea6fe42135c5a09bc] Connected to AMQP server on 192.168.204.2:5672 2014-09-16 20:35:32.248 12510 INFO nova.openstack.common.rpc.common [req-2f7bd481-9aed-4d66-97dd-3075407282dd 24a43342a5ae4e31bd431aef2d395ebc 02bf771ab40f4ecea6fe42135c5a09bc] Connected to AMQP server on 192.168.204.2:5672 2014-09-16 20:35:32.270 12510 ERROR nova.openstack.common.rpc.amqp [req-a8389c8d-7e5b-4f08-8669-ce14594e3863 24a43342a5ae4e31bd431aef2d395ebc 02bf771ab40f4ecea6fe42135c5a09bc] Exception during message handling 2014-09-16 20:35:32.270 12510 TRACE nova.openstack.common.rpc.amqp Traceback (most recent call last): 2014-09-16 20:35:32.270 12510 TRACE nova.openstack.common.rpc.amqp File "./usr/lib64/python2.7/site-packages/nova/openstack/common/rpc/amqp.py", line 466, in _process_data 2014-09-16 20:35:32.270 12510 TRACE nova.openstack.common.rpc.amqp File "./usr/lib64/python2.7/site-packages/nova/openstack/common/rpc/dispatcher.py", line 172, in dispatch 2014-09-16 20:35:32.270 12510 TRACE nova.openstack.common.rpc.amqp File "./usr/lib64/python2.7/site-packages/nova/exception.py", line 90, in wrapped 2014-09-16 20:35:32.270 12510 TRACE nova.openstack.common.rpc.amqp File "./usr/lib64/python2.7/site-packages/nova/exception.py", line 73, in wrapped 2014-09-16 20:35:32.270 12510 TRACE nova.openstack.common.rpc.amqp File "./usr/lib64/python2.7/site-packages/nova/compute/manager.py", line 4647, in post_live_migration_at_destination 2014-09-16 20:35:32.270 12510 TRACE nova.openstack.common.rpc.amqp File "./usr/lib64/python2.7/site-packages/nova/virt/libvirt/driver.py", line 5065, in post_live_migration_at_destination 2014-09-16 20:35:32.270 12510 TRACE nova.openstack.common.rpc.amqp File "./usr/lib64/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3709, in to_xml 2014-09-16 20:35:32.270 12510 TRACE nova.openstack.common.rpc.amqp File "./usr/lib64/python2.7/site-packages/nova/virt/libvirt/driver.py", line 3465, in get_guest_config 2014-09-16 20:35:32.270 12510 TRACE nova.openstack.common.rpc.amqp File "./usr/lib64/python2.7/site-packages/nova/compute/manager.py", line 383, in insta
Re: [openstack-dev] [Openstack-operators] [nova] about resize the instance
On 11/8/2018 5:30 AM, Rambo wrote: When I resize the instance, the compute node report that "libvirtError: internal error: qemu unexpectedly closed the monitor: 2018-11-08T09:42:04.695681Z qemu-kvm: cannot set up guest memory 'pc.ram': Cannot allocate memory".Has anyone seen this situation?And the ram_allocation_ratio is set 3 in nova.conf.The total memory is 125G.When I use the "nova hypervisor-show server" command to show the compute node's free_ram_mb is -45G.If it is the result of excessive use of memory? Can you give me some suggestions about this?Thank you very much. I suspect that you simply don't have any available memory on that system. What is your kernel overcommit setting on the host? If /proc/sys/vm/overcommit_memory is set to 2, then try either changing the overcommit ratio or setting it to 1 to see if that makes a difference. Chris __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Does Nova really need an SQL database?
On 11/18/2013 06:47 PM, Joshua Harlow wrote: An idea related to this, what would need to be done to make the DB have the exact state that a compute node is going through (and therefore the scheduler would not make unreliable/racey decisions, even when there are multiple schedulers). It's not like we are dealing with a system which can not know the exact state (as long as the compute nodes are connected to the network, and a network partition does not occur). How would you synchronize the various schedulers with each other? Suppose you have multiple scheduler nodes all trying to boot multiple instances each. Even if each at the start of the process each scheduler has a perfect view of the system, each scheduler would need to have a view of what every other scheduler is doing in order to not make racy decisions. I see a few options: 1) Push scheduling down into the database itself. Implement scheduler filters as SQL queries or stored procedures. 2) Get rid of the DB for scheduling. It looks like people are working on this: https://blueprints.launchpad.net/nova/+spec/no-db-scheduler 3) Do multi-stage scheduling. Do a "tentative" schedule, then try and update the DB to reserve all the necessary resources. If that fails, someone got there ahead of you so try again with the new data. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Does Nova really need an SQL database?
On 11/19/2013 12:35 PM, Clint Byrum wrote: Each scheduler process can own a different set of resources. If they each grab instance requests in a round-robin fashion, then they will fill their resources up in a relatively well balanced way until one scheduler's resources are exhausted. At that time it should bow out of taking new instances. If it can't fit a request in, it should kick the request out for retry on another scheduler. In this way, they only need to be in sync in that they need a way to agree on who owns which resources. A distributed hash table that gets refreshed whenever schedulers come and go would be fine for that. That has some potential, but at high occupancy you could end up refusing to schedule something because no one scheduler has sufficient resources even if the cluster as a whole does. This gets worse once you start factoring in things like heat and instance groups that will want to schedule whole sets of resources (instances, IP addresses, network links, cinder volumes, etc.) at once with constraints on where they can be placed relative to each other. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Does Nova really need an SQL database?
On 11/19/2013 12:27 PM, Joshua Harlow wrote: Personally I would prefer #3 from the below. #2 I think will still have to deal with consistency issues, just switching away from a DB doesn't make magical ponies and unicorns appear (in-fact it can potentially make the problem worse if its done incorrectly - and its pretty easy to get it wrong IMHO). #1 could also work, but then u hit a vertical scaling limit (works if u paid oracle for there DB or IBM for DB2 I suppose). I prefer #3 since I think it is honestly needed under all solutions. Personally I think we need a combination of #3 (resource reservation) with something else to speed up scheduling. We have multiple filters that currently loop over all the compute nodes, gathering a bunch of data from the DB and then ignoring most of that data while doing some simple logic in python. There is really no need for the bulk of the resource information to be stored in the DB. The compute nodes could broadcast their current state to all scheduler nodes, and the scheduler nodes could reserve resources directly from the compute nodes (triggering an update of all the other scheduler nodes). Failing that, it should be possible to push at least some of the filtering down into the DB itself. Stuff like ramfilter or cpufilter would be trival (and fast) as an SQL query. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Does Nova really need an SQL database?
On 11/19/2013 01:51 PM, Clint Byrum wrote: Excerpts from Chris Friesen's message of 2013-11-19 11:37:02 -0800: On 11/19/2013 12:35 PM, Clint Byrum wrote: Each scheduler process can own a different set of resources. If they each grab instance requests in a round-robin fashion, then they will fill their resources up in a relatively well balanced way until one scheduler's resources are exhausted. At that time it should bow out of taking new instances. If it can't fit a request in, it should kick the request out for retry on another scheduler. In this way, they only need to be in sync in that they need a way to agree on who owns which resources. A distributed hash table that gets refreshed whenever schedulers come and go would be fine for that. That has some potential, but at high occupancy you could end up refusing to schedule something because no one scheduler has sufficient resources even if the cluster as a whole does. I'm not sure what you mean here. What resource spans multiple compute hosts? Imagine the cluster is running close to full occupancy, each scheduler has room for 40 more instances. Now I come along and issue a single request to boot 50 instances. The cluster has room for that, but none of the schedulers do. This gets worse once you start factoring in things like heat and instance groups that will want to schedule whole sets of resources (instances, IP addresses, network links, cinder volumes, etc.) at once with constraints on where they can be placed relative to each other. Actually that is rather simple. Such requests have to be serialized into a work-flow. So if you say "give me 2 instances in 2 different locations" then you allocate 1 instance, and then another one with 'not_in_location(1)' as a condition. Actually, you don't want to serialize it, you want to hand the whole set of resource requests and constraints to the scheduler all at once. If you do them one at a time, then early decisions made with less-than-complete knowledge can result in later scheduling requests failing due to being unable to meet constraints, even if there are actually sufficient resources in the cluster. The "VM ensembles" document at https://docs.google.com/document/d/1bAMtkaIFn4ZSMqqsXjs_riXofuRvApa--qo4UTwsmhw/edit?pli=1 has a good example of how one-at-a-time scheduling can cause spurious failures. And if you're handing the whole set of requests to a scheduler all at once, then you want the scheduler to have access to as many resources as possible so that it has the highest likelihood of being able to satisfy the request given the constraints. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Does Nova really need an SQL database?
On 11/20/2013 10:06 AM, Soren Hansen wrote: 2013/11/18 Mike Spreitzer : There were some concerns expressed at the summit about scheduler scalability in Nova, and a little recollection of Boris' proposal to keep the needed state in memory. I also heard one guy say that he thinks Nova does not really need a general SQL database, that a NOSQL database with a bit of denormalization and/or client-maintained secondary indices could suffice. I may have said something along those lines. Just to clarify -- since you started this post by talking about scheduler scalability -- the main motivation for using a non-SQL backend isn't scheduler scalability, it's availability and resilience. I just don't accept the failure modes that MySQL (and derivatives such as Galera) impose. Has that sort of thing been considered before? It's been talked about on and off since... well, probably since we started this project. What is the community's level of interest in exploring that? The session on adding a backend using a non-SQL datastore was pretty well attended. What about a hybrid solution? There is data that is only used by the scheduler--for performance reasons maybe it would make sense to store that information in RAM as described at https://blueprints.launchpad.net/nova/+spec/no-db-scheduler For the rest of the data, perhaps it could be persisted using some alternate backend. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Does Nova really need an SQL database?
On 11/21/2013 10:52 AM, Stephen Gran wrote: On 21/11/13 15:49, Chris Friesen wrote: On 11/21/2013 02:58 AM, Soren Hansen wrote: 2013/11/20 Chris Friesen : What about a hybrid solution? There is data that is only used by the scheduler--for performance reasons maybe it would make sense to store that information in RAM as described at https://blueprints.launchpad.net/nova/+spec/no-db-scheduler I suspect that a large performance gain could be had by 2 fairly simple changes: a) Break the scheduler in two, so that the chunk of code receiving updates from the compute nodes can't block the chunk of code scheduling instances. b) Use a memcache backend instead of SQL for compute resource information. My fear with keeping data local to a scheduler instance is that local state destroys scalability. "a" and "b" are basically what is described in the blueprint above. Your fear is addressed by having the compute nodes broadcast their resource information to all scheduler instances. As I see it, the scheduler could then make a tentative scheduling decision, attempt to reserve the resources from the compute node (which would trigger the compute node to send updated resource information in all the scheduler instances), and assuming it got the requested resources it could then proceed with bringing up the resource. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Does Nova really need an SQL database?
On 11/21/2013 02:58 AM, Soren Hansen wrote: 2013/11/20 Chris Friesen : What about a hybrid solution? There is data that is only used by the scheduler--for performance reasons maybe it would make sense to store that information in RAM as described at https://blueprints.launchpad.net/nova/+spec/no-db-scheduler For the rest of the data, perhaps it could be persisted using some alternate backend. What would that solve? The scheduler has performance issues. Currently the design is suboptimal--the compute nodes write resource information to the database, then the scheduler pulls a bunch of data out of the database, copies it over into python, and analyzes it in python to do the filtering. For large clusters this can lead to significant time spent scheduling. Based on the above, for performance reasons it would be beneficial for the scheduler to have the necessary data already available in python rather than needing to pull it out of the database. For other uses of the database people are proposing alternatives to SQL in order to get reliability. I don't have any experience with that so I have no opinion on it. But as long as the data is sitting on-disk (or even in a database process instead of in the scheduler process) it's going to slow down the scheduler. If the primary consumer of a give piece of data (free ram, free cpu, free disk, etc) is the scheduler, then I think it makes sense for the compute nodes to report it directly to the scheduler. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] future fate of nova-network?
On 11/22/2013 02:29 PM, Russell Bryant wrote: I honestly don't understand why openstack@ and openstack-operators@ are different lists. Perhaps openstack@ just needs better use of topic tagging ... Wouldn't openstack@ be the logical place for end-users to hang out, while openstack-operators@ is for the actual cloud providers? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Is there a way for the VM to identify that it is getting booted in OpenStack
On 11/26/2013 07:48 PM, Vijay Venkatachalam wrote: Hi, Is there a way for the VM to identify that it is getting booted in OpenStack? As said in the below mail, once the VM knows it is booting in OpenStack it will alter the boot sequence. What does "getting booted in OpenStack" mean? OpenStack supports multiple hypervisors, so you could have something coming up via kvm/vmware/Xen/baremetal/etc. but they're all "getting booted in OpenStack". Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [heat] Is it time for a v2 Heat API?
On 11/27/2013 11:50 AM, Zane Bitter wrote: Even better would be if we had the keystone domain (instead of the tenant id) incorporated into the endpoint in the keystone catalog and then we could use the tenant^W project *name* in the URL and users would never have to deal with UUIDs and invisible headers again - your server is always at /my-project-name/servers/my-server-name. Where you might expect it to be. That sounds way too logical... Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] curious, why wasn't nova commit 52f6981 backported to grizzly?
Hi, Just wondering why nova commit 52f6981 ("Evacuated instance disk not deleted") wasn't backported to grizzly? The symptoms of this bug are that if you evacuate a server off a compute node that uses local storage then you can never move it back to that compute node because the old files are still lying around. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Schduler] Volunteers wanted for a modest proposal for an external scheduler in our lifetime
On 11/28/2013 09:50 AM, Gary Kotton wrote: One option worth thinking about is to introduce a new scheduling driver to nova - this driver will interface with the external scheduler. This will let us define the scheduling API, model etc, without being in the current confines of Nova. This will also enable all of the other modules, for example Cinder to hook into it. I see a couple nice things about this proposal: 1) Going this route means that we're free to mess with the APIs to some extent since they're not really "public" yet. 2) Once we have API parity with the current schedulers all in one place then we'll be able to more easily start extracting common stuff. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] problems with rabbitmq on HA controller failure...anyone seen this?
Hi, We're currently running Grizzly (going to Havana soon) and we're running into an issue where if the active controller is ungracefully killed then nova-compute on the compute node doesn't properly connect to the new rabbitmq server on the newly-active controller node. I saw a bugfix in Folsom (https://bugs.launchpad.net/nova/+bug/718869) to retry the connection to rabbitmq if it's lost, but it doesn't seem to be properly handling this case. Interestingly, killing and restarting nova-compute on the compute node seems to work, which implies that the retry code is doing something less effective than the initial startup. Has anyone doing HA controller setups run into something similar? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] problems with rabbitmq on HA controller failure...anyone seen this?
On 11/29/2013 06:37 PM, David Koo wrote: On Nov 29, 02:22:17 PM (Friday), Chris Friesen wrote: We're currently running Grizzly (going to Havana soon) and we're running into an issue where if the active controller is ungracefully killed then nova-compute on the compute node doesn't properly connect to the new rabbitmq server on the newly-active controller node. Interestingly, killing and restarting nova-compute on the compute node seems to work, which implies that the retry code is doing something less effective than the initial startup. Has anyone doing HA controller setups run into something similar? As a followup, it looks like if I wait for 9 minutes or so I see a message in the compute logs: 2013-11-30 00:02:14.756 1246 ERROR nova.openstack.common.rpc.common [-] Failed to consume message from queue: Socket closed It then reconnects to the AMQP server and everything is fine after that. However, any instances that I tried to boot during those 9 minutes stay stuck in the "BUILD" status. So the rabbitmq server and the controller are on the same node? Yes, they are. > My guess is that it's related to this bug 856764 (RabbitMQ connections lack heartbeat or TCP keepalives). The gist of it is that since there are no heartbeats between the MQ and nova-compute, if the MQ goes down ungracefully then nova-compute has no way of knowing. If the MQ goes down gracefully then the MQ clients are notified and so the problem doesn't arise. Sounds about right. We got bitten by the same bug a while ago when our controller node got hard reset without any warning!. It came down to this bug (which, unfortunately, doesn't have a fix yet). We worked around this bug by implementing our own crude fix - we wrote a simple app to periodically check if the MQ was alive (write a short message into the MQ, then read it out again). When this fails n-times in a row we restart nova-compute. Very ugly, but it worked! Sounds reasonable. I did notice a kombu heartbeat change that was submitted and then backed out again because it was buggy. I guess we're still waiting on the real fix? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Schduler] Volunteers wanted for a modest proposal for an external scheduler in our lifetime
On 12/02/2013 02:31 PM, Vishvananda Ishaya wrote: I'm going to reopen a can of worms, though. I think the most difficult part of the forklift will be moving stuff out of the existing databases into a new database. Do we really need to move it to a new database for the forklift? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Blueprint: standard specification of guest CPU topology
On 12/03/2013 04:08 AM, Daniel P. Berrange wrote: On Tue, Dec 03, 2013 at 01:47:31AM -0800, Gary Kotton wrote: Hi, I think that this information should be used as part of the scheduling decision, that is hosts that are to be selected should be excluded if they do not have the necessary resources available. It will be interesting to know how this is going to fit into the new scheduler that is being discussed. The CPU topology support shouldn't have any interactions with, nor cause any failures post-scheduling. ie If the host has declared that it has sufficient resources to run a VM with the given vCPU count, then that is sufficient. What if we want to do more than just specify a number of vCPUs? What if we want to specify that they need to all come from a single NUMA node? Or all from different NUMA nodes? Or that we want (or don't want) them to come from hyperthread siblings, or from different physical sockets. This sort of things is less common in the typical cloud space, but for private clouds where the overcommit ratio might be far smaller and performance is more of an issue they might be more desirable. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] anyone aware of networking issues with grizzly live migration of kvm instances?
Hi, We've got a grizzly setup using quantum networking and libvirt/kvm with VIR_MIGRATE_LIVE set. I was live-migrating an instance back and forth between a couple of compute nodes. It worked fine for maybe half a dozen migrations and then after a migration I could no longer ping it. It appears that packets were making it up to the guest but we never saw packets come out of the guest. Rebooting the instance seems to have restored connectivity. Anyone aware of something like this? We're planning on switching to havana when we can, so it'd be nice if this was fixed there. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?
On 12/12/2013 11:02 AM, Clint Byrum wrote: So I'm asking, is there a standard way to determine whether or not a nova-compute is definitely ready to have things scheduled on it? This can be via an API, or even by observing something on the nova-compute host itself. I just need a definitive signal that "the compute host is ready". Is it not sufficient that "nova service-list" shows the compute service as "up"? If not, then maybe we should call that a bug in nova... Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] why don't we deal with "claims" when live migrating an instance?
When we create a new instance via _build_instance() or _build_and_run_instance(), in both cases we call instance_claim() to reserve and test for resources. During a cold migration I see us calling prep_resize() which calls resize_claim(). How come we don't need to do something like this when we live migrate an instance? Do we track the hypervisor overhead somewhere in the instance? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Proposal for dd disk i/o performance blueprint of cinder.
On 12/26/2013 01:56 AM, cosmos cosmos wrote: Hello. My name is Rucia for Samsung SDS. I had in truouble in volume deleting. I am developing for supporting big data storage such as hadoop in lvm. it use as a full disk io for deleting of cinder lvm volume because of dd the high disk I/O affects the other hadoop instance on same host. If using dd for deleting the volume, it takes too much time for deleting of cinder lvm volume because of dd. Cinder volume is 200GB for supporting hadoop master data. When i delete cinder volume in using 'dd if=/dev/zero of $cinder-volume count=100 bs=1M' it takes about 30 minutes. While I think your ionice proposal makes some sense, I think it would be better to avoid the cost of the volume wipe on deletion in the first place. I read a proposal about using thinly-provisioned logical volumes as a way around the cost of wiping the disks, since they zero-fill on demand rather than incur the cost at deletion time. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Proposal for dd disk i/o performance blueprint of cinder.
On 01/15/2014 06:00 PM, Fox, Kevin M wrote: What about a configuration option on the volume for delete type? I can see some possible options: * None - Don't clear on delete. Its junk data for testing and I don't want to wait. * Zero - Return zero's from subsequent reads either by zeroing on delete, or by faking zero reads initially. * Random - Write random to disk. * Multipass - Clear out the space in the most secure mode configured. Multiple passes and such. Interesting idea, but for the "None" case I'd include the possibility that the user will be encrypting their data. Also, I don't see any point for random or multipass given that anyone that really cares about it isn't going to trust their unencrypted data to someone else anyways. Lastly, the "zero on delete" case could conceivably end up costing more since the provider could continue to bill for the volume while it is being deleted. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Proposal for dd disk i/o performance blueprint of cinder.
On 01/15/2014 06:30 PM, Jay S Bryant wrote: There is already an option that can be set in cinder.conf using 'volume_clear=none' Is there a reason that that option is not sufficient? That option would be for the cloud operator and since it would apply to all volumes on that cinder node. My impression was that Kevin was proposing a per-volume option for the end-user. If the cloud operator charged for the time/bandwidth required to delete the volume, that would cover their costs and give some incentive for customers to mark non-private data as such and save the effort of deleting the data. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Proposal for dd disk i/o performance blueprint of cinder.
On 01/15/2014 11:25 PM, Clint Byrum wrote: Excerpts from Alan Kavanagh's message of 2014-01-15 19:11:03 -0800: Hi Paul I posted a query to Ironic which is related to this discussion. My thinking was I want to ensure the case you note here (1) " a tenant can not read another tenants disk.." the next (2) was where in Ironic you provision a baremetal server that has an onboard dish as part of the blade provisioned to a given tenant-A. then when tenant-A finishes his baremetal blade lease and that blade comes back into the pool and tenant-B comes along, I was asking what open source tools guarantee data destruction so that no ghost images or file retrieval is possible? Is that really a path worth going down, given that tenant-A could just drop evil firmware in any number of places, and thus all tenants afterward are owned anyway? Ooh, nice one! :) I suppose the provider could flash to known-good firmware for all firmware on the device in between leases. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] how is resource tracking supposed to work for live migration and evacuation?
Hi, I'm trying to figure out how resource tracking is intended to work for live migration and evacuation. For a while I thought that maybe we were relying on the call to ComputeManager._instance_update() in ComputeManager.post_live_migration_at_destination(). However, in ResourceTracker.update_usage() we see that on a live migration the instance that has just migrated over isn't listed in self.tracked_instances and so we don't actually update its usage. As far as I can see, the current code will just wait for the audit to run at some unknown time in the future and call update_available_resource(), which will add the newly-migrated instance to self.tracked_instances and update the resource usage. From my poking around so far the same thing holds true for evacuation as well. In either case, just waiting for the audit seems somewhat haphazard. Would it make sense to do something like ResourceTracker.instance_claim() during the migration/evacuate and properly track the resources rather than wait for the audit? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Proposal for dd disk i/o performance blueprint of cinder.
On 01/16/2014 04:22 PM, Clint Byrum wrote: Excerpts from Fox, Kevin M's message of 2014-01-16 09:29:14 -0800: Yeah, I think the evil firmware issue is separate and should be solved separately. Ideally, there should be a mode you can set the bare metal server into where firmware updates are not allowed. This is useful to more folks then just baremetal cloud admins. Something to ask the hardware vendors for. Yes, I think we call this mode "virtualization". Nice. :) Unfortunately, virtualization still has a pretty heavy price in some use-cases (heavy disk I/O, for instance). Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] "Evil" Firmware
On 01/16/2014 05:12 PM, CARVER, PAUL wrote: Jumping back to an earlier part of the discussion, it occurs to me that this has broader implications. There's some discussion going on under the heading of Neutron with regard to PCI passthrough. I imagine it's under Neutron because of a desire to provide passthrough access to NICs, but given some of the activity around GPU based computing it seems like sooner or later someone is going to try to offer multi-tenant cloud servers with the ability to do GPU based computing if they haven't already. I'd expect that the situation with PCI passthrough may be a bit different, at least in the common case. The usual scenario is to use SR-IOV to have a single physical device expose a bunch of virtual functions, and then a virtual function is passed through into a guest. The physical function is the one with the "real" PCI config space, so as long as the host controls it then there should be minimal risk from the guests since they have limited access via the virtual functions--typically mostly just message-passing to the physical function. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [ironic] Disk Eraser
On 01/17/2014 04:20 PM, Devananda van der Veen wrote: tl;dr, We should not be recycling bare metal nodes between untrusted tenants at this time. There's a broader discussion about firmware security going on, which, I think, will take a while for the hardware vendors to really address. What can the hardware vendors do? Has anyone proposed a meaningful solution for the firmware issue? Given the number of devices (NIC, GPU, storage controllers, etc.) that could potentially have firmware update capabilities it's not clear to me how this could be reliably solved. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova]Why not allow to create a vm directly with two VIF in the same network
On 01/24/2014 08:33 AM, CARVER, PAUL wrote: I agree that I’d like to see a set of use cases for this. This is the second time in as many days that I’ve heard about a desire to have such a thing but I still don’t think I understand any use cases adequately. In the physical world it makes perfect sense, LACP, MLT, Etherchannel/Portchannel, etc. In the virtual world I need to see a detailed description of one or more use cases. Shihanzhang, why don’t you start up an Etherpad or something and start putting together a list of one or more practical use cases in which the same VM would benefit from multiple virtual connections to the same network. If it really makes sense we ought to be able to clearly describe it. One obvious case is if we ever support SR-IOV NIC passthrough. Since that is essentially real hardware, all the "physical world" reasons still apply. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][neutron] PCI pass-through SRIOV
On 01/28/2014 10:55 AM, Jani, Nrupal wrote: While technically it is possible, we as a team can decide about the final recommendationJGiven that VFs are going to be used for the high-performance VMs, mixing VMs with virtio & VFs may not be a good option. Initially we can use PF interface for the management traffic and/or VF configuration!! I would expect that it would be fairly common to want to dedicate a VF link for high-speed data plane and use a virtio link for control plane traffic, health checks, etc. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] transactions in openstack REST API?
Has anyone ever considered adding the concept of transaction IDs to the openstack REST API? I'm envisioning a way to handle long-running transactions more cleanly. For example: 1) A user sends a request to live-migrate an instance 2) Openstack acks the request and includes a "transaction ID" in the response. 3) The user can then poll (or maybe listen to notifications) to see whether the transaction is complete or hit an error. I view this as most useful for things that could potentially take a long time to finish--instance creation/deletion/migration/evacuation are obvious, I'm sure there are others. Also, anywhere that we use a "cast" RPC call we'd want to add that call to a list associated with that transaction in the database...that way the transaction is only complete when all the sub-jobs are complete. I've seen some discussion about using transaction IDs to locate logs corresponding to a given transaction, but nothing about the end user being able to query the status of the transaction. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler
On 02/03/2014 12:28 PM, Khanh-Toan Tran wrote: Another though would be the need for Instance Group API [1]. Currently users can only request multiple instances of the same flavors. These requests do not need LP to solve, just placing instances one by one is sufficient. Therefore we need this API so that users can request instances of different flavors, with some relations (constraints) among them. The advantage is that this logic and API will help us add Cinder volumes with ease (not sure how the Cinder-stackers think about it, though). I don't think that the instance group API actually helps here. (I think it's a good idea, just not directly related to this.) I think what we really want is the ability to specify an arbitrary list of instances (or other things) that you want to schedule, each of which may have different image/flavor, each of which may be part of an instance group, a specific network, have metadata which associates with a host aggregate, desire specific PCI passthrough devices, etc. An immediate user of something like this would be heat, since it would let them pass the whole stack to the scheduler in one API call. The scheduler could then take a more holistic view, possibly doing a better fitting job than if the instances are scheduled one-at-a-time. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] transactions in openstack REST API?
On 02/03/2014 01:31 PM, Andrew Laski wrote: On 02/03/14 at 01:10pm, Chris Friesen wrote: Has anyone ever considered adding the concept of transaction IDs to the openstack REST API? I'm envisioning a way to handle long-running transactions more cleanly. For example: 1) A user sends a request to live-migrate an instance 2) Openstack acks the request and includes a "transaction ID" in the response. 3) The user can then poll (or maybe listen to notifications) to see whether the transaction is complete or hit an error. I've called them tasks, but I have a proposal up at https://blueprints.launchpad.net/nova/+spec/instance-tasks-api that is very similar to this. It allows for polling, but doesn't get into notifications. But this is a first step in this direction and it can be expanded upon later. Please let me know if this covers what you've brought up, and add any feedback you may have to the blueprint. That actually looks really good. I like the idea of subtasks for things like live migration. The only real comment I have at this point is that you might want to talk to the "transaction ID" guys and maybe use your task UUID as the transaction ID that gets passed to other services acting on behalf of nova. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler
On 02/10/2014 10:54 AM, Khanh-Toan Tran wrote: Heat may orchestrate the provisioning process, but eventually the instances will be passed to Nova-scheduler (Gantt) as separated commands, which is exactly the problem Solver Scheduler wants to correct. Therefore the Instance Group API is needed, wherever it is used (nova-scheduler/Gantt). I'm not sure that this follows. First, the instance groups API is totally separate since we may want to schedule a number of instances simultaneously without them being part of an instance group. Certainly in the case of using instance groups that would be one input into the scheduler, but it's an optional input. Second, there is nothing wrong with booting the instances (or instantiating other resources) as separate commands as long as we support some kind of reservation token. In that model, we would pass a bunch of information about multiple resources to the solver scheduler, have it perform scheduling *and reserve the resources*, then return some kind of resource reservation tokens back to the caller for each resource. The caller could then allocate each resource, pass in the reservation token indicating both that the resources had already been reserved as well as what the specific resource that had been reserved (the compute-host in the case of an instance, for example). Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler
On 02/11/2014 03:21 AM, Khanh-Toan Tran wrote: Second, there is nothing wrong with booting the instances (or instantiating other resources) as separate commands as long as we support some kind of reservation token. I'm not sure what reservation token would do, is it some kind of informing the scheduler that the resources would not be initiated until later ? Like a restaurant reservation, it would "claim" the resources for use by someone at a later date. That way nobody else can use them. That way the scheduler would be responsible for determining where the resource should be allocated from, and getting a reservation for that resource. It would not have anything to do with actually instantiating the instance/volume/etc. Let's consider a following example: A user wants to create 2 VMs, a small one with 20 GB RAM, and a big one with 40 GB RAM in a datacenter consisted of 2 hosts: one with 50 GB RAM left, and another with 30 GB RAM left, using Filter Scheduler's default RamWeigher. If we pass the demand as two commands, there is a chance that the small VM arrives first. RamWeigher will put it in the 50 GB RAM host, which will be reduced to 30 GB RAM. Then, when the big VM request arrives, there will be no space left to host it. As a result, the whole demand is failed. Now if we can pass the two VMs in a command, SolverScheduler can put their constraints all together into one big LP as follow (x_uv = 1 if VM u is hosted in host v, 0 if not): Yes. So what I'm suggesting is that we schedule the two VMs as one call to the SolverScheduler. The scheduler then gets reservations for the necessary resources and returns them to the caller. This would be sort of like the existing Claim object in nova/compute/claims.py but generalized somewhat to other resources as well. The caller could then boot each instance separately (passing the appropriate reservation/claim along with the boot request). Because the caller has a reservation the core code would know it doesn't need to schedule or allocate resources, that's already been done. The advantage of this is that the scheduling and resource allocation is done separately from the instantiation. The instantiation API could remain basically as-is except for supporting an optional reservation token. That responses to your first point, too. If we don't mind that some VMs are placed and some are not (e.g. they belong to different apps), then it's OK to pass them to the scheduler without Instance Group. However, if the VMs are together (belong to an app), then we have to put them into an Instance Group. When I think of an "Instance Group", I think of "https://blueprints.launchpad.net/nova/+spec/instance-group-api-extension";. Fundamentally Instance Groups" describes a runtime relationship between different instances. The scheduler doesn't necessarily care about a runtime relationship, it's just trying to allocate resources efficiently. In the above example, there is no need for those two instances to necessarily be part of an Instance Group--we just want to schedule them both at the same time to give the scheduler a better chance of fitting them both. More generally, the more instances I want to start up the more beneficial it can be to pass them all to the scheduler at once in order to give the scheduler more information. Those instances could be parts of completely independent Instance Groups, or not part of an Instance Group at all...the scheduler can still do a better job if it has more information to work with. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] Is there anything blocking the libvirt driver from implementing the host_maintenance_mode API?
On 02/20/2014 11:38 AM, Matt Riedemann wrote: On 2/19/2014 4:05 PM, Matt Riedemann wrote: The os-hosts OS API extension [1] showed up before I was working on the project and I see that only the VMware and XenAPI drivers implement it, but was wondering why the libvirt driver doesn't - either no one wants it, or there is some technical reason behind not implementing it for that driver? [1] http://docs.openstack.org/api/openstack-compute/2/content/PUT_os-hosts-v2_updateHost_v2__tenant_id__os-hosts__host_name__ext-os-hosts.html By the way, am I missing something when I think that this extension is already covered if you're: 1. Looking to get the node out of the scheduling loop, you can just disable it with os-services/disable? 2. Looking to evacuate instances off a failed host (or one that's in "maintenance mode"), just use the evacuate server action. In compute/api.py the API.evacuate() routine errors out if self.servicegroup_api.service_is_up(service) is true, which means that you can't evacuate from a compute node that is "disabled", you need to migrate instead. So, the alternative is basically to disable the service, then get a list of all the servers on the compute host, then kick off the migration (either cold or live) of each of the servers. Then because migration uses a "cast" instead of a "call" you need to poll all the migrations for success or late failures. Once you have no failed migrations and no servers running on the host then you're good. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] why doesn't _rollback_live_migration() always call rollback_live_migration_at_destination()?
I'm looking at the live migration rollback code and I'm a bit confused. When setting up a live migration we unconditionally run ComputeManager.pre_live_migration() on the destination host to do various things including setting up networks on the host. If something goes wrong with the live migration in ComputeManager._rollback_live_migration() we will only call self.compute_rpcapi.rollback_live_migration_at_destination() if we're doing block migration or volume-backed migration that isn't shared storage. However, looking at ComputeManager.rollback_live_migration_at_destination(), I also see it cleaning up networking as well as block device. What happens if we have a shared-storage instance that we try to migrate and fail and end up rolling back? Are we going to end up with messed-up networking on the destination host because we never actually cleaned it up? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Future of the Nova API
On 02/24/2014 04:01 PM, Morgan Fainberg wrote: TL;DR, “don’t break the contract”. If we are seriously making incompatible changes (and we will be regardless of the direction) the only reasonable option is a new major version. Agreed. I don't think we can possibly consider making backwards-incompatible changes without changing the version number. We could stay with V2 and make as many backwards-compatible changes as possible using a minor version. This could include things like adding support for unified terminology as long as we *also* continue to support the old terminology. The downside of this is that the code gets messy. On the other hand, if we need to make backwards incompatible changes then we need to bump the version number. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Future of the Nova API
On 02/24/2014 04:59 PM, Sean Dague wrote: So, that begs a new approach. Because I think at this point even if we did put out Nova v3, there can never be a v4. It's too much, too big, and doesn't fit in the incremental nature of the project. Does it necessarily need to be that way though? Maybe we bump the version number every time we make a non-backwards-compatible change, even if it's just removing an API call that has been deprecated for a while. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Future of the Nova API
On 02/24/2014 05:17 PM, Sean Dague wrote: On 02/24/2014 06:13 PM, Chris Friesen wrote: On 02/24/2014 04:59 PM, Sean Dague wrote: So, that begs a new approach. Because I think at this point even if we did put out Nova v3, there can never be a v4. It's too much, too big, and doesn't fit in the incremental nature of the project. Does it necessarily need to be that way though? Maybe we bump the version number every time we make a non-backwards-compatible change, even if it's just removing an API call that has been deprecated for a while. So I'm not sure how this is different than the keep v2 and use microversioning suggestion that is already in this thread. It differs in that it allows the user to determine whether the changes are forwards or backwards compatible. For instance, you might use an API version that looks like {major}.{minor}.{bugfix} with the following rules: A new bugfix release is both forwards and backwards compatible. A new minor release is backwards compatible. So code written against version x.y will work with version x.y+n. New minor releases would generally add functionality. A new major release is not necessarily backwards compatible. Code written against version x may not work with version x+1. New major releases remove or change functionality. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] why doesn't _rollback_live_migration() always call rollback_live_migration_at_destination()?
On 02/25/2014 05:15 AM, John Garbutt wrote: On 24 February 2014 22:14, Chris Friesen wrote: What happens if we have a shared-storage instance that we try to migrate and fail and end up rolling back? Are we going to end up with messed-up networking on the destination host because we never actually cleaned it up? I had some WIP code up to clean that up, as part as the move to conductor, its massively confusing right now. Looks like a bug to me. I suspect the real issue is that some parts of: self.driver.rollback_live_migration_at_destination(ontext, instance, network_info, block_device_info) Need more information about if there is shared storage being used or not. What's the timeframe on the move to conductor? I'm looking at fixing up the resource tracking over a live migration (currently we just rely on the audit fixing things up whenever it gets around to running) but to make that work properly I need to unconditionally run rollback code on the destination. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] need advice on how to supply automated testing with bugfix patch
I'm in the process of putting together a bug report and a patch for properly handling resource tracking on live migration. The change involves code that will run on the destination compute node in order to properly account for the resources that the instance to be migrated will consume. Testing it manually is really simple...start with an instance on one compute node, check the hypervisor stats on the destination node, trigger a live-migration, and immediately check the hypervisor stats again. With the current code the hypervisor doesn't update until the audit runs, with the patch it updates right away. I can see how to do a tempest testcase for this, but I don't have a good handle on how to set this up as a unit test. I *think* it should be possible to modify _test_check_can_live_migrate_destination() but it would mean setting up fake resource tracking and adding fake resources (cpu/memory/disk) to the fake instance being fake migrated and I don't have any experience with that. Anyone have any suggestions? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Future of the Nova API
On 02/26/2014 04:50 PM, Dan Smith wrote: So if we make backwards incompatible changes we really need a major version bump. Minor versions don't cut it, because the expectation is you have API stability within a major version. I disagree. If the client declares support for it, I think we can very reasonably return new stuff. If we take what we have today in v2 and call that 2, then we could make novaclient send this header: Accept: application/json;version=2 Then, we have a common method in the api code called get_client_version(req), which returns 2 if it's missing, otherwise it returns the version the client declared. When we want to return something new, we do so only if get_client_version(req) >= 3. I think that if we did that, we could support tasks in v2, properly, today. As I see it, you're not actually disagreeing with using a major version bump to deal with backwards incompatible changes. What you're really suggesting is having the client explicitly send version support in a header, and if they don't then we would assume version 2. I don't see how this is fundamentally different than a scheme where the path has "v3" in it instead of "v2". The two are essentially equivalent, either way the client indicates which version it supports. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Future of the Nova API
On 02/27/2014 08:43 AM, Dan Smith wrote: So I think once we start returning different response codes, or completely different structures (such as the tasks change will be), it doesn't matter if we make the change in effect by invoking /v2 prefix or /v3 prefix or we look for a header. Its a major api revision. I don't think we should pretend otherwise. I think you're missing my point. The current /v2 and /v3 versions are about 10,000 revisions apart. I'm saying we have the client declare support for a new version every time we need to add something new, not to say that they support what we currently have as /v3. So it would be something like: version=v2: Current thing version=v3: added simple task return for server create version=v4: added the event extension version=v5: added a new event for cinder to the event extension version=v6: changed 200 to 202 for three volumes calls ...etc Sure, but that's still functionally equivalent to using the /v2 prefix. So we could chuck the current /v3 code and do: /v2: Current thing /v3: invalid, not supported /v4: added simple task return for server create /v5: added the event extension /v6: added a new event for cinder to the event extension and it would be equivalent. And arguably, anything that is a pure "add" could get away with either a minor version or not touching the version at all. Only "remove" or "modify" should have the potential to break a properly-written application. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Future of the Nova API
On 02/27/2014 06:00 PM, Alex Xu wrote: Does mean our code looks like as below? if client_version > 2: elif client_version > 3 ... elif client_version > 4: ... elif client_version > 5: ... elif client_version > 6: .. And we need test each version... That looks bad... I don't think the code would look like that Each part of the API could look at the version separately. And each part of the API only needs to check the client version if it has made a backwards-incompatible change. So a part of the API that only made one backwards-incompatible change at version 3 would only need one check. if client_version >= 3 do_newer_something() else do_something() Maybe some other part of the API made a change at v6 (assuming global versioning). That part of the API would also only need one check. if client version >= 6 do_newer_something() else do_something() Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] inconsistent naming? node vs host vs vs hypervisor_hostname vs OS-EXT-SRV-ATTR:host
Hi, I've been working with OpenStack for a while now but I'm still a bit fuzzy on the precise meaning of some of the terminology. It seems reasonably clear that a "node" is a computer running at least one component of an Openstack system. However, "nova service-list" talks about the "host" that a given service runs on. Shouldn't that be "node"? Normally "host" is used to distinguish from "guest", but that doesn't really make sense for a dedicated controller node. "nova show" reports "OS-EXT-SRV-ATTR:host" and "OS-EXT-SRV-ATTR:hypervisor_hostname" for an instance. What is the distinction between the two and how do they relate to OpenStack "nodes" or the "host" names in "nova service-list"? "nova hypervisor-list" uses the term "hypervisor hostname", but "nova hypervisor-stats" talks about "compute nodes". Is this distinction accurate or should they both use the hypervisor terminology? What is the distinction between hypervisor/host/node? "nova host-list" reports "host_name", but seems to include all services. Does "host_name" correspond to host, hypervisor_host, or node? And just to make things interesting, the other "nova host-*" commands only work on compute hosts, so maybe "nova host-list" should only output info for systems running nova-compute? Thanks, Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] inconsistent naming? node vs host vs vs hypervisor_hostname vs OS-EXT-SRV-ATTR:host
On 02/28/2014 11:38 AM, Jiang, Yunhong wrote: One reason of the confusion is, in some virt driver (maybe xenapi or vmwareapi), one compute service manages multiple node. Okay, so in the scenario above, is the nova-compute service running on a "node" or a "host"? (And if it's a "host", then what is the "compute node"?) What is the distinction between "OS-EXT-SRV-ATTR:host" and "OS-EXT-SRV-ATTR:hypervisor_hostname" in the above case? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Future of the Nova API
On 03/03/2014 08:14 AM, Steve Gordon wrote: I would be interested in your opinion on the impact of a V2 version release which had backwards incompatibility in only one area - and that is input validation. So only apps/SDKs which are currently misusing the API (I think the most common problem would be sending extraneous data which is currently ignored) would be adversely affected. Other cases where where the API was used correctly would be unaffected. In this kind of scenario would we need to maintain the older V2 version where there is poor input validation as long? Or would the V2 version with strong input validation be sufficient? This is a tricky one because people who have applications or SDKs that are misusing the API are unlikely to realize they are misusing it until you actually make the switch to stricter validation by default and they start getting errors back. We also don't as far as I know have data at a deep enough level on exactly how people are using the APIs to base a decision on - though perhaps some of the cloud providers running OpenStack might be able to help here. I think at best we'd be able to evaluate the common/well known SDKs to ensure they aren't guilty (and if they are, to fix it) to gain some level of confidence but suspect that breaking the API contract in this fashion would still warrant a similar period of deprecation from the point of view of distributors and cloud providers. Would there be any point in doing it incrementally? Maybe initially the command would still be accepted but it would print an error log indicating that the command contained extraneous data. That way someone could monitor the logs and see how much extraneous data is being injected, but also contact the owner of the software and notify them of the problem. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] how to regenerate doc/api_samples?
How do I regenerate the doc/api_samples tests if I change the corresponding template? The instructions in nova/tests/functional/api_samples/README.rst say to run "GENERATE_SAMPLES=True tox -epy27 nova.tests.unit.integrated", but that path doesn't exist anymore. I suspect the instructions should have been updated when the templates were moved under nova/tests/functional. Thanks, Chris __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] required libvirtd/qemu versions for numa support?
Hi, I'm interested in the recent work around NUMA support for guest instances (https://blueprints.launchpad.net/nova/+spec/virt-driver-numa-placement), but I'm having some difficulty figuring out what versions of libvirt and qemu are required. From the research that I've done it seems like qemu 2.1 might be required, but I've been unable to find a specific version listed in the nova requirements or in the openstack global requirements. Is it there and I just can't find it? If it's not specified, and yet openstack relies on it, perhaps it should be added. (Or at least documented somewhere.) Thanks, Chris __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Heat] looking to add support for server groups to heat...any comments?
I'm looking to add support for server groups to heat. I've got working code, but I thought I'd post the overall design here in case people had objections. Basically, what I propose is to add a "class NovaServerGroup" resource. Currently it would only support a "policy" property to store the scheduler policy for the server group. The scheduler policy would not support updating on the fly. The "LaunchConfiguration" and "Instance" classes would be extended with an optional "ServerGroup" property. In the "Instance" class if the "ServerGroup" property is set then the group name is added to the scheduler_hints when building the instance. The "Server" class would be extended with an optional "server_group" property. If it is set then the group name is added to the scheduler_hints when building the server. All in all, its around a hundred lines of code. Any comments? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] looking to add support for server groups to heat...any comments?
On 04/25/2014 11:01 AM, Mike Spreitzer wrote: Zane Bitter wrote on 04/25/2014 12:36:00 PM: > On 25/04/14 12:23, Chris Friesen wrote: More important is Zane's following question. > > The "Server" class would be extended with an optional "server_group" > > property. If it is set then the group name is added to the > > scheduler_hints when building the server. > > Given that we already expose the scheduler_hints directly, can you talk > about why it would be advantageous to have a separate property as well? > (e.g. syntax would be really finicky?) I was thinking it'd be more intuitive for the end-user (and more future-proof if novaclient changes), but maybe I'm wrong. In the version I have currently it looks something like this: cirros_server1: type: OS::Nova::Server properties: name: cirros1 image: 'cirros' flavor: 'm1.tiny' server_group: { get_resource: my_heat_group } In the nova boot command we pass the group uuid like this: --hint group=e4cf5dea-4831-49a1-867d-e263f2579dd0 If we were to make use of the scheduler hints, how would that look? Something like this? (I'm not up to speed on my YAML, so forgive me if this isn't quite right.) And how would this look if we wanted to specify other scheduler hints as well? cirros_server1: type: OS::Nova::Server properties: name: cirros1 image: 'cirros' flavor: 'm1.tiny' scheduler_hints: {"group": { get_resource: my_heat_group }} Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] looking to add support for server groups to heat...any comments?
On 04/25/2014 12:00 PM, Zane Bitter wrote: On 25/04/14 13:50, Chris Friesen wrote: In the nova boot command we pass the group uuid like this: --hint group=e4cf5dea-4831-49a1-867d-e263f2579dd0 If we were to make use of the scheduler hints, how would that look? Something like this? (I'm not up to speed on my YAML, so forgive me if this isn't quite right.) And how would this look if we wanted to specify other scheduler hints as well? cirros_server1: type: OS::Nova::Server properties: name: cirros1 image: 'cirros' flavor: 'm1.tiny' scheduler_hints: {"group": { get_resource: my_heat_group }} Something like that (I don't think you need the quotes around "group"). Or, equivalently: cirros_server1: type: OS::Nova::Server properties: name: cirros1 image: 'cirros' flavor: 'm1.tiny' scheduler_hints: group: { get_resource: my_heat_group } Okay...assuming it works like that then that looks fine to me. If we go this route then the changes are confined to a single new file. Given that, do we need a blueprint or can I just submit the code for review once I port it to the current codebase? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] looking to add support for server groups to heat...any comments?
On 04/26/2014 09:41 PM, Jay Lau wrote: Just noticed this email, I have already filed a blueprint related to this topic https://blueprints.launchpad.net/heat/+spec/vm-instance-group-support My idea is that can we add a new field such as "PlacemenetPolicy" to AutoScalingGroup? If the value is affinity, then when heat engine create the AutoScalingGroup, it will first create a server group with affinity policy, then when create VM instance for the AutoScalingGroup, heat engine will transfer the server group id as scheduler hints so as to make sure all the VM instances in the AutoScalingGroup can be created with affinity policy. resources: WorkloadGroup: type: AWS::AutoScaling::AutoScalingGroup properties: AvailabilityZones: ["nova"] LaunchConfigurationName: {Ref: LaunchConfig} PlacementPolicy: ["affinity"] MaxSize: 3 MinSize: 2 While I personally like this sort of idea from the perspective of simplifying things for heat users, I see two problems. First, my impression is that heat tries to provide a direct mapping of nova resources to heat resources. Using a property of a heat resource to trigger the creation of a nova resource would not fit that model. Second, it seems less well positioned for exposing possible server group enhancements in nova. For example, one enhancement that has been discussed is to add a server group option to make the group scheduling policy a weighting factor if it can't be satisfied as a filter. With the server group as an explicit resource there is a natural way to extend it. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Proposal: remove the server groups feature
On 04/25/2014 03:15 PM, Jay Pipes wrote: There are myriad problems with the above user experience and implementation. Let me explain them. 1. The user isn't creating a "server group" when they issue a nova server-group-create call. They are creating a policy and calling it a group. Cognitive dissonance results from this mismatch. I actually don't think this is true. From my perspective they are actually creating a group, and then when booting servers they can be added into the group. The group happens to have a policy, it is not only a policy. 2. There's no way to add an existing server to this "group". In the original API there was a way to add existing servers to the group. This didn't make it into the code that was submitted. It is however supported by the instance group db API in nova. 3. There's no way to remove members from the group In the original API there was a way to remove members from the group. This didn't make it into the code that was submitted. 4. There's no way to manually add members to the server group Isn't this the same as item 2? 5. The act of telling the scheduler to place instances near or away from some other instances has been hidden behind the server group API, which means that users doing a nova help boot will see a --group option that doesn't make much sense, as it doesn't describe the scheduling policy activity. There are many things hidden away that affect server booting...metadata matching between host aggregates and flavor extra specs, for instance. As I understand it, originally the concept of "server groups" was more broad. They supported multiple policies, arbitrary group metadata, etc. The scheduler policy was only one of the things that could be associated with a group. This is why the underlying database structure is more complicated than necessary for the current set of supported operations. What we have currently is sort of a "dumbed-down" version but now that we have the basic support we can start adding in additional functionality as desired. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Proposal: remove the server groups feature
On 04/28/2014 06:58 AM, Steve Gordon wrote: - Original Message - Create two new options to nova boot: --near-tag and --not-near-tag The first would tell the scheduler to place the new VM near other VMs having a particular "tag". The latter would tell the scheduler to place the new VM *not* near other VMs with a particular tag. Would we continue to grow this set of arguments in response to the addition of new policies, how much do we expect this to grow? The two most likely additions I can think of are "soft"/"best effort" versions of the current two, are there any other proposals/ideas out there - I know we're a creative bunch ;)? One logical extension that came up previously is a max group size, maybe expressed as a quota or something. 1. There's no need to have any "server group" object any more. Servers have a set of tags (key/value pairs in v2/v3 API) that may be used to identify a type of server. The activity of launching an instance would now have options for the user to indicate their affinity preference, which removes the cognitive dissonance that happens due to the user needing to know what a server group is (a policy, not a group). Would the user's affinity preference stay with the instance for consideration in future operations post-boot (either now or in a future extension of this functionality)? Whichever way it's implemented, we need to preserve the boot time scheduler constraints so that any time we reschedule (migration, evacuation, resize, etc.) the constraints will be re-evaluated. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Proposal: remove the server groups feature
On 04/28/2014 11:22 AM, Dan Smith wrote: 2. There's no way to add an existing server to this "group". In the original API there was a way to add existing servers to the group. This didn't make it into the code that was submitted. It is however supported by the instance group db API in nova. 3. There's no way to remove members from the group In the original API there was a way to remove members from the group. This didn't make it into the code that was submitted. Well, it didn't make it in because it was broken. If you add an instance to a group after it's running, a migration may need to take place in order to keep the semantics of the group. That means that for a while the policy will be being violated, and if we can't migrate the instance somewhere to satisfy the policy then we need to either drop it back out, or be in violation. Either some additional states (such as being queued for inclusion in a group, etc) may be required, or some additional footnotes on what it means to be in a group might have to be made. I think your comment actually applies to adding existing instances to a group. There's no good reason not to allow removing instances from a group. As for the case of addition, we could start with something simple...if adding an instance to a group would violate the group scheduling policy, then raise an exception. It was for the above reasons, IIRC, that we decided to leave that bit out since the semantics and consequences clearly hadn't been fully thought-out. Obviously they can be addressed, but I fear the result will be ... ugly. I think there's a definite possibility that leaving out those dynamic functions will look more desirable than an actual implementation. Your idea of "pending group membership" doesn't sound too ugly. That said, I would expect "adding existing instances to a group" to be something that would be done under fairly well-controlled circumstances. In that case I think it would be reasonable to push the work of managing any migrations onto whoever is trying to create a group from existing instances. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Heat] looking to add support for server groups to heat...any comments?
On 04/30/2014 03:41 PM, Mike Spreitzer wrote: Chris Friesen wrote on 04/28/2014 10:44:46 AM: > Using a property of a heat resource > to trigger the creation of a nova resource would not fit that model. For the sake of your argument, let's pretend that the new ASG blueprint has been fully implemented. That means an ASG is an ordinary virtual resource. In all likelihood the implementation will generate templates and make nested stacks. I think it is fairly natural to suppose that the generated template could include a Nova server group. > Second, it seems less well positioned for exposing possible server group > enhancements in nova. For example, one enhancement that has been > discussed is to add a server group option to make the group scheduling > policy a weighting factor if it can't be satisfied as a filter. With > the server group as an explicit resource there is a natural way to > extend it. Abstractly an autoscaling group is a sub-class of "group of servers" (ignoring the generalization of "server" in the relevant cases), so it would seem natural to me that the properties of an autoscaling group would include the properties of a server group. As the latter evolves, so would the former. OTOH, I find nothing particularly bad with doing it as you suggest. BTW, this is just the beginning. What about resources of type AWS::CloudFormation::Stack? What about Trove and Sahara? If we go with what Zane suggested (using the already-exposed scheduler_hints) then by implementing a single "server group" resource we basically get support for server groups "for free" in any resource that exposes scheduler hints. That seems to me to be an excellent reason to go that route rather than modifying all the different group-like resources that heat supports. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Heat] Custom Nova Flavor creation through Heat (pt.2)
On 05/05/2014 10:51 AM, Steve Gordon wrote: In addition extra specifications may denote the passthrough of additional devices, adding another dimension. This seems likely to be the case in the use case outlined in the original thread [1]. Thanks, Steve [1] http://lists.openstack.org/pipermail/openstack-dev/2013-November/018744.html Agreed. The ability to set arbitrary metadata on the flavor means that you could realistically have many different flavors all with identical virtual hardware but different metadata. As one example, the flavor metadata can be used to match against host aggregate metadata. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] [Heat] Custom Nova Flavor creation through Heat (pt.2)
On 05/05/2014 11:40 AM, Solly Ross wrote: One thing that I was discussing with @jaypipes and @dansmith over on IRC was the possibility of breaking flavors down into separate components -- i.e have a disk flavor, a CPU flavor, and a RAM flavor. This way, you still get the control of the size of your building blocks (e.g. you could restrict RAM to only 2GB, 4GB, or 16GB), but you avoid exponential flavor explosion by separating out the axes. I like this idea because it allows for greater flexibility, but I think we'd need to think carefully about how to expose it via horizon--maybe separate tabs within the overall "flavors" page? As a simplifying view you could keep the existing flavors which group all of them, while still allowing instances to specify each one separately if desired. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] [Heat] Custom Nova Flavor creation through Heat (pt.2)
On 05/05/2014 12:18 PM, Chris Friesen wrote: As a simplifying view you could keep the existing flavors which group all of them, while still allowing instances to specify each one separately if desired. Also, if we're allowing the cpu/memory/disk to be specified independently at instance boot time, we might want to allow for arbitrary metadata to be specified as well (that would be matched as per the existing flavor "extra_spec"). Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Proposal: Move CPU and memory allocation ratio out of scheduler
On 06/03/2014 07:29 AM, Jay Pipes wrote: Hi Stackers, tl;dr = Move CPU and RAM allocation ratio definition out of the Nova scheduler and into the resource tracker. Remove the calculations for overcommit out of the core_filter and ram_filter scheduler pieces. Makes sense to me. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Arbitrary "extra specs" for compute nodes?
On 06/07/2014 12:30 AM, Joe Cropper wrote: Hi Folks, I was wondering if there was any such mechanism in the compute node structure to hold arbitrary key-value pairs, similar to flavors' "extra_specs" concept? It appears there are entries for things like pci_stats, stats and recently added extra_resources -- but these all tend to have more specific usages vs. just arbitrary data that may want to be maintained about the compute node over the course of its lifetime. Unless I'm overlooking an existing construct for this, would this be something that folks would welcome a Juno blueprint for--i.e., adding extra_specs style column with a JSON-formatted string that could be loaded as a dict of key-value pairs? If nothing else, you could put the compute node in a host aggregate and assign metadata to it. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Proposal: Move CPU and memory allocation ratio out of scheduler
On 06/09/2014 07:59 AM, Jay Pipes wrote: On 06/06/2014 08:07 AM, Murray, Paul (HP Cloud) wrote: Forcing an instance to a specific host is very useful for the operator - it fulfills a valid use case for monitoring and testing purposes. Pray tell, what is that valid use case? I find it useful for setting up specific testcases when trying to validate thingsput *this* instance on *this* host, put *those* instances on *those* hosts, now pull the power plug on *this* host...etc. I wouldn't expect the typical openstack end-user to need it though. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Gate proposal - drop Postgresql configurations in the gate
On 06/12/2014 09:36 AM, Sean Dague wrote: This is what I mean by functional testing. If we were directly hitting a real database on a set of in tree project tests, I think you could discover issues like this. Neutron was headed down that path. But if we're talking about a devstack / tempest run, it's not really applicable. If someone can point me to a case where we've actually found this kind of bug with tempest / devstack, that would be great. I've just *never* seen it. I was the one that did most of the fixing for pg support in Nova, and have helped other projects as well, so I'm relatively familiar with the kinds of fails we can discover. The ones that Julien pointed really aren't likely to be exposed in our current system. Which is why I think we're mostly just burning cycles on the existing approach for no gain. What about https://bugs.launchpad.net/nova/+bug/1292963 Would this have been caught by strict/traditional mode with mysql? (Of course in this case we didn't actually have tempest testcases for server groups yet, not sure if they exist now.) Also, while we're on the topic of testing databases...I opened a bug a while back for the fact that sqlite regexp() doesn't behave like mysql/postgres. Having unit tests that don't behave like a real install seems like a bad move. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] mysql/mysql-python license "contamination" into openstack?
Hi, I'm looking for the community viewpoint on whether there is any chance of license contamination between mysql and nova. I realize that lawyers would need to be involved for a proper ruling, but I'm curious about the view of the developers on the list. Suppose someone creates a modified openstack and wishes to sell it to others. They want to keep their changes private. They also want to use the mysql database. The concern is this: nova is apache licensed sqlalchemy is MIT licensed mysql-python (aka mysqldb1) is GPLv2 licensed mysql is GPLv2 licensed The concern is that since nova/sqlalchemy/mysql-python are all essentially linked together, an argument could be made that the work as a whole is a derivative work of mysql-python, and thus all the source code must be made available to anyone using the binary. Does this argument have any merit? Has anyone tested any of the mysql DBAPIs with more permissive licenses? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] is the cpu feature check in _compare_cpu() valid
In nova/virt/libvirt/driver.py the _compare_cpu() function checks the cpu features using self._conn.compareCPU(). Is this actually valid? The kvm processes don't seem to have the "--cpu" option specified, so we should get a compatible subset of cpu features from qemu. If that's the case then the host cpu features shouldn't really matter, right? We're running Havana and hitting live-migration issues between Sandy Bridge and Ivy Bridge due to some differences in the host processor flags. Just wondering what the expected behaviour is. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] mysql/mysql-python license "contamination" into openstack?
On 06/12/2014 01:30 PM, Mike Bayer wrote: the GPL is excepted in the case of MySQL and other MySQL products released by Oracle (can you imagine such a sentence being written.), see http://www.mysql.com/about/legal/licensing/foss-exception/. Okay, good start. mysql itself is out of the picture. If MySQL-Python itself were an issue, OpenStack could switch to another MySQL library, such as MySQL Connector/Python which is now MySQL's official Python driver: http://dev.mysql.com/doc/connector-python/en/index.html It seems like mysql-python could be an issue given that it's licensed GPLv2. Has anyone tested any of the mysql DBAPIs with more permissive licenses? I just mentioned other MySQL drivers the other day; MySQL Connector/Python, OurSQL and pymysql are well tested within SQLAlchemy and these drivers generally pass all tests. There's some concern over compatibility with eventlet, however, I can't speak to that just yet. Okay, so they're not really tested with OpenStack then? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Rethink how we manage projects? (was Gate proposal - drop Postgresql configurations in the gate)
On 06/16/2014 03:33 AM, Thierry Carrez wrote: David Kranz wrote: [...] There is a different way to do this. We could adopt the same methodology we have now around gating, but applied to each project on its own branch. These project branches would be integrated into master at some frequency or when some new feature in project X is needed by project Y. Projects would want to pull from the master branch often, but the push process would be less frequent and run a much larger battery of tests than we do now. So we would basically discover the cross-project bugs when we push to the "master master" branch. I think you're just delaying discovery of the most complex issues, and push the responsibility to resolve them onto a inexistent set of people. Adding integration branches only makes sense if you have an integration team. We don't have one, so we'd call back on the development teams to solve the same issues... with a delay. In our specific open development setting, delaying is bad because you don't have a static set of developers that you can assume will be on call ready to help with what they have written a few months later: shorter feedback loops are key to us. On the other hand, I've had fairly trivial changes wait for a week to be merged because it failed multiple separate testcases that were totally unrelated to the change I was making. If I'm making a change that is entirely contained within nova, it seems really unfortunate that a buggy commit in neutron or cinder can block my commit from being merged. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] [bug?] live migration fails with boot-from-volume
Hi, I was just testing the current icehouse code and came across some behaviour that looked suspicious. I have two nodes, an all-in-one and a compute node. I was not using shared instance storage. I created a volume from an image and then booted an instance from the volume. Once the image was up and running I tried to do a "nova live-migration " and got the following error: cfriesen@controller:/opt/stack/nova/nova/compute$ nova live-migration fromvol ERROR: controller is not on shared storage: Live migration can not be used without shared storage. (HTTP 400) (Request-ID: req-0d8da5e4-b0ec-401d-be95-d9c4f9f7e062) Shouldn't booting from volume count as a form of shared storage? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] UTF-8 required charset/encoding for openstack database?
Hi, I'm using havana and recent we ran into an issue with heat related to character sets. In heat/db/sqlalchemy/api.py in user_creds_get() we call _decrypt() on an encrypted password stored in the database and then try to convert the result to unicode. Today we hit a case where this errored out with the following message: UnicodeDecodeError: 'utf8' codec can't decode byte 0xf2 in position 0: invalid continuation byte We're using postgres and currently all the databases are using SQL_ASCII as the charset. I see that in icehouse heat will complain if you're using mysql and not using UTF-8. There doesn't seem to be any checks for other databases though. It looks like devstack creates most databases as UTF-8 but uses latin1 for nova/nova_bm/nova_cell. I assume this is because nova expects to migrate the db to UTF-8 later. Given that those migrations specify a character set only for mysql, when using postgres should we explicitly default to UTF-8 for everything? Thanks, Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [bug?] live migration fails with boot-from-volume
On 03/08/2014 02:23 AM, ChangBo Guo wrote: Are you using libvirt driver ? As I remember, the way to check if compute nodes with shared storage is : create a temporary file from source node , then check the file from dest node , by accessing file system from operating system level. And boot from volume is just a way to boot instance , not means shared storage or not . For non-shared storage , have you try block migration with option --block-migration ? Using block migration does seem to work. However, it passes VIR_MIGRATE_NON_SHARED_INC to libvirt in the migration flags, which doesn't seem ideal for boot-from-volume. I assume it starts to do an incremental copy but then decides that both are identical? This raises an interesting question. Why do we even need the user to explicitly specify --block-migration? It seems like we could just test whether the instance storage is shared between the two compute nodes and set the appropriate flags automatically. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] UTF-8 required charset/encoding for openstack database?
On 03/10/2014 02:02 PM, Ben Nemec wrote: We just had a discussion about this in #openstack-oslo too. See the discussion starting at 2014-03-10T16:32:26 http://eavesdrop.openstack.org/irclogs/%23openstack-oslo/%23openstack-oslo.2014-03-10.log In that discussion dhellmann said, "I wonder if we make any assumptions elsewhere that we are using utf8 in the database" For what it's worth I came across "https://wiki.openstack.org/wiki/Encoding";, which proposed a rule: "All external text that is not explicitly encoded (database storage, commandline arguments, etc.) should be presumed to be encoded as utf-8." While it seems Heat does require utf8 (or at least matching character sets) across all tables, I'm not sure the current solution is good. It seems like we may want a migration to help with this for anyone who might already have mismatched tables. There's a lot of overlap between that discussion and how to handle Postgres with this, I think. I'm lucky enough to be able to fix this now, I don't have any real deployments yet. It sounds like for the near future my best bet would be to just set the install scripts to configure postgres (which is used only for openstack) to default to utf-8. Is that a fair summation? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] a question about instance snapshot
On 03/10/2014 02:58 PM, Jay Pipes wrote: On Mon, 2014-03-10 at 16:30 -0400, Shawn Hartsock wrote: While I understand the general argument about pets versus cattle. The question is, would you be willing to poke a few holes in the strict "cattle" abstraction for the sake of pragmatism. Few shops are going to make the direct transition in one move. Poking a hole in the cattle abstraction allowing them to keep a pet VM might be very valuable to some shops making a migration. Poking holes in cattle aside, my experience with shops that prefer the pets approach is that they are either: * Not managing their servers themselves at all and just relying on some IT operations organization to manage everything for them, including all aspects of backing up their data as well as failing over and balancing servers, or, * Hiding behind rationales of "needing to be secure" or "needing 100% uptime" or "needing no customer disruption" in order to avoid any change to the status quo. This is because the incentives inside legacy IT application development and IT operations groups are typically towards not rocking the boat in order to satisfy unrealistic expectations and outdated interface agreements that are forced upon them by management chains that haven't crawled out of the waterfall project management funk of the 1980s. Adding pet-based features to Nova would, IMO, just perpetuate the above scenarios and incentives. What about the cases where it's not a "preference" but rather just the inertia of pre-existing systems and procedures? If we can get them in the door with enough support for legacy stuff, then they might be easier to convince to do things the "cloud" way in the future. If we stick with the hard-line cattle-only approach we run the risk of alienating them completely since redoing everything at once is generally not feasible. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] UTF-8 required charset/encoding for openstack database?
On 03/11/2014 05:50 PM, Clint Byrum wrote: But MySQL can't possibly know what you _meant_ when you were inserting data. So, if you _assumed_ that the database was UTF-8, and inserted UTF-8 with all of those things accidentally set for latin1, then you will have UTF-8 in your db, but MySQL will think it is latin1. So if you now try to alter the table to UTF-8, all of your high-byte strings will be double-encoded. It unfortunately takes analysis to determine what the course of action is. That is why we added the check to Heat, so that it would complain very early if your tables and/or server configuration were going to disagree with the assumptions. I find it interesting that the db migrations only specify character encodings for mysql, not any other database. At the same time, devstack seems to create the nova* databases as latin1 for historical reasons. postgres is supported under devstack, so I think this will end up causing a devstack/postgres setup to use utf-8 for most things but latin1 for the nova* databases, which seems odd. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] any recommendations for live debugging of openstack services?
Are there any tools that people can recommend for live debugging of openstack services? I'm looking for a mechanism where I could take a running system that isn't behaving the way I expect and somehow poke around inside the program while it keeps running. (Sort of like tracepoints in gdb.) I've seen mention of things like twisted.manhole and eventlet.backdoor...has anyone used this sort of thing with openstack? Are there better options? Also, has anyone ever seen an implementation of watchpoints for python? By that I mean the ability to set a breakpoint if the value of a variable changes. I found "https://sourceforge.net/blog/watchpoints-in-python/"; but it looks pretty hacky. Thanks, Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] [bug?] possible postgres/mysql incompatibility in InstanceGroup.get_hosts()
Hi, I'm trying to run InstanceGroup.get_hosts() on a havana installation that uses postgres. When I run the code, I get the following error: RemoteError: Remote error: ProgrammingError (ProgrammingError) operator does not exist: timestamp without time zone ~ unknown 2014-03-14 09:58:57.193 8164 TRACE nova.compute.manager [instance: 83439206-3a88-495b-b6c7-6aea1287109f] LINE 3: uuid != instances.uuid AND (instances.deleted_at ~ 'None') ... 2014-03-14 09:58:57.193 8164 TRACE nova.compute.manager [instance: 83439206-3a88-495b-b6c7-6aea1287109f]^ 2014-03-14 09:58:57.193 8164 TRACE nova.compute.manager [instance: 83439206-3a88-495b-b6c7-6aea1287109f] HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. I'm not a database expert, but after doing some digging, it seems that the problem is this line in get_hosts(): filters = {'uuid': filter_uuids, 'deleted_at': None} It seems that current postgres doesn't allow implicit casts. If I change the line to: filters = {'uuid': filter_uuids, 'deleted': 0} Then it seems to work. Is this change valid? Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] question about e41fb84 "fix anti-affinity race condition on boot"
Hi, I'm curious why the specified git commit chose to fix the anti-affinity race condition by aborting the boot and triggering a reschedule. It seems to me that it would have been more elegant for the scheduler to do a database transaction that would atomically check that the chosen host was not already part of the group, and then add the instance (with the chosen host) to the group. If the check fails then the scheduler could update the group_hosts list and reschedule. This would prevent the race condition in the first place rather than detecting it later and trying to work around it. This would require setting the "host" field in the instance at the time of scheduling rather than the time of instance creation, but that seems like it should work okay. Maybe I'm missing something though... Thanks, Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [bug?] possible postgres/mysql incompatibility in InstanceGroup.get_hosts()
On 03/15/2014 04:29 AM, Sean Dague wrote: On 03/15/2014 02:49 AM, Chris Friesen wrote: Hi, I'm trying to run InstanceGroup.get_hosts() on a havana installation that uses postgres. When I run the code, I get the following error: RemoteError: Remote error: ProgrammingError (ProgrammingError) operator does not exist: timestamp without time zone ~ unknown 2014-03-14 09:58:57.193 8164 TRACE nova.compute.manager [instance: 83439206-3a88-495b-b6c7-6aea1287109f] LINE 3: uuid != instances.uuid AND (instances.deleted_at ~ 'None') ... 2014-03-14 09:58:57.193 8164 TRACE nova.compute.manager [instance: 83439206-3a88-495b-b6c7-6aea1287109f]^ 2014-03-14 09:58:57.193 8164 TRACE nova.compute.manager [instance: 83439206-3a88-495b-b6c7-6aea1287109f] HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts. I'm not a database expert, but after doing some digging, it seems that the problem is this line in get_hosts(): filters = {'uuid': filter_uuids, 'deleted_at': None} It seems that current postgres doesn't allow implicit casts. If I change the line to: filters = {'uuid': filter_uuids, 'deleted': 0} Then it seems to work. Is this change valid? Yes, postgresql is strongly typed with it's data. That's a valid bug you found, fixes appreciated! Bug report is open at "https://bugs.launchpad.net/nova/+bug/1292963"; Patch is up for review at "https://review.openstack.org/80808";, comments welcome. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev