Re: [openstack-dev] [Openstack] Cinder-service connectivity issues
Based on the checkin times in your post, it looks like time is out of sync between your nodes. The one reporting down is reporting time in the future. I would install ntp and make sure the clocks are in sync. Vish On Mar 25, 2015, at 2:33 AM, Kamsali, RaghavendraChari (Artesyn) raghavendrachari.kams...@artesyn.com wrote: Please find attachment log (c-api) , when I execute command cinder create 1. From: Kamsali, RaghavendraChari (Artesyn) [mailto:raghavendrachari.kams...@artesyn.com] Sent: Wednesday, March 25, 2015 1:39 PM To: Ritesh Nanda Cc: openstack-dev@lists.openstack.org; openst...@lists.openstack.org Subject: Re: [Openstack] Cinder-service connectivity issues FYI, From: Ritesh Nanda [mailto:riteshnand...@gmail.com] Sent: Wednesday, March 25, 2015 1:09 PM To: Kamsali, RaghavendraChari [ENGINEERING/IN] Cc: openst...@lists.openstack.org; openstack-dev@lists.openstack.org Subject: Re: [Openstack] Cinder-service connectivity issues Can you run cinder-scheduler , volume service in debug mode and paste the logs. Regards, Ritesh On Wed, Mar 25, 2015 at 12:10 AM, Kamsali, RaghavendraChari (Artesyn) raghavendrachari.kams...@artesyn.com wrote: Hi, My setup is shown below having three networks (management, storage, data/virtual) . image001.png Am facing issue when I bring up the setup as shown above scenario , could anyone help me to figure out did I configured incorrectly or doing anything wrong . On Controller Node SERVICES ENABLED: (c-sch,c-api) Management- 192.168.21.108 Storage- 10.130.98.97 Cinder_configarations : my_ip : 10.130.98.97 (also tried 19.2168.21.108) glance_host:10.130.98.97 (also tried 192.168.21.108) iscsi_ip_address: 10.130.98.97 (also tried 192.168.21.108) image002.jpg image003.jpg On Storage Node SERVICES ENABLED: (c-vol) Management - 192.1689.21.107 Stroage - 10.130.98.136 my_ip : 10.130.98.97 (also tried 19.2168.21.108) glance_host:10.130.98.97 (also tried 192.168.21.108) iscsi_ip_address: 10.130.98.97 (also tried 192.168.21.108) lvmdriver-1.iscsi_ip_address : 10.130.98.136 (also tried 192.168.21.107) image004.jpg Thanks and Regards, Raghavendrachari kamsali | Software Engineer II | Embedded Computing Artesyn Embedded Technologies | 5th Floor, Capella Block, The V, Madhapur| Hyderabad, AP 500081 India ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openst...@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack -- With Regards Ritesh Nanda cinder-create-1.txt__ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Intended behavior for instance.host on reschedule?
I’m pretty sure it has always done this: leave the host set on the final scheduling attempt. I agree that this could be cleared which would free up room for future scheduling attempts. Vish On Mar 3, 2015, at 12:15 AM, Joe Cropper cropper@gmail.com wrote: Hi Folks, I was wondering if anyone can comment on the intended behavior of how instance.host is supported to be set during reschedule operations. For example, take this scenario: 1. Assume an environment with a single host… call it host-1 2. Deploy a VM, but force an exception in the spawn path somewhere to simulate some hypervisor error” 3. The scheduler correctly attempts to reschedule the VM, and ultimately ends up (correctly) with a NoValidHost error because there was only 1 host 4. However, the instance.host (e.g., [nova show vm]) is still showing ‘host-1’ — is this the expected behavior? It seems like perhaps the claim should be reverted (read: instance.host nulled out) when we take the exception path during spawn in step #2 above, but maybe I’m overlooking something? This behavior was observed on a Kilo base from a couple weeks ago, FWIW. Thoughts/comments? Thanks, Joe __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Question about force_host skip filters
If this feature is going to be added, I suggest it gets a different name. Force host is an admin command to force an instance onto a host. If you want to make a user-facing command that respects filters, perhaps something like requested-host might work. In general, however, the name of hosts are not exposed to users, so this is moving far away from cloud and into virtualization management. Vish On Feb 12, 2015, at 1:05 AM, Rui Chen chenrui.m...@gmail.com wrote: Hi: If we boot instance with 'force_hosts', the force host will skip all filters, looks like that it's intentional logic, but I don't know the reason. I'm not sure that the skipping logic is apposite, I think we should remove the skipping logic, and the 'force_hosts' should work with the scheduler, test whether the force host is appropriate ASAP. Skipping filters and postponing the booting failure to nova-compute is not advisable. On the other side, more and more options had been added into flavor, like NUMA, cpu pinning, pci and so on, forcing a suitable host is more and more difficult. Best Regards. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder] Fixing stuck volumes - part II
On Feb 11, 2015, at 3:45 PM, D'Angelo, Scott scott.dang...@hp.com wrote: At the cinder mid-cycle it was decided that the best way to fix volumes stuck in ‘attaching’ or ‘detaching’ was NOT to fix the broken reset-state command. The doc string and help message for reset-state have been modified to warn the user that the tool only affects Cinder DB and can cause problems. But, ultimately, a separate command to ‘force-detach’ would be better. I’ve abandoned the original BP/spec for reset-state involving the driver. I have looked at the existing function ‘force-detach’ in Cinder and it seems to work…except that Nova must be involved. Nova uses the BlockDeviceMapping table to keep track of attached volumes and, if Nova is not involved, a force-detach’ed volume will not be capable of being re-attached. So, my plan is to submit a blueprint + spec for Novaclient to add a ‘force-detach’ command. This is technically fairly simple and only involves stripping away the checks for proper state in Nova, and calling Cinder force-detach. I don’t plan on asking for an exception to feature freeze, unless there is optimism from the community that this could possible get in for L. The existing Cinder force-detach calls terminate_connection() and detach_volume(). I assume detach_volume() is covered by the “Volume Detach” minimum feature? I see many drivers have terminate_connection(), but not all. I believe this will not be a minimum feature, but others may disagree. If you are going to add a force-detach command to nova, I think it would be good to make it detach even if the cinder request fails. Currently if you try to detach a volume (or terminate an instance with an attached volume), if cinder is down or the volume node where the volume resides is down, nova refuses to continue, which is pretty bad user experience. Vish thanks, scottda scott.dang...@hp.com __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Deprecation of in tree EC2 API in Nova for Kilo release
On Feb 3, 2015, at 6:19 AM, Sean Dague s...@dague.net wrote: On 02/03/2015 08:57 AM, Thierry Carrez wrote: Jesse Pretorius wrote: I think that perhaps something that shouldn't be lost site of is that the users using the EC2 API are using it as-is. The only commitment that needs to be made is to maintain the functionality that's already there, rather than attempt to keep it up to scratch with newer functionality that's come into EC2. The stackforge project can perhaps be the incubator for the development of a full replacement which is more up-to-date and interacts more like a translator. Once it's matured enough that the users want to use it instead of the old EC2 API in-tree, then perhaps deprecation is the right option. Between now and then, I must say that I agree with Sean - perhaps the best strategy would be to make it clear somehow that the EC2 API isn't a fully tested or up-to-date API. Right, there are several dimensions in the issue we are discussing. - I completely agree we should communicate clearly the status of the in-tree EC2 API to our users. - Deprecation is a mechanism by which we communicate to our users that they need to evolve their current usage of OpenStack. It should not be used lightly, and it should be a reliable announcement. In the past we deprecated things based on a promised replacement plan that never happened, and we had to un-deprecate. I would very much prefer if we didn't do that ever again, because it's training users to ignore our deprecation announcements. That is what I meant in my earlier email. We /can/ deprecate, but only when we are 99.9% sure we will follow up on that. - The supposed 35% of our users are actually more 44% of the user survey respondents replying yes when asked if they ever used the EC2 API in their deployment of OpenStack. Given that it's far from being up to date or from behaving fully like the current Amazon EC2 API, it's fair to say that those users are probably more interested in keeping the current OpenStack EC2 API support as-is, than they are interested in a new project that will actually make it better and/or different. All of which is fair, however there is actually no such thing as keeping support as-is. The EC2 API is the equivalent of parts of Nova + Neutron + Cinder + Keystone + Swift. However the whole thing is implemented in Nova. Nova, for instances, has a terrible s3 object store in tree to make any of this work (so that the EC2 API doesn't actually depend on Swift). As the projects drift away and change their semantics, and bump their APIs keeping the same support is real work, that's not getting done. This is a bit unfair. This code path is only used for ec2_register_image which I think is almost completely unused even by ec2 these days. Also, it can use any s3 object store (for example swift with swift3 in front). Vish It will become different over time regardless, the real question is if it gets different worse or different better. - Given legal uncertainty about closed APIs it might make *legal* sense to remove it from Nova or at least mark it deprecated and freeze it until that removal can happen. Projects in Stackforge are, by definition, not OpenStack projects, and therefore do not carry the same risk. -- Sean Dague http://dague.net __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo.db] PyMySQL review
On Jan 29, 2015, at 8:57 AM, Roman Podoliaka rpodoly...@mirantis.com wrote: Jeremy, I don't have exact numbers, so yeah, it's just an assumption based on looking at the nova-api/scheduler logs with connection_debug set to 100. But that's a good point you are making here: it will be interesting to see what difference enabling of PyMySQL will make for tempest/rally workloads, rather than just running synthetic tests. I'm going to give it a try on my devstack installation. FWIW I tested this a while ago on some perf tests on nova and cinder that we run internally and I found pymysql to be slower by about 10%. It appears that we were cpu bound in python more often than we were blocking talking to the db. I do recall someone doing a similar test in neutron saw some speedup, however. On our side we also exposed a few race conditions which made it less stable. We hit a few hard deadlocks in volume create IIRC. I don’t think switching is going to give us much benefit right away. We will need a few optimizations and bugfixes in other areas (particularly in our sqlalchemy usage) before we will derive any benefit from the switch. Vish Thanks, Roman On Thu, Jan 29, 2015 at 6:42 PM, Jeremy Stanley fu...@yuggoth.org wrote: On 2015-01-29 18:35:20 +0200 (+0200), Roman Podoliaka wrote: [...] Otherwise, PyMySQL would be much slower than MySQL-Python for the typical SQL queries we do (e.g. ask for *a lot* of data from the DB). [...] Is this assertion based on representative empirical testing (for example profiling devstack+tempest, or perhaps comparing rally benchmarks), or merely an assumption which still needs validating? -- Jeremy Stanley __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [api] Get servers with limit and IP address filter
On Jan 28, 2015, at 7:05 AM, Steven Kaufer kau...@us.ibm.com wrote: Vishvananda Ishaya vishvana...@gmail.com wrote on 01/27/2015 04:29:50 PM: From: Vishvananda Ishaya vishvana...@gmail.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Date: 01/27/2015 04:32 PM Subject: Re: [openstack-dev] [nova] [api] Get servers with limit and IP address filter The network info for an instance is cached as a blob of data (neutron has the canonical version in most installs), so it isn’t particularly easy to do at the database layer. You would likely need a pretty complex stored procedure to do it accurately. Vish Vish, Thanks for the reply. I agree with your point about the difficultly in accurately querying the blob of data; however, IMHO, the complexity this fix does not preclude the current behavior as being classified as a bug. With that in mind, I was wondering if anyone in the community has any thoughts on if the current behavior is considered a bug? Yes it should be classified as a bug. If so, how should it be resolved? A couple options that I could think of: 1. Disallow the combination of using both a limit and an IP address filter by raising an error. I think this is the simplest solution. Vish 2. Workaround the problem by removing the limit from the DB query and then manually limiting the list of servers (after manually applying the IP address filter). 3. Break up the query so that the server UUIDs that match the IP filter are retrieved first and then used as a UUID DB filter. As far as I can tell, this type of solution was originally implemented but the network query was deemed to expensive [1]. Is there a less expensive method to determine the UUIDs (possibly querying the cached 'network_info' in the 'instance_info_caches' table)? 4. Figure out how to accurately query the blob of network info that is cached in the nova DB and apply the IP filter at the DB layer. [1]: https://review.openstack.org/#/c/131460/ Thanks, Steven Kaufer On Jan 27, 2015, at 2:00 PM, Steven Kaufer kau...@us.ibm.com wrote: Hello, When applying an IP address filter to a paginated servers query (eg, supplying servers/detail?ip=192.168limit=100), the IP address filtering is only being applied against the non-filtered page of servers that were retrieved from the DB; see [1]. I believe that the IP address filtering should be done before the limit is applied, returning up to limit servers that match the IP address filter. Currently, if the servers in the page of data returned from the DB do not happen to match the IP address filter (applied in the compute API), then no servers will be returned by the REST API (even if there are servers that match the IP address filter). This seems like a bug to me, shouldn't all filtering be done at the DB layer? [1]: https://github.com/openstack/nova/blob/master/nova/compute/ api.py#L2037-L2042 Thanks, Steven Kaufer __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] high dhcp lease times in neutron deployments considered harmful (or not???)
On Jan 28, 2015, at 9:36 AM, Carl Baldwin c...@ecbaldwin.net wrote: On Wed, Jan 28, 2015 at 9:52 AM, Salvatore Orlando sorla...@nicira.com wrote: The patch Kevin points out increased the lease to 24 hours (which I agree is as arbitrary as 2 minutes, 8 minutes, or 1 century) because it introduced use of DHCPRELEASE message in the agent, which is supported by dnsmasq (to the best of my knowledge) and is functionally similar to FORCERENEW. My understanding was that the dhcp release mechanism in dnsmasq does not actually unicast a FORCERENEW message to the client. Does it? I thought it just released dnsmasq's record of the lease. If I'm right, this is a huge difference. It is a big pain knowing that there are many clients out there who may not renew their leases to get updated dhcp options for hours and hours. I don't think there is a reliable way for the server to force renew to the client, is there? Do clients support the FORCERENEW unicast message? If you are using the dhcp-release script (that we got included in ubuntu years ago for nova-network), it sends a release packet on behalf of the client so that dnsmasq can update its leases table, but it doesn’t send any message to the client to tell it to update. Vish Carl __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo.db] PyMySQL review
On Jan 28, 2015, at 4:03 PM, Doug Hellmann d...@doughellmann.com wrote: On Wed, Jan 28, 2015, at 06:50 PM, Johannes Erdfelt wrote: On Wed, Jan 28, 2015, Clint Byrum cl...@fewbar.com wrote: As is often the case with threading, a reason to avoid using it is that libraries often aren't able or willing to assert thread safety. That said, one way to fix that, is to fix those libraries that we do want to use, to be thread safe. :) I floated this idea across some coworkers recently and they brought up a similar concern, which is concurrency in general, both within our code and dependencies. I can't find many places in Nova (at least) that are concurrent in the sense that one object will be used by multiple threads. nova-scheduler is likely one place. nova-compute would likely be easy to fix if there are any problems. That said, I think the only way to know for sure is to try it out and see. I'm going to hack up a proof of concept and see how difficult this will be. I hope someone who was around at the time will chime in with more detail about why green threads were deemed better than regular threads, and I look forward to seeing your analysis of a change. There is already a thread-based executor in oslo.messaging, which *should* be usable in the applications when you remove eventlet. Threading was never really considered. The initial version tried to get a working api server up as quickly as possible and it used tonado. This was quickly replaced with twisted since tornado was really new at the time and had bugs. We then switched to eventlet when swift joined the party so we didn’t have multiple concurrency stacks. By the time someone came up with the idea of using different concurrency models for the api server and the backend services, we were already pretty far down the greenthread path. Vish Doug JE __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder] [nova] [scheduler] Nova node name passed to Cinder
On Jan 26, 2015, at 10:16 PM, Philipp Marek philipp.ma...@linbit.com wrote: Hello Vish, Nova passes ip, iqn, and hostname into initialize_connection. That should give you the info you need. thank you, but that is on the _Nova_ side. I need to know that on the Cinder node already: For that the cinder volume driver needs to know at ... time which Nova host will be used to access the data. but it's not passed in there: The arguments passed to this functions already include an attached_host value, sadly it's currently given as None... Therefore my question where/when that value is calculated… Initialize connection passes that data to cinder in the call. The connector dictionary in the call should contain the info from nova: https://github.com/openstack/cinder/blob/master/cinder/volume/driver.py#L1051 Regards, Phil -- : Ing. Philipp Marek : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com : DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [api] Get servers with limit and IP address filter
The network info for an instance is cached as a blob of data (neutron has the canonical version in most installs), so it isn’t particularly easy to do at the database layer. You would likely need a pretty complex stored procedure to do it accurately. Vish On Jan 27, 2015, at 2:00 PM, Steven Kaufer kau...@us.ibm.com wrote: Hello, When applying an IP address filter to a paginated servers query (eg, supplying servers/detail?ip=192.168limit=100), the IP address filtering is only being applied against the non-filtered page of servers that were retrieved from the DB; see [1]. I believe that the IP address filtering should be done before the limit is applied, returning up to limit servers that match the IP address filter. Currently, if the servers in the page of data returned from the DB do not happen to match the IP address filter (applied in the compute API), then no servers will be returned by the REST API (even if there are servers that match the IP address filter). This seems like a bug to me, shouldn't all filtering be done at the DB layer? [1]: https://github.com/openstack/nova/blob/master/nova/compute/api.py#L2037-L2042 Thanks, Steven Kaufer __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder] [nova] [scheduler] Nova node name passed to Cinder
Nova passes ip, iqn, and hostname into initialize_connection. That should give you the info you need. Vish On Jan 26, 2015, at 3:21 AM, Philipp Marek philipp.ma...@linbit.com wrote: Hello everybody, I'm currently working on providing DRBD as a block storage protocol. For that the cinder volume driver needs to know at initialize_connection, create_export, and ensure_export time which Nova host will be used to access the data. I'd like to ask for a bit of help; can somebody tell me which part of the code decides that, and where the data flows? Is it already known at that time which node will receive the VM? The arguments passed to this functions already include an attached_host value, sadly it's currently given as None... Thank you for any tips, ideas, and pointers into the code - and, of course, even more so for full-blown patches on review.openstack.org ;) Regards, Phil -- : Ing. Philipp Marek : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com : DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Setting MTU size for tap device
It makes sense to add it to me. Libvirt sets the mtu from the bridge when it creates the tap device, but if you are creating it manually you might need to set it to something else. Vish On Dec 17, 2014, at 10:29 PM, Ryu Ishimoto r...@midokura.com wrote: Hi All, I noticed that in linux_net.py, the method to create a tap interface[1] does not let you set the MTU size. In other places, I see calls made to set the MTU of the device [2]. I'm wondering if there is any technical reasons to why we can't also set the MTU size when creating tap interfaces for general cases. In certain overlay solutions, this would come in handy. If there isn't any, I would love to submit a patch to accomplish this. Thanks in advance! Ryu [1] https://github.com/openstack/nova/blob/master/nova/network/linux_net.py#L1374 [2] https://github.com/openstack/nova/blob/master/nova/network/linux_net.py#L1309 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [OpenStack-dev][nova-net]Floating ip assigned as /32 from the start of the range
Floating ips are always added to the host as a /32. You will need one ip on the compute host from the floating range with the /16 prefix (which it will use for natting instances without floating ips as well). In other words you should manually assign an ip from 10.100.130.X/16 to each compute node and set that value as routing_source_ip=10.100.130.X (or my_ip) in nova.conf. Vish On Dec 19, 2014, at 7:00 AM, Eduard Matei eduard.ma...@cloudfounders.com wrote: Hi, I'm trying to create a vm and assign it an ip in range 10.100.130.0/16. On the host, the ip is assigned to br100 as inet 10.100.0.3/32 scope global br100 instead of 10.100.130.X/16, so it's not reachable from the outside. The localrc.conf : FLOATING_RANGE=10.100.130.0/16 Any idea what to change? Thanks, Eduard -- Eduard Biceri Matei, Senior Software Developer www.cloudfounders.com | eduard.ma...@cloudfounders.com CloudFounders, The Private Cloud Software Company Disclaimer: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the named addressee or an employee or agent responsible for delivering this message to the named addressee, you are hereby notified that you are not authorized to read, print, retain, copy or disseminate this message or any part of it. If you have received this email in error we request you to notify us by reply e-mail and to delete all electronic files of the message. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. E-mail transmission cannot be guaranteed to be secure or error free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the content of this message, and shall have no liability for any loss or damage suffered by the user, which arise as a result of e-mail transmission. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [qa] How to delete a VM which is in ERROR state?
There have been a few, but we were specifically hitting this one: https://www.redhat.com/archives/libvir-list/2014-March/msg00501.html Vish On Dec 17, 2014, at 7:03 AM, Belmiro Moreira moreira.belmiro.email.li...@gmail.com wrote: Hi Vish, do you have more info about the libvirt deadlocks that you observed? Maybe I'm observing the same on SLC6 where I can't even kill libvirtd process. Belmiro On Tue, Dec 16, 2014 at 12:01 AM, Vishvananda Ishaya vishvana...@gmail.com wrote: I have seen deadlocks in libvirt that could cause this. When you are in this state, check to see if you can do a virsh list on the node. If not, libvirt is deadlocked, and ubuntu may need to pull in a fix/newer version. Vish On Dec 12, 2014, at 2:12 PM, pcrews glee...@gmail.com wrote: On 12/09/2014 03:54 PM, Ken'ichi Ohmichi wrote: Hi, This case is always tested by Tempest on the gate. https://github.com/openstack/tempest/blob/master/tempest/api/compute/servers/test_delete_server.py#L152 So I guess this problem wouldn't happen on the latest version at least. Thanks Ken'ichi Ohmichi --- 2014-12-10 6:32 GMT+09:00 Joe Gordon joe.gord...@gmail.com: On Sat, Dec 6, 2014 at 5:08 PM, Danny Choi (dannchoi) dannc...@cisco.com wrote: Hi, I have a VM which is in ERROR state. +--+--+++-++ | ID | Name | Status | Task State | Power State | Networks | +--+--+++-++ | 1cb5bf96-619c-4174-baae-dd0d8c3d40c5 | cirros--1cb5bf96-619c-4174-baae-dd0d8c3d40c5 | ERROR | - | NOSTATE || I tried in both CLI “nova delete” and Horizon “terminate instance”. Both accepted the delete command without any error. However, the VM never got deleted. Is there a way to remove the VM? What version of nova are you using? This is definitely a serious bug, you should be able to delete an instance in error state. Can you file a bug that includes steps on how to reproduce the bug along with all relevant logs. bugs.launchpad.net/nova Thanks, Danny ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Hi, I've encountered this in my own testing and have found that it appears to be tied to libvirt. When I hit this, reset-state as the admin user reports success (and state is set), *but* things aren't really working as advertised and subsequent attempts to do anything with the errant vm's will send them right back into 'FLAIL' / can't delete / endless DELETING mode. restarting libvirt-bin on my machine fixes this - after restart, the deleting vm's are properly wiped without any further user input to nova/horizon and all seems right in the world. using: devstack ubuntu 14.04 libvirtd (libvirt) 1.2.2 triggered via: lots of random create/reboot/resize/delete requests of varying validity and sanity. Am in the process of cleaning up my test code so as not to hurt anyone's brain with the ugly and will file a bug once done, but thought this worth sharing. Thanks, Patrick ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [glance] Option to skip deleting images in use?
A simple solution that wouldn’t require modification of glance would be a cron job that lists images and snapshots and marks them protected while they are in use. Vish On Dec 16, 2014, at 3:19 PM, Collins, Sean sean_colli...@cable.comcast.com wrote: On Tue, Dec 16, 2014 at 05:12:31PM EST, Chris St. Pierre wrote: No, I'm looking to prevent images that are in use from being deleted. In use and protected are disjoint sets. I have seen multiple cases of images (and snapshots) being deleted while still in use in Nova, which leads to some very, shall we say, interesting bugs and support problems. I do think that we should try and determine a way forward on this, they are indeed disjoint sets. Setting an image as protected is a proactive measure, we should try and figure out a way to keep tenants from shooting themselves in the foot if possible. -- Sean M. Collins ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [qa] Question about nova boot --min-count number
I suspect you are actually failing due to not having enough room in your cloud instead of not having enough quota. You will need to make instance sizes with less cpus/ram/disk or change your allocation ratios in the scheduler. Vish On Dec 13, 2014, at 8:43 AM, Danny Choi (dannchoi) dannc...@cisco.com wrote: Hi, According to the help text, “—min-count number” boot at least number servers (limited by quota): --min-count number Boot at least number servers (limited by quota). I used devstack to deploy OpenStack (version Kilo) in a multi-node setup: 1 Controller/Network + 2 Compute nodes I update the tenant demo default quota “instances and “cores from ’10’ and ’20’ to ‘100’ and ‘200’: localadmin@qa4:~/devstack$ nova quota-show --tenant 62fe9a8a2d58407d8aee860095f11550 --user eacb7822ccf545eab9398b332829b476 +-+---+ | Quota | Limit | +-+---+ | instances | 100 | | cores | 200 | | ram | 51200 | | floating_ips| 10| | fixed_ips | -1| | metadata_items | 128 | | injected_files | 5 | | injected_file_content_bytes | 10240 | | injected_file_path_bytes| 255 | | key_pairs | 100 | | security_groups | 10| | security_group_rules| 20| | server_groups | 10| | server_group_members| 10| +-+---+ When I boot 50 VMs using “—min-count 50”, only 48 VMs come up. localadmin@qa4:~/devstack$ nova boot --image cirros-0.3.2-x86_64-uec --flavor 1 --nic net-id=5b464333-bad0-4fc1-a2f0-310c47b77a17 --min-count 50 vm- There is no error in logs; and it happens consistently. I also tried “—min-count 60” and only 48 VMs com up. In Horizon, left pane “Admin” - “System” - “Hypervisors”, it shows both Compute hosts, each with 32 total VCPUs for a grand total of 64, but only 48 used. Is this normal behavior or is there any other setting to change in order to use all 64 VCPUs? Thanks, Danny ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [qa] How to delete a VM which is in ERROR state?
I have seen deadlocks in libvirt that could cause this. When you are in this state, check to see if you can do a virsh list on the node. If not, libvirt is deadlocked, and ubuntu may need to pull in a fix/newer version. Vish On Dec 12, 2014, at 2:12 PM, pcrews glee...@gmail.com wrote: On 12/09/2014 03:54 PM, Ken'ichi Ohmichi wrote: Hi, This case is always tested by Tempest on the gate. https://github.com/openstack/tempest/blob/master/tempest/api/compute/servers/test_delete_server.py#L152 So I guess this problem wouldn't happen on the latest version at least. Thanks Ken'ichi Ohmichi --- 2014-12-10 6:32 GMT+09:00 Joe Gordon joe.gord...@gmail.com: On Sat, Dec 6, 2014 at 5:08 PM, Danny Choi (dannchoi) dannc...@cisco.com wrote: Hi, I have a VM which is in ERROR state. +--+--+++-++ | ID | Name | Status | Task State | Power State | Networks | +--+--+++-++ | 1cb5bf96-619c-4174-baae-dd0d8c3d40c5 | cirros--1cb5bf96-619c-4174-baae-dd0d8c3d40c5 | ERROR | - | NOSTATE || I tried in both CLI “nova delete” and Horizon “terminate instance”. Both accepted the delete command without any error. However, the VM never got deleted. Is there a way to remove the VM? What version of nova are you using? This is definitely a serious bug, you should be able to delete an instance in error state. Can you file a bug that includes steps on how to reproduce the bug along with all relevant logs. bugs.launchpad.net/nova Thanks, Danny ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Hi, I've encountered this in my own testing and have found that it appears to be tied to libvirt. When I hit this, reset-state as the admin user reports success (and state is set), *but* things aren't really working as advertised and subsequent attempts to do anything with the errant vm's will send them right back into 'FLAIL' / can't delete / endless DELETING mode. restarting libvirt-bin on my machine fixes this - after restart, the deleting vm's are properly wiped without any further user input to nova/horizon and all seems right in the world. using: devstack ubuntu 14.04 libvirtd (libvirt) 1.2.2 triggered via: lots of random create/reboot/resize/delete requests of varying validity and sanity. Am in the process of cleaning up my test code so as not to hurt anyone's brain with the ugly and will file a bug once done, but thought this worth sharing. Thanks, Patrick ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Neutron] out-of-tree plugin for Mech driver/L2 and vif_driver
On Dec 11, 2014, at 2:41 AM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Dec 11, 2014 at 09:37:31AM +0800, henry hly wrote: On Thu, Dec 11, 2014 at 3:48 AM, Ian Wells ijw.ubu...@cack.org.uk wrote: On 10 December 2014 at 01:31, Daniel P. Berrange berra...@redhat.com wrote: So the problem of Nova review bandwidth is a constant problem across all areas of the code. We need to solve this problem for the team as a whole in a much broader fashion than just for people writing VIF drivers. The VIF drivers are really small pieces of code that should be straightforward to review get merged in any release cycle in which they are proposed. I think we need to make sure that we focus our energy on doing this and not ignoring the problem by breaking stuff off out of tree. The problem is that we effectively prevent running an out of tree Neutron driver (which *is* perfectly legitimate) if it uses a VIF plugging mechanism that isn't in Nova, as we can't use out of tree code and we won't accept in code ones for out of tree drivers. The question is, do we really need such flexibility for so many nova vif types? I also think that VIF_TYPE_TAP and VIF_TYPE_VHOSTUSER is good example, nova shouldn't known too much details about switch backend, it should only care about the VIF itself, how the VIF is plugged to switch belongs to Neutron half. However I'm not saying to move existing vif driver out, those open backend have been used widely. But from now on the tap and vhostuser mode should be encouraged: one common vif driver to many long-tail backend. Yes, I really think this is a key point. When we introduced the VIF type mechanism we never intended for there to be soo many different VIF types created. There is a very small, finite number of possible ways to configure the libvirt guest XML and it was intended that the VIF types pretty much mirror that. This would have given us about 8 distinct VIF type maximum. I think the reason for the larger than expected number of VIF types, is that the drivers are being written to require some arbitrary tools to be invoked in the plug unplug methods. It would really be better if those could be accomplished in the Neutron code than the Nova code, via a host agent run provided by the Neutron mechanism. This would let us have a very small number of VIF types and so avoid the entire problem that this thread is bringing up. Failing that though, I could see a way to accomplish a similar thing without a Neutron launched agent. If one of the VIF type binding parameters were the name of a script, we could run that script on plug unplug. So we'd have a finite number of VIF types, and each new Neutron mechanism would merely have to provide a script to invoke eg consider the existing midonet iovisor VIF types as an example. Both of them use the libvirt ethernet config, but have different things running in their plug methods. If we had a mechanism for associating a plug script with a vif type, we could use a single VIF type for both. eg iovisor port binding info would contain vif_type=ethernet vif_plug_script=/usr/bin/neutron-iovisor-vif-plug while midonet would contain vif_type=ethernet vif_plug_script=/usr/bin/neutron-midonet-vif-plug +1 This is a great suggestion. Vish And so you see implementing a new Neutron mechanism in this way would not require *any* changes in Nova whatsoever. The work would be entirely self-contained within the scope of Neutron. It is simply a packaging task to get the vif script installed on the compute hosts, so that Nova can execute it. This is essentially providing a flexible VIF plugin system for Nova, without having to have it plug directly into the Nova codebase with the API RPC stability constraints that implies. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] UniqueConstraint for name and tenant_id in security group
On Dec 11, 2014, at 8:00 AM, Henry Gessau ges...@cisco.com wrote: On Thu, Dec 11, 2014, Mark McClain m...@mcclain.xyz wrote: On Dec 11, 2014, at 8:43 AM, Jay Pipes jaypi...@gmail.com mailto:jaypi...@gmail.com wrote: I'm generally in favor of making name attributes opaque, utf-8 strings that are entirely user-defined and have no constraints on them. I consider the name to be just a tag that the user places on some resource. It is the resource's ID that is unique. I do realize that Nova takes a different approach to *some* resources, including the security group name. End of the day, it's probably just a personal preference whether names should be unique to a tenant/user or not. Maru had asked me my opinion on whether names should be unique and I answered my personal opinion that no, they should not be, and if Neutron needed to ensure that there was one and only one default security group for a tenant, that a way to accomplish such a thing in a race-free way, without use of SELECT FOR UPDATE, was to use the approach I put into the pastebin on the review above. I agree with Jay. We should not care about how a user names the resource. There other ways to prevent this race and Jay’s suggestion is a good one. However we should open a bug against Horizon because the user experience there is terrible with duplicate security group names. The reason security group names are unique is that the ec2 api supports source rule specifications by tenant_id (user_id in amazon) and name, so not enforcing uniqueness means that invocation in the ec2 api will either fail or be non-deterministic in some way. Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Reason for mem/vcpu ratio in default flavors
Probably just a historical artifact of values that we thought were reasonable for our machines at NASA. Vish On Dec 11, 2014, at 8:35 AM, David Kranz dkr...@redhat.com wrote: Perhaps this is a historical question, but I was wondering how the default OpenStack flavor size ratio of 2/1 was determined? According to http://aws.amazon.com/ec2/instance-types/, ec2 defines the flavors for General Purpose (M3) at about 3.7/1, with Compute Intensive (C3) at about 1.9/1 and Memory Intensive (R3) at about 7.6/1. -David ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] UniqueConstraint for name and tenant_id in security group
On Dec 11, 2014, at 1:04 PM, Jay Pipes jaypi...@gmail.com wrote: On 12/11/2014 04:01 PM, Vishvananda Ishaya wrote: On Dec 11, 2014, at 8:00 AM, Henry Gessau ges...@cisco.com wrote: On Thu, Dec 11, 2014, Mark McClain m...@mcclain.xyz wrote: On Dec 11, 2014, at 8:43 AM, Jay Pipes jaypi...@gmail.com mailto:jaypi...@gmail.com wrote: I'm generally in favor of making name attributes opaque, utf-8 strings that are entirely user-defined and have no constraints on them. I consider the name to be just a tag that the user places on some resource. It is the resource's ID that is unique. I do realize that Nova takes a different approach to *some* resources, including the security group name. End of the day, it's probably just a personal preference whether names should be unique to a tenant/user or not. Maru had asked me my opinion on whether names should be unique and I answered my personal opinion that no, they should not be, and if Neutron needed to ensure that there was one and only one default security group for a tenant, that a way to accomplish such a thing in a race-free way, without use of SELECT FOR UPDATE, was to use the approach I put into the pastebin on the review above. I agree with Jay. We should not care about how a user names the resource. There other ways to prevent this race and Jay’s suggestion is a good one. However we should open a bug against Horizon because the user experience there is terrible with duplicate security group names. The reason security group names are unique is that the ec2 api supports source rule specifications by tenant_id (user_id in amazon) and name, so not enforcing uniqueness means that invocation in the ec2 api will either fail or be non-deterministic in some way. So we should couple our API evolution to EC2 API then? -jay No I was just pointing out the historical reason for uniqueness, and hopefully encouraging someone to find the best behavior for the ec2 api if we are going to keep the incompatibility there. Also I personally feel the ux is better with unique names, but it is only a slight preference. Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Spring cleaning nova-core
On Dec 4, 2014, at 4:05 PM, Michael Still mi...@stillhq.com wrote: One of the things that happens over time is that some of our core reviewers move on to other projects. This is a normal and healthy thing, especially as nova continues to spin out projects into other parts of OpenStack. However, it is important that our core reviewers be active, as it keeps them up to date with the current ways we approach development in Nova. I am therefore removing some no longer sufficiently active cores from the nova-core group. I’d like to thank the following people for their contributions over the years: * cbehrens: Chris Behrens * vishvananda: Vishvananda Ishaya Thank you Michael. I knew this would happen eventually. I am around and I still do reviews from time to time, so everyone feel free to ping me on irc if there are specific reviews that need my historical knowledge! Vish * dan-prince: Dan Prince * belliott: Brian Elliott * p-draigbrady: Padraig Brady I’d love to see any of these cores return if they find their available time for code reviews increases. Thanks, Michael -- Rackspace Australia ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder] [qa] which core team members are diving into - http://status.openstack.org/elastic-recheck/#1373513
On Nov 25, 2014, at 7:29 AM, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: On 11/25/2014 9:03 AM, Matt Riedemann wrote: On 11/25/2014 8:11 AM, Sean Dague wrote: There is currently a review stream coming into Tempest to add Cinder v2 tests in addition to the Cinder v1 tests. At the same time the currently biggest race fail in the gate related to the projects is http://status.openstack.org/elastic-recheck/#1373513 - which is cinder related. I believe these 2 facts are coupled. The number of volume tests we have in tempest is somewhat small, and as such the likelihood of them running simultaneously is also small. However the fact that as the # of tests with volumes goes up we are getting more of these race fails typically means that what's actually happening is 2 vol ops that aren't safe to run at the same time, are. This remains critical - https://bugs.launchpad.net/cinder/+bug/1373513 - with no assignee. So we really needs dedicated diving on this (last bug update with any code was a month ago), otherwise we need to stop adding these tests to Tempest, and honestly start skipping the volume tests if we can't have a repeatable success. -Sean I just put up an e-r query for a newly opened bug https://bugs.launchpad.net/cinder/+bug/1396186 this morning, it looks similar to bug 1373513 but without the blocked task error in syslog. There is a three minute gap between when the volume is being deleted in c-vol logs and when we see the volume uuid logged again, at which point tempest has already timed out waiting for the delete to complete. We should at least get some patches to add diagnostic logging in these delete flows (or periodic tasks that use the same locks/low-level i/o bound commands?) to try and pinpoint these failures. I think I'm going to propose a skip patch for test_volume_boot_pattern since that just seems to be a never ending cause of pain until these root issues get fixed. I marked 1396186 as a duplicate of 1373513 since the e-r query for 1373513 had an OR message which was the same as 1396186. I went ahead and proposed a skip for test_volume_boot_pattern due to bug 1373513 [1] until people get on top of debugging it. I added some notes to bug 1396186, the 3 minute hang seems to be due to a vgs call taking ~1 minute and an lvs call taking ~2 minutes. I'm not sure if those are hit in the volume delete flow or in some periodic task, but if there are multiple concurrent worker processes that could be hitting those commands at the same time can we look at off-loading one of them to a separate thread or something? Do we set up devstack to not zero volumes on delete (CINDER_SECURE_DELETE=False) ? If not, the dd process could be hanging the system due to io load. This would get significantly worse with multiple deletes occurring simultaneously. Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [OpenStack-dev][Nova] Migration stuck - resize/migrating
Migrate/resize uses scp to copy files back and forth with the libvirt driver. This shouldn’t be necessary with shared storage, but it may still need ssh configured between the user that nova is running as in order to complete the migration. It is also possible that there is a bug in the code path dealing with shared storage, although I would have expected you to see a traceback somewhere. Vish On Nov 11, 2014, at 1:10 AM, Eduard Matei eduard.ma...@cloudfounders.com wrote: Hi, I'm testing our cinder volume driver in the following setup: - 2 nodes, ubuntu, devstack juno (2014.2.1) - shared storage (common backend), our custom software solution + cinder volume on shared storage - 1 instance running on node 1, /instances directory on shared storage - kvm, libvirt (with live migration flags) Live migration of instance between nodes works perfectly. Migrate simply blocks. The instance in in status Resize/Migrate, no errors in n-cpu or n-sch, and it stays like that for over 8 hours (all night). I thought it was copying the disk, but it's a 20GB sparse file with approx. 200 mb of data, and the nodes have 1Gbps link, so it should be a couple of seconds. Any difference between live migration and migration? As i said, we use a shared filesystem-like storage solution so the volume files and the instance files are visible on both nodes, so no data needs copying. I know it's tricky to debug since we use a custom cinder driver, but anyone has any ideas where to start looking? Thanks, Eduard -- Eduard Biceri Matei, Senior Software Developer www.cloudfounders.com | eduard.ma...@cloudfounders.com CloudFounders, The Private Cloud Software Company Disclaimer: This email and any files transmitted with it are confidential and intended solely for the use of the individual or entity to whom they are addressed. If you are not the named addressee or an employee or agent responsible for delivering this message to the named addressee, you are hereby notified that you are not authorized to read, print, retain, copy or disseminate this message or any part of it. If you have received this email in error we request you to notify us by reply e-mail and to delete all electronic files of the message. If you are not the intended recipient you are notified that disclosing, copying, distributing or taking any action in reliance on the contents of this information is strictly prohibited. E-mail transmission cannot be guaranteed to be secure or error free as information could be intercepted, corrupted, lost, destroyed, arrive late or incomplete, or contain viruses. The sender therefore does not accept liability for any errors or omissions in the content of this message, and shall have no liability for any loss or damage suffered by the user, which arise as a result of e-mail transmission. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Undead DB objects: ProviderFirewallRule and InstanceGroupPolicy?
AFAIK they are relics. Vish On Nov 13, 2014, at 7:20 AM, Matthew Booth mbo...@redhat.com wrote: There are 3 db apis relating to ProviderFirewallRule: provider_fw_rule_create, provider_fw_rule_get_all, and provider_fw_rule_destroy. Of these, only provider_fw_rule_get_all seems to be used. i.e. It seems they can be queried, but not created. InstanceGroupPolicy doesn't seem to be used anywhere at all. _validate_instance_group_policy() in compute manager seems to be doing something else. Are these undead relics in need of a final stake through the heart, or is something else going on here? Thanks, Matt -- Matthew Booth Red Hat Engineering, Virtualisation Team Phone: +442070094448 (UK) GPG ID: D33C3490 GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][Neutron] Error message when Neutron network is running out of IP addresses
It looks like this has not been reported so a bug would be great. It looks like it might be as easy as adding the NoMoreFixedIps exception to the list where FixedIpLimitExceeded is caught in nova/network/manager.py Vish On Nov 18, 2014, at 8:13 AM, Edgar Magana edgar.mag...@workday.com wrote: Hello Community, When a network subnet runs out of IP addresses a request to create a VM on that network fails with the Error message: No valid host was found. There are not enough hosts available. In the nova logs the error message is: NoMoreFixedIps: No fixed IP addresses available for network: Obviously, this is not the desirable behavior, is there any work in progress to change it or I should open a bug to properly propagate the right error message. Thanks, Edgar ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] pci pass through turing complete config options?
On Nov 6, 2014, at 7:56 PM, Ian Wienand iwien...@redhat.com wrote: On 10/29/2014 12:42 AM, Doug Hellmann wrote: Another way to do this, which has been used in some other projects, is to define one option for a list of “names” of things, and use those names to make groups with each field I've proposed that in [1]. I look forward to some -1's :) OTOH, oslo.config is not the only way we have to support configuration. This looks like a good example of settings that are more complex than what oslo.config is meant to handle, and that might be better served in a separate file with the location of that file specified in an oslo.config option. My personal opinion is that yet-another-config-file in possibly yet-another-format is just a few lines of code, but has a pretty high cost for packagers, testers, admins, etc. So I feel like that's probably a last-resort. In most discussions I’ve had with deployers, the prefer multiple files, as it is easier to create a new file via puppet or chef when a feature is turned on than to add a bunch of new sections in the middle of an existing file. Vish -i [1] https://review.openstack.org/133138 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] TC election by the numbers
On Oct 30, 2014, at 10:41 AM, Zane Bitter zbit...@redhat.com wrote: On 30/10/14 06:22, Eoghan Glynn wrote: IIRC, there is no method for removing foundation members. So there are likely a number of people listed who have moved on to other activities and are no longer involved with OpenStack. I'd actually be quite interested to see the turnout numbers with voters who missed the last two elections prior to this one filtered out. Well, the base electorate for the TC are active contributors with patches landed to official projects within the past year, so these are devs getting their code merged but not interested in voting. This is somewhat different from (though potentially related to) the dead weight foundation membership on the rolls for board elections. Also, foundation members who have not voted in two board elections are being removed from the membership now, from what I understand (we just needed to get to the point where we had two years worth of board elections in the first place). Thanks, I lost my mind here and confused the board with the TC. So then my next question is, of those who did not vote, how many are from under-represented companies? A higher percentage there might point to disenfranchisement. Different but related question (might be hard to calculate though): If we remove people who have only ever landed one patch from the electorate, what do the turnout numbers look like? 2? 5? Do we have the ability to dig in slightly and find a natural definition or characterization amongst our currently voting electorate that might help us understand who the people are who do vote and what it is about those people who might be or feel different or more enfranchised? I've personally been thinking that the one-patch rule is, while tractable, potentially strange for turnout - especially when one-patch also gets you a free summit pass... but I have no data to say what actually defined active in active technical contributor. Again, the ballots are anonymized so we've no way of doing that analysis. The best we could IIUC would be to analyze the electoral roll, bucketizing by number of patches landed, to see if there's a significant long-tail of potential voters with very few patches. Just looking at stackalytices numbers for Juno: Out of 1556 committers, 1071 have committed more than one patch and 485 only a single patch. That's a third! Here's the trend over the past four cycles, with a moving average in the last column, as the eligible voters are derived from the preceding two cycles: Release | Committers | Single-patch | 2-cycle MA Juno| 1556 | 485 (31.2%) | 29.8% Icehouse| 1273 | 362 (28.4%) | 28.0% Havana | 1005 | 278 (27.6%) | 28.8% Folsom | 401| 120 (29.9%) | 27.9% Correction, I skipped a cycle in that table: Release | Committers | Single-patch | 2-cycle MA Juno| 1556 | 485 (31.2%) | 29.8% Icehouse| 1273 | 362 (28.4%) | 28.0% Havana | 1005 | 278 (27.6%) | 28.0% Grizzly | 630| 179 (28.4%) | 29.2% Folsom | 401| 120 (29.9%) | 27.9% Doesn't alter the trend though, still quite flat with some jitter and a small uptick. The low (and dropping) level of turnout is worrying, particularly in light of that analysis showing the proportion of drive-by contributors is relatively static, but it is always going to be hard to discern the motives of people who didn't vote from the single bit of data we have on them. There is, however, another metric that we can pull from the actual voting data: the number of candidates actually ranked by each voter: Candidates rankedFrequency 08 2% 1 17 3% 2 24 5% 3 20 4% 4 31 6% 5 36 7% 6 68 13% 7 39 8% 8 17 3% 99 2% 10 21 4% 11- - 12 216 43% (Note that it isn't possible to rank exactly n-1 candidates.) So even amongst the group of people who were engaged enough to vote, fewer than half ranked all of the candidates. A couple of hypotheses spring to mind: 1) People don't understand the voting system. Under Condorcet, there is no such thing as tactical voting by an individual. So to the extent that these figures might reflect deliberate 'tactical' voting, it means people don't understand Condorcet. The size of the spike at 6 (the number of positions available - the same spike appeared at 7 in the previous election) strongly suggests that lack of understanding of the voting system is at least part of the story. The good news is that this problem is eminently addressable. 2) People aren't familiar with the candidates This is the one that worries me - it looks a lot like most
Re: [openstack-dev] [Neutron] Killing connection after security group rule deletion
If you exec conntrack inside the namespace with ip netns exec does it still show both connections? Vish On Oct 23, 2014, at 3:22 AM, Elena Ezhova eezh...@mirantis.com wrote: Hi! I am working on a bug ping still working once connected even after related security group rule is deleted (https://bugs.launchpad.net/neutron/+bug/1335375). The gist of the problem is the following: when we delete a security group rule the corresponding rule in iptables is also deleted, but the connection, that was allowed by that rule, is not being destroyed. The reason for such behavior is that in iptables we have the following structure of a chain that filters input packets for an interface of an istance: Chain neutron-openvswi-i830fa99f-3 (1 references) pkts bytes target prot opt in out source destination 0 0 DROP all -- * * 0.0.0.0/00.0.0.0/0 state INVALID /* Drop packets that are not associated with a state. */ 0 0 RETURN all -- * * 0.0.0.0/00.0.0.0/0 state RELATED,ESTABLISHED /* Direct packets associated with a known session to the RETURN chain. */ 0 0 RETURN udp -- * * 10.0.0.3 0.0.0.0/0 udp spt:67 dpt:68 0 0 RETURN all -- * * 0.0.0.0/00.0.0.0/0 match-set IPv43a0d3610-8b38-43f2-8 src 0 0 RETURN tcp -- * * 0.0.0.0/00.0.0.0/0 tcp dpt:22 rule that allows ssh on port 22 184 RETURN icmp -- * * 0.0.0.0/00.0.0.0/0 0 0 neutron-openvswi-sg-fallback all -- * * 0.0.0.0/0 0.0.0.0/0/* Send unmatched traffic to the fallback chain. */ So, if we delete rule that allows tcp on port 22, then all connections that are already established won't be closed, because all packets would satisfy the rule: 0 0 RETURN all -- * * 0.0.0.0/00.0.0.0/0 state RELATED,ESTABLISHED /* Direct packets associated with a known session to the RETURN chain. */ I seek advice on the way how to deal with the problem. There are a couple of ideas how to do it (more or less realistic): Kill the connection using conntrack The problem here is that it is sometimes impossible to tell which connection should be killed. For example there may be two instances running in different namespaces that have the same ip addresses. As a compute doesn't know anything about namespaces, it cannot distinguish between the two seemingly identical connections: $ sudo conntrack -L | grep 10.0.0.5 tcp 6 431954 ESTABLISHED src=10.0.0.3 dst=10.0.0.5 sport=60723 dport=22 src=10.0.0.5 dst=10.0.0.3 sport=22 dport=60723 [ASSURED] mark=0 use=1 tcp 6 431976 ESTABLISHED src=10.0.0.3 dst=10.0.0.5 sport=60729 dport=22 src=10.0.0.5 dst=10.0.0.3 sport=22 dport=60729 [ASSURED] mark=0 use=1 I wonder whether there is any way to search for a connection by destination MAC? Delete iptables rule that directs packets associated with a known session to the RETURN chain It will force all packets to go through the full chain each time and this will definitely make the connection close. But this will strongly affect the performance. Probably there may be created a timeout after which this rule will be restored, but it is uncertain how long should it be. Please share your thoughts on how it would be better to handle it. Thanks in advance, Elena ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [kolla] on Dockerfile patterns
On Oct 14, 2014, at 1:21 PM, Lars Kellogg-Stedman l...@redhat.com wrote: On Tue, Oct 14, 2014 at 04:06:22PM -0400, Jay Pipes wrote: I understand that general feeling, but system administration tasks like debugging networking issues or determining and grepping log file locations or diagnosing packaging issues for OpenStack services or performing database logfile maintenance and backups don't just go away because you're using containers, right? They don't go away, but they're not necessarily things that you would do inside your container. Any state (e.g., database tables) that has a lifetime different from that of your container should be stored outside of the container proper. In docker, this would be a volume (in a cloud environment, this would be something like EBS or a Cinder volume). Ideally, your container-optimized applications logs to stdout/stderr. If you have multiple processes, they each run in a separate container. Backups take advantage of the data volumes you've associated with your container. E.g., spawning a new container using the docker --volumes-from option to access that data for backup purposes. If you really need to get inside a container for diagnostic purposes, then you use something like nsenter, nsinit, or the forthcoming docker exec”. “something like” isn’t good enough here. There must be a standard way to do this stuff or people will continue to build fat containers with all of their pet tools inside. This means containers will just be another incarnation of virtualization. Vish they very much seem to be developed from the point of view of application developers, and not so much from the point of view of operators who need to maintain and support those applications. I think it's entirely accurate to say that they are application-centric, much like services such as Heroku, OpenShift, etc. -- Lars Kellogg-Stedman l...@redhat.com | larsks @ {freenode,twitter,github} Cloud Engineering / OpenStack | http://blog.oddbit.com/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [kolla] on Dockerfile patterns
On Oct 14, 2014, at 1:12 PM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Lars Kellogg-Stedman's message of 2014-10-14 12:50:48 -0700: On Tue, Oct 14, 2014 at 03:25:56PM -0400, Jay Pipes wrote: I think the above strategy is spot on. Unfortunately, that's not how the Docker ecosystem works. I'm not sure I agree here, but again nobody is forcing you to use this tool. operating system that the image is built for. I see you didn't respond to my point that in your openstack-containers environment, you end up with Debian *and* Fedora images, since you use the official MySQL dockerhub image. And therefore you will end up needing to know sysadmin specifics (such as how network interfaces are set up) on multiple operating system distributions. I missed that part, but ideally you don't *care* about the distribution in use. All you care about is the application. Your container environment (docker itself, or maybe a higher level abstraction) sets up networking for you, and away you go. If you have to perform system administration tasks inside your containers, my general feeling is that something is wrong. Speaking as a curmudgeon ops guy from back in the day.. the reason I choose the OS I do is precisely because it helps me _when something is wrong_. And the best way an OS can help me is to provide excellent debugging tools, and otherwise move out of the way. When something _is_ wrong and I want to attach GDB to mysqld in said container, I could build a new container with debugging tools installed, but that may lose the very system state that I'm debugging. So I need to run things inside the container like apt-get or yum to install GDB.. and at some point you start to realize that having a whole OS is actually a good thing even if it means needing to think about a few more things up front, such as which OS will I use? and what tools do I need installed in my containers? What I mean to say is, just grabbing off the shelf has unstated consequences. If this is how people are going to use and think about containers, I would submit they are a huge waste of time. The performance value they offer is dramatically outweighed by the flexibilty and existing tooling that exists for virtual machines. As I state in my blog post[1] if we really want to get value from containers, we must convert to the single application per container view. This means having standard ways of doing the above either on the host machine or in a debugging container that is as easy (or easier) than the workflow you mention. There are not good ways to do this yet, and the community hand-waves it away, saying things like, well you could …”. You could isn’t good enough. The result is that a lot of people that are using containers today are doing fat containers with a full os. Vish [1] https://medium.com/@vishvananda/standard-components-not-standard-containers-c30567f23da6 Sure, Docker isn't any more limiting than using a VM or bare hardware, but if you use the official Docker images, it is more limiting, no? No more so than grabbing a virtual appliance rather than building a system yourself. In other words: sure, it's less flexible, but possibly it's faster to get started, which is especially useful if your primary goal is not be a database administrator but is actually write an application that uses a database backend. I think there are uses cases for both official and customized images. In the case of Kolla, we're deploying OpenStack, not just some new application that uses a database backend. I think the bar is a bit higher for operations than end-user applications, since it sits below the abstractions, much closer to the metal. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] Naming convention for unused variables
On Oct 13, 2014, at 6:28 PM, Angus Lees g...@inodes.org wrote: (Context: https://review.openstack.org/#/c/117418/) I'm looking for some rough consensus on what naming conventions we want for unused variables in Neutron, and across the larger OpenStack python codebase since there's no reason for Neutron to innovate here. As far as I can see, there are two cases: 1. The I just don't care variable Eg:_, _, filename = path.rpartition('/') In python this is very commonly '_', but this conflicts with the gettext builtin so we should avoid it in OpenStack. Possible candidates include: a. 'x' b. '__' (double-underscore) c. No convention 2. I know it is unused, but the name still serves as documentation Note this turns up as two cases: as a local, and as a function parameter. Eg: out, _err = execute('df', path) Eg: def makefile(self, _mode, _other): return self._buffer I deliberately chose that second example to highlight that the leading- underscore convention collides with its use for private properties. Possible candidates include: a. _foo (leading-underscore, note collides with private properties) b. unused_foo (suggested in the Google python styleguide) c. NOQA_foo (as suggested in c/117418) d. No convention (including not indicating that variables are known-unused) I prefer a. Private properties are explicitly prefixed with self. so it doesn’t seem to be a conflict to me. Vish As with all style discussions, everyone feels irrationally attached to their favourite, but the important bit is to be consistent to aid readability (and in this case, also to help the mechanical code checkers). Vote / Discuss / Suggest additional alternatives. -- - Gus ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [qa] Cannot start the VM console when VM is launched at Compute node
No this is not expected and may represent a misconfiguration or a bug. Something is returning a 404 when it shouldn’t. You might get more luck running the nova command with —debug to see what specifically is 404ing. You could also see if anything is reporting NotFound in the nova-consoleauth or nova-api or nova-compute logs Vish On Oct 14, 2014, at 10:45 AM, Danny Choi (dannchoi) dannc...@cisco.com wrote: Hi, I used devstack to deploy multi-node OpenStack, with Controller + nova-compute + Network on one physical node (qa4), and Compute on a separate physical node (qa5). When I launch a VM which spun up on the Compute node (qa5), I cannot launch the VM console, in both CLI and Horizon. localadmin@qa4:~/devstack$ nova hypervisor-servers q +--+---+---+-+ | ID | Name | Hypervisor ID | Hypervisor Hostname | +--+---+---+-+ | 48b16e7c-0a17-42f8-9439-3146f26b4cd8 | instance-000e | 1 | qa4 | | 3eadf190-465b-4e90-ba49-7bc8ce7f12b9 | instance-000f | 1 | qa4 | | 056d4ad2-e081-4706-b7d1-84ee281e65fc | instance-0010 | 2 | qa5 | +--+---+---+-+ localadmin@qa4:~/devstack$ nova list +--+--+++-+-+ | ID | Name | Status | Task State | Power State | Networks| +--+--+++-+-+ | 3eadf190-465b-4e90-ba49-7bc8ce7f12b9 | vm1 | ACTIVE | - | Running | private=10.0.0.17 | | 48b16e7c-0a17-42f8-9439-3146f26b4cd8 | vm2 | ACTIVE | - | Running | private=10.0.0.16, 172.29.173.4 | | 056d4ad2-e081-4706-b7d1-84ee281e65fc | vm3 | ACTIVE | - | Running | private=10.0.0.18, 172.29.173.5 | +--+--+++-+-+ localadmin@qa4:~/devstack$ nova get-vnc-console vm3 novnc ERROR (CommandError): No server with a name or ID of 'vm3' exists. [ERROR] This does not happen if the VM resides at the Controlller (qa5). localadmin@qa4:~/devstack$ nova get-vnc-console vm2 novnc +---+-+ | Type | Url | +---+-+ | novnc | http://172.29.172.161:6080/vnc_auto.html?token=f556dea2-125d-49ed-bfb7-55a9a7714b2e | +---+-+ Is this expected behavior? Thanks, Danny ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [qa] nova get-password does not seem to work
Get password only works if you have something in the guest generating the encrypted password and posting it to the metadata server. Cloud-init for windows (the primary use case) will do this for you. You can do something similar for ubuntu using this script: https://gist.github.com/vishvananda/4008762 If cirros has usermod and openssl installed it may work there as well. Note that you can pass the script in as userdata (see the comments at the end). Vish On Oct 15, 2014, at 8:02 AM, Danny Choi (dannchoi) dannc...@cisco.com wrote: Hi, I used devstack to deploy Juno OpenStack. I spin up an instance with cirros-0.3.2-x86_64-uec. By default, useranme/password is cirrus/cubswin:) When I execute the command “nova get-password”, nothing is returned. localadmin@qa4:/etc/nova$ nova show vm1 +--++ | Property | Value | +--++ | OS-DCF:diskConfig| MANUAL | | OS-EXT-AZ:availability_zone | nova | | OS-EXT-STS:power_state | 1 | | OS-EXT-STS:task_state| - | | OS-EXT-STS:vm_state | active | | OS-SRV-USG:launched_at | 2014-10-15T14:48:04.00 | | OS-SRV-USG:terminated_at | - | | accessIPv4 | | | accessIPv6 | | | config_drive | | | created | 2014-10-15T14:47:56Z | | flavor | m1.tiny (1) | | hostId | ea715752b11cf96b95f9742513a351d2d6571c4fdb76f497d64ecddb | | id | 1a3c487e-c3a3-4783-bd0b-e3c87bf22c3f | | image| cirros-0.3.2-x86_64-uec (1dda953b-9319-4c43-bd20-1ef75b491553) | | key_name | cirros-key | | metadata | {} | | name | vm1 | | os-extended-volumes:volumes_attached | [] | | private network | 10.0.0.11 | | progress | 0 | | security_groups | default | | status | ACTIVE | | tenant_id| c8daf9bd6dda40a982b074322c08da7d | | updated | 2014-10-15T14:48:04Z | | user_id | 2cbbafae01404d4ebeb6e6fbacfa6546 | +--++ localadmin@qa4:/etc/nova$ nova help get-password usage: nova get-password server [private-key] Get password for a server. Positional arguments: server Name or ID of server. private-key Private key (used locally to decrypt password) (Optional). When specified, the command displays the clear (decrypted) VM password. When not specified, the ciphered VM password is displayed. localadmin@qa4:/etc/nova$ nova get-password vm1 [NOTHING RETURNED] localadmin@qa4:/etc/nova$ Am I missing something? Thanks, Danny ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail
Re: [openstack-dev] [nova] Resource tracker
On Oct 7, 2014, at 6:21 AM, Daniel P. Berrange berra...@redhat.com wrote: On Mon, Oct 06, 2014 at 02:55:20PM -0700, Joe Gordon wrote: On Mon, Oct 6, 2014 at 6:03 AM, Gary Kotton gkot...@vmware.com wrote: Hi, At the moment the resource tracker in Nova ignores that statistics that are returned by the hypervisor and it calculates the values on its own. Not only is this highly error prone but it is also very costly – all of the resources on the host are read from the database. Not only the fact that we are doing something very costly is troubling, the fact that we are over calculating resources used by the hypervisor is also an issue. In my opinion this leads us to not fully utilize hosts at our disposal. I have a number of concerns with this approach and would like to know why we are not using the actual resource reported by the hypervisor. The reason for asking this is that I have added a patch which uses the actual hypervisor resources returned and it lead to a discussion on the particular review (https://review.openstack.org/126237). So it sounds like you have mentioned two concerns here: 1. The current method to calculate hypervisor usage is expensive in terms of database access. 2. Nova ignores that statistics that are returned by the hypervisor and uses its own calculations. To #1, maybe we can doing something better, optimize the query, cache the result etc. As for #2 nova intentionally doesn't use the hypervisor statistics for a few reasons: * Make scheduling more deterministic, make it easier to reproduce issues etc. * Things like memory ballooning and thin provisioning in general, mean that the hypervisor is not reporting how much of the resources can be allocated but rather how much are currently in use (This behavior can vary from hypervisor to hypervisor today AFAIK -- which makes things confusing). So if I don't want to over subscribe RAM, and the hypervisor is using memory ballooning, the hypervisor statistics are mostly useless. I am sure there are more complex schemes that we can come up with that allow us to factor in the properties of thin provisioning, but is the extra complexity worth it? That is just an example of problems with the way Nova virt drivers /currently/ report usage to the schedular. It is easily within the realm of possibility for the virt drivers to be changed so that they report stats which take into account things like ballooning and thin provisioning so that we don't oversubscribe. Ignoring the hypervisor stats entirely and re-doing the calculations in the resource tracker code is just a crude workaround really. It is just swapping one set of problems for a new set of problems. +1 lets make the hypervisors report detailed enough information that we can do it without having to recalculate. Vish That being said I am fine with discussing in a spec the idea of adding an option to use the hypervisor reported statistics, as long as it is off by default. I'm against the idea of adding config options to switch between multiple codepaths because it is just punting the problem to the admins who are in an even worse position to decide what is best. It is saying would you rather your cloud have bug A or have bug B. We should be fixing the data the hypervisors report so that the resource tracker doesn't have to ignore them, and give the admins something which just works and avoid having to choose between 2 differently broken options. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Glance] Granularity of policies
On Oct 6, 2014, at 12:35 PM, Eddie Sheffield eddie.sheffi...@rackspace.com wrote: I encountered an interesting situation with Glance policies. Basically we have a situation where users in certain roles are not allowed to make certain calls at all. In this specific case, we don't want users in those roles listing or viewing members. When listing members, these users receive a 403 (Forbidden) but when showing an individual member the users receive 404 (Not Found). So the problem is that there are a couple of situations here and we don't (can't?) distinguish the exact intent: 1) A user IS allowed to make the call but isn't allowed to see a particular member - in that case 404 makes sense because a 403 could imply the user actually is there, you just can't look see them directly. 2) A user IS NOT allowed to make the call at all. In this case a 403 makes more sense because the user is forbidden at the call level. At this point I'm mainly trying to spark some conversation on this. This feels a bit inconsistent if users get 403 for a whole set of calls they are barred from but 404 for others which are sub calls of the others (e.g. listing members vs. showing a specific one.) But I don't have a specific proposals at this time - first I'm trying to find out if others feel this is a problem which should be addressed. If so I'm willing to work on a blueprint and implementation Generally you use a 404 to make sure no information is exposed about whether the user actually exists, but in the case of 2) I agree that a 403 is appropriate. It may be that 404 was used there because the same code path is taken in both cases. Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] why do we have os-attach-interfaces in the v3 API?
If you can add and delete interfaces it seems like being able to list them is useful. You can get the info you need from the networks list when you get the instance, but “nova interface-list” seems like a useful addition if we have “interface-attach” and “interface-detach”, so in this case I think I would suggest that we leave the proxying in and implement it for nova-network as well. I was looking for an anologue with cinder but it looks like we have a volume-attach and volume-detach there without a volume-list server command so if people want to kill it I’m ok with that too. Vish On Oct 2, 2014, at 2:43 PM, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: On 10/2/2014 4:34 PM, Vishvananda Ishaya wrote: os-attach-interfacees is actually a a forward port of: http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/contrib/attach_interfaces.py which is a compute action that is valid for both nova-network and neutron: http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/api.py#n2991 On Oct 2, 2014, at 1:57 PM, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: The os-interface (v2) and os-attach-interfaces (v3) APIs are only used for the neutron network API, you'll get a NotImplemented if trying to call the related methods with nova-network [1]. Since we aren't proxying to neutron in the v3 API (v2.1), why does os-attach-interfaces [2] exist? Was this just an oversight? If so, please allow me to delete it. :) [1] http://git.openstack.org/cgit/openstack/nova/tree/nova/network/api.py?id=2014.2.rc1#n310 [2] http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/plugins/v3/attach_interfaces.py?id=2014.2.rc1 -- Thanks, Matt Riedemann ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev OK so create/delete call the compute_api to attach/detach, but show and index are calling the network_api on port methods which are neutron only, so I guess that's what I'm talking about as far as removing. Personally I don't think it hurts anything, but I'm getting mixed signals about the stance on neutron proxying in the v2.1 API. -- Thanks, Matt Riedemann signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Quota management and enforcement across projects
The proposal in the past was to keep quota enforcement local, but to put the resource limits into keystone. This seems like an obvious first step to me. Then a shared library for enforcing quotas with decent performance should be next. The quota calls in nova are extremely inefficient right now and it will only get worse when we try to add hierarchical projects and quotas. Vish On Oct 3, 2014, at 7:53 AM, Duncan Thomas duncan.tho...@gmail.com wrote: Taking quota out of the service / adding remote calls for quota management is going to make things fragile - you've somehow got to deal with the cases where your quota manager is slow, goes away, hiccups, drops connections etc. You'll also need some way of reconciling actual usage against quota usage periodically, to detect problems. On 3 October 2014 15:03, Salvatore Orlando sorla...@nicira.com wrote: Hi, Quota management is currently one of those things where every openstack project does its own thing. While quotas are obviously managed in a similar way for each project, there are subtle differences which ultimately result in lack of usability. I recall that in the past there have been several calls for unifying quota management. The blueprint [1] for instance, hints at the possibility of storing quotas in keystone. On the other hand, the blazar project [2, 3] seems to aim at solving this problem for good enabling resource reservation and therefore potentially freeing openstack projects from managing and enforcing quotas. While Blazar is definetely a good thing to have, I'm not entirely sure we want to make it a required component for every deployment. Perhaps single projects should still be able to enforce quota. On the other hand, at least on paper, the idea of making Keystone THE endpoint for managing quotas, and then letting the various project enforce them, sounds promising - is there any reason for which this blueprint is stalled to the point that it seems forgotten now? I'm coming to the mailing list with these random questions about quota management, for two reasons: 1) despite developing and using openstack on a daily basis I'm still confused by quotas 2) I've found a race condition in neutron quotas and the fix is not trivial. So, rather than start coding right away, it might probably make more sense to ask the community if there is already a known better approach to quota management - and obviously enforcement. Thanks in advance, Salvatore [1] https://blueprints.launchpad.net/keystone/+spec/service-metadata [2] https://wiki.openstack.org/wiki/Blazar [3] https://review.openstack.org/#/q/project:stackforge/blazar,n,z ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Duncan Thomas ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] why do we have os-attach-interfaces in the v3 API?
os-attach-interfacees is actually a a forward port of: http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/contrib/attach_interfaces.py which is a compute action that is valid for both nova-network and neutron: http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/api.py#n2991 On Oct 2, 2014, at 1:57 PM, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: The os-interface (v2) and os-attach-interfaces (v3) APIs are only used for the neutron network API, you'll get a NotImplemented if trying to call the related methods with nova-network [1]. Since we aren't proxying to neutron in the v3 API (v2.1), why does os-attach-interfaces [2] exist? Was this just an oversight? If so, please allow me to delete it. :) [1] http://git.openstack.org/cgit/openstack/nova/tree/nova/network/api.py?id=2014.2.rc1#n310 [2] http://git.openstack.org/cgit/openstack/nova/tree/nova/api/openstack/compute/plugins/v3/attach_interfaces.py?id=2014.2.rc1 -- Thanks, Matt Riedemann ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] evacuating boot-from-volume instances with local storage is broken?
On Oct 2, 2014, at 2:05 PM, Chris Friesen chris.frie...@windriver.com wrote: On 10/02/2014 02:24 PM, Jay Pipes wrote: On 10/02/2014 02:29 PM, Chris Friesen wrote: Hi, I'm interested in running nova evacuate on an instance that has local storage but was booted from a cinder volume. OpenStack allows live-migration of this sort of instance, so I'm assuming that we would want to allow evacuation as well... I'm getting ready to test it, but I see that there was a nova bug opened against this case back in March (https://bugs.launchpad.net/nova/+bug/1299368). It's been confirmed but hasn't even had an importance assigned yet. Anyone can sign up for the https://launchpad.net/~nova team on Launchpad and set an importance for any Nova bug. I don't think that's correct. I'm a member of the nova team but I'm not allowed to change the importance of bugs. The mouseover message for the Importance field says that it is changeable only by a project maintainer or bug supervisor”. The team is: https://launchpad.net/~nova-bugs Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Help with EC2 Driver functionality using boto ...
It is hard to tell if this is a bug or a misconfiguration from your desctiption. The failure likely generated some kind of error message in nova or glance. If you can track down an error message and a tracback it would be worth submitting as a bug report to the appropriate project. Vish On Oct 1, 2014, at 11:13 AM, Aparna S Parikh apa...@thoughtworks.com wrote: Hi, We are currently working on writing a driver for Amazon's EC2 using the boto libraries, and are hung up on creating a snapshot of an instance. The instance remains in 'Queued' status on Openstack instead of becoming 'Active'. The actual EC2 snapshot that gets created is in 'available' status. We are essentially calling create_image() from the boto/ec2/instance.py when snapshot of an instance is being called. Any help in figuring this out would be greatly appreciated. Thanks, Aparna ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [kolla] Kolla Blueprints
On Oct 1, 2014, at 2:05 PM, Fox, Kevin M kevin@pnnl.gov wrote: Has anyone figured out a way of having a floating ip like feature with docker so that you can have rabbitmq, mysql, or ceph mon's at fixed ip's and be able to migrate them around from physical host to physical host and still have them at fixed locations that you can easily put in static config files? There are[1] many[2] ways[3] to do this, but in general I don’t think they pass the “too much magic” sniff test. I think the standard docker approach of passing in the necessary ips via environment variables is probably the most user friendly option. Containers are light-weight enough to restart if the data changes. [1] https://github.com/coreos/flannel [2] https://github.com/vishvananda/wormhole [3] https://github.com/openshift/geard/blob/master/docs/linking.md Vish Maybe iptables rules? Maybe adding another bridge? Maybe just disabling the docker network stack all together and binding the service to a fixed, static address on the host? Also, I ran across: http://jperrin.github.io/centos/2014/09/25/centos-docker-and-systemd/ and it does seem to work. I was able to get openssh-server and keystone to work in the same container without needing to write custom start/stop scripts. This kind of setup would make a nova compute container much, much easier. Thanks, Kevin From: Steven Dake [sd...@redhat.com] Sent: Wednesday, October 01, 2014 8:04 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [kolla] Kolla Blueprints On 09/30/2014 09:55 AM, Chmouel Boudjnah wrote: On Tue, Sep 30, 2014 at 6:41 PM, Steven Dake sd...@redhat.com wrote: I've done a first round of prioritization. I think key things we need people to step up for are nova and rabbitmq containers. For the developers, please take a moment to pick a specific blueprint to work on. If your already working on something, this hsould help to prevent duplicate work :) As I understand in the current implementations[1] the containers are configured with a mix of shell scripts using crudini and other shell command. Is it the way to configure the containers? and is a deployment tool like Ansible (or others) is something that is planned to be used in the future? Chmouel Chmouel, I am not really sure what the best solution to configure the containers. It is clear to me the current shell scripts are fragile in nature and do not handle container restart properly. The idea of using Puppet or Ansible as a CM tool has been discussed with no resolution. At the moment, I'm satisified with a somewhat hacky solution if we can get the containers operational. Regards, -steve [1] from https://github.com/jlabocki/superhappyfunshow/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [infra] Nominating Sean Dague for project-config-core
Based on the amazing work that Sean does across a whole slew of repositories, can we just give him +2 rights on everything? ;) Vish P.S. But seriously, I am truly impressed with how much Sean puts into this project. On Sep 26, 2014, at 8:35 AM, James E. Blair cor...@inaugust.com wrote: I'm pleased to nominate Sean Dague to the project-config core team. The project-config repo is a constituent of the Infrastructure Program and has a core team structured to be a superset of infra-core with additional reviewers who specialize in the area. For some time, Sean has been the person we consult to make sure that changes to the CI system are testing what we think we should be testing (and just as importantly, not testing what we think we should not be testing). His knowledge of devstack, devstack-gate, tempest, and nova is immensely helpful in making sense of what we're actually trying to accomplish. Please respond with support or concerns and if the consensus is in favor, we will add him next week. Thanks, Jim ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Thoughts on OpenStack Layers and a Big Tent model
On Sep 26, 2014, at 1:25 AM, Thierry Carrez thie...@openstack.org wrote: That said, singling out the test infrastructure (3) and the release management (2) is a bit unfair to other horizontal efforts, like Documentation, Translations, or general QA, which also suffer from a scale issue. The Docs team, in particular, couldn't really scale either and already implemented two tiers within the integrated release -- the part they directly support, and the part they help / provide tools for. You are correct I left out some very important cross project teams (Sorry Anne!). We call the projects that tap these shared resources today “integrated”. It seems like if we keep going down this path, we need to either: a) make “integrated” == layer1 This would be a tricky proposition because of the way it might look in the community, but it may be necessary if we are already too overloaded to handle the cross-project concerns at our current scale. b) clearly dilineate “integrated” vs layer1 in some other way This would likely involve changing the meaning of integrated to mean a bit less than it does today: just best effort for projects outside of layer1. All the other projects would be able to get help and tools from those horizontal teams, but would just not be directly taken care of. This is how Docs currently work (the two tiers within integrated release), this is how Release management currently works (integrated vs. incubated), this is how the test infra would like to be able to work... Making sure we at least support the layer #1 is just a matter of setting the same basic expectations for the same basic set of projects. It sounds like you are suggesting that b) has already occurred to some degree so that we should just continue along those lines? Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Release criticality of bug 1365606 (get_network_info efficiency for nova-network)
To explain my rationale: I think it is totally reasonable to be conservative and wait to merge the actual fixes to the network calls[1][2] until Kilo and have them go through the stable/backports process. Unfortunately, due to our object design, if we block https://review.openstack.org/#/c/119521/ then there is no way we can backport those fixes, so we are stuck for a full 6 months with abysmal performance. This is why I’ve been pushing to get that one fix in. That said, I will happily decouple the two patches. Vish [1] https://review.openstack.org/#/c/119522/9 [2] https://review.openstack.org/#/c/119523/10 On Sep 24, 2014, at 3:51 PM, Michael Still mi...@stillhq.com wrote: Hi, so, I'd really like to see https://review.openstack.org/#/c/121663/ merged in rc1. That patch is approved right now. However, it depends on https://review.openstack.org/#/c/119521/, which is not approved. 119521 fixes a problem where we make five RPC calls per call to get_network_info, which is an obvious efficiency problem. Talking to Vish, who is the author of these patches, it sounds like the efficiency issue is a pretty big deal for users of nova-network and he'd like to see 119521 land in Juno. I think that means he's effectively arguing that the bug is release critical. On the other hand, its only a couple of days until rc1, so we're trying to be super conservative about what we land now in Juno. So... I'd like to see a bit of a conversation on what call we make here. Do we land 119521? Michael -- Rackspace Australia signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Release criticality of bug 1365606 (get_network_info efficiency for nova-network)
Ok new versions have reversed the order so we can take: https://review.openstack.org/#/c/121663/4 before: https://review.openstack.org/#/c/119521/10 I still strongly recommend that we take the second so we at least have the possibility of backporting the other two patches. And I also wouldn’t complain if we just took all 4 :) Vish On Sep 25, 2014, at 9:44 AM, Vishvananda Ishaya vishvana...@gmail.com wrote: To explain my rationale: I think it is totally reasonable to be conservative and wait to merge the actual fixes to the network calls[1][2] until Kilo and have them go through the stable/backports process. Unfortunately, due to our object design, if we block https://review.openstack.org/#/c/119521/ then there is no way we can backport those fixes, so we are stuck for a full 6 months with abysmal performance. This is why I’ve been pushing to get that one fix in. That said, I will happily decouple the two patches. Vish [1] https://review.openstack.org/#/c/119522/9 [2] https://review.openstack.org/#/c/119523/10 On Sep 24, 2014, at 3:51 PM, Michael Still mi...@stillhq.com wrote: Hi, so, I'd really like to see https://review.openstack.org/#/c/121663/ merged in rc1. That patch is approved right now. However, it depends on https://review.openstack.org/#/c/119521/, which is not approved. 119521 fixes a problem where we make five RPC calls per call to get_network_info, which is an obvious efficiency problem. Talking to Vish, who is the author of these patches, it sounds like the efficiency issue is a pretty big deal for users of nova-network and he'd like to see 119521 land in Juno. I think that means he's effectively arguing that the bug is release critical. On the other hand, its only a couple of days until rc1, so we're trying to be super conservative about what we land now in Juno. So... I'd like to see a bit of a conversation on what call we make here. Do we land 119521? Michael -- Rackspace Australia signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Thoughts on OpenStack Layers and a Big Tent model
On Sep 24, 2014, at 10:55 AM, Zane Bitter zbit...@redhat.com wrote: On 18/09/14 14:53, Monty Taylor wrote: Hey all, I've recently been thinking a lot about Sean's Layers stuff. So I wrote a blog post which Jim Blair and Devananda were kind enough to help me edit. http://inaugust.com/post/108 I think there are a number of unjustified assumptions behind this arrangement of things. I'm going to list some here, but I don't want anyone to interpret this as a personal criticism of Monty. The point is that we all suffer from biases - not for any questionable reasons but purely as a result of our own experiences, who we spend our time talking to and what we spend our time thinking about - and therefore we should all be extremely circumspect about trying to bake our own mental models of what OpenStack should be into the organisational structure of the project itself. I think there were some assumptions that lead to the Layer1 model. Perhaps a little insight into the in-person debate[1] at OpenStack-SV might help explain where monty was coming from. The initial thought was a radical idea (pioneered by Jay) to completely dismantle the integrated release and have all projects release independently and functionally test against their real dependencies. This gained support from various people and I still think it is a great long-term goal. The worry that Monty (and others) had are two-fold: 1. When we had no co-gating in the past, we ended up with a lot of cross-project breakage. If we jump right into this we could end up in the wild west were different projects expect different keystone versions and there is no way to deploy a functional cloud. 2. We have set expectations in our community (and especially with distributions), that we release a set of things that all work together. It is not acceptable for us to just pull the rug out from under them. These concerns show that we must (in the short term) provide some kind of integrated testing and release. I see the layer1 model as a stepping stone towards the long term goal of having the projects release independently and depend on stable interfaces. We aren’t going to get there immediately, so having a smaller, integrated set of services representing our most common use case seems like a good first step. As our interfaces get more stable and our testing gets better it could move to a (once every X months) release that just packages the current version of the layer1 projects or even be completely managed by distributions. We need a way to move forward, but I’m hoping we can do it without a concept of “specialness” around layer1 projects. I actually see it as a limitation of these projects that we have to take this stepping stone and cannot disaggregate completely. Instead it should be seen as a necessary evil so that we don’t break our users. In addition, we should encourage other shared use cases in openstack both for testing (functional tests against groups of services) and for releases (shared releases of related projects). [1] Note this wasn’t a planned debate, but a spontaneous discussion that included (at various points) Monty Taylor, Jay Pipes, Joe Gordon, John Dickenson, Myself, and (undoubtedly) one or two people I”m forgetting. signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Thoughts on OpenStack Layers and a Big Tent model
On Sep 25, 2014, at 4:01 PM, Robert Collins robe...@robertcollins.net wrote: So I guess I'm saying: Lets decouple 'what is openstack' from 'what we test together on every commit'. It seems that this discussion has actually illustrated shortcomings in our answers to 3 separate questions, and people have been throwing out ideas that attempt to solve all 3. Perhaps we need to address each one individually. The three questions are: 1. Which projects are “part of openstack”? 2. Which projects are released as a single unit? 3. Which projects are tested together The current answers are: 1. Three levels incubation, integration, core 2. Things that reach the integration level 3. Things that reach the integration level. Some proposed answers: 1. Lightweight incubation a la apache 2. Monty’s layer1 3. Direct dependencies and close collaborators Discussing the propased answers(in reverse order): I think we have rough consensus around 3: that we should move towards functional testing for direct dependencies and let the projects decide when they want to co-gate. The functional co-gating should ideally be based on important use-cases. 2 is a bit murkier. In the interest of staying true to our roots the best we can probably do is to allow projects to opt-out of the coordinated release and for thierry to specifically select which projects he is willing to coordinate. Any other project could co-release with the integrated release but wouldn’t be centrally managed by thierry. There is also a decision about what the TCs role is in these projects. 1 Has some unanswerd questions, like is there another level “graduation” where the tc has some kind of technical oversight? What is the criteria for it? etc. Maybe addressing these things separately will allow us to make progress. Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][IPv6] Neighbor Discovery for HA
You are going to have to make this as a separate binary and call it via rootwrap ip netns exec. While it is possible to change network namespaces in python, you aren’t going to be able to do this consistently without root access, so it will need to be guarded by rootwrap anyway. Vish On Sep 25, 2014, at 7:00 PM, Xu Han Peng pengxu...@gmail.com wrote: Sending unsolicited NA by scapy is like this: from scapy.all import send, IPv6, ICMPv6ND_NA, ICMPv6NDOptDstLLAddr target_ll_addr = ICMPv6NDOptDstLLAddr(lladdr = mac_address) unsolicited_na=ICMPv6ND_NA(R=1, S=0, O=1, tgt=target) packet=IPv6(src=source)/unsolicited_na/target_ll_addr send(packet, iface=interface_name, count=10, inter=0.2) It's not actually a python script but a python method. Any ideas? On 09/25/2014 06:20 PM, Kevin Benton wrote: Does running the python script with ip netns exec not work correctly? On Thu, Sep 25, 2014 at 2:05 AM, Xu Han Peng pengxu...@gmail.com wrote: Hi, As we talked in last IPv6 sub-team meeting, I was able to construct and send IPv6 unsolicited neighbor advertisement for external gateway interface by python tool scapy: http://www.secdev.org/projects/scapy/ http://www.idsv6.de/Downloads/IPv6PacketCreationWithScapy.pdf However, I am having trouble to send this unsolicited neighbor advertisement in a given namespace. All the current namespace operations leverage ip netns exec and shell command. But we cannot do this to scapy since it's python code. Can anyone advise me on this? Thanks, Xu Han On 09/05/2014 05:46 PM, Xu Han Peng wrote: Carl, Seem so. I think internal router interface and external gateway port GARP are taken care by keepalived during failover. And if HA is not enable, _send_gratuitous_arp is called to send out GARP. I think we will need to take care IPv6 for both cases since keepalived 1.2.0 support IPv6. May need a separate BP. For the case HA is enabled externally, we still need unsolicited neighbor advertisement for gateway failover. But for internal router interface, since Router Advertisement is automatically send out by RADVD after failover, we don't need to send out neighbor advertisement anymore. Xu Han On 09/05/2014 03:04 AM, Carl Baldwin wrote: Hi Xu Han, Since I sent my message yesterday there has been some more discussion in the review on that patch set. See [1] again. I think your assessment is likely correct. Carl [1] https://review.openstack.org/#/c/70700/37/neutron/agent/l3_ha_agent.py On Thu, Sep 4, 2014 at 3:32 AM, Xu Han Peng pengxu...@gmail.com wrote: Carl, Thanks a lot for your reply! If I understand correctly, in VRRP case, keepalived will be responsible for sending out GARPs? By checking the code you provided, I can see all the _send_gratuitous_arp_packet call are wrapped by if not is_ha condition. Xu Han On 09/04/2014 06:06 AM, Carl Baldwin wrote: It should be noted that send_arp_for_ha is a configuration option that preceded the more recent in-progress work to add VRRP controlled HA to Neutron's router. The option was added, I believe, to cause the router to send (default) 3 GARPs to the external gateway if the router was removed from one network node and added to another by some external script or manual intervention. It did not send anything on the internal network ports. VRRP is a different story and the code in review [1] sends GARPs on internal and external ports. Hope this helps avoid confusion in this discussion. Carl [1] https://review.openstack.org/#/c/70700/37/neutron/agent/l3_ha_agent.py On Mon, Sep 1, 2014 at 8:52 PM, Xu Han Peng pengxu...@gmail.com wrote: Anthony, Thanks for your reply. If HA method like VRRP are used for IPv6 router, according to the VRRP RFC with IPv6 included, the servers should be auto-configured with the active router's LLA as the default route before the failover happens and still remain that route after the failover. In other word, there should be no need to use two LLAs for default route of a subnet unless load balance is required. When the backup router become the master router, the backup router should be responsible for sending out an unsolicited ND neighbor advertisement with the associated LLA (the previous master's LLA) immediately to update the bridge learning state and sending out router advertisement with the same options with the previous master to maintain the route and bridge learning. This is shown in http://tools.ietf.org/html/rfc5798#section-4.1 and the actions backup router should take after failover is documented here: http://tools.ietf.org/html/rfc5798#section-6.4.2. The need for immediate messaging sending and periodic message sending is documented here: http://tools.ietf.org/html/rfc5798#section-2.4 Since the keepalived manager support for L3 HA is merged: https://review.openstack.org/#/c/68142/43. And keepalived release 1.2.0
Re: [openstack-dev] Thoughts on OpenStack Layers and a Big Tent model
On Sep 23, 2014, at 8:40 AM, Doug Hellmann d...@doughellmann.com wrote: If we are no longer incubating *programs*, which are the teams of people who we would like to ensure are involved in OpenStack governance, then how do we make that decision? From a practical standpoint, how do we make a list of eligible voters for a TC election? Today we pull a list of committers from the git history from the projects associated with “official programs, but if we are dropping “official programs” we need some other way to build the list. Joe Gordon mentioned an interesting idea to address this (which I am probably totally butchering), which is that we make incubation more similar to the ASF Incubator. In other words make it more lightweight with no promise of governance or infrastructure support. It is also interesting to consider that we may not need much governance for things outside of layer1. Of course, this may be dancing around the actual problem to some extent, because there are a bunch of projects that are not layer1 that are already a part of the community, and we need a solution that includes them somehow. Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] battling stale .pyc files
On Sep 15, 2014, at 4:34 AM, Lucas Alvares Gomes lucasago...@gmail.com wrote: So, although I like the fix proposed and I would +1 that idea, I'm also not very concerned if most of the people don't want that. Because as you just said we can fix it locally easily. I didn't set it to my .local but the way I do nowadays is to have a small bash function in my .bashrc to delete the pyc files from the current directory: function delpyc () { find . -name *.pyc -exec rm -rf {} \; } So I just invoke it when needed :) fyi there is a -delete option to find which is probably a little safer then exec with a rm -rf. Also it is really convienient to do this as a git alias so it happens automatically when switching branches: In ~/.gitconfig: [alias] cc = !TOP=$(git rev-parse --show-toplevel) find $TOP -name '*.pyc' -delete; git-checkout” now you can git cc branch instead of git checkout branch. Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Thoughts on OpenStack Layers and a Big Tent model
On Sep 19, 2014, at 3:33 AM, Thierry Carrez thie...@openstack.org wrote: Vishvananda Ishaya wrote: Great writeup. I think there are some great concrete suggestions here. A couple more: 1. I think we need a better name for Layer #1 that actually represents what the goal of it is: Infrastructure Services? 2. We need to be be open to having other Layer #1s within the community. We should allow for similar collaborations and group focus to grow up as well. Storage Services? Platform Services? Computation Services? I think that would nullify most of the benefits of Monty's proposal. If we keep on blessing themes or special groups, we'll soon be back at step 0, with projects banging on the TC door to become special, and companies not allocating resources to anything that's not special. I was assuming that these would be self-organized rather than managed by the TC. Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Thoughts on OpenStack Layers and a Big Tent model
On Sep 19, 2014, at 10:14 AM, John Dickinson m...@not.mn wrote: On Sep 19, 2014, at 5:46 AM, John Griffith john.griff...@solidfire.com wrote: On Fri, Sep 19, 2014 at 4:33 AM, Thierry Carrez thie...@openstack.org wrote: Vishvananda Ishaya wrote: Great writeup. I think there are some great concrete suggestions here. A couple more: 1. I think we need a better name for Layer #1 that actually represents what the goal of it is: Infrastructure Services? 2. We need to be be open to having other Layer #1s within the community. We should allow for similar collaborations and group focus to grow up as well. Storage Services? Platform Services? Computation Services? I think that would nullify most of the benefits of Monty's proposal. If we keep on blessing themes or special groups, we'll soon be back at step 0, with projects banging on the TC door to become special, and companies not allocating resources to anything that's not special. -- Thierry Carrez (ttx) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Great stuff, mixed on point 2 raised by Vish but honestly I think that's something that could evolve over time, but I looked at that differently as in Cinder, SWIFT and some day Manilla live under a Storage Services umbrella, and ideally at some point there's some convergence there. Anyway, I don't want to start a rat-hole on that, it's kind of irrelevant right now. Bottom line is I think the direction and initial ideas in Monty's post are what a lot of us have been thinking about and looking for. I'm in!! I too am generally supportive of the concept, but I do want to think about the vishy/tts/jgriffith points above. Today, I'd group existing OpenStack projects into programs as follows: Compute: nova, sahara, ironic Storage: swift, cinder, glance, trove Network: neutron, designate, zaqar Deployment/management: heat, triple-o, horizon, ceilometer Identity: keystone, barbican Support (not user facing): infra, docs, tempest, devstack, oslo (potentially even) stackforge: lots There is a pretty different division of things in this breakdown than in what monty was proposing. This divides things up by conceptual similarity which I think is actually less useful than breaking things down by use case. I really like the grouping and testing of things which are tightly coupled. If we say launching a VM and using it is the primary use case of our community corrently then things group into monty’s layer #1. It seems fairly clear that a large section of our community is focused on this use case so this should be a primary focus of infrastructure resources. There are other use cases in our community, for example: Object Storage: Swift (depends on keystone) Orchestrating Multiple VMs: Heat (depends on layer1) DBaSS: Trove: (depends on heat) These are also important use cases for parts of our community, but swift has demostrated that it isn’t required to be a part of an integrated release schedule, so these could be managed by smaller groups in the community. Note that these are primarily individual projects today, but I could see a future where some of these projects decide to group and do an integrated release. In the future we might see (totally making this up): Public Cloud Application services: Trove, Zaqar Application Deployment services: Heat, Murano Operations services: Ceilometer, Congress As I mentioned previously though, I don’t think we need to define these groups in advance. These groups can organize as needed. Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Thoughts on OpenStack Layers and a Big Tent model
Great writeup. I think there are some great concrete suggestions here. A couple more: 1. I think we need a better name for Layer #1 that actually represents what the goal of it is: Infrastructure Services? 2. We need to be be open to having other Layer #1s within the community. We should allow for similar collaborations and group focus to grow up as well. Storage Services? Platform Services? Computation Services? Vish On Sep 18, 2014, at 11:53 AM, Monty Taylor mord...@inaugust.com wrote: Hey all, I've recently been thinking a lot about Sean's Layers stuff. So I wrote a blog post which Jim Blair and Devananda were kind enough to help me edit. http://inaugust.com/post/108 Enjoy. Monty ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers
On Sep 4, 2014, at 3:24 AM, Daniel P. Berrange berra...@redhat.com wrote: Position statement == Over the past year I've increasingly come to the conclusion that Nova is heading for (or probably already at) a major crisis. If steps are not taken to avert this, the project is likely to loose a non-trivial amount of talent, both regular code contributors and core team members. That includes myself. This is not good for Nova's long term health and so should be of concern to anyone involved in Nova and OpenStack. For those who don't want to read the whole mail, the executive summary is that the nova-core team is an unfixable bottleneck in our development process with our current project structure. The only way I see to remove the bottleneck is to split the virt drivers out of tree and let them all have their own core teams in their area of code, leaving current nova core to focus on all the common code outside the virt driver impls. I, now, none the less urge people to read the whole mail. I am highly in favor of this approach (and have been for at least a year). Every time we have brought this up in the past there has been concern about the shared code, but we have to make a change. We have tried various other approaches and none of them have made a dent. +1000 Vish Background information == I see many factors coming together to form the crisis - Burn out of core team members from over work - Difficulty bringing new talent into the core team - Long delay in getting code reviewed merged - Marginalization of code areas which aren't popular - Increasing size of nova code through new drivers - Exclusion of developers without corporate backing Each item on their own may not seem too bad, but combined they add up to a big problem. Core team burn out -- Having been involved in Nova for several dev cycles now, it is clear that the backlog of code up for review never goes away. Even intensive code review efforts at various points in the dev cycle makes only a small impact on the backlog. This has a pretty significant impact on core team members, as their work is never done. At best, the dial is sometimes set to 10, instead of 11. Many people, myself included, have built tools to help deal with the reviews in a more efficient manner than plain gerrit allows for. These certainly help, but they can't ever solve the problem on their own - just make it slightly more bearable. And this is not even considering that core team members might have useful contributions to make in ways beyond just code review. Ultimately the workload is just too high to sustain the levels of review required, so core team members will eventually burn out (as they have done many times already). Even if one person attempts to take the initiative to heavily invest in review of certain features it is often to no avail. Unless a second dedicated core reviewer can be found to 'tag team' it is hard for one person to make a difference. The end result is that a patch is +2d and then sits idle for weeks or more until a merge conflict requires it to be reposted at which point even that one +2 is lost. This is a pretty demotivating outcome for both reviewers the patch contributor. New core team talent It can't escape attention that the Nova core team does not grow in size very often. When Nova was younger and its code base was smaller, it was easier for contributors to get onto core because the base level of knowledge required was that much smaller. To get onto core today requires a major investment in learning Nova over a year or more. Even people who potentially have the latent skills may not have the time available to invest in learning the entire of Nova. With the number of reviews proposed to Nova, the core team should probably be at least double its current size[1]. There is plenty of expertize in the project as a whole but it is typically focused into specific areas of the codebase. There is nowhere we can find 20 more people with broad knowledge of the codebase who could be promoted even over the next year, let alone today. This is ignoring that many existing members of core are relatively inactive due to burnout and so need replacing. That means we really need another 25-30 people for core. That's not going to happen. Code review delays -- The obvious result of having too much work for too few reviewers is that code contributors face major delays in getting their work reviewed and merged. From personal experience, during Juno, I've probably spent 1 week in aggregate on actual code development vs 8 weeks on waiting on code review. You have to constantly be on alert for review comments because unless you can respond quickly (and repost) while you still have the attention of the reviewer, they may not be look again for days/weeks. The
Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers
On Sep 4, 2014, at 8:33 AM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Sep 04, 2014 at 01:36:04PM +, Gary Kotton wrote: Hi, I do not think that Nova is in a death spiral. I just think that the current way of working at the moment is strangling the project. I do not understand why we need to split drivers out of the core project. Why not have the ability to provide Œcore review¹ status to people for reviewing those parts of the code? We have enough talented people in OpenStack to be able to write a driver above gerrit to enable that. The consensus view at the summit was that, having tried failed at getting useful changes into gerrit, it is not a viable option unless we undertake a permanent fork of the code base. There didn't seem to be any apetite for maintaining developing a large java app ourselves. So people we're looking to start writing a replacement for gerrit from scratch (albeit reusing the database schema). I don’t think this is a viable option for us, but if we were going to do it, we would probably be better off using https://code.google.com/p/rietveld/ as a base, since it is actually written in python. Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers
On Sep 5, 2014, at 4:12 AM, Sean Dague s...@dague.net wrote: On 09/05/2014 06:40 AM, Nikola Đipanov wrote: Just some things to think about with regards to the whole idea, by no means exhaustive. So maybe the better question is: what are the top sources of technical debt in Nova that we need to address? And if we did, everyone would be more sane, and feel less burnt. Maybe the drivers are the worst debt, and jettisoning them makes them someone else's problem, so that helps some. I'm not entirely convinced right now. I think Cells represents a lot of debt right now. It doesn't fully work with the rest of Nova, and produces a ton of extra code paths special cased for the cells path. The Scheduler has a ton of debt as has been pointed out by the efforts in and around Gannt. The focus has been on the split, but realistically I'm with Jay is that we should focus on the debt, and exposing a REST interface in Nova. What about the Nova objects transition? That continues to be slow because it's basically Dan (with a few other helpers from time to time). Would it be helpful if we did an all hands on deck transition of the rest of Nova for K1 and just get it done? Would be nice to have the bulk of Nova core working on one thing like this and actually be in shared context with everyone else for a while. In my mind, spliting helps with all of these things. A lot of the cleanup related work is completely delayed because the review queue starts to seem like an insurmountable hurdle. There are various cleanups needed in the drivers as well but they are not progressing due to the glacier pace we are moving right now. Some examples: Vmware spawn refactor, Hyper-v bug fixes, Libvirt resize/migrate (this is still using ssh to copy data!) People need smaller areas of work. And they need a sense of pride and ownership of the things that they work on. In my mind that is the best way to ensure success. Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Criteria for giving a -1 in a review
On Aug 21, 2014, at 9:42 AM, Adam Young ayo...@redhat.com wrote: On 08/21/2014 12:34 PM, Dolph Mathews wrote: On Thu, Aug 21, 2014 at 11:21 AM, Daniel P. Berrange berra...@redhat.com wrote: On Thu, Aug 21, 2014 at 05:05:04PM +0100, Matthew Booth wrote: I would prefer that you didn't merge this. i.e. The project is better off without it. A bit off topic, but I've never liked this message that gets added as it think it sounds overly negative. It would better written as This patch needs further work before it can be merged ++ This patch needs further work before it can be merged, and as a reviewer, I am either too lazy or just unwilling to checkout your patch and fix those issues myself. Heh...well, there are a couple other aspects: 1. I am unsure if my understanding is correct. I'd like to have some validation, and, if I am wrong, I'll withdraw the objections. 2. If I make the change, I can no longer +2/+A the review. If you make the change, I can approve it. I don’t think this is correct. I’m totally ok with a core reviewer making a minor change to a patch AND +2ing it. This is especially true of minor things like spelling issues or code cleanliness. The only real functional difference between: 1) commenting “please change if foo==None: to if foo is None:” 2) waiting for the reviewer to exactly what you suggested 3) +2 the result and: 1) you change if foo==None: to if foo is None: for the author 2) +2 the result is the second set is WAY faster. Of course this only applies to minor changes. If you are refactoring more significantly it is nice to get the original poster’s feedback so the first option might be better. Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Issues with POSIX semaphores and other locks in lockutils
This may be slightly off-topic but it is worth mentioning that the use of threading.Lock[1] which was included to make the locks thread safe seems to be leading to a deadlock in eventlet[2]. It seems like we have rewritten this too many times in order to fix minor pain points and are adding risk to a very important component of the system. [1] https://review.openstack.org/#/c/54581 [2] https://bugs.launchpad.net/nova/+bug/1349452 On Aug 18, 2014, at 2:05 PM, Pádraig Brady p...@draigbrady.com wrote: On 08/18/2014 03:38 PM, Julien Danjou wrote: On Thu, Aug 14 2014, Yuriy Taraday wrote: Hi Yuriy, […] Looking forward to your opinions. This looks like a good summary of the situation. I've added a solution E based on pthread, but didn't get very far about it for now. In my experience I would just go with the fcntl locks. They're auto unlocked and well supported, and importantly, supported for distributed processes. I'm not sure how problematic the lock_path config is TBH. That is adjusted automatically in certain cases where needed anyway. Pádraig. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] The future of the integrated release
On Aug 13, 2014, at 5:07 AM, Daniel P. Berrange berra...@redhat.com wrote: On Wed, Aug 13, 2014 at 12:55:48PM +0100, Steven Hardy wrote: On Wed, Aug 13, 2014 at 11:42:52AM +0100, Daniel P. Berrange wrote: On Thu, Aug 07, 2014 at 03:56:04AM -0700, Jay Pipes wrote: By ignoring stable branches, leaving it upto a small team to handle, I think we giving the wrong message about what our priorities as a team team are. I can't help thinking this filters through to impact the way people think about their work on master. Who is ignoring stable branches? This sounds like a project specific failing to me, as all experienced core reviewers should consider offering their services to help with stable-maint activity. I don't personally see any reason why the *entire* project core team has to do this, but a subset of them should feel compelled to participate in the stable-maint process, if they have sufficient time, interest and historical context, it's not some other team IMO. I think that stable branch review should be a key responsibility for anyone on the core team, not solely those few who volunteer for stable team. As the number of projects in openstack grows I think the idea of having a single stable team with rights to approve across any project is ultimately flawed because it doesn't scale efficiently and they don't have the same level of domain knowledge as the respective project teams. This side-thread is a bit off topic for the main discussion, but as a stable-maint with not a lot of time, I would love more help from the core teams here. That said, help is not just about aproving reviews. There are three main steps in the process: 1. Bugs get marked for backport I try to stay on top of this in nova by following the feed of merged patches and marking them icehouse-backport-potential[1] when they seem like they are appropriate but I’m sure I miss some. 2. Patches get backprorted This is sometimes a very time-consuming process, especially late in the cycle or for patches that are being backported 2 releases. 3. Patches get reviewed and merged The criteria for a stable backport are pretty straightforward and I think any core reviewer is capable of understanding and aply that criteria While we have fallen behind in number 3. at times, we are much more often WAY behind on 2. I also suspect that a whole bunch of patches get missed in some of the other projects where someone isn’t specifically trying to mark them all as they come in. Vish [1] https://bugs.launchpad.net/nova/+bugs?field.tag=icehouse-backport-potential signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] usage patterns for oslo.config
Hi Alistair, Modules can register their own options and there is no need to call reload_config_files. The config files are parsed and values stored in case the option is later declared. The only time you need to reload files is if you add new config files in the new module. See the example code: from oslo.config import cfg with open(foo, w) as f: f.write([DEFAULT]\nfoo=bar) cfg.CONF([--config-file, foo]) try: print cfg.CONF.foo except cfg.NoSuchOptError: print NO OPT # OUT: 'NO OPT' cfg.CONF.register_opt(cfg.StrOpt(foo)) print cfg.CONF.foo cfg.CONF.foo # OUT: ‘bar' One thing to keep in mind is you don’t want to use config values at import time, since this tends to be before the config files have been loaded. Vish On Aug 8, 2014, at 6:40 AM, Coles, Alistair alistair.co...@hp.com wrote: I’ve been looking at the implications of applying oslo.config in Swift, and I have a question about the best pattern for registering options. Looking at how keystone uses oslo.config, the pattern seems to be to have all options declared and registered 'up-front' in a single place (keystone/common/config.py) before loading wsgi pipeline/starting the service. Is there another usage pattern where each middleware registers its options independently ‘on-demand’ rather than maintaining them all in a single place? I read about a pattern [1] whereby modules register opts during import, but does that require there to be some point in the lifecycle where all required modules are imported *before* parsing config files? Seems like that would mean parsing the wsgi pipeline to ‘discover’ the middleware modules being used, importing all those modules, then parsing config files, then loading the wsgi pipeline? OR - is it acceptable for each middleware module to register its own options if/when it is imported during wsgi pipeline loading (CONF.register_options()) and then call CONF.reload_config_files() ? Thanks, Alistair [1] http://docs.openstack.org/developer/oslo.config/cfg.html#global-configopts ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder] Bug#1231298 - size parameter for volume creation
On Aug 8, 2014, at 6:55 AM, Dean Troyer dtro...@gmail.com wrote: On Fri, Aug 8, 2014 at 12:36 AM, Ganapathy, Sandhya sandhya.ganapa...@hp.com wrote: This is to discuss Bug #1231298 – https://bugs.launchpad.net/cinder/+bug/1231298 ... Conclusion reached with this bug is that, we need to modify cinder client in order to accept optional size parameter (as the cinder’s API allows) and calculate the size automatically during volume creation from image. There is also an opinion that size should not be an optional parameter during volume creation – does this mean, Cinder’s API should be changed in order to make size a mandatory parameter. In cinderclient I think you're stuck with size as a mandatory argument to the 'cinder create' command, as you must be backward-compatible for at least a deprecation period.[0] Your option here[1] is to use a sentinel value for size that indicates the actual volume size should be calculated and let the client do the right thing under the hood to feed the server API. Other project CLIs have used both 'auto' and '0' in situations like this. I'd suggest '0' as it is still an integer and doesn't require potentially user-error-prone string matching to work. We did this for novaclient volume attach and allowed device to be ‘auto' or the argument to be omitted. I don’t see a huge problem turning size into an optional parameter as long as it doesn’t break older scripts. Turning it from an arg into a kwarg would definitely require deprecation. Vish FWIW, this is why OSC changed 'volume create' to make --size an option and make the volume name be the positional argument. [0] The deprecation period for clients is ambiguous as the release cycle isn't timed but we think of deprecations that way. Using integrated release cycles is handy but less than perfect to correlate to the client's semver releases. [1] Bad pun alert...or is there such a thing as a bad pun??? dt -- Dean Troyer dtro...@gmail.com ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Boot from ISO feature status
I think we should discuss adding/changing this functionality. I have had many new users assume that booting from an iso image would give them a root drive which they could snapshot. I was hoping that the new block device mapping code would allow something like this, but unfortunately there isn’t a way to do it there either. You can boot a flavor with an ephemeral drive, but there is no command to snapshot secondary drives. Vish On Jul 25, 2014, at 4:22 PM, Maksym Lobur mlo...@mirantis.com wrote: Hi Vish! Appreciate your feedback! Are there some significant pitfalls that forced Nova team to decide that? Currently I'm testing my local nova modifications to get real boot from ISO functionality like described in the spec. I'm fetching ISO image from glance into the separate file under the instances/uuid/ dir, attaching it as a CDROM and boot from it. I also do a blank root drive 'disk' which I use to install OS to. Are there any things require extra attention? Any pitfalls in such an approach? Best regards, Max Lobur, OpenStack Developer, Mirantis, Inc. Mobile: +38 (093) 665 14 28 Skype: max_lobur 38, Lenina ave. Kharkov, Ukraine www.mirantis.com www.mirantis.ru On Tue, Jul 22, 2014 at 8:57 AM, Vishvananda Ishaya vishvana...@gmail.com wrote: This is somewhat confusing, but long ago the decision was made that booting from an ISO image should use the ISO as a root drive. This means that it is only really useful for things like live cds. I believe you could use the new block device mapping code to create an instance that boots from an iso and has an ephemeral drive as well but I haven’t tested this. Vish On Jul 22, 2014, at 7:57 AM, Maksym Lobur mlo...@mirantis.com wrote: Hi Folks, Could someone please share his experience with Nova Boot from ISO feature [1]. We test it on Havana + KVM, uploaded the image with DISK_FORMAT set to 'iso'. Windows deployment does not happen. The VM has two volumes: one is config-2 (CDFS, ~400Kb, don't know what that is); and the second one is our flavor volume (80Gb). The windows ISO contents (about 500Mb) for some reason are inside a flavor volume instead of separate CD drive. So far I found only two patches for nova: vmware [2] and Xen [3]. Does it work with KVM? Maybe some specific nova configuration required for KVM. [1] https://wiki.openstack.org/wiki/BootFromISO [2] https://review.openstack.org/#/c/63084/ [3] https://review.openstack.org/#/c/38650/ Thanks beforehand! Max Lobur, OpenStack Developer, Mirantis, Inc. Mobile: +38 (093) 665 14 28 Skype: max_lobur 38, Lenina ave. Kharkov, Ukraine www.mirantis.com www.mirantis.ru ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova]resize
The resize code as written originally did the simplest possible thing. It converts and copies the whole file so that it doesn’t have to figure out how to sync backing files etc. This could definitely be improved, especially now that there is code in _create_images_and_backing that can ensure that backing files are downloaded/created if they are not there. Additionally the resize code should be using something other than ssh/rsync. I’m a fan of using glance to store the file during transfer, but others have suggested using the live migrate code or libvirt to transfer the disks. Vish On Jul 24, 2014, at 2:26 AM, fdsafdsafd jaze...@163.com wrote: No. before L5156, we convert it from qcow2 to qcow2, in which it strips backing file. I think here, we should wirte like this: if info['type'] == 'qcow2' and info['backing_file']: if shared_storage: utils.execute('cp', from_path, img_path) else: tmp_path = from_path + _rbase # merge backing file utils.execute('qemu-img', 'convert', '-f', 'qcow2', '-O', 'qcow2', from_path, tmp_path) libvirt_utils.copy_image(tmp_path, img_path, host=dest) utils.execute('rm', '-f', tmp_path) else: # raw or qcow2 with no backing file libvirt_utils.copy_image(from_path, img_path, host=dest) At 2014-07-24 05:02:39, Tian, Shuangtai shuangtai.t...@intel.com wrote: !-- _font-face {font-family:SimSun; panose-1:2 1 6 0 3 1 1 1 1 1;} _font-face {font-family:SimSun; panose-1:2 1 6 0 3 1 1 1 1 1;} _font-face {font-family:Calibri; panose-1:2 15 5 2 2 2 4 3 2 4;} _font-face {font-family:Tahoma; panose-1:2 11 6 4 3 5 4 4 2 4;} _font-face {font-family:SimSun; panose-1:2 1 6 0 3 1 1 1 1 1;} p.MsoNormal, li.MsoNormal, div.MsoNormal {margin:0cm; margin-bottom:.0001pt; font-size:12.0pt; font-family:SimSun;} a:link, span.MsoHyperlink {mso-style-priority:99; color:blue; text-decoration:underline;} a:visited, span.MsoHyperlinkFollowed {mso-style-priority:99; color:purple; text-decoration:underline;} span.EmailStyle17 {mso-style-type:personal-reply; font-family:Calibri,sans-serif; color:#1F497D;} .MsoChpDefault {mso-style-type:export-only; font-family:Calibri,sans-serif;} _page WordSection1 {size:612.0pt 792.0pt; margin:72.0pt 90.0pt 72.0pt 90.0pt;} div.WordSection1 {page:WordSection1;} -- whether we already use like that ? https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L5156 From: fdsafdsafd [mailto:jaze...@163.com] Sent: Thursday, July 24, 2014 4:30 PM To: openstack-dev@lists.openstack.org Subject: [openstack-dev] [nova]resize In resize, we convert the disk and drop peel backing file, should we judge whether we are in shared_storage? If we are in shared storage, for example, nfs, then we can use the image in _base to be the backing file. And the time cost to resize will be faster. The processing in line 5132 https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py Thanks ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] vhost-scsi support in Nova
On Jul 24, 2014, at 3:06 AM, Daniel P. Berrange berra...@redhat.com wrote: On Wed, Jul 23, 2014 at 10:32:44PM -0700, Nicholas A. Bellinger wrote: *) vhost-scsi doesn't support migration Since it's initial merge in QEMU v1.5, vhost-scsi has a migration blocker set. This is primarily due to requiring some external orchestration in order to setup the necessary vhost-scsi endpoints on the migration destination to match what's running on the migration source. Here are a couple of points that Stefan detailed some time ago about what's involved for properly supporting live migration with vhost-scsi: (1) vhost-scsi needs to tell QEMU when it dirties memory pages, either by DMAing to guest memory buffers or by modifying the virtio vring (which also lives in guest memory). This should be straightforward since the infrastructure is already present in vhost (it's called the log) and used by drivers/vhost/net.c. (2) The harder part is seamless target handover to the destination host. vhost-scsi needs to serialize any SCSI target state from the source machine and load it on the destination machine. We could be in the middle of emulating a SCSI command. An obvious solution is to only support active-passive or active-active HA setups where tcm already knows how to fail over. This typically requires shared storage and maybe some communication for the clustering mechanism. There are more sophisticated approaches, so this straightforward one is just an example. That said, we do intended to support live migration for vhost-scsi using iSCSI/iSER/FC shared storage. *) vhost-scsi doesn't support qcow2 Given all other cinder drivers do not use QEMU qcow2 to access storage blocks, with the exception of the Netapp and Gluster driver, this argument is not particularly relevant here. However, this doesn't mean that vhost-scsi (and target-core itself) cannot support qcow2 images. There is currently an effort to add a userspace backend driver for the upstream target (tcm_core_user [3]), that will allow for supporting various disk formats in userspace. The important part for vhost-scsi is that regardless of what type of target backend driver is put behind the fabric LUNs (raw block devices using IBLOCK, qcow2 images using target_core_user, etc) the changes required in Nova and libvirt to support vhost-scsi remain the same. They do not change based on the backend driver. *) vhost-scsi is not intended for production vhost-scsi has been included the upstream kernel since the v3.6 release, and included in QEMU since v1.5. vhost-scsi runs unmodified out of the box on a number of popular distributions including Fedora, Ubuntu, and OpenSuse. It also works as a QEMU boot device with Seabios, and even with the Windows virtio-scsi mini-port driver. There is at least one vendor who has already posted libvirt patches to support vhost-scsi, so vhost-scsi is already being pushed beyond a debugging and development tool. For instance, here are a few specific use cases where vhost-scsi is currently the only option for virtio-scsi guests: - Low (sub 100 usec) latencies for AIO reads/writes with small iodepth workloads - 1M+ small block IOPs workloads at low CPU utilization with large iopdeth workloads. - End-to-end data integrity using T10 protection information (DIF) IIUC, there is also missing support for block jobs like drive-mirror which is needed by Nova. From a functionality POV migration drive-mirror support are the two core roadblocks to including vhost-scsi in Nova (as well as libvirt support for it of course). Realistically it doesn't sound like these are likely to be solved soon enough to give us confidence in taking this for the Juno release cycle. As I understand this work, vhost-scsi provides massive perf improvements over virtio, which makes it seem like a very valuable addition. I’m ok with telling customers that it means that migration and snapshotting are not supported as long as the feature is protected by a flavor type or image metadata (i.e. not on by default). I know plenty of customers that would gladly trade some of the friendly management features for better i/o performance. Therefore I think it is acceptable to take it with some documentation that it is experimental. Maybe I’m unique but I deal with people pushing for better performance all the time. Vish Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org
Re: [openstack-dev] [nova] how scheduler handle messages?
Workers can consume more than one message at a time due to eventlet/greenthreads. The conf option rpc_thread_pool_size determines how many messages can theoretically be handled at once. Greenthread switching can happen any time a monkeypatched call is made. Vish On Jul 21, 2014, at 3:36 AM, fdsafdsafd jaze...@163.com wrote: Hello, recently, i use rally to test boot-and-delete. I thought that one nova-scheduler will handle message sent to it one by one, but the log print show differences. So Can some one how nova-scheduler handle messages? I read the code in nova.service, and found that one service will create fanout consumer, and that all fanout message consumed in one thread. So I wonder that, How the nova-scheduler handle message, if there are many messages casted to call scheduler's run_instance? Thanks a lot. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Virtio-scsi settings nova-specs exception
I will also sponsor this. Vish On Jul 17, 2014, at 2:47 PM, Mike Perez thin...@gmail.com wrote: As requested from the #openstack-meeting for Nova, I'm posting my nova-spec exception proposal to the ML. Spec: https://review.openstack.org/#/c/103797/3/specs/juno/virtio-scsi-settings.rst Code: https://review.openstack.org/#/c/107650/ - Nikola Dipanov was kind to be the first core sponsor. [1] - This is an optional feature, which should make it a low risk for Nova. - The spec was posted before the spec freeze deadline. - Code change is reasonable and available now. Thank you! [1] - https://etherpad.openstack.org/p/nova-juno-spec-priorities -- Mike Perez ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] fair standards for all hypervisor drivers
On Jul 16, 2014, at 8:28 AM, Daniel P. Berrange berra...@redhat.com wrote: On Wed, Jul 16, 2014 at 08:12:47AM -0700, Clark Boylan wrote: I am worried that we would just regress to the current process because we have tried something similar to this previously and were forced to regress to the current process. IMHO the longer we wait between updating the gate to new versions the bigger the problems we create for ourselves. eg we were switching from 0.9.8 released Dec 2011, to 1.1.1 released Jun 2013, so we were exposed to over 1 + 1/2 years worth of code churn in a single event. The fact that we only hit a couple of bugs in that, is actually remarkable given the amount of feature development that had gone into libvirt in that time. If we had been tracking each intervening libvirt release I expect the majority of updates would have had no ill effect on us at all. For the couple of releases where there was a problem we would not be forced to rollback to a version years older again, we'd just drop back to the previous release at most 1 month older. This is a really good point. As someone who has to deal with packaging issues constantly, it is odd to me that libvirt is one of the few places where we depend on upstream packaging. We constantly pull in new python dependencies from pypi that are not packaged in ubuntu. If we had to wait for packaging before merging the whole system would grind to a halt. I think we should be updating our libvirt version more frequently vy installing from source or our own ppa instead of waiting for the ubuntu team to package it. Vish signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Jul 15, 2014, at 3:30 PM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 14/07/14 22:48, Vishvananda Ishaya wrote: On Jul 13, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/07/14 03:17, Mike Bayer wrote: On 7/11/14, 7:26 PM, Carl Baldwin wrote: On Jul 11, 2014 5:32 PM, Vishvananda Ishaya vishvana...@gmail.com mailto:vishvana...@gmail.com wrote: I have tried using pymysql in place of mysqldb and in real world concurrency tests against cinder and nova it performs slower. I was inspired by the mention of mysql-connector so I just tried that option instead. Mysql-connector seems to be slightly slower as well, which leads me to believe that the blocking inside of Do you have some numbers? Seems to be slightly slower doesn't really stand up as an argument against the numbers that have been posted in this thread. Numbers are highly dependent on a number of other factors, but I was seeing 100 concurrent list commands against cinder going from an average of 400 ms to an average of around 600 ms with both msql-connector and pymsql. I've made my tests on neutron only, so there is possibility that cinder works somehow differently. But, those numbers don't tell a lot in terms of considering the switch. Do you have numbers for mysqldb case? Sorry if my commentary above was unclear. The 400ms is mysqldb. The 600ms average was the same for both the other options. It is also worth mentioning that my test of 100 concurrent creates from the same project in cinder leads to average response times over 3 seconds. Note that creates return before the request is sent to the node for processing, so this is just the api creating the db record and sticking a message on the queue. A huge part of the slowdown is in quota reservation processing which does a row lock on the project id. Again, are those 3 seconds better or worse than what we have for mysqldb? The 3 seconds is from mysqldb. I don’t have average response times for mysql-connector due to the timeouts I mention below. Before we are sure that an eventlet friendly backend “gets rid of all deadlocks”, I will mention that trying this test against connector leads to some requests timing out at our load balancer (5 minute timeout), so we may actually be introducing deadlocks where the retry_on_deadlock operator is used. Deadlocks != timeouts. I attempt to fix eventlet-triggered db deadlocks, not all possible deadlocks that you may envision, or timeouts. That may be true, but if switching the default is trading one problem for another it isn’t necessarily the right fix. The timeout means that one or more greenthreads are never actually generating a response. I suspect and endless retry_on_deadlock between a couple of competing greenthreads which we don’t hit with mysqldb, but it could be any number of things. Consider the above anecdotal for the moment, since I can’t verify for sure that switching the sql driver didn’t introduce some other race or unrelated problem. Let me just caution that we can’t recommend replacing our mysql backend without real performance and load testing. I agree. Not saying that the tests are somehow complete, but here is what I was into last two days. There is a nice openstack project called Rally that is designed to allow easy benchmarks for openstack projects. They have four scenarios for neutron implemented: for networks, ports, routers, and subnets. Each scenario combines create and list commands. I've run each test with the following runner settings: times = 100, concurrency = 10, meaning each scenario is run 100 times in parallel, and there were not more than 10 parallel scenarios running. Then I've repeated the same for times = 100, concurrency = 20 (also set max_pool_size to 20 to allow sqlalchemy utilize that level of parallelism), and times = 1000, concurrency = 100 (same note on sqlalchemy parallelism). You can find detailed html files with nice graphs here [1]. Brief description of results is below: 1. create_and_list_networks scenario: for 10 parallel workers performance boost is -12.5% from original time, for 20 workers -6.3%, for 100 workers there is a slight reduction of average time spent for scenario +9.4% (this is the only scenario that showed slight reduction in performance, I'll try to rerun the test tomorrow to see whether it was some discrepancy when I executed it that influenced the result). 2. create_and_list_ports scenario: for 10 parallel workers boost is -25.8%, for 20 workers it's -9.4%, and for 100 workers it's -12.6%. 3. create_and_list_routers scenario: for 10 parallel workers boost is -46.6% (almost half of original time), for 20 workers it's -51.7% (more than a half), for 100 workers it's -41.5%. 4. create_and_list_subnets scenario: for 10 parallel workers boost is -26.4%, for 20 workers
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
On Jul 13, 2014, at 9:29 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: Signed PGP part On 12/07/14 03:17, Mike Bayer wrote: On 7/11/14, 7:26 PM, Carl Baldwin wrote: On Jul 11, 2014 5:32 PM, Vishvananda Ishaya vishvana...@gmail.com mailto:vishvana...@gmail.com wrote: I have tried using pymysql in place of mysqldb and in real world concurrency tests against cinder and nova it performs slower. I was inspired by the mention of mysql-connector so I just tried that option instead. Mysql-connector seems to be slightly slower as well, which leads me to believe that the blocking inside of Do you have some numbers? Seems to be slightly slower doesn't really stand up as an argument against the numbers that have been posted in this thread. Numbers are highly dependent on a number of other factors, but I was seeing 100 concurrent list commands against cinder going from an average of 400 ms to an average of around 600 ms with both msql-connector and pymsql. It is also worth mentioning that my test of 100 concurrent creates from the same project in cinder leads to average response times over 3 seconds. Note that creates return before the request is sent to the node for processing, so this is just the api creating the db record and sticking a message on the queue. A huge part of the slowdown is in quota reservation processing which does a row lock on the project id. Before we are sure that an eventlet friendly backend “gets rid of all deadlocks”, I will mention that trying this test against connector leads to some requests timing out at our load balancer (5 minute timeout), so we may actually be introducing deadlocks where the retry_on_deadlock operator is used. Consider the above anecdotal for the moment, since I can’t verify for sure that switching the sql driver didn’t introduce some other race or unrelated problem. Let me just caution that we can’t recommend replacing our mysql backend without real performance and load testing. Vish sqlalchemy is not the main bottleneck across projects. Vish P.S. The performanace in all cases was abysmal, so performance work definitely needs to be done, but just the guess that replacing our mysql library is going to solve all of our performance problems appears to be incorrect at first blush. The motivation is still mostly deadlock relief but more performance work should be done. I agree with you there. I'm still hopeful for some improvement from this. To identify performance that's alleviated by async you have to establish up front that IO blocking is the issue, which would entail having code that's blazing fast until you start running it against concurrent connections, at which point you can identify via profiling that IO operations are being serialized. This is a very specific issue. In contrast, to identify why some arbitrary openstack app is slow, my bet is that async is often not the big issue. Every day I look at openstack code and talk to people working on things, I see many performance issues that have nothing to do with concurrency, and as I detailed in my wiki page at https://wiki.openstack.org/wiki/OpenStack_and_SQLAlchemy there is a long road to cleaning up all the excessive queries, hundreds of unnecessary rows and columns being pulled over the network, unindexed lookups, subquery joins, hammering of Python-intensive operations (often due to the nature of OS apps as lots and lots of tiny API calls) that can be cached. There's a clear path to tons better performance documented there and most of it is not about async - which means that successful async isn't going to solve all those issues. Of course there is a long road to decent performance, and switching a library won't magically fix all out issues. But if it will fix deadlocks, and give 30% to 150% performance boost for different operations, and since the switch is almost smooth, this is something worth doing. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][all] switch from mysqldb to another eventlet aware mysql client
I have tried using pymysql in place of mysqldb and in real world concurrency tests against cinder and nova it performs slower. I was inspired by the mention of mysql-connector so I just tried that option instead. Mysql-connector seems to be slightly slower as well, which leads me to believe that the blocking inside of sqlalchemy is not the main bottleneck across projects. Vish P.S. The performanace in all cases was abysmal, so performance work definitely needs to be done, but just the guess that replacing our mysql library is going to solve all of our performance problems appears to be incorrect at first blush. On Jul 11, 2014, at 10:20 AM, Clark Boylan clark.boy...@gmail.com wrote: Before we get too far ahead of ourselves mysql-connector is not hosted on pypi. Instead it is an external package link. We recently managed to remove all packages that are hosted as external package links from openstack and will not add new ones in. Before we can use mysql-connector in the gate oracle will need to publish mysql-connector on pypi properly. That said there is at least one other pure python alternative, PyMySQL. PyMySQL supports py3k and pypy. We should look at using PyMySQL instead if we want to start with a reasonable path to getting this in the gate. Clark On Fri, Jul 11, 2014 at 10:07 AM, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: +1 here too, Amazed with the performance gains, x2.4 seems a lot, and we'd get rid of deadlocks. - Original Message - +1 I'm pretty excited about the possibilities here. I've had this mysqldb/eventlet contention in the back of my mind for some time now. I'm glad to see some work being done in this area. Carl On Fri, Jul 11, 2014 at 7:04 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA512 On 09/07/14 13:17, Ihar Hrachyshka wrote: Hi all, Multiple projects are suffering from db lock timeouts due to deadlocks deep in mysqldb library that we use to interact with mysql servers. In essence, the problem is due to missing eventlet support in mysqldb module, meaning when a db lock is encountered, the library does not yield to the next green thread, allowing other threads to eventually unlock the grabbed lock, and instead it just blocks the main thread, that eventually raises timeout exception (OperationalError). The failed operation is not retried, leaving failing request not served. In Nova, there is a special retry mechanism for deadlocks, though I think it's more a hack than a proper fix. Neutron is one of the projects that suffer from those timeout errors a lot. Partly it's due to lack of discipline in how we do nested calls in l3_db and ml2_plugin code, but that's not something to change in foreseeable future, so we need to find another solution that is applicable for Juno. Ideally, the solution should be applicable for Icehouse too to allow distributors to resolve existing deadlocks without waiting for Juno. We've had several discussions and attempts to introduce a solution to the problem. Thanks to oslo.db guys, we now have more or less clear view on the cause of the failures and how to easily fix them. The solution is to switch mysqldb to something eventlet aware. The best candidate is probably MySQL Connector module that is an official MySQL client for Python and that shows some (preliminary) good results in terms of performance. I've made additional testing, creating 2000 networks in parallel (10 thread workers) for both drivers and comparing results. With mysqldb: 215.81 sec With mysql-connector: 88.66 ~2.4 times performance boost, ok? ;) I think we should switch to that library *even* if we forget about all the nasty deadlocks we experience now. I've posted a Neutron spec for the switch to the new client in Juno at [1]. Ideally, switch is just a matter of several fixes to oslo.db that would enable full support for the new driver already supported by SQLAlchemy, plus 'connection' string modified in service configuration files, plus documentation updates to refer to the new official way to configure services for MySQL. The database code won't, ideally, require any major changes, though some adaptation for the new client library may be needed. That said, Neutron does not seem to require any changes, though it was revealed that there are some alembic migration rules in Keystone or Glance that need (trivial) modifications. You can see how trivial the switch can be achieved for a service based on example for Neutron [2]. While this is a Neutron specific proposal, there is an obvious wish to switch to the new library globally throughout all the projects, to reduce devops burden, among other things. My vision is that, ideally, we switch all projects to the new library in Juno, though we still may leave several projects for K in case any issues arise, similar to the way projects switched to
Re: [openstack-dev] [nova][libvirt] Suspend via virDomainSave() rather than virDomainManagedSave()
On Jul 6, 2014, at 10:22 PM, Rafi Khardalian r...@metacloud.com wrote: Hi All -- It seems as though it would be beneficial to use virDomainSave rather than virDomainManagedSave for suspending instances. The primary benefit of doing so would be to locate the save files within the instance's dedicated directory. As it stands suspend operations are utilizing ManagedSave, which places all save files in a single directory (/var/lib/libvirt/qemu/save by default on Ubuntu). This is the only instance-specific state data which lives both outside the instance directory and the database. Also, ManagedSave does not consider Libvirt's save_image_format directive and stores all saves as raw, rather than offering the various compression options available when DomainSave is used. ManagedSave is certainly easier but offers less control than what I think is desired in this case. Is there anything I'm missing? If not, would folks be open to this change? +1 Vish Thanks, Rafi ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] cloud-init IPv6 support
I haven’t heard of anyone addressing this, but it seems useful. Vish On Jul 7, 2014, at 9:15 AM, Nir Yechiel nyech...@redhat.com wrote: AFAIK, the cloud-init metadata service can currently be accessed only by sending a request to http://169.254.169.254, and no IPv6 equivalent is currently implemented. Does anyone working on this or tried to address this before? Thanks, Nir ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Two questions about 'backup' API
On Jun 26, 2014, at 6:58 PM, wu jiang win...@gmail.com wrote: Hi Vish, thanks for your reply. About Q1, I mean that Nova doesn't have extra processions/works for 'daily'/'weekly' than other backup_types like '123'/'test'. The 'daily' 'weekly' don't have unique places in the API than any other else. But we gave them as examples in code comments especially in novaclient. A few users asked me why their instances were not backup-ed automatically, they thought we have a timing task to do this if the 'backup_type' equals to 'daily'/'weekly' because we prompt them to use it.. Therefore, it's useless and inclined to make confusion for this API IMO. No need to show them in code comments novaclient. P.S. So maybe 'backup_name'/'backup_tag' is a better name, but we can't modify the API for compatibility.. Yes the name is confusing. Vish Thanks. WingWJ On Fri, Jun 27, 2014 at 5:20 AM, Vishvananda Ishaya vishvana...@gmail.com wrote: On Jun 26, 2014, at 5:07 AM, wu jiang win...@gmail.com wrote: Hi all, I tested the 'backup' API recently and got two questions about it: 1. Why 'daily' 'weekly' appear in code comments novaclient about 'backup_type' parameter? The 'backup_type' parameter is only a tag for this backup(image). And there isn't corresponding validation for 'backup_type' about these two types. Moreover, there is also no periodic_task for 'backup' in compute host. (It's fair to leave the choice to other third-parts system) So, why we leave 'daily | weekly' example in code comments and novaclient? IMO it may lead confusion that Nova will do more actions for 'daily|weekly' backup request. The tag affects the cleanup of old copies, so if you do a tag of ‘weekly’ and the rotation is 3, it will insure you only have 3 copies that are tagged weekly. You could also have 3 copies of the daily tag as well. 2. Is it necessary to backup instance when 'rotation' is equal to 0? Let's look at related codes in nova/compute/manager.py: # def backup_instance(self, context, image_id, instance, backup_type, rotation): # #self._do_snapshot_instance(context, image_id, instance, rotation) #self._rotate_backups(context, instance, backup_type, rotation) I knew Nova will delete all backup images according the 'backup_type' parameter when 'rotation' equals to 0. But according the logic above, Nova will generate one new backup in _do_snapshot_instance(), and delete it in _rotate_backups().. It's weird to snapshot a useless backup firstly IMO. We need to add one new branch here: if 'rotation' is equal to 0, no need to backup, just rotate it. That makes sense I suppose. Vish So, what's your opinions? Look forward to your suggestion. Thanks. WingWJ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] nova networking API and CLI are poorly documented and buggy
On Jun 14, 2014, at 9:12 AM, Mike Spreitzer mspre...@us.ibm.com wrote: I am not even sure what is the intent, but some of the behavior looks like it is clearly unintended and not useful (a more precise formulation of buggy that is not defeated by the lack of documentation). IMHO, the API and CLI documentation should explain these calls/commands in enough detail that the reader can tell the difference. And the difference should be useful in at least some networking configurations. It seems to me that in some configurations an administrative user may want THREE varieties of the network listing call/command: one that shows networks assigned to his tenant, one that also shows networks available to be assigned, and one that shows all networks. And in no configurations should a non-administrative user be blind to all categories of networks. In the API, there are the calls on /v2/{tenant_id}/os-networks and they are documented at http://docs.openstack.org/api/openstack-compute/2/content/ext-os-networks.html. There are also calls on /v2/{tenant_id}/os-tenant-networks --- but I can not find documentation for them. http://docs.openstack.org/api/openstack-compute/2/content/ext-os-networks.html does not describe the meaning of the calls in much detail. For example, about GET /v2/{tenant_id}/os-networks that doc says only Lists networks that are available to the tenant. In some networking configurations, there are two levels of availability: a network might be assigned to a tenant, or a network might be available for assignment. In other networking configurations there are NOT two levels of availability. For example, in Flat DHCP nova networking (which is the default in DevStack), a network CAN NOT be assigned to a tenant. I think it should be returning the networks which a tenant will get for their instance when they launch it. This is unfortunately a bit confusing in vlan mode if a network has not been autoassigned, but that is generally a temporary case. So the bug fix below would lead to the correct behavior. You might think that the to the tenant qualification implies filtering by the invoker's tenant. But you would be wrong in the case of an administrative user; see the model_query method in nova/db/sqlalchemy/api.py In the CLI, we have two sets of similar-seeming commands. For example, $ nova help net-list usage: nova net-list List networks $ nova help network-list usage: nova network-list Print a list of available networks. IMO net-list / os-tenant-networks should be deprecated because it really isn’t adding any features to the original extension. Those remarks are even briefer than the one description in the API doc, omitting the qualification to the tenant. Experimentation shows that, in the case of flat DHCP nova networking, both of those commands show zero networks to a non-administrative user (and remember that networks can not be assigned to tenants in that configuration) and all the networks to an administrative user. At the API the GET calls behave the same way. The fact that a non-administrative user sees zero networks looks unintended and not useful. See https://bugs.launchpad.net/openstack-manuals/+bug/1152862 and https://bugs.launchpad.net/nova/+bug/1327406 Can anyone tell me why there are both /os-networks and /os-tenant-networks calls and what their intended semantics are? The os-networks extension (nova network-list, network-create, etc.) were originally designed to pull features from nova-manage network commands to allow administration of networks through the api instead of directly talking to the database. The os-tenant-networks extension (nova net-list) were initially created as a replacement for the above but they changed the semantics slightly so got turned into their own extension. Since then some work has been proposed to improve the original extension to add some functionality to os-networks and improve error handling[1]. The original extension not showing networks to tenants is a bug which you have already identified. [1] https://review.openstack.org/#/c/93759/ Thanks, Mike___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron]One security issue about floating ip
I believe this will affect nova-network as well. We probably should use something like the linux cutter utility to kill any ongoing connections after we remove the nat rule. Vish On Jun 25, 2014, at 8:18 PM, Xurong Yang ido...@gmail.com wrote: Hi folks, After we create an SSH connection to a VM via its floating ip, even though we have removed the floating ip association, we can still access the VM via that connection. Namely, SSH is not disconnected when the floating ip is not valid. Any good solution about this security issue? Thanks Xurong Yang ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron]One security issue about floating ip
I missed that going in, but it appears that clean_conntrack is not done on disassociate, just during migration. It sounds like we should remove the explicit call in migrate, and just always call it from remove_floating_ip. Vish On Jun 26, 2014, at 1:48 PM, Brian Haley brian.ha...@hp.com wrote: Signed PGP part I believe nova-network does this by using 'conntrack -D -r $fixed_ip' when the floating IP goes away (search for clean_conntrack), Neutron doesn't when it removes the floating IP. Seems like it's possible to close most of that gap in the l3-agent - when it removes the IP from it's qg- interface it can do a similar operation. -Brian On 06/26/2014 03:36 PM, Vishvananda Ishaya wrote: I believe this will affect nova-network as well. We probably should use something like the linux cutter utility to kill any ongoing connections after we remove the nat rule. Vish On Jun 25, 2014, at 8:18 PM, Xurong Yang ido...@gmail.com wrote: Hi folks, After we create an SSH connection to a VM via its floating ip, even though we have removed the floating ip association, we can still access the VM via that connection. Namely, SSH is not disconnected when the floating ip is not valid. Any good solution about this security issue? Thanks Xurong Yang ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Two questions about 'backup' API
On Jun 26, 2014, at 5:07 AM, wu jiang win...@gmail.com wrote: Hi all, I tested the 'backup' API recently and got two questions about it: 1. Why 'daily' 'weekly' appear in code comments novaclient about 'backup_type' parameter? The 'backup_type' parameter is only a tag for this backup(image). And there isn't corresponding validation for 'backup_type' about these two types. Moreover, there is also no periodic_task for 'backup' in compute host. (It's fair to leave the choice to other third-parts system) So, why we leave 'daily | weekly' example in code comments and novaclient? IMO it may lead confusion that Nova will do more actions for 'daily|weekly' backup request. The tag affects the cleanup of old copies, so if you do a tag of ‘weekly’ and the rotation is 3, it will insure you only have 3 copies that are tagged weekly. You could also have 3 copies of the daily tag as well. 2. Is it necessary to backup instance when 'rotation' is equal to 0? Let's look at related codes in nova/compute/manager.py: # def backup_instance(self, context, image_id, instance, backup_type, rotation): # #self._do_snapshot_instance(context, image_id, instance, rotation) #self._rotate_backups(context, instance, backup_type, rotation) I knew Nova will delete all backup images according the 'backup_type' parameter when 'rotation' equals to 0. But according the logic above, Nova will generate one new backup in _do_snapshot_instance(), and delete it in _rotate_backups().. It's weird to snapshot a useless backup firstly IMO. We need to add one new branch here: if 'rotation' is equal to 0, no need to backup, just rotate it. That makes sense I suppose. Vish So, what's your opinions? Look forward to your suggestion. Thanks. WingWJ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Why is there a 'None' task_state between 'SCHEDULING' 'BLOCK_DEVICE_MAPPING'?
Thanks WingWJ. It would also be great to track this in a bug. Vish On Jun 26, 2014, at 5:30 AM, wu jiang win...@gmail.com wrote: Hi Phil, Ok, I'll submit a patch to add a new task_state(like 'STARTING_BUILD') in these two days. And related modifications will be definitely added in the Doc. Thanks for your help. :) WingWJ On Thu, Jun 26, 2014 at 6:42 PM, Day, Phil philip@hp.com wrote: Why do others think – do we want a spec to add an additional task_state value that will be set in a well defined place. Kind of feels overkill for me in terms of the review effort that would take compared to just reviewing the code - its not as there are going to be lots of alternatives to consider here. From: wu jiang [mailto:win...@gmail.com] Sent: 26 June 2014 09:19 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [nova] Why is there a 'None' task_state between 'SCHEDULING' 'BLOCK_DEVICE_MAPPING'? Hi Phil, thanks for your reply. So should I need to submit a patch/spec to add it now? On Wed, Jun 25, 2014 at 5:53 PM, Day, Phil philip@hp.com wrote: Looking at this a bit deeper the comment in _start_buidling() says that its doing this to “Save the host and launched_on fields and log appropriately “. But as far as I can see those don’t actually get set until the claim is made against the resource tracker a bit later in the process, so this whole update might just be not needed – although I still like the idea of a state to show that the request has been taken off the queue by the compute manager. From: Day, Phil Sent: 25 June 2014 10:35 To: OpenStack Development Mailing List Subject: RE: [openstack-dev] [nova] Why is there a 'None' task_state between 'SCHEDULING' 'BLOCK_DEVICE_MAPPING'? Hi WingWJ, I agree that we shouldn’t have a task state of None while an operation is in progress. I’m pretty sure back in the day this didn’t use to be the case and task_state stayed as Scheduling until it went to Networking (now of course networking and BDM happen in parallel, so you have to be very quick to see the Networking state). Personally I would like to see the extra granularity of knowing that a request has been started on the compute manager (and knowing that the request was started rather than is still sitting on the queue makes the decision to put it into an error state when the manager is re-started more robust). Maybe a task state of “STARTING_BUILD” for this case ? BTW I don’t think _start_building() is called anymore now that we’ve switched to conductor calling build_and_run_instance() – but the same task_state issue exist in there well. From: wu jiang [mailto:win...@gmail.com] Sent: 25 June 2014 08:19 To: OpenStack Development Mailing List Subject: [openstack-dev] [nova] Why is there a 'None' task_state between 'SCHEDULING' 'BLOCK_DEVICE_MAPPING'? Hi all, Recently, some of my instances were stuck in task_state 'None' during VM creation in my environment. So I checked found there's a 'None' task_state between 'SCHEDULING' 'BLOCK_DEVICE_MAPPING'. The related codes are implemented like this: #def _start_building(): #self._instance_update(context, instance['uuid'], # vm_state=vm_states.BUILDING, # task_state=None, # expected_task_state=(task_states.SCHEDULING, # None)) So if compute node is rebooted after that procession, all building VMs on it will always stay in 'None' task_state. And it's useless and not convenient for locating problems. Why not a new task_state for this step? WingWJ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Virtual Interface creation failed
I have seen something like this before with nova-network and it was due to the number of requests the rpc call timeout gets hit for allocate_network. You might need to set your rpc_response_timeout to something greater. I think it defaults to 60 seconds. Vish On Jun 25, 2014, at 6:57 AM, tfre...@redhat.com wrote: Hello, During the tests of Multiple RPC, I've encountered a problem to create VMs. Creation of 180 VMs succeeded. But when I've tried to create 200 VMs, part of the VMs failed with resources problem of VCPU limitation, the other part failed with following error: vm failed - {message: Virtual Interface creation failed, code: 500, created: 2014-06-25T10:22:35Z} | | flavor | nano (10) We can see from the Neutron server and Nova API logs, that Neutron got the Nova request and responded to it, but this connection fails somewhere between Nova API and Nova Compute. Please see the exact logs: http://pastebin.test.redhat.com/217653 Tested with latest Icehouse version on RHEL 7. Controller + Compute Node All Nova and Neutron logs are attached. Is this a known issue? -- Thanks, Toni multiple_vm_neutron_log.tar.gzmultiple_vm_nova_log.tar.gz___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Openstack] [Nova] How to confirm I have the right database schema when checkout to another branch?
You need to remove the old .pyc files in the migrate_repo/versions directory. I have an alias in my .gitconfig to allow me to checkout a branch and delete pycs in one command: [alias] cc = !TOP=$(git rev-parse --show-toplevel); find $TOP -name '*.pyc' -delete; git-checkout” so i can do: git cc some-branch Vish On Jun 11, 2014, at 7:54 PM, 严超 yanchao...@gmail.com wrote: Hi, All: When I checkout nova to another branch. how to confirm I have the right database schema ? When I run nova-manage db sync, I've got below error: 2014-06-11 22:53:27.977 CRITICAL nova [-] KeyError: VerNum(242) 2014-06-11 22:53:27.977 TRACE nova Traceback (most recent call last): 2014-06-11 22:53:27.977 TRACE nova File /usr/local/bin/nova-manage, line 10, in module 2014-06-11 22:53:27.977 TRACE nova sys.exit(main()) 2014-06-11 22:53:27.977 TRACE nova File /opt/stack/nova/nova/cmd/manage.py, line 1376, in main 2014-06-11 22:53:27.977 TRACE nova ret = fn(*fn_args, **fn_kwargs) 2014-06-11 22:53:27.977 TRACE nova File /opt/stack/nova/nova/cmd/manage.py, line 885, in sync 2014-06-11 22:53:27.977 TRACE nova return migration.db_sync(version) 2014-06-11 22:53:27.977 TRACE nova File /opt/stack/nova/nova/db/migration.py, line 32, in db_sync 2014-06-11 22:53:27.977 TRACE nova return IMPL.db_sync(version=version) 2014-06-11 22:53:27.977 TRACE nova File /opt/stack/nova/nova/db/sqlalchemy/migration.py, line 44, in db_sync 2014-06-11 22:53:27.977 TRACE nova return versioning_api.upgrade(get_engine(), repository, version) 2014-06-11 22:53:27.977 TRACE nova File /usr/local/lib/python2.7/dist-packages/migrate/versioning/api.py, line 186, in upgrade 2014-06-11 22:53:27.977 TRACE nova return _migrate(url, repository, version, upgrade=True, err=err, **opts) 2014-06-11 22:53:27.977 TRACE nova File string, line 2, in _migrate 2014-06-11 22:53:27.977 TRACE nova File /usr/local/lib/python2.7/dist-packages/migrate/versioning/util/__init__.py, line 160, in with_engine 2014-06-11 22:53:27.977 TRACE nova return f(*a, **kw) 2014-06-11 22:53:27.977 TRACE nova File /usr/local/lib/python2.7/dist-packages/migrate/versioning/api.py, line 345, in _migrate 2014-06-11 22:53:27.977 TRACE nova changeset = schema.changeset(version) 2014-06-11 22:53:27.977 TRACE nova File /usr/local/lib/python2.7/dist-packages/migrate/versioning/schema.py, line 82, in changeset 2014-06-11 22:53:27.977 TRACE nova changeset = self.repository.changeset(database, start_ver, version) 2014-06-11 22:53:27.977 TRACE nova File /usr/local/lib/python2.7/dist-packages/migrate/versioning/repository.py, line 225, in changeset 2014-06-11 22:53:27.977 TRACE nova changes = [self.version(v).script(database, op) for v in versions] 2014-06-11 22:53:27.977 TRACE nova File /usr/local/lib/python2.7/dist-packages/migrate/versioning/repository.py, line 189, in version 2014-06-11 22:53:27.977 TRACE nova return self.versions.version(*p, **k) 2014-06-11 22:53:27.977 TRACE nova File /usr/local/lib/python2.7/dist-packages/migrate/versioning/version.py, line 173, in version 2014-06-11 22:53:27.977 TRACE nova return self.versions[VerNum(vernum)] 2014-06-11 22:53:27.977 TRACE nova KeyError: VerNum(242) 2014-06-11 22:53:27.977 TRACE nova Best Regards! Chao Yan -- My twitter:Andy Yan @yanchao727 My Weibo:http://weibo.com/herewearenow -- ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openst...@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] fastest way to run individual tests ?
This isn’t an officially supported method, but i tend to use: python -m nova.openstack.common.lockutils nosetests test for example: python -m nova.openstack.common.lockutils nosetests nova.tests.integrated.test_api_samples:CloudPipeSampleJsonTest.test_cloud_pipe_create I think this is a little bit less flexible than testtools run but old habits. Vish On Jun 12, 2014, at 3:59 AM, Daniel P. Berrange berra...@redhat.com wrote: When in the middle of developing code for nova I'll typically not wish to the run the entire Nova test suite every time I have a bit of code to verify. I'll just want to run the single test case that deals with the code I'm hacking on. I'm currently writing a 'test_hardware.py' test case for the NUMA work I'm dealing with. I can run that using 'run_tests.sh' or 'tox' by just passing the name of the test case. The test case in question takes a tiny fraction of a second to run, but the tox command above wastes 32 seconds faffing about before it runs the test itself, while run_tests.sh is not much better wasting 22 seconds. # tox -e py27 tests.virt.test_hardware ...snip... real0m32.923s user0m22.083s sys 0m4.377s # time ./run_tests.sh tests.virt.test_hardware ...snip... real0m22.075s user0m14.282s sys 0m1.407s This is a really severe time penalty to incurr each time I want to run this tiny test (which is very frequently during dev). Does anyone have any tip on how to actually run individual tests in an efficient manner. ie something that adds no more than 1 second penalty over above the time to run the test itself. NB, assume that i've primed the virtual env with all prerequisite deps already. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Reducing quota below utilisation
On Jun 17, 2014, at 12:53 PM, Jan van Eldik jan.van.el...@cern.ch wrote: Just On 06/17/2014 08:18 PM, Tim Bell wrote: We have some projects which are dynamically creating VMs up to their quota. Under some circumstances, as cloud administrators, we would like these projects to shrink and make room for other higher priority work. We had investigated setting the project quota below the current utilisation (i.e. effectively delete only, no create). This will eventually match the desired level of VMs as the dynamic workload leads to old VMs being deleted and new ones cannot be created. However, OpenStack does not allow a quota to be set to below the current usage. Just to add that nova help quota-update suggests that the --force option should do the trick: --force Whether force update the quota even if the already used and reserved exceeds the new quota However, when trying to lower the quota below the current usage value, we get: $ nova absolute-limits --te $ID|grep -i core | totalCoresUsed | 11| | maxTotalCores | 20| $ nova quota-update --cores 2 $ID ERROR: Quota value 2 for cores are greater than already used and reserved 11 (HTTP 400) (Request-ID: req-c1dd6add-772c-4cd5-9a13-c33940698f93) $ nova quota-update --cores 2 --force $ID ERROR: Quota limit must greater than 11. (HTTP 400) (Request-ID: req-cfc58810-35af-46a3-b554-59d34c647e40) Am I misunderstanding what --force does? That was my understanding of force as well. This looks like a bug to me. Vish BTW: I believe the first error message is wrong, and will propose a patch. cheers, Jan This seems a little restrictive … any thoughts from others ? Tim ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova]Passing flat_injected flag through instance metadata
We have discussed this a bunch in the past, and the right implementation here is to put the network configuration in a standard format (json?) in both the config drive and metadata. cloud-init can be modified to read from that format and write out a proper /etc/network/interfaces (or appropriate files for the guest distro) Vish On Jun 2, 2014, at 10:20 AM, ebaysf, yvempati yvemp...@ebaysf.com wrote: Hi, Thanks for getting back to me. The current flat_injected flag is set in the hypervisor nova.conf. The config drive data uses this flag to set the static network configuration. What I am trying to accomplish is to pass the flat_injected file through the instance metadata during the boot time and use it during the config drive network configuration rather setting the flag at the hypervisor level. Regards, Yashwanth Vempati On 6/2/14, 9:30 AM, Ben Nemec openst...@nemebean.com wrote: On 05/30/2014 05:29 PM, ebaysf, yvempati wrote: Hello all, I am new to the openstack community and I am looking for feedback. We would like to implement a feature that allows user to pass flat_injected flag through instance metadata. We would like to enable this feature for images that support config drive. This feature helps us to decrease the dependency on dhcp server and to maintain a uniform configuration across all the hypervisors running in our cloud. In order to enable this feature should I create a blue print and later implement or can this feature be implemented by filing a bug. I'm not sure I understand what you're trying to do here. As I recall, when flat_injected is set the static network configuration is already included in the config drive data. I believe there have been some changes around file injection, but that shouldn't affect config drive as far as I know. If you just need that functionality and it's not working anymore then a bug might be appropriate, but if you need something else then a blueprint/spec will be needed. -Ben ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] nova default quotas
Phil, You are correct and this seems to be an error. I don’t think in the earlier ML thread[1] that anyone remembered that the quota classes were being used for default quotas. IMO we need to revert this removal as we (accidentally) removed a Havana feature with no notification to the community. I’ve reactivated a bug[2] and marked it critcal. Vish [1] http://lists.openstack.org/pipermail/openstack-dev/2014-February/027574.html [2] https://bugs.launchpad.net/nova/+bug/1299517 On May 27, 2014, at 12:19 PM, Day, Phil philip@hp.com wrote: Hi Vish, I think quota classes have been removed from Nova now. Phil Sent from Samsung Mobile Original message From: Vishvananda Ishaya Date:27/05/2014 19:24 (GMT+00:00) To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [nova] nova default quotas Are you aware that there is already a way to do this through the cli using quota-class-update? http://docs.openstack.org/user-guide-admin/content/cli_set_quotas.html (near the bottom) Are you suggesting that we also add the ability to use just regular quota-update? I’m not sure i see the need for both. Vish On May 20, 2014, at 9:52 AM, Cazzolato, Sergio J sergio.j.cazzol...@intel.com wrote: I would to hear your thoughts about an idea to add a way to manage the default quota values through the API. The idea is to use the current quota api, but sending ''default' instead of the tenant_id. This change would apply to quota-show and quota-update methods. This approach will help to simplify the implementation of another blueprint named per-flavor-quotas Feedback? Suggestions? Sergio Juan Cazzolato Intel Software Argentina ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] nova default quotas
I’m not sure that this is the right approach. We really have to add the old extension back for compatibility, so it might be best to simply keep that extension instead of adding a new way to do it. Vish On May 27, 2014, at 1:31 PM, Cazzolato, Sergio J sergio.j.cazzol...@intel.com wrote: I have created a blueprint to add this functionality to nova. https://review.openstack.org/#/c/94519/ -Original Message- From: Vishvananda Ishaya [mailto:vishvana...@gmail.com] Sent: Tuesday, May 27, 2014 5:11 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [nova] nova default quotas Phil, You are correct and this seems to be an error. I don't think in the earlier ML thread[1] that anyone remembered that the quota classes were being used for default quotas. IMO we need to revert this removal as we (accidentally) removed a Havana feature with no notification to the community. I've reactivated a bug[2] and marked it critcal. Vish [1] http://lists.openstack.org/pipermail/openstack-dev/2014-February/027574.html [2] https://bugs.launchpad.net/nova/+bug/1299517 On May 27, 2014, at 12:19 PM, Day, Phil philip@hp.com wrote: Hi Vish, I think quota classes have been removed from Nova now. Phil Sent from Samsung Mobile Original message From: Vishvananda Ishaya Date:27/05/2014 19:24 (GMT+00:00) To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [nova] nova default quotas Are you aware that there is already a way to do this through the cli using quota-class-update? http://docs.openstack.org/user-guide-admin/content/cli_set_quotas.html (near the bottom) Are you suggesting that we also add the ability to use just regular quota-update? I'm not sure i see the need for both. Vish On May 20, 2014, at 9:52 AM, Cazzolato, Sergio J sergio.j.cazzol...@intel.com wrote: I would to hear your thoughts about an idea to add a way to manage the default quota values through the API. The idea is to use the current quota api, but sending ''default' instead of the tenant_id. This change would apply to quota-show and quota-update methods. This approach will help to simplify the implementation of another blueprint named per-flavor-quotas Feedback? Suggestions? Sergio Juan Cazzolato Intel Software Argentina ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Cinder] Question about storage backend capacity expansion
On May 14, 2014, at 12:14 AM, Zhangleiqiang (Trump) zhangleiqi...@huawei.com wrote: Hi, all: I meet a requirement in my OpenStack environment which initially uses one LVMISCSI backend. Along with the usage, the storage is insufficient, so I want to add a NFS backend to the exists Cinder. There is only a single Cinder-volume in environment, so I need to configure the Cinder to use multi-backend, which means the initial LVMISCSI storage and the new added NFS storage are both used as the backend. However, the existing volume on initial LVMISCSI backend will not be handled normally after using multi-backend, because the host of the exists volume will be thought down. I know that the migrate and retype APIs aim to handle the backend capacity expansion, however, each of them can't used for this situation. I think the use case above is common in production environment. Is there some existing method can achieve it ? Currently, I manually updated the host value of the existing volumes in database, and the existing volumes can then be handled normally. While the above use case may be common, you are explicitly changing the config of the system, and requiring a manual update of the database in this case seems reasonable to me. Vish Thanks. -- zhangleiqiang (Trump) Best Regards ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Multiple instances of Keystone
Keystone has specifically avoided including multiple process patches because they want to encourage apache + mod_wsgi as the standard way of scaling the keystone api. Vish On May 13, 2014, at 9:34 PM, Aniruddha Singh Gautam aniruddha.gau...@aricent.com wrote: Hi, Hope you are doing well… I was working on trying to apply the patch for running multiple instance of Keystone. Somehow it does not work with following errors, I wish to still debug it further, but thought that I will check with you if you can provide some quick help. I was following http://blog.gridcentric.com/?Tag=Scalability. I did the changes on Ice House GA. Error Traceback (most recent call last): File /usr/lib/python2.7/dist-packages/keystone/openstack/common/threadgroup.py, line 119, in wait x.wait() File /usr/lib/python2.7/dist-packages/keystone/openstack/common/threadgroup.py, line 47, in wait return self.thread.wait() File /usr/lib/python2.7/dist-packages/eventlet/greenthread.py, line 168, in wait return self._exit_event.wait() File /usr/lib/python2.7/dist-packages/eventlet/event.py, line 116, in wait return hubs.get_hub().switch() File /usr/lib/python2.7/dist-packages/eventlet/hubs/hub.py, line 187, in switch return self.greenlet.switch() File /usr/lib/python2.7/dist-packages/eventlet/greenthread.py, line 194, in main result = function(*args, **kwargs) File /usr/lib/python2.7/dist-packages/keystone/openstack/common/service.py, line 449, in run_service service.start() AttributeError: 'tuple' object has no attribute 'start' (keystone): 2014-05-13 08:17:37,073 CRITICAL AttributeError: 'tuple' object has no attribute 'stop' Traceback (most recent call last): File /usr/bin/keystone-all, line 162, in module serve(*servers) File /usr/bin/keystone-all, line 111, in serve launcher.wait() File /usr/lib/python2.7/dist-packages/keystone/openstack/common/service.py, line 352, in wait self._respawn_children() File /usr/lib/python2.7/dist-packages/keystone/openstack/common/service.py, line 342, in _respawn_children self._start_child(wrap) File /usr/lib/python2.7/dist-packages/keystone/openstack/common/service.py, line 282, in _start_child status, signo = self._child_wait_for_exit_or_signal(launcher) File /usr/lib/python2.7/dist-packages/keystone/openstack/common/service.py, line 240, in _child_wait_for_exit_or_signal launcher.stop() File /usr/lib/python2.7/dist-packages/keystone/openstack/common/service.py, line 95, in stop self.services.stop() File /usr/lib/python2.7/dist-packages/keystone/openstack/common/service.py, line 419, in stop service.stop() AttributeError: 'tuple' object has no attribute 'stop' In logs I can find the new child processes, somehow probably they are stopped and then it spawns another child processes. I also noticed that support for running multiple neutron servers in ICE House GA. Any specific reason of not having same thing for Keystone (My knowledge of Openstack is limited, so please bear with my dumb questions) Best regards, Aniruddha DISCLAIMER: This message is proprietary to Aricent and is intended solely for the use of the individual to whom it is addressed. It may contain privileged or confidential information and should not be circulated or used for any purpose other than for what it is intended. If you have received this message in error, please notify the originator immediately. If you are not the intended recipient, you are notified that you are strictly prohibited from using, copying, altering, or disclosing the contents of this message. Aricent accepts no responsibility for loss or damage arising from the use of the information transmitted by this email including damage from virus. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Cinder] Question about synchronized decoration usage in cinder-volume
On Apr 26, 2014, at 2:56 AM, Zhangleiqiang (Trump) zhangleiqi...@huawei.com wrote: Hi, all: I find almost all of the @utils.synchronized decoration usage in cinder-volume (cinder.volume.manager / cinder.volume.drivers.*) with an external=True param. Such as cinder.volume.manager.VolumeManager:attach_volume: def attach_volume(self, context, volume_id, instance_uuid, host_name, mountpoint, mode): Updates db to show volume is attached. @utils.synchronized(volume_id, external=True) def do_attach(): However, in docstring of common.lockutils.synchronized, I find param external is used for multi-workers scenario: :param external: The external keyword argument denotes whether this lock should work across multiple processes. This means that if two different workers both run a a method decorated with @synchronized('mylock', external=True), only one of them will execute at a time. I have two questions about it. 1. As far as I know, cinder-api has supported multi-worker mode and cinder-volume doesn't support it, does it? So I wonder why the external=True param is used here? Before the multibackend support in cinder-volume it was common to run more than one cinder-volume for different backends on the same host. This would require external=True 2. Specific to cinder.volume.manager.VolumeManager:attach_volume, all operations in do_attach method are database related. As said in [1], operations to the database will block the main thread of a service, so another question I want to know is why this method is needed to be synchronized? Currently db operations block the main thread of the service, but hopefully this will change in the future. Vish Thanks. [1] http://docs.openstack.org/developer/cinder/devref/threading.html#mysql-access-and-eventlet -- zhangleiqiang (Trump) Best Regards ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Hierarchicical Multitenancy Discussion
This is a bit different from how I would have expected it to work. It appears that you are adding the role assignment when the project is created. IMO the role should be added to the list when the roles are checked. In other words, when getting the list of roles for a user/project, it walks up the tree to find all parent projects and creates a list of roles that includes all of the parent roles for that user that are marked inheritable. The implementation you made seems very fragile if parent projects are changed etc. Vish On Apr 14, 2014, at 12:17 PM, Raildo Mascena rail...@gmail.com wrote: Hi all, As I had promised, here is the repository of Telles Nobrega (https://github.com/tellesnobrega/keystone/tree/multitenancy) updated now with inherited roles working with hierarchical projects. How does it work? Inherited roles operate in the following way: - It should be added a role to be inherited to a domain using the api PUT localhost:35357/v3/OS-INHERIT/domains/{domain_id}/users/{user_id}/roles/{role_id}/inherited_to_projects. - It should be created a hierarchical project as described above for Telles. - List the assignments roles GET localhost: 35357/v3/role_assignments and check that the role inherited is already associated with this new project. What was implemented? The implementation consists of filtering roles which are associated with the parent project to be inherited (this is done by checking the assigment table) for them to be added to the child project. Also a filter has been created to ensure that a role inherited from another domain does not interfere in the inheritance of this project. What remains to implement? Role inheritance has been implemented to work with domains, so the role will be inherited to all projects contained this domain, ie, a role that is marked to be inherited, even if it is not associated with the parent project, will be inherited to the child project. In my opinion, should be created a project column in the assignment that would indicate where to start inheritance projects, it would be possible to finish this feature. (This is just a suggestion, I believe there are other ways to make it work). 2014-03-17 8:04 GMT-03:00 Telles Nobrega tellesnobr...@gmail.com: That is good news, I can have both information sent to nova really easy. I just need to add a field into the token, or more than one if needed. RIght now I send Ids, it could names just as easily and we can add a new field so we can have both information sent. I'm not sure which is the best option for us but i would think that sending both for now would keep the compatibility and we could still use the names for display porpuse On Sun, Mar 16, 2014 at 9:18 AM, Jay Pipes jaypi...@gmail.com wrote: On Fri, 2014-03-14 at 13:43 -0700, Vishvananda Ishaya wrote: Awesome, this is exactly what I was thinking. I think this is really close to being usable on the nova side. First of all the dot.sperated.form looks better imo, and I think my code should still work that way as well. The other piece that is needed is mapping ids to names for display purposes. I did something like this for a prototype of names in dns caching that should work nicely. The question simply becomes how do we expose those names. I’m thinking we have to add an output field to the display of objects in the system showing the fully qualified name. We can then switch the display in novaclient to show names instead of ids. That way an admin listing all the projects in orga would see the owner as orga.projb instead of the id string. The other option would be to pass names instead of ids from keystone and store those instead. That seems simpler at first glance, it is not backwards compatible with the current model so it will be painful for providers to switch. -1 for instead of. in addition to would have been fine, IMO. Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- -- Telles Mota Vidal Nobrega Bsc in Computer Science at UFCG Software Engineer at PulsarOpenStack Project - HP/LSD-UFCG ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Raildo Mascena Bachelor of Computer Science. Software Engineer at Laboratory of Distributed Systems - UFCG ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev
Re: [openstack-dev] [Cinder] Question about the magic 100M when creating zero-size volume
This was added long ago to support testing environments without a lot of disk space. Vish On Apr 29, 2014, at 12:12 AM, Zhangleiqiang (Trump) zhangleiqi...@huawei.com wrote: Hi, all: I find in some of the cinder backend volume drivers, there are codes in create_volume as follows: #cinder.volume.drivers.lvm def _sizestr(self, size_in_g): if int(size_in_g) == 0: return '100m' Similar codes also exist in ibm.gpfs, san.hp.hp_lefthand_cliq_proxy, san.solaris and huawei.ssh_common. I wonder why the 100M is used here, from the git log I cannot find useful info. Thanks. -- zhangleiqiang (Trump) Best Regards ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Openstack] Question regarding Nova in Havana
I believe the exchange should also be ‘nova’. Vish On Apr 15, 2014, at 11:31 PM, Prashant Upadhyaya prashant.upadhy...@aricent.com wrote: Hi Vish, Thanks, now one more question – When I send the request out, I send it to the exchange ‘nova’ and routing key ‘conductor’ (using RabbitMQ), this will take the message to the Nova Conductor on the controller, I have been able to do that much. I do see that there is a ‘reply queue’ embedded in the above message so presumably the Nova Conductor will use that queue to send back the response, is that correct ? If the above is correct, what is the ‘exchange’ used Nova Conductor to send back this response. Regards -Prashant From: Vishvananda Ishaya [mailto:vishvana...@gmail.com] Sent: Wednesday, April 16, 2014 10:11 AM To: OpenStack Development Mailing List (not for usage questions) Cc: Prashant Upadhyaya Subject: Re: [openstack-dev] [Openstack] Question regarding Nova in Havana The service reference is created in the start method of the service. This happens around line 217 in nova/service.py in the current code. You should be able to do something similar by sending a message to service_create on conductor. It will return an error if the service already exists. Note you can also use service_get_by_args in conductor to see if the service exists. Vish On Apr 15, 2014, at 9:22 PM, Swapnil S Kulkarni cools...@gmail.com wrote: Interesting discussion. Forwarding to openstack-dev. On Wed, Apr 16, 2014 at 9:34 AM, Prashant Upadhyaya prashant.upadhy...@aricent.com wrote: Hi, I am writing a Compute Node Simulator. The idea is that I would write a piece of software using C which honors the RabbitMQ interface towards the Controller, but will not actually do the real thing – everything on the Compute Node will be simulated by my simulator software. The problem I am facing, that I have not been able to get my simulated CN listed in the output of nova-manage service list I am on Havana, and my simulator is sending a periodic ‘service_update’ and ‘compute_node_update’ RPC messages to the ‘nova’ exchange and the ‘conductor’ routing key. I can manipulate the above messages at will to fool the controller. (I observe the messages from a real CN and take cues from there to construct a fake one from my simulator) Question is – what causes the controller to add a new Nova Compute in its database, is it the ‘service_update’ RPC or something else. Hopefully you can help me reverse engineer the interface. Regards -Prashant DISCLAIMER: This message is proprietary to Aricent and is intended solely for the use of the individual to whom it is addressed. It may contain privileged or confidential information and should not be circulated or used for any purpose other than for what it is intended. If you have received this message in error, please notify the originator immediately. If you are not the intended recipient, you are notified that you are strictly prohibited from using, copying, altering, or disclosing the contents of this message. Aricent accepts no responsibility for loss or damage arising from the use of the information transmitted by this email including damage from virus. ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openst...@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev DISCLAIMER: This message is proprietary to Aricent and is intended solely for the use of the individual to whom it is addressed. It may contain privileged or confidential information and should not be circulated or used for any purpose other than for what it is intended. If you have received this message in error, please notify the originator immediately. If you are not the intended recipient, you are notified that you are strictly prohibited from using, copying, altering, or disclosing the contents of this message. Aricent accepts no responsibility for loss or damage arising from the use of the information transmitted by this email including damage from virus. signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] TC candidacy
Hello all, I’d like to announce my candidacy for the Technical Committee election. I was one of the original authors of the Nova project and served as its PTL for the first two years that the position existed. I have also been on the Technical Comittee since its inception. I was also recently elected to the OpenStack Board. In my day job I am the Chief Technical Officer for Nebula, a startup focused on bringing OpenStack to the Enterprise. My role in OpenStack has changed over time from being one of the top code contributors to more leadership and planning. I helped start both Cinder and Devstack. I’m on the stable-backports committee, and I’m currently working on an effort to bring Hierarchical Multitenancy to all of the OpenStack Projects. I also spend a lot of my time dealing with my companies customers, which are real operators and users of OpenStack. I think there are two major governance issues that need to be addressed in OpenStack. We’ve started having these discussions in both the Board and the Technical Committee, but some more effort is needed to drive them home. 1. Stop the kitchen-sink approach. We are adding new projects like mad and in order to keep the quality of the integrated release high, we have to raise the bar to be come integrated. We made some changes over the past few months here. 2. Better product management. This was a topic of discussion at the last board meeting and one we hope to continue at the joint meeting in Atlanta. We have a bit of a whole across openstack that is filled in most organizations by a product management team. We need to be more conscious of release quality and addressing customer issues. It isn’t exactly clear how something like this should happen in Open Source, but it is something we should try to address. I hope to be able to continue to address these issues on the Technical Committee and provide some much-needed understanding from the “old-days” of OpenStack. It is often helpful to know where you came from in order see the best direction to go next. Thanks, Vish Ishaya signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Openstack] Question regarding Nova in Havana
The service reference is created in the start method of the service. This happens around line 217 in nova/service.py in the current code. You should be able to do something similar by sending a message to service_create on conductor. It will return an error if the service already exists. Note you can also use service_get_by_args in conductor to see if the service exists. Vish On Apr 15, 2014, at 9:22 PM, Swapnil S Kulkarni cools...@gmail.com wrote: Interesting discussion. Forwarding to openstack-dev. On Wed, Apr 16, 2014 at 9:34 AM, Prashant Upadhyaya prashant.upadhy...@aricent.com wrote: Hi, I am writing a Compute Node Simulator. The idea is that I would write a piece of software using C which honors the RabbitMQ interface towards the Controller, but will not actually do the real thing – everything on the Compute Node will be simulated by my simulator software. The problem I am facing, that I have not been able to get my simulated CN listed in the output of nova-manage service list I am on Havana, and my simulator is sending a periodic ‘service_update’ and ‘compute_node_update’ RPC messages to the ‘nova’ exchange and the ‘conductor’ routing key. I can manipulate the above messages at will to fool the controller. (I observe the messages from a real CN and take cues from there to construct a fake one from my simulator) Question is – what causes the controller to add a new Nova Compute in its database, is it the ‘service_update’ RPC or something else. Hopefully you can help me reverse engineer the interface. Regards -Prashant DISCLAIMER: This message is proprietary to Aricent and is intended solely for the use of the individual to whom it is addressed. It may contain privileged or confidential information and should not be circulated or used for any purpose other than for what it is intended. If you have received this message in error, please notify the originator immediately. If you are not the intended recipient, you are notified that you are strictly prohibited from using, copying, altering, or disclosing the contents of this message. Aricent accepts no responsibility for loss or damage arising from the use of the information transmitted by this email including damage from virus. ___ Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack Post to : openst...@lists.openstack.org Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] use of the oslo namespace package
I dealt with this myself the other day and it was a huge pain. That said, changing all the packages seems like a nuclear option. Is there any way we could change python that would make it smarter about searching multiple locations for namespace packages? Vish On Apr 7, 2014, at 12:24 PM, Doug Hellmann doug.hellm...@dreamhost.com wrote: Some of the production Oslo libraries are currently being installed into the oslo namespace package (oslo.config, oslo.messaging, oslo.vmware, oslo.rootwrap, and oslo.version). Over the course of the last 2 release cycles, we have seen an increase in the number of developers who end up with broken systems, where an oslo library (most often oslo.config) cannot be imported. This is usually caused by having one copy of a library installed normally (via a system package or via pip) and another version in development (a.k.a., editable) mode as installed by devstack. The symptom is most often an error about importing oslo.config, although that is almost never the library causing the problem. We have already worked around this issue with the non-production libraries by installing them into their own packages, without using the namespace (oslotest, oslosphinx, etc.). We have also changed the way packages are installed in nova's tox.ini, to force installation of packages into the virtualenv (since exposing the global site-packages was a common source of the problem). And very recently, Sean Dague changed devstack to install the oslo libraries not in editable mode, so that installing from source should replace any existing installed version of the same library. However, the problems seem to persist, and so I think it's time to revisit our decision to use a namespace package. After experimenting with non-namespace packages, I wasn't able to reproduce the same import issues. I did find one case that may cause us some trouble, though. Installing a package and then installing an editable version from source leaves both installed and the editable version appears first in the import path. That might cause surprising issues if the source is older than the package, which happens when a devstack system isn't updated regularly and a new library is released. However, surprise due to having an old version of code should occur less frequently than, and have less of an impact than, having a completely broken set of oslo libraries. We can avoid adding to the problem by putting each new library in its own package. We still want the Oslo name attached for libraries that are really only meant to be used by OpenStack projects, and so we need a naming convention. I'm not entirely happy with the crammed together approach for oslotest and oslosphinx. At one point Dims and I talked about using a prefix oslo_ instead of just oslo, so we would have oslo_db, oslo_i18n, etc. That's also a bit ugly, though. Opinions? Given the number of problems we have now (I help about 1 dev per week unbreak their system), I think we should also consider renaming the existing libraries to not use the namespace package. That isn't a trivial change, since it will mean updating every consumer as well as the packaging done by distros. If we do decide to move them, I will need someone to help put together a migration plan. Does anyone want to volunteer to work on that? Before we make any changes, it would be good to know how bad this problem still is. Do developers still see issues on clean systems, or are all of the problems related to updating devstack boxes? Are people figuring out how to fix or work around the situation on their own? Can we make devstack more aggressive about deleting oslo libraries before re-installing them? Are there other changes we can make that would be less invasive? Doug ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Domains prototype in Nova
Note that there has been a lot of discussion and a potential path forward for hierarchical project support in openstack. I personally think this makes a lot more sense than having a bunch of domain specific calls. Please take a look at the information in the wiki here: https://wiki.openstack.org/wiki/HierarchicalMultitenancy We are going to be discussing this in detail at the summit. Vish On Mar 28, 2014, at 7:06 AM, Henrique Truta henriquecostatr...@gmail.com wrote: Hi all! I've been working on a prototype of Domains in Nova. In that prototype the user is now able to do the following API calls with a domain scoped token: GET v2/domains/{domain_id}/servers: Lists servers which projects belong to the given domain GET v2/domains/{domain_id}/servers/{server_id}: Gets details from the given server DELETE v2/domains/{domain_id}/servers/{server_id}: Deletes the given server POST v2/domains/{domain_id}/servers/{server_id}/action: Reboots the given server Could you help me test these functionalities and review the code? The code can be found in my github repo (https://github.com/henriquetruta/nova) on the domains-prototype branch. Thanks! -- -- Ítalo Henrique Costa Truta ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [bug] nova server-group-list doesn't show any members
On Mar 27, 2014, at 4:38 PM, Chris Friesen chris.frie...@windriver.com wrote: On 03/27/2014 04:47 PM, Chris Friesen wrote: Interestingly, unit test nova.tests.api.openstack.compute.contrib.test_server_groups.ServerGroupTest.test_display_members passes just fine, and it seems to be running the same sqlalchemy code. Is this a case where sqlite behaves differently from mysql? Sorry to keep replying to myself, but this might actually hit us other places. Down in db/sqlalchemy/api.py we end up calling query = query.filter(column_attr.op(db_regexp_op)('None’)) Sqlalchemy handles mapping None to NULL in many cases, so it appears that the problem is that we are passing None as a string here? Vish When using mysql, it looks like a regexp comparison of the string 'None' against a NULL field fails to match. Since sqlite doesn't have its own regexp function we provide one in openstack/common/db/sqlalchemy/session.py. In the buggy case we end up calling it as regexp('None', None), where the types are unicode and NoneType. However, we end up converting the second arg to text type before calling reg.search() on it, so it matches. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder] Refactor ISCSIDriver to support other iSCSI transports besides TCP
This all makes sense to me. I would suggest a blueprint and a pull request as soon as Juno opens. Vish On Mar 25, 2014, at 8:07 AM, Shlomi Sasson shlo...@mellanox.com wrote: Hi, I want to share with the community the following challenge: Currently, Vendors who have their iSCSI driver, and want to add RDMA transport (iSER), cannot leverage their existing plug-in which inherit from iSCSI And must modify their driver or create an additional plug-in driver which inherit from iSER, and copy the exact same code. Instead I believe a simpler approach is to add a new attribute to ISCSIDriver to support other iSCSI transports besides TCP, which will allow minimal changes to support iSER. The existing ISERDriver code will be removed, this will eliminate significant code and class duplication, and will work with all the iSCSI vendors who supports both TCP and RDMA without the need to modify their plug-in drivers. To achieve that both cinder nova requires slight changes: For cinder, I wish to add a parameter called “transport” (default to iscsi) to distinguish between the transports and use the existing “iscsi_ip_address” parameter for any transport type connection. For nova, I wish to add a parameter called “default_rdma” (default to false) to enable initiator side. The outcome will avoid code duplication and the need to add more classes. I am not sure what will be the right approach to handle this, I already have the code, should I open a bug or blueprint to track this issue? Best Regards, Shlomi ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][scheduler] Availability Zones and Host aggregates..
Personally I view this as a bug. There is no reason why we shouldn’t support arbitrary grouping of zones. I know there is at least one problem with zones that overlap regarding displaying them properly: https://bugs.launchpad.net/nova/+bug/1277230 There is probably a related issue that is causing the error you see below. IMO both of these should be fixed. I also think adding a compute node to two different aggregates with azs should be allowed. It also might be nice to support specifying multiple zones in the launch command in these models. This would allow you to limit booting to an intersection of two overlapping zones. A few examples where these ideas would be useful: 1. You have 3 racks of servers and half of the nodes from each rack plugged into a different switch. You want to be able to specify to spread across racks or switches via an AZ. In this model you could have a zone for each switch and a zone for each rack. 2. A single cloud has 5 racks in one room in the datacenter and 5 racks in a second room. You’d like to give control to the user to choose the room or choose the rack. In this model you would have one zone for each room, and smaller zones for each rack. 3. You have a small 3 rack cloud and would like to ensure that your production workloads don’t run on the same machines as your dev workloads, but you also want to use zones spread workloads across the three racks. Similarly to 1., you could split your racks in half via dev and prod zones. Each one of these zones would overlap with a rack zone. You can achieve similar results in these situations by making small zones (switch1-rack1 switch1-rack2 switch1-rack3 switch2-rack1 switch2-rack2 switch2-rack3) but that removes the ability to decide to launch something with less granularity. I.e. you can’t just specify ‘switch1' or ‘rack1' or ‘anywhere’ I’d like to see all of the following work nova boot … (boot anywhere) nova boot —availability-zone switch1 … (boot it switch1 zone) nova boot —availability-zone rack1 … (boot in rack1 zone) nova boot —availability-zone switch1,rack1 … (boot Vish On Mar 25, 2014, at 1:50 PM, Sangeeta Singh sin...@yahoo-inc.com wrote: Hi, The availability Zones filter states that theoretically a compute node can be part of multiple availability zones. I have a requirement where I need to make a compute node part to 2 AZ. When I try to create a host aggregates with AZ I can not add the node in two host aggregates that have AZ defined. However if I create a host aggregate without associating an AZ then I can add the compute nodes to it. After doing that I can update the host-aggregate an associate an AZ. This looks like a bug. I can see the compute node to be listed in the 2 AZ with the availability-zone-list command. The problem that I have is that I can still not boot a VM on the compute node when I do not specify the AZ in the command though I have set the default availability zone and the default schedule zone in nova.conf. I get the error “ERROR: The requested availability zone is not available” What I am trying to achieve is have two AZ that the user can select during the boot but then have a default AZ which has the HV from both AZ1 AND AZ2 so that when the user does not specify any AZ in the boot command I scatter my VM on both the AZ in a balanced way. Any pointers. Thanks, Sangeeta ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev