Re: [openstack-dev] [Neutron][Testr] Brand new checkout of Neutron... getting insane unit test run results
It looks like the problem is that there is a dependency on pyudev which only works properly on Linux. The neutron setup_hook does properly install pyudev on Linux (explains why the tests run in the gate), but would not work properly on windows or OS X. I assume folks are trying to run the tests on not Linux? Neutron may want to do something similar to what Nova does when libvirt is not importable, https://git.openstack.org/cgit/openstack/nova/tree/nova/tests/virt/libvirt/test_libvirt.py#n77 and use a fake in order to get the tests to run properly anyways. +1 to this suggestion. I used to have a work around where I'd modify the requirements.txt to comment out the pyudev dependency, until e1165ce removed it from the requirements file. That at least allowed me to set Neutron up and run the majority of the unit tests. -- Sean M. Collins ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] minimum review period for functional changes that break backwards compatibility
Tim Bell wrote: - Changes in default behaviour: Always likely to affect existing systems in some way. Maybe we should have an additional type of review vote that comes from people who are recognised as reperensting large production deployments ? This is my biggest worry... there are changes which may be technically valid but have a significant disruptive impact on those people who are running clouds. Asking the people who are running production OpenStack clouds to review every patch to understand the risks and assess the migration impact is asking a lot. IMHO there are a few takeaways from this thread... When a proposed patch is known to change default behavior, or break backward compatibility, or cause an upgrade headache, we should definitely be more careful before finally approving the change. We should also have a mechanism to engage with users and operators so that they can weigh in. In the worst case scenario where there is no good solution, at least they are informed that the pain is coming. One remaining question would be... what is that mechanism ? Mail to the general list ? the operators list ? (should those really be two separate lists ?) Some impact tag that upgrade-minded operators can subscribe to ? For the cases where we underestimate the impact of a change, there is no magic bullet. So, like Sean said, we need to continue improving the number of things we cover by automated testing. We also need to continue encouraging a devops culture. When most developers are also people running clouds, we are better at estimating operational impact. I've seen a bit of us vs. them between operators and developers recently, and this is a dangerous trends. A sysadmin-friendly programming language was picked for OpenStack for a reason: to make sure that operation-minded developers and development-minded operators could all be part of the same game. If we create two separate groups, tension will only get worse. -- Thierry Carrez (ttx) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO] Next weeks meeting cancelled?
A bunch of the TripleO folk are going to be at LCA. I'd like to cancel the meeting rather than get them all up at (IIRC) 3am. If folk who aren't going want to hold the meeting, thats fine - but I won't be around to chair :) Cheers, Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron][IPv6] Meeting time - change to 1300 UTC or 1500 UTC?
Looking at the calendar, our options for 1500 UTC require us to change the day that we meet. The following days are available: * Tuesdays * Fridays Thoughts? -- Sean M. Collins ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Neutron][IPv6] Meeting Agenda - Jan 2nd 2014 2100 UTC
Happy New Year! Since we're still hashing out the logistics of changing our meeting time, we'll meet at 2100 UTC - and hopefully transition to a new meeting day and time by next week. I have created a section in the wiki for today's agenda: https://wiki.openstack.org/wiki/Meetings/Neutron-IPv6-Subteam#Agenda_for_Jan_2nd Please feel free to add items to the agenda and we'll discuss. -- Sean M. Collins ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] minimum review period for functional changes that break backwards compatibility
Hi Thierry, Thanks for a great summary. I don't really share your view that there is a us vs them attitude emerging between operators and developers (but as someone with a foot in both camps maybe I'm just thinking that because otherwise I'd become even more bi-polar :-) I would suggest though that the criteria for core reviewers is maybe more slanted towards developers that operators, and that it would be worth considering if there is some way to recognised and incorporate the different perspective that operators can provide into the review process. Regards, Phil -Original Message- From: Thierry Carrez [mailto:thie...@openstack.org] Sent: 02 January 2014 09:53 To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [nova] minimum review period for functional changes that break backwards compatibility Tim Bell wrote: - Changes in default behaviour: Always likely to affect existing systems in some way. Maybe we should have an additional type of review vote that comes from people who are recognised as reperensting large production deployments ? This is my biggest worry... there are changes which may be technically valid but have a significant disruptive impact on those people who are running clouds. Asking the people who are running production OpenStack clouds to review every patch to understand the risks and assess the migration impact is asking a lot. IMHO there are a few takeaways from this thread... When a proposed patch is known to change default behavior, or break backward compatibility, or cause an upgrade headache, we should definitely be more careful before finally approving the change. We should also have a mechanism to engage with users and operators so that they can weigh in. In the worst case scenario where there is no good solution, at least they are informed that the pain is coming. One remaining question would be... what is that mechanism ? Mail to the general list ? the operators list ? (should those really be two separate lists ?) Some impact tag that upgrade-minded operators can subscribe to ? For the cases where we underestimate the impact of a change, there is no magic bullet. So, like Sean said, we need to continue improving the number of things we cover by automated testing. We also need to continue encouraging a devops culture. When most developers are also people running clouds, we are better at estimating operational impact. I've seen a bit of us vs. them between operators and developers recently, and this is a dangerous trends. A sysadmin-friendly programming language was picked for OpenStack for a reason: to make sure that operation-minded developers and development-minded operators could all be part of the same game. If we create two separate groups, tension will only get worse. -- Thierry Carrez (ttx) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Grizzly - Havanna db migration bug/1245502 still biting me?
Just FYI this backport has now merged. Hopefully this will mean it'll get you past now. Cheers, Josh From: Joshua Hesketh [joshua.hesk...@rackspace.com] Sent: Wednesday, January 01, 2014 11:40 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] Grizzly - Havanna db migration bug/1245502 still biting me? Hi Jon, The issue here is that the backport into Havana hasn't been merged yet: https://review.openstack.org/#/c/56149/ I apologise for the delay, It has been in review for a long time. If there are some stable maintainers that are able to take a look that would be really appreciated. Cheers, Josh From: Jonathan Proulx [j...@jonproulx.com] Sent: Tuesday, December 31, 2013 5:25 AM To: OpenStack Development Mailing List Subject: [openstack-dev] [Nova] Grizzly - Havanna db migration bug/1245502 still biting me? Hi All, I'm mid upgrade between Grizzly and Havana and seem to still be having issues with https://bugs.launchpad.net/nova/+bug/1245502 I grabbed the patched version of 185_rename_unique_constraints.py but that migration still keeps failing with many issues trying to drop nonexistant indexes and add preexisting indexes. for example: CRITICAL nova [-] (OperationalError) (1553, Cannot drop index 'uniq_instance_type_id_x_project_id_x_deleted': needed in a foreign key constraint) 'ALTER TABLE instance_type_projects DROP INDEX uniq_instance_type_id_x_project_id_x_deleted' () I'm on Ubuntu 12.04 having originally installed Essex and progressively upgraded using cloud archive packages since. # nova-manage db version 184 a dump of my nova database schema as it currently stands is at: https://gist.github.com/jon-proulx/79a77e8b771f90847ae9 The bug is marked fix released, but one of the last comments request schemata from affected systems so not sure if replying to the bug is appropriate or if this should be a new bug. -Jon ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron][qa] Parallel testing update
Hi again, I've now run the experimental job a good deal of times, and I've filed bugs for all the issues which came out. Most of them occurred no more than once among all test execution (I think about 30). They're all tagged with neutron-parallel [1]. for ease of tracking, I've associated all the bug reports with neutron, but some are probably more tempest or nova issues. Salvatore [1] https://bugs.launchpad.net/neutron/+bugs?field.tag=neutron-parallel On 27 December 2013 11:09, Salvatore Orlando sorla...@nicira.com wrote: Hi, We now have several patches under review which improve a lot how neutron handles parallel testing. In a nutshell, these patches try to ensure the ovs agent processes new, removed, and updated interfaces as soon as possible, These patches are: https://review.openstack.org/#/c/61105/ https://review.openstack.org/#/c/61964/ https://review.openstack.org/#/c/63100/ https://review.openstack.org/#/c/63558/ There is still room for improvement. For instance the calls from the agent into the plugins might be consistently reduced. However, even if the above patches shrink a lot the time required for processing a device, we are still hitting a hard limit with the execution ovs commands for setting local vlan tags and clearing flows (or adding the flow rule for dropping all the traffic). In some instances this commands slow down a lot, requiring almost 10 seconds to complete. This adds a delay in interface processing which in some cases leads to the hideous SSH timeout error (the same we see with bug 1253896 in normal testing). It is also worth noting that when this happens sysstat reveal CPU usage is very close to 100% From the neutron side there is little we can do. Introducing parallel processing for interface, as we do for the l3 agent, is not actually a solution, since ovs-vswitchd v1.4.x, the one executed on gate tests, is not multithreaded. If you think the situation might be improved by changing the logic for handling local vlan tags and putting ports on the dead vlan, I would be happy to talk about that. On my local machines I've seen a dramatic improvement in processing times by installing ovs 2.0.0, which has a multi-threaded vswitchd. Is this something we might consider for gate tests? Also, in order to reduce CPU usage on the gate (and making tests a bit faster), there is a tempest patch which stops creating and wiring neutron routers when they're not needed: https://review.openstack.org/#/c/62962/ Even in my local setup which succeeds about 85% of times, I'm still seeing some occurrences of the issue described in [1], which at the end of the day seems a dnsmasq issue. Beyond the 'big' structural problem discussed above, there are some minor problems with a few tests: 1) test_network_quotas.test_create_ports_until_quota_hit fails about 90% of times. I think this is because the test itself should be made aware of parallel execution and asynchronous events, and there is a patch for this already: https://review.openstack.org/#/c/64217 2) test_attach_interfaces.test_create_list_show_delete_interfaces fails about 66% of times. The failure is always on an assertion made after deletion of interfaces, which probably means the interface is not deleted within 5 seconds. I think this might be a consequence of the higher load on the neutron service and we might try to enable multiple workers on the gate to this aim, or just increase the tempest timeout. On a slightly different note, allow me to say that the way assertion are made on this test might be improved a bit. So far one has to go through the code to see why the test failed. Thanks for reading this rather long message. Regards, Salvatore [1] https://lists.launchpad.net/openstack/msg23817.html On 2 December 2013 22:01, Kyle Mestery (kmestery) kmest...@cisco.comwrote: Yes, this is all great Salvatore and Armando! Thank you for all of this work and the explanation behind it all. Kyle On Dec 2, 2013, at 2:24 PM, Eugene Nikanorov enikano...@mirantis.com wrote: Salvatore and Armando, thanks for your great work and detailed explanation! Eugene. On Mon, Dec 2, 2013 at 11:48 PM, Joe Gordon joe.gord...@gmail.com wrote: On Dec 2, 2013 9:04 PM, Salvatore Orlando sorla...@nicira.com wrote: Hi, As you might have noticed, there has been some progress on parallel tests for neutron. In a nutshell: * Armando fixed the issue with IP address exhaustion on the public network [1] * Salvatore has now a patch which has a 50% success rate (the last failures are because of me playing with it) [2] * Salvatore is looking at putting back on track full isolation [3] * All the bugs affecting parallel tests can be queried here [10] * This blueprint tracks progress made towards enabling parallel testing [11] - The long story is as follows: Parallel testing basically is not working because parallelism means
[openstack-dev] [UX] django-bootstrap-form django 1.4 / 1.5
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev In-Reply-To: 52c51651.9010...@redhat.com On 2014-01-02 08:33:37 +0100 (+0100), Matthias Runge wrote: [...] Djangp-bootstrap-form was added to global requirements by[1], but is seems it's not used at all, so I propose to remove it. https://git.openstack.org/cgit/openstack/requirements/commit/?id=46a6f06 https://blueprints.launchpad.net/horizon/+spec/bootstrap-update It looks like the UX team was working on changes related to the blueprint up until about a month ago, so I've tagged the subject in hopes of making this thread a little more visible to them. -- Jeremy Stanley ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Turbo-hipster
On 12/31/2013 3:58 PM, Michael Still wrote: Hi. So while turbo hipster is new, I've been reading every failure message it produces to make sure its not too badly wrong. There were four failures posted last night while I slept: https://review.openstack.org/#/c/64521 This one is a TH bug. We shouldn't be testing stable branches. bug/1265238 has been filed to track this. https://review.openstack.org/#/c/61753 This is your review. The failed run's log is https://ssl.rcbops.com/turbo_hipster/logviewer/?q=/turbo_hipster/results/61/61753/8/check/gate-real-db-upgrade_nova_percona_user_001/1326092/user_001.log and you can see from the failure message that migrations 152 and 206 took too long. Migration 152 took 326 seconds, where our historical data of 2,846 test migrations says it should take 222 seconds. Migration 206 took 81 seconds, where we think it should take 56 seconds based on 2,940 test runs. Whilst I can't explain why those migrations took too long this time around, they are certainly exactly the sort of thing TH is meant to catch. If you think your patch isn't responsible (perhaps the machine is just being slow or something), you can always retest by leaving a review comment of recheck migrations. I have done this for you on this patch. Michael, is recheck migrations something that is going to be added to the wiki for test failures here? https://wiki.openstack.org/wiki/GerritJenkinsGit#Test_Failures https://review.openstack.org/#/c/61717 This review also had similar unexplained slowness, but has already been rechecked by someone else and now passes. I note that the slowness in both cases was from the same TH worker node, and I will keep an eye on that node today. https://review.openstack.org/#/c/56420 This review also had slowness in migration 206, but has been rechecked by the developer and now passes. It wasn't on the percona-001 worker that the other two were on, so perhaps this indicates that we need to relax the timing requirements for migration 206. Hope this helps, Michael On Wed, Jan 1, 2014 at 12:34 AM, Gary Kotton gkot...@vmware.com wrote: Hi, It seems that she/he is behaving oddly again. I have posted a patch that does not have any database changes and it has give me a –1…. Happy new year Gary ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Thanks, Matt Riedemann ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron][3rd Party Testing]Remove voting until your testing structure works.
On 12/30/2013 09:32 PM, Anita Kuno wrote: Please. When your third party testing structure votes on patches and your testing structure is not stable, it will vote with a -1 on patches. This results in three consequences: 1. The patch it votes on starts a countdown for abandonment, this is frustrating for developers. 2. Reviewers who use -Verified-1 as a filter criteria will not review a patch with a -1 in the verification column in the Gerrit dashboard. This prevents developers from progressing in their work, and this also prevents reviewers from reviewing patches that need to be assessed. 3. Third party testing that does not provide publicly accessible logs leave developers with no way to diagnose the issue, which makes it very difficult for a developer to fix, leaving the patch in a state of limbo. You can post messages to the patches, including a stable working url to the logs for your tests, as you continue to work on your third party tests. You are also welcome to post a success of failure message, just please refrain from allowing your testing infrastructure to vote on patches until your testing infrastructure is working, reliably, and has logs that developers can use to fix problems. The list of third party testing accounts are found here.[0] Right now there are three plugins that need to remove voting until they are stable. Please be active in #openstack-neutron, #openstack-qa, and the mailing list so that if there is an issue with your testing structure, people can come and talk to you. Thank you, Anita. [0]https://review.openstack.org/#/admin/groups/91,members Keep in mind that the email provided in this list is expected to be an email people can use to contact you with questions and concerns about your testing interactions. [0] The point of the exercise is to provide useful and helpful information to developers and reviewers so that we all improve our code and create a better, more integrated product than we have now. Please check the email inboxes of the email addresses you have provided and please respond to inquires in a timely fashion. Please also remember people will look for you on irc, so having a representative available on irc for discussion will give you some useful feedback on ensuring your third party testing structure is behaving as efficiently as possible. We have a certain level of tolerance for unproductive noise while you are getting the bugs knocked out of your system. If developers are trying to contact you for more information and there is no response, third party testing structures that fail to comply with the expectation that they will respond to questions, will be addressed on a case by case basis. Thank you in advance for your kind attention, Anita. [0] https://review.openstack.org/#/admin/groups/91,members ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova][BluePrint Register] Shrink the volume when file in the instance was deleted.
On 25 December 2013 05:14, Qixiaozhen qixiaoz...@huawei.com wrote: Hi,all A blueprint is registered that is about shrinking the volume in thin provision. Have you got the link? Thin provision means allocating the disk space once the instance writes the data on the area of volume in the first time. However, if the files in the instance were deleted, thin provision could not deal with this situation. The space that was allocated by the files could not be released. So it is necessary to shrink the volume when the files are deleted in the instance. In this case the user will probably need to zero out the free space of your filesystem too, in some cases, unless nova does that for them, which sounds a bit dodgy. The operation of shrinking can be manually executed by the user with the web portal or CLI command or periodically in the background. I wondered about an optimise disk call. A few thoughts: * I am not sure it can always be done online for all drivers, may need an offline mode * Similar operations have ways of confirming and reverting to protect against dataloss * Ideally keep all operations on the virtual disk, and no operations on its content * With chains of disks, you may want to simplify the chain too (where it makes sense) John ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [neutron] PCI pass-through network support
On 22 December 2013 12:07, Irena Berezovsky ire...@mellanox.com wrote: Hi Ian, My comments are inline I would like to suggest to focus the next PCI-pass though IRC meeting on: 1.Closing the administration and tenant that powers the VM use cases. 2. Decouple the nova and neutron parts to start focusing on the neutron related details. When is the next meeting? I have lost track due to holidays, etc. John ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer][Oslo] Consuming Notifications in Batches
On 12/20/13, 11:57 PM, Jay Pipes jaypi...@gmail.com wrote: On 12/20/2013 04:43 PM, Julien Danjou wrote: On Fri, Dec 20 2013, Herndon, John Luke wrote: I think there is probably a tolerance for duplicates but you¹re right, missing a notification is unacceptable. Can anyone weigh in on how big of a deal duplicates are for meters? Duplicates aren¹t really unique to the batching approach, though. If a consumer dies after it¹s inserted a message into the data store but before the message is acked, the message will be requeued and handled by another consumer resulting in a duplicate. Duplicates can be a problem for metering, as if you see twice the same event it's possible you will think it happened twice. As for event storage, it won't be a problem if you use a good storage driver that can have unique constraint; you'll just drop it and log the fact that this should not have happened, or something like that. The above brings up a point related to the implementation of the existing SQL driver code that will need to be re-thought with the introduction of batch notification processing. Currently, the SQL driver's record_events() method [1] is written in a way that forces a new INSERT transaction for every record supplied to the method. If the record_events() method is called with 10K events, then 10K BEGIN; INSERT ...; COMMIT; transactions are executed against the server. Suffice to say, this isn't efficient. :) Ostensibly, from looking at the code, the reason that this approach was taken was to allow for the collection of duplicate event IDs, and return those duplicate event IDs to the caller. Because of this code: for event_model in event_models: event = None try: with session.begin(): event = self._record_event(session, event_model) except dbexc.DBDuplicateEntry: problem_events.append((api_models.Event.DUPLICATE, event_model)) The session object will be commit()'d after the session.begin() context manager exits, which will cause the aforementioned BEGIN; INSERT; COMMIT; transaction to be executed against the server for each event record. If we want to actually take advantage of the performance benefits of batching notification messages, the above code will need to be rewritten so that a single transaction is executed against the database for the entire batch of events. Yeah, this makes sense. Working on this driver is definitely on the to-do list (we also need to cache the event an trait types so several queries to the db are not incurred for each event). In the above code, we still have to deal with the dbduplicate error, but it gets much harder. The options I can think of are: 1) comb through the batch of events, remove the duplicate and try again or 2) allow the duplicate to be inserted and deal with it later. -john Best, -jay [1] https://github.com/openstack/ceilometer/blob/master/ceilometer/storage/imp l_sqlalchemy.py#L932 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev smime.p7s Description: S/MIME cryptographic signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Novnc switch to sockjs-client instead of websockify
This seems like a great choice, assuming that we can verify that it works properly. Vish On Dec 31, 2013, at 10:33 PM, Thomas Goirand z...@debian.org wrote: Hi, I was wondering if it would be possible for NoVNC to switch from websockify to sockjs-client, which is available here: https://github.com/sockjs/sockjs-client This has the advantage of not using flash at all (pure javascript), and continuing to work on all browsers, with a much cleaner licensing. Thoughts welcome! Cheers, Thomas Goirand (zigo) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev signature.asc Description: Message signed with OpenPGP using GPGMail ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Novnc switch to sockjs-client instead of websockify
On 01/01/2014 07:33 AM, Thomas Goirand wrote: Hi, I was wondering if it would be possible for NoVNC to switch from websockify to sockjs-client, which is available here: https://github.com/sockjs/sockjs-client This has the advantage of not using flash at all (pure javascript), and continuing to work on all browsers, with a much cleaner licensing. Looks good to me and good catch here! That would be an awesome improvement. Matthias ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Common SSH
On Mon, Dec 30, 2013 at 5:23 AM, Flavio Percoco fla...@redhat.com wrote: On 26/12/13 20:05 +0200, Sergey Skripnick wrote: Hi all, I'm surprised there is no common ssh library in oslo so I filed this blueprint[0]. I would be happy to address any comments/suggestions. [0] https://blueprints.launchpad.net/oslo/+spec/common-ssh-client If you think the BP is ready to be reviewed - which I bet you do since you already submitted a patch - please, set a target milestone and series. That'll make it pop-up into the BPs review process. I left a comment on the blueprint, but I'll post to the list for a broader audience. I'd like to explore whether the paramiko team will accept this code (or something like it). This seems like a perfect opportunity for us to contribute upstream. Doug ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] Next weeks meeting cancelled?
Excerpts from Robert Collins's message of 2014-01-02 01:59:23 -0800: A bunch of the TripleO folk are going to be at LCA. I'd like to cancel the meeting rather than get them all up at (IIRC) 3am. If folk who aren't going want to hold the meeting, thats fine - but I won't be around to chair :) I think we may still have value for those of us not going to LCA. I'll chair. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo] Common SSH
On 2013-12-30 04:23, Flavio Percoco wrote: On 26/12/13 20:05 +0200, Sergey Skripnick wrote: Hi all, I'm surprised there is no common ssh library in oslo so I filed this blueprint[0]. I would be happy to address any comments/suggestions. [0] https://blueprints.launchpad.net/oslo/+spec/common-ssh-client If you think the BP is ready to be reviewed - which I bet you do since you already submitted a patch - please, set a target milestone and series. That'll make it pop-up into the BPs review process. Cheers, FF I definitely think this is a good thing (Nova already has a simple wrapper around Paramiko, for example), but I agree with the comments on the bp that we should investigate contributing this to paramiko itself. It seems like something they should be open to, and then we get help with the maintenance from the experts on that project. Sergey, can you follow up with the paramiko devs to see what their feelings are on this? Thanks. -Ben ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron][Testr] Brand new checkout of Neutron... getting insane unit test run results
On 01/01/2014 10:56 PM, Clark Boylan wrote: On Wed, Jan 1, 2014 at 7:33 PM, 黎林果 lilinguo8...@gmail.com wrote: I have met this problem too.The units can't be run. The end info as: Ran 0 tests in 0.673s OK cp: cannot stat `.testrepository/-1': No such file or directory 2013/12/28 Jay Pipes jaypi...@gmail.com: On 12/27/2013 11:11 PM, Robert Collins wrote: I'm really sorry about the horrid UI - we're in the middle of fixing the plumbing to report this and support things like tempest better - from the bottom up. The subunit listing - testr reporting of listing errors is fixed on the subunit side, but not on the the testr side yet. If you look at the end of the error: \rimport errors4neutron.tests.unit.linuxbridge.test_lb_neutron_agent\x85\xc5\x1a\\', stderr=None error: testr failed (3) You can see this^ which translates as import errors neutron.tests.unit.linuxbridge.test_lb_neutron_agent so neutron/tests/unit/linuxbridge/test_lb_neutron_agent.py is failing to import. Phew, thanks Rob! I was a bit stumped there :) I have identified the import issue (this is on a fresh checkout of Neutron, BTW, so I'm a little confused how this made it through the gate... (.venv)jpipes@uberbox:~/repos/openstack/neutron$ python Python 2.7.4 (default, Sep 26 2013, 03:20:26) [GCC 4.7.3] on linux2 Type help, copyright, credits or license for more information. import neutron.tests.unit.linuxbridge.test_lb_neutron_agent Traceback (most recent call last): File stdin, line 1, in module File neutron/tests/unit/linuxbridge/test_lb_neutron_agent.py, line 29, in module from neutron.plugins.linuxbridge.agent import linuxbridge_neutron_agent File neutron/plugins/linuxbridge/agent/linuxbridge_neutron_agent.py, line 33, in module import pyudev ImportError: No module named pyudev Looks like pyudev needs to be added to requirements.txt... I've filed a bug: https://bugs.launchpad.net/neutron/+bug/1264687 with a patch here: https://review.openstack.org/#/c/64333/ Thanks again, much appreciated! -jay On 28 December 2013 13:41, Jay Pipes jaypi...@gmail.com wrote: Please see: http://paste.openstack.org/show/57627/ This is on a brand new git clone of neutron and then running ./run_tests.sh -V (FWIW, the same behavior occurs when running with tox -epy27 as well...) I have zero idea what to do...any help would be appreciated! It's almost like the subunit stream is being dumped as-is into the console. Best! -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev It looks like the problem is that there is a dependency on pyudev which only works properly on Linux. The neutron setup_hook does properly install pyudev on Linux (explains why the tests run in the gate), but would not work properly on windows or OS X. I assume folks are trying to run the tests on not Linux? Nope, the problem occurs on Linux. I was using Ubuntu 13.04. I abandoned my patch after some neutron-cores said it wasn't correct to put Linux-only dependencies in requirements.txt and said it was a packaging issue. The problem is that requirements.txt is *all about packaging issues*. Until we have some way of indicating this dependency is only for Linux/Windows/whatever in our requirements.txt files, this is going to be a pain in the butt. Neutron may want to do something similar to what Nova does when libvirt is not importable, https://git.openstack.org/cgit/openstack/nova/tree/nova/tests/virt/libvirt/test_libvirt.py#n77 and use a fake in order to get the tests to run properly anyways. Possible, but that's just a hack at its core. Fakes should be used to speed up unit tests where all you're testing is the interface between the faked-out object and the calling object, not whether or not the real object works. Best, -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] Docker and TripleO
On 2013-12-31 20:35, Robert Collins wrote: So, we've spoken about using containers on baremetal - e.g. the lxc provider - in the past, and with the [righteously deserved] noise Docker is receiving, I think we need to have a short expectation-setting discussion. Previously we've said that deploying machines to deploy containers to deploy OpenStack was overly meta - I stand by it being strictly unnecessary, but since Docker seems to have gotten a really good sweet spot together, I think we're going to want to revisit those discussions. However, I think we should do so no sooner than 6 months, and probably more like a year out. I say 6-12 months because: - Docker currently describes itself as 'not for production use' - It's really an optimisation from our perspective - We need to ship a production ready version of TripleO ASAP, and I think retooling would delay us more than it would benefit us. - There are going to be some nasty bootstrapping issues - we have to deploy the bare metal substrate and update it in all cases anyway - And I think pushing ahead with (any) container without those resolved is unwise - because our goal as always has to be to push the necessary support into the rest of OpenStack, *not* as a TripleO unique facility. This all ties into other threads that have been raised about future architectures we could use: I think we want to evolve to have better flexability and performance, but lets get a v1 minimal but functional - HA, scalable, usable - version in place before we advance. -Rob No argument here. I agree that it's an optimization and you know what they say about premature optimization. :-) I do wonder whether the ongoing discussion about where container support should live in OpenStack is relevant to this as well. Or would we not intend to manage the TripleO containers using OpenStack? If we _do_ then that is going to be a dependency too - we won't want to commit to using containers until OpenStack as a whole has decided how we're going to support them. -Ben ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] Tuskar CLI after architecture changes
On 01/02/2014 06:41 AM, Radomir Dopieralski wrote: On 20/12/13 17:34, Clint Byrum wrote: OpenStack is non-deterministic. Deterministic systems are rigid and unable to handle failure modes of any kind of diversity. I wonder how you are going to debug a non-deterministic system :-) Very carefully. We tend to err toward pushing problems back to the user and giving them tools to resolve the problem. Avoiding spurious problems is important too, no doubt. However, what Jay has been suggesting is that the situation a pessimistic locking system would avoid is entirely user created, and thus lower priority than say, actually having a complete UI for deploying OpenStack. I fail to see how leaving ourselves the ability to add locks when they become needed, by keeping tuskar-api in place, conflicts with actually having a complete UI for deploying OpenStack. Can you elaborate on that? I think all Clint was saying is that completing the UI for base OpenStack deployment (Tuskar UI) is a higher priority than trying to add a pessimistic lock model/concurrency to any particular part of the existing UI. That doesn't mean you can't work on a pessimistic locking model. It just means that Clint (and I) think that completing the as-yet-finished UI work is a more important task. Best, and Happy New Year! -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] Novnc switch to sockjs-client instead of websockify
On 2014-01-01 00:33, Thomas Goirand wrote: Hi, I was wondering if it would be possible for NoVNC to switch from websockify to sockjs-client, which is available here: https://github.com/sockjs/sockjs-client This has the advantage of not using flash at all (pure javascript), and continuing to work on all browsers, with a much cleaner licensing. Thoughts welcome! Cheers, Thomas Goirand (zigo) Sounds reasonable, but NoVNC isn't an OpenStack project so you'll need to bring this up with its developers: http://kanaka.github.io/noVNC/ -Ben ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [trove] Proposal to add Auston McReynolds to trove-core
Hi Illya, I greatly appreciate what Denis is providing to the Trove community. Thank you for letting him devote his time to trove. Denis and I have spoken privately at length about core (most recently a few weeks ago), and I believe he has a good idea of how to grow into a core member. Please have a conversation and ask him what myself and the other core team members have said to him privately. Im do not think a forum like this is a good place to discuss why people who werent yet nominated _arent_ being nominated yet. If you have concerns, please bring it up to the TC and myself. On Thu, Jan 2, 2014 at 4:30 AM, Ilya Sviridov isviri...@mirantis.comwrote: Hello Trove team Hello Michael I believe that Auston does a great job and personally think that his reviews are always thorough and reasonable. But It is surprising to not to see Denis Makogon (dmakogon, denis_makogon) as candidate to cores. He is well known active community player driving HEAT integration and Cassandra support in Trove. He takes part in all technical discussions and infrastructure as well. Also Denis actively helps to newcomers and always responsive in IRC chat. Just looking at his code review statistic [1] [2] [3] and trove weekly meeting participation [4] I astonished why there is no his name in the first email. [1] http://www.russellbryant.net/openstack-stats/trove-reviewers-180.txt [2] http://www.russellbryant.net/openstack-stats/trove-reviewers-90.txt [3] http://www.russellbryant.net/openstack-stats/trove-reviewers-30.txt [4] http://eavesdrop.openstack.org/meetings/trove/2013/ With best regards, Ilya Sviridov On Tue, Dec 31, 2013 at 3:34 PM, Paul Marshall paul.marsh...@rackspace.com wrote: +1 On Dec 30, 2013, at 1:13 PM, Vipul Sabhaya vip...@gmail.com wrote: +1 Sent from my iPhone On Dec 30, 2013, at 10:50 AM, Craig Vyvial cp16...@gmail.com wrote: +1 On Mon, Dec 30, 2013 at 12:00 PM, Greg Hill greg.h...@rackspace.com wrote: +1 On Dec 27, 2013, at 4:48 PM, Michael Basnight mbasni...@gmail.com wrote: Howdy, Im proposing Auston McReynolds (amcrn) to trove-core. Auston has been working with trove for a while now. He is a great reviewer. He is incredibly thorough, and has caught more than one critical error with reviews and helps connect large features that may overlap (config edits + multi datastores comes to mind). The code he submits is top notch, and we frequently ask for his opinion on architecture / feature / design. https://review.openstack.org/#/dashboard/8214 https://review.openstack.org/#/q/owner:8214,n,z https://review.openstack.org/#/q/reviewer:8214,n,z Please respond with +1/-1, or any further comments. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Michael Basnight ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron][Testr] Brand new checkout of Neutron... getting insane unit test run results
To be fair, neutron cores turned down reviews [1][2][3] for fear that the patch would break Hyper-V support for Neutron. Whether it's been hinted (erroneously) that this was a packaging issue is irrelevant for the sake of this discussion, and I suggested (when I turned down review [3]) if we could make the requirement dependent on the distro, so that the problem could be solved once and for all (and without causing any side effects). Just adding the pyudev dependency to requirements.txt it's not acceptable for the above mentioned reason. I am sorry if people keep abandoning the issue without taking the bull by the horns. [1] https://review.openstack.org/#/c/64333/ [2] https://review.openstack.org/#/c/55966/ [3] https://review.openstack.org/#/c/58884/ Cheers, Armando On 2 January 2014 18:03, Jay Pipes jaypi...@gmail.com wrote: On 01/01/2014 10:56 PM, Clark Boylan wrote: On Wed, Jan 1, 2014 at 7:33 PM, 黎林果 lilinguo8...@gmail.com wrote: I have met this problem too.The units can't be run. The end info as: Ran 0 tests in 0.673s OK cp: cannot stat `.testrepository/-1': No such file or directory 2013/12/28 Jay Pipes jaypi...@gmail.com: On 12/27/2013 11:11 PM, Robert Collins wrote: I'm really sorry about the horrid UI - we're in the middle of fixing the plumbing to report this and support things like tempest better - from the bottom up. The subunit listing - testr reporting of listing errors is fixed on the subunit side, but not on the the testr side yet. If you look at the end of the error: \rimport errors4neutron.tests.unit.linuxbridge.test_lb_neutron_agent\x85\xc5\x1a\\', stderr=None error: testr failed (3) You can see this^ which translates as import errors neutron.tests.unit.linuxbridge.test_lb_neutron_agent so neutron/tests/unit/linuxbridge/test_lb_neutron_agent.py is failing to import. Phew, thanks Rob! I was a bit stumped there :) I have identified the import issue (this is on a fresh checkout of Neutron, BTW, so I'm a little confused how this made it through the gate... (.venv)jpipes@uberbox:~/repos/openstack/neutron$ python Python 2.7.4 (default, Sep 26 2013, 03:20:26) [GCC 4.7.3] on linux2 Type help, copyright, credits or license for more information. import neutron.tests.unit.linuxbridge.test_lb_neutron_agent Traceback (most recent call last): File stdin, line 1, in module File neutron/tests/unit/linuxbridge/test_lb_neutron_agent.py, line 29, in module from neutron.plugins.linuxbridge.agent import linuxbridge_neutron_agent File neutron/plugins/linuxbridge/agent/linuxbridge_neutron_agent.py, line 33, in module import pyudev ImportError: No module named pyudev Looks like pyudev needs to be added to requirements.txt... I've filed a bug: https://bugs.launchpad.net/neutron/+bug/1264687 with a patch here: https://review.openstack.org/#/c/64333/ Thanks again, much appreciated! -jay On 28 December 2013 13:41, Jay Pipes jaypi...@gmail.com wrote: Please see: http://paste.openstack.org/show/57627/ This is on a brand new git clone of neutron and then running ./run_tests.sh -V (FWIW, the same behavior occurs when running with tox -epy27 as well...) I have zero idea what to do...any help would be appreciated! It's almost like the subunit stream is being dumped as-is into the console. Best! -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev It looks like the problem is that there is a dependency on pyudev which only works properly on Linux. The neutron setup_hook does properly install pyudev on Linux (explains why the tests run in the gate), but would not work properly on windows or OS X. I assume folks are trying to run the tests on not Linux? Nope, the problem occurs on Linux. I was using Ubuntu 13.04. I abandoned my patch after some neutron-cores said it wasn't correct to put Linux-only dependencies in requirements.txt and said it was a packaging issue. The problem is that requirements.txt is *all about packaging issues*. Until we have some way of indicating this dependency is only for Linux/Windows/whatever in our requirements.txt files, this is going to be a pain in the butt. Neutron may want to do something similar to what Nova does when libvirt is not importable, https://git.openstack.org/cgit/openstack/nova/tree/nova/tests/virt/libvirt/test_libvirt.py#n77 and use a fake in order to get the tests to run properly anyways. Possible, but that's just a hack at its core. Fakes should be used to speed up unit tests where all you're testing is the interface between the faked-out object and the calling object, not whether or not the real object
Re: [openstack-dev] [Neutron][Testr] Brand new checkout of Neutron... getting insane unit test run results
On 01/02/2014 12:50 PM, Armando M. wrote: To be fair, neutron cores turned down reviews [1][2][3] for fear that the patch would break Hyper-V support for Neutron. Whether it's been hinted (erroneously) that this was a packaging issue is irrelevant for the sake of this discussion, and I suggested (when I turned down review [3]) if we could make the requirement dependent on the distro, so that the problem could be solved once and for all (and without causing any side effects). Just adding the pyudev dependency to requirements.txt it's not acceptable for the above mentioned reason. I am sorry if people keep abandoning the issue without taking the bull by the horns. I am more than happy to take the bull by the horns. I just don't know what the bull is, nor which way the bull is facing. If I go to grab the horns, I don't want to find that the bull is facing away from me ;) -jay [1] https://review.openstack.org/#/c/64333/ [2] https://review.openstack.org/#/c/55966/ [3] https://review.openstack.org/#/c/58884/ Cheers, Armando On 2 January 2014 18:03, Jay Pipes jaypi...@gmail.com wrote: On 01/01/2014 10:56 PM, Clark Boylan wrote: On Wed, Jan 1, 2014 at 7:33 PM, 黎林果 lilinguo8...@gmail.com wrote: I have met this problem too.The units can't be run. The end info as: Ran 0 tests in 0.673s OK cp: cannot stat `.testrepository/-1': No such file or directory 2013/12/28 Jay Pipes jaypi...@gmail.com: On 12/27/2013 11:11 PM, Robert Collins wrote: I'm really sorry about the horrid UI - we're in the middle of fixing the plumbing to report this and support things like tempest better - from the bottom up. The subunit listing - testr reporting of listing errors is fixed on the subunit side, but not on the the testr side yet. If you look at the end of the error: \rimport errors4neutron.tests.unit.linuxbridge.test_lb_neutron_agent\x85\xc5\x1a\\', stderr=None error: testr failed (3) You can see this^ which translates as import errors neutron.tests.unit.linuxbridge.test_lb_neutron_agent so neutron/tests/unit/linuxbridge/test_lb_neutron_agent.py is failing to import. Phew, thanks Rob! I was a bit stumped there :) I have identified the import issue (this is on a fresh checkout of Neutron, BTW, so I'm a little confused how this made it through the gate... (.venv)jpipes@uberbox:~/repos/openstack/neutron$ python Python 2.7.4 (default, Sep 26 2013, 03:20:26) [GCC 4.7.3] on linux2 Type help, copyright, credits or license for more information. import neutron.tests.unit.linuxbridge.test_lb_neutron_agent Traceback (most recent call last): File stdin, line 1, in module File neutron/tests/unit/linuxbridge/test_lb_neutron_agent.py, line 29, in module from neutron.plugins.linuxbridge.agent import linuxbridge_neutron_agent File neutron/plugins/linuxbridge/agent/linuxbridge_neutron_agent.py, line 33, in module import pyudev ImportError: No module named pyudev Looks like pyudev needs to be added to requirements.txt... I've filed a bug: https://bugs.launchpad.net/neutron/+bug/1264687 with a patch here: https://review.openstack.org/#/c/64333/ Thanks again, much appreciated! -jay On 28 December 2013 13:41, Jay Pipes jaypi...@gmail.com wrote: Please see: http://paste.openstack.org/show/57627/ This is on a brand new git clone of neutron and then running ./run_tests.sh -V (FWIW, the same behavior occurs when running with tox -epy27 as well...) I have zero idea what to do...any help would be appreciated! It's almost like the subunit stream is being dumped as-is into the console. Best! -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev It looks like the problem is that there is a dependency on pyudev which only works properly on Linux. The neutron setup_hook does properly install pyudev on Linux (explains why the tests run in the gate), but would not work properly on windows or OS X. I assume folks are trying to run the tests on not Linux? Nope, the problem occurs on Linux. I was using Ubuntu 13.04. I abandoned my patch after some neutron-cores said it wasn't correct to put Linux-only dependencies in requirements.txt and said it was a packaging issue. The problem is that requirements.txt is *all about packaging issues*. Until we have some way of indicating this dependency is only for Linux/Windows/whatever in our requirements.txt files, this is going to be a pain in the butt. Neutron may want to do something similar to what Nova does when libvirt is not importable, https://git.openstack.org/cgit/openstack/nova/tree/nova/tests/virt/libvirt/test_libvirt.py#n77 and use a fake in order to get the tests to run properly anyways. Possible, but that's just a
Re: [openstack-dev] [Neutron][Testr] Brand new checkout of Neutron... getting insane unit test run results
On Thu, Jan 2, 2014 at 9:50 AM, Armando M. arma...@gmail.com wrote: To be fair, neutron cores turned down reviews [1][2][3] for fear that the patch would break Hyper-V support for Neutron. Whether it's been hinted (erroneously) that this was a packaging issue is irrelevant for the sake of this discussion, and I suggested (when I turned down review [3]) if we could make the requirement dependent on the distro, so that the problem could be solved once and for all (and without causing any side effects). Just adding the pyudev dependency to requirements.txt it's not acceptable for the above mentioned reason. I am sorry if people keep abandoning the issue without taking the bull by the horns. [1] https://review.openstack.org/#/c/64333/ [2] https://review.openstack.org/#/c/55966/ [3] https://review.openstack.org/#/c/58884/ Cheers, Armando On 2 January 2014 18:03, Jay Pipes jaypi...@gmail.com wrote: On 01/01/2014 10:56 PM, Clark Boylan wrote: On Wed, Jan 1, 2014 at 7:33 PM, 黎林果 lilinguo8...@gmail.com wrote: I have met this problem too.The units can't be run. The end info as: Ran 0 tests in 0.673s OK cp: cannot stat `.testrepository/-1': No such file or directory 2013/12/28 Jay Pipes jaypi...@gmail.com: On 12/27/2013 11:11 PM, Robert Collins wrote: I'm really sorry about the horrid UI - we're in the middle of fixing the plumbing to report this and support things like tempest better - from the bottom up. The subunit listing - testr reporting of listing errors is fixed on the subunit side, but not on the the testr side yet. If you look at the end of the error: \rimport errors4neutron.tests.unit.linuxbridge.test_lb_neutron_agent\x85\xc5\x1a\\', stderr=None error: testr failed (3) You can see this^ which translates as import errors neutron.tests.unit.linuxbridge.test_lb_neutron_agent so neutron/tests/unit/linuxbridge/test_lb_neutron_agent.py is failing to import. Phew, thanks Rob! I was a bit stumped there :) I have identified the import issue (this is on a fresh checkout of Neutron, BTW, so I'm a little confused how this made it through the gate... (.venv)jpipes@uberbox:~/repos/openstack/neutron$ python Python 2.7.4 (default, Sep 26 2013, 03:20:26) [GCC 4.7.3] on linux2 Type help, copyright, credits or license for more information. import neutron.tests.unit.linuxbridge.test_lb_neutron_agent Traceback (most recent call last): File stdin, line 1, in module File neutron/tests/unit/linuxbridge/test_lb_neutron_agent.py, line 29, in module from neutron.plugins.linuxbridge.agent import linuxbridge_neutron_agent File neutron/plugins/linuxbridge/agent/linuxbridge_neutron_agent.py, line 33, in module import pyudev ImportError: No module named pyudev Looks like pyudev needs to be added to requirements.txt... I've filed a bug: https://bugs.launchpad.net/neutron/+bug/1264687 with a patch here: https://review.openstack.org/#/c/64333/ Thanks again, much appreciated! -jay On 28 December 2013 13:41, Jay Pipes jaypi...@gmail.com wrote: Please see: http://paste.openstack.org/show/57627/ This is on a brand new git clone of neutron and then running ./run_tests.sh -V (FWIW, the same behavior occurs when running with tox -epy27 as well...) I have zero idea what to do...any help would be appreciated! It's almost like the subunit stream is being dumped as-is into the console. Best! -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev It looks like the problem is that there is a dependency on pyudev which only works properly on Linux. The neutron setup_hook does properly install pyudev on Linux (explains why the tests run in the gate), but would not work properly on windows or OS X. I assume folks are trying to run the tests on not Linux? Nope, the problem occurs on Linux. I was using Ubuntu 13.04. I abandoned my patch after some neutron-cores said it wasn't correct to put Linux-only dependencies in requirements.txt and said it was a packaging issue. The problem is that requirements.txt is *all about packaging issues*. Until we have some way of indicating this dependency is only for Linux/Windows/whatever in our requirements.txt files, this is going to be a pain in the butt. Neutron may want to do something similar to what Nova does when libvirt is not importable, https://git.openstack.org/cgit/openstack/nova/tree/nova/tests/virt/libvirt/test_libvirt.py#n77 and use a fake in order to get the tests to run properly anyways. Possible, but that's just a hack at its core. Fakes should be used to speed up unit tests where all you're testing is the interface between
Re: [openstack-dev] [Ceilometer][Oslo] Consuming Notifications in Batches
On 12/20/2013 09:26 PM, Herndon, John Luke wrote: On Dec 20, 2013, at 12:13 PM, Gordon Sim g...@redhat.com wrote: On 12/20/2013 05:27 PM, Herndon, John Luke wrote: Other protocols may support bulk consumption. My one concern with this approach is error handling. Currently the executors treat each notification individually. So let’s say the broker hands 100 messages at a time. When client is done processing the messages, the broker needs to know if message 25 had an error or not. We would somehow need to communicate back to the broker which messages failed. I think this may take some refactoring of executors/dispatchers. What do you think? [...] (2) What would you want the broker to do with the failed messages? What sort of things might fail? Is it related to the message content itself? Or is it failures suspected to be of a temporal nature? There will be situations where the message can’t be parsed, and those messages can’t just be thrown away. My current thought is that ceilometer could provide some sort of mechanism for sending messages that are invalid to an external data store (like a file, or a different topic on the amqp server) where a living, breathing human can look at them and try to parse out any meaningful information. Right, in those cases simply requeueing probably is not the right thing and you really want it dead-lettered in some way. I guess the first question is whether that is part of the notification systems function, or if it is done by the application itself (e.g. by storing it or republishing it). If it is the latter you may not need any explicit negative acknowledgement. Other errors might be “database not available”, in which case re-queing the message is probably the right way to go. That does mean however that the backlog of messages starts to grow on the broker, so some scheme for dealing with this if the database outage goes on for a bit is probably important. It also means that the messages will keep being retried without any 'backoff' waiting for the database to be restored which could increase the load. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] [neutron] PCI pass-through network support
Hi John, We had one on 12/14/2013 with the log: http://eavesdrop.openstack.org/meetings/pci_passthrough_meeting/2013/pci_pa ssthrough_meeting.2013-12-24-14.02.log.html The next one will be at UTC 1400 on Jan. 7th, Tuesday. --Robert On 1/2/14 10:06 AM, John Garbutt j...@johngarbutt.com wrote: On 22 December 2013 12:07, Irena Berezovsky ire...@mellanox.com wrote: Hi Ian, My comments are inline I would like to suggest to focus the next PCI-pass though IRC meeting on: 1.Closing the administration and tenant that powers the VM use cases. 2. Decouple the nova and neutron parts to start focusing on the neutron related details. When is the next meeting? I have lost track due to holidays, etc. John ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] minimum review period for functional changes that break backwards compatibility
On 2014-01-03 09:43:11 +1300 (+1300), Robert Collins wrote: I disagree here - educational institutions have followed the trend, not set it. Likewise corporate management. The trend setting occured IMNSHO through organisations like Novell and Microsoft disrupting the computing marketplace through their products {itself neither good or bad} but what was bad was their choice to not ship the source code [...] Agreed, this is very likely the reason behind the reason. In some ways, another form of Stockholm Syndrome. I think that mischaracterises devops :). One of the crucial sea-changes that has occurred in the intervening period between early Unix administration and now is the broad acceptance of untested code as unprofessional, broken, bad. [...] Probably thanks to my changing jobs around the time the modern devops movement began to gain in popularity, I hadn't associated test-centric development culture with it. I can definitely see the relationship though, and so concur it's a positive outcome (and not merely a throwback to the beforetime). -- Jeremy Stanley ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Turbo-hipster
Heh, I didn't know that wiki page existed. I've added an entry to the checklist. There's also some talk of adding some help text to the vote message turbo-hipster leaves in gerrit, but we haven't gotten around to doing that yet. Cheers, Michael On Fri, Jan 3, 2014 at 12:57 AM, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: On 12/31/2013 3:58 PM, Michael Still wrote: Hi. So while turbo hipster is new, I've been reading every failure message it produces to make sure its not too badly wrong. There were four failures posted last night while I slept: https://review.openstack.org/#/c/64521 This one is a TH bug. We shouldn't be testing stable branches. bug/1265238 has been filed to track this. https://review.openstack.org/#/c/61753 This is your review. The failed run's log is https://ssl.rcbops.com/turbo_hipster/logviewer/?q=/turbo_hipster/results/61/61753/8/check/gate-real-db-upgrade_nova_percona_user_001/1326092/user_001.log and you can see from the failure message that migrations 152 and 206 took too long. Migration 152 took 326 seconds, where our historical data of 2,846 test migrations says it should take 222 seconds. Migration 206 took 81 seconds, where we think it should take 56 seconds based on 2,940 test runs. Whilst I can't explain why those migrations took too long this time around, they are certainly exactly the sort of thing TH is meant to catch. If you think your patch isn't responsible (perhaps the machine is just being slow or something), you can always retest by leaving a review comment of recheck migrations. I have done this for you on this patch. Michael, is recheck migrations something that is going to be added to the wiki for test failures here? https://wiki.openstack.org/wiki/GerritJenkinsGit#Test_Failures https://review.openstack.org/#/c/61717 This review also had similar unexplained slowness, but has already been rechecked by someone else and now passes. I note that the slowness in both cases was from the same TH worker node, and I will keep an eye on that node today. https://review.openstack.org/#/c/56420 This review also had slowness in migration 206, but has been rechecked by the developer and now passes. It wasn't on the percona-001 worker that the other two were on, so perhaps this indicates that we need to relax the timing requirements for migration 206. Hope this helps, Michael On Wed, Jan 1, 2014 at 12:34 AM, Gary Kotton gkot...@vmware.com wrote: Hi, It seems that she/he is behaving oddly again. I have posted a patch that does not have any database changes and it has give me a –1…. Happy new year Gary ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Thanks, Matt Riedemann ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Rackspace Australia ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron][qa] Parallel testing update
Hi Salvatore!, Good work on this. About the quota limit tests, I believe they may be unit-tested, instead of functionally tested. When running those tests in parallel with any other tests that rely on having ports, networks or subnets available into quota, they have high chances of making those other tests fail. Cheers, Miguel Ángel Ajo - Original Message - From: Kyle Mestery mest...@siliconloons.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Thursday, January 2, 2014 7:53:05 PM Subject: Re: [openstack-dev] [Neutron][qa] Parallel testing update Thanks for the updates here Salvatore, and for continuing to push on this! This is all great work! On Jan 2, 2014, at 6:57 AM, Salvatore Orlando sorla...@nicira.com wrote: Hi again, I've now run the experimental job a good deal of times, and I've filed bugs for all the issues which came out. Most of them occurred no more than once among all test execution (I think about 30). They're all tagged with neutron-parallel [1]. for ease of tracking, I've associated all the bug reports with neutron, but some are probably more tempest or nova issues. Salvatore [1] https://bugs.launchpad.net/neutron/+bugs?field.tag=neutron-parallel On 27 December 2013 11:09, Salvatore Orlando sorla...@nicira.com wrote: Hi, We now have several patches under review which improve a lot how neutron handles parallel testing. In a nutshell, these patches try to ensure the ovs agent processes new, removed, and updated interfaces as soon as possible, These patches are: https://review.openstack.org/#/c/61105/ https://review.openstack.org/#/c/61964/ https://review.openstack.org/#/c/63100/ https://review.openstack.org/#/c/63558/ There is still room for improvement. For instance the calls from the agent into the plugins might be consistently reduced. However, even if the above patches shrink a lot the time required for processing a device, we are still hitting a hard limit with the execution ovs commands for setting local vlan tags and clearing flows (or adding the flow rule for dropping all the traffic). In some instances this commands slow down a lot, requiring almost 10 seconds to complete. This adds a delay in interface processing which in some cases leads to the hideous SSH timeout error (the same we see with bug 1253896 in normal testing). It is also worth noting that when this happens sysstat reveal CPU usage is very close to 100% From the neutron side there is little we can do. Introducing parallel processing for interface, as we do for the l3 agent, is not actually a solution, since ovs-vswitchd v1.4.x, the one executed on gate tests, is not multithreaded. If you think the situation might be improved by changing the logic for handling local vlan tags and putting ports on the dead vlan, I would be happy to talk about that. On my local machines I've seen a dramatic improvement in processing times by installing ovs 2.0.0, which has a multi-threaded vswitchd. Is this something we might consider for gate tests? Also, in order to reduce CPU usage on the gate (and making tests a bit faster), there is a tempest patch which stops creating and wiring neutron routers when they're not needed: https://review.openstack.org/#/c/62962/ Even in my local setup which succeeds about 85% of times, I'm still seeing some occurrences of the issue described in [1], which at the end of the day seems a dnsmasq issue. Beyond the 'big' structural problem discussed above, there are some minor problems with a few tests: 1) test_network_quotas.test_create_ports_until_quota_hit fails about 90% of times. I think this is because the test itself should be made aware of parallel execution and asynchronous events, and there is a patch for this already: https://review.openstack.org/#/c/64217 2) test_attach_interfaces.test_create_list_show_delete_interfaces fails about 66% of times. The failure is always on an assertion made after deletion of interfaces, which probably means the interface is not deleted within 5 seconds. I think this might be a consequence of the higher load on the neutron service and we might try to enable multiple workers on the gate to this aim, or just increase the tempest timeout. On a slightly different note, allow me to say that the way assertion are made on this test might be improved a bit. So far one has to go through the code to see why the test failed. Thanks for reading this rather long message. Regards, Salvatore [1] https://lists.launchpad.net/openstack/msg23817.html On 2 December 2013 22:01, Kyle Mestery (kmestery) kmest...@cisco.com wrote: Yes, this is all great Salvatore and Armando! Thank you for all of this work and the explanation behind it all. Kyle On Dec 2, 2013, at 2:24 PM, Eugene
Re: [openstack-dev] [Neutron][qa] Parallel testing update
Another way to tackle it would be to create a dedicated tenant for those tests, then the quota won't interact with anything else. On 3 January 2014 10:35, Miguel Angel Ajo Pelayo mangel...@redhat.com wrote: Hi Salvatore!, Good work on this. About the quota limit tests, I believe they may be unit-tested, instead of functionally tested. When running those tests in parallel with any other tests that rely on having ports, networks or subnets available into quota, they have high chances of making those other tests fail. Cheers, Miguel Ángel Ajo - Original Message - From: Kyle Mestery mest...@siliconloons.com To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Sent: Thursday, January 2, 2014 7:53:05 PM Subject: Re: [openstack-dev] [Neutron][qa] Parallel testing update Thanks for the updates here Salvatore, and for continuing to push on this! This is all great work! On Jan 2, 2014, at 6:57 AM, Salvatore Orlando sorla...@nicira.com wrote: Hi again, I've now run the experimental job a good deal of times, and I've filed bugs for all the issues which came out. Most of them occurred no more than once among all test execution (I think about 30). They're all tagged with neutron-parallel [1]. for ease of tracking, I've associated all the bug reports with neutron, but some are probably more tempest or nova issues. Salvatore [1] https://bugs.launchpad.net/neutron/+bugs?field.tag=neutron-parallel On 27 December 2013 11:09, Salvatore Orlando sorla...@nicira.com wrote: Hi, We now have several patches under review which improve a lot how neutron handles parallel testing. In a nutshell, these patches try to ensure the ovs agent processes new, removed, and updated interfaces as soon as possible, These patches are: https://review.openstack.org/#/c/61105/ https://review.openstack.org/#/c/61964/ https://review.openstack.org/#/c/63100/ https://review.openstack.org/#/c/63558/ There is still room for improvement. For instance the calls from the agent into the plugins might be consistently reduced. However, even if the above patches shrink a lot the time required for processing a device, we are still hitting a hard limit with the execution ovs commands for setting local vlan tags and clearing flows (or adding the flow rule for dropping all the traffic). In some instances this commands slow down a lot, requiring almost 10 seconds to complete. This adds a delay in interface processing which in some cases leads to the hideous SSH timeout error (the same we see with bug 1253896 in normal testing). It is also worth noting that when this happens sysstat reveal CPU usage is very close to 100% From the neutron side there is little we can do. Introducing parallel processing for interface, as we do for the l3 agent, is not actually a solution, since ovs-vswitchd v1.4.x, the one executed on gate tests, is not multithreaded. If you think the situation might be improved by changing the logic for handling local vlan tags and putting ports on the dead vlan, I would be happy to talk about that. On my local machines I've seen a dramatic improvement in processing times by installing ovs 2.0.0, which has a multi-threaded vswitchd. Is this something we might consider for gate tests? Also, in order to reduce CPU usage on the gate (and making tests a bit faster), there is a tempest patch which stops creating and wiring neutron routers when they're not needed: https://review.openstack.org/#/c/62962/ Even in my local setup which succeeds about 85% of times, I'm still seeing some occurrences of the issue described in [1], which at the end of the day seems a dnsmasq issue. Beyond the 'big' structural problem discussed above, there are some minor problems with a few tests: 1) test_network_quotas.test_create_ports_until_quota_hit fails about 90% of times. I think this is because the test itself should be made aware of parallel execution and asynchronous events, and there is a patch for this already: https://review.openstack.org/#/c/64217 2) test_attach_interfaces.test_create_list_show_delete_interfaces fails about 66% of times. The failure is always on an assertion made after deletion of interfaces, which probably means the interface is not deleted within 5 seconds. I think this might be a consequence of the higher load on the neutron service and we might try to enable multiple workers on the gate to this aim, or just increase the tempest timeout. On a slightly different note, allow me to say that the way assertion are made on this test might be improved a bit. So far one has to go through the code to see why the test failed. Thanks for reading this rather long message. Regards, Salvatore [1] https://lists.launchpad.net/openstack/msg23817.html On 2 December 2013 22:01, Kyle Mestery (kmestery)
Re: [openstack-dev] [nova] Turbo-hipster
On 01/02/2014 04:29 PM, Michael Still wrote: Heh, I didn't know that wiki page existed. I've added an entry to the checklist. There's also some talk of adding some help text to the vote message turbo-hipster leaves in gerrit, but we haven't gotten around to doing that yet. Cheers, Michael So was there enough countable slowness earlier in the run that you could have predicted these runs would be slower overall? My experience looking at Tempest run data is there can be as much as an +60% variance from fastest and slowest nodes (same instance type) within the same cloud provider, which is the reason we've never tried to performance gate on it. However if there was some earlier benchmark that would let you realize that the whole run was slow, so give it more of a buffer, that would probably be useful. -Sean -- Sean Dague Samsung Research America s...@dague.net / sean.da...@samsung.com http://dague.net signature.asc Description: OpenPGP digital signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [qa] Negative test generation
I took a little time to think some more about this and pushed a little working prototype of some ideas I had https://review.openstack.org/#/c/64733/1. Comments about the approach are most welcome. Note that I minimized any refactoring of the existing hierarchy for this prorotype. -David ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Turbo-hipster
Michael Still mi...@stillhq.com writes: Heh, I didn't know that wiki page existed. I've added an entry to the checklist. There's also some talk of adding some help text to the vote message turbo-hipster leaves in gerrit, but we haven't gotten around to doing that yet. I would rather not mention it on that page, which is the documentation for the project gating system and developer workflow (Zuul links to it when it leaves a failure message) so I have removed it. I _do_ think adding help text to the messages third-party tools leave, and/or linking to specific documentation (ideally also in the OpenStack wiki) from there is a good idea. However, there are _a lot_ of third-party test systems coming on-line, and I'm not sure that expanding the recheck language to support ever more complexity is a good idea. I can see how being able to say recheck foo would be useful in some circumstances, but given that just saying recheck will suffice, I'd prefer that we kept the general recommendation simple so developers can worry about something else. Certainly at a minimum, recheck should recheck all the systems; that's one of the proposed requirements here: https://review.openstack.org/#/c/63478/5/doc/source/third_party.rst I think it would be best if we stopped there. But if you still feel very strongly that you want a private extension to the syntax, please consider how necessary it is for most developers to know about it when you decide how prominently to feature it in messages or documentation about your tools. -Jim ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Turbo-hipster
Sean Dague s...@dague.net writes: On 01/02/2014 04:29 PM, Michael Still wrote: Heh, I didn't know that wiki page existed. I've added an entry to the checklist. There's also some talk of adding some help text to the vote message turbo-hipster leaves in gerrit, but we haven't gotten around to doing that yet. Cheers, Michael So was there enough countable slowness earlier in the run that you could have predicted these runs would be slower overall? My experience looking at Tempest run data is there can be as much as an +60% variance from fastest and slowest nodes (same instance type) within the same cloud provider, which is the reason we've never tried to performance gate on it. However if there was some earlier benchmark that would let you realize that the whole run was slow, so give it more of a buffer, that would probably be useful. If you are able to do this and benchmark the performance of a cloud server reliably enough, we might be able to make progress on performance testing, which has been long desired. The large ops test is (somewhat accidentally) a performance test, and predictably, it has failed when we change cloud node provider configurations. A benchmark could make this test more reliable and other tests more feasible. -Jim ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Turbo-hipster
On 3 January 2014 11:26, James E. Blair jebl...@openstack.org wrote: If you are able to do this and benchmark the performance of a cloud server reliably enough, we might be able to make progress on performance testing, which has been long desired. The large ops test is (somewhat accidentally) a performance test, and predictably, it has failed when we change cloud node provider configurations. A benchmark could make this test more reliable and other tests more feasible. In bzr we found it much more reliable to do tests that isolate and capture the *effort*, not the time: most [not all] performance issues have both a time and effort domain, and the effort domain is usually correlated with time in a particular environment, but itself approximately constant across environments. For instance, MB sent in a request, or messages on the message bus, or writes to the file system, or queries sent to the DB. So the structure we ended up with - which was quite successful - was: - a cron job based benchmark that ran several versions through functional scenarios and reporting timing data - gating tests that tested effort for operations - a human process whereby someone wanting to put a ratchet on some aspect of performance would write an effort based test or three to capture the status quo, then make it better and update the tests with their improvements. I think this would work well for OpenStack too - and infact we have some things that are in this general direction already. -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Turbo-hipster
On Fri, Jan 3, 2014 at 9:24 AM, James E. Blair jebl...@openstack.org wrote: However, there are _a lot_ of third-party test systems coming on-line, and I'm not sure that expanding the recheck language to support ever more complexity is a good idea. I can see how being able to say recheck foo would be useful in some circumstances, but given that just saying recheck will suffice, I'd prefer that we kept the general recommendation simple so developers can worry about something else. Fair enough. I feel like you and I should sit down and chat about this stuff at the LCA meetup next week. If we need to make tweaks based on that, then we will. Cheers, Michael -- Rackspace Australia ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [solum][general] some pip package installs from requirements.txt fail when using pip 1.5
I have a box with a newer version of pip ( 1.5 ) and it refuses to install netaddr = 0.7.6 as it's an external/insecure pip package.This may cause issues for devs that are running newer versions of pip than standard system packages. There maybe other packages in the larger global-requirements ( https://github.com/openstack/requirements/blob/master/global-requirements.txt ) that will suffer the same fate. The error message looks something like this Downloading/unpacking netaddr=0.7.6 (from -r requirements.txt (line 3)) Could not find a version that satisfies the requirement netaddr=0.7.6 (from -r requirements.txt (line 3)) (from versions: 0.3.1, 0.3.1, 0.4, 0.4, 0.5.1, 0.5.1, 0.5.2, 0.5.2, 0.5, 0.5, 0.6.1, 0.6.1, 0.6.2, 0.6.2, 0.6.3, 0.6.3, 0.6.4, 0.6.4, 0.6, 0.6, 0.7.1, 0.7.1, 0.7.2, 0.7.2, 0.7.3, 0.7.3, 0.7, 0.7) Some insecure and unverifiable files were ignored (use --allow-unverified netaddr to allow). Cleaning up... No distributions matching the version for netaddr=0.7.6 (from -r requirements.txt (line 3)) Storing debug log for failure in /home/paul6951/.pip/pip.log Dropping the netaddr version to 0.7.3 fixes the error I see as does running 'pip install --allow-all-external --allow-unverified netaddr -r requirements.txt' ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Turbo-hipster
On 01/02/2014 05:39 PM, Michael Still wrote: On Fri, Jan 3, 2014 at 9:24 AM, James E. Blair jebl...@openstack.org wrote: However, there are _a lot_ of third-party test systems coming on-line, and I'm not sure that expanding the recheck language to support ever more complexity is a good idea. I can see how being able to say recheck foo would be useful in some circumstances, but given that just saying recheck will suffice, I'd prefer that we kept the general recommendation simple so developers can worry about something else. Fair enough. I feel like you and I should sit down and chat about this stuff at the LCA meetup next week. If we need to make tweaks based on that, then we will. Cheers, Michael I'd love to attend this chat, if possible. A number of the coming on-line third-party test systems are motivated by neutron plugins. I'd get a lot from hearing this discussion. Thanks, Anita. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer][Oslo] Consuming Notifications in Batches
On 1/2/14, 11:36 AM, Gordon Sim g...@redhat.com wrote: On 12/20/2013 09:26 PM, Herndon, John Luke wrote: On Dec 20, 2013, at 12:13 PM, Gordon Sim g...@redhat.com wrote: On 12/20/2013 05:27 PM, Herndon, John Luke wrote: Other protocols may support bulk consumption. My one concern with this approach is error handling. Currently the executors treat each notification individually. So let¹s say the broker hands 100 messages at a time. When client is done processing the messages, the broker needs to know if message 25 had an error or not. We would somehow need to communicate back to the broker which messages failed. I think this may take some refactoring of executors/dispatchers. What do you think? [...] (2) What would you want the broker to do with the failed messages? What sort of things might fail? Is it related to the message content itself? Or is it failures suspected to be of a temporal nature? There will be situations where the message can¹t be parsed, and those messages can¹t just be thrown away. My current thought is that ceilometer could provide some sort of mechanism for sending messages that are invalid to an external data store (like a file, or a different topic on the amqp server) where a living, breathing human can look at them and try to parse out any meaningful information. Right, in those cases simply requeueing probably is not the right thing and you really want it dead-lettered in some way. I guess the first question is whether that is part of the notification systems function, or if it is done by the application itself (e.g. by storing it or republishing it). If it is the latter you may not need any explicit negative acknowledgement. Exactly, I¹m thinking this is something we¹d build into ceilometer and not oslo, since ceilometer is where the event parsing knowledge lives. From an oslo point of view, the message would be 'acked¹. Other errors might be ³database not available², in which case re-queing the message is probably the right way to go. That does mean however that the backlog of messages starts to grow on the broker, so some scheme for dealing with this if the database outage goes on for a bit is probably important. It also means that the messages will keep being retried without any 'backoff' waiting for the database to be restored which could increase the load. This is a problem we already have :( https://github.com/openstack/ceilometer/blob/master/ceilometer/notification .py#L156-L158 Since notifications cannot be lost, overflow needs to be detected and the messages need to be saved. I¹m thinking the database being down is a rare occurrence that will be worthy of waking someone up in the middle of the night. One possible solution: flip the collector into an emergency mode and save notifications to disc until the issue is resolved. Once the db is up and running, the collector inserts all of these saved messages (as one big batch!). Thoughts? I¹m not sure I understand what you are saying about retrying without a backoff. Can you explain? -john ___ OpenStack-dev mailing l OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev smime.p7s Description: S/MIME cryptographic signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Turbo-hipster
On Fri, Jan 3, 2014 at 9:46 AM, Anita Kuno ante...@anteaya.info wrote: I'd love to attend this chat, if possible. A number of the coming on-line third-party test systems are motivated by neutron plugins. I'd get a lot from hearing this discussion. I've created a BoF on Monday night straight after the CI miniconf in the same room. Let's all get together and have a chat and then go to dinner. Michael -- Rackspace Australia ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Turbo-hipster
On 01/02/2014 05:58 PM, Michael Still wrote: On Fri, Jan 3, 2014 at 9:46 AM, Anita Kuno ante...@anteaya.info wrote: I'd love to attend this chat, if possible. A number of the coming on-line third-party test systems are motivated by neutron plugins. I'd get a lot from hearing this discussion. I've created a BoF on Monday night straight after the CI miniconf in the same room. Let's all get together and have a chat and then go to dinner. Michael Great, thank you Michael. Anita. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Ceilometer] Vertica Storage Driver Testing
Hi, I¹m working on adding a vertica (www.vertica.com) storage driver to ceilometer. I would love to get this driver into upstream. However, I¹ve run into a bit of a snag with the tests. It looks like all of the existing storage drivers have ³in-memory² versions that are used for unit tests. Vertica does not have an in-memory implementation, and is not trivial to set-up. Given this constraint, I don¹t think it will be possible to run unit tests ³out-of-the-box² against a real vertica database. Vertica is mostly sql compliant, so I could use a sqlite or h2 backend to test the query parts of the driver. Data loading can¹t be done with sqlite, and will probably need to be tested with mocks. Is this an acceptable approach for unit tests, or do the tests absolutely need to run against the database under test? Thanks! -john smime.p7s Description: S/MIME cryptographic signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] Vertica Storage Driver Testing
Excerpts from Herndon, John Luke's message of 2014-01-02 15:16:26 -0800: Hi, I¹m working on adding a vertica (www.vertica.com) storage driver to ceilometer. I would love to get this driver into upstream. However, I¹ve run into a bit of a snag with the tests. It looks like all of the existing storage drivers have ³in-memory² versions that are used for unit tests. Vertica does not have an in-memory implementation, and is not trivial to set-up. Given this constraint, I don¹t think it will be possible to run unit tests ³out-of-the-box² against a real vertica database. Well arguably those other implementations aren't really running against a real database either so I don't see a problem with this. Vertica is mostly sql compliant, so I could use a sqlite or h2 backend to test the query parts of the driver. Data loading can¹t be done with sqlite, and will probably need to be tested with mocks. Is this an acceptable approach for unit tests, or do the tests absolutely need to run against the database under test? A fake Vertica or mocking it out seems like a good idea. I'm not deeply involved with Ceilometer, but in general I think it is preferable to test only the _code_ in unit tests. However, it may be a good idea to adopt an approach similar to Nova's approach and require that a 3rd party run Vertica integration tests in the gate. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] MySQLdb situation with the gate
On 01/02/2014 12:48 PM, Monty Taylor wrote: Hey all! This morning, we experienced our first systemic breakage of the gate of 2014! There are a few reasons for this, which are related to pip and virtualenv making new releases. The new releases do WONDERFUL things and are more secure, but unfortunately, we've hit a few pain points which we are working through. The most visible one that's going to come across people's plate has to do with mysql-python. The upstream package still hasn't updated itself from attempting to explicitly use distribute, and it cannot be installed by pip 1.5. (this has nothing to do with OpenStack - mysql-python is simply uninstallable with pip 1.5 at the moment) Since we kinda use the heck out of mysql-python, and we're also pretty heavy users of pip, we have to address the situation. First of all, we made a pull request to upstream: https://github.com/farcepest/MySQLdb1/pull/44 That fixes the issue. In a perfect world, this will get merged and 1.2.5 will get released in the next 30 minutes. Holla! A perfect world happened and Andy Dustman has merged the PR and cut a 1.2.5 release. However, there is another pull request that is 3 months old that also addresses this issue, so we're having to make contingency plans in case it does not get addressed. The most immediate plan is that we've prepared a lightweight fork based on the above patches, called it python-MySQLdb, and uploaded it to PyPI. The only things changed in the fork are the packaging changes needed to get things working with modern pip. We'll be making patches to move requirements to consume that for the time being. As it's the same code as mysql-python, distros should not need to care about this fork, and we fully intend to delete it the instant upstream is fixed. (to be fair, by delete we probably mean upload an empty package which depends on mysql-python) I just deleted the package. Since we never used it for anything. I love it when stuff works out like that. We're not happy about this, but we also don't want the gate to be broken - or for local dev envs to be broken - for a significant period of time. Thanks! Monty ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] Vertica Storage Driver Testing
On 1/2/14, 4:27 PM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Herndon, John Luke's message of 2014-01-02 15:16:26 -0800: Hi, I¹m working on adding a vertica (www.vertica.com) storage driver to ceilometer. I would love to get this driver into upstream. However, I¹ve run into a bit of a snag with the tests. It looks like all of the existing storage drivers have ³in-memory² versions that are used for unit tests. Vertica does not have an in-memory implementation, and is not trivial to set-up. Given this constraint, I don¹t think it will be possible to run unit tests ³out-of-the-box² against a real vertica database. Well arguably those other implementations aren't really running against a real database either so I don't see a problem with this. Vertica is mostly sql compliant, so I could use a sqlite or h2 backend to test the query parts of the driver. Data loading can¹t be done with sqlite, and will probably need to be tested with mocks. Is this an acceptable approach for unit tests, or do the tests absolutely need to run against the database under test? A fake Vertica or mocking it out seems like a good idea. I'm not deeply involved with Ceilometer, but in general I think it is preferable to test only the _code_ in unit tests. However, it may be a good idea to adopt an approach similar to Nova's approach and require that a 3rd party run Vertica integration tests in the gate. I don’t think it would be that hard to get the review or gate jobs to use a real vertica instance, actually. Who do I talk to about that? ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev smime.p7s Description: S/MIME cryptographic signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Turbo-hipster
On Fri, Jan 3, 2014 at 9:39 AM, Michael Still mi...@stillhq.com wrote: On Fri, Jan 3, 2014 at 9:24 AM, James E. Blair jebl...@openstack.org wrote: However, there are _a lot_ of third-party test systems coming on-line, and I'm not sure that expanding the recheck language to support ever more complexity is a good idea. I can see how being able to say recheck foo would be useful in some circumstances, but given that just saying recheck will suffice, I'd prefer that we kept the general recommendation simple so developers can worry about something else. Fair enough. I feel like you and I should sit down and chat about this stuff at the LCA meetup next week. If we need to make tweaks based on that, then we will. Further to this, I have just reloaded our zuul with a rules change. The following events will all cause a turbo hipster check run: - uploading a patchset - restoring a patchset - commenting recheck .* - commenting recheck migrations Cheers, Michael -- Rackspace Australia ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [solum][general] some pip package installs from requirements.txt fail when using pip 1.5
On 2014-01-02 22:46:06 + (+), Paul Czarkowski wrote: [...] Dropping the netaddr version to 0.7.3 fixes the error I see as does running 'pip install --allow-all-external --allow-unverified netaddr -r requirements.txt' Yes, this and needing a new virtualenv release for the issue Monty raised with them today seem to be the last remaining problems we're aware of stemming from the new pip 1.5 behaviors. I'm currently looking at updating nova's tox.ini to use... install_command = pip install -U --allow-external netaddr --allow-unverified netaddr {opts} {packages} The down-side is that anyone with earlier virtualenv installed will need to upgrade to a version bundling pip 1.5 since pip before 1.5 the --allow-unverified option isn't recognized so pip exits nonzero when tox tries to pass it in. -- Jeremy Stanley ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] neutron plugin for opendaylight
[Apologies if mesg is duplicated] Hi Kyle/Team, I’m looking at open daylight support in openstack and see the relevant blueprint for the neutron plugin for ODL at https://docs.google.com/document/d/1rdowsQSYBirS634RMFePOaJu1FysjUUZtobChTkfOIY/edit - is there a repo where we can review the code implemented up until now and test it out? I can participate in the implementation effort as well, please let me know if there are some modules that need to be addressed. Regards, Vijay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] Docker and TripleO
No argument here. I agree that it's an optimization and you know what they say about premature optimization. :-) I do wonder whether the ongoing discussion about where container support should live in OpenStack is relevant to this as well. Or would we not intend to manage the TripleO containers using OpenStack? If we _do_ then that is going to be a dependency too - we won't want to commit to using containers until OpenStack as a whole has decided how we're going to support them. This is the crux of it :) - and why I'm proposing we defer this until folk have bandwidth to do a decent job of analysis. But as a teaser - here's the stack, for a given arbitrary service - overcloud Nova-compute: Nova-compute deployed in container $UUID deployed by container-driver X on machine $UUID deployed by Ironic So we have: Ironic deploying the container substrate The container substrate (e.g. nova + docker) deploying containers The overcloud service itself Thats three distinct hypervisors in play: are there three clouds, or two clouds and three regions? Or two clouds and one multi-hypervisor region? How do we orchestrate graceful deployments of the container substrate (consider - if nova-compute is running in a container, we'd have to quiesce load on a machine at two separate levels...)? Will VIF plugging, and iSCSI work properly in containers? There's some reason to think that iSCSI won't, for one... we hit a wall with it with nova-baremetal. That may mean that we deploy a single overcloud via two different hypervisors - simple service in containers and compute/storage to baremetal). And then there's the question of where the container substrate should be owned: should it be part of the overcloud and deployed in the same stack - e.g. 'we get full machines and for some of them we choose to subdivide', or should it be an undercloud service - 'you can ask for machines or containers'. To my mind there are too many unknowns to meaningfully design at this point. We'll need to do a bunch of experiments to learn the constraints -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] Vertica Storage Driver Testing
On Thu, Jan 2, 2014 at 5:36 PM, Robert Collins robe...@robertcollins.net wrote: On 3 January 2014 14:34, Robert Collins robe...@robertcollins.net wrote: On 3 January 2014 12:40, Herndon, John Luke john.hern...@hp.com wrote: On 1/2/14, 4:27 PM, Clint Byrum cl...@fewbar.com wrote: I don’t think it would be that hard to get the review or gate jobs to use a real vertica instance, actually. Who do I talk to about that? http://ci.openstack.org/third_party.html Oh, if you meant setting up a gate variant to use vertica community edition - I'd run it past the ceilometer folk and then just submit patches to devstack, devstack-gate and infra/config to do it. devstack - code for setting up a real vertica devstack-gate - handles passing the right flags to devstack for the configuration scenarios we test against infra/config - has the jenkins job builder definitions to define the jobs -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev I don't think Veritca is open source. There is a community edition that requires registration to use. This is problematic for upstream gating, but seems like a good candidate for third party testing. Clark ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ceilometer] Vertica Storage Driver Testing
On 01/02/2014 08:36 PM, Robert Collins wrote: On 3 January 2014 14:34, Robert Collins robe...@robertcollins.net wrote: On 3 January 2014 12:40, Herndon, John Luke john.hern...@hp.com wrote: On 1/2/14, 4:27 PM, Clint Byrum cl...@fewbar.com wrote: I don’t think it would be that hard to get the review or gate jobs to use a real vertica instance, actually. Who do I talk to about that? http://ci.openstack.org/third_party.html Oh, if you meant setting up a gate variant to use vertica community edition - I'd run it past the ceilometer folk and then just submit patches to devstack, devstack-gate and infra/config to do it. devstack - code for setting up a real vertica devstack-gate - handles passing the right flags to devstack for the configuration scenarios we test against infra/config - has the jenkins job builder definitions to define the jobs I think general policy (thus far) has been that we're not going to put non Open Source software into upstream gate jobs. So you really should approach this via 3rd party testing instead. The DB2 folks are approaching it that way, for that reason. -Sean -- Sean Dague Samsung Research America s...@dague.net / sean.da...@samsung.com http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron][IPv6] Meeting time - change to 1300 UTC or 1500 UTC?
Sean, Tuesdays is better for China :) thank you so much. Thanks Best Regards, Yang Yu(于杨) Collins, Sean sean_colli...@cable.comcast.com 2014-01-02 18:08 Please respond to OpenStack Development Mailing List \(not for usage questions\) openstack-dev@lists.openstack.org To OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org cc Subject Re: [openstack-dev] [Neutron][IPv6] Meeting time - change to 1300 UTC or 1500 UTC? Looking at the calendar, our options for 1500 UTC require us to change the day that we meet. The following days are available: * Tuesdays * Fridays Thoughts? -- Sean M. Collins ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [elastic-recheck] Thoughts on next steps
A lot of elastic recheck this fall has been based on the ad hoc needs of the moment, in between diving down into the race bugs that were uncovered by it. This week away from it all helped provide a little perspective on what I think we need to do to call it *done* (i.e. something akin to a 1.0 even though we are CDing it). Here is my current thinking on the next major things that should happen. Opinions welcomed. (These are roughly in implementation order based on urgency) = Split of web UI = The elastic recheck page is becoming a mismash of what was needed at the time. I think what we really have emerging is: * Overall Gate Health * Known (to ER) Bugs * Unknown (to ER) Bugs - more below I think the landing page should be Know Bugs, as that's where we want both bug hunters to go to prioritize things, as well as where people looking for known bugs should start. I think the overall Gate Health graphs should move to the zuul status page. Possibly as part of the collection of graphs at the bottom. We should have a secondary page (maybe column?) of the un-fingerprinted recheck bugs, largely to use as candidates for fingerprinting. This will let us eventually take over /recheck. = Data Analysis / Graphs = I spent a bunch of time playing with pandas over break (http://dague.net/2013/12/30/ipython-notebook-experiments/), it's kind of awesome. It also made me rethink our approach to handling the data. I think the rolling average approach we were taking is more precise than accurate. As these are statistical events they really need error bars. Because when we have a quiet night, and 1 job fails at 6am in the morning, the 100% failure rate it reflects in grenade needs to be quantified that it was 1 of 1, not 50 of 50. So my feeling is we should move away from the point graphs we have, and present these as weekly and daily failure rates (with graphs and error bars). And slice those per job. My suggestion is that we do the actual visualization with matplotlib because it's super easy to output that from pandas data sets. Basically we'll be mining Elastic Search - Pandas TimeSeries - transforms and analysis - output tables and graphs. This is different enough from our current jquery graphing that I want to get ACKs before doing a bunch of work here and finding out people don't like it in reviews. Also in this process upgrade the metadata that we provide for each of those bugs so it's a little more clear what you are looking at. = Take over of /recheck = There is still a bunch of useful data coming in on recheck bug data which hasn't been curated into ER queries. I think the right thing to do is treat these as a work queue of bugs we should be building patterns out of (or completely invalidating). I've got a preliminary gerrit bulk query piece of code that does this, which would remove the need of the daemon the way that's currently happening. The gerrit queries are a little long right now, but I think if we are only doing this on hourly cron, the additional load will be negligible. This would get us into a single view, which I think would be more informative than the one we currently have. = Categorize all the jobs = We need a bit of refactoring to let us comment on all the jobs (not just tempest ones). Basically we assumed pep8 and docs don't fail in the gate at the beginning. Turns out they do, and are good indicators of infra / external factor bugs. They are a part of the story so we should put them in. = Multi Line Fingerprints = We've definitely found bugs where we never had a really satisfying single line match, but we had some great matches if we could do multi line. We could do that in ER, however it will mean giving up logstash as our UI, because those queries can't be done in logstash. So in order to do this we'll really need to implement some tools - cli minimum, which will let us easily test a bug. A custom web UI might be in order as well, though that's going to be it's own chunk of work, that we'll need more volunteers for. This would put us in a place where we should have all the infrastructure to track 90% of the race conditions, and talk about them in certainty as 1%, 5%, 0.1% bugs. -Sean -- Sean Dague Samsung Research America s...@dague.net / sean.da...@samsung.com http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [elastic-recheck] Thoughts on next steps
On Thu, Jan 2, 2014 at 6:29 PM, Sean Dague s...@dague.net wrote: A lot of elastic recheck this fall has been based on the ad hoc needs of the moment, in between diving down into the race bugs that were uncovered by it. This week away from it all helped provide a little perspective on what I think we need to do to call it *done* (i.e. something akin to a 1.0 even though we are CDing it). Here is my current thinking on the next major things that should happen. Opinions welcomed. (These are roughly in implementation order based on urgency) = Split of web UI = The elastic recheck page is becoming a mismash of what was needed at the time. I think what we really have emerging is: * Overall Gate Health * Known (to ER) Bugs * Unknown (to ER) Bugs - more below I think the landing page should be Know Bugs, as that's where we want both bug hunters to go to prioritize things, as well as where people looking for known bugs should start. I think the overall Gate Health graphs should move to the zuul status page. Possibly as part of the collection of graphs at the bottom. We should have a secondary page (maybe column?) of the un-fingerprinted recheck bugs, largely to use as candidates for fingerprinting. This will let us eventually take over /recheck. = Data Analysis / Graphs = I spent a bunch of time playing with pandas over break (http://dague.net/2013/12/30/ipython-notebook-experiments/), it's kind of awesome. It also made me rethink our approach to handling the data. I think the rolling average approach we were taking is more precise than accurate. As these are statistical events they really need error bars. Because when we have a quiet night, and 1 job fails at 6am in the morning, the 100% failure rate it reflects in grenade needs to be quantified that it was 1 of 1, not 50 of 50. So my feeling is we should move away from the point graphs we have, and present these as weekly and daily failure rates (with graphs and error bars). And slice those per job. My suggestion is that we do the actual visualization with matplotlib because it's super easy to output that from pandas data sets. Basically we'll be mining Elastic Search - Pandas TimeSeries - transforms and analysis - output tables and graphs. This is different enough from our current jquery graphing that I want to get ACKs before doing a bunch of work here and finding out people don't like it in reviews. Also in this process upgrade the metadata that we provide for each of those bugs so it's a little more clear what you are looking at. = Take over of /recheck = There is still a bunch of useful data coming in on recheck bug data which hasn't been curated into ER queries. I think the right thing to do is treat these as a work queue of bugs we should be building patterns out of (or completely invalidating). I've got a preliminary gerrit bulk query piece of code that does this, which would remove the need of the daemon the way that's currently happening. The gerrit queries are a little long right now, but I think if we are only doing this on hourly cron, the additional load will be negligible. This would get us into a single view, which I think would be more informative than the one we currently have. = Categorize all the jobs = We need a bit of refactoring to let us comment on all the jobs (not just tempest ones). Basically we assumed pep8 and docs don't fail in the gate at the beginning. Turns out they do, and are good indicators of infra / external factor bugs. They are a part of the story so we should put them in. = Multi Line Fingerprints = We've definitely found bugs where we never had a really satisfying single line match, but we had some great matches if we could do multi line. We could do that in ER, however it will mean giving up logstash as our UI, because those queries can't be done in logstash. So in order to do this we'll really need to implement some tools - cli minimum, which will let us easily test a bug. A custom web UI might be in order as well, though that's going to be it's own chunk of work, that we'll need more volunteers for. This would put us in a place where we should have all the infrastructure to track 90% of the race conditions, and talk about them in certainty as 1%, 5%, 0.1% bugs. -Sean -- Sean Dague Samsung Research America s...@dague.net / sean.da...@samsung.com http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev This is great stuff. Out of curiousity is doing the graphing with pandas and ES vs graphite so that we can graph things in a more ad hoc fashion? Also, for the dashboard, Kibana3 does a lot more stuff than Kibana2 which we currently use. I have been meaning to get Kibana3 running alongside Kibana2 and I think it may be able to do multi line queries (I need to double
[openstack-dev] [Neutron] disable_security_group_extension_if_noop_driver() function in plugins
Hi, I'm not sure if this question has been asked, but I wonder what is the purpose to have the following function in ovs, linuxbridge and ml2 plugins. disable_security_group_extension_if_noop_driver() With this logic, if I set neutron to use NOOP firewall, creating a Nova instance will fail, because the 'security_group' resource doesn't exist. If I comment out this line, set firewall driver in both Neutron and Nova to be NOOP driver, everything seems working fine. Thanks, Gary ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [elastic-recheck] Thoughts on next steps
On Thu, Jan 2, 2014 at 6:44 PM, Clark Boylan clark.boy...@gmail.com wrote: On Thu, Jan 2, 2014 at 6:29 PM, Sean Dague s...@dague.net wrote: A lot of elastic recheck this fall has been based on the ad hoc needs of the moment, in between diving down into the race bugs that were uncovered by it. This week away from it all helped provide a little perspective on what I think we need to do to call it *done* (i.e. something akin to a 1.0 even though we are CDing it). Here is my current thinking on the next major things that should happen. Opinions welcomed. (These are roughly in implementation order based on urgency) = Split of web UI = The elastic recheck page is becoming a mismash of what was needed at the time. I think what we really have emerging is: * Overall Gate Health * Known (to ER) Bugs * Unknown (to ER) Bugs - more below I think the landing page should be Know Bugs, as that's where we want both bug hunters to go to prioritize things, as well as where people looking for known bugs should start. I think the overall Gate Health graphs should move to the zuul status page. Possibly as part of the collection of graphs at the bottom. We should have a secondary page (maybe column?) of the un-fingerprinted recheck bugs, largely to use as candidates for fingerprinting. This will let us eventually take over /recheck. = Data Analysis / Graphs = I spent a bunch of time playing with pandas over break (http://dague.net/2013/12/30/ipython-notebook-experiments/), it's kind of awesome. It also made me rethink our approach to handling the data. I think the rolling average approach we were taking is more precise than accurate. As these are statistical events they really need error bars. Because when we have a quiet night, and 1 job fails at 6am in the morning, the 100% failure rate it reflects in grenade needs to be quantified that it was 1 of 1, not 50 of 50. So my feeling is we should move away from the point graphs we have, and present these as weekly and daily failure rates (with graphs and error bars). And slice those per job. My suggestion is that we do the actual visualization with matplotlib because it's super easy to output that from pandas data sets. Basically we'll be mining Elastic Search - Pandas TimeSeries - transforms and analysis - output tables and graphs. This is different enough from our current jquery graphing that I want to get ACKs before doing a bunch of work here and finding out people don't like it in reviews. Also in this process upgrade the metadata that we provide for each of those bugs so it's a little more clear what you are looking at. = Take over of /recheck = There is still a bunch of useful data coming in on recheck bug data which hasn't been curated into ER queries. I think the right thing to do is treat these as a work queue of bugs we should be building patterns out of (or completely invalidating). I've got a preliminary gerrit bulk query piece of code that does this, which would remove the need of the daemon the way that's currently happening. The gerrit queries are a little long right now, but I think if we are only doing this on hourly cron, the additional load will be negligible. This would get us into a single view, which I think would be more informative than the one we currently have. = Categorize all the jobs = We need a bit of refactoring to let us comment on all the jobs (not just tempest ones). Basically we assumed pep8 and docs don't fail in the gate at the beginning. Turns out they do, and are good indicators of infra / external factor bugs. They are a part of the story so we should put them in. = Multi Line Fingerprints = We've definitely found bugs where we never had a really satisfying single line match, but we had some great matches if we could do multi line. We could do that in ER, however it will mean giving up logstash as our UI, because those queries can't be done in logstash. So in order to do this we'll really need to implement some tools - cli minimum, which will let us easily test a bug. A custom web UI might be in order as well, though that's going to be it's own chunk of work, that we'll need more volunteers for. This would put us in a place where we should have all the infrastructure to track 90% of the race conditions, and talk about them in certainty as 1%, 5%, 0.1% bugs. -Sean -- Sean Dague Samsung Research America s...@dague.net / sean.da...@samsung.com http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev This is great stuff. Out of curiousity is doing the graphing with pandas and ES vs graphite so that we can graph things in a more ad hoc fashion? Also, for the dashboard, Kibana3 does a lot more stuff than Kibana2 which we currently use. I have been meaning to get Kibana3 running alongside
Re: [openstack-dev] [elastic-recheck] Thoughts on next steps
On 01/02/2014 09:44 PM, Clark Boylan wrote: snip This is great stuff. Out of curiousity is doing the graphing with pandas and ES vs graphite so that we can graph things in a more ad hoc fashion? So, we need to go to ES for the fingerprints anyway (because that's where we mine them from), which means we need a way to process ES data into TimeSeries. In order to calculate frequencies we need largely equivalent TimeSeries that are base lines for # of jobs run of particular types. Given that we can get that with an ES query, it prevents the need of having to have a different data transformation process to get to the same kind of TimeSeries. It also lets us bulk query. With 1 ~20second ES query we get all states, of all jobs, across all queues, over the last 7 days (as well as information on review). And the transform to slice is super easy because it's 10s of thousands of records that are dictionaries, which makes for good input. You'd need to do a bunch of unbinning and transforms to massage the graphite data to pair with what we have in the fingerprint data. Eventually having tools to do the same thing with graphite is probe ably a good thing, largely for other analysis people want to do on that (I think long term having some data kits for our bulk data to let people play with it is goodness). I'd just put it after a 1.0 as I think it's not really needed. Also, for the dashboard, Kibana3 does a lot more stuff than Kibana2 which we currently use. I have been meaning to get Kibana3 running alongside Kibana2 and I think it may be able to do multi line queries (I need to double check that but it has a lot more query and graphing capability). I think Kibana3 is worth looking into as well before we go too far down the road of custom UI. Absolutely. There is a reason that's all the way at the bottom of the list, and honestly, something I almost didn't put in there. But I figured we needed to understand the implications of multi line matches with our current UI, and the fact that they will make some things better, but discovering those matches will be harder with the existing UI. If Kibana3 solves it, score. One less thing to do. Because I'd really like to not be in the business of maintaining a custom web UI just for discovery of fingerprints. -Sean -- Sean Dague Samsung Research America s...@dague.net / sean.da...@samsung.com http://dague.net ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] Discuss the option delete_on_termination
Hi, All Attach a volume when creating a server, the API contains 'block_device_mapping', such as: block_device_mapping: [ { volume_id: VOLUME_ID, device_name: /dev/vdc, delete_on_termination: true } ] It allows the option 'delete_on_termination', but in the code it's hardcoded to True. Why? Another situation, attach a volume to an exists server, there is not the option 'delete_on_termination'. Should we add the 'delete_on_termination' when attach a volume to an exists server or modify the value from the params? See also: https://blueprints.launchpad.net/nova/+spec/add-delete-on-termination-option Best regards! ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron][IPv6] Meeting time - change to 1300 UTC or 1500 UTC?
Either day works for me. Thanks for setting it up, Sean! On Jan 2, 2014, at 5:08 AM, Collins, Sean sean_colli...@cable.comcast.com wrote: Looking at the calendar, our options for 1500 UTC require us to change the day that we meet. The following days are available: * Tuesdays * Fridays Thoughts? -- Sean M. Collins ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova]Should we add the api to move the removed or damaged hosts
Hi, All Should we add the api to move the removed or damaged hosts? See also: https://blueprints.launchpad.net/nova/+spec/add-delete-host-api Best regards! Lee ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova]Should we add the api to move the removed or damaged hosts
Its duplicate with https://blueprints.launchpad.net/nova/+spec/remove-nova-compute Thanks, Jay 2014/1/3 黎林果 lilinguo8...@gmail.com Hi, All Should we add the api to move the removed or damaged hosts? See also: https://blueprints.launchpad.net/nova/+spec/add-delete-host-api Best regards! Lee ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] minimum review period for functional changes that break backwards compatibility
On Thu, Jan 2, 2014 at 8:23 PM, Thierry Carrez thie...@openstack.orgwrote: Tim Bell wrote: - Changes in default behaviour: Always likely to affect existing systems in some way. Maybe we should have an additional type of review vote that comes from people who are recognised as reperensting large production deployments ? This is my biggest worry... there are changes which may be technically valid but have a significant disruptive impact on those people who are running clouds. Asking the people who are running production OpenStack clouds to review every patch to understand the risks and assess the migration impact is asking a lot. IMHO there are a few takeaways from this thread... When a proposed patch is known to change default behavior, or break backward compatibility, or cause an upgrade headache, we should definitely be more careful before finally approving the change. We should also have a mechanism to engage with users and operators so that they can weigh in. In the worst case scenario where there is no good solution, at least they are informed that the pain is coming. One remaining question would be... what is that mechanism ? Mail to the general list ? the operators list ? (should those really be two separate lists ?) Some impact tag that upgrade-minded operators can subscribe to ? Whilst I don't think that having a minimum review period would have helped in this case because of the volume of patches being submitted, I think where there are backwards incompatible changes that aren't urgent, a post to the appropriate mailing lists should be done. And referenced in the commit message which would help with the reviews as well. We could have a reasonable minimum period for people to respond to a mailing list post which would not necessarily affect the how fast the patch actually gets merged because the message can be sent in advance of someone actually working on a patch. Chris ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] minimum review period for functional changes that break backwards compatibility
On Fri, Jan 3, 2014 at 7:30 AM, Tim Bell tim.b...@cern.ch wrote: Is there a mechanism to tag changes as being potentially more appropriate for the more ops related profiles ? I'm thinking more when someone proposes a change they suspect could have an operations impact, they could highlight this as being one for particular focus. How about an OpsImpact tag ? Perhaps this would trigger a message to the mailing list on patch upload rather than merge though so ops people have some time to respond before it gets merged. Tim -Original Message- From: Robert Collins [mailto:robe...@robertcollins.net] Sent: 02 January 2014 21:47 To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [nova] minimum review period for functional changes that break backwards compatibility On 3 January 2014 00:53, Day, Phil philip@hp.com wrote: Hi Thierry, Thanks for a great summary. I don't really share your view that there is a us vs them attitude emerging between operators and developers (but as someone with a foot in both camps maybe I'm just thinking that because otherwise I'd become even more bi-polar :-) I would suggest though that the criteria for core reviewers is maybe more slanted towards developers that operators, and that it would be worth considering if there is some way to recognised and incorporate the different perspective that operators can provide into the review process. Perhaps they can start doing reviews? One could argue that a specific venue is needed for them to review effectively, as looking at each individual commit being proposed might not be efficient for folk looking at conceptual review - but thats at best a hypothesis. If ops focused folk spent an hour a day reviewing upcoming changes I'm fairly certain they would both be listened to and eventually earn core wings. -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [horizon] Javascript checkstyle improvement
On 12/27/2013 03:52 PM, Maxime Vidori wrote: Hi all, I send this mail to talk about Javascript coding style improvement, like python has pep8, it could be interesting to have some rules for javascript too. JSHint provides some rules to perform this and I think it could be a great idea to discuss about which rules could be integrated into Horizon. According to http://www.jshint.com/docs/options/ here is a list of the rules which seems interesting: - bitwise - curly - eqeqeq - forin - latedef - noempty - undef - unused with vars - trailing Here is a second list of options which can be integrated but need some discussion before: - camelcase - quotmark I already made a first patch for the indentation: https://review.openstack.org/#/c/64272/ Thank you for driving this further! I see pros and cons here. First of all, I really like style improvements and unification of code and style. But: We're bundling foreign code (which is bad in general); now we need to change/format that code too, to match our style conventions? That would even generate a fork, like here[1], where the changes were just cosmetics. A patch like[2] this one shouldn't pass any more, although it's code like distributed by upstream. From a users standpoint there is nothing wrong with [2]. It would be ideal to remove bundled code at all and to add an external dependency. Matthias [1] https://review.openstack.org/#/c/64272/5/horizon/static/horizon/js/angular/controllers/dummy.js [2] https://review.openstack.org/#/c/64760/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Openstack] Quota delegation tool (for nova) ?
Hi, thanks a lot to all who have responded already! Joe, thanks a lot for those comments. See below. On 01.01.2014 01:14, Joe Gordon wrote: This sounds like what I would imagine should be a very common use case, so it would be really great if we can support this. And as always support means tested in gate. Yes, I can imagine that this is a very common use case for large installations indeed. Fully agree. To put this in other terms, it sounds like you want one role to set per project quota's, and another role to set user quotas for that specific project (https://blueprints.launchpad.net/nova/+spec/per-user-quotas). Both of those quota mechanisms exist today, but the roles you desire do not exist by default. So I think all that is needed to support your use case is make sure nova works with your desired roles. That involves fixing any issues and writing a test we can gate on. So I think the outcome of this work will consist of: * Changing the default policy.json file we use * An easy upgrade path * Documentation explaining the new default roles * Some small changes to nova to understand the new roles, along with unit tests. * Tempest tests I'd like to ask people for their opinion on how such a schema should be implemented. There are several aspects which need to be taken into account here: - There are people with different roles in this game: +- the main resource manager role is a super user role which can but does not have to be identical to the cloud manager. Persons with this role should be able to change all numbers down in the tree. In general, the cloud manager and the resource manager role are not identical in my opinion. Persons with this role should also be able to nominate other resource managers and give them a fraction of the resources +- a normal resource manager is a bit like the main resource manager, with the exception that he can only manage the fraction of the resources he was allocated by a person above him +- a normal user: persons with this role can only consume resources This can be supported via our policy logic (http://git.openstack.org/cgit/openstack/nova/tree/etc/nova/policy.json). By default we don't define that many roles by default. Great! So maybe the roles issue is a much more easy one than we though initially. - several people can have the same role. This is necessary to be able to cover eg. holiday season or sick leave periods where one manager is not available. Maybe introducing a group concept here would be appropriate, in a way that roles are assigned to groups and people are assigned to the groups instead of assigning roles directly to individuals. I think keystone supports this today. OK - When I say Quota what I'm talking about is actually just a number, eventually assigned with some unit. It could be a static limit on a specific resource like number of VMs or the amount of memory or disk space, or it could be something different like computing performance or even something like a currency at the longer term - What is the right place to store such groups or roles ? What do people think ? We are currently only interested in limit settings for Nova. The described ideas could be implemented as part of Nova, or as an entirely independent external tool (which might be incorporated later). IMO the latter approach has some advantages but I'd like to hear peoples opinion about this. I think this should be directly in nova. Yes. This will cover our main use case. Keep in touch! Ulrich We'll have some man power available to work on the design and the implementation of this so I'd expect to see some rapid progress if everbody agrees that this is a useful thing to do. Great! Thanks a lot for your comments/opinions! Kind regards, Ulrich ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org mailto:OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev