Re: [openstack-dev] [stable][neutron] 2014.2.2 exceptions
On Wed, Feb 4, 2015 at 10:43 AM, Ihar Hrachyshka ihrac...@redhat.com wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi all, I'd like to ask grant for exception for the following patches: - - https://review.openstack.org/#/c/149818/ (FIPs are messed up and/or not working after L3 HA failover; makes L3 HA feature unusable) - - https://review.openstack.org/152841 (ipv6: router does not advertise dhcp server for stateful subnets, making all compliant dhcp clients to fail to get ipv6 address) +1 to both from my perspective. Thanks /Ihar -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQEcBAEBAgAGBQJU0kwwAAoJEC5aWaUY1u57AJsH+wf0B5zNb2nW9H/NpY9LkH21 alO2/ZV3EAZuGQO2rrKgNSwUyypmSuGgfETfWoM860wbQtHvBhfmStFpLbrIRvAg FSCIpetvUTp34lcHiQiELc4dubwWvlgPd/xJuH/vHZ/m7nhMQpND3mrC4S99BViR RX5b3S0NeyF8HV7bIvLiI3iD/ZI+IBR3fsYPy8N26+hY5GUmtVdYH29rsyqofRs1 1nJ2fG8Khoj+cUT4WI4sjshswWa4I3TIRix1zVlJOv0HSKsLmalxOIH0JQYbgiz/ YNvc/1Gd47Oldnedc9AUcpfr2AJv5DkeYKsfVnHNUQImaeAHIp7x3NE/aKf996s= =jts0 -END PGP SIGNATURE- __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [horizon] JavaScript docs?
I agree. StoryBoard's storyboard-webclient project has a lot of existing code already that's pretty well documented, but without knowing what documentation system we were going to settle on we never put any rule enforcement in place. If someone wants to take a stab at putting together a javascript docs build, that project would provide a good test bed that will let you test out the tools without having to also make them dance with python/sphinx at the same time. I.E. I have a bunch of javascript that you can hack on, and the domain knowledge of the Infra JS Build tools. I'd be happy to support this effort. Michael On Wed Feb 04 2015 at 9:09:22 AM Thai Q Tran tqt...@us.ibm.com wrote: As we're moving toward Angular, might make sense for us to adopt ngdoc as well. -Matthew Farina m...@mattfarina.com wrote: - To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org From: Matthew Farina m...@mattfarina.com Date: 02/04/2015 05:42AM Subject: [openstack-dev] [horizon] JavaScript docs? In python we have a style to document methods, classes, and so forth. But, I don't see any guidance on how JavaScript should be documented. I was looking for something like jsdoc or ngdoc (an extension of jsdoc). Is there any guidance on how JavaScript should be documented? For anyone who doesn't know, Angular uses ngdoc (an extension to the commonly used jsdoc) which is written up at https://github.com/angular/angular.js/wiki/Writing-AngularJS-Documentation . __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [api][nova] Openstack HTTP error codes
The downside of numbers rather than camel-case text is that they are less likely to stick in the memory of regular users. Not a huge think, but a reduction in usability, I think. On the other hand they might lead to less guessing about the error with insufficient info, I suppose. To make the global registry easier, we can just use a per-service prefix, and then keep the error catalogue in the service code repo, pulling them into some sort of release periodically On 3 February 2015 at 03:24, Sean Dague s...@dague.net wrote: On 02/02/2015 05:35 PM, Jay Pipes wrote: On 01/29/2015 12:41 PM, Sean Dague wrote: Correct. This actually came up at the Nova mid cycle in a side conversation with Ironic and Neutron folks. HTTP error codes are not sufficiently granular to describe what happens when a REST service goes wrong, especially if it goes wrong in a way that would let the client do something other than blindly try the same request, or fail. Having a standard json error payload would be really nice. { fault: ComputeFeatureUnsupportedOnInstanceType, messsage: This compute feature is not supported on this kind of instance type. If you need this feature please use a different instance type. See your cloud provider for options. } That would let us surface more specific errors. snip Standardization here from the API WG would be really great. What about having a separate HTTP header that indicates the OpenStack Error Code, along with a generated URI for finding more information about the error? Something like: X-OpenStack-Error-Code: 1234 X-OpenStack-Error-Help-URI: http://errors.openstack.org/1234 That way is completely backwards compatible (since we wouldn't be changing response payloads) and we could handle i18n entirely via the HTTP help service running on errors.openstack.org. That could definitely be implemented in the short term, but if we're talking about API WG long term evolution, I'm not sure why a standard error payload body wouldn't be better. The if we are going to having global codes that are just numbers, we'll also need a global naming registry. Which isn't a bad thing, just someone will need to allocate the numbers in a separate global repo across all projects. -Sean -- Sean Dague http://dague.net __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Duncan Thomas __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
On 2015-02-04 13:40:29 +0200 (+0200), Duncan Thomas wrote: 4) Write a small daemon that runs as root, accepting commands over a unix domain socket or similar. Easier to audit, less code running as root. http://git.openstack.org/cgit/openstack/oslo.rootwrap/tree/oslo_rootwrap/daemon.py -- Jeremy Stanley __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
On 2015-02-04 18:38:16 +0200 (+0200), Duncan Thomas wrote: If I'm reading that correctly, it does not help with the filtering issues at all, since it needs exactly the same kind of filter. Daniel explained the concept far better than I. I didn't mean to imply that it does, merely that it fits your rather terse description of a daemon that runs as root, accepting commands over a unix domain socket or similar. -- Jeremy Stanley __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
On 02/04/2015 06:57 AM, Daniel P. Berrange wrote: On Wed, Feb 04, 2015 at 11:58:03AM +0100, Thierry Carrez wrote: What solutions do we have ? (1) we could get our act together and audit and fix those filter definitions. Remove superfluous usage of root rights, make use of advanced filters for where we actually need them. We have been preaching for that at many many design summits. This is a lot of work though... There were such efforts in the past, but they were never completed for some types of nodes. Worse, the bad filter definitions kept coming back, since developers take shortcuts, reviewers may not have sufficient security awareness to detect crappy filter definitions, and I don't think we can design a gate test that would have such awareness. (2) bite the bullet and accept that some types of nodes actually need root rights for so many different things, they should just run as root anyway. I know a few distributions which won't be very pleased by such a prospect, but that would be a more honest approach (rather than claiming we provide efficient isolation when we really don't). An added benefit is that we could replace a number of shell calls by Python code, which would simplify the code and increase performance. (3) intermediary solution where we would run as the nova user but run sudo COMMAND directly (instead of sudo nova-rootwrap CONFIG COMMAND). That would leave it up to distros to choose between a blanket sudoer or maintain their own filtering rules. I think it's a bit hypocritical though (pretend the distros could filter if they wanted it, when we dropped the towel on doing that ourselves). I'm also not convinced it's more secure than solution 2, and it prevents from reducing the number of shell-outs, which I think is a worthy idea. In all cases I would not drop the baby with the bath water, and keep rootwrap for all the cases where root rights are needed on a very specific set of commands (like neutron, or nova's api-metadata). The daemon mode should address the performance issue for the projects making a lot of calls. (4) I think that ultimately we need to ditch rootwrap and provide a proper privilege separated, formal RPC mechanism for each project. eg instead of having a rootwrap command, or rootwrap server attempting to validate safety of qemu-img create -f qcow2 /var/lib/nova/instances/instance1/disk.qcow2 we should have a nova-compute-worker daemon running as root, that accepts an RPC command from nova-compute running unprivileged. eg CreateImage(instane0001, qcow2, disk.qcow) This immediately makes it trivial to validate that we're not trying to trick qemu-img into overwriting some key system file. This is certainly alot more work than trying to patchup rootwrap, but it would provide a level of security that rootwrap can never achieve IMHO. This 4th idea sounds interesting, though we are assuming this new service running as root would be exempt of bug, especially if it uses the same libraries as non-root services... For example a major bug in python would give attacker direct root access while the rootwrap solution would in theory keep the intruder at the sudo level... For completeness, I'd like to propose a more long-term solution: (5) Get ride of root! Seriously, OpenStack could support security mechanism like SELinux or AppArmor in order to properly isolate service and let them run what they need to run. For what it worth, the underlying issue here is having a single almighty super user: root and thus we should, at least, consider solution that remove the need of such powers (e.g. kernel module loading, ptrace or raw socket). Beside, as long as sensitive process are not contained at the system level, the attack surface for a non-root user is still very wide (e.g. system calls, setuid binaries, ipc, ...) While this might sounds impossible to implement upstream because it's too vendor specific or just because of other technicals difficulties, I guess it still deserves a mention in this thread. Best regards, Tristan signature.asc Description: OpenPGP digital signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] [lbaas] LBaaS Haproxy performance benchmarking
Thanks Miguel. From: Miguel Ángel Ajo majop...@redhat.commailto:majop...@redhat.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Date: Wednesday, February 4, 2015 at 1:10 AM To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [neutron] [lbaas] LBaaS Haproxy performance benchmarking You can try with httperf[1], or ab[2] for http workloads. If you will use overlay, make sure your network MTU is correctly configured to handle the extra size of the overlay (GRE / VXLAN packets) otherwise you will be introducing fragmentation overhead on the tenant networks. [1] http://www.hpl.hp.com/research/linux/httperf/ [2] http://httpd.apache.org/docs/2.2/programs/ab.html Miguel Ángel Ajo On Wednesday, 4 de February de 2015 at 01:58, Varun Lodaya wrote: Hi, We were trying to use haproxy as our LBaaS solution on the overlay. Has anybody done some baseline benchmarking with LBaaSv1 haproxy solution? Also, any recommended tools which we could use to do that? Thanks, Varun __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribemailto:openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
On Wed, Feb 04, 2015 at 05:52:12PM +0100, Philipp Marek wrote: Here are my 2¢. (1) we could get our act together and audit and fix those filter definitions. Remove superfluous usage of root rights, make use of advanced filters for where we actually need them. We have been preaching for that at many many design summits. This is a lot of work though... There were such efforts in the past, but they were never completed for some types of nodes. Worse, the bad filter definitions kept coming back, since developers take shortcuts, reviewers may not have sufficient security awareness to detect crappy filter definitions, and I don't think we can design a gate test that would have such awareness. Sounds like a lot of work... ongoing, too. (2) bite the bullet and accept that some types of nodes actually need root rights for so many different things, they should just run as root anyway. I know a few distributions which won't be very pleased by such a prospect, but that would be a more honest approach (rather than claiming we provide efficient isolation when we really don't). An added benefit is that we could replace a number of shell calls by Python code, which would simplify the code and increase performance. Practical, but unsafe. I'd very much like to have some best-effort filter against bugs in my programming - even more so during development. (4) I think that ultimately we need to ditch rootwrap and provide a proper privilege separated, formal RPC mechanism for each project. ... we should have a nova-compute-worker daemon running as root, that accepts an RPC command from nova-compute running unprivileged. eg CreateImage(instane0001, qcow2, disk.qcow) ... This is certainly alot more work than trying to patchup rootwrap, but it would provide a level of security that rootwrap can never achieve IMHO. A lot of work, and if input sanitation didn't work in one piece of code, why should it here? I think this only leads to _more_ work, without any real benefit. If we can't get the filters right the first round, we won't make it here either. The difference is that the API I illustrate here has *semantic* context about the operation. In the API example the caller is not permitted to provide a directory path - only the name of the instance and the name of the disk image. The privileged nova-compute-worker program can thus enforce exactly what directory the image is created in, and ensure it doesn't clash with a disk image from another VM. This kind of validation is impractical when you are just given a 'qemu-img' command line args with a full directory path, so there is no semantic conext for the privileged rootwrap to know whether it is reasonable to create the disk image in that particular location. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] Logging interactions of libvirt + QEMU in Nova
Daniel, Kashyap, One question that came up on IRC was, how/where to configure say a directory where core dumps from qemu would end up. Sean was seeing a scenario where he noticed a core dump from qemu in dmesg/syslog and was wondering how to specify a directory to capture a core dump if/when it occurs. thanks, dims On Wed, Feb 4, 2015 at 8:48 AM, Kashyap Chamarthy kcham...@redhat.com wrote: On Wed, Feb 04, 2015 at 10:27:34AM +, Daniel P. Berrange wrote: On Wed, Feb 04, 2015 at 11:23:34AM +0100, Kashyap Chamarthy wrote: Heya, I noticed a ping (but couldn't respond in time) on #openstack-nova IRC about turning on logging in libvirt to capture Nova failures. This was discussed on this list previously by Daniel Berrange, just spelling it out here for reference and completness' sake. (1) To see the interactions between libvirt and QEMU, in /etc/libvirt/libvirtd.conf, have these two config attributes: . . . log_filters=1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 1:util You really want 3:object in there /before/ 1:util too otherwise you'll be spammed with object ref/unref messages Thanks for this detail. log_outputs=1:file:/var/tmp/libvirtd.log Use /var/log/libvirt/libvirtd.log instead of /var/tmp Ah, yeah, it was an incorrect copy/paste. -- /kashyap __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Davanum Srinivas :: https://twitter.com/dims __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
On Wed, Feb 04, 2015 at 06:38:16PM +0200, Duncan Thomas wrote: If I'm reading that correctly, it does not help with the filtering issues at all, since it needs exactly the same kind of filter. Daniel explained the concept far better than I. Yep, the only thing rootwrap daemon mode does is to remove the overhead of spawning the rootwrap command. It does nothing to improve actual security - it is still a chocolate teapot from that POV. On 4 February 2015 at 18:33, Jeremy Stanley fu...@yuggoth.org wrote: On 2015-02-04 13:40:29 +0200 (+0200), Duncan Thomas wrote: 4) Write a small daemon that runs as root, accepting commands over a unix domain socket or similar. Easier to audit, less code running as root. http://git.openstack.org/cgit/openstack/oslo.rootwrap/tree/oslo_rootwrap/daemon.py Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [horizon] JavaScript docs?
As we're moving toward Angular, might make sense for us to adopt ngdoc as well.-Matthew Farina m...@mattfarina.com wrote: -To: "OpenStack Development Mailing List (not for usage questions)" openstack-dev@lists.openstack.orgFrom: Matthew Farina m...@mattfarina.comDate: 02/04/2015 05:42AMSubject: [openstack-dev] [horizon] _javascript_ docs?In python we have a style to document methods, classes, and so forth. But, I don't see any guidance on how _javascript_ should be documented. I was looking for something like jsdoc or ngdoc (an extension of jsdoc). Is there any guidance on how _javascript_ should be documented?For anyone who doesn't know, Angular uses ngdoc (an extension to the commonly used jsdoc) which is written up at https://github.com/angular/angular.js/wiki/Writing-AngularJS-Documentation. __OpenStack Development Mailing List (not for usage questions)Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribehttp://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
On Wed, Feb 04, 2015 at 06:05:16PM +0100, Philipp Marek wrote: (4) I think that ultimately we need to ditch rootwrap and provide a proper privilege separated, formal RPC mechanism for each project. ... we should have a nova-compute-worker daemon running as root, that accepts an RPC command from nova-compute running unprivileged. eg CreateImage(instane0001, qcow2, disk.qcow) ... This is certainly alot more work than trying to patchup rootwrap, but it would provide a level of security that rootwrap can never achieve IMHO. A lot of work, and if input sanitation didn't work in one piece of code, why should it here? I think this only leads to _more_ work, without any real benefit. If we can't get the filters right the first round, we won't make it here either. The difference is that the API I illustrate here has *semantic* context about the operation. In the API example the caller is not permitted to provide a directory path - only the name of the instance and the name of the disk image. The privileged nova-compute-worker program can thus enforce exactly what directory the image is created in, and ensure it doesn't clash with a disk image from another VM. This kind of validation is impractical when you are just given a 'qemu-img' command line args with a full directory path, so there is no semantic conext for the privileged rootwrap to know whether it is reasonable to create the disk image in that particular location. Sorry about being unclear. Yes, there's some semantic meaning at that level. But this level already exists at the current rootwrap caller site, too - and if that one can be tricked to do something against image.img rm -rf /, then the additional layer can be tricked, too. No, that is really not correct. If you are passing full command strings to rootwrap then the caller can trick rootwrap into running commands with those shell metacharacter exploits. If you have a formal API like the one I describe and correctly implement it, there would be no shell involved at all. ie the nova-compute-worker program would directly invoke the system call execve(/usr/bin/qemu-img, create, image.img rm -rf /) and this would at worse create a file called 'image.img rm -rf /' and *not* invoke the rm command as you get when you use shell. This is really just another example of why rootwrap/sudo as a concept is a bad idea. The shell should never be involved in executing any external commands that Nova/Neutron/etc need to run, because it is impractical to correctly validate shell commands anywhere in the stack. The only safe thing todo is to take shell out of the picture entirely. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Report on virtual sprinting
Hi everyone, This cycle, the OpenStack Infrastructure team forewent having an in-person midcycle sprint and has instead taken advantage of the new #openstack-sprint channel for specific sprint topics we wished to cover. So far there have been 3 such virtual sprints. Two were completed by the Infrastructure team, the first in December and focused on fleshing out our infra-manual[0] and the second in January where we completed our Spec to split out our Puppet modules[1]. The third was also in January, hosted by the Third Party CI Working Group focused on improving documentation for rolling out external testing systems [2]. For each of these sprints, we created an Etherpad which had links to important information related to the sprint, details about what where we currently were work-wise and kept a running tally of the flow of work. Work was also tracked through near continous chat on #openstack-sprint, just like you might find at a physical sprint. We found these virtual sprints to be incredibly valuable for knocking out work items that we'd defined at the Summit and in Specs. By focusing on specific work items we were able to spend just a day or two on each sprint and we didn't have the travel time (and jet lag!) penalty that physical sprints have. They were also relatively easy to schedule around the particularly active travel schedule that our team members have. Virtual sprints in the #openstack-sprint channel are reserved much like project IRC meetings, pick a time and update to the wiki page at https://wiki.openstack.org/wiki/VirtualSprints Some lessons learned: * Schedule as far in advance as you can, taking into account the core people needed to complete the task * Block out time for the sprint in your schedule Just like being at a physical sprint, ignore much of the rest of IRC, mailing lists and other meetings and be present at the virtual sprint. Continous presence in channel helps the team tackle problems and coordinate work. * Consider other timezones by having hand-offs Not everyone on our team is in the same timezone, so to help folks who join later in your day by giving a quick summary of work done and next steps they may wish to focus on * Have sprint participants sign up for specific tasks during the sprint Use an Etherpad and bugs to track overall sprint progress and have contributors sign up for specific work items so there isn't overlap in work. Though it's still in heavy development and not ready for most projects to use yet, I'll also mention that Storyboard has the ability to create and assign tasks to individuals, this helped us tremendously during our Puppet Module sprint, where a lot modules were being created and we wanted to make sure we didn't overlap on work. Something to look forward! * Use a common Gerrit topic for the sprint In order to help others in the sprint review changes, use a common topic in Gerrit for all changes made during the sprint, this can be set upon submission to Gerrit with: git review -t sprint-topic-here, or afterwords by the owner in the Gerrit UI. * We'd like to bring in the Gerrit bot for future sprints Due to the way the Gerrit bot is configured, it takes a change to the global bot config via Gerrit to update it. We're looking into ways to coordinate this or make it easier so you can also see patchset updates for projects you wish to track in the sprint channel. [0] http://docs.openstack.org/infra/manual/ event wrap-up blog post with some stats at: http://princessleia.com/journal/?p=9952 [1] http://specs.openstack.org/openstack-infra/infra-specs/specs/puppet-modules.html [2] https://etherpad.openstack.org/p/third-party-ci-documentation -- Elizabeth Krumbach Joseph || Lyz || pleia2 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
On 2015-02-04 11:58:03 +0100 (+0100), Thierry Carrez wrote: [...] The second problem is the quality of the filter definitions. Rootwrap is a framework to enable isolation. It's only as good as the filters each project defines. Most of them rely on CommandFilters that do not check any argument, instead of using more powerful filters (which are arguably more painful to maintain). Developers routinely add filter definitions that basically remove any isolation that might have been there, like allowing blank dd, tee, chown or chmod. [...] This part is my biggest concern at the moment, from a vulnerability management standpoint. I'm worried that it's an attractive nuisance resulting in a false sense of security in its current state because we're not calling this shortcoming out explicitly in documentation (as far as I'm aware), and so we're opening our operators/users up to unexpected risks and opening ourselves up to the possibility of a slew of vulnerability reports because this mechanism doesn't provide the level of protection it would seem to imply. -- Jeremy Stanley __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [stable][neutron] 2014.2.2 exceptions
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi all, I'd like to ask grant for exception for the following patches: - - https://review.openstack.org/#/c/149818/ (FIPs are messed up and/or not working after L3 HA failover; makes L3 HA feature unusable) - - https://review.openstack.org/152841 (ipv6: router does not advertise dhcp server for stateful subnets, making all compliant dhcp clients to fail to get ipv6 address) Thanks /Ihar -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQEcBAEBAgAGBQJU0kwwAAoJEC5aWaUY1u57AJsH+wf0B5zNb2nW9H/NpY9LkH21 alO2/ZV3EAZuGQO2rrKgNSwUyypmSuGgfETfWoM860wbQtHvBhfmStFpLbrIRvAg FSCIpetvUTp34lcHiQiELc4dubwWvlgPd/xJuH/vHZ/m7nhMQpND3mrC4S99BViR RX5b3S0NeyF8HV7bIvLiI3iD/ZI+IBR3fsYPy8N26+hY5GUmtVdYH29rsyqofRs1 1nJ2fG8Khoj+cUT4WI4sjshswWa4I3TIRix1zVlJOv0HSKsLmalxOIH0JQYbgiz/ YNvc/1Gd47Oldnedc9AUcpfr2AJv5DkeYKsfVnHNUQImaeAHIp7x3NE/aKf996s= =jts0 -END PGP SIGNATURE- __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
Here are my 2¢. (1) we could get our act together and audit and fix those filter definitions. Remove superfluous usage of root rights, make use of advanced filters for where we actually need them. We have been preaching for that at many many design summits. This is a lot of work though... There were such efforts in the past, but they were never completed for some types of nodes. Worse, the bad filter definitions kept coming back, since developers take shortcuts, reviewers may not have sufficient security awareness to detect crappy filter definitions, and I don't think we can design a gate test that would have such awareness. Sounds like a lot of work... ongoing, too. (2) bite the bullet and accept that some types of nodes actually need root rights for so many different things, they should just run as root anyway. I know a few distributions which won't be very pleased by such a prospect, but that would be a more honest approach (rather than claiming we provide efficient isolation when we really don't). An added benefit is that we could replace a number of shell calls by Python code, which would simplify the code and increase performance. Practical, but unsafe. I'd very much like to have some best-effort filter against bugs in my programming - even more so during development. (4) I think that ultimately we need to ditch rootwrap and provide a proper privilege separated, formal RPC mechanism for each project. ... we should have a nova-compute-worker daemon running as root, that accepts an RPC command from nova-compute running unprivileged. eg CreateImage(instane0001, qcow2, disk.qcow) ... This is certainly alot more work than trying to patchup rootwrap, but it would provide a level of security that rootwrap can never achieve IMHO. A lot of work, and if input sanitation didn't work in one piece of code, why should it here? I think this only leads to _more_ work, without any real benefit. If we can't get the filters right the first round, we won't make it here either. Regarding the idea of using containers ... take Cinder as an example. If the cinder container can access *all* the VM data, why should someone bother to get *out* of the container? Everything that she wants is already here... I'm not sure what the containers would buy us, but perhaps I just don't understand something here. So, IMO, solution 1 (one) would be the way to go ... it gets to security asymptotically (and might never reach it), but at least it provides a bit of help. And if the rootwrap filter specification would be linked to in the rootwrap config files, it would help newcomers to see the available syntax, instead of simply copying a bad example ;P -- : Ing. Philipp Marek : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com : DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
Thierry Carrez thie...@openstack.org writes: You make a good point when you mention traditional distro here. I would argue that containers are slightly changing the rules of the don't-run-as-root game. Solution (2) aligns pretty well with container-powered OpenStack deployments -- running compute nodes as root in a container (and embracing abovementioned simplicity/performance gains) sounds like a pretty strong combo. This sounds at least a little like a suggestion that containers are a substitute for the security provided by running non-root. The security landscape around containers is complex, and while there are a lot of benefits, I believe the general consensus is that uid 0 processes should not be seen as fully isolated. From https://docs.docker.com/articles/security/ : Docker containers are, by default, quite secure; especially if you take care of running your processes inside the containers as non-privileged users (i.e., non-root). Which is not to say that using containers is not a good idea, but rather, if one does, one should avoid running as root (perhaps with capabilities), and use selinux (or similar). -Jim __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Manila]Question about the usage of connect_share_server_to_tenant_network
On 02/04/2015 03:04 AM, Li, Chen wrote: Hi list, For generic driver, there is a flag named “connect_share_server_to_tenant_network” in manila/share/drivers/service_instance.py. When it set to True, share-server(nova instance) would be created directly on the “share-network”. When it set to False, the subnet within share-network must connected to a router, and then manila would create its own subnet and connect to the router too, and start share-server in manila’s subnet. Based on https://wiki.openstack.org/wiki/Manila/Networking#Gateway_Mediated, I assume the difference here is L2 vs L3 connectivity. But, I wander whether there are some other reasons for generic driver to support this flag. So, my question here is: As an cloud admin, what I need to consider to help me figure out what value I should set for this flag ? L3 connectivity tends to be dramatically more efficient in a larger cloud because it limits the size of the broadcast domains. If you try to use L2 connectivity between the share server and all its clients, and any of those machines are separately in the physical world (separate racks, separate aisles, separate datacenters) then all your ARP traffic, etc, is traversing backbone links. The only benefit to L2 connectivity that I'm aware of is a potential performance improvement by removing the (virtual) router as a bottleneck. -Ben Swartzlander __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
On Wed, Feb 04, 2015 at 09:10:06AM -0800, James E. Blair wrote: Thierry Carrez thie...@openstack.org writes: You make a good point when you mention traditional distro here. I would argue that containers are slightly changing the rules of the don't-run-as-root game. Solution (2) aligns pretty well with container-powered OpenStack deployments -- running compute nodes as root in a container (and embracing abovementioned simplicity/performance gains) sounds like a pretty strong combo. This sounds at least a little like a suggestion that containers are a substitute for the security provided by running non-root. The security landscape around containers is complex, and while there are a lot of benefits, I believe the general consensus is that uid 0 processes should not be seen as fully isolated. From https://docs.docker.com/articles/security/ : Docker containers are, by default, quite secure; especially if you take care of running your processes inside the containers as non-privileged users (i.e., non-root). Which is not to say that using containers is not a good idea, but rather, if one does, one should avoid running as root (perhaps with capabilities), and use selinux (or similar). Yep, I've seen attempts by some folks to run nova-compute and libvirtd and QEMU inside a docker container. Because of the inherantly privileged nature of what Nova/libvirt/qemu need to do, you end up having to share all the host namespaces with the docker container, except for the filesystem namespace and even that you need to bind mount a bunch of stuff over. As a result the container isn't really offerring any security benefit over running the things outside the container. IOW the use of containers to confine nova is only solving a managability problem rather than a security problem. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] Logging interactions of libvirt + QEMU in Nova
On Wed, Feb 04, 2015 at 11:15:18AM -0500, Davanum Srinivas wrote: Daniel, Kashyap, One question that came up on IRC was, how/where to configure say a directory where core dumps from qemu would end up. Sean was seeing a scenario where he noticed a core dump from qemu in dmesg/syslog and was wondering how to specify a directory to capture a core dump if/when it occurs. That's really outside the scope of libvirt. On Fedora/RHEL there is the abrt program, which captures core dumps from any process in the entire OS and has configurable actions for where to save them. IIUC Ubuntu has some general purpose core dump handler too but I don't know much about it myself. They all work by hooking into the kernel's core_pattern sysctl knob to redirect core dumps to their own binary instead of using $CWD Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] Logging interactions of libvirt + QEMU in Nova
On Wed, Feb 04, 2015 at 11:57:56AM -0500, Davanum Srinivas wrote: Daniel, The last tip on this page possibly? http://wiki.stoney-cloud.org/wiki/Debugging_Qemu Note that tip is not merely affecting QEMU processes - the recommended change is affecting core dumps for everything on the entire OS. This is really what a tool like abrt is aiming todo already :-) Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
(4) I think that ultimately we need to ditch rootwrap and provide a proper privilege separated, formal RPC mechanism for each project. ... we should have a nova-compute-worker daemon running as root, that accepts an RPC command from nova-compute running unprivileged. eg CreateImage(instane0001, qcow2, disk.qcow) ... This is certainly alot more work than trying to patchup rootwrap, but it would provide a level of security that rootwrap can never achieve IMHO. A lot of work, and if input sanitation didn't work in one piece of code, why should it here? I think this only leads to _more_ work, without any real benefit. If we can't get the filters right the first round, we won't make it here either. The difference is that the API I illustrate here has *semantic* context about the operation. In the API example the caller is not permitted to provide a directory path - only the name of the instance and the name of the disk image. The privileged nova-compute-worker program can thus enforce exactly what directory the image is created in, and ensure it doesn't clash with a disk image from another VM. This kind of validation is impractical when you are just given a 'qemu-img' command line args with a full directory path, so there is no semantic conext for the privileged rootwrap to know whether it is reasonable to create the disk image in that particular location. Sorry about being unclear. Yes, there's some semantic meaning at that level. But this level already exists at the current rootwrap caller site, too - and if that one can be tricked to do something against image.img rm -rf /, then the additional layer can be tricked, too. I'm trying to get at the point everything that can be forgot to check at the rootwrap call site now, can be forgotten in the additional API too. So let's get the current call sites tight, and we're done. (Ha!) -- : Ing. Philipp Marek : LINBIT | Your Way to High Availability : DRBD/HA support and consulting http://www.linbit.com : DRBD® and LINBIT® are registered trademarks of LINBIT, Austria. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] [lbaas] LBaaS Haproxy performance benchmarking
Thanks Baptiste. I will try that tool. I worked with ab and was seeing really low results. But let me give httpress a shot :) Thanks, Varun On 2/3/15, 7:01 PM, Baptiste bed...@gmail.com wrote: On Wed, Feb 4, 2015 at 1:58 AM, Varun Lodaya varun_lod...@symantec.com wrote: Hi, We were trying to use haproxy as our LBaaS solution on the overlay. Has anybody done some baseline benchmarking with LBaaSv1 haproxy solution? Also, any recommended tools which we could use to do that? Thanks, Varun _ _ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Hi Varun, large subject :) any injector could do the trick. I usually use inject (from HAProxy's author) and httpress. They can hammer a single URL, but if the purpose is to measure HAProxy's performance, then this is more than enough. Baptiste __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [stable][neutron] 2014.2.2 exceptions
Ihar Hrachyshka wrote: I'd like to ask grant for exception for the following patches: - https://review.openstack.org/#/c/149818/ (FIPs are messed up and/or not working after L3 HA failover; makes L3 HA feature unusable) - https://review.openstack.org/152841 (ipv6: router does not advertise dhcp server for stateful subnets, making all compliant dhcp clients to fail to get ipv6 address) Both feel reasonable to me. -- Thierry Carrez (ttx) signature.asc Description: OpenPGP digital signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [api][nova] Openstack HTTP error codes
Ideally there would need to be a way to replicate errors.openstack.org and switch the url, for none-internet connected deployments, but TBH sites with that sort of requirement are used to weird breakages, so not a huge issue of it can't easily be done On 3 February 2015 at 00:35, Jay Pipes jaypi...@gmail.com wrote: On 01/29/2015 12:41 PM, Sean Dague wrote: Correct. This actually came up at the Nova mid cycle in a side conversation with Ironic and Neutron folks. HTTP error codes are not sufficiently granular to describe what happens when a REST service goes wrong, especially if it goes wrong in a way that would let the client do something other than blindly try the same request, or fail. Having a standard json error payload would be really nice. { fault: ComputeFeatureUnsupportedOnInstanceType, messsage: This compute feature is not supported on this kind of instance type. If you need this feature please use a different instance type. See your cloud provider for options. } That would let us surface more specific errors. snip Standardization here from the API WG would be really great. What about having a separate HTTP header that indicates the OpenStack Error Code, along with a generated URI for finding more information about the error? Something like: X-OpenStack-Error-Code: 1234 X-OpenStack-Error-Help-URI: http://errors.openstack.org/1234 That way is completely backwards compatible (since we wouldn't be changing response payloads) and we could handle i18n entirely via the HTTP help service running on errors.openstack.org. Best, -jay __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Duncan Thomas __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
I've spent a few hours today reading about Galera, a clustering solution for MySQL. Galera provides multi-master 'virtually synchronous' replication between multiple mysql nodes. i.e. I can create a cluster of 3 mysql dbs and read and write from any of them with certain consistency guarantees. I am no expert[1], but this is a TL;DR of a couple of things which I didn't know, but feel I should have done. The semantics are important to application design, which is why we should all be aware of them. * Commit will fail if there is a replication conflict foo is a table with a single field, which is its primary key. A: start transaction; B: start transaction; A: insert into foo values(1); B: insert into foo values(1); -- 'regular' DB would block here, and report an error on A's commit A: commit; -- success B: commit; -- KABOOM Confusingly, Galera will report a 'deadlock' to node B, despite this not being a deadlock by any definition I'm familiar with. Essentially, anywhere that a regular DB would block, Galera will not block transactions on different nodes. Instead, it will cause one of the transactions to fail on commit. This is still ACID, but the semantics are quite different. The impact of this is that code which makes correct use of locking may still fail with a 'deadlock'. The solution to this is to either fail the entire operation, or to re-execute the transaction and all its associated code in the expectation that it won't fail next time. As I understand it, these can be eliminated by sending all writes to a single node, although that obviously makes less efficient use of your cluster. * Write followed by read on a different node can return stale data During a commit, Galera replicates a transaction out to all other db nodes. Due to its design, Galera knows these transactions will be successfully committed to the remote node eventually[2], but it doesn't commit them straight away. The remote node will check these outstanding replication transactions for write conflicts on commit, but not for read. This means that you can do: A: start transaction; A: insert into foo values(1) A: commit; B: select * from foo; -- May not contain the value we inserted above[3] This means that even for 'synchronous' slaves, if a client makes an RPC call which writes a row to write master A, then another RPC call which expects to read that row from synchronous slave node B, there's no default guarantee that it'll be there. Galera exposes a session variable which will fix this: wsrep_sync_wait (or wsrep_causal_reads on older mysql). However, this isn't the default. It presumably has a performance cost, but I don't know what it is, or how it scales with various workloads. Because these are semantic issues, they aren't things which can be easily guarded with an if statement. We can't say: if galera: try: commit except: rewind time If we are to support this DB at all, we have to structure code in the first place to allow for its semantics. Matt [1] No, really: I just read a bunch of docs and blogs today. If anybody who is an expert would like to validate/correct that would be great. [2] http://www.percona.com/blog/2012/11/20/understanding-multi-node-writing-conflict-metrics-in-percona-xtradb-cluster-and-galera/ [3] http://www.percona.com/blog/2013/03/03/investigating-replication-latency-in-percona-xtradb-cluster/ -- Matthew Booth Red Hat Engineering, Virtualisation Team Phone: +442070094448 (UK) GPG ID: D33C3490 GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
If I'm reading that correctly, it does not help with the filtering issues at all, since it needs exactly the same kind of filter. Daniel explained the concept far better than I. On 4 February 2015 at 18:33, Jeremy Stanley fu...@yuggoth.org wrote: On 2015-02-04 13:40:29 +0200 (+0200), Duncan Thomas wrote: 4) Write a small daemon that runs as root, accepting commands over a unix domain socket or similar. Easier to audit, less code running as root. http://git.openstack.org/cgit/openstack/oslo.rootwrap/tree/oslo_rootwrap/daemon.py -- Jeremy Stanley __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Duncan Thomas __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] Logging interactions of libvirt + QEMU in Nova
Daniel, The last tip on this page possibly? http://wiki.stoney-cloud.org/wiki/Debugging_Qemu -- dims On Wed, Feb 4, 2015 at 11:18 AM, Daniel P. Berrange berra...@redhat.com wrote: On Wed, Feb 04, 2015 at 11:15:18AM -0500, Davanum Srinivas wrote: Daniel, Kashyap, One question that came up on IRC was, how/where to configure say a directory where core dumps from qemu would end up. Sean was seeing a scenario where he noticed a core dump from qemu in dmesg/syslog and was wondering how to specify a directory to capture a core dump if/when it occurs. That's really outside the scope of libvirt. On Fedora/RHEL there is the abrt program, which captures core dumps from any process in the entire OS and has configurable actions for where to save them. IIUC Ubuntu has some general purpose core dump handler too but I don't know much about it myself. They all work by hooking into the kernel's core_pattern sysctl knob to redirect core dumps to their own binary instead of using $CWD Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| -- Davanum Srinivas :: https://twitter.com/dims __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
On Wed, Feb 04, 2015 at 04:30:32PM +, Matthew Booth wrote: I've spent a few hours today reading about Galera, a clustering solution for MySQL. Galera provides multi-master 'virtually synchronous' replication between multiple mysql nodes. i.e. I can create a cluster of 3 mysql dbs and read and write from any of them with certain consistency guarantees. I am no expert[1], but this is a TL;DR of a couple of things which I didn't know, but feel I should have done. The semantics are important to application design, which is why we should all be aware of them. * Commit will fail if there is a replication conflict foo is a table with a single field, which is its primary key. A: start transaction; B: start transaction; A: insert into foo values(1); B: insert into foo values(1); -- 'regular' DB would block here, and report an error on A's commit A: commit; -- success B: commit; -- KABOOM Confusingly, Galera will report a 'deadlock' to node B, despite this not being a deadlock by any definition I'm familiar with. Yes ! and if I can add more information and I hope I do not make mistake I think it's a know issue which comes from MySQL, that is why we have a decorator to do a retry and so handle this case here: http://git.openstack.org/cgit/openstack/nova/tree/nova/db/sqlalchemy/api.py#n177 Essentially, anywhere that a regular DB would block, Galera will not block transactions on different nodes. Instead, it will cause one of the transactions to fail on commit. This is still ACID, but the semantics are quite different. The impact of this is that code which makes correct use of locking may still fail with a 'deadlock'. The solution to this is to either fail the entire operation, or to re-execute the transaction and all its associated code in the expectation that it won't fail next time. As I understand it, these can be eliminated by sending all writes to a single node, although that obviously makes less efficient use of your cluster. * Write followed by read on a different node can return stale data During a commit, Galera replicates a transaction out to all other db nodes. Due to its design, Galera knows these transactions will be successfully committed to the remote node eventually[2], but it doesn't commit them straight away. The remote node will check these outstanding replication transactions for write conflicts on commit, but not for read. This means that you can do: A: start transaction; A: insert into foo values(1) A: commit; B: select * from foo; -- May not contain the value we inserted above[3] This means that even for 'synchronous' slaves, if a client makes an RPC call which writes a row to write master A, then another RPC call which expects to read that row from synchronous slave node B, there's no default guarantee that it'll be there. Galera exposes a session variable which will fix this: wsrep_sync_wait (or wsrep_causal_reads on older mysql). However, this isn't the default. It presumably has a performance cost, but I don't know what it is, or how it scales with various workloads. Because these are semantic issues, they aren't things which can be easily guarded with an if statement. We can't say: if galera: try: commit except: rewind time If we are to support this DB at all, we have to structure code in the first place to allow for its semantics. Matt [1] No, really: I just read a bunch of docs and blogs today. If anybody who is an expert would like to validate/correct that would be great. [2] http://www.percona.com/blog/2012/11/20/understanding-multi-node-writing-conflict-metrics-in-percona-xtradb-cluster-and-galera/ [3] http://www.percona.com/blog/2013/03/03/investigating-replication-latency-in-percona-xtradb-cluster/ -- Matthew Booth Red Hat Engineering, Virtualisation Team Phone: +442070094448 (UK) GPG ID: D33C3490 GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [oslo][nova][cinder] removing request_utils from oslo-incubator
About 12 hours ago in #openstack-oslo ankit_ag asked about the request_utils module that was removed from oslo-incubator and how to proceed to get it into nova. The module was deleted a few days ago [1] because nothing was actually using it and it appeared to be related to a nova blueprint [2], the spec for which was abandoned at the end of juno [3]. The one copy that had been synced into cinder wasn’t being used, and was also deleted [4] as part of this housekeeping work. As I said in the review, we removed the code from the incubator because it appeared to be a dead end. If that impression is incorrect, we should get the spec and blueprint resurrected (probably as a cross-project spec rather than just in nova) and then we can consider the best course for proceeding with the implementation. Doug [1] https://review.openstack.org/#/c/150370/ [2] https://blueprints.launchpad.net/nova/+spec/log-request-id-mappings [3] https://review.openstack.org/#/c/106878/ [4] https://review.openstack.org/#/c/150369/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] high dhcp lease times in neutron deployments considered harmful (or not???)
I proposed an alternative to adjusting the lease time early on the in the thread. By specifying the renewal time (DHCP option 58), we can have the benefits of a long lease time (resiliency to long DHCP server outages) while having a frequent renewal interval to check for IP changes. I favored this approach because it only required a patch to dnsmasq to allow that option to be set and patch to our agent to set that option, both of which are pretty straight-forward. - Just don't allow users to change their IPs without a reboot. How can we do this? Call Nova from Neutron to force a reboot when the port is updated? - Bounce the link under the VM when the IP is changed, to force the guest to re-request a DHCP lease immediately. I had thought about this as well and it's the approach that I think would be ideal, but the Nova VIF code would require changes to add support for changing interface state. It's definition of plugging and unplugging is actually creating and deleting the interfaces, which might not work so well with running VMs. Then more changes would have to be done on the Nova side to react to a port IP change notification from Neutron to trigger the interface bounce. Finally, a small change would have to be made to Neutron to send the IP change event to Nova. The amount of changes it required from the Nova side deterred me from pursuing it further. - Remove the IP spoofing firewall feature I think this makes sense as a tenant-configurable option for networks they own, but I don't think we should throw it out. It makes for good protection on networks facing Internet traffic that could have compromised hosts. Along the same line, we make use of shared networks, which has other shady tenants that might be dishonest when it comes to IP addresses. - Make the IP spoofing firewall allow an overlap of both old and new addresses until the DHCP lease time is up (or the instance reboots). Adds some additional async tasks, but this is clearly the required solution if we want to keep all our existing features. I didn't find a clean spot to put this. Spoofing rules are generated a long ways away from the code that knows about IP updates. Maybe we could tack it onto the response to the query from the agent for allowed address pairs. Then we have to deal with persisting these temporary allowed addresses to the DB (not a big deal, but still a schema change). Another issue here would be if Neutron then allocated that address for another port while it was still in use by the old node. We will probably have to block IPAM from re-allocating that address for another port during this window as well. However, this doesn't solve the general slowness of DHCP info propagation for other updates (subnet gateway change, DNS nameserver change, etc), so I would still like to go forward with the increased renewal interval. I will also look into eliminating the downtime completely with your last suggestion if it can be implemented without impacting too much stuff. On Tue, Feb 3, 2015 at 11:01 PM, Angus Lees g...@inodes.org wrote: There's clearly not going to be any amount of time that satisfies both concerns here. Just to get some other options on the table, here's some things that would allow a non-zero dhcp lease timeout _and_ address Kevin's original bug report: - Just don't allow users to change their IPs without a reboot. - Bounce the link under the VM when the IP is changed, to force the guest to re-request a DHCP lease immediately. - Remove the IP spoofing firewall feature (- my favourite, for what it's worth. I've never liked presenting a layer2 abstraction but then forcing specific layer3 addressing choices by default) - Make the IP spoofing firewall allow an overlap of both old and new addresses until the DHCP lease time is up (or the instance reboots). Adds some additional async tasks, but this is clearly the required solution if we want to keep all our existing features. On Wed Feb 04 2015 at 4:28:11 PM Aaron Rosen aaronoro...@gmail.com wrote: I believe I was the one who changed the default value of this. When we upgraded our internal cloud ~6k networks back then from folsom to grizzly we didn't account that if the dhcp-agents went offline that instances would give up their lease and unconfigure themselves causing an outage. Setting a larger value for this helps to avoid this downtime (as Brian pointed out as well). Personally, I wouldn't really expect my instance to automatically change it's ip - I think requiring the user to reboot the instance or use the console to correct the ip should be good enough. Especially since this will help buy you shorter down time if an agent fails for a little while which is probably more important than having the instance change it's ip. Aaron On Tue, Feb 3, 2015 at 5:25 PM, Kevin Benton blak...@gmail.com wrote: I definitely understand the use-case of having updatable stuff and I don't intend to support any proposals to strip away
Re: [openstack-dev] [neutron] [lbaas] LBaaS Haproxy performance benchmarking
You can try with httperf[1], or ab[2] for http workloads. If you will use overlay, make sure your network MTU is correctly configured to handle the extra size of the overlay (GRE / VXLAN packets) otherwise you will be introducing fragmentation overhead on the tenant networks. [1] http://www.hpl.hp.com/research/linux/httperf/ [2] http://httpd.apache.org/docs/2.2/programs/ab.html Miguel Ángel Ajo On Wednesday, 4 de February de 2015 at 01:58, Varun Lodaya wrote: Hi, We were trying to use haproxy as our LBaaS solution on the overlay. Has anybody done some baseline benchmarking with LBaaSv1 haproxy solution? Also, any recommended tools which we could use to do that? Thanks, Varun __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe (mailto:openstack-dev-requ...@lists.openstack.org?subject:unsubscribe) http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] [ML2] [arp] [l2pop] arp responding for vlan network
Hi henry, It looks great and quite simple thanks to the work done by the ofagent team. This kind of work might be used also for DVR which now support VLAN networks [3]. I have some concerns about the patch submitted in [1], so let's review! [3]https://review.openstack.org/#/c/129884/ On Wed, Feb 4, 2015 at 8:06 AM, henry hly henry4...@gmail.com wrote: Hi ML2'ers, We encounter use case of large amount of vlan network deployment, and want to reduce ARP storm by local responding. Luckily from Icehouse arp local response is implemented, however vlan is missed for l2pop. Then came this BP[1], which implement the plugin support of l2pop for configurable network types, and the ofagent vlan l2pop. Now I find proposal for ovs vlan support for l2pop [2], it's very small and was submitted as a bugfix, so I want to know is it possible to be merged in the K cycle? Best regards Henry [1] https://review.openstack.org/#/c/112947/ [2] https://bugs.launchpad.net/neutron/+bug/1413056 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Manila]Question about the usage of connect_share_server_to_tenant_network
Hi list, For generic driver, there is a flag named connect_share_server_to_tenant_network in manila/share/drivers/service_instance.py. When it set to True, share-server(nova instance) would be created directly on the share-network. When it set to False, the subnet within share-network must connected to a router, and then manila would create its own subnet and connect to the router too, and start share-server in manila's subnet. Based on https://wiki.openstack.org/wiki/Manila/Networking#Gateway_Mediated, I assume the difference here is L2 vs L3 connectivity. But, I wander whether there are some other reasons for generic driver to support this flag. So, my question here is: As an cloud admin, what I need to consider to help me figure out what value I should set for this flag ? Thanks. -chen __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] high dhcp lease times in neutron deployments considered harmful (or not???)
On Wed, Feb 04, 2015 at 08:59:54, Kevin Benton wrote: I proposed an alternative to adjusting the lease time early on the in the thread. By specifying the renewal time (DHCP option 58), we can have the benefits of a long lease time (resiliency to long DHCP server outages) while having a frequent renewal interval to check for IP changes. I favored this approach because it only required a patch to dnsmasq to allow that option to be set and patch to our agent to set that option, both of which are pretty straight-forward. It's hard to see a downside to this proposal. Even if one of the other ideas goes forward as well, a short DHCP renewal interval feels like a very good idea to me. Cory __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Manila]Questions about using not handle share-servers drivers with Flat network
Hi, Thanks very much for the reply. Really sorry for the late response. In your case if you have a driver that doesn't handle share servers, then the network is complete out of scope for Manila. Drivers that don't manage share servers have neither flat not segment networking in Manila, they have NO networking. ð So, you mean there is no way I can work as I want, right ? But, is it possible to enable that ? If you noticed, we're trying to enable HDFS in manila: https://blueprints.launchpad.net/manila/+spec/hdfs-driver That's the main reason I want to emphasize on my driver do not handle share server. Big data users want to have a unify storage when they're working in cloud. Because instances are not reliable resource in cloud. Put data together with instances while making sure data's reliability would be complicated. The biggest difference between HDFS and all currently backends manila support is: HDFS has different control path and data path. For a HDFS cluster, it has one name node and multi data nodes. Client would talk to name node first, get data location and then talk to data nodes to get data. The Export location represent name node information only. ð We can't put any share-server in the middle of user instances and HDFS cluster. But, it do possible to let the HDFS work in the cloud with restrictions ð It can only support one share-network at one time. This actually restrict the ability of the manila backend, no multi-tenancy at all. We want to use HDFS like this: Connect users' share-network and HDFS-cluster-network by router. Similar to currently generic driver's behavior when connect_share_server_to_tenant_network = False while no share-server exist. Access control is achieved based on its own user. We can add some access control based on keystone users and keystone tenants to avoid bad users to connect to HDFS cluster at very beginning if that's possible. Thanks. -chen From: Ben Swartzlander [mailto:b...@swartzlander.org] Sent: Wednesday, January 28, 2015 12:35 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Manila]Questions about using not handle share-servers drivers with Flat network On 01/27/2015 06:39 PM, Li, Chen wrote: Hi list, I have some questions. Hope can get help from you guys. Manila has two driver modes. For handle share server drivers, the share-network is easy to understand. For not handle share-servers drivers, manila request admin to do everything before manila-share service start, and when the service is running, it only serves requests do not contain share-network. I kept confusing about which/why users would create shares without share-network. Although when working with this kind of driver, the manila-share service can only support one specific network restricted by the backend. But users do not know backends, they should always want to create shares with share-network, because users always want to connect shares to their instances that lives in the cloud with share-network. Then I have been told that these shares created without share-network are assumed to be used on a public network. The public network do make a clear explanation about why share-network not matter anymore. But, when I build my cloud with Manila, what I want to do is let backends to serve my Flat network. I want to have 2 backends in Manila, both of them are not handle share-servers drivers. I set 192.168.6.253 for backend1 and create a Flat network in neutron with subnet 192.168.6.0/24 with IP range from 192.168.6.1-192.168.6.252. I set 192.168.7.253 for backend2 and create a Flat network in neutron with subnet 192.168.7.0/24 with IP range from 192.168.7.1-192.168.7.252. The reason I build my cloud like this is because I want to do some performance tests on both backends, to compare the two backends. I think it should not hard to do it, but manila do not support that currently. So, is this the behavior should work ? Or anything else I missed ? Manila needs to support backends that can create share servers and backends that can't create share servers. We do this because of the reality that different storage systems have different capabilities and designs, and we don't want to block anything that can reasonably described as a shared filesystem from working with Manila. For the purposes of Manila, a share server is a logically isolated instance of a file share server, with its own IP address, routing tables, security domain, and name services. Manila only tracks the existence of share servers that were created as the result of a share-create operation. Share servers created by manila have IP addresses assigned by Manila, and can be expected to be deleted by Manila sometime after the last share on that share server is deleted. Backends that simply create shares on a preexsting storage systems are not referred to as share servers and networking concerns for those systems are out
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
Matthew Booth mbo...@redhat.com wrote: This means that even for 'synchronous' slaves, if a client makes an RPC call which writes a row to write master A, then another RPC call which expects to read that row from synchronous slave node B, there's no default guarantee that it'll be there. Can I get some kind of clue as to how common this use case is? This is where we get into things like how nova.objects works and stuff, which is not my domain.We are going through a huge amount of thought in order to handle this use case but I’m not versed in where / how this use case exactly happens and how widespread it is. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [QA] Prototype of the script for Tempest auto-configuration
On 01/26/2015 09:39 AM, Timur Nurlygayanov wrote: Hi, Sorry for the late reply. Was on vacation. *Yaroslav*,thank you for raising the question, I realy like this feature, I discussed this script with several people during the OpenStack summit in Paris and heard many the same things - we need to have something like this to execute tempest tests automatically for validation of different production and test OpenStack clouds - it is real pain to create our own separate scripts for each project / team which will configure Tempest for some specific configurations / installations, because tempest configuration file can be changed and we will need to update our scripts. We need to discuss, first of all, what we need to change in this script before this script will be merged. As I can see, the spec description [1] not fully meet the current implementation [2] and the spec looks really general - probably we can describe separate 'simple' spec for this script and just abandon the current spec or update the spec to sync spec and this script? Good idea. *David*, we found many issues with the current version of script, many tempest tests failed for our custom OpenStack configurations (for example, with and without Swift or Ceph) and we have our own scripts which already can solve the problem. Can we join you and edit the patch together? (or we can describe our ideas in the comments for the patch). I welcome edits to this patch. -David Also, looks like we need review from Tempest core team - they can write more valuable comments and suggest some cool ideas for the implementation. [1] https://review.openstack.org/#/c/94473 [2] https://review.openstack.org/#/c/133245 On Fri, Jan 23, 2015 at 7:12 PM, Yaroslav Lobankov yloban...@mirantis.com mailto:yloban...@mirantis.com wrote: Hello everyone, I would like to discuss the following patch [1] for Tempest. I think that such feature as auto-configuration of Tempest would be very useful for many engineers and users. I have recently tried to use the script from [1]. I rebased the patch on master and ran the script. The script was finished without any errors and the tempest.conf was generated! Of course, this patch needs a lot of work, but the idea looks very cool! Also I would like to thank David Kranz for his working on initial version of the script. Any thoughts? [1] https://review.openstack.org/#/c/133245 Regards, Yaroslav Lobankov. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Timur, Senior QA Engineer OpenStack Projects Mirantis Inc __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Telco][NFV] Meeting Reminder - Wednesday 4th February 2015 @ 2200 UTC in #openstack-meeting
Hi all, Just a quick (belated) reminder that there is an OpenStack Telco Working group meeting in #openstack-meeting today @ 2200 UTC. I'm currently updating the agenda, please review and add any items here: https://etherpad.openstack.org/p/nfv-meeting-agenda Thanks, Steve __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
On 02/04/2015 12:05 PM, Sahid Orentino Ferdjaoui wrote: On Wed, Feb 04, 2015 at 04:30:32PM +, Matthew Booth wrote: I've spent a few hours today reading about Galera, a clustering solution for MySQL. Galera provides multi-master 'virtually synchronous' replication between multiple mysql nodes. i.e. I can create a cluster of 3 mysql dbs and read and write from any of them with certain consistency guarantees. I am no expert[1], but this is a TL;DR of a couple of things which I didn't know, but feel I should have done. The semantics are important to application design, which is why we should all be aware of them. * Commit will fail if there is a replication conflict foo is a table with a single field, which is its primary key. A: start transaction; B: start transaction; A: insert into foo values(1); B: insert into foo values(1); -- 'regular' DB would block here, and report an error on A's commit A: commit; -- success B: commit; -- KABOOM Confusingly, Galera will report a 'deadlock' to node B, despite this not being a deadlock by any definition I'm familiar with. It is a failure to certify the writeset, which bubbles up as an InnoDB deadlock error. See my article here: http://www.joinfu.com/2015/01/understanding-reservations-concurrency-locking-in-nova/ Which explains this. Yes ! and if I can add more information and I hope I do not make mistake I think it's a know issue which comes from MySQL, that is why we have a decorator to do a retry and so handle this case here: http://git.openstack.org/cgit/openstack/nova/tree/nova/db/sqlalchemy/api.py#n177 It's not an issue with MySQL. It's an issue with any database code that is highly contentious. Almost all highly distributed or concurrent applications need to handle deadlock issues, and the most common way to handle deadlock issues on database records is using a retry technique. There's nothing new about that with Galera. The issue with our use of the @_retry_on_deadlock decorator is *not* that the retry decorator is not needed, but rather it is used too frequently. The compare-and-swap technique I describe in the article above dramatically* reduces the number of deadlocks that occur (and need to be handled by the @_retry_on_deadlock decorator) and dramatically reduces the contention over critical database sections. Best, -jay * My colleague Pavel Kholkin is putting together the results of a benchmark run that compares the compare-and-swap method with the raw @_retry_on_deadlock decorator method. Spoiler: the compare-and-swap method cuts the runtime of the benchmark by almost *half*. Essentially, anywhere that a regular DB would block, Galera will not block transactions on different nodes. Instead, it will cause one of the transactions to fail on commit. This is still ACID, but the semantics are quite different. The impact of this is that code which makes correct use of locking may still fail with a 'deadlock'. The solution to this is to either fail the entire operation, or to re-execute the transaction and all its associated code in the expectation that it won't fail next time. As I understand it, these can be eliminated by sending all writes to a single node, although that obviously makes less efficient use of your cluster. * Write followed by read on a different node can return stale data During a commit, Galera replicates a transaction out to all other db nodes. Due to its design, Galera knows these transactions will be successfully committed to the remote node eventually[2], but it doesn't commit them straight away. The remote node will check these outstanding replication transactions for write conflicts on commit, but not for read. This means that you can do: A: start transaction; A: insert into foo values(1) A: commit; B: select * from foo; -- May not contain the value we inserted above[3] This means that even for 'synchronous' slaves, if a client makes an RPC call which writes a row to write master A, then another RPC call which expects to read that row from synchronous slave node B, there's no default guarantee that it'll be there. Galera exposes a session variable which will fix this: wsrep_sync_wait (or wsrep_causal_reads on older mysql). However, this isn't the default. It presumably has a performance cost, but I don't know what it is, or how it scales with various workloads. Because these are semantic issues, they aren't things which can be easily guarded with an if statement. We can't say: if galera: try: commit except: rewind time If we are to support this DB at all, we have to structure code in the first place to allow for its semantics. Matt [1] No, really: I just read a bunch of docs and blogs today. If anybody who is an expert would like to validate/correct that would be great. [2] http://www.percona.com/blog/2012/11/20/understanding-multi-node-writing-conflict-metrics-in-percona-xtradb-cluster-and-galera/ [3]
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
Excerpts from Tristan Cacqueray's message of 2015-02-04 09:02:19 -0800: On 02/04/2015 06:57 AM, Daniel P. Berrange wrote: On Wed, Feb 04, 2015 at 11:58:03AM +0100, Thierry Carrez wrote: What solutions do we have ? (1) we could get our act together and audit and fix those filter definitions. Remove superfluous usage of root rights, make use of advanced filters for where we actually need them. We have been preaching for that at many many design summits. This is a lot of work though... There were such efforts in the past, but they were never completed for some types of nodes. Worse, the bad filter definitions kept coming back, since developers take shortcuts, reviewers may not have sufficient security awareness to detect crappy filter definitions, and I don't think we can design a gate test that would have such awareness. (2) bite the bullet and accept that some types of nodes actually need root rights for so many different things, they should just run as root anyway. I know a few distributions which won't be very pleased by such a prospect, but that would be a more honest approach (rather than claiming we provide efficient isolation when we really don't). An added benefit is that we could replace a number of shell calls by Python code, which would simplify the code and increase performance. (3) intermediary solution where we would run as the nova user but run sudo COMMAND directly (instead of sudo nova-rootwrap CONFIG COMMAND). That would leave it up to distros to choose between a blanket sudoer or maintain their own filtering rules. I think it's a bit hypocritical though (pretend the distros could filter if they wanted it, when we dropped the towel on doing that ourselves). I'm also not convinced it's more secure than solution 2, and it prevents from reducing the number of shell-outs, which I think is a worthy idea. In all cases I would not drop the baby with the bath water, and keep rootwrap for all the cases where root rights are needed on a very specific set of commands (like neutron, or nova's api-metadata). The daemon mode should address the performance issue for the projects making a lot of calls. (4) I think that ultimately we need to ditch rootwrap and provide a proper privilege separated, formal RPC mechanism for each project. eg instead of having a rootwrap command, or rootwrap server attempting to validate safety of qemu-img create -f qcow2 /var/lib/nova/instances/instance1/disk.qcow2 we should have a nova-compute-worker daemon running as root, that accepts an RPC command from nova-compute running unprivileged. eg CreateImage(instane0001, qcow2, disk.qcow) This immediately makes it trivial to validate that we're not trying to trick qemu-img into overwriting some key system file. This is certainly alot more work than trying to patchup rootwrap, but it would provide a level of security that rootwrap can never achieve IMHO. This 4th idea sounds interesting, though we are assuming this new service running as root would be exempt of bug, especially if it uses the same libraries as non-root services... For example a major bug in python would give attacker direct root access while the rootwrap solution would in theory keep the intruder at the sudo level... I don't believe that anyone assumes the new service would be without bugs. But just like the OpenSSH team saw years ago, privilege separation means that you can absolutely know what is running as root, and what is not. So when you decide to commit your resources to code audits, you _start_ with the things that run with elevated privileges. For completeness, I'd like to propose a more long-term solution: (5) Get ride of root! Seriously, OpenStack could support security mechanism like SELinux or AppArmor in order to properly isolate service and let them run what they need to run. For what it worth, the underlying issue here is having a single almighty super user: root and thus we should, at least, consider solution that remove the need of such powers (e.g. kernel module loading, ptrace or raw socket). We don't need a security module to drop all of those capabilities entirely and run as a hobbled root user. By my measure, this process for nova-compute would only need CAP_NET_ADMIN, CAP_SYS_ADMIN and CAP_KILL. These capabilities can be audited per-agent and even verified as needed simply by running integration tests without each one to see what breaks. Beside, as long as sensitive process are not contained at the system level, the attack surface for a non-root user is still very wide (e.g. system calls, setuid binaries, ipc, ...) While this might sounds impossible to implement upstream because it's too vendor specific or just because of other technicals difficulties, I guess it still deserves a mention in this thread. I think OpenStack can do its part by making privilege
Re: [openstack-dev] [neutron][lbaas] Can entity calls be made to driver when entities get associated/disassociated with root entity?
Thanks Doug. My apologies for the delayed reply. The change is merged, so replying here. It is a welcome change in one way, there is always a root entity now in perspective while creating any entity. Listener is created with loadbalancer and pool is created with listener. The problem itself is eliminated because there is no DEFERRED stage. But, this restricts pool in one listener. Basically reusing of a pools across listeners and loadbalancers is not possible now. The use case of creating both a HTTPS vip and HTTP vip for the same pool is lost. Basically, a user who will need that, should create 2 pools with the same members and manage them. Is that right? Thanks, Vijay V. From: Doug Wiegley [mailto:doug...@parksidesoftware.com] Sent: Tuesday, February 3, 2015 10:03 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [neutron][lbaas] Can entity calls be made to driver when entities get associated/disassociated with root entity? I’d recommend taking a look at Brandon’s review: https://review.openstack.org/#/c/144834/ which aims to simplify exactly what you’re describing. Please leave feedback there. Thanks, doug On Tue, Feb 3, 2015 at 7:13 AM, Vijay Venkatachalam vijay.venkatacha...@citrix.commailto:vijay.venkatacha...@citrix.com wrote: Hi: In OpenStack neutron lbaas implementation, when entities are created/updated by the user, they might not be associated with the root entity, which is loadbalancer. Since root entity has the driver information, the driver cannot be called by lbaas plugin during these operations by user. Such entities are set in DEFFERED status until the entity is associated with root entity. During this association operation (listener created with pool), the driver api is called for the current operation (listener create); and the driver is expected to perform the original operation (pool create) along with the current operation (listener create). This leads to complex handling at the driver, I think it will be better for the lbaas plugin to call the original operation (pool create) driver API in addition to the current operation (listener create) API during the association operation. That is the summary, please read on to understand the situation in detail. Let’s take the example of pool create in driver. a. A pool create operation will not translate to a pool create api in the driver. There is a pool create in the driver API but that is never called today. b. When a listener is created with loadbalancer and pool, the driver’s listener create api is called and the driver is expected to create both pool and listener. c. When a listener is first created without loadbalancer but with a pool, the call does not reach driver. Later when the listener is updated with loadbalancer id, the drivers listener update API is called and the driver is expected to create both pool and listener. d. When a listener configured with pool and loadbalancer is updated with new pool id, the driver’s listener update api is called. The driver is expected to delete the original pool that was associated, create the new pool and also update the listener As you can see this is leading to a quite a bit of handling in the driver code. This makes driver code complex. How about handling this logic in lbaas plugin and it can call the “natural” functions that were deferred. Whenever an entity is going from a DEFERRED to ACTIVE/CREATE status (through whichever workflow) the plugin can call the CREATE pool function of the driver. Whenever an entity is going from an ACTIVE/CREATED to DEFERRED status (through whichever workflow) the plugin can call the DELETE pool function of the driver. Thanks, Vijay V. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribehttp://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
Excerpts from Daniel P. Berrange's message of 2015-02-04 03:57:53 -0800: On Wed, Feb 04, 2015 at 11:58:03AM +0100, Thierry Carrez wrote: The first one is performance -- each call would spawn a Python interpreter which would then call the system command. This was fine when there were just a few calls here and there, not so much when it's called a hundred times in a row. During the Juno cycle, a daemon mode was added to solve this issue. It is significantly faster than running sudo directly (the often-suggested alternative). Projects still have to start adopting it though. Neutron and Cinder have started work to do that in Kilo. The second problem is the quality of the filter definitions. Rootwrap is a framework to enable isolation. It's only as good as the filters each project defines. Most of them rely on CommandFilters that do not check any argument, instead of using more powerful filters (which are arguably more painful to maintain). Developers routinely add filter definitions that basically remove any isolation that might have been there, like allowing blank dd, tee, chown or chmod. I think this is really the key point which shows rootwrap as a concept is broken by design IMHO. Root wrap is essentially trying to provide an API for invoking privileged operations, but instead of actually designing an explicit API for the operations, we done by implicit one based on command args. From a security POV I think this approach is doomed to failure, but command arg strings are fr to expressive a concept to deal with. What solutions do we have ? (1) we could get our act together and audit and fix those filter definitions. Remove superfluous usage of root rights, make use of advanced filters for where we actually need them. We have been preaching for that at many many design summits. This is a lot of work though... There were such efforts in the past, but they were never completed for some types of nodes. Worse, the bad filter definitions kept coming back, since developers take shortcuts, reviewers may not have sufficient security awareness to detect crappy filter definitions, and I don't think we can design a gate test that would have such awareness. (2) bite the bullet and accept that some types of nodes actually need root rights for so many different things, they should just run as root anyway. I know a few distributions which won't be very pleased by such a prospect, but that would be a more honest approach (rather than claiming we provide efficient isolation when we really don't). An added benefit is that we could replace a number of shell calls by Python code, which would simplify the code and increase performance. (3) intermediary solution where we would run as the nova user but run sudo COMMAND directly (instead of sudo nova-rootwrap CONFIG COMMAND). That would leave it up to distros to choose between a blanket sudoer or maintain their own filtering rules. I think it's a bit hypocritical though (pretend the distros could filter if they wanted it, when we dropped the towel on doing that ourselves). I'm also not convinced it's more secure than solution 2, and it prevents from reducing the number of shell-outs, which I think is a worthy idea. In all cases I would not drop the baby with the bath water, and keep rootwrap for all the cases where root rights are needed on a very specific set of commands (like neutron, or nova's api-metadata). The daemon mode should address the performance issue for the projects making a lot of calls. (4) I think that ultimately we need to ditch rootwrap and provide a proper privilege separated, formal RPC mechanism for each project. eg instead of having a rootwrap command, or rootwrap server attempting to validate safety of qemu-img create -f qcow2 /var/lib/nova/instances/instance1/disk.qcow2 we should have a nova-compute-worker daemon running as root, that accepts an RPC command from nova-compute running unprivileged. eg CreateImage(instane0001, qcow2, disk.qcow) This immediately makes it trivial to validate that we're not trying to trick qemu-img into overwriting some key system file. This is certainly alot more work than trying to patchup rootwrap, but it would provide a level of security that rootwrap can never achieve IMHO. +1, I think you're right on Daniel. Count me in for future discussions and work on this. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
Matthew Booth mbo...@redhat.com wrote: A: start transaction; B: start transaction; A: insert into foo values(1); B: insert into foo values(1); -- 'regular' DB would block here, and report an error on A's commit A: commit; -- success B: commit; -- KABOOM Confusingly, Galera will report a 'deadlock' to node B, despite this not being a deadlock by any definition I'm familiar with. So, one of the entire points of the enginefacade work is that we will ensure that writes will continue to be made to exactly one node in the cluster. Openstack does not have the problem defined above, because we only communicate with one node, even today. The work that we are trying to proceed with is to at least have *reads* make full use of the cluster. The above phenomenon is not a problem for openstack today except for the reduced efficiency, which enginefacade will partially solve. As I understand it, these can be eliminated by sending all writes to a single node, although that obviously makes less efficient use of your cluster. this is what we do right now and it continues to be the plan going forward. Having single-master is in fact the traditional form of clustering. In the Openstack case, this issue isn’t as bad as it seems, because openstack runs many different applications against the same database simultaneously. Different applications should refer to different nodes in the cluster as their “master”. There’s no conflict here because each app talks only to its own tables. During a commit, Galera replicates a transaction out to all other db nodes. Due to its design, Galera knows these transactions will be successfully committed to the remote node eventually[2], but it doesn't commit them straight away. The remote node will check these outstanding replication transactions for write conflicts on commit, but not for read. This means that you can do: A: start transaction; A: insert into foo values(1) A: commit; B: select * from foo; -- May not contain the value we inserted above[3] will need to get more detail on this. this would mean that galera is not in fact synchronous. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] Logging interactions of libvirt + QEMU in Nova
On Wed, Feb 04, 2015 at 11:15:18AM -0500, Davanum Srinivas wrote: Daniel, Kashyap, One question that came up on IRC was, how/where to configure say a directory where core dumps from qemu would end up. Sean was seeing a scenario where he noticed a core dump from qemu in dmesg/syslog If you're using something similar to ABRT on Ubuntu (assuming you're using that), you should be able to find the stack traces. Seems like on Ubuntu, this is what they use https://wiki.ubuntu.com/Apport and was wondering how to specify a directory to capture a core dump if/when it occurs. Incidentally, last week I was debugging a QEMU seg fault with upstream folks[1] which involves qemu-nbd. To capture coredumps with QEMU, this is what I had to do on my Fedora 21 system. Ensure abrt-ccpp (Automatic bug reporting tool is running: $ systemctl status abrt-ccpp ● abrt-ccpp.service - Install ABRT coredump hook Loaded: loaded (/usr/lib/systemd/system/abrt-ccpp.service; enabled) Active: active (exited) since Tue 2015-02-03 16:16:14 CET; 1 day 3h ago Main PID: 1113 (code=exited, status=0/SUCCESS) CGroup: /system.slice/abrt-ccpp.service Then the coredump file should land in: /var/tmp/abrt/ccpp-2015-01-30-00\:01\:05-3145/ You can actually see that the specific coredump is for QEMU, by running (note, in this example it was `qemu-img` that was seg faulting - and there are patches accepted for this upstream): $ cd /var/tmp/abrt/ccpp-2015-01-30-00\:01\:05-3145/ $ file coredump coredump: ELF 64-bit LSB core file x86-64, version 1 (SYSV), SVR4-style, from '/home/kashyapc/build/qemu/qemu-img Assuming you have all the QEMU debug info packages installed, you can then invoke GDB to capture the trace backs. An example[2] $ gdb /var/tmp/abrt/ccpp-2015-01-30-00\:01\:05-3145/coredump [. . .] (gdb) bt full [. . .] [1] http://lists.nongnu.org/archive/html/qemu-devel/2015-01/msg04397.html [2] https://kashyapc.fedorapeople.org/virt/qemu-nbd-test/stack-traces-from-coredump.txt On Wed, Feb 4, 2015 at 8:48 AM, Kashyap Chamarthy kcham...@redhat.com wrote: On Wed, Feb 04, 2015 at 10:27:34AM +, Daniel P. Berrange wrote: On Wed, Feb 04, 2015 at 11:23:34AM +0100, Kashyap Chamarthy wrote: Heya, I noticed a ping (but couldn't respond in time) on #openstack-nova IRC about turning on logging in libvirt to capture Nova failures. This was discussed on this list previously by Daniel Berrange, just spelling it out here for reference and completness' sake. (1) To see the interactions between libvirt and QEMU, in /etc/libvirt/libvirtd.conf, have these two config attributes: . . . log_filters=1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 1:util You really want 3:object in there /before/ 1:util too otherwise you'll be spammed with object ref/unref messages Thanks for this detail. log_outputs=1:file:/var/tmp/libvirtd.log Use /var/log/libvirt/libvirtd.log instead of /var/tmp Ah, yeah, it was an incorrect copy/paste. -- /kashyap -- /kashyap __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder][nova] Cinder Brick pypi library?
On Tue, Feb 3, 2015 at 9:41 PM, Walter A. Boring IV walter.bor...@hp.com wrote: Hey folks, I wanted to get some feedback from the Nova folks on using Cinder's Brick library. As some of you may or may not know, Cinder has an internal module called Brick. It's used for discovering and removing volumes attached to a host. Most of the code in the Brick module in cinder originated from the Nova libvirt volume drivers that do the same thing (discover attached volumes and then later remove them). Cinder uses the brick library for copy volume to image, as well as copy image to volume operations where the Cinder node needs to attach volumes to itself to do the work. The Brick code inside of Cinder has been used since the Havana release. Our plans in Cinder for the Kilo release is to extract the Brick module into it's own separate library that is maintained by the Cinder team as a subproject of Cinder and released as a pypi lib. Then for the L release, refactor Nova's libvirt volume drivers to use the Brick library. This will enable us to eliminate the duplicate code between Nova's libvirt volume drivers and Cinder's internal brick module. Both projects can benefit from a shared library. So the question I have is, does Nova have an interest in using the code in a pypi brick library? If not, then it doesn't make any sense for the Cinder team to extract it's brick module into a shared (pypi) library. Yes, nova is interested in using brick, I'm looking forward to seeing nova use it. The first release of brick will only contain the volume discovery and removal code. This is contained in the initiator directory of cinder/brick/ You can view the current brick code in Cinder here: https://github.com/openstack/cinder/tree/master/cinder/brick Thanks for the feedback, Walt __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
Excerpts from Matthew Booth's message of 2015-02-04 08:30:32 -0800: * Write followed by read on a different node can return stale data During a commit, Galera replicates a transaction out to all other db nodes. Due to its design, Galera knows these transactions will be successfully committed to the remote node eventually[2], but it doesn't commit them straight away. The remote node will check these outstanding replication transactions for write conflicts on commit, but not for read. This means that you can do: A: start transaction; A: insert into foo values(1) A: commit; B: select * from foo; -- May not contain the value we inserted above[3] This means that even for 'synchronous' slaves, if a client makes an RPC call which writes a row to write master A, then another RPC call which expects to read that row from synchronous slave node B, there's no default guarantee that it'll be there. Galera exposes a session variable which will fix this: wsrep_sync_wait (or wsrep_causal_reads on older mysql). However, this isn't the default. It presumably has a performance cost, but I don't know what it is, or how it scales with various workloads. wsrep_sync_wait/wsrep_casual_reads doesn't actually hit the cluster any harder, it simply tells the local Galera node if you're not caught up with the highest known sync point, don't answer queries yet. So it will slow down that particular query as it waits for an update from the leader about sync point and, if necessary, waits for the local engine to catch up to that point. However, it isn't going to push that query off to all the other boxes or anything like that. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [horizon][keystone]
Hi all,I have been helping with the websso effort and wanted to get some feedback.Basically, users are presented with a login screen where they can select: credentials, default protocol, or discovery service.If user selects credentials, it works exactly the same way it works today.If user selects default protocol or discovery service, they can choose to be redirected to those pages.Keep in mind that this is a prototype, early feedback will be good.Here are the relevant patches:https://review.openstack.org/#/c/136177/https://review.openstack.org/#/c/136178/https://review.openstack.org/#/c/151842/I have attached the files and present them below:__ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [horizon] JavaScript docs?
On 02/04/2015 12:48 PM, Michael Krotscheck wrote: I agree. StoryBoard's storyboard-webclient project has a lot of existing code already that's pretty well documented, but without knowing what documentation system we were going to settle on we never put any rule enforcement in place. If someone wants to take a stab at putting together a javascript docs build, that project would provide a good test bed that will let you test out the tools without having to also make them dance with python/sphinx at the same time. I.E. I have a bunch of javascript that you can hack on, and the domain knowledge of the Infra JS Build tools. I'd be happy to support this effort. +100 Michael On Wed Feb 04 2015 at 9:09:22 AM Thai Q Tran tqt...@us.ibm.com wrote: As we're moving toward Angular, might make sense for us to adopt ngdoc as well. -Matthew Farina m...@mattfarina.com wrote: - To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org From: Matthew Farina m...@mattfarina.com Date: 02/04/2015 05:42AM Subject: [openstack-dev] [horizon] JavaScript docs? In python we have a style to document methods, classes, and so forth. But, I don't see any guidance on how JavaScript should be documented. I was looking for something like jsdoc or ngdoc (an extension of jsdoc). Is there any guidance on how JavaScript should be documented? For anyone who doesn't know, Angular uses ngdoc (an extension to the commonly used jsdoc) which is written up at https://github.com/angular/angular.js/wiki/Writing-AngularJS-Documentation . __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
How interesting, Why are people using galera if it behaves like this? :-/ Are the people that are using it know/aware that this happens? :-/ Scary Mike Bayer wrote: Matthew Boothmbo...@redhat.com wrote: A: start transaction; A: insert into foo values(1) A: commit; B: select * from foo;-- May not contain the value we inserted above[3] I’ve confirmed in my own testing that this is accurate. the wsrep_causal_reads flag does resolve this, and it is settable on a per-session basis. The attached script, adapted from the script given in the blog post, illustrates this. Galera exposes a session variable which will fix this: wsrep_sync_wait (or wsrep_causal_reads on older mysql). However, this isn't the default. It presumably has a performance cost, but I don't know what it is, or how it scales with various workloads. Well, consider our application is doing some @writer, then later it does some @reader. @reader has the contract that reads must be synchronous with any writes. Easy enough, @reader ensures that the connection it uses sets up set wsrep_causal_reads=1”. The attached test case confirms this is feasible on a per-session (that is, a connection attached to the database) basis, so that the setting will not impact the cluster as a whole, and we can forego using it on those @async_reader calls where we don’t need it. Because these are semantic issues, they aren't things which can be easily guarded with an if statement. We can't say: if galera: try: commit except: rewind time If we are to support this DB at all, we have to structure code in the first place to allow for its semantics. I think the above example is referring to the “deadlock” issue, which we have solved both with the “only write to one master” strategy. But overall, as you’re aware, we will no longer have the words “begin” or “commit” in our code. This takes place all within enginefacade. With this pattern, we will permanently end the need for any kind of repeated special patterns or boilerplate which occurs per-transaction on a backend-configurable basis. The enginefacade is where any such special patterns can take place, and for extended patterns such as setting up wsrep_causal_reads on @reader nodes or similar, we can implement a rudimentary plugin system for it such that we can have a “galera” backend to set up what’s needed. The attached script does essentially what the one associated with http://www.percona.com/blog/2013/03/03/investigating-replication-latency-in-percona-xtradb-cluster/ does. It’s valid because without wsrep_causal_reads turned on the connection, I get plenty of reads that lag behind the writes, so I’ve confirmed this is easily reproducible, and that with casual_reads turned on, it vanishes. The script demonstrates that a single application can set up “wsrep_causal_reads” on a per-session basis (remember, by “session” we mean “a mysql session”), where it takes effect for that connection alone, not affecting the performance of other concurrent connections even in the same application. With the flag turned on, the script never reads a stale row. The script illustrates calls upon both the casual reads connection and the non-causal reads in a randomly alternating fashion. I’m running it against a cluster of two virtual nodes on a laptop, so performance is very slow, but some sample output: 2015-02-04 15:49:27,131 100 runs 2015-02-04 15:49:27,754 w/ non-causal reads, got row 763 val is 9499, retries 0 2015-02-04 15:49:27,760 w/ non-causal reads, got row 763 val is 9499, retries 1 2015-02-04 15:49:27,764 w/ non-causal reads, got row 763 val is 9499, retries 2 2015-02-04 15:49:27,772 w/ non-causal reads, got row 763 val is 9499, retries 3 2015-02-04 15:49:27,777 w/ non-causal reads, got row 763 val is 9499, retries 4 2015-02-04 15:49:30,985 200 runs 2015-02-04 15:49:37,579 300 runs 2015-02-04 15:49:42,396 400 runs 2015-02-04 15:49:48,240 w/ non-causal reads, got row 6544 val is 6766, retries 0 2015-02-04 15:49:48,255 w/ non-causal reads, got row 6544 val is 6766, retries 1 2015-02-04 15:49:48,276 w/ non-causal reads, got row 6544 val is 6766, retries 2 2015-02-04 15:49:49,336 500 runs 2015-02-04 15:49:56,433 600 runs 2015-02-04 15:50:05,801 700 runs 2015-02-04 15:50:08,802 w/ non-causal reads, got row 533 val is 834, retries 0 2015-02-04 15:50:10,849 800 runs 2015-02-04 15:50:14,834 900 runs 2015-02-04 15:50:15,445 w/ non-causal reads, got row 124 val is 3850, retries 0 2015-02-04 15:50:15,448 w/ non-causal reads, got row 124 val is 3850, retries 1 2015-02-04 15:50:18,515 1000 runs 2015-02-04 15:50:22,130 1100 runs 2015-02-04 15:50:26,301 1200 runs 2015-02-04 15:50:28,898 w/ non-causal reads, got row 1493 val is 8358, retries 0 2015-02-04 15:50:29,988 1300 runs 2015-02-04 15:50:33,736 1400 runs 2015-02-04 15:50:34,219 w/ non-causal reads, got row 9661 val is 2877, retries 0 2015-02-04 15:50:38,796 1500 runs 2015-02-04 15:50:42,844 1600 runs 2015-02-04 15:50:46,838 1700 runs 2015-02-04
Re: [openstack-dev] [oslo][nova][cinder] removing request_utils from oslo-incubator
I was primarily trying to explain what happened for ankit_ag, since we don't seem to overlap on IRC. If someone cares about the feature, they should get the cross-project spec going because I don't think this is something only nova cores should be deciding. On Wed, Feb 4, 2015, at 04:34 PM, Davanum Srinivas wrote: Doug, So the ball is in the Nova core(s) court? -- dims On Wed, Feb 4, 2015 at 12:16 PM, Doug Hellmann d...@doughellmann.com wrote: About 12 hours ago in #openstack-oslo ankit_ag asked about the request_utils module that was removed from oslo-incubator and how to proceed to get it into nova. The module was deleted a few days ago [1] because nothing was actually using it and it appeared to be related to a nova blueprint [2], the spec for which was abandoned at the end of juno [3]. The one copy that had been synced into cinder wasn’t being used, and was also deleted [4] as part of this housekeeping work. As I said in the review, we removed the code from the incubator because it appeared to be a dead end. If that impression is incorrect, we should get the spec and blueprint resurrected (probably as a cross-project spec rather than just in nova) and then we can consider the best course for proceeding with the implementation. Doug [1] https://review.openstack.org/#/c/150370/ [2] https://blueprints.launchpad.net/nova/+spec/log-request-id-mappings [3] https://review.openstack.org/#/c/106878/ [4] https://review.openstack.org/#/c/150369/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Davanum Srinivas :: https://twitter.com/dims __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] [lbaas] LBaaS Haproxy performance benchmarking
Well, low results on ab or on haproxy??? Can you define low ? you should test your server without any openstack stuff on it, then apply the same test with openstack installation. There may be some negative impacts because of software installed by neutron (mainly iptables). Baptiste On Wed, Feb 4, 2015 at 6:29 PM, Varun Lodaya varun_lod...@symantec.com wrote: Thanks Baptiste. I will try that tool. I worked with ab and was seeing really low results. But let me give httpress a shot :) Thanks, Varun On 2/3/15, 7:01 PM, Baptiste bed...@gmail.com wrote: On Wed, Feb 4, 2015 at 1:58 AM, Varun Lodaya varun_lod...@symantec.com wrote: Hi, We were trying to use haproxy as our LBaaS solution on the overlay. Has anybody done some baseline benchmarking with LBaaSv1 haproxy solution? Also, any recommended tools which we could use to do that? Thanks, Varun _ _ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev Hi Varun, large subject :) any injector could do the trick. I usually use inject (from HAProxy's author) and httpress. They can hammer a single URL, but if the purpose is to measure HAProxy's performance, then this is more than enough. Baptiste __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
Excerpts from Joshua Harlow's message of 2015-02-04 13:24:20 -0800: How interesting, Why are people using galera if it behaves like this? :-/ Note that any true MVCC database will roll back transactions on conflicts. One must always have a deadlock detection algorithm of some kind. Galera behaves like this because it is enormously costly to be synchronous at all times for everything. So it is synchronous when you want it to be, and async when you don't. Note that it's likely NDB (aka MySQL Cluster) would work fairly well for OpenStack's workloads, and does not suffer from this. However, it requires low latency high bandwidth links between all nodes (infiniband recommended) or it will just plain suck. So Galera is a cheaper, easier to tune and reason about option. Are the people that are using it know/aware that this happens? :-/ I think the problem really is that it is somewhat de facto, and used without being tested. The gate doesn't set up a three node Galera db and test that OpenStack works right. Also it is inherently a race condition, and thus will be a hard one to test. Thats where having knowledge of it and taking time to engineer a solution that makes sense is really the best course I can think of. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
On 5 February 2015 at 03:33, Monty Taylor mord...@inaugust.com wrote: On 02/04/2015 06:57 AM, Daniel P. Berrange wrote: to manage VMs on a laptop - you're going to use virtualbox or virt-manager. You're going to use nova-compute to manage compute hosts in a cloud - and in almost all circumstances the only thing that's going to be running on your compute hosts is going to be nova-compute. Actually we know from user / operator stories at summits that many groups run *much more* than nova-compute on their compute hosts. Specifically the all-in-one scale-as-a-unit architecture is quite popular. That has all the components of cinder/swift/keystone/nova/neutron all running on all machines, and the cluster is scaled by adding just another identical machine. I suspect its actually really quite common outside of a) paranoid (but not necessarily wrong :)) and b) ultra-scale environments. ... I'm with Daniel - (4) I think that ultimately we need to ditch rootwrap and provide a proper privilege separated, formal RPC mechanism for each project. eg instead of having a rootwrap command, or rootwrap server attempting to validate safety of qemu-img create -f qcow2 /var/lib/nova/instances/instance1/disk.qcow2 we should have a nova-compute-worker daemon running as root, that accepts an RPC command from nova-compute running unprivileged. eg CreateImage(instane0001, qcow2, disk.qcow) This immediately makes it trivial to validate that we're not trying to trick qemu-img into overwriting some key system file. This is certainly alot more work than trying to patchup rootwrap, but it would provide a level of security that rootwrap can never achieve IMHO. I think a local root daemon is a solid idea, there's lots and lots of prior art on this, from the bazillion dbus daemons we haveadays to the Android isolation between apps. So count me in too on design and speccing for this. I realise that for some cases rootwrap was absolutely a performance and stability issue, but daemon mode should address that - so I think this is a longer term project, giving us a significant step up in security. -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][nova][cinder] removing request_utils from oslo-incubator
Doug, So the ball is in the Nova core(s) court? -- dims On Wed, Feb 4, 2015 at 12:16 PM, Doug Hellmann d...@doughellmann.com wrote: About 12 hours ago in #openstack-oslo ankit_ag asked about the request_utils module that was removed from oslo-incubator and how to proceed to get it into nova. The module was deleted a few days ago [1] because nothing was actually using it and it appeared to be related to a nova blueprint [2], the spec for which was abandoned at the end of juno [3]. The one copy that had been synced into cinder wasn’t being used, and was also deleted [4] as part of this housekeeping work. As I said in the review, we removed the code from the incubator because it appeared to be a dead end. If that impression is incorrect, we should get the spec and blueprint resurrected (probably as a cross-project spec rather than just in nova) and then we can consider the best course for proceeding with the implementation. Doug [1] https://review.openstack.org/#/c/150370/ [2] https://blueprints.launchpad.net/nova/+spec/log-request-id-mappings [3] https://review.openstack.org/#/c/106878/ [4] https://review.openstack.org/#/c/150369/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Davanum Srinivas :: https://twitter.com/dims __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
On 5 February 2015 at 10:24, Joshua Harlow harlo...@outlook.com wrote: How interesting, Why are people using galera if it behaves like this? :-/ Because its actually fairly normal. In fact its an instance of point 7 on https://wiki.openstack.org/wiki/BasicDesignTenets - one of our oldest wiki pages :). In more detail, consider what happens in full isolation when you have the A and B example given, but B starts its transaction before A. B BEGIN A BEGIN A INSERT foo A COMMIT B SELECT foo - NULL - data inserted by a transaction with a higher transaction id isn't visible to the older transaction (in a MVCC style engine - there are other engines, but this is common). When you add clustering in, many cluster DBs are not synchronous: - postgresql replication is asynchronous - both log shipping and slony. Neither is Galera. So reads will see older data than has been committed to the cluster. Writes will conflict *if* the write was dependent on data that was changed. If rather than clustering you add multiple DB's, you get the same sort of thing unless you explicitly wire in 2PC and a distributed lock manager and oh my... and we have multiple DB's (cinder, nova etc) but no such coordination between them. Now, if we say that we can't accept eventual consistency, that we have to have atomic visibility of changes, then we've a -lot- of work- because of the multiple DB's thing. However, eventual consistency can cause confusion if its not applied well, and it may be that this layer is the wrong layer to apply it at - thats certainly a possibility. Are the people that are using it know/aware that this happens? :-/ I hope so :) -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
Matthew Booth mbo...@redhat.com wrote: A: start transaction; A: insert into foo values(1) A: commit; B: select * from foo; -- May not contain the value we inserted above[3] I’ve confirmed in my own testing that this is accurate. the wsrep_causal_reads flag does resolve this, and it is settable on a per-session basis. The attached script, adapted from the script given in the blog post, illustrates this. Galera exposes a session variable which will fix this: wsrep_sync_wait (or wsrep_causal_reads on older mysql). However, this isn't the default. It presumably has a performance cost, but I don't know what it is, or how it scales with various workloads. Well, consider our application is doing some @writer, then later it does some @reader. @reader has the contract that reads must be synchronous with any writes. Easy enough, @reader ensures that the connection it uses sets up set wsrep_causal_reads=1”. The attached test case confirms this is feasible on a per-session (that is, a connection attached to the database) basis, so that the setting will not impact the cluster as a whole, and we can forego using it on those @async_reader calls where we don’t need it. Because these are semantic issues, they aren't things which can be easily guarded with an if statement. We can't say: if galera: try: commit except: rewind time If we are to support this DB at all, we have to structure code in the first place to allow for its semantics. I think the above example is referring to the “deadlock” issue, which we have solved both with the “only write to one master” strategy. But overall, as you’re aware, we will no longer have the words “begin” or “commit” in our code. This takes place all within enginefacade. With this pattern, we will permanently end the need for any kind of repeated special patterns or boilerplate which occurs per-transaction on a backend-configurable basis. The enginefacade is where any such special patterns can take place, and for extended patterns such as setting up wsrep_causal_reads on @reader nodes or similar, we can implement a rudimentary plugin system for it such that we can have a “galera” backend to set up what’s needed. The attached script does essentially what the one associated with http://www.percona.com/blog/2013/03/03/investigating-replication-latency-in-percona-xtradb-cluster/ does. It’s valid because without wsrep_causal_reads turned on the connection, I get plenty of reads that lag behind the writes, so I’ve confirmed this is easily reproducible, and that with casual_reads turned on, it vanishes. The script demonstrates that a single application can set up “wsrep_causal_reads” on a per-session basis (remember, by “session” we mean “a mysql session”), where it takes effect for that connection alone, not affecting the performance of other concurrent connections even in the same application. With the flag turned on, the script never reads a stale row. The script illustrates calls upon both the casual reads connection and the non-causal reads in a randomly alternating fashion. I’m running it against a cluster of two virtual nodes on a laptop, so performance is very slow, but some sample output: 2015-02-04 15:49:27,131 100 runs 2015-02-04 15:49:27,754 w/ non-causal reads, got row 763 val is 9499, retries 0 2015-02-04 15:49:27,760 w/ non-causal reads, got row 763 val is 9499, retries 1 2015-02-04 15:49:27,764 w/ non-causal reads, got row 763 val is 9499, retries 2 2015-02-04 15:49:27,772 w/ non-causal reads, got row 763 val is 9499, retries 3 2015-02-04 15:49:27,777 w/ non-causal reads, got row 763 val is 9499, retries 4 2015-02-04 15:49:30,985 200 runs 2015-02-04 15:49:37,579 300 runs 2015-02-04 15:49:42,396 400 runs 2015-02-04 15:49:48,240 w/ non-causal reads, got row 6544 val is 6766, retries 0 2015-02-04 15:49:48,255 w/ non-causal reads, got row 6544 val is 6766, retries 1 2015-02-04 15:49:48,276 w/ non-causal reads, got row 6544 val is 6766, retries 2 2015-02-04 15:49:49,336 500 runs 2015-02-04 15:49:56,433 600 runs 2015-02-04 15:50:05,801 700 runs 2015-02-04 15:50:08,802 w/ non-causal reads, got row 533 val is 834, retries 0 2015-02-04 15:50:10,849 800 runs 2015-02-04 15:50:14,834 900 runs 2015-02-04 15:50:15,445 w/ non-causal reads, got row 124 val is 3850, retries 0 2015-02-04 15:50:15,448 w/ non-causal reads, got row 124 val is 3850, retries 1 2015-02-04 15:50:18,515 1000 runs 2015-02-04 15:50:22,130 1100 runs 2015-02-04 15:50:26,301 1200 runs 2015-02-04 15:50:28,898 w/ non-causal reads, got row 1493 val is 8358, retries 0 2015-02-04 15:50:29,988 1300 runs 2015-02-04 15:50:33,736 1400 runs 2015-02-04 15:50:34,219 w/ non-causal reads, got row 9661 val is 2877, retries 0 2015-02-04 15:50:38,796 1500 runs 2015-02-04 15:50:42,844 1600 runs 2015-02-04 15:50:46,838 1700 runs 2015-02-04 15:50:51,049 1800 runs 2015-02-04 15:50:55,139 1900 runs 2015-02-04 15:50:59,632 2000 runs 2015-02-04 15:51:04,721 2100 runs 2015-02-04 15:51:10,670 2200 runs
[openstack-dev] [QA] Meeting Thursday February 5th at 22:00 UTC
Just a quick reminder that the weekly OpenStack QA team IRC meeting will be tomorrow Thursday, February 5th at 22:00 UTC in the #openstack-meeting channel. The agenda for tomorrow's meeting can be found here: https://wiki.openstack.org/wiki/Meetings/QATeamMeeting Anyone is welcome to add an item to the agenda. To help people figure out what time 22:00 UTC is in other timezones tomorrow's meeting will be at: 17:00 EST 07:00 JST 08:30 ACDT 23:00 CET 16:00 CST 14:00 PST -Matt Treinish pgpgy54ExDtYT.pgp Description: PGP signature __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [cinder][nova] Cinder Brick pypi library?
On 2015-02-03 21:41:02 -0800 (-0800), Walter A. Boring IV wrote: [...] So the question I have is, does Nova have an interest in using the code in a pypi brick library? [...] It'll probably need a different/more specific name since brick is already taken on PyPI: https://pypi.python.org/pypi/brick -- Jeremy Stanley __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] django-openstack-auth and stable/icehouse
On 2015-02-04 12:03:14 +0100 (+0100), Alan Pevec wrote: [...] oslo.config==1.6.0 # git sha 99e530e django-openstack-auth==1.1.9 # git sha 2079383 [...] Clients are capped in stable/icehouse requirements but devstack in gate seems to be installing them from git master (note # git sha) Check that assumption. For example 99e530e is the git SHA tagged as 1.6.0 in oslo.config. This is output from `pbr freeze` rather than `pip freeze` and therefore reports this information PBR included in the EGG-INFO when the sdist/wheel was originally built. -- Jeremy Stanley __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Trove] No weekly Trove IRC meeting this week
Hello folks: Since we are having the Trove Mid-Cycle meetup this week (Feb 3-5), there will be no weekly Trove IRC meeting on Feb 4. We'll resume our weekly IRC meeting next week, on Feb 11th. Thanks, Nikhil __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
On 02/04/2015 07:59 PM, Angus Lees wrote: On Thu Feb 05 2015 at 9:02:49 AM Robert Collins robe...@robertcollins.net mailto:robe...@robertcollins.net wrote: On 5 February 2015 at 10:24, Joshua Harlow harlo...@outlook.com mailto:harlo...@outlook.com wrote: How interesting, Why are people using galera if it behaves like this? :-/ Because its actually fairly normal. In fact its an instance of point 7 on https://wiki.openstack.org/__wiki/BasicDesignTenets https://wiki.openstack.org/wiki/BasicDesignTenets - one of our oldest wiki pages :). In more detail, consider what happens in full isolation when you have the A and B example given, but B starts its transaction before A. B BEGIN A BEGIN A INSERT foo A COMMIT B SELECT foo - NULL Note that this still makes sense from each of A and B's individual view of the world. If I understood correctly, the big change with Galera that Matthew is highlighting is that read-after-write may not be consistent from the pov of a single thread. No, this is not correct. There is nothing different about Galera here versus any asynchronously replicated database. A single thread, issuing statements in two entirely *separate sessions*, load-balanced across an entire set of database cluster nodes, may indeed see older data if the second session gets balanced to a slave node. Nothing has changed about this with Galera. The exact same patterns that you would use to ensure that you are able to read the data that you previously wrote can be used with Galera. Just have the thread start a transactional session and ensure all queries are executed in the context of that session. Done. Nothing about Galera changes anything here. Not have read-after-write is *really* hard to code to (see for example x86 SMP cache coherency, C++ threading semantics, etc which all provide read-after-write for this reason). This is particularly true when the affected operations are hidden behind an ORM - it isn't clear what might involve a database call and sequencers (or logical clocks, etc) aren't made explicit in the API. I strongly suggest just enabling wsrep_casual_reads on all galera sessions, unless you can guarantee that the high-level task is purely read-only, and then moving on to something else ;) If we choose performance over correctness here then we're just signing up for lots of debugging of hard to reproduce race conditions, and the fixes are going to look like what wsrep_casual_reads does anyway. (Mind you, exposing sequencers at every API interaction would be awesome, and I look forward to a future framework and toolchain that makes that easy to do correctly) IMHO, you all are reading WAY too much into this. The behaviour that Matthew is describing is the kind of thing that has been around for decades now with asynchronous slave replication. Applications have traditionally handled it by sending reads that can tolerate slave lag to a slave machine, and reads that cannot to the same machine that was written to. Galera doesn't change anything here. I'm really not sure what the fuss is about, frankly. I don't recommend mucking with wsrep_causal_reads if we don't have to. And, IMO, we don't have to much with it at all. Best, -jay __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
On Wed, Feb 4, 2015 at 11:00 PM, Robert Collins robe...@robertcollins.net wrote: On 5 February 2015 at 10:24, Joshua Harlow harlo...@outlook.com wrote: How interesting, Why are people using galera if it behaves like this? :-/ Because its actually fairly normal. In fact its an instance of point 7 on https://wiki.openstack.org/wiki/BasicDesignTenets - one of our oldest wiki pages :). When I hear MySQL I don't exactly think of eventual consistency (#7), scalability (#1), horizontal scalability (#4), etc. For the past few months I have been advocating implementing an alternative to db/sqlalchemy, but of course it's a huge undertaking. NoSQL (or even distributed key-value stores) should be considered IMO. Just some food for thought :) -- *Avishay Traeger* *Storage RD* Mobile: +972 54 447 1475 E-mail: avis...@stratoscale.com Web http://www.stratoscale.com/ | Blog http://www.stratoscale.com/blog/ | Twitter https://twitter.com/Stratoscale | Google+ https://plus.google.com/u/1/b/108421603458396133912/108421603458396133912/posts | Linkedin https://www.linkedin.com/company/stratoscale __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova] memory reporting for huge pages
As part of https://blueprints.launchpad.net/nova/+spec/virt-driver-large-pages; we have introduced the ability to specify based on flavor/image that we want to use huge pages. Is there a way to query the number of huge pages available on each NUMA node of each compute node? I haven't been able to find one (short of querying the database directly) and it's proven somewhat frustrating. Currently we report the total amount of memory available, but when that can be broken up into several page sizes and multiple NUMA nodes per compute node it can be very difficult to determine whether a given flavor/image is bootable within the network, or to debug any issues that occur. Chris __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][lbaas] Can entity calls be made to driver when entities get associated/disassociated with root entity?
Also please note that the interface still allows for adding shared support. It’s not just in the first implementation. Thanks, doug On Feb 4, 2015, at 5:15 PM, Phillip Toohill phillip.tooh...@rackspace.com wrote: Sharing/reusing pools is a planned future feature. We are currently trying to work towards getting something released and having shared pools would extend that timeline to not meet our expectations. From: Vijay Venkatachalam vijay.venkatacha...@citrix.com mailto:vijay.venkatacha...@citrix.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org mailto:openstack-dev@lists.openstack.org Date: Wednesday, February 4, 2015 12:19 PM To: doug...@parkside.io mailto:doug...@parkside.io doug...@parkside.io mailto:doug...@parkside.io, OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org mailto:openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [neutron][lbaas] Can entity calls be made to driver when entities get associated/disassociated with root entity? Thanks Doug. My apologies for the delayed reply. The change is merged, so replying here. It is a welcome change in one way, there is always a root entity now in perspective while creating any entity. Listener is created with loadbalancer and pool is created with listener. The problem itself is eliminated because there is no DEFERRED stage. But, this restricts pool in one listener. Basically reusing of a pools across listeners and loadbalancers is not possible now. The use case of creating both a HTTPS vip and HTTP vip for the same pool is lost. Basically, a user who will need that, should create 2 pools with the same members and manage them. Is that right? Thanks, Vijay V. From: Doug Wiegley [mailto:doug...@parksidesoftware.com mailto:doug...@parksidesoftware.com] Sent: Tuesday, February 3, 2015 10:03 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [neutron][lbaas] Can entity calls be made to driver when entities get associated/disassociated with root entity? I’d recommend taking a look at Brandon’s review: https://review.openstack.org/#/c/144834/ https://review.openstack.org/#/c/144834/ which aims to simplify exactly what you’re describing. Please leave feedback there. Thanks, doug On Tue, Feb 3, 2015 at 7:13 AM, Vijay Venkatachalam vijay.venkatacha...@citrix.com mailto:vijay.venkatacha...@citrix.com wrote: Hi: In OpenStack neutron lbaas implementation, when entities are created/updated by the user, they might not be associated with the root entity, which is loadbalancer. Since root entity has the driver information, the driver cannot be called by lbaas plugin during these operations by user. Such entities are set in DEFFERED status until the entity is associated with root entity. During this association operation (listener created with pool), the driver api is called for the current operation (listener create); and the driver is expected to perform the original operation (pool create) along with the current operation (listener create). This leads to complex handling at the driver, I think it will be better for the lbaas plugin to call the original operation (pool create) driver API in addition to the current operation (listener create) API during the association operation. That is the summary, please read on to understand the situation in detail. Let’s take the example of pool create in driver. a. A pool create operation will not translate to a pool create api in the driver. There is a pool create in the driver API but that is never called today. b. When a listener is created with loadbalancer and pool, the driver’s listener create api is called and the driver is expected to create both pool and listener. c. When a listener is first created without loadbalancer but with a pool, the call does not reach driver. Later when the listener is updated with loadbalancer id, the drivers listener update API is called and the driver is expected to create both pool and listener. d. When a listener configured with pool and loadbalancer is updated with new pool id, the driver’s listener update api is called. The driver is expected to delete the original pool that was associated, create the new pool and also update the listener As you can see this is leading to a quite a bit of handling in the driver code. This makes driver code complex. How about handling this logic in lbaas plugin and it can call the “natural” functions that were deferred. Whenever an entity is going from a DEFERRED to ACTIVE/CREATE status (through whichever workflow) the plugin can call the CREATE pool function of the driver. Whenever an entity is going from an ACTIVE/CREATED to DEFERRED status (through whichever
Re: [openstack-dev] [horizon] JavaScript docs?
I'd recommend taking a look at the dgeni project which is a formal breakout of the code that supports ngdoc into a usable library. https://github.com/angular/dgeni Eric - Original Message - From: Monty Taylor mord...@inaugust.com To: openstack-dev@lists.openstack.org Sent: Wednesday, February 4, 2015 4:00:17 PM Subject: Re: [openstack-dev] [horizon] JavaScript docs? On 02/04/2015 12:48 PM, Michael Krotscheck wrote: I agree. StoryBoard's storyboard-webclient project has a lot of existing code already that's pretty well documented, but without knowing what documentation system we were going to settle on we never put any rule enforcement in place. If someone wants to take a stab at putting together a javascript docs build, that project would provide a good test bed that will let you test out the tools without having to also make them dance with python/sphinx at the same time. I.E. I have a bunch of javascript that you can hack on, and the domain knowledge of the Infra JS Build tools. I'd be happy to support this effort. +100 Michael On Wed Feb 04 2015 at 9:09:22 AM Thai Q Tran tqt...@us.ibm.com wrote: As we're moving toward Angular, might make sense for us to adopt ngdoc as well. -Matthew Farina m...@mattfarina.com wrote: - To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org From: Matthew Farina m...@mattfarina.com Date: 02/04/2015 05:42AM Subject: [openstack-dev] [horizon] JavaScript docs? In python we have a style to document methods, classes, and so forth. But, I don't see any guidance on how JavaScript should be documented. I was looking for something like jsdoc or ngdoc (an extension of jsdoc). Is there any guidance on how JavaScript should be documented? For anyone who doesn't know, Angular uses ngdoc (an extension to the commonly used jsdoc) which is written up at https://github.com/angular/angular.js/wiki/Writing-AngularJS-Documentation . __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [api][nova] Openstack HTTP error codes
Duncan Thomas [mailto:duncan.tho...@gmail.com] on Wednesday, February 04, 2015 8:34 AM wrote: The downside of numbers rather than camel-case text is that they are less likely to stick in the memory of regular users. Not a huge think, but a reduction in usability, I think. On the other hand they might lead to less guessing about the error with insufficient info, I suppose. To make the global registry easier, we can just use a per-service prefix, and then keep the error catalogue in the service code repo, pulling them into some sort of release periodically [Rockyg] In discussions at the summit about assigning error codes, we determined it would be pretty straightforward to build a tool that could be called when a new code was needed and it would both assign an unused code and insert the error summary for the code in the DB it would keep to ensure uniqueness. If you didn’t provide a summary, it wouldn’t spit out an error code;-) Simple little tool that could be in oslo, or some cross-project code location. --Rocky On 3 February 2015 at 03:24, Sean Dague s...@dague.netmailto:s...@dague.net wrote: On 02/02/2015 05:35 PM, Jay Pipes wrote: On 01/29/2015 12:41 PM, Sean Dague wrote: Correct. This actually came up at the Nova mid cycle in a side conversation with Ironic and Neutron folks. HTTP error codes are not sufficiently granular to describe what happens when a REST service goes wrong, especially if it goes wrong in a way that would let the client do something other than blindly try the same request, or fail. Having a standard json error payload would be really nice. { fault: ComputeFeatureUnsupportedOnInstanceType, messsage: This compute feature is not supported on this kind of instance type. If you need this feature please use a different instance type. See your cloud provider for options. } That would let us surface more specific errors. snip Standardization here from the API WG would be really great. What about having a separate HTTP header that indicates the OpenStack Error Code, along with a generated URI for finding more information about the error? Something like: X-OpenStack-Error-Code: 1234 X-OpenStack-Error-Help-URI: http://errors.openstack.org/1234 That way is completely backwards compatible (since we wouldn't be changing response payloads) and we could handle i18n entirely via the HTTP help service running on errors.openstack.orghttp://errors.openstack.org. That could definitely be implemented in the short term, but if we're talking about API WG long term evolution, I'm not sure why a standard error payload body wouldn't be better. The if we are going to having global codes that are just numbers, we'll also need a global naming registry. Which isn't a bad thing, just someone will need to allocate the numbers in a separate global repo across all projects. -Sean -- Sean Dague http://dague.net __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribehttp://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Duncan Thomas __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [oslo][nova][cinder] removing request_utils from oslo-incubator
Doug, thanks for responding so quickly to this. I had flagged it and was searching for just this email. Ankit, thanks for raising the issue. It *is* key. I agree with Doug that this should be a cross-project spec. Right now, Glance is moving in this direction, but as many have seen on the mailing list, request-id mappings are getting more discussion and are a hot request from operators. It would be great if Ankit could write the cross-project spec by taking his nova work and genericizing it to be useable by multiple projects. My biggest question for a cross-project spec like this is how do we get the attention and eyeballs of the right people in the individual projects to look at and comment on this? Perhaps we can start with Oslo liaisons? Any other ideas? We are moving to introduce more and more cross-project needs as we try to make OpenStack more useable to both developers and users. --Rocky -Original Message- From: Doug Hellmann [mailto:d...@doughellmann.com] Sent: Wednesday, February 04, 2015 9:17 AM To: OpenStack Development Mailing List (not for usage questions) Subject: [openstack-dev] [oslo][nova][cinder] removing request_utils from oslo-incubator About 12 hours ago in #openstack-oslo ankit_ag asked about the request_utils module that was removed from oslo-incubator and how to proceed to get it into nova. The module was deleted a few days ago [1] because nothing was actually using it and it appeared to be related to a nova blueprint [2], the spec for which was abandoned at the end of juno [3]. The one copy that had been synced into cinder wasn’t being used, and was also deleted [4] as part of this housekeeping work. As I said in the review, we removed the code from the incubator because it appeared to be a dead end. If that impression is incorrect, we should get the spec and blueprint resurrected (probably as a cross-project spec rather than just in nova) and then we can consider the best course for proceeding with the implementation. Doug [1] https://review.openstack.org/#/c/150370/ [2] https://blueprints.launchpad.net/nova/+spec/log-request-id-mappings [3] https://review.openstack.org/#/c/106878/ [4] https://review.openstack.org/#/c/150369/ __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Nominating Melanie Witt for python-novaclient-core
Its been a week, so I have now added Melanie to this group. Welcome aboard! Michael On Wed, Jan 28, 2015 at 9:41 AM, Michael Still mi...@stillhq.com wrote: Greetings, I would like to nominate Melanie Witt for the python-novaclient-core team. (What is python-novaclient-core? Its a new group which will contain all of nova-core as well as anyone else we think should have core reviewer powers on just the python-novaclient code). Melanie has been involved with nova for a long time now. She does solid reviews in python-novaclient, and at least two current nova-cores have suggested her as ready for core review powers on that repository. Please respond with +1s or any concerns. References: https://review.openstack.org/#/q/project:openstack/python-novaclient+reviewer:%22melanie+witt+%253Cmelwitt%2540yahoo-inc.com%253E%22,n,z As a reminder, we use the voting process outlined at https://wiki.openstack.org/wiki/Nova/CoreTeam to add members to our core team. Thanks, Michael -- Rackspace Australia -- Rackspace Australia __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron] [lbaas] LBaaS Haproxy performance benchmarking
+1 for Tsung! From: Adam Harwell adam.harw...@rackspace.commailto:adam.harw...@rackspace.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Date: Wednesday, February 4, 2015 9:25 AM To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [neutron] [lbaas] LBaaS Haproxy performance benchmarking At Rackspace we have been working on automated testing with Ansible and Tsung, but I don’t know if that code ever made it to a public repository… We found Tsung to be very useful for parallel testing though! :) --Adam https://keybase.io/rm_you From: Varun Lodaya varun_lod...@symantec.commailto:varun_lod...@symantec.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Date: Tuesday, February 3, 2015 at 6:58 PM To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Subject: [openstack-dev] [neutron] [lbaas] LBaaS Haproxy performance benchmarking Hi, We were trying to use haproxy as our LBaaS solution on the overlay. Has anybody done some baseline benchmarking with LBaaSv1 haproxy solution? Also, any recommended tools which we could use to do that? Thanks, Varun __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [neutron][lbaas] Can entity calls be made to driver when entities get associated/disassociated with root entity?
Sharing/reusing pools is a planned future feature. We are currently trying to work towards getting something released and having shared pools would extend that timeline to not meet our expectations. From: Vijay Venkatachalam vijay.venkatacha...@citrix.commailto:vijay.venkatacha...@citrix.com Reply-To: OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Date: Wednesday, February 4, 2015 12:19 PM To: doug...@parkside.iomailto:doug...@parkside.io doug...@parkside.iomailto:doug...@parkside.io, OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.orgmailto:openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [neutron][lbaas] Can entity calls be made to driver when entities get associated/disassociated with root entity? Thanks Doug. My apologies for the delayed reply. The change is merged, so replying here. It is a welcome change in one way, there is always a root entity now in perspective while creating any entity. Listener is created with loadbalancer and pool is created with listener. The problem itself is eliminated because there is no DEFERRED stage. But, this restricts pool in one listener. Basically reusing of a pools across listeners and loadbalancers is not possible now. The use case of creating both a HTTPS vip and HTTP vip for the same pool is lost. Basically, a user who will need that, should create 2 pools with the same members and manage them. Is that right? Thanks, Vijay V. From: Doug Wiegley [mailto:doug...@parksidesoftware.com] Sent: Tuesday, February 3, 2015 10:03 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [neutron][lbaas] Can entity calls be made to driver when entities get associated/disassociated with root entity? I’d recommend taking a look at Brandon’s review: https://review.openstack.org/#/c/144834/ which aims to simplify exactly what you’re describing. Please leave feedback there. Thanks, doug On Tue, Feb 3, 2015 at 7:13 AM, Vijay Venkatachalam vijay.venkatacha...@citrix.commailto:vijay.venkatacha...@citrix.com wrote: Hi: In OpenStack neutron lbaas implementation, when entities are created/updated by the user, they might not be associated with the root entity, which is loadbalancer. Since root entity has the driver information, the driver cannot be called by lbaas plugin during these operations by user. Such entities are set in DEFFERED status until the entity is associated with root entity. During this association operation (listener created with pool), the driver api is called for the current operation (listener create); and the driver is expected to perform the original operation (pool create) along with the current operation (listener create). This leads to complex handling at the driver, I think it will be better for the lbaas plugin to call the original operation (pool create) driver API in addition to the current operation (listener create) API during the association operation. That is the summary, please read on to understand the situation in detail. Let’s take the example of pool create in driver. a. A pool create operation will not translate to a pool create api in the driver. There is a pool create in the driver API but that is never called today. b. When a listener is created with loadbalancer and pool, the driver’s listener create api is called and the driver is expected to create both pool and listener. c. When a listener is first created without loadbalancer but with a pool, the call does not reach driver. Later when the listener is updated with loadbalancer id, the drivers listener update API is called and the driver is expected to create both pool and listener. d. When a listener configured with pool and loadbalancer is updated with new pool id, the driver’s listener update api is called. The driver is expected to delete the original pool that was associated, create the new pool and also update the listener As you can see this is leading to a quite a bit of handling in the driver code. This makes driver code complex. How about handling this logic in lbaas plugin and it can call the “natural” functions that were deferred. Whenever an entity is going from a DEFERRED to ACTIVE/CREATE status (through whichever workflow) the plugin can call the CREATE pool function of the driver. Whenever an entity is going from an ACTIVE/CREATED to DEFERRED status (through whichever workflow) the plugin can call the DELETE pool function of the driver. Thanks, Vijay V. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribehttp://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Manila]Questions about using not handle share-servers drivers with Flat network
Hi, Thanks very much for the reply. Really sorry for the late response. In your case if you have a driver that doesn't handle share servers, then the network is complete out of scope for Manila. Drivers that don't manage share servers have neither flat not segment networking in Manila, they have NO networking. ð So, you mean there is no way I can work as I want, right ? But, is it possible to enable that ? If you noticed, we're trying to enable HDFS in manila: https://blueprints.launchpad.net/manila/+spec/hdfs-driver That's the main reason I want to emphasize on my driver do not handle share server. Big data users want to have a unify storage when they're working in cloud. Because instances are not reliable resource in cloud. Put data together with instances while making sure data's reliability would be complicated. The biggest difference between HDFS and all currently backends manila support is: HDFS has different control path and data path. For a HDFS cluster, it has one name node and multi data nodes. Client would talk to name node first, get data location and then talk to data nodes to get data. The Export location represent name node information only. ð We can't put any share-server in the middle of user instances and HDFS cluster. But, it do possible to let the HDFS work in the cloud with restrictions ð It can only support one share-network at one time. This actually restrict the ability of the manila backend, no multi-tenancy at all. We want to use HDFS like this: Connect users' share-network and HDFS-cluster-network by router. Similar to currently generic driver's behavior when connect_share_server_to_tenant_network = False while no share-server exist. Access control is achieved based on its own user. We can add some access control based on keystone users and keystone tenants to avoid bad users to connect to HDFS cluster at very beginning if that's possible. Thanks. -chen From: Ben Swartzlander [mailto:b...@swartzlander.org] Sent: Wednesday, January 28, 2015 12:35 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Manila]Questions about using not handle share-servers drivers with Flat network On 01/27/2015 06:39 PM, Li, Chen wrote: Hi list, I have some questions. Hope can get help from you guys. Manila has two driver modes. For handle share server drivers, the share-network is easy to understand. For not handle share-servers drivers, manila request admin to do everything before manila-share service start, and when the service is running, it only serves requests do not contain share-network. I kept confusing about which/why users would create shares without share-network. Although when working with this kind of driver, the manila-share service can only support one specific network restricted by the backend. But users do not know backends, they should always want to create shares with share-network, because users always want to connect shares to their instances that lives in the cloud with share-network. Then I have been told that these shares created without share-network are assumed to be used on a public network. The public network do make a clear explanation about why share-network not matter anymore. But, when I build my cloud with Manila, what I want to do is let backends to serve my Flat network. I want to have 2 backends in Manila, both of them are not handle share-servers drivers. I set 192.168.6.253 for backend1 and create a Flat network in neutron with subnet 192.168.6.0/24 with IP range from 192.168.6.1-192.168.6.252. I set 192.168.7.253 for backend2 and create a Flat network in neutron with subnet 192.168.7.0/24 with IP range from 192.168.7.1-192.168.7.252. The reason I build my cloud like this is because I want to do some performance tests on both backends, to compare the two backends. I think it should not hard to do it, but manila do not support that currently. So, is this the behavior should work ? Or anything else I missed ? Manila needs to support backends that can create share servers and backends that can't create share servers. We do this because of the reality that different storage systems have different capabilities and designs, and we don't want to block anything that can reasonably described as a shared filesystem from working with Manila. For the purposes of Manila, a share server is a logically isolated instance of a file share server, with its own IP address, routing tables, security domain, and name services. Manila only tracks the existence of share servers that were created as the result of a share-create operation. Share servers created by manila have IP addresses assigned by Manila, and can be expected to be deleted by Manila sometime after the last share on that share server is deleted. Backends that simply create shares on a preexsting storage systems are not referred to as share servers and networking concerns for those systems are out of
Re: [openstack-dev] [Manila]Questions about using not handle share-servers drivers with Flat network
On 02/04/2015 07:01 PM, Li, Chen wrote: Hi, Thanks very much for the reply. Really sorry for the late response. In your case if you have a driver that doesn't handle share servers, then the network is complete out of scope for Manila. Drivers that don't manage share servers have neither flat not segment networking in Manila, they have NO networking. ðSo, you mean there is no way I can work as I want, right ? But, is it possible to enable that ? I think you missed the point when I say networking is out of scope for non-share-server-handling drivers. All that that means is that Manila will not be involved with the management of the network resources for the backend or the network paths between the clients and the server. The reason for this is to allow administrators to configure the network however they like. Arbitrarily complicated network designs are possible when you use a driver with driver_manages_share_servers=False because you're free to do what you want and Manila doesn't care. I think people sometimes forget that Manila doesn't want to be involved with network management. We only touch the network where it's unavoidable, such as when we have to create new virtual machines that need to be reachable over the network from existing VMs. There already exist many other great tools inside and outside of OpenStack for doing network management and we want to avoid duplicating or overlapping with their functionality as much as possible. If you noticed, we’re trying to enable HDFS in manila: https://blueprints.launchpad.net/manila/+spec/hdfs-driver That’s the main reason I want to emphasize on my driver do not handle share server. Big data users want to have a unify storage when they’re working in cloud. Because instances are not reliable resource in cloud. Put data together with instances while making sure data’s reliability would be complicated. The biggest difference between HDFS and all currently backends manila support is: HDFS has different control path and data path. For a HDFS cluster, it has one name node and multi data nodes. Client would talk to “name node” first, get data location and then talk to data nodes to get data. The “Export location” represent “name node” information only. ðWe can’t put any “share-server” in the middle of user instances and HDFS cluster. But, it do possible to let the HDFS work in the cloud with restrictions ðIt can only support one share-network at one time. This actually restrict the ability of the manila backend, no multi-tenancy at all. We want to use HDFS like this: Connect users’ “share-network” and “HDFS-cluster-network” by router. Similar to currently generic driver’s behavior when “connect_share_server_to_tenant_network = False” while no “share-server” exist. Access control is achieved based on its own user. We can add some access control based on keystone users and keystone tenants to avoid bad users to connect to HDFS cluster at very beginning if that’s possible. Thanks. -chen *From:*Ben Swartzlander [mailto:b...@swartzlander.org] *Sent:* Wednesday, January 28, 2015 12:35 PM *To:* OpenStack Development Mailing List (not for usage questions) *Subject:* Re: [openstack-dev] [Manila]Questions about using not handle share-servers drivers with Flat network On 01/27/2015 06:39 PM, Li, Chen wrote: Hi list, I have some questions. Hope can get help from you guys. Manila has two driver modes. For handle share server drivers, the share-network is easy to understand. For not handle share-servers drivers, manila request admin to do everything before manila-share service start, and when the service is running, it only serves requests do not contain share-network. I kept confusing about which/why users would create shares without share-network. Although when working with this kind of driver, the manila-share service can only support one specific network restricted by the backend. But “users” do not know backends, they should always want to create shares with share-network, because users always want to connect shares to their instances that lives in the cloud with “share-network”. Then I have been told that these shares created without share-network are assumed to be used on a public network. The public network do make a clear explanation about why share-network not matter anymore. But, when I build my cloud with Manila, what I want to do is let backends to serve my “Flat network”. I want to have 2 backends in Manila, both of them are “*/not/* handle share-servers drivers”. I set 192.168.6.253 for backend1 and create a “Flat network” in neutron with subnet 192.168.6.0/24 with IP range from 192.168.6.1-192.168.6.252. I set 192.168.7.253 for backend2 and create a “Flat network” in neutron with subnet 192.168.7.0/24 with IP range from 192.168.7.1-192.168.7.252.
Re: [openstack-dev] [neutron] high dhcp lease times in neutron deployments considered harmful (or not???)
On Wed Feb 04 2015 at 8:02:04 PM Kevin Benton blak...@gmail.com wrote: I proposed an alternative to adjusting the lease time early on the in the thread. By specifying the renewal time (DHCP option 58), we can have the benefits of a long lease time (resiliency to long DHCP server outages) while having a frequent renewal interval to check for IP changes. I favored this approach because it only required a patch to dnsmasq to allow that option to be set and patch to our agent to set that option, both of which are pretty straight-forward. Yep, I should have said +1 to this in my other post. Simple coding change that is strictly better than the current situation (other than a slight increase in DHCP request traffic). - Gus __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Glance] Meeting Thursday February 5th at 1500 UTC
Just a reminder to everyone that Glance’s weekly meeting is on February 5th at 1500 UTC. The agenda can be found here: https://etherpad.openstack.org/p/glance-team-meeting-agenda We meet in #openstack-meeting-4 Anyone is welcome to add an agenda item and participation is welcome. Cheers, Ian __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] OpensStack w/ Chef
Question: is there a tool that does the same sort of things as Mcollective but is specific to Chef? I know Mcollective is compatible with Chef, but I'd like to research tooling native to the Chef platform if possible... (if that makes any sense) *Adam Lawson* AQORN, Inc. 427 North Tatnall Street Ste. 58461 Wilmington, Delaware 19801-2230 Toll-free: (844) 4-AQORN-NOW ext. 101 International: +1 302-387-4660 Direct: +1 916-246-2072 __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera
On Thu Feb 05 2015 at 9:02:49 AM Robert Collins robe...@robertcollins.net wrote: On 5 February 2015 at 10:24, Joshua Harlow harlo...@outlook.com wrote: How interesting, Why are people using galera if it behaves like this? :-/ Because its actually fairly normal. In fact its an instance of point 7 on https://wiki.openstack.org/wiki/BasicDesignTenets - one of our oldest wiki pages :). In more detail, consider what happens in full isolation when you have the A and B example given, but B starts its transaction before A. B BEGIN A BEGIN A INSERT foo A COMMIT B SELECT foo - NULL Note that this still makes sense from each of A and B's individual view of the world. If I understood correctly, the big change with Galera that Matthew is highlighting is that read-after-write may not be consistent from the pov of a single thread. Not have read-after-write is *really* hard to code to (see for example x86 SMP cache coherency, C++ threading semantics, etc which all provide read-after-write for this reason). This is particularly true when the affected operations are hidden behind an ORM - it isn't clear what might involve a database call and sequencers (or logical clocks, etc) aren't made explicit in the API. I strongly suggest just enabling wsrep_casual_reads on all galera sessions, unless you can guarantee that the high-level task is purely read-only, and then moving on to something else ;) If we choose performance over correctness here then we're just signing up for lots of debugging of hard to reproduce race conditions, and the fixes are going to look like what wsrep_casual_reads does anyway. (Mind you, exposing sequencers at every API interaction would be awesome, and I look forward to a future framework and toolchain that makes that easy to do correctly) - Gus __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [congress] sprinting towards Kilo2
Join the Congress team to push your code over the milestone line. The Congress team will be online from 9am to 5pm over the next two days. Reach out through the #congress IRC channel. We will start a google hangout and post the URL in the #congress channel. See you there! On Tuesday, February 3, 2015, sean roberts seanrobert...@gmail.com wrote: Over the last couple of meetings, we have discussed holding a hackathon this Thursday and Friday. You each have some code you are working on. Let’s each pick a 3-4 hour block of time to intensively collaborate. We can use the #congress IRC channel and google hangout. Reply to this thread, so we can allocate people’s time. ~ sean -- ~sean __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] Nominating Melanie Witt for python-novaclient-core
On Feb 4, 2015, at 16:31, Michael Still mi...@stillhq.com wrote: Its been a week, so I have now added Melanie to this group. Welcome aboard! Thank you! I am honored by the nomination and all of your votes. It is my great pleasure to join python-novaclient-core. :) melanie (melwitt) signature.asc Description: Message signed with OpenPGP using GPGMail __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [python-novaclient][nova] future of --os-compute-api-version option and whole api versioning
First results is https://review.openstack.org/#/c/152569/ - if os-compute-api-version is not supplied don't send any header at all - it is probably worth doing a bit version parsing to see if it makes sense eg of format: r^([1-9]\d*)\.([1-9]\d*|0)$ or latest implemented - handle HTTPNotAcceptable if the user asked for a version which is not supported (can also get a badrequest if its badly formatted and got through the novaclient filter) Based on https://review.openstack.org/#/c/150641/ , HTTPNotFound (404) will be returned by API, so the current implementation of novaclient is not required any changes. - show the version header information returned Imo, API should return exception with proper message which already include this information. On Mon, Feb 2, 2015 at 2:02 PM, Andrey Kurilin akuri...@mirantis.com wrote: Thanks for the summary, I'll try to send first patch(maybe WIP) in few days. On Mon, Feb 2, 2015 at 1:43 PM, Christopher Yeoh cbky...@gmail.com wrote: On Sat, Jan 31, 2015 at 4:09 AM, Andrey Kurilin akuri...@mirantis.com wrote: Thanks for the answer. Can I help with implementation of novaclient part? Sure! Do you think its something you can get proposed into Gerrit by the end of the week or very soon after? Its the sort of timeframe we're looking for to get microversions enabled asap I guess just let me know if it turns out you don't have the time. So I think a short summary of what is needed is: - if os-compute-api-version is not supplied don't send any header at all - it is probably worth doing a bit version parsing to see if it makes sense eg of format: r^([1-9]\d*)\.([1-9]\d*|0)$ or latest - handle HTTPNotAcceptable if the user asked for a version which is not supported (can also get a badrequest if its badly formatted and got through the novaclient filter) - show the version header information returned Regards, Chris On Wed, Jan 28, 2015 at 11:50 AM, Christopher Yeoh cbky...@gmail.com wrote: On Fri, 23 Jan 2015 15:51:54 +0200 Andrey Kurilin akuri...@mirantis.com wrote: Hi everyone! After removing nova V3 API from novaclient[1], implementation of v1.1 client is used for v1.1, v2 and v3 [2]. Since we moving to micro versions, I wonder, do we need such mechanism of choosing api version(os-compute-api-version) or we can simply remove it, like in proposed change - [3]? If we remove it, how micro version should be selected? So since v3 was never officially released I think we can re-use os-compute-api-version for microversions which will map to the X-OpenStack-Compute-API-Version header. See here for details on what the header will look like. We need to also modify novaclient to handle errors when a version requested is not supported by the server. If the user does not specify a version number then we should not send anything at all. The server will run the default behaviour which for quite a while will just be v2.1 (functionally equivalent to v.2) http://specs.openstack.org/openstack/nova-specs/specs/kilo/approved/api-microversions.html [1] - https://review.openstack.org/#/c/138694 [2] - https://github.com/openstack/python-novaclient/blob/master/novaclient/client.py#L763-L769 [3] - https://review.openstack.org/#/c/149006 -- Best regards, Andrey Kurilin. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev -- Best regards, Andrey Kurilin. -- Best regards, Andrey Kurilin. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Murano] exception in deployment
Hi, Has anyone seen this exception in the recent revision? I just get the latest today. murano-engine outputs this exception when I deploy an environment. It fails to create a VM. I wonder if it's my environment or a murano defect. {packages: [{class_definitions: [io.murano.Environment, io.murano.resources.Network, io.murano.Application, io.murano.resources.WindowsInstance, io.murano.resources.HeatSWConfigInstance, io.murano.system.StatusReporter, io.murano.resources.LinuxInstance, io.murano.system.AgentListener, io.murano.resources.NeutronNetwork, io.murano.resources.LinuxMuranoInstance, io.murano.resources.HeatSWConfigLinuxInstance, io.murano.system.NetworkExplorer, io.murano.system.Agent, io.murano.SharedIp, io.murano.system.HeatStack, io.murano.system.InstanceNotifier, io.murano.resources.Instance, io.murano.system.SecurityGroupManager, io.murano.StackTrace, io.murano.Object, io.murano.Exception, io.murano.resources.LinuxUDInstance, io.murano.system.Resources], description: Core MuranoPL library\n, tags: [MuranoPL], updated: 2015-02-04T23:03:03, is_public: true, categories: [], name: Core library, author: murano.io, created: 2015-02-04T23:03:03, enabled: true, id: 5ebdb96fc6f542dca9a1af766ddbfa94, supplier: {}, fully_qualified_name: io.murano, type: Library, owner_id: }]} log_http_response /opt/stack/python-muranoclient/muranoclient/common/http.py:124 2015-02-04 17:37:50.950 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: Begin execution: io.murano.system.Resources.string (-7559542279869778927) called from File /tmp/murano-packages-cache/1aaac863-f712-4ec6-b7c3-683d3498b951/io.murano/Classes/resources/Instance.yaml, line 98:28 in method deploy of class io.murano.resources.Instance $.prepareUserData() _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:142 2015-02-04 17:37:50.950 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: End execution: io.murano.system.Resources.string (-7559542279869778927) _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:160 2015-02-04 17:37:50.952 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: Begin execution: io.murano.system.Resources.string (-7559542279869778927) called from File /tmp/murano-packages-cache/1aaac863-f712-4ec6-b7c3-683d3498b951/io.murano/Classes/resources/Instance.yaml, line 98:28 in method deploy of class io.murano.resources.Instance $.prepareUserData() _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:142 2015-02-04 17:37:50.952 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: End execution: io.murano.system.Resources.string (-7559542279869778927) _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:160 2015-02-04 17:37:50.953 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: Begin execution: io.murano.system.AgentListener.queueName (8227276669045428973) called from File /tmp/murano-packages-cache/1aaac863-f712-4ec6-b7c3-683d3498b951/io.murano/Classes/resources/Instance.yaml, line 98:28 in method deploy of class io.murano.resources.Instance $.prepareUserData() _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:142 2015-02-04 17:37:50.953 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: End execution: io.murano.system.AgentListener.queueName (8227276669045428973) _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:160 2015-02-04 17:37:50.954 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: Begin execution: io.murano.system.Agent.queueName (-5211629396053631386) called from File /tmp/murano-packages-cache/1aaac863-f712-4ec6-b7c3-683d3498b951/io.murano/Classes/resources/Instance.yaml, line 98:28 in method deploy of class io.murano.resources.Instance $.prepareUserData() _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:142 2015-02-04 17:37:50.954 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: End execution: io.murano.system.Agent.queueName (-5211629396053631386) _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:160 2015-02-04 17:37:50.956 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: End execution: io.murano.resources.LinuxMuranoInstance.prepareUserData (-6721861659645843611) _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:160 2015-02-04 17:37:50.957 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: Begin execution: io.murano.system.Agent.prepare (-5222445134521330586) called from File /tmp/murano-packages-cache/1aaac863-f712-4ec6-b7c3-683d3498b951/io.murano.apps.linux.Telnet/Classes/telnet.yaml, line 32:9 in method deploy of class io.murano.apps.linux.Telnet $.instance.deploy() _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:142 Traceback (most recent
Re: [openstack-dev] [Neutron] fixed ip info shown for port even when dhcp is disabled
Hi John,Sure, this may not be in the scope of IPAM as originally proposed in [1]. Am just trying to see if there can be a solution for this scenario using IPAM. What I had in mind is also a centrally managed IPAM and not each VM allocating its own IP address. Let me clarify. It's not an uncommon requirement to use a commercial DHCP server in an enterprise/DC. I am referring to it as external DHCP server from Openstack's perspective.One scenario of deployment is when IPAM is done through Openstack and in which case, the MAC, IP, optionally segment can be programmed in the external DHCP server, assuming they have standard API's for that. Then, the VM will get the openstack assigned IP address from the DHCP server when it boots up. You also suggested this and sure, it's not a problem here. Now, another scenario and that's of interest is when IPAM itself is done directly using the commercial DHCP server for various reasons. I am just trying to see how this model will work (or can be extended to work) for this case. What's the role of pluggable IPAM [1] in this scenario? If at all there's a way to tell the external DHCP server, do your own allocation and return an IP address for me for this MAC, segment, then this IP address can be returned during the allocate_ip API by the pluggable IPAM? But, I am not sure, an external DHCP server will have this flexibility.Then, in such scenarios, the only way is to not allocate an IP address during create_port. When the VM comes up and sends a DHCP request the commercial DHCP server responds with an address. The pluggable IPAM then can use its own method, and when it discovers the IP address of the VM, it can call to update_port with the IP address.Any other way? [1] - http://specs.openstack.org/openstack/neutron-specs/specs/kilo/neutron-ipam.html Thanks,Paddu On Tuesday, February 3, 2015 8:57 AM, John Belamaric jbelama...@infoblox.com wrote: Hi Paddu, I think this is less an issue of the pluggable IPAM than it is the Neutron management layer, which requires an IP for a port, as far as I know. If the management layer is updated to allow a port to exist without a known IP, then an additional IP request type could be added to represent the placeholder you describing. However, doing so leaves IPAM out of the hands of Neutron and out of the hands of the external (presumably authoritative) IPAM system. This could lead to duplicate IP issues since each VM is deciding its own IP without any centralized coordination. I wouldn't recommend this approach to managing your IP space. John From: Padmanabhan Krishnan kpr...@yahoo.com Reply-To: Padmanabhan Krishnan kpr...@yahoo.com Date: Wednesday, January 28, 2015 at 4:58 PM To: John Belamaric jbelama...@infoblox.com, OpenStack Development Mailing List (not for usage questions) openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [Neutron] fixed ip info shown for port even when dhcp is disabled Some follow up questions on this. In the specs, i see that during a create_port, there's provisions to query the external source by Pluggable IPAM to return the IP.This works fine for cases where the external source (say, DHCP server) can be queried for the IP address when a launch happens. Is there a provision to have the flexibility of a late IP assignment? I am thinking of cases, like the temporary unavailability of external IP source or lack of standard interfaces in which case data packet snooping is used to find the IP address of a VM after launch. Something similar to late binding of IP addresses.This means the create_port may not get the IP address from the pluggable IPAM. In that case, launch of a VM (or create_port) shouldn't fail. The Pluggable IPAM should have some provision to return something equivalent to unavailable during create_port and be able to do an update_port when the IP address becomes available. I don't see that option. Please correct me if I am wrong. Thanks,Paddu On Thursday, December 18, 2014 7:59 AM, Padmanabhan Krishnan kpr...@yahoo.com wrote: Hi John,Thanks for the pointers. I shall take a look and get back. Regards,Paddu On Thursday, December 18, 2014 6:23 AM, John Belamaric jbelama...@infoblox.com wrote: Hi Paddu, Take a look at what we are working on in Kilo [1] for external IPAM. While this does not address DHCP specifically, it does allow you to use an external source to allocate the IP that OpenStack uses, which may solve your problem. Another solution to your question is to invert the logic - you need to take the IP allocated by OpenStack and program the DHCP server to provide a fixed IP for that MAC. You may be interested in looking at this Etherpad [2] that Don Kehn put together gathering all the various DHCP blueprints and related info, and also at this BP [3] for including a DHCP relay so we can utilize external DHCP more easily. [1] https://blueprints.launchpad.net/neutron/+spec/neutron-ipam[2]
Re: [openstack-dev] [Murano] exception in deployment
Hi Steven, You get this exception when Murano can't connect to the RabbitMQ used to communicate between engine and murano-agent on VM side. You need to configure connection to RabbitMQ twice, in regular way for oslo.messaging, and you need to configure connection to RabbitMQ in [rabbitmq] section. Please, check RabbitMQ configuration in configuration file (section [rabbitmq]). I would explain below why do we have two places to configure connection to RabbitMQ. Murano uses RabbitMQ for communication between components, specifically between: * API Engine * Engine Agent One of the recommended way of OpenStack deployment with Murano is to use two RabbitMQ installations: * OS Rabbit: used by OpenStack Murano * Murano Rabbit: used only by Murano OS Rabbit is used by Murano for communication between API Engine and it is same RabbitMQ instance that is used by rest of OpenStack components. Murano Rabbit is used by Murano for communication between Engine Agent. Murano Rabbit is isolated from the management network, but accessible from VMs. This deployment model is recommended due to security reasons - VMs spawned by Murano with murano-agent enabled use RabbitMQ for communication, and if it's same RabbitMQ instance (with same credentials/vhost) as one used by OpenStack, user may affect/compromise OpenStack cloud itself. On Thu, Feb 5, 2015 at 4:46 AM, Tran, Steven steven.tr...@hp.com wrote: Hi, Has anyone seen this exception in the recent revision? I just get the latest today. murano-engine outputs this exception when I deploy an environment. It fails to create a VM. I wonder if it’s my environment or a murano defect. {packages: [{class_definitions: [io.murano.Environment, io.murano.resources.Network, io.murano.Application, io.murano.resources.WindowsInstance, io.murano.resources.HeatSWConfigInstance, io.murano.system.StatusReporter, io.murano.resources.LinuxInstance, io.murano.system.AgentListener, io.murano.resources.NeutronNetwork, io.murano.resources.LinuxMuranoInstance, io.murano.resources.HeatSWConfigLinuxInstance, io.murano.system.NetworkExplorer, io.murano.system.Agent, io.murano.SharedIp, io.murano.system.HeatStack, io.murano.system.InstanceNotifier, io.murano.resources.Instance, io.murano.system.SecurityGroupManager, io.murano.StackTrace, io.murano.Object, io.murano.Exception, io.murano.resources.LinuxUDInstance, io.murano.system.Resources], description: Core MuranoPL library\n, tags: [MuranoPL], updated: 2015-02-04T23:03:03, is_public: true, categories: [], name: Core library, author: murano.io, created: 2015-02-04T23:03:03, enabled: true, id: 5ebdb96fc6f542dca9a1af766ddbfa94, supplier: {}, fully_qualified_name: io.murano, type: Library, owner_id: }]} log_http_response /opt/stack/python-muranoclient/muranoclient/common/http.py:124 2015-02-04 17:37:50.950 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: Begin execution: io.murano.system.Resources.string (-7559542279869778927) called from File /tmp/murano-packages-cache/1aaac863-f712-4ec6-b7c3-683d3498b951/io.murano/Classes/resources/Instance.yaml, line 98:28 in method deploy of class io.murano.resources.Instance $.prepareUserData() _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:142 2015-02-04 17:37:50.950 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: End execution: io.murano.system.Resources.string (-7559542279869778927) _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:160 2015-02-04 17:37:50.952 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: Begin execution: io.murano.system.Resources.string (-7559542279869778927) called from File /tmp/murano-packages-cache/1aaac863-f712-4ec6-b7c3-683d3498b951/io.murano/Classes/resources/Instance.yaml, line 98:28 in method deploy of class io.murano.resources.Instance $.prepareUserData() _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:142 2015-02-04 17:37:50.952 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: End execution: io.murano.system.Resources.string (-7559542279869778927) _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:160 2015-02-04 17:37:50.953 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: Begin execution: io.murano.system.AgentListener.queueName (8227276669045428973) called from File /tmp/murano-packages-cache/1aaac863-f712-4ec6-b7c3-683d3498b951/io.murano/Classes/resources/Instance.yaml, line 98:28 in method deploy of class io.murano.resources.Instance $.prepareUserData() _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:142 2015-02-04 17:37:50.953 28689 DEBUG murano.dsl.executor [-] cad324b38b064c6a92f8bdf55b01ce37: End execution: io.murano.system.AgentListener.queueName (8227276669045428973) _invoke_method_implementation /opt/stack/murano/murano/dsl/executor.py:160
[openstack-dev] [python-cinderclient] Return request ID to caller
Hi, I have submitted patch for cinder-client [1] to 'Return tuple containing header and body from client' instead of just response. Also cinder spec for the same is under review [2]. This change will break OpenStack services which are using cinder-client. To do not break services which are using cinder-client, we need to first make changes in those projects to check return type of cinder-client. We are working on doing cinder-client return type check changes in OpenStack services like nova, glance_store, heat, trove, manila etc. We have already submitted patch for same against nova : https://review.openstack.org/#/c/152820/ [1] https://review.openstack.org/#/c/152075/ [2] https://review.openstack.org/#/c/132161/ I want to seek early feedback from the community members on the above patches, so please give your thoughts on the same. Thanks, Abhijeet Malawade __ Disclaimer: This email and any attachments are sent in strictest confidence for the sole use of the addressee and may contain legally privileged, confidential, and proprietary data. If you are not the intended recipient, please advise the sender by replying promptly to this email and then delete and destroy this email and any attachments without any further use, copying or forwarding.__ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][libvirt] Logging interactions of libvirt + QEMU in Nova
On Wed, Feb 04, 2015 at 11:23:34AM +0100, Kashyap Chamarthy wrote: Heya, I noticed a ping (but couldn't respond in time) on #openstack-nova IRC about turning on logging in libvirt to capture Nova failures. This was discussed on this list previously by Daniel Berrange, just spelling it out here for reference and completness' sake. (1) To see the interactions between libvirt and QEMU, in /etc/libvirt/libvirtd.conf, have these two config attributes: . . . log_filters=1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 1:util You really want 3:object in there /before/ 1:util too otherwise you'll be spammed with object ref/unref messages log_outputs=1:file:/var/tmp/libvirtd.log Use /var/log/libvirt/libvirtd.log instead of /var/tmp Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [sahara] team meeting Feb 5 1800 UTC
Hi folks, We'll be having the Sahara team meeting in #openstack-meeting-alt channel. Agenda: https://wiki.openstack.org/wiki/Meetings/SaharaAgenda#Next_meetings http://www.timeanddate.com/worldclock/fixedtime.html?msg=Sahara+Meetingiso=20150205T18 -- Sincerely yours, Sergey Lukjanov Sahara Technical Lead (OpenStack Data Processing) Principal Software Engineer Mirantis Inc. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] django-openstack-auth and stable/icehouse
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/04/2015 12:03 PM, Alan Pevec wrote: Dependencies in requirements.txt do not seem to be used in stable/icehouse gate jobs, recent pip freeze in stable/icehouse shows: ... oslo.config==1.6.0 # git sha 99e530e django-openstack-auth==1.1.9 # git sha 2079383 It's because of this: 2015-01-27 19:33:44.152 | Collecting oslo.config=1.4.0 (from python-keystoneclient=0.11.1-python-openstackclient=1.0.1) After that installs 1.6.0, consequent pip runs assume that 1.6.0 is always better than 1.4.0 and disregards version cap, hence does not downgrade the library. Should we finally cap versions for clients so that they don't fetch new library versions? Clients are capped in stable/icehouse requirements but devstack in gate seems to be installing them from git master (note # git sha) So we install python-openstackclient=1.0.1 in Icehouse devstack [1] even though we have 0.5 in requirements/Icehouse [2]. This should be fixed I guess. But that would not be enough since all versions of python-openstackclient don't cap the maximum version of keystoneclient. Anyway, in the end we see that 1.4.0 is installed, so probably pip downgraded it later in the run. It looks suspicious and hacky, but it works. As for git hashes you see in freeze output, they seem to be part of pbr metadata shipped with wheels, I see them even when setting local env with 'tox -e py27 --notest' locally when I'm pretty sure git is not involved. So all in all, I still vote for disabling django_openstack_auth =1.1.9 in gate for Icehouse. /Ihar -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQEcBAEBAgAGBQJU0hctAAoJEC5aWaUY1u573NEH/3e+2c1eXDaYU87qz6ZzX9vw yG/2raO3S+/4UtA2Zb3EQYdTduUHeXnqk3caGZq0hcx3XdmzO01SVueKgQAaJLij 8p6p6WwYDr2h5+DXM2g9dfoRE/mPziwwzoGUw095dUzJBIAOsdUcB/OmyAxiJFD8 dXEiwu988pZ4oJgzbL28YhyMce3TK1dY1EFpfvYxhIYySCcVFv9enQVxaj4y6+dc aCw02TyUpObNFHYSqrIwsXMNuhaQAwlZ7wdc4IAcVbggcDdpDyToJicg80OSB2aN nhdp4Y4BlZt1grx8NgWgUSe/5G+JkzHjm3K3rllxa9l99i1lc9+zNOxD2cj8e5I= =qQHZ -END PGP SIGNATURE- __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] XenAPI questions
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 12/12/2014 07:43 AM, YAMAMOTO Takashi wrote: hi, good to hear. do you have any estimate when it will be available? will it cover dom0 side of the code found in neutron/plugins/openvswitch/agent/xenapi? We also have rootwrap script just for Xen. It would be great to have an ability to trigger Xen specific neutron based job from gerrit comments for any neutron patch. I undertand why Xen team may not want to run their CI on each neutron patch, but at least there should be a half-automated way to trigger it (either by CI machine tracking the files of interest, or responding to specially crafted gerrit comments). Thanks, /Ihar -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQEcBAEBAgAGBQJU0fkkAAoJEC5aWaUY1u57bM4IAMcIPZ8AtlAhR+Jtqy+TCF9z 5+MJWzduNvvY/4AB2jUbIX3tgO2V0HnKD1EhbPYooz5Wq9dwFdlL+BR2su9dqkO+ 3PtAJf60XK4r24xaWKK0nCnxJGLzupF39UMiS3RzmMJ+fOrhGdPyJNlLIuLH2ye3 VApJ5HZkbxz/F7ikMYHfE8Uh5HN84ehXcDHIEMm1RgX3r5+kQLpuPyl2Y74I+6FR xE4vKjahbkGlXCoFUj4gnWqBk+YunawyAC/9X2uoxx6e8OwYcutn4yODS1JA6bDO oxri6fLaBaYGgnDy96gIHlrsqKH1HBqYZIjIiAwMsFYJo4LmkN5B3MWLU89imiE= =OHz/ -END PGP SIGNATURE- __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] django-openstack-auth and stable/icehouse
Dependencies in requirements.txt do not seem to be used in stable/icehouse gate jobs, recent pip freeze in stable/icehouse shows: ... oslo.config==1.6.0 # git sha 99e530e django-openstack-auth==1.1.9 # git sha 2079383 It's because of this: 2015-01-27 19:33:44.152 | Collecting oslo.config=1.4.0 (from python-keystoneclient=0.11.1-python-openstackclient=1.0.1) After that installs 1.6.0, consequent pip runs assume that 1.6.0 is always better than 1.4.0 and disregards version cap, hence does not downgrade the library. Should we finally cap versions for clients so that they don't fetch new library versions? Clients are capped in stable/icehouse requirements but devstack in gate seems to be installing them from git master (note # git sha) Cheers, Alan __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Monasca][Monasca-agent] Logfile format to send matrices
Hello, In Monasca-agent README here https://github.com/stackforge/monasca-agent#introduction It mentioned: Retrieving metrics from log files written in a specific format. Can anyone please point to such a format?Any such pointer would be good. If I understand it correctly, if I point the monasca-agent to the logfile written in specific format - it can pick it up from there (the collector) and send it to the monasca-api. Right? Thanks. Pradip __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] XenAPI questions
Hi, The next meeting will be tomorrow @ 15:00 UTC - We'd love to see you there and we can talk about the CI and Terry's work. We're currently meeting fortnightly and skipped one due to travel, which is why there haven't been minutes recently. Thanks, Bob -Original Message- From: YAMAMOTO Takashi [mailto:yamam...@valinux.co.jp] Sent: 04 February 2015 07:18 To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [Neutron] XenAPI questions hi Bob, is there any news on the CI work? do you think the idea of small proxy program can work? i think Terry Wilson's ovsdb effort will eventually need something similar, unless we will maintain two versions of the library forever. btw, when will the next XenAPI IRC meeting be? (i checked wiki and previous meeting logs but it wasn't clear to me) YAMAMOTO Takashi hi, good to hear. do you have any estimate when it will be available? will it cover dom0 side of the code found in neutron/plugins/openvswitch/agent/xenapi? YAMAMOTO Takashi Hi Yamamoto, XenAPI and Neutron do work well together, and we have an private CI that is running Neutron jobs. As it's not currently the public CI it's harder to access logs. We're working on trying to move the existing XenServer CI from a nova- network base to a neutron base, at which point the logs will of course be publically accessible and tested against any changes, thus making it easy to answer questions such as the below. Bob -Original Message- From: YAMAMOTO Takashi [mailto:yamam...@valinux.co.jp] Sent: 11 December 2014 03:17 To: openstack-dev@lists.openstack.org Subject: [openstack-dev] [Neutron] XenAPI questions hi, i have questions for XenAPI folks: - what's the status of XenAPI support in neutron? - is there any CI covering it? i want to look at logs. - is it possible to write a small program which runs with the xen rootwrap and proxies OpenFlow channel between domains? (cf. https://review.openstack.org/#/c/138980/) thank you. YAMAMOTO Takashi ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ ___ OpenStack Development Mailing List (not for usage questions) Unsubscribe: OpenStack-dev- requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [nova][libvirt] Logging interactions of libvirt + QEMU in Nova
Heya, I noticed a ping (but couldn't respond in time) on #openstack-nova IRC about turning on logging in libvirt to capture Nova failures. This was discussed on this list previously by Daniel Berrange, just spelling it out here for reference and completness' sake. (1) To see the interactions between libvirt and QEMU, in /etc/libvirt/libvirtd.conf, have these two config attributes: . . . log_filters=1:libvirt 1:qemu 1:conf 1:security 3:event 3:json 3:file 1:util log_outputs=1:file:/var/tmp/libvirtd.log . . . (2) Restart libvirtd: $ systemctl restart libvirtd (3) Enable Nova debug logs in /etc/nova/nova.conf: . . . verbose = True debug = True . . . (4) Restart the Nova service: $ openstack-service restart nova (5) Repeat the offending test. The libvirtd.log file should have the relevant details. Other resources --- - http://wiki.libvirt.org/page/DebugLogs - https://fedoraproject.org/wiki/How_to_debug_Virtualization_problems -- /kashyap __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][ec2-api] Tagging functionality in nova's EC2 API
The conclusion seems fine ATM, like cleanup, fixing bugs, etc. But we should review the spec(s) for EC2 tags and if the spec design looks fine, then we can review the EC2 Tags patch. If the spec design itself is not feasible, then we should revisit the spec and blueprint. Thanks Swami On Tue, Feb 3, 2015 at 9:32 PM, Alexandre Levine alev...@cloudscaling.com wrote: I'm writing this in regard to several reviews concering tagging functionality for EC2 API in nova. The list of the reviews concerned is here: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/ec2-volume-and-snapshot-tags,n,z I don't think it's a good idea to merge these reviews. The analysis is below: Tagging in AWS Main goal for the tagging functionality in AWS is to be able to efficiently distinguish various resources based on user-defined criteria: Tags enable you to categorize your AWS resources in different ways, for example, by purpose, owner, or environment. ... You can search and filter the resources based on the tags you add. (quoted from here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_Tags.html) It means that one of the two main use-cases is to be able to use Tags as filter when you describe something. Another one is to be able to get information about particular tag with all of the resources tagged by it. Also there is a constraint: You can tag public or shared resources, but the tags you assign are available only to your AWS account and not to the other accounts sharing the resource. The important part here is shared resources which are visible to different users but tags are not shared - each user sees his own. Existing implementation in nova Existing implementation of tags in nova's EC2 API covers only instances. But it does so in both areas: 1. Tags management (create, delete, describe,...) 2. Instances filtering (describe_instances with filtering by tags). The implementation is based on storing tags in each instance's metadata. And nova DB sqlalchemy level uses tag: in queries to allow instances describing with tag filters. I see the following design flaws in existing implementation: 1. It uses instance's own metadata for storing information about assigned tags. Problems: - it doesn't scale when you want to start using tags for other resources. Following this design decision you'll have to store tags in other resources metadata, which mean different services APIs and other databases. So performance for searching for tags or tagged resources in main use cases should suffer. You'll have to search through several remote APIs, querying different metadatas to collect all info and then to compile the result. - instances are not shared resources, but images are. It means that, when developed, metadata for images will have to store different tags for different users somehow. 2. EC2-specific code (tag: searching in novaDB sqlalchemy) leaked into lower layers of nova. - layering is violated. There should be no EC2-specifics below EC2 API library in nova, ideally. - each other service will have to implement the same solution in its own DB level to support tagging for EC2 API. Proposed review changes The review in question introduces tagging for volumes and snapshots. It follows design decisions of existing instance tagging implementation, but realizes only one of the two use cases. It provides create, delete, describe for tags. But it doesn't provide describe_volumes or describe_snapshots for filtering. It suffers from the design flaws I listed above. It has to query remote API (cinder) for metadata. It didn't implement filtering by tag: in cinder DB level so we don't see implementation of describe_volumes with tags filtering. Current stackforge/ec2-api tagging implementation In comparison, the implementation of tagging in stackforge/ec2-api, stores all of the tags and their links to resources and users in a separate place. So we can efficiently list tags and its resources or filter by tags during describing of some of the resources. Also user-specific tagging is supported. Conclusion Keeping in mind all of the above, and seeing your discussion about deprecation of EC2 API in nova, I don't feel it's a good time to add such a half-baked code with some potential problems into nova. I think it's better to concentrate on cleaning up, fixing, reviving and making bullet-proof whatever functionality is currently present in nova for EC2 and used by clients. Best regards, Alex Levine __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage
Re: [openstack-dev] django-openstack-auth and stable/icehouse
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 02/04/2015 11:20 AM, Alan Pevec wrote: Bumping minimal oslo.config version due to the issue in django-openstack-auth seems like a wrong way to do it. Dependencies in requirements.txt do not seem to be used in stable/icehouse gate jobs, recent pip freeze in stable/icehouse shows: ... oslo.config==1.6.0 # git sha 99e530e django-openstack-auth==1.1.9 # git sha 2079383 It's because of this: 2015-01-27 19:33:44.152 | Collecting oslo.config=1.4.0 (from python-keystoneclient=0.11.1-python-openstackclient=1.0.1) After that installs 1.6.0, consequent pip runs assume that 1.6.0 is always better than 1.4.0 and disregards version cap, hence does not downgrade the library. Should we finally cap versions for clients so that they don't fetch new library versions? /Ihar -BEGIN PGP SIGNATURE- Version: GnuPG v1 iQEcBAEBAgAGBQJU0fpHAAoJEC5aWaUY1u57AwUIAJdW7ZBrknVyaAn0VIkty180 r49gYGEWaQCF7nMVzcnWKrs6aG3VOEJpyipAujzk0A2rF/gD9Bn9iHk2/hyjF/sZ iDmokiDuFPAB8pIpYdMNyYyKKMgCGoInyHW1PAbCIsj24qiFIzSQMbojvt8Bsgks 68gQk5CYXmi0gF6OiPUHEqj73vpPjXLNZHd2V/P87MAvsTiGRXXFWncT0F1cl5oJ i47uVOyhBK9zfZgDFfL/jPq35Ij71t9BXUQxdgxXavYbGjsnC+YEcOeAacUS4kBk hDliIq+HGPGK0eEgLe4BwHxrd5Skh60h0TPsx+BbVo8A0mydxee7XgUxEG2P2Fs= =sy8K -END PGP SIGNATURE- __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][cinder][neutron][security] Rootwrap on root-intensive nodes
On Wed, Feb 04, 2015 at 11:58:03AM +0100, Thierry Carrez wrote: The first one is performance -- each call would spawn a Python interpreter which would then call the system command. This was fine when there were just a few calls here and there, not so much when it's called a hundred times in a row. During the Juno cycle, a daemon mode was added to solve this issue. It is significantly faster than running sudo directly (the often-suggested alternative). Projects still have to start adopting it though. Neutron and Cinder have started work to do that in Kilo. The second problem is the quality of the filter definitions. Rootwrap is a framework to enable isolation. It's only as good as the filters each project defines. Most of them rely on CommandFilters that do not check any argument, instead of using more powerful filters (which are arguably more painful to maintain). Developers routinely add filter definitions that basically remove any isolation that might have been there, like allowing blank dd, tee, chown or chmod. I think this is really the key point which shows rootwrap as a concept is broken by design IMHO. Root wrap is essentially trying to provide an API for invoking privileged operations, but instead of actually designing an explicit API for the operations, we done by implicit one based on command args. From a security POV I think this approach is doomed to failure, but command arg strings are fr to expressive a concept to deal with. What solutions do we have ? (1) we could get our act together and audit and fix those filter definitions. Remove superfluous usage of root rights, make use of advanced filters for where we actually need them. We have been preaching for that at many many design summits. This is a lot of work though... There were such efforts in the past, but they were never completed for some types of nodes. Worse, the bad filter definitions kept coming back, since developers take shortcuts, reviewers may not have sufficient security awareness to detect crappy filter definitions, and I don't think we can design a gate test that would have such awareness. (2) bite the bullet and accept that some types of nodes actually need root rights for so many different things, they should just run as root anyway. I know a few distributions which won't be very pleased by such a prospect, but that would be a more honest approach (rather than claiming we provide efficient isolation when we really don't). An added benefit is that we could replace a number of shell calls by Python code, which would simplify the code and increase performance. (3) intermediary solution where we would run as the nova user but run sudo COMMAND directly (instead of sudo nova-rootwrap CONFIG COMMAND). That would leave it up to distros to choose between a blanket sudoer or maintain their own filtering rules. I think it's a bit hypocritical though (pretend the distros could filter if they wanted it, when we dropped the towel on doing that ourselves). I'm also not convinced it's more secure than solution 2, and it prevents from reducing the number of shell-outs, which I think is a worthy idea. In all cases I would not drop the baby with the bath water, and keep rootwrap for all the cases where root rights are needed on a very specific set of commands (like neutron, or nova's api-metadata). The daemon mode should address the performance issue for the projects making a lot of calls. (4) I think that ultimately we need to ditch rootwrap and provide a proper privilege separated, formal RPC mechanism for each project. eg instead of having a rootwrap command, or rootwrap server attempting to validate safety of qemu-img create -f qcow2 /var/lib/nova/instances/instance1/disk.qcow2 we should have a nova-compute-worker daemon running as root, that accepts an RPC command from nova-compute running unprivileged. eg CreateImage(instane0001, qcow2, disk.qcow) This immediately makes it trivial to validate that we're not trying to trick qemu-img into overwriting some key system file. This is certainly alot more work than trying to patchup rootwrap, but it would provide a level of security that rootwrap can never achieve IMHO. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] XenAPI questions
hi, i added an item to the agenda. https://wiki.openstack.org/wiki/Meetings/XenAPI#Next_meeting YAMAMOTO Takashi Hi, The next meeting will be tomorrow @ 15:00 UTC - We'd love to see you there and we can talk about the CI and Terry's work. We're currently meeting fortnightly and skipped one due to travel, which is why there haven't been minutes recently. Thanks, Bob -Original Message- From: YAMAMOTO Takashi [mailto:yamam...@valinux.co.jp] Sent: 04 February 2015 07:18 To: openstack-dev@lists.openstack.org Subject: Re: [openstack-dev] [Neutron] XenAPI questions hi Bob, is there any news on the CI work? do you think the idea of small proxy program can work? i think Terry Wilson's ovsdb effort will eventually need something similar, unless we will maintain two versions of the library forever. btw, when will the next XenAPI IRC meeting be? (i checked wiki and previous meeting logs but it wasn't clear to me) YAMAMOTO Takashi hi, good to hear. do you have any estimate when it will be available? will it cover dom0 side of the code found in neutron/plugins/openvswitch/agent/xenapi? YAMAMOTO Takashi Hi Yamamoto, XenAPI and Neutron do work well together, and we have an private CI that is running Neutron jobs. As it's not currently the public CI it's harder to access logs. We're working on trying to move the existing XenServer CI from a nova- network base to a neutron base, at which point the logs will of course be publically accessible and tested against any changes, thus making it easy to answer questions such as the below. Bob -Original Message- From: YAMAMOTO Takashi [mailto:yamam...@valinux.co.jp] Sent: 11 December 2014 03:17 To: openstack-dev@lists.openstack.org Subject: [openstack-dev] [Neutron] XenAPI questions hi, i have questions for XenAPI folks: - what's the status of XenAPI support in neutron? - is there any CI covering it? i want to look at logs. - is it possible to write a small program which runs with the xen rootwrap and proxies OpenFlow channel between domains? (cf. https://review.openstack.org/#/c/138980/) thank you. YAMAMOTO Takashi ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ ___ OpenStack Development Mailing List (not for usage questions) Unsubscribe: OpenStack-dev- requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][ec2-api] Tagging functionality in nova's EC2 API
Thanks Alex for your detailed inspection of my work. Comments inline.. On 3 February 2015 at 21:32, Alexandre Levine alev...@cloudscaling.com wrote: I'm writing this in regard to several reviews concering tagging functionality for EC2 API in nova. The list of the reviews concerned is here: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/ec2-volume-and-snapshot-tags,n,z I don't think it's a good idea to merge these reviews. The analysis is below: *Tagging in AWS* Main goal for the tagging functionality in AWS is to be able to efficiently distinguish various resources based on user-defined criteria: Tags enable you to categorize your AWS resources in different ways, for example, by purpose, owner, or environment. ... You can search and filter the resources based on the tags you add. (quoted from here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_Tags.html) It means that one of the two main use-cases is to be able to use Tags as filter when you describe something. Another one is to be able to get information about particular tag with all of the resources tagged by it. Also there is a constraint: You can tag public or shared resources, but the tags you assign are available only to your AWS account and not to the other accounts sharing the resource. The important part here is shared resources which are visible to different users but tags are not shared - each user sees his own. *Existing implementation in nova *Existing implementation of tags in nova's EC2 API covers only instances. But it does so in both areas: 1. Tags management (create, delete, describe,...) 2. Instances filtering (describe_instances with filtering by tags). The implementation is based on storing tags in each instance's metadata. And nova DB sqlalchemy level uses tag: in queries to allow instances describing with tag filters. I see the following design flaws in existing implementation: 1. It uses instance's own metadata for storing information about assigned tags. Problems: - it doesn't scale when you want to start using tags for other resources. Following this design decision you'll have to store tags in other resources metadata, which mean different services APIs and other databases. So performance for searching for tags or tagged resources in main use cases should suffer. You'll have to search through several remote APIs, querying different metadatas to collect all info and then to compile the result. - instances are not shared resources, but images are. It means that, when developed, metadata for images will have to store different tags for different users somehow. 2. EC2-specific code (tag: searching in novaDB sqlalchemy) leaked into lower layers of nova. - layering is violated. There should be no EC2-specifics below EC2 API library in nova, ideally. All of the Nova-EC2 mapping happens in Nova's DB currently. See InstanceIdMapping model in nova/db/sqlalchemy/models.py. EC2 API which resides in Nova will keep using the Nova database as long as it is functional. - each other service will have to implement the same solution in its own DB level to support tagging for EC2 API. *Proposed review changes* The review in question introduces tagging for volumes and snapshots. It follows design decisions of existing instance tagging implementation, but realizes only one of the two use cases. It provides create, delete, describe for tags. But it doesn't provide describe_volumes or describe_snapshots for filtering. I honestly forgot about those two methods. I can implement them. It suffers from the design flaws I listed above. It has to query remote API (cinder) for metadata. It didn't implement filtering by tag: in cinder DB level so we don't see implementation of describe_volumes with tags filtering. Cinder do support filtering based on tags, and I marked the work as TODO in https://review.openstack.org/#/c/112325/23/nova/volume/cinder.py . This was not the reason why I didn't implement describe_volumes and describe_snapshots. Those two methods just missed my attention :) Nova's EC2 API's tag filtering is also done in-memory presently if I'm correct, as Nova's API doesn't support filtering only on the basis of tag names or tag values alone.. *Current stackforge/ec2-api tagging implementation* In comparison, the implementation of tagging in stackforge/ec2-api, stores all of the tags and their links to resources and users in a separate place. So we can efficiently list tags and its resources or filter by tags during describing of some of the resources. Also user-specific tagging is supported. *Conclusion *Keeping in mind all of the above, and seeing your discussion about deprecation of EC2 API in nova, I don't feel it's a good time to add such a half-baked code with some potential problems into nova. I think it's better to concentrate on cleaning up, fixing, reviving and making bullet-proof
[openstack-dev] [horizon] JavaScript docs?
In python we have a style to document methods, classes, and so forth. But, I don't see any guidance on how JavaScript should be documented. I was looking for something like jsdoc or ngdoc (an extension of jsdoc). Is there any guidance on how JavaScript should be documented? For anyone who doesn't know, Angular uses ngdoc (an extension to the commonly used jsdoc) which is written up at https://github.com/angular/angular.js/wiki/Writing-AngularJS-Documentation. __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova][ec2-api] Tagging functionality in nova's EC2 API
Rushi, Thank you for the response. I totally understand the effort and your problems with getting it through at the time. Your design is completely inline with what's currently present in Nova for EC2, no doubt about that. I did whatever I could to review your patches and consider if it's worth to go forward with them in current circumstances. I believe it'll add more complications than value if we go on with them. The main design problems, introduced before you, won't go away: 1. User isolation for shared objects is close to impossible to implement in this model. 2. Marked resources listing when describing tags, will require going to all of the possible different APIs and their databases eventually and then compiling the result. The stackforge/ec2-api implementation fortunately had no constraints or previously implemented code with some conceptual problems. So it could and did store the tags and their associations with resources separately. It allowed efficient searching for both describing tags and resources. Strategically if, as I understand, eventual aim is to switch to separate ec2-api solution, it makes little sense (to me) to add more functionality, especially partial functionality (no describe_volumes, describe_snapshots and even if added, no tagging for other resources), to current nova code. If the decision was made to enhance nova with new features like this, I'd still be for a separate table in DB for all of the tags and their associations - it would've made universal, complete and efficient solution with one effort. And again, I more than agree with this: ...I can only wish that the patches got more attention when it was possible to get them merged :) But that's a different story. Best regards, Alex Levine On 2/4/15 4:32 PM, Rushi Agrawal wrote: Thanks Alex for your detailed inspection of my work. Comments inline.. On 3 February 2015 at 21:32, Alexandre Levine alev...@cloudscaling.com mailto:alev...@cloudscaling.com wrote: I'm writing this in regard to several reviews concering tagging functionality for EC2 API in nova. The list of the reviews concerned is here: https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/ec2-volume-and-snapshot-tags,n,z I don't think it's a good idea to merge these reviews. The analysis is below: *Tagging in AWS* Main goal for the tagging functionality in AWS is to be able to efficiently distinguish various resources based on user-defined criteria: Tags enable you to categorize your AWS resources in different ways, for example, by purpose, owner, or environment. ... You can search and filter the resources based on the tags you add. (quoted from here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/Using_Tags.html) It means that one of the two main use-cases is to be able to use Tags as filter when you describe something. Another one is to be able to get information about particular tag with all of the resources tagged by it. Also there is a constraint: You can tag public or shared resources, but the tags you assign are available only to your AWS account and not to the other accounts sharing the resource. The important part here is shared resources which are visible to different users but tags are not shared - each user sees his own. * **Existing implementation in nova *Existing implementation of tags in nova's EC2 API covers only instances. But it does so in both areas: 1. Tags management (create, delete, describe,...) 2. Instances filtering (describe_instances with filtering by tags). The implementation is based on storing tags in each instance's metadata. And nova DB sqlalchemy level uses tag: in queries to allow instances describing with tag filters. I see the following design flaws in existing implementation: 1. It uses instance's own metadata for storing information about assigned tags. Problems: - it doesn't scale when you want to start using tags for other resources. Following this design decision you'll have to store tags in other resources metadata, which mean different services APIs and other databases. So performance for searching for tags or tagged resources in main use cases should suffer. You'll have to search through several remote APIs, querying different metadatas to collect all info and then to compile the result. - instances are not shared resources, but images are. It means that, when developed, metadata for images will have to store different tags for different users somehow. 2. EC2-specific code (tag: searching in novaDB sqlalchemy) leaked into lower layers of nova. - layering is violated. There should be no EC2-specifics below EC2 API library in nova, ideally. All of the Nova-EC2 mapping happens in Nova's DB currently. See
Re: [openstack-dev] [tc] do we really need project tags in the governance repository?
On Tue, Feb 3, 2015 at 8:04 PM, Joe Gordon joe.gord...@gmail.com wrote: On Tue, Jan 27, 2015 at 10:15 AM, Clint Byrum cl...@fewbar.com wrote: Excerpts from Thierry Carrez's message of 2015-01-27 02:46:03 -0800: Doug Hellmann wrote: On Mon, Jan 26, 2015, at 12:02 PM, Thierry Carrez wrote: [...] I'm open to alternative suggestions on where the list of tags, their definition and the list projects they apply to should live. If you don't like that being in the governance repository, what would have your preference ? From the very beginning I have taken the position that tags are by themselves not sufficiently useful for evaluating projects. If someone wants to choose between Ceilometer, Monasca, or StackTach, we're unlikely to come up with tags that will let them do that. They need in-depth discussions of deployment options, performance characteristics, and feature trade-offs. They are still useful to give people a chance to discover that those 3 are competing in the same space, and potentially get an idea of which one (if any) is deployed on more than one public cloud, better documented, or security-supported. I agree with you that an (opinionated) article comparing those 3 solutions would be a nice thing to have, but I'm just saying that basic, clearly-defined reference project metadata still has a lot of value, especially as we grow the number of projects. I agree with your statement that summary reference metadata is useful. I agree with Doug that it is inappropriate for the TC to assign it. That said, I object to only saying this is all information that can be found elsewhere or should live elsewhere, because that is just keeping the current situation -- where that information exists somewhere but can't be efficiently found by our downstream consumers. We need a taxonomy and clear definitions for tags, so that our users can easily find, understand and navigate such project metadata. As someone new to the project, I would not think to look in the governance documents for state information about a project. I would search for things like install guide openstack or component list openstack and expect to find them in the documentation. So I think putting the information in those (or similar) places will actually make it easier to find for someone that hasn't been involved in the discussion of tags and the governance repository. The idea here is to have the reference information in some Gerrit-controlled repository (currently openstack/governance, but I'm open to moving this elsewhere), and have that reference information consumed by the openstack.org website when you navigate to the Software section, to present a browseable/searchable list of projects with project metadata. I don't expect anyone to read the YAML file from the governance repository. On the other hand, the software section of the openstack.org website is by far the most visited page of all our web properties, so I expect most people to see that. Just like we gather docs and specs into single websites, we could also gather project metadata. Let the projects set their tags. One thing that might make sense for the TC to do is to elevate certain tags to a more important status that they _will_ provide guidance on when to use. However, the actual project to tag mapping would work quite well as a single file in whatever repository the project team thinks would be the best starting point for a new user. One way we can implement this is, have the TC manage a library that converts a file with tag data into a document, along with a list of default tags, and each project can import that library and include it in its docs. This way the TC can suggest tags that make sense, but its up to individual projects to apply them. This is similar to what nova is doing with our hypervisor feature capability matrix in https://review.openstack.org/#/c/136380/ We convert a config file into http://docs-draft.openstack.org/80/136380/7/check/gate-nova-docs/28be8b3//doc/build/html/support-matrix.html I really like this Joe. Nice work Daniel. To Jay's response about tag ownership, I think a cross-project team like infra or docs makes sense, but I can't imagine taking it on in docs right now, too many other projects planned. I think in this release the TC may have to suck it up and get it boot strapped, but then make a plan for either distributed maintenance across projects or in a cross-project repo. Anne __ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __ OpenStack Development Mailing List (not for