Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target
Excerpts from Mike Spreitzer's message of 2014-10-16 22:24:30 -0700: I like the idea of measuring complexity. I looked briefly at `python -m mccabe`. It seems to measure each method independently. Is this really fair? If I have a class with some big methods, and I break it down into more numerous and smaller methods, then the largest method gets smaller, but the number of methods gets larger. A large number of methods is itself a form of complexity. It is not clear to me that said re-org has necessarily made the class easier to understand. I can also break one class into two, but it is not clear to me that the project has necessarily become easier to understand. While it is true that when you truly make a project easier to understand you sometimes break it into more classes, it is also true that you can do a bad job of re-organizing a set of classes while still reducing the size of the largest method. Has the McCabe metric been evaluated on Python projects? There is a danger in focusing on what is easy to measure if that is not really what you want to optimize. BTW, I find that one of the complexity issues for me when I am learning about a Python class is doing the whole-program type inference so that I know what the arguments are. It seems to me that if you want to measure complexity of Python code then something like the complexity of the argument typing should be taken into account. Fences don't solve problems. Fences make it harder to cause problems. Of course you can still do the wrong thing and make the code worse. But you can't do _this_ wrong thing without asserting why you need to. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target
On Fri, Oct 17, 2014 at 03:03:43PM +1100, Michael Still wrote: I think nova wins. We have: ./nova/virt/libvirt/driver.py:3736:1: C901 'LibvirtDriver._get_guest_config' is too complex (67) IMHO this tool is of pretty dubious value. I mean that function is long for sure, but it is by no means a serious problem in the Nova libvirt codebase. The stuff it complains about in the libvirt/config.py file is just incredibly stupid thing to highlight. We've got plenty of big problems that need addressing in the OpenStack codebase and I don't see this tool highlighting any of them. Better to have people focus on solving actual real problems we have than trying to get some arbitrary code analysis score to hit a magic value. Regards, Daniel -- |: http://berrange.com -o-http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :| ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Neutron] BGPVPN implementation discussions
Good news, +1 2014-10-17 0:48 GMT+08:00 Mathieu Rohon mathieu.ro...@gmail.com: Hi all, as discussed during today's l3-meeting, we keep on working on BGPVPN service plugin implementation [1]. MPLS encapsulation is now supported in OVS [2], so we would like to summit a design to leverage OVS capabilities. A first design proposal, based on l3agent, can be found here : https://docs.google.com/drawings/d/1NN4tDgnZlBRr8ZUf5-6zzUcnDOUkWSnSiPm8LuuAkoQ/edit this solution is based on bagpipe [3], and its capacity to manipulate OVS, based on advertised and learned routes. [1]https://blueprints.launchpad.net/neutron/+spec/neutron-bgp-vpn [2]https://raw.githubusercontent.com/openvswitch/ovs/master/FAQ [3]https://github.com/Orange-OpenSource/bagpipe-bgp Thanks Mathieu ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target
I like measuring code metrics, and I definitely support Joe's change here. I think of McCabe complexity as a proxy for testability and readability of code, both of which are IMO actual real problems in the nova codebase. If you are an experienced openstack dev you might find the code easy to move around, but large and complex functions are difficult for beginners to grok. As an exercise, I took the method in libvirt/config.py and removed everything except the flow-control keywords (ie the things that affect the McCabe complexity): http://paste.openstack.org/show/121589/ - I would find it difficult to hold all that in my head at once. It's possible to argue that this is a false-positive, but my experience is that this tool finds code which need improvement. That said, these should be descriptive metrics rather than prescriptive targets. There are products which chart a codebase's evolution over time, such as www.sonarsource.com, which are really great for provoking thought and conversation about code quality. Now I'm interested, I'll have a look into it. Matthew ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Automatic evacuate
-Original Message- From: Florian Haas [mailto:flor...@hastexo.com] Sent: Thursday, October 16, 2014 10:53 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] Automatic evacuate On Thu, Oct 16, 2014 at 9:25 AM, Jastrzebski, Michal michal.jastrzeb...@intel.com wrote: In my opinion flavor defining is a bit hacky. Sure, it will provide us functionality fairly quickly, but also will strip us from flexibility Heat would give. Healing can be done in several ways, simple destroy - create (basic convergence workflow so far), evacuate with or without shared storage, even rebuild vm, probably few more when we put more thoughts to it. But then you'd also need to monitor the availability of *individual* guest and down you go the rabbit hole. So suppose you're monitoring a guest with a simple ping. And it stops responding to that ping. I was more reffering to monitoring host (not guest), and for sure not by ping. I was thinking of current zookeeper service group implementation, we might want to use corosync and write servicegroup plugin for that. There are several choices for that, each requires testing really before we make any decission. There is also fencing case, which we agree is important, and I think nova should be able to do that (since it does evacuate, it also should do a fencing). But for working fencing we really need working host health monitoring, so I suggest we take baby steps here and solve one issue at the time. And that would be host monitoring. (1) Has it died? (2) Is it just too busy to respond to the ping? (3) Has its guest network stack died? (4) Has its host vif died? (5) Has the L2 agent on the compute host died? (6) Has its host network stack died? (7) Has the compute host died? Suppose further it's using shared storage (running off an RBD volume or using an iSCSI volume, or whatever). Now you have almost as many recovery options as possible causes for the failure, and some of those recovery options will potentially destroy your guest's data. No matter how you twist and turn the problem, you need strongly consistent distributed VM state plus fencing. In other words, you need a full blown HA stack. I'd rather use nova for low level task and maybe low level monitoring (imho nova should do that using servicegroup). But I'd use something more more configurable for actual task triggering like heat. That would give us framework rather than mechanism. Later we might want to apply HA on network or volume, then we'll have mechanism ready just monitoring hook and healing will need to be implemented. We can use scheduler hints to place resource on host HA-compatible (whichever health action we'd like to use), this will bit more complicated, but also will give us more flexibility. I apologize in advance for my bluntness, but this all sounds to me like you're vastly underrating the problem of reliable guest state detection and recovery. :) Guest health in my opinion is just a bit out of scope here. If we'll have robust way of detecting host health, we can pretty much asume that if host dies, guests follow. There are ways to detect guest health (libvirt watchdog, ceilometer, ping you mentioned), but that should be done somewhere else. And for sure not by evacuation. I agree that we all should meet in Paris and discuss that so we can join our forces. This is one of bigger gaps to be filled imho. Pretty much every user I've worked with in the last 2 years agrees. Granted, my view may be skewed as HA is typically what customers approach us for in the first place, but yes, this definitely needs a globally understood and supported solution. Cheers, Florian ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [kolla] on Dockerfile patterns
On Thu, 16 Oct 2014, Lars Kellogg-Stedman wrote: On Fri, Oct 17, 2014 at 12:44:50PM +1100, Angus Lees wrote: You just need to find the pid of a process in the container (perhaps using docker inspect to go from container name - pid) and then: nsenter -t $pid -m -u -i -n -p -w Note also that the 1.3 release of Docker (any day now) will sport a Yesterday: http://blog.docker.com/2014/10/docker-1-3-signed-images-process-injection-security-options-mac-shared-directories/ -- Chris Dent tw:@anticdent freenode:cdent https://tank.peermore.com/tanks/cdent ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target
On Fri, 17 Oct 2014, Daniel P. Berrange wrote: IMHO this tool is of pretty dubious value. I mean that function is long for sure, but it is by no means a serious problem in the Nova libvirt codebase. The stuff it complains about in the libvirt/config.py file is just incredibly stupid thing to highlight. I find a lot of the OpenStack code very hard to read. If it is very hard to read it is very hard to maintain, whether that means fix or improve. That said, the value I see in these kinds of tools is not specifically in preventing complexity, but in providing entry points for people who want to fix things. You don't know where to start (because you haven't yet got the insight or experience): run flake8 or pylint or some other tools, do what it tells you. In the process you will: * learn more about the code * probably find bugs * make an incremental improvement to something that needs it -- Chris Dent tw:@anticdent freenode:cdent https://tank.peermore.com/tanks/cdent ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Glance][QA] python-glanceclient untestable in Python 3.4
On 10/17/2014 05:57 AM, Fei Long Wang wrote: Hi Jeremy, Thanks for the heads up. Is there a bug opened to track this? If not, I'm going to open one and dig into it. Cheers. Hey Fei Long, Thanks for taking care of this, please keep me in the loop. @Jeremy: Thanks for the heads up Flavio On 17/10/14 14:17, Jeremy Stanley wrote: As part of an effort to deprecate our specialized testing platform for Python 3.3, many of us have been working to confirm projects which currently gate on 3.3 can also pass their same test sets under Python 3.4 (which comes by default in Ubuntu Trusty). For the vast majority of projects, the differences between 3.3 and 3.4 are immaterial and no effort is required. For some, minor adjustments are needed... For python-glanceclient, we have 22 failing tests in a tox -e py34 run. I spent the better part of today digging into them, and they basically all stem from the fact that PEP 456 switches the unordered data hash algorithm from FNV to SipHash in 3.4. The unit tests in python-glanceclient frequently rely on trying to match multi-parameter URL queries and JSON built from unordered data types against predetermined string representations. Put simply, this just doesn't work if you can't guarantee their ordering. I'm left with a dilemma--I don't really have time to fix all of these (I started to go through and turn the fixture keys into format strings embedding dicts filtered through urlencode() for example, but it created as many new failures as it fixed), however I'd hate to drop Py3K testing for software which currently has it no matter how fragile. This is mainly a call for help to anyone with some background and/or interest in python-glanceclient's unit tests to get them working under Python 3.4, so that we can eliminate the burden of maintaining special 3.3 test infrastructure. -- @flaper87 Flavio Percoco ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic][Ceilometer] Proposed Change to Sensor meter naming in Ceilometer
Hi Jim, On 10/16/2014 07:23 PM, Jim Mankovich wrote: All, I would like to get some feedback on a proposal to change to the current sensor naming implemented in ironic and ceilometer. I would like to provide vendor specific sensors within the current structure for IPMI sensors in ironic and ceilometer, but I have found that the current implementation of sensor meters in ironic and ceilometer is IPMI specific (from a meter naming perspective) . This is not suitable as it currently stands to support sensor information from a provider other than IPMI.Also, the current Resource ID naming makes it difficult for a consumer of sensors to quickly find all the sensors for a given Ironic Node ID, so I would like to propose changing the Resource ID naming as well. Currently, sensors sent by ironic to ceilometer get named by ceilometer as has hardware.ipmi.SensorType, and the Resource ID is the Ironic Node ID with a post-fix containing the Sensor ID. For Details pertaining to the issue with the Resource ID naming, see https://bugs.launchpad.net/ironic/+bug/1377157, ipmi sensor naming in ceilometer is not consumer friendly Here is an example of what meters look like for sensors in ceilometer with the current implementation: | Name| Type | Unit | Resource ID | hardware.ipmi.current | gauge | W| edafe6f4-5996-4df8-bc84-7d92439e15c0-power_meter_(0x16) | hardware.ipmi.temperature | gauge | C| edafe6f4-5996-4df8-bc84-7d92439e15c0-16-system_board_(0x15) What I would like to propose is dropping the ipmi string from the name altogether and appending the Sensor ID to the name instead of to the Resource ID. So, transforming the above to the new naming would result in the following: | Name | Type | Unit | Resource ID | hardware.current.power_meter_(0x16) | gauge | W| edafe6f4-5996-4df8-bc84-7d92439e15c0 | hardware.temperature.system_board_(0x15) | gauge | C| edafe6f4-5996-4df8-bc84-7d92439e15c0 +1 Very-very nit, feel free to ignore if inappropriate: maybe hardware.temperature.system_board.0x15 ? I.e. use separation with dots, do not use brackets? This structure would provide the ability for a consumer to do a ceilometer resource list using the Ironic Node ID as the Resource ID to get all the sensors in a given platform. The consumer would then then iterate over each of the sensors to get the samples it wanted. In order to retain the information as to who provide the sensors, I would like to propose that a standard sensor_provider field be added to the resource_metadata for every sensor where the sensor_provider field would have a string value indicating the driver that provided the sensor information. This is where the string ipmi, or a vendor specific string would be specified. +1 I understand that this proposed change is not backward compatible with the existing naming, but I don't really see a good solution that would retain backward compatibility. For backward compatibility you could _also_ keep old ones (with ipmi in it) for IPMI sensors. Any/All Feedback will be appreciated, In this version it makes a lot of sense to me, +1 if Ceilometer folks are not against. Jim ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [All] Maintenance mode in OpenStack during patching/upgrades
On 17 October 2014 02:28, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: On 10/16/2014 7:26 PM, Christopher Aedo wrote: On Tue, Sep 9, 2014 at 2:19 PM, Mike Scherbakov mscherba...@mirantis.com wrote: On Tue, Sep 9, 2014 at 6:02 PM, Clint Byrum cl...@fewbar.com wrote: The idea is not simply deny or hang requests from clients, but provide them we are in maintenance mode, retry in X seconds You probably would want 'nova host-servers-migrate host' yeah for migrations - but as far as I understand, it doesn't help with disabling this host in scheduler - there is can be a chance that some workloads will be scheduled to the host. Regarding putting a compute host in maintenance mode using nova host-update --maintenance enable, it looks like the blueprint and associated commits were abandoned a year and a half ago: https://blueprints.launchpad.net/nova/+spec/host-maintenance It seems that nova service-disable host nova-compute effectively prevents the scheduler from trying to send new work there. Is this the best approach to use right now if you want to pull a compute host out of an environment before migrating VMs off? I agree with Tim and Mike that having something respond down for maintenance rather than ignore or hang would be really valuable. But it also looks like that hasn't gotten much traction in the past - anyone feel like they'd be in support of reviving the notion of maintenance mode? -Christopher ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev host-maintenance-mode is definitely a thing in nova compute via the os-hosts API extension and the --maintenance parameter, the compute manager code is here [1]. The thing is the only in-tree virt driver that implements it is xenapi, and I believe when you put the host in maintenance mode it's supposed to automatically evacuate the instances to some other host, but you can't target the other host or tell the driver, from the API, which instances you want to evacuate, e.g. all, none, running only, etc. [1] http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py?id=2014.2#n3990 We should certainly make that more generic. It doesn't update the VM state, so its really only admin focused in its current form. The XenAPI logic only works when using XenServer pools with shared NFS storage, if my memory serves me correctly. Honestly, its a bit of code I have planned on removing, along with the rest of the pool support. In terms of requiring DB downtime in Nova, the current efforts are focusing on avoiding downtime all together, via expand/contract style migrations, with a little help from objects to avoid data migrations. That doesn't mean maintenance mode if not useful for other things, like an emergency patching of the hypervisor. John ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [QA] Proposal: A launchpad bug description template
Markus Zoeller wrote: TL;DR: A proposal for a template for launchpad bug entries which asks for the minimal needed data to work on a bug. Note that Launchpad doesn't support bug entry templates. You can display bug reporting guidelines which appear under the textbox, but that's about it. Also note that the text is project-specific, so it needs to be entered in every openstack project. Depending on the exact nature of the project, I suspect the text should be different. Regards, -- Thierry Carrez (ttx) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [TripleO] Summit scheduling - using our time together wisely.
Clint Byrum wrote: * The Ops Summit is Wendesday/Thursday [3], which overlaps with these sessions. I am keenly interested in gathering more contribution from those already operating and deploying OpenStack. It can go both ways, but I think it might make sense to have more ops-centric topics discussed on Friday, when those participants might not be fully wrapped up in the ops sessions. The Ops Summit is actually on Monday and Thursday. Not on Wednesday. You were wrong on the Internet. -- Thierry Carrez (ttx) ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target
On 10/17/2014 05:10 AM, Chris Dent wrote: On Fri, 17 Oct 2014, Daniel P. Berrange wrote: IMHO this tool is of pretty dubious value. I mean that function is long for sure, but it is by no means a serious problem in the Nova libvirt codebase. The stuff it complains about in the libvirt/config.py file is just incredibly stupid thing to highlight. I find a lot of the OpenStack code very hard to read. If it is very hard to read it is very hard to maintain, whether that means fix or improve. Exactly, ++. That said, the value I see in these kinds of tools is not specifically in preventing complexity, but in providing entry points for people who want to fix things. You don't know where to start (because you haven't yet got the insight or experience): run flake8 or pylint or some other tools, do what it tells you. In the process you will: * learn more about the code * probably find bugs * make an incremental improvement to something that needs it Agreed. -jay ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Nova] Automatic evacuate
On Fri, Oct 17, 2014 at 9:53 AM, Jastrzebski, Michal michal.jastrzeb...@intel.com wrote: -Original Message- From: Florian Haas [mailto:flor...@hastexo.com] Sent: Thursday, October 16, 2014 10:53 AM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [Nova] Automatic evacuate On Thu, Oct 16, 2014 at 9:25 AM, Jastrzebski, Michal michal.jastrzeb...@intel.com wrote: In my opinion flavor defining is a bit hacky. Sure, it will provide us functionality fairly quickly, but also will strip us from flexibility Heat would give. Healing can be done in several ways, simple destroy - create (basic convergence workflow so far), evacuate with or without shared storage, even rebuild vm, probably few more when we put more thoughts to it. But then you'd also need to monitor the availability of *individual* guest and down you go the rabbit hole. So suppose you're monitoring a guest with a simple ping. And it stops responding to that ping. I was more reffering to monitoring host (not guest), and for sure not by ping. I was thinking of current zookeeper service group implementation, we might want to use corosync and write servicegroup plugin for that. There are several choices for that, each requires testing really before we make any decission. There is also fencing case, which we agree is important, and I think nova should be able to do that (since it does evacuate, it also should do a fencing). But for working fencing we really need working host health monitoring, so I suggest we take baby steps here and solve one issue at the time. And that would be host monitoring. You're describing all of the cases for which Pacemaker is the perfect fit. Sorry, I see absolutely no point in teaching Nova to do that. (1) Has it died? (2) Is it just too busy to respond to the ping? (3) Has its guest network stack died? (4) Has its host vif died? (5) Has the L2 agent on the compute host died? (6) Has its host network stack died? (7) Has the compute host died? Suppose further it's using shared storage (running off an RBD volume or using an iSCSI volume, or whatever). Now you have almost as many recovery options as possible causes for the failure, and some of those recovery options will potentially destroy your guest's data. No matter how you twist and turn the problem, you need strongly consistent distributed VM state plus fencing. In other words, you need a full blown HA stack. I'd rather use nova for low level task and maybe low level monitoring (imho nova should do that using servicegroup). But I'd use something more more configurable for actual task triggering like heat. That would give us framework rather than mechanism. Later we might want to apply HA on network or volume, then we'll have mechanism ready just monitoring hook and healing will need to be implemented. We can use scheduler hints to place resource on host HA-compatible (whichever health action we'd like to use), this will bit more complicated, but also will give us more flexibility. I apologize in advance for my bluntness, but this all sounds to me like you're vastly underrating the problem of reliable guest state detection and recovery. :) Guest health in my opinion is just a bit out of scope here. If we'll have robust way of detecting host health, we can pretty much asume that if host dies, guests follow. There are ways to detect guest health (libvirt watchdog, ceilometer, ping you mentioned), but that should be done somewhere else. And for sure not by evacuation. You're making an important point here; you're asking for a robust way of detecting host health. I can guarantee you that the way of detecting host health that you suggest (i.e. from within Nova) will not be robust by HA standards for at least two years, if your patch lands tomorrow. Cheers, Florian ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Horizon] Template Blueprint
Hello Horizoners, I would like to draw your attention to the excellent Template Blueprint[1] which David created. The aim of this is to create a template which will be used for all future blueprints. This way we can try to ensure that enough information/detail is provided in blueprints, as we have had problems with blueprints lacking in details in the past. Please take a minute to review [1] and add your comments to the whiteboard. We are hoping to finalize this and starting using this template ASAP. Thanks! [1] https://blueprints.launchpad.net/horizon/+spec/template -- Regards, Ana Krivokapic Software Engineer OpenStack team Red Hat Inc. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] Cell Initialization
Hi, I was trying to create cells under openstack using devstack. My setup contains 3 machines. One toplevel and 2 compute cells. I'm following this documentation, http://docs.openstack.org/trunk/config-reference/content/section_compute-cells.html . Both these cells instantiation are generating errors. 1. first one doesn't generate any error logs unless I issue a command at the parent 'nova cell-show cell2'. At this point the toplevel cell throws the following error, 2014-10-17 12:03:34.888 ERROR oslo.messaging.rpc.dispatcher [-] Exception during message handling: Circular reference detected 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last): 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher File /usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py, line 134, in _dispatch_and_reply 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher incoming.message)) 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher File /usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py, line 72, in reply 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher self._send_reply(conn, reply, failure, log_failure=log_failure) 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher File /usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py, line 62, in _send_reply 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher conn.direct_send(self.reply_q, rpc_common.serialize_msg(msg)) 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher File /usr/lib/python2.7/site-packages/oslo/messaging/_drivers/common.py, line 302, in serialize_msg 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher _MESSAGE_KEY: jsonutils.dumps(raw_msg)} 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher File /usr/lib/python2.7/site-packages/oslo/messaging/openstack/common/jsonutils.py, line 172, in dumps 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher return json.dumps(value, default=default, **kwargs) 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher File /usr/lib64/python2.7/json/__init__.py, line 250, in dumps 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher sort_keys=sort_keys, **kw).encode(obj) 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher File /usr/lib64/python2.7/json/encoder.py, line 207, in encode 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher chunks = self.iterencode(o, _one_shot=True) 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher File /usr/lib64/python2.7/json/encoder.py, line 270, in iterencode 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher return _iterencode(o, 0) 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher ValueError: Circular reference detected 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher 2014-10-17 12:03:34.890 ERROR oslo.messaging._drivers.common [-] Returning exception Circular reference detected to caller 2014-10-17 12:03:34.890 ERROR oslo.messaging._drivers.common [-] ['Traceback (most recent call last):\n', ' File /usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py, line 134, in _dispatch_and_reply\nincoming.message))\n', ' File /usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py, line 72, in reply\nself._send_reply(conn, reply, failure, log_failure=log_failure)\n', ' File /usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py, line 62, in _send_reply\nconn.direct_send(self.reply_q, rpc_common.serialize_msg(msg))\n', ' File /usr/lib/python2.7/site-packages/oslo/messaging/_drivers/common.py, line 302, in serialize_msg\n_MESSAGE_KEY: jsonutils.dumps(raw_msg)}\n', ' File /usr/lib/python2.7/site-packages/oslo/messaging/openstack/common/jsonutils.py, line 172, in dumps\nreturn json.dumps(value, default=default, **kwargs)\n', ' File /usr/lib64/python2.7/json/__init__.py, line 250, in dumps\nsort_keys=sort_keys, **kw).encode(obj)\n', ' File /usr/lib64/python2.7/json/encoder.py, line 207, in encode\nchunks = self.iterencode(o, _one_shot=True)\n', ' File /usr/lib64/python2.7/json/encoder.py, line 270, in iterencode\nreturn _iterencode(o, 0)\n', 'ValueError: Circular reference detected\n'] This one seems to be similar to the bug reported here, https://bugs.launchpad.net/nova/+bug/1312002 2. In the second child cell initialization, the error crops up as soon as I add the toplevel cell in the child cell using 'nova-manage' command. 2014-10-17 12:05:29.500 ERROR nova.cells.messaging [req-f74d05cf-061a-4488-bfcb-0cb1edec44e2 None None] Error locating next hop for message: Inconsistency in cell routing: destination is cell1!toplevel but routing_path is cell1!cell1 2014-10-17 12:05:29.500 TRACE nova.cells.messaging Traceback (most recent call last): 2014-10-17 12:05:29.500 TRACE nova.cells.messaging File
Re: [openstack-dev] [QA] Proposal: A launchpad bug description template
Thierry Carrez thie...@openstack.org wrote on 10/17/2014 11:28:56 AM: From: Thierry Carrez thie...@openstack.org To: openstack-dev@lists.openstack.org Date: 10/17/2014 11:31 AM Subject: Re: [openstack-dev] [QA] Proposal: A launchpad bug description template Markus Zoeller wrote: TL;DR: A proposal for a template for launchpad bug entries which asks for the minimal needed data to work on a bug. Note that Launchpad doesn't support bug entry templates. You can display bug reporting guidelines which appear under the textbox, but that's about it. Also note that the text is project-specific, so it needs to be entered in every openstack project. Depending on the exact nature of the project, I suspect the text should be different. Regards, -- Thierry Carrez (ttx) Thanks for the note on Launchpads capabilities. Providing the infor- mation in the bug reporting guidelines on launchpad looks like a good place. Currently there is for Nova Please include the exact version of Nova with which you're experiencing this issue.. The wiki page about the bugs [1] could be enhanced as well and then we could let Launchpad link to this wiki page. Maybe this would reduce the maintenance of the template. Subsections could be introduced for project specific debug data. [1] https://wiki.openstack.org/wiki/Bugs Regards, Markus Zoeller IRC: markus_z ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [novaclient] E12* rules
Hi everyone! I'm working on enabling E12* PEP8 rules in novaclient(status of my work listed below). Imo, PEP8 rules should be ignored only in extreme cases/for important reasons and we should decrease a number of ignored rules. This helps to keep code in more strict, readable form, which is very important when working in community. While working on rule E126, we started discussion with Joe Gordon about demand of these rules. I have no idea about reasons of why they should be ignored, so I want to know: - Why these rules should be ignored? - What do you think about enabling these rules? Please, leave your opinion about E12* rules. Already enabled rules: E121,E125 - https://review.openstack.org/#/c/122888/ E122 - https://review.openstack.org/#/c/123830/ E123 - https://review.openstack.org/#/c/123831/ Abandoned rule: E124 - https://review.openstack.org/#/c/123832/ Pending review: E126 - https://review.openstack.org/#/c/123850/ E127 - https://review.openstack.org/#/c/123851/ E128 - https://review.openstack.org/#/c/127559/ E129 - https://review.openstack.org/#/c/123852/ -- Best regards, Andrey Kurilin. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Elections] Results of the TC Election
Please join me in congratulating the 6 newly elected members of the TC. * Monty Taylor * Sean Dague * Doug Hellmann * Russell Bryant * Anne Gentle * John Griffith Full results: http://civs.cs.cornell.edu/cgi-bin/results.pl?id=E_c105db929e6c11f4 Thank you to all candidates who stood for election, having a good group of candidates helps engage the community in our democratic process, Thank you to Mark McLoughlin, who served in the previous TC and chose run for a seat this time. Thank you to all who voted and who encouraged others to vote. We need to ensure your voice is heard. Thanks to my fellow election official, Tristan Cacqueray, I appreciate your help and perspective. Thank you for another great round. Here's to Kilo, Anita. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [Nova] Turbo hipster problems
Hi, Anyone aware why Turbo his peter is failing with: real-db-upgrade_nova_percona_user_002:th-perconahttp://localhost Exception: [Errno 2] No such file or directory: '/var/lib/turbo-hipster/datasets_user_002' in 0s Thanks Gary ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Glance][QA] python-glanceclient untestable in Python 3.4
On 2014-10-17 16:57:59 +1300 (+1300), Fei Long Wang wrote: Thanks for the heads up. Is there a bug opened to track this? If not, I'm going to open one and dig into it. Cheers. Gah! You'd think *I* would know better at this point--sorry about that... I've now opened https://launchpad.net/bugs/1382582 to track this. Thanks for any assistance you're able to provide! -- Jeremy Stanley ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Glance][QA] python-glanceclient untestable in Python 3.4
On Fri, Oct 17, 2014 at 03:01:22PM +, Jeremy Stanley wrote: Gah! You'd think *I* would know better at this point--sorry about that... I've now opened https://launchpad.net/bugs/1382582 to track this. Thanks for any assistance you're able to provide! This looks like a continuation of the old PYTHONHASHSEED bug: https://launchpad.net/bugs/1348818 signature.asc Description: Digital signature ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target
I would also advise pinning the version of mccabe we’re using. Mccabe was originally a proof-of-concept script that Ned Batchelder wrote and which Tarek Ziade vendored into Flake8. After we split it out in v2 of Flake8, we’ve found several (somewhat serious) reporting problems with the tool. Currently the package owner on PyPI hasn’t granted me permissions to release a new version of the package, but we have several fixes in the repository: https://github.com/flintwork/mccabe. The changes are somewhat drastic but they should reduce the average function/method’s complexity by 1 or 2 points. I’m going to bother Florent again to give me permission to release the package since it has been far too long since a release has been cut. For what it’s worth, Florent doesn’t pay close attention to GitHub notifications so chiming in (or creating) issues on mccabe to release a new version will only spam *me*. So please don’t pile on to anything existing or create a new one. Cheers, Ian On 10/17/14, 12:39 AM, Michael Davies mich...@the-davies.net wrote: On Fri, Oct 17, 2014 at 2:39 PM, Joe Gordon joe.gord...@gmail.com wrote: First step in fixing this, put a cap on it: http://goog_106984861https://review.openstack.org/129125 Thanks Joe - I've just put up a similar patch for Ironic: https://review.openstack.org/129132 https://review.openstack.org/129132 -- Michael Davies mich...@the-davies.net Rackspace Australia ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [nova] APIImpact flag for nova specs
On Oct 15, 2014, at 5:52 AM, Christopher Yeoh cbky...@gmail.commailto:cbky...@gmail.com wrote: We don't require new templates as part of nova-specs and api changes don't necessarily change the api sample tpl files. We do ask for some jsonschema descriptions of the new APIs input but they work pretty well in the spec document itself. I agree it could be prone to spelling mistakes etc, though just being able to search for 'api' would be sufficient and people who review specs could pick up missing or mispelled flags in the commit message (and it wouldn't necessarily need to be restricted to just APIImpact as possible flags). +1 to APIImpact flag That there could be misses is not a good reason to not do this. Which is to say, let’s do this. Everett ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [kolla] on Dockerfile patterns
docker exec would be awesome. So... whats redhat's stance on docker upgrades here? I'm running centos7, and dockers topped out at docker-0.11.1-22.el7.centos.x86_64. (though redhat package versions don't always reflect the upstream version) I tried running docker 1.2 binary from docker.io but selinux flipped out on it. how long before docker exec actually is useful solution for debugging on such systems? Thanks, Kevin From: Lars Kellogg-Stedman [l...@redhat.com] Sent: Thursday, October 16, 2014 7:14 PM To: OpenStack Development Mailing List (not for usage questions) Subject: Re: [openstack-dev] [kolla] on Dockerfile patterns On Fri, Oct 17, 2014 at 12:44:50PM +1100, Angus Lees wrote: You just need to find the pid of a process in the container (perhaps using docker inspect to go from container name - pid) and then: nsenter -t $pid -m -u -i -n -p -w Note also that the 1.3 release of Docker (any day now) will sport a shiny new docker exec command that will provide you with the ability to run commands inside the container via the docker client without having to involve nsenter (or nsinit). It looks like: docker exec container_id ps -fe Or: docker exec -it container_id bash -- Lars Kellogg-Stedman l...@redhat.com | larsks @ {freenode,twitter,github} Cloud Engineering / OpenStack | http://blog.oddbit.com/ ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic][Ceilometer] Proposed Change to Sensor meter naming in Ceilometer
On Thu, 16 Oct 2014, Jim Mankovich wrote: What I would like to propose is dropping the ipmi string from the name altogether and appending the Sensor ID to the name instead of to the Resource ID. So, transforming the above to the new naming would result in the following: | Name | Type | Unit | Resource ID | hardware.current.power_meter_(0x16) | gauge | W| edafe6f4-5996-4df8-bc84-7d92439e15c0 | hardware.temperature.system_board_(0x15) | gauge | C| edafe6f4-5996-4df8-bc84-7d92439e15c0 [plus sensor_provider in resource_metadata] If this makes sense for the kinds of queries that need to happen then we may as well do it, but I'm not sure it is. When I was writing the consumer code for the notifications the names of the meters was a big open question that was hard to resolve because of insufficient data and input on what people really need to do with the samples. The scenario you've listed is getting all sensors on a given single platform. What about the scenario where you want to create an alarm that says If temperate gets over X on any system board on any of my hardware, notify the authorities? Will having the _(0x##) qualifier allow that to work? I don't actually know, are those qualifiers standard in some way or are they specific to different equipment? If they are different having them in the meter name makes creating a useful alarm in a heterogeneous a bit more of a struggle, doesn't it? Perhaps (if they are not standard) this would work: | hardware.current.power_meter | gauge | W | edafe6f4-5996-4df8-bc84-7d92439e15c0 with both sensor_provider and whatever that qualifier is called in the metadata? Then the name remains sufficiently generic to allow aggregates across multiple systems, while still having the necessary info to narrow to different sensors of the same type. I understand that this proposed change is not backward compatible with the existing naming, but I don't really see a good solution that would retain backward compatibility. I think we should strive to worry less about such things, especially when it's just names in data fields. Not always possible, or even a good idea, but sometimes its a win. -- Chris Dent tw:@anticdent freenode:cdent https://tank.peermore.com/tanks/cdent ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [api] Request Validation - Stoplight
Hi API Working Group Last night at the Openstack Meetup in Atlanta, a group of us discussed how request validation is being performed over various projects and how some teams are using pecan wsmi, or warlock, jsonschema etc. Each of these libraries have their own pro’s and con’s. My understanding is that the API working group is in the early stages of looking into these various libraries and will likely provide guidance in the near future on this. I would like to suggest another library to evaluate when deciding this. Some of our teams have started to use a library named “Stoplight”[1][2] in our projects. For example, in the Poppy CDN project, we found it worked around some of the issues we had with warlock such as validating nested json correctly [3]. Stoplight is an input validation framework for python. It can be used to decorate any function (including routes in pecan or falcon) to validate its parameters. Some good examples can be found here [4] on how to use Spotlight. Let us know your thoughts/interest and we would be happy to discuss further on if and how this would be valuable as a library for API request validation in Openstack. Thanks Amit Gandhi Senior Manager – Rackspace [1] https://pypi.python.org/pypi/stoplight [2] https://github.com/painterjd/stoplight [3] https://github.com/stackforge/poppy/blob/master/poppy/transport/pecan/controllers/v1/services.py#L108 [4] https://github.com/painterjd/stoplight/blob/master/stoplight/tests/test_validation.py#L138 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Glance][QA] python-glanceclient untestable in Python 3.4
On 2014-10-17 16:17:39 +0100 (+0100), Louis Taylor wrote: This looks like a continuation of the old PYTHONHASHSEED bug: https://launchpad.net/bugs/1348818 The underlying design choices in python-glanceclient's tests do cause both problems (can't run with a random hash seed, but also can't run under a different hash algorithm), and properly fixing one will fix the other. Unfortunately there isn't an easy workaround for the Python 3.4 testing issue, unlike bug 1348818. -- Jeremy Stanley ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] [TripleO] CI report : 04/10/2014 - 17/10/2014
Hi All, Nothing to report since the last report, 2 weeks of no breakages. thanks, Derek. ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
[openstack-dev] DB Datasets CI broken
Hi, I've just noticed that the DB Datasets CI (the artist formerly known as turbo hipster) is failing for many patches. I'm looking into it now. Michael -- Rackspace Australia ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] DB Datasets CI broken
This was a bad image in nodepool. I've rebuilt the image and killed our pool of workers running the old image and things seem to be ok now. I'm in the process of enqueueing rechecks for every failed turbo-hipster run now, but they'll take some time to all get executed. Thanks for your patience everyone. Is it possible to add a verification step to nodepool so that it doesn't mark a new image as ready unless it passes some basic sanity checks? Thanks, Michael On Sat, Oct 18, 2014 at 8:44 AM, Michael Still mi...@stillhq.com wrote: Hi, I've just noticed that the DB Datasets CI (the artist formerly known as turbo hipster) is failing for many patches. I'm looking into it now. Michael -- Rackspace Australia -- Rackspace Australia ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [api] API recommendation
On Oct 16, 2014 8:24 AM, Dean Troyer dtro...@gmail.com wrote: On Thu, Oct 16, 2014 at 4:57 AM, Salvatore Orlando sorla...@nicira.com wrote: From an API guideline viewpoint, I understand that https://review.openstack.org/#/c/86938/ proposes the introduction of a rather simple endpoint to query active tasks and filter them by resource uuid or state, for example. That review/blueprint contains one thing that I want to address in more detail below along with Sal's comment on persistence... While this is hardly questionable, I wonder if it might be worth typifying the task, ie: adding a resource_type attribute, and/or allowing to retrieve active tasks as a chile resource of an object, eg.: GET /servers/server_id/tasks?state=running or if just for running tasks GET /servers/server_id/active_tasks I'd prefer the filter approach, but more importantly, it should be the _same_ structure as listing resources themselves. To note: here is another API design detail, specifying resource types in the URL path: /server/server/foo vs /server/foo or what we have today, for example, in compute: /tenant/foo The proposed approach for the multiple server create case also makes sense to me. Other than bulk operations there are indeed cases where a single API operation needs to perform multiple tasks. For instance, in Neutron, creating a port implies L2 wiring, setting up DHCP info, and securing it on the compute node by enforcing anti-spoof rules and security groups. This means there will be 3/4 active tasks. For this reason I wonder if it might be the case of differentiating between the concept of operation and tasks where the former is the activity explicitly initiated by the API consumer, and the latter are the activities which need to complete to fulfil it. This is where we might leverage the already proposed request_id attribute of the task data structure. I like the ability to track the fan-out, especially if I can get the state of the entire set of tasks in a single round-trip. This also makes it easier to handle backout of failed requests without having to maintain a lot of client-side state, or make a lot of round-trips. Based on previous experience, I highly recommend maintaining separation between tracking work at an API call level aggregate and other subtasks. In non-provisioning scenarios, tasks may fire independent of API operations, so there wouldn't be an API handle to query on. It is great to manage per-API call level tasks in the framework. The other work type tasks are *much* more complicated beasts, deserving of their own design. Finally, a note on persistency. How long a completed task, successfully or not should be stored for? Do we want to store them until the resource they operated on is deleted? I don't think it's a great idea to store them indefinitely in the DB. Tying their lifespan to resources is probably a decent idea, but time-based cleanup policies might also be considered (e.g.: destroy a task record 24 hours after its completion) I can envision an operator/user wanting to be able to pull a log of an operation/task for not only cloud debugging (x failed to build, when/why?) but also app-level debugging (concrete use case not ready at deadline). This would require a minimum of life-of-resource + some-amount-of-time. The time might also be variable, failed operations might actually need to stick around longer. Even as an operator with access to backend logging, pulling these state transitions out should not be hard, and should be available to the resource owner (project). dt -- Dean Troyer dtro...@gmail.com ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] DB Datasets CI broken
On 2014-10-18 10:45:23 +1100 (+1100), Michael Still wrote: [...] Is it possible to add a verification step to nodepool so that it doesn't mark a new image as ready unless it passes some basic sanity checks? Back in the beforetime, when devstack-gate had scripts which managed the worker pool as scheduled Jenkins jobs, it would run DevStack exercises on a test boot of the new image before using it to boot real images. Of course you can imagine the number of perfectly good images which were thrown away because of nondeterministic bugs causing false negative results there, so we probably wouldn't want to duplicate that exactly, but perhaps something more lightweight would be a reasonable compromise. Anyway, I consider it a good feature request (others may disagree), just nobody's reimplemented it in nodepool to date. -- Jeremy Stanley ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [Ironic][Ceilometer] Proposed Change to Sensor meter naming in Ceilometer
Chris, See answers inline. I don't have any concrete answers as to how to deal with some of questions you brought up, but I do have some more detail that may be useful to further the discussion. On 10/17/2014 11:03 AM, Chris Dent wrote: On Thu, 16 Oct 2014, Jim Mankovich wrote: What I would like to propose is dropping the ipmi string from the name altogether and appending the Sensor ID to the name instead of to the Resource ID. So, transforming the above to the new naming would result in the following: | Name | Type | Unit | Resource ID | hardware.current.power_meter_(0x16) | gauge | W | edafe6f4-5996-4df8-bc84-7d92439e15c0 | hardware.temperature.system_board_(0x15) | gauge | C | edafe6f4-5996-4df8-bc84-7d92439e15c0 [plus sensor_provider in resource_metadata] If this makes sense for the kinds of queries that need to happen then we may as well do it, but I'm not sure it is. When I was writing the consumer code for the notifications the names of the meters was a big open question that was hard to resolve because of insufficient data and input on what people really need to do with the samples. The scenario you've listed is getting all sensors on a given single platform. What about the scenario where you want to create an alarm that says If temperate gets over X on any system board on any of my hardware, notify the authorities? Will having the _(0x##) qualifier allow that to work? I don't actually know, are those qualifiers standard in some way or are they specific to different equipment? If they are different having them in the meter name makes creating a useful alarm in a heterogeneous a bit more of a struggle, doesn't it? The _(0x##) is an ipmitool display artifact that is tacked onto the end of the Sensor ID in order to provide more information beyond what Sensor ID has in it. The ## is the sensor record ID which is specific to IPMI. Whether or not a Sensor ID (sans _(0x##)) is unique is up to the vendor, but in general I believe all vendors will likely name their sensors uniquely; otherwise, how can a person differentiate textually what component in a platform the sensor represents? Personally, I would like to see the _(0x##) removed form the Sensor ID string (by the ipmitool driver) before it returns sensors to the Ironic conductor. I just don't see any value in this extra info. This 0x## addition only helps if a vendor used the exact same Sensor ID string for multiple sensors of the same sensor type. i.e. Multiple sensors of type Temperature, each with the exact same Sensor ID string of CPU instead of giving each Sensor ID string a unique name like CPU 1 , CPU 2,... Now if you want to get deeper into the IPMI realm, (which I don't really want to advocate) the Entity ID Code actually tells you the component. From the IPMI spec section, 43.14 Entity IDs The Entity ID field is used for identifying the physical entity that a sensor or device is associated with. If multiple sensors refer to the same entity, they will have the same Entity ID field value. For example, if a voltage sensor and a temperature sensor are both for a ‘Power Supply 1’ entity the Entity ID in their sensor data records would both be 10 (0Ah), per the Entity ID table. FYI: Entity 10 (0Ah) means power supply. In a heterogeneous platform environment, the Sensor ID string is likely going to be different per vendor, so your question If temperate...on any system board...on any hardware, notify the authorities is going to be tough because each vendor may name their system board differently. But, I bet that vendors use similar strings, so worst case, your alarm creation could require 1 alarm definition per vendor. Perhaps (if they are not standard) this would work: | hardware.current.power_meter | gauge | W | edafe6f4-5996-4df8-bc84-7d92439e15c0 with both sensor_provider and whatever that qualifier is called in the metadata? I see generic naming as somewhat problematic. If you lump all the temperature sensors for a platform under hardware.temperature the consumer will always need to query for a specific temperature sensor that it is interested in, like system board. The notion of having different samples from multiple sensors under a single generic name seems harder to deal with to me. If you have multiple temperature samples under the same generic meter name, how do you figure out what all the possible temperature samples actual exist? Then the name remains sufficiently generic to allow aggregates across multiple systems, while still having the necessary info to narrow to different sensors of the same type. I understand that this proposed change is not backward compatible with the existing naming, but I don't really see a good solution that would retain backward compatibility. I think we should strive to worry less about such things, especially when it's just names in data fields. Not always possible, or even a good idea, but sometimes its a win. I'm always good with
Re: [openstack-dev] [api] Request Validation - Stoplight
Hi Amit, Keeping in mind this viewpoint is nothing but my own personal view, my recommendation would be to not mandate the use of a particular validation framework, but to instead define what kind of validation clients should expect the server to perform in general. For example, I would expect a service to return an error code and not perform any action if I called Create server but did not include a request body, but the actual manner in which that error is generated within the service does not matter from the client's perspective. This is not to say the API Working Group wouldn't help you evaluate the potential of Stoplight to meet the needs of a service. To the contrary, by clearly defining the expectations of a service's responses to requests, you'll have a great idea of exactly what to look for in your evaluation, and your final decision would be based on objective results. Thank you, Sam Harwell From: Amit Gandhi [mailto:amit.gan...@rackspace.com] Sent: Friday, October 17, 2014 12:32 PM To: OpenStack Development Mailing List (not for usage questions) Cc: r...@ryanpetrello.com Subject: [openstack-dev] [api] Request Validation - Stoplight Hi API Working Group Last night at the Openstack Meetup in Atlanta, a group of us discussed how request validation is being performed over various projects and how some teams are using pecan wsmi, or warlock, jsonschema etc. Each of these libraries have their own pro's and con's. My understanding is that the API working group is in the early stages of looking into these various libraries and will likely provide guidance in the near future on this. I would like to suggest another library to evaluate when deciding this. Some of our teams have started to use a library named Stoplight[1][2] in our projects. For example, in the Poppy CDN project, we found it worked around some of the issues we had with warlock such as validating nested json correctly [3]. Stoplight is an input validation framework for python. It can be used to decorate any function (including routes in pecan or falcon) to validate its parameters. Some good examples can be found here [4] on how to use Spotlight. Let us know your thoughts/interest and we would be happy to discuss further on if and how this would be valuable as a library for API request validation in Openstack. Thanks Amit Gandhi Senior Manager - Rackspace [1] https://pypi.python.org/pypi/stoplight [2] https://github.com/painterjd/stoplight [3] https://github.com/stackforge/poppy/blob/master/poppy/transport/pecan/controllers/v1/services.py#L108 [4] https://github.com/painterjd/stoplight/blob/master/stoplight/tests/test_validation.py#L138 ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] DB Datasets CI broken
On Sat, Oct 18, 2014 at 11:02 AM, Jeremy Stanley fu...@yuggoth.org wrote: On 2014-10-18 10:45:23 +1100 (+1100), Michael Still wrote: [...] Is it possible to add a verification step to nodepool so that it doesn't mark a new image as ready unless it passes some basic sanity checks? Back in the beforetime, when devstack-gate had scripts which managed the worker pool as scheduled Jenkins jobs, it would run DevStack exercises on a test boot of the new image before using it to boot real images. Of course you can imagine the number of perfectly good images which were thrown away because of nondeterministic bugs causing false negative results there, so we probably wouldn't want to duplicate that exactly, but perhaps something more lightweight would be a reasonable compromise. Anyway, I consider it a good feature request (others may disagree), just nobody's reimplemented it in nodepool to date. Yeah, I'm starting to think along the lines of adding a simple sanity check to the shell worker in turbo hipster before the real tests run. Things like checking if the git directory exists, and contains a git repo with the branches we need. We could run that pre-flight script (or a variant) of it on images before marking them as real. For reference, what we think happened here is that the cache of SQL databases baked into the image was rsynced from our master while jhesketh was in the process of updating the SQL databases to a more recent version of OpenStack. Michael -- Rackspace Australia ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Re: [openstack-dev] [All] Maintenance mode in OpenStack during patching/upgrades
I’m glad to see this topic getting some focus once again. :-) From several of the administrators I talk with, when they think of putting a host into maintenance mode, the common requests I hear are: 1. Don’t schedule more VMs to the host 2. Provide an optional way to automatically migrate all (usually active) VMs off the host so that users’ workloads remain “unaffected” by the maintenance operation #1 can easily be achieved, as has been mentioned several times, by simply disabling the compute service. However, #2 involves a little more work, although certainly possible using all the operations provided by nova today (e.g., live migration, etc.). I believe these types of discussions have come up several times over the past several OpenStack releases—certainly since Grizzly (i.e., when I started watching this space). It seems that the general direction is to have the type of workflow needed for #2 outside of nova (which is certainly a valid stance). To that end, it would be fairly straightforward to build some code that logically sits on top of nova, that when entering maintenance: 1. Prevents VMs from being scheduled to the host; 2. Maintains state about the maintenance operation (e.g., not in maintenance, migrations in progress, in maintenance, or error); 3. Provides mechanisms to, upon entering maintenance, dictates which VMs (active, all, none) to migrate and provides some throttling capabilities to prevent hundreds of parallel migrations on densely packed hosts (all done via a REST API). If anyone has additional questions, comments, or would like to discuss some options, please let me know. If interested, upon request, I could even share a video of how such cases might work. :-) My colleagues and I have given these use cases a lot of thought and consideration and I’d love to talk more about them (perhaps a small session in Paris would be possible). - Joe On Oct 17, 2014, at 4:18 AM, John Garbutt j...@johngarbutt.com wrote: On 17 October 2014 02:28, Matt Riedemann mrie...@linux.vnet.ibm.com wrote: On 10/16/2014 7:26 PM, Christopher Aedo wrote: On Tue, Sep 9, 2014 at 2:19 PM, Mike Scherbakov mscherba...@mirantis.com wrote: On Tue, Sep 9, 2014 at 6:02 PM, Clint Byrum cl...@fewbar.com wrote: The idea is not simply deny or hang requests from clients, but provide them we are in maintenance mode, retry in X seconds You probably would want 'nova host-servers-migrate host' yeah for migrations - but as far as I understand, it doesn't help with disabling this host in scheduler - there is can be a chance that some workloads will be scheduled to the host. Regarding putting a compute host in maintenance mode using nova host-update --maintenance enable, it looks like the blueprint and associated commits were abandoned a year and a half ago: https://blueprints.launchpad.net/nova/+spec/host-maintenance It seems that nova service-disable host nova-compute effectively prevents the scheduler from trying to send new work there. Is this the best approach to use right now if you want to pull a compute host out of an environment before migrating VMs off? I agree with Tim and Mike that having something respond down for maintenance rather than ignore or hang would be really valuable. But it also looks like that hasn't gotten much traction in the past - anyone feel like they'd be in support of reviving the notion of maintenance mode? -Christopher ___ OpenStack-dev mailing list OpenStack-dev@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev host-maintenance-mode is definitely a thing in nova compute via the os-hosts API extension and the --maintenance parameter, the compute manager code is here [1]. The thing is the only in-tree virt driver that implements it is xenapi, and I believe when you put the host in maintenance mode it's supposed to automatically evacuate the instances to some other host, but you can't target the other host or tell the driver, from the API, which instances you want to evacuate, e.g. all, none, running only, etc. [1] http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py?id=2014.2#n3990 We should certainly make that more generic. It doesn't update the VM state, so its really only admin focused in its current form. The XenAPI logic only works when using XenServer pools with shared NFS storage, if my memory serves me correctly. Honestly, its a bit of code I have planned on removing, along with the rest of the pool support. In terms of requiring DB downtime in Nova, the current efforts are focusing on avoiding downtime all together, via expand/contract style migrations, with a little help from objects to avoid data migrations. That doesn't mean maintenance mode if not useful for other things, like an emergency patching of the hypervisor. John