Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target

2014-10-17 Thread Clint Byrum
Excerpts from Mike Spreitzer's message of 2014-10-16 22:24:30 -0700:
 I like the idea of measuring complexity.  I looked briefly at `python -m 
 mccabe`.  It seems to measure each method independently.  Is this really 
 fair?  If I have a class with some big methods, and I break it down into 
 more numerous and smaller methods, then the largest method gets smaller, 
 but the number of methods gets larger.  A large number of methods is 
 itself a form of complexity.  It is not clear to me that said re-org has 
 necessarily made the class easier to understand.  I can also break one 
 class into two, but it is not clear to me that the project has necessarily 
 become easier to understand.  While it is true that when you truly make a 
 project easier to understand you sometimes break it into more classes, it 
 is also true that you can do a bad job of re-organizing a set of classes 
 while still reducing the size of the largest method.  Has the McCabe 
 metric been evaluated on Python projects?  There is a danger in focusing 
 on what is easy to measure if that is not really what you want to 
 optimize.
 
 BTW, I find that one of the complexity issues for me when I am learning 
 about a Python class is doing the whole-program type inference so that I 
 know what the arguments are.  It seems to me that if you want to measure 
 complexity of Python code then something like the complexity of the 
 argument typing should be taken into account.
 

Fences don't solve problems. Fences make it harder to cause problems.

Of course you can still do the wrong thing and make the code worse. But
you can't do _this_ wrong thing without asserting why you need to.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target

2014-10-17 Thread Daniel P. Berrange
On Fri, Oct 17, 2014 at 03:03:43PM +1100, Michael Still wrote:
 I think nova wins. We have:
 
 ./nova/virt/libvirt/driver.py:3736:1: C901
 'LibvirtDriver._get_guest_config' is too complex (67)

IMHO this tool is of pretty dubious value. I mean that function is long
for sure, but it is by no means a serious problem in the Nova libvirt
codebase. The stuff it complains about in the libvirt/config.py file is
just incredibly stupid thing to highlight.

We've got plenty of big problems that need addressing in the OpenStack
codebase and I don't see this tool highlighting any of them. Better to
have people focus on solving actual real problems we have than trying
to get some arbitrary code analysis score to hit a magic value.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] BGPVPN implementation discussions

2014-10-17 Thread Damon Wang
Good news, +1

2014-10-17 0:48 GMT+08:00 Mathieu Rohon mathieu.ro...@gmail.com:

 Hi all,

 as discussed during today's l3-meeting, we keep on working on BGPVPN
 service plugin implementation [1].
 MPLS encapsulation is now supported in OVS [2], so we would like to
 summit a design to leverage OVS capabilities. A first design proposal,
 based on l3agent, can be found here :


 https://docs.google.com/drawings/d/1NN4tDgnZlBRr8ZUf5-6zzUcnDOUkWSnSiPm8LuuAkoQ/edit

 this solution is based on bagpipe [3], and its capacity to manipulate
 OVS, based on advertised and learned routes.

 [1]https://blueprints.launchpad.net/neutron/+spec/neutron-bgp-vpn
 [2]https://raw.githubusercontent.com/openvswitch/ovs/master/FAQ
 [3]https://github.com/Orange-OpenSource/bagpipe-bgp


 Thanks

 Mathieu

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target

2014-10-17 Thread Matthew Gilliard
I like measuring code metrics, and I definitely support Joe's change
here. I think of McCabe complexity as a proxy for testability and
readability of code, both of which are IMO actual real problems in the
nova codebase. If you are an experienced openstack dev you might find
the code easy to move around, but large and complex functions are
difficult for beginners to grok.

As an exercise, I took the method in libvirt/config.py and removed
everything except the flow-control keywords (ie the things that affect
the McCabe complexity): http://paste.openstack.org/show/121589/ - I
would find it difficult to hold all that in my head at once. It's
possible to argue that this is a false-positive, but my experience is
that this tool finds code which need improvement.

That said, these should be descriptive metrics rather than
prescriptive targets. There are products which chart a codebase's
evolution over time, such as www.sonarsource.com, which are really
great for provoking thought and conversation about code quality. Now
I'm interested, I'll have a look into it.

  Matthew

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-17 Thread Jastrzebski, Michal


 -Original Message-
 From: Florian Haas [mailto:flor...@hastexo.com]
 Sent: Thursday, October 16, 2014 10:53 AM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [Nova] Automatic evacuate
 
 On Thu, Oct 16, 2014 at 9:25 AM, Jastrzebski, Michal
 michal.jastrzeb...@intel.com wrote:
  In my opinion flavor defining is a bit hacky. Sure, it will provide us
  functionality fairly quickly, but also will strip us from flexibility
  Heat would give. Healing can be done in several ways, simple destroy
  - create (basic convergence workflow so far), evacuate with or
  without shared storage, even rebuild vm, probably few more when we put
  more thoughts to it.
 
 But then you'd also need to monitor the availability of *individual* guest and
 down you go the rabbit hole.
 
 So suppose you're monitoring a guest with a simple ping. And it stops
 responding to that ping.

I was more reffering to monitoring host (not guest), and for sure not by ping.
I was thinking of current zookeeper service group implementation, we might want
to use corosync and write servicegroup plugin for that. There are several 
choices
for that, each requires testing really before we make any decission.

There is also fencing case, which we agree is important, and I think nova should
be able to do that (since it does evacuate, it also should do a fencing). But
for working fencing we really need working host health monitoring, so I suggest
we take baby steps here and solve one issue at the time. And that would be host
monitoring.

 (1) Has it died?
 (2) Is it just too busy to respond to the ping?
 (3) Has its guest network stack died?
 (4) Has its host vif died?
 (5) Has the L2 agent on the compute host died?
 (6) Has its host network stack died?
 (7) Has the compute host died?
 
 Suppose further it's using shared storage (running off an RBD volume or
 using an iSCSI volume, or whatever). Now you have almost as many recovery
 options as possible causes for the failure, and some of those recovery
 options will potentially destroy your guest's data.
 
 No matter how you twist and turn the problem, you need strongly consistent
 distributed VM state plus fencing. In other words, you need a full blown HA
 stack.
 
  I'd rather use nova for low level task and maybe low level monitoring
  (imho nova should do that using servicegroup). But I'd use something
  more more configurable for actual task triggering like heat. That
  would give us framework rather than mechanism. Later we might want to
  apply HA on network or volume, then we'll have mechanism ready just
  monitoring hook and healing will need to be implemented.
 
  We can use scheduler hints to place resource on host HA-compatible
  (whichever health action we'd like to use), this will bit more
  complicated, but also will give us more flexibility.
 
 I apologize in advance for my bluntness, but this all sounds to me like you're
 vastly underrating the problem of reliable guest state detection and
 recovery. :)

Guest health in my opinion is just a bit out of scope here. If we'll have robust
way of detecting host health, we can pretty much asume that if host dies, 
guests follow.
There are ways to detect guest health (libvirt watchdog, ceilometer, ping you 
mentioned),
but that should be done somewhere else. And for sure not by evacuation.

 
  I agree that we all should meet in Paris and discuss that so we can
  join our forces. This is one of bigger gaps to be filled imho.
 
 Pretty much every user I've worked with in the last 2 years agrees.
 Granted, my view may be skewed as HA is typically what customers approach
 us for in the first place, but yes, this definitely needs a globally 
 understood
 and supported solution.
 
 Cheers,
 Florian
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] on Dockerfile patterns

2014-10-17 Thread Chris Dent

On Thu, 16 Oct 2014, Lars Kellogg-Stedman wrote:


On Fri, Oct 17, 2014 at 12:44:50PM +1100, Angus Lees wrote:

You just need to find the pid of a process in the container (perhaps using
docker inspect to go from container name - pid) and then:
 nsenter -t $pid -m -u -i -n -p -w


Note also that the 1.3 release of Docker (any day now) will sport a


Yesterday:
http://blog.docker.com/2014/10/docker-1-3-signed-images-process-injection-security-options-mac-shared-directories/


--
Chris Dent tw:@anticdent freenode:cdent
https://tank.peermore.com/tanks/cdent

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target

2014-10-17 Thread Chris Dent

On Fri, 17 Oct 2014, Daniel P. Berrange wrote:


IMHO this tool is of pretty dubious value. I mean that function is long
for sure, but it is by no means a serious problem in the Nova libvirt
codebase. The stuff it complains about in the libvirt/config.py file is
just incredibly stupid thing to highlight.


I find a lot of the OpenStack code very hard to read. If it is very
hard to read it is very hard to maintain, whether that means fix or
improve.

That said, the value I see in these kinds of tools is not
specifically in preventing complexity, but in providing entry points
for people who want to fix things. You don't know where to start
(because you haven't yet got the insight or experience): run
flake8 or pylint or some other tools, do what it tells you. In the
process you will:

* learn more about the code
* probably find bugs
* make an incremental improvement to something that needs it

--
Chris Dent tw:@anticdent freenode:cdent
https://tank.peermore.com/tanks/cdent

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance][QA] python-glanceclient untestable in Python 3.4

2014-10-17 Thread Flavio Percoco
On 10/17/2014 05:57 AM, Fei Long Wang wrote:
 Hi Jeremy,
 
 Thanks for the heads up. Is there a bug opened to track this? If not,
 I'm going to open one and dig into it. Cheers.

Hey Fei Long,

Thanks for taking care of this, please keep me in the loop.

@Jeremy: Thanks for the heads up

Flavio

 
 On 17/10/14 14:17, Jeremy Stanley wrote:
 As part of an effort to deprecate our specialized testing platform
 for Python 3.3, many of us have been working to confirm projects
 which currently gate on 3.3 can also pass their same test sets under
 Python 3.4 (which comes by default in Ubuntu Trusty). For the vast
 majority of projects, the differences between 3.3 and 3.4 are
 immaterial and no effort is required. For some, minor adjustments
 are needed...

 For python-glanceclient, we have 22 failing tests in a tox -e py34
 run. I spent the better part of today digging into them, and they
 basically all stem from the fact that PEP 456 switches the unordered
 data hash algorithm from FNV to SipHash in 3.4. The unit tests in
 python-glanceclient frequently rely on trying to match
 multi-parameter URL queries and JSON built from unordered data types
 against predetermined string representations. Put simply, this just
 doesn't work if you can't guarantee their ordering.

 I'm left with a dilemma--I don't really have time to fix all of
 these (I started to go through and turn the fixture keys into format
 strings embedding dicts filtered through urlencode() for example,
 but it created as many new failures as it fixed), however I'd hate
 to drop Py3K testing for software which currently has it no matter
 how fragile. This is mainly a call for help to anyone with some
 background and/or interest in python-glanceclient's unit tests to
 get them working under Python 3.4, so that we can eliminate the
 burden of maintaining special 3.3 test infrastructure.
 


-- 
@flaper87
Flavio Percoco

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ironic][Ceilometer] Proposed Change to Sensor meter naming in Ceilometer

2014-10-17 Thread Dmitry Tantsur

Hi Jim,

On 10/16/2014 07:23 PM, Jim Mankovich wrote:

All,

I would like to get some feedback on a proposal  to change to the
current sensor naming implemented in ironic and ceilometer.

I would like to provide vendor specific sensors within the current
structure for IPMI sensors in ironic and ceilometer, but I have found
that the current  implementation of sensor meters in ironic and
ceilometer is IPMI specific (from a meter naming perspective) . This is
not suitable as it currently stands to support sensor information from a
provider other than IPMI.Also, the current Resource ID naming makes
it difficult for a consumer of sensors to quickly find all the sensors
for a given Ironic Node ID, so I would like to propose changing the
Resource ID naming as well.

Currently, sensors sent by ironic to ceilometer get named by ceilometer
as has hardware.ipmi.SensorType, and the Resource ID is the Ironic
Node ID with a post-fix containing the Sensor ID.  For Details
pertaining to the issue with the Resource ID naming, see
https://bugs.launchpad.net/ironic/+bug/1377157, ipmi sensor naming in
ceilometer is not consumer friendly

Here is an example of what meters look like for sensors in ceilometer
with the current implementation:
| Name| Type  | Unit | Resource ID
| hardware.ipmi.current   | gauge | W|
edafe6f4-5996-4df8-bc84-7d92439e15c0-power_meter_(0x16)
| hardware.ipmi.temperature   | gauge | C|
edafe6f4-5996-4df8-bc84-7d92439e15c0-16-system_board_(0x15)

What I would like to propose is dropping the ipmi string from the name
altogether and appending the Sensor ID to the name  instead of to the
Resource ID.   So, transforming the above to the new naming would result
in the following:
| Name | Type  | Unit | Resource ID
| hardware.current.power_meter_(0x16)  | gauge | W|
edafe6f4-5996-4df8-bc84-7d92439e15c0
| hardware.temperature.system_board_(0x15) | gauge | C|
edafe6f4-5996-4df8-bc84-7d92439e15c0

+1

Very-very nit, feel free to ignore if inappropriate: maybe 
hardware.temperature.system_board.0x15 ? I.e. use separation with dots, 
do not use brackets?


This structure would provide the ability for a consumer to do a
ceilometer resource list using the Ironic Node ID as the Resource ID to
get all the sensors in a given platform.   The consumer would then then
iterate over each of the sensors to get the samples it wanted.   In
order to retain the information as to who provide the sensors, I would
like to propose that a standard sensor_provider field be added to the
resource_metadata for every sensor where the sensor_provider field
would have a string value indicating the driver that provided the sensor
information. This is where the string ipmi, or a vendor specific
string would be specified.

+1


I understand that this proposed change is not backward compatible with
the existing naming, but I don't really see a good solution that would
retain backward compatibility.
For backward compatibility you could _also_ keep old ones (with ipmi in 
it) for IPMI sensors.




Any/All Feedback will be appreciated,
In this version it makes a lot of sense to me, +1 if Ceilometer folks 
are not against.



Jim




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [All] Maintenance mode in OpenStack during patching/upgrades

2014-10-17 Thread John Garbutt
On 17 October 2014 02:28, Matt Riedemann mrie...@linux.vnet.ibm.com wrote:


 On 10/16/2014 7:26 PM, Christopher Aedo wrote:

 On Tue, Sep 9, 2014 at 2:19 PM, Mike Scherbakov
 mscherba...@mirantis.com wrote:

 On Tue, Sep 9, 2014 at 6:02 PM, Clint Byrum cl...@fewbar.com wrote:

 The idea is not simply deny or hang requests from clients, but provide
 them
 we are in maintenance mode, retry in X seconds

 You probably would want 'nova host-servers-migrate host'

 yeah for migrations - but as far as I understand, it doesn't help with
 disabling this host in scheduler - there is can be a chance that some
 workloads will be scheduled to the host.


 Regarding putting a compute host in maintenance mode using nova
 host-update --maintenance enable, it looks like the blueprint and
 associated commits were abandoned a year and a half ago:
 https://blueprints.launchpad.net/nova/+spec/host-maintenance

 It seems that nova service-disable host nova-compute effectively
 prevents the scheduler from trying to send new work there.  Is this
 the best approach to use right now if you want to pull a compute host
 out of an environment before migrating VMs off?

 I agree with Tim and Mike that having something respond down for
 maintenance rather than ignore or hang would be really valuable.  But
 it also looks like that hasn't gotten much traction in the past -
 anyone feel like they'd be in support of reviving the notion of
 maintenance mode?

 -Christopher

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


 host-maintenance-mode is definitely a thing in nova compute via the os-hosts
 API extension and the --maintenance parameter, the compute manager code is
 here [1].  The thing is the only in-tree virt driver that implements it is
 xenapi, and I believe when you put the host in maintenance mode it's
 supposed to automatically evacuate the instances to some other host, but you
 can't target the other host or tell the driver, from the API, which
 instances you want to evacuate, e.g. all, none, running only, etc.

 [1]
 http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py?id=2014.2#n3990

We should certainly make that more generic. It doesn't update the VM
state, so its really only admin focused in its current form.

The XenAPI logic only works when using XenServer pools with shared NFS
storage, if my memory serves me correctly. Honestly, its a bit of code
I have planned on removing, along with the rest of the pool support.

In terms of requiring DB downtime in Nova, the current efforts are
focusing on avoiding downtime all together, via expand/contract style
migrations, with a little help from objects to avoid data migrations.

That doesn't mean maintenance mode if not useful for other things,
like an emergency patching of the hypervisor.

John

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA] Proposal: A launchpad bug description template

2014-10-17 Thread Thierry Carrez
Markus Zoeller wrote:
 TL;DR: A proposal for a template for launchpad bug entries which asks 
for the minimal needed data to work on a bug.

Note that Launchpad doesn't support bug entry templates. You can display
bug reporting guidelines which appear under the textbox, but that's
about it.

Also note that the text is project-specific, so it needs to be entered
in every openstack project. Depending on the exact nature of the
project, I suspect the text should be different.

Regards,

-- 
Thierry Carrez (ttx)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Summit scheduling - using our time together wisely.

2014-10-17 Thread Thierry Carrez
Clint Byrum wrote:
 * The Ops Summit is Wendesday/Thursday [3], which overlaps with these
   sessions. I am keenly interested in gathering more contribution from
   those already operating and deploying OpenStack. It can go both ways,
   but I think it might make sense to have more ops-centric topics
   discussed on Friday, when those participants might not be fully
   wrapped up in the ops sessions.

The Ops Summit is actually on Monday and Thursday. Not on Wednesday.
You were wrong on the Internet.

-- 
Thierry Carrez (ttx)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target

2014-10-17 Thread Jay Pipes

On 10/17/2014 05:10 AM, Chris Dent wrote:

On Fri, 17 Oct 2014, Daniel P. Berrange wrote:


IMHO this tool is of pretty dubious value. I mean that function is long
for sure, but it is by no means a serious problem in the Nova libvirt
codebase. The stuff it complains about in the libvirt/config.py file is
just incredibly stupid thing to highlight.


I find a lot of the OpenStack code very hard to read. If it is very
hard to read it is very hard to maintain, whether that means fix or
improve.


Exactly, ++.


That said, the value I see in these kinds of tools is not
specifically in preventing complexity, but in providing entry points
for people who want to fix things. You don't know where to start
(because you haven't yet got the insight or experience): run
flake8 or pylint or some other tools, do what it tells you. In the
process you will:

* learn more about the code
* probably find bugs
* make an incremental improvement to something that needs it


Agreed.

-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Automatic evacuate

2014-10-17 Thread Florian Haas
On Fri, Oct 17, 2014 at 9:53 AM, Jastrzebski, Michal
michal.jastrzeb...@intel.com wrote:


 -Original Message-
 From: Florian Haas [mailto:flor...@hastexo.com]
 Sent: Thursday, October 16, 2014 10:53 AM
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [Nova] Automatic evacuate

 On Thu, Oct 16, 2014 at 9:25 AM, Jastrzebski, Michal
 michal.jastrzeb...@intel.com wrote:
  In my opinion flavor defining is a bit hacky. Sure, it will provide us
  functionality fairly quickly, but also will strip us from flexibility
  Heat would give. Healing can be done in several ways, simple destroy
  - create (basic convergence workflow so far), evacuate with or
  without shared storage, even rebuild vm, probably few more when we put
  more thoughts to it.

 But then you'd also need to monitor the availability of *individual* guest 
 and
 down you go the rabbit hole.

 So suppose you're monitoring a guest with a simple ping. And it stops
 responding to that ping.

 I was more reffering to monitoring host (not guest), and for sure not by ping.
 I was thinking of current zookeeper service group implementation, we might 
 want
 to use corosync and write servicegroup plugin for that. There are several 
 choices
 for that, each requires testing really before we make any decission.

 There is also fencing case, which we agree is important, and I think nova 
 should
 be able to do that (since it does evacuate, it also should do a fencing). But
 for working fencing we really need working host health monitoring, so I 
 suggest
 we take baby steps here and solve one issue at the time. And that would be 
 host
 monitoring.

You're describing all of the cases for which Pacemaker is the perfect
fit. Sorry, I see absolutely no point in teaching Nova to do that.

 (1) Has it died?
 (2) Is it just too busy to respond to the ping?
 (3) Has its guest network stack died?
 (4) Has its host vif died?
 (5) Has the L2 agent on the compute host died?
 (6) Has its host network stack died?
 (7) Has the compute host died?

 Suppose further it's using shared storage (running off an RBD volume or
 using an iSCSI volume, or whatever). Now you have almost as many recovery
 options as possible causes for the failure, and some of those recovery
 options will potentially destroy your guest's data.

 No matter how you twist and turn the problem, you need strongly consistent
 distributed VM state plus fencing. In other words, you need a full blown HA
 stack.

  I'd rather use nova for low level task and maybe low level monitoring
  (imho nova should do that using servicegroup). But I'd use something
  more more configurable for actual task triggering like heat. That
  would give us framework rather than mechanism. Later we might want to
  apply HA on network or volume, then we'll have mechanism ready just
  monitoring hook and healing will need to be implemented.
 
  We can use scheduler hints to place resource on host HA-compatible
  (whichever health action we'd like to use), this will bit more
  complicated, but also will give us more flexibility.

 I apologize in advance for my bluntness, but this all sounds to me like 
 you're
 vastly underrating the problem of reliable guest state detection and
 recovery. :)

 Guest health in my opinion is just a bit out of scope here. If we'll have 
 robust
 way of detecting host health, we can pretty much asume that if host dies, 
 guests follow.
 There are ways to detect guest health (libvirt watchdog, ceilometer, ping you 
 mentioned),
 but that should be done somewhere else. And for sure not by evacuation.

You're making an important point here; you're asking for a robust way
of detecting host health. I can guarantee you that the way of
detecting host health that you suggest (i.e. from within Nova) will
not be robust by HA standards for at least two years, if your patch
lands tomorrow.

Cheers,
Florian

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Horizon] Template Blueprint

2014-10-17 Thread Ana Krivokapic

Hello Horizoners,

I would like to draw your attention to the excellent Template 
Blueprint[1] which David created. The aim of this is to create a 
template which will be used for all future blueprints. This way we can 
try to ensure that enough information/detail is provided in blueprints, 
as we have had problems with blueprints lacking in details in the past.


Please take a minute to review [1] and add your comments to the 
whiteboard. We are hoping to finalize this and starting using this 
template ASAP.


Thanks!


[1] https://blueprints.launchpad.net/horizon/+spec/template

--
Regards,

Ana Krivokapic
Software Engineer
OpenStack team
Red Hat Inc.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Cell Initialization

2014-10-17 Thread Vineet Menon
Hi,

I was trying to create cells under openstack using devstack. My setup
contains 3 machines. One toplevel and 2 compute cells.
I'm following this documentation,
http://docs.openstack.org/trunk/config-reference/content/section_compute-cells.html
.

Both these cells instantiation are generating errors.
1. first one doesn't generate any error logs unless I issue a command at
the parent 'nova cell-show cell2'. At this point the toplevel cell throws
the following error,

2014-10-17 12:03:34.888 ERROR oslo.messaging.rpc.dispatcher [-] Exception
 during message handling: Circular reference detected
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher Traceback
 (most recent call last):
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
 /usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py, line
 134, in _dispatch_and_reply
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher
 incoming.message))
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
 /usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py,
 line 72, in reply
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher
 self._send_reply(conn, reply, failure, log_failure=log_failure)
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
 /usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py,
 line 62, in _send_reply
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher
 conn.direct_send(self.reply_q, rpc_common.serialize_msg(msg))
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
 /usr/lib/python2.7/site-packages/oslo/messaging/_drivers/common.py, line
 302, in serialize_msg
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher
 _MESSAGE_KEY: jsonutils.dumps(raw_msg)}
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
 /usr/lib/python2.7/site-packages/oslo/messaging/openstack/common/jsonutils.py,
 line 172, in dumps
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher return
 json.dumps(value, default=default, **kwargs)
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
 /usr/lib64/python2.7/json/__init__.py, line 250, in dumps
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher
 sort_keys=sort_keys, **kw).encode(obj)
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
 /usr/lib64/python2.7/json/encoder.py, line 207, in encode
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher chunks =
 self.iterencode(o, _one_shot=True)
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher   File
 /usr/lib64/python2.7/json/encoder.py, line 270, in iterencode
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher return
 _iterencode(o, 0)
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher ValueError:
 Circular reference detected
 2014-10-17 12:03:34.888 TRACE oslo.messaging.rpc.dispatcher
 2014-10-17 12:03:34.890 ERROR oslo.messaging._drivers.common [-] Returning
 exception Circular reference detected to caller
 2014-10-17 12:03:34.890 ERROR oslo.messaging._drivers.common [-]
 ['Traceback (most recent call last):\n', '  File
 /usr/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py, line
 134, in _dispatch_and_reply\nincoming.message))\n', '  File
 /usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py,
 line 72, in reply\nself._send_reply(conn, reply, failure,
 log_failure=log_failure)\n', '  File
 /usr/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py,
 line 62, in _send_reply\nconn.direct_send(self.reply_q,
 rpc_common.serialize_msg(msg))\n', '  File
 /usr/lib/python2.7/site-packages/oslo/messaging/_drivers/common.py, line
 302, in serialize_msg\n_MESSAGE_KEY: jsonutils.dumps(raw_msg)}\n', '
 File
 /usr/lib/python2.7/site-packages/oslo/messaging/openstack/common/jsonutils.py,
 line 172, in dumps\nreturn json.dumps(value, default=default,
 **kwargs)\n', '  File /usr/lib64/python2.7/json/__init__.py, line 250, in
 dumps\nsort_keys=sort_keys, **kw).encode(obj)\n', '  File
 /usr/lib64/python2.7/json/encoder.py, line 207, in encode\nchunks =
 self.iterencode(o, _one_shot=True)\n', '  File
 /usr/lib64/python2.7/json/encoder.py, line 270, in iterencode\nreturn
 _iterencode(o, 0)\n', 'ValueError: Circular reference detected\n']

 This one seems to be similar to the bug reported here,
https://bugs.launchpad.net/nova/+bug/1312002

 2. In the second child cell initialization, the error crops up as soon as
I add the toplevel cell in the child cell using 'nova-manage' command.

2014-10-17 12:05:29.500 ERROR nova.cells.messaging
 [req-f74d05cf-061a-4488-bfcb-0cb1edec44e2 None None] Error locating next
 hop for message: Inconsistency in cell routing: destination is
 cell1!toplevel but routing_path is cell1!cell1
 2014-10-17 12:05:29.500 TRACE nova.cells.messaging Traceback (most recent
 call last):
 2014-10-17 12:05:29.500 TRACE nova.cells.messaging   File
 

Re: [openstack-dev] [QA] Proposal: A launchpad bug description template

2014-10-17 Thread Markus Zoeller
Thierry Carrez thie...@openstack.org wrote on 10/17/2014 11:28:56 AM:

 From: Thierry Carrez thie...@openstack.org
 To: openstack-dev@lists.openstack.org
 Date: 10/17/2014 11:31 AM
 Subject: Re: [openstack-dev] [QA] Proposal: A launchpad bug description 
template
 
 Markus Zoeller wrote:
  TL;DR: A proposal for a template for launchpad bug entries which asks 
 for the minimal needed data to work on a bug.
 
 Note that Launchpad doesn't support bug entry templates. You can display
 bug reporting guidelines which appear under the textbox, but that's
 about it.
 
 Also note that the text is project-specific, so it needs to be entered
 in every openstack project. Depending on the exact nature of the
 project, I suspect the text should be different.
 
 Regards,
 
 -- 
 Thierry Carrez (ttx)

Thanks for the note on Launchpads capabilities. Providing the infor-
mation in the bug reporting guidelines on launchpad looks like a good
place. Currently there is for Nova Please include the exact version
of Nova with which you're experiencing this issue.. 

The wiki page about the bugs [1] could be enhanced as well and then 
we could let Launchpad link to this wiki page. Maybe this would reduce
the maintenance of the template. Subsections could be introduced for 
project specific debug data.

[1] https://wiki.openstack.org/wiki/Bugs

Regards, 
Markus Zoeller 
IRC: markus_z


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [novaclient] E12* rules

2014-10-17 Thread Andrey Kurilin
Hi everyone!

I'm working on enabling E12* PEP8 rules in novaclient(status of my work
listed below). Imo, PEP8 rules should be ignored only in extreme cases/for
important reasons and we should decrease a number of ignored rules. This
helps to keep code in more strict, readable form, which is very important
when working in community.

While working on rule E126, we started discussion with Joe Gordon about
demand of these rules. I have no idea about reasons of why they should be
ignored, so I want to know:
- Why these rules should be ignored?
- What do you think about enabling these rules?

Please, leave your opinion about E12* rules.

Already enabled rules:
  E121,E125 - https://review.openstack.org/#/c/122888/
  E122 - https://review.openstack.org/#/c/123830/
  E123 - https://review.openstack.org/#/c/123831/

Abandoned rule:
  E124 - https://review.openstack.org/#/c/123832/

Pending review:
  E126 - https://review.openstack.org/#/c/123850/
  E127 - https://review.openstack.org/#/c/123851/
  E128 - https://review.openstack.org/#/c/127559/
  E129 - https://review.openstack.org/#/c/123852/


-- 
Best regards,
Andrey Kurilin.
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Elections] Results of the TC Election

2014-10-17 Thread Anita Kuno
Please join me in congratulating the 6 newly elected members of the TC.

* Monty Taylor
* Sean Dague
* Doug Hellmann
* Russell Bryant
* Anne Gentle
* John Griffith

Full results:
http://civs.cs.cornell.edu/cgi-bin/results.pl?id=E_c105db929e6c11f4

Thank you to all candidates who stood for election, having a good group
of candidates helps engage the community in our democratic process,

Thank you to Mark McLoughlin, who served in the previous TC and chose
run for a seat this time.

Thank you to all who voted and who encouraged others to vote. We need to
ensure your voice is heard.

Thanks to my fellow election official, Tristan Cacqueray, I appreciate
your help and perspective.

Thank you for another great round.

Here's to Kilo,
Anita.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Nova] Turbo hipster problems

2014-10-17 Thread Gary Kotton
Hi,
Anyone aware why Turbo his peter is failing with:

real-db-upgrade_nova_percona_user_002:th-perconahttp://localhost Exception: 
[Errno 2] No such file or directory: '/var/lib/turbo-hipster/datasets_user_002' 
in 0s

Thanks
Gary
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance][QA] python-glanceclient untestable in Python 3.4

2014-10-17 Thread Jeremy Stanley
On 2014-10-17 16:57:59 +1300 (+1300), Fei Long Wang wrote:
 Thanks for the heads up. Is there a bug opened to track this? If
 not, I'm going to open one and dig into it. Cheers.

Gah! You'd think *I* would know better at this point--sorry about
that... I've now opened https://launchpad.net/bugs/1382582 to track
this. Thanks for any assistance you're able to provide!
-- 
Jeremy Stanley

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance][QA] python-glanceclient untestable in Python 3.4

2014-10-17 Thread Louis Taylor
On Fri, Oct 17, 2014 at 03:01:22PM +, Jeremy Stanley wrote:
 Gah! You'd think *I* would know better at this point--sorry about
 that... I've now opened https://launchpad.net/bugs/1382582 to track
 this. Thanks for any assistance you're able to provide!

This looks like a continuation of the old PYTHONHASHSEED bug:

https://launchpad.net/bugs/1348818


signature.asc
Description: Digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] add cyclomatic complexity check to pep8 target

2014-10-17 Thread Ian Cordasco
I would also advise pinning the version of mccabe we’re using. Mccabe was
originally a proof-of-concept script that Ned Batchelder wrote and which
Tarek Ziade vendored into Flake8. After we split it out in v2 of Flake8,
we’ve found several (somewhat serious) reporting problems with the tool.
Currently the package owner on PyPI hasn’t granted me permissions to
release a new version of the package, but we have several fixes in the
repository: https://github.com/flintwork/mccabe. The changes are somewhat
drastic but they should reduce the average function/method’s complexity by
1 or 2 points. I’m going to bother Florent again to give me permission to
release the package since it has been far too long since a release has
been cut.

For what it’s worth, Florent doesn’t pay close attention to GitHub
notifications so chiming in (or creating) issues on mccabe to release a
new version will only spam *me*. So please don’t pile on to anything
existing or create a new one.

Cheers,
Ian

On 10/17/14, 12:39 AM, Michael Davies mich...@the-davies.net wrote:

On Fri, Oct 17, 2014 at 2:39 PM, Joe Gordon
joe.gord...@gmail.com wrote:



First step in fixing this, put a cap on it:
http://goog_106984861https://review.openstack.org/129125







Thanks Joe - I've just put up a similar patch for Ironic:
https://review.openstack.org/129132 https://review.openstack.org/129132
 


-- 
Michael Davies   mich...@the-davies.net
Rackspace Australia


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] APIImpact flag for nova specs

2014-10-17 Thread Everett Toews
On Oct 15, 2014, at 5:52 AM, Christopher Yeoh 
cbky...@gmail.commailto:cbky...@gmail.com wrote:

We don't require new templates as part of nova-specs and api changes don't 
necessarily change the api sample tpl files. We do ask for some jsonschema 
descriptions of the new APIs input but they work pretty well in the spec 
document itself. I agree it could be prone to spelling mistakes etc, though 
just being able to search for 'api' would be sufficient and people who review 
specs could pick up missing or mispelled flags in the commit message (and it 
wouldn't necessarily need to be restricted to just APIImpact as possible flags).

+1 to APIImpact flag

That there could be misses is not a good reason to not do this. Which is to 
say, let’s do this.

Everett

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [kolla] on Dockerfile patterns

2014-10-17 Thread Fox, Kevin M
docker exec would be awesome.

So... whats redhat's stance on docker upgrades here?

I'm running centos7, and dockers topped out at 
docker-0.11.1-22.el7.centos.x86_64.
(though redhat package versions don't always reflect the upstream version)

I tried running docker 1.2 binary from docker.io but selinux flipped out on it.

how long before docker exec actually is useful solution for debugging on such 
systems?

Thanks,
Kevin

From: Lars Kellogg-Stedman [l...@redhat.com]
Sent: Thursday, October 16, 2014 7:14 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [kolla] on Dockerfile patterns

On Fri, Oct 17, 2014 at 12:44:50PM +1100, Angus Lees wrote:
 You just need to find the pid of a process in the container (perhaps using
 docker inspect to go from container name - pid) and then:
  nsenter -t $pid -m -u -i -n -p -w

Note also that the 1.3 release of Docker (any day now) will sport a
shiny new docker exec command that will provide you with the ability
to run commands inside the container via the docker client without
having to involve nsenter (or nsinit).

It looks like:

docker exec container_id ps -fe

Or:

docker exec -it container_id bash

--
Lars Kellogg-Stedman l...@redhat.com | larsks @ {freenode,twitter,github}
Cloud Engineering / OpenStack  | http://blog.oddbit.com/


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ironic][Ceilometer] Proposed Change to Sensor meter naming in Ceilometer

2014-10-17 Thread Chris Dent

On Thu, 16 Oct 2014, Jim Mankovich wrote:

What I would like to propose is dropping the ipmi string from the name 
altogether and appending the Sensor ID to the name  instead of to the 
Resource ID.   So, transforming the above to the new naming would result in 
the following:



| Name | Type  | Unit | Resource ID
| hardware.current.power_meter_(0x16)  | gauge | W| 
edafe6f4-5996-4df8-bc84-7d92439e15c0
| hardware.temperature.system_board_(0x15) | gauge | C| 
edafe6f4-5996-4df8-bc84-7d92439e15c0


[plus sensor_provider in resource_metadata]

If this makes sense for the kinds of queries that need to happen then
we may as well do it, but I'm not sure it is. When I was writing the
consumer code for the notifications the names of the meters was a big
open question that was hard to resolve because of insufficient data
and input on what people really need to do with the samples.

The scenario you've listed is getting all sensors on a given single
platform.

What about the scenario where you want to create an alarm that says
If temperate gets over X on any system board on any of my hardware,
notify the authorities? Will having the _(0x##) qualifier allow that
to work? I don't actually know, are those qualifiers standard in some
way or are they specific to different equipment? If they are different
having them in the meter name makes creating a useful alarm in a
heterogeneous a bit more of a struggle, doesn't it?

Perhaps (if they are not standard) this would work:

 | hardware.current.power_meter | gauge | W | 
edafe6f4-5996-4df8-bc84-7d92439e15c0

with both sensor_provider and whatever that qualifier is called in the
metadata?

Then the name remains sufficiently generic to allow aggregates across
multiple systems, while still having the necessary info to narrow to
different sensors of the same type.

I understand that this proposed change is not backward compatible with the 
existing naming, but I don't really see a good solution that would retain 
backward compatibility.


I think we should strive to worry less about such things, especially
when it's just names in data fields. Not always possible, or even a
good idea, but sometimes its a win.

--
Chris Dent tw:@anticdent freenode:cdent
https://tank.peermore.com/tanks/cdent

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [api] Request Validation - Stoplight

2014-10-17 Thread Amit Gandhi
Hi API Working Group

Last night at the Openstack Meetup in Atlanta, a group of us discussed how 
request validation is being performed over various projects and how some teams 
are using pecan wsmi, or warlock, jsonschema etc.

Each of these libraries have their own pro’s and con’s.  My understanding is 
that the API working group is in the early stages of looking into these various 
libraries and will likely provide guidance in the near future on this.

I would like to suggest another library to evaluate when deciding this.  Some 
of our teams have started to use a library named “Stoplight”[1][2] in our 
projects.  For example, in the Poppy CDN project, we found it worked around 
some of the issues we had with warlock such as validating nested json correctly 
[3].

Stoplight is an input validation framework for python.  It can be used to 
decorate any function (including routes in pecan or falcon) to validate its 
parameters.

Some good examples can be found here [4] on how to use Spotlight.

Let us know your thoughts/interest and we would be happy to discuss further on 
if and how this would be valuable as a library for API request validation in 
Openstack.


Thanks


Amit Gandhi
Senior Manager – Rackspace



[1] https://pypi.python.org/pypi/stoplight
[2] https://github.com/painterjd/stoplight
[3] 
https://github.com/stackforge/poppy/blob/master/poppy/transport/pecan/controllers/v1/services.py#L108
[4] 
https://github.com/painterjd/stoplight/blob/master/stoplight/tests/test_validation.py#L138

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance][QA] python-glanceclient untestable in Python 3.4

2014-10-17 Thread Jeremy Stanley
On 2014-10-17 16:17:39 +0100 (+0100), Louis Taylor wrote:
 This looks like a continuation of the old PYTHONHASHSEED bug:
 
 https://launchpad.net/bugs/1348818

The underlying design choices in python-glanceclient's tests do
cause both problems (can't run with a random hash seed, but also
can't run under a different hash algorithm), and properly fixing one
will fix the other. Unfortunately there isn't an easy workaround for
the Python 3.4 testing issue, unlike bug 1348818.
-- 
Jeremy Stanley

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] CI report : 04/10/2014 - 17/10/2014

2014-10-17 Thread Derek Higgins
Hi All,

   Nothing to report since the last report, 2 weeks of no breakages.

thanks,
Derek.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] DB Datasets CI broken

2014-10-17 Thread Michael Still
Hi,

I've just noticed that the DB Datasets CI (the artist formerly known
as turbo hipster) is failing for many patches. I'm looking into it
now.

Michael

-- 
Rackspace Australia

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] DB Datasets CI broken

2014-10-17 Thread Michael Still
This was a bad image in nodepool. I've rebuilt the image and killed
our pool of workers running the old image and things seem to be ok
now. I'm in the process of enqueueing rechecks for every failed
turbo-hipster run now, but they'll take some time to all get executed.

Thanks for your patience everyone.

Is it possible to add a verification step to nodepool so that it
doesn't mark a new image as ready unless it passes some basic sanity
checks?

Thanks,
Michael

On Sat, Oct 18, 2014 at 8:44 AM, Michael Still mi...@stillhq.com wrote:
 Hi,

 I've just noticed that the DB Datasets CI (the artist formerly known
 as turbo hipster) is failing for many patches. I'm looking into it
 now.

 Michael

 --
 Rackspace Australia



-- 
Rackspace Australia

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [api] API recommendation

2014-10-17 Thread Peter Balland
On Oct 16, 2014 8:24 AM, Dean Troyer dtro...@gmail.com wrote:



 On Thu, Oct 16, 2014 at 4:57 AM, Salvatore Orlando sorla...@nicira.com
wrote:

 From an API guideline viewpoint, I understand that
https://review.openstack.org/#/c/86938/ proposes the introduction of a
rather simple endpoint to query active tasks and filter them by resource
uuid or state, for example.


 That review/blueprint contains one thing that I want to address in more
detail below along with Sal's comment on persistence...


 While this is hardly questionable, I wonder if it might be worth
typifying the task, ie: adding a resource_type attribute, and/or allowing
to retrieve active tasks as a chile resource of an object, eg.: GET
/servers/server_id/tasks?state=running or if just for running tasks GET
/servers/server_id/active_tasks


 I'd prefer the filter approach, but more importantly, it should be the
_same_ structure as listing resources themselves.

 To note: here is another API design detail, specifying resource types in
the URL path:

 /server/server/foo

 vs

 /server/foo

 or what we have today, for example, in compute:

 /tenant/foo

 The proposed approach for the multiple server create case also makes
sense to me. Other than bulk operations there are indeed cases where a
single API operation needs to perform multiple tasks. For instance, in
Neutron, creating a port implies L2 wiring, setting up DHCP info, and
securing it on the compute node by enforcing anti-spoof rules and security
groups. This means there will be 3/4 active tasks. For this reason I wonder
if it might be the case of differentiating between the concept of
operation and tasks where the former is the activity explicitly
initiated by the API consumer, and the latter are the activities which need
to complete to fulfil it. This is where we might leverage the already
proposed request_id attribute of the task data structure.


 I like the ability to track the fan-out, especially if I can get the
state of the entire set of tasks in a single round-trip.  This also makes
it easier to handle backout of failed requests without having to maintain a
lot of client-side state, or make a lot of round-trips.


Based on previous experience, I highly recommend maintaining separation
between tracking work at an API call level aggregate and other subtasks.
In non-provisioning scenarios, tasks may fire independent of API
operations, so there wouldn't be an API handle to query on. It is great to
manage per-API call level tasks in the framework. The other work type
tasks are *much* more complicated beasts, deserving of their own design.

 Finally, a note on persistency. How long a completed task, successfully
or not should be stored for? Do we want to store them until the resource
they operated on is deleted?
 I don't think it's a great idea to store them indefinitely in the DB.
Tying their lifespan to resources is probably a decent idea, but time-based
cleanup policies might also be considered (e.g.: destroy a task record 24
hours after its completion)


 I can envision an operator/user wanting to be able to pull a log of an
operation/task for not only cloud debugging (x failed to build, when/why?)
but also app-level debugging (concrete use case not ready at deadline).
This would require a minimum of life-of-resource + some-amount-of-time.
The time might also be variable, failed operations might actually need to
stick around longer.

 Even as an operator with access to backend logging, pulling these state
transitions out should not be hard, and should be available to the resource
owner (project).

 dt

 --

 Dean Troyer
 dtro...@gmail.com

 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] DB Datasets CI broken

2014-10-17 Thread Jeremy Stanley
On 2014-10-18 10:45:23 +1100 (+1100), Michael Still wrote:
[...]
 Is it possible to add a verification step to nodepool so that it
 doesn't mark a new image as ready unless it passes some basic sanity
 checks?

Back in the beforetime, when devstack-gate had scripts which managed
the worker pool as scheduled Jenkins jobs, it would run DevStack
exercises on a test boot of the new image before using it to boot
real images. Of course you can imagine the number of perfectly good
images which were thrown away because of nondeterministic bugs
causing false negative results there, so we probably wouldn't want
to duplicate that exactly, but perhaps something more lightweight
would be a reasonable compromise.

Anyway, I consider it a good feature request (others may disagree),
just nobody's reimplemented it in nodepool to date.
-- 
Jeremy Stanley

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ironic][Ceilometer] Proposed Change to Sensor meter naming in Ceilometer

2014-10-17 Thread Jim Mankovich

Chris,
See answers inline. I don't have any concrete answers as to how to deal
with some of questions you brought up, but I do have some more detail
that may be useful to further the discussion.

On 10/17/2014 11:03 AM, Chris Dent wrote:

On Thu, 16 Oct 2014, Jim Mankovich wrote:

What I would like to propose is dropping the ipmi string from the 
name altogether and appending the Sensor ID to the name instead of to 
the Resource ID. So, transforming the above to the new naming would 
result in the following:



| Name | Type | Unit | Resource ID
| hardware.current.power_meter_(0x16) | gauge | W | 
edafe6f4-5996-4df8-bc84-7d92439e15c0
| hardware.temperature.system_board_(0x15) | gauge | C | 
edafe6f4-5996-4df8-bc84-7d92439e15c0


[plus sensor_provider in resource_metadata]

If this makes sense for the kinds of queries that need to happen then
we may as well do it, but I'm not sure it is. When I was writing the
consumer code for the notifications the names of the meters was a big
open question that was hard to resolve because of insufficient data
and input on what people really need to do with the samples.

The scenario you've listed is getting all sensors on a given single
platform.

What about the scenario where you want to create an alarm that says
If temperate gets over X on any system board on any of my hardware,
notify the authorities? Will having the _(0x##) qualifier allow that
to work? I don't actually know, are those qualifiers standard in some
way or are they specific to different equipment? If they are different
having them in the meter name makes creating a useful alarm in a
heterogeneous a bit more of a struggle, doesn't it?


The _(0x##) is an ipmitool display artifact that is tacked onto the 
end of the Sensor ID

in order to provide more information beyond what Sensor ID has in it.
The ## is the sensor record ID which is specific to IPMI. Whether or
not a Sensor ID (sans _(0x##)) is unique is up to the vendor, but in 
general
I believe all vendors will likely name their sensors uniquely; 
otherwise, how can a
person differentiate textually what component in a platform the sensor 
represents?


Personally, I would like to see the _(0x##) removed form the Sensor ID 
string (by the ipmitool
driver) before it returns sensors to the Ironic conductor. I just don't 
see any value in this
extra info. This 0x## addition only helps if a vendor used the exact 
same Sensor ID string for multiple
sensors of the same sensor type. i.e. Multiple sensors of type 
Temperature, each with the
exact same Sensor ID string of CPU instead of giving each Sensor ID 
string a unique name

like CPU 1 ,  CPU 2,...

Now if you want to get deeper into the IPMI realm, (which I don't really 
want to advocate)
the Entity ID Code actually tells you the component. From the IPMI spec 
section, 43.14 Entity IDs


The Entity ID field is used for identifying the physical entity that a 
sensor or device is associated with. If multiple
sensors refer to the same entity, they will have the same Entity ID 
field value. For example, if a voltage sensor and
a temperature sensor are both for a ‘Power Supply 1’ entity the Entity 
ID in their sensor data records would both
be 10 (0Ah), per the Entity ID table. FYI: Entity 10 (0Ah) means power 
supply.


In a heterogeneous platform environment, the Sensor ID string is likely 
going to be different per vendor,
so your question If temperate...on any system board...on any hardware, 
notify the authorities is
going to be tough because each vendor may name their system board 
differently. But, I bet that
vendors use similar strings, so worst case, your alarm creation could 
require 1 alarm definition

per vendor.



Perhaps (if they are not standard) this would work:

| hardware.current.power_meter | gauge | W | 
edafe6f4-5996-4df8-bc84-7d92439e15c0


with both sensor_provider and whatever that qualifier is called in the
metadata?


I see generic naming as somewhat problematic. If you lump all the 
temperature sensors
for a platform under hardware.temperature the consumer will always need 
to query for a specific temperature
sensor that it is interested in, like system board. The notion of 
having different samples from
multiple sensors under a single generic name seems harder to deal with 
to me. If you have multiple
temperature samples under the same generic meter name, how do you figure 
out what all the possible

temperature samples actual exist?



Then the name remains sufficiently generic to allow aggregates across
multiple systems, while still having the necessary info to narrow to
different sensors of the same type.

I understand that this proposed change is not backward compatible 
with the existing naming, but I don't really see a good solution that 
would retain backward compatibility.


I think we should strive to worry less about such things, especially
when it's just names in data fields. Not always possible, or even a
good idea, but sometimes its a win.


I'm always good with 

Re: [openstack-dev] [api] Request Validation - Stoplight

2014-10-17 Thread Sam Harwell
Hi Amit,

Keeping in mind this viewpoint is nothing but my own personal view, my 
recommendation would be to not mandate the use of a particular validation 
framework, but to instead define what kind of validation clients should expect 
the server to perform in general. For example, I would expect a service to 
return an error code and not perform any action if I called Create server but 
did not include a request body, but the actual manner in which that error is 
generated within the service does not matter from the client's perspective.

This is not to say the API Working Group wouldn't help you evaluate the 
potential of Stoplight to meet the needs of a service. To the contrary, by 
clearly defining the expectations of a service's responses to requests, you'll 
have a great idea of exactly what to look for in your evaluation, and your 
final decision would be based on objective results.

Thank you,
Sam Harwell

From: Amit Gandhi [mailto:amit.gan...@rackspace.com]
Sent: Friday, October 17, 2014 12:32 PM
To: OpenStack Development Mailing List (not for usage questions)
Cc: r...@ryanpetrello.com
Subject: [openstack-dev] [api] Request Validation - Stoplight

Hi API Working Group

Last night at the Openstack Meetup in Atlanta, a group of us discussed how 
request validation is being performed over various projects and how some teams 
are using pecan wsmi, or warlock, jsonschema etc.

Each of these libraries have their own pro's and con's.  My understanding is 
that the API working group is in the early stages of looking into these various 
libraries and will likely provide guidance in the near future on this.

I would like to suggest another library to evaluate when deciding this.  Some 
of our teams have started to use a library named Stoplight[1][2] in our 
projects.  For example, in the Poppy CDN project, we found it worked around 
some of the issues we had with warlock such as validating nested json correctly 
[3].

Stoplight is an input validation framework for python.  It can be used to 
decorate any function (including routes in pecan or falcon) to validate its 
parameters.

Some good examples can be found here [4] on how to use Spotlight.

Let us know your thoughts/interest and we would be happy to discuss further on 
if and how this would be valuable as a library for API request validation in 
Openstack.


Thanks


Amit Gandhi
Senior Manager - Rackspace



[1] https://pypi.python.org/pypi/stoplight
[2] https://github.com/painterjd/stoplight
[3] 
https://github.com/stackforge/poppy/blob/master/poppy/transport/pecan/controllers/v1/services.py#L108
[4] 
https://github.com/painterjd/stoplight/blob/master/stoplight/tests/test_validation.py#L138

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] DB Datasets CI broken

2014-10-17 Thread Michael Still
On Sat, Oct 18, 2014 at 11:02 AM, Jeremy Stanley fu...@yuggoth.org wrote:
 On 2014-10-18 10:45:23 +1100 (+1100), Michael Still wrote:
 [...]
 Is it possible to add a verification step to nodepool so that it
 doesn't mark a new image as ready unless it passes some basic sanity
 checks?

 Back in the beforetime, when devstack-gate had scripts which managed
 the worker pool as scheduled Jenkins jobs, it would run DevStack
 exercises on a test boot of the new image before using it to boot
 real images. Of course you can imagine the number of perfectly good
 images which were thrown away because of nondeterministic bugs
 causing false negative results there, so we probably wouldn't want
 to duplicate that exactly, but perhaps something more lightweight
 would be a reasonable compromise.

 Anyway, I consider it a good feature request (others may disagree),
 just nobody's reimplemented it in nodepool to date.

Yeah, I'm starting to think along the lines of adding a simple sanity
check to the shell worker in turbo hipster before the real tests run.
Things like checking if the git directory exists, and contains a git
repo with the branches we need. We could run that pre-flight script
(or a variant) of it on images before marking them as real.

For reference, what we think happened here is that the cache of SQL
databases baked into the image was rsynced from our master while
jhesketh was in the process of updating the SQL databases to a more
recent version of OpenStack.

Michael

-- 
Rackspace Australia

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [All] Maintenance mode in OpenStack during patching/upgrades

2014-10-17 Thread Joe Cropper
I’m glad to see this topic getting some focus once again.  :-)

From several of the administrators I talk with, when they think of putting a 
host into maintenance mode, the common requests I hear are:

1. Don’t schedule more VMs to the host
2. Provide an optional way to automatically migrate all (usually active) VMs 
off the host so that users’ workloads remain “unaffected” by the maintenance 
operation

#1 can easily be achieved, as has been mentioned several times, by simply 
disabling the compute service.  However, #2 involves a little more work, 
although certainly possible using all the operations provided by nova today 
(e.g., live migration, etc.).  I believe these types of discussions have come 
up several times over the past several OpenStack releases—certainly since 
Grizzly (i.e., when I started watching this space).

It seems that the general direction is to have the type of workflow needed for 
#2 outside of nova (which is certainly a valid stance).  To that end, it would 
be fairly straightforward to build some code that logically sits on top of 
nova, that when entering maintenance:

1. Prevents VMs from being scheduled to the host;
2. Maintains state about the maintenance operation (e.g., not in maintenance, 
migrations in progress, in maintenance, or error);
3. Provides mechanisms to, upon entering maintenance, dictates which VMs 
(active, all, none) to migrate and provides some throttling capabilities to 
prevent hundreds of parallel migrations on densely packed hosts (all done via a 
REST API).

If anyone has additional questions, comments, or would like to discuss some 
options, please let me know.  If interested, upon request, I could even share a 
video of how such cases might work.  :-)  My colleagues and I have given these 
use cases a lot of thought and consideration and I’d love to talk more about 
them (perhaps a small session in Paris would be possible).

- Joe

On Oct 17, 2014, at 4:18 AM, John Garbutt j...@johngarbutt.com wrote:

 On 17 October 2014 02:28, Matt Riedemann mrie...@linux.vnet.ibm.com wrote:
 
 
 On 10/16/2014 7:26 PM, Christopher Aedo wrote:
 
 On Tue, Sep 9, 2014 at 2:19 PM, Mike Scherbakov
 mscherba...@mirantis.com wrote:
 
 On Tue, Sep 9, 2014 at 6:02 PM, Clint Byrum cl...@fewbar.com wrote:
 
 The idea is not simply deny or hang requests from clients, but provide
 them
 we are in maintenance mode, retry in X seconds
 
 You probably would want 'nova host-servers-migrate host'
 
 yeah for migrations - but as far as I understand, it doesn't help with
 disabling this host in scheduler - there is can be a chance that some
 workloads will be scheduled to the host.
 
 
 Regarding putting a compute host in maintenance mode using nova
 host-update --maintenance enable, it looks like the blueprint and
 associated commits were abandoned a year and a half ago:
 https://blueprints.launchpad.net/nova/+spec/host-maintenance
 
 It seems that nova service-disable host nova-compute effectively
 prevents the scheduler from trying to send new work there.  Is this
 the best approach to use right now if you want to pull a compute host
 out of an environment before migrating VMs off?
 
 I agree with Tim and Mike that having something respond down for
 maintenance rather than ignore or hang would be really valuable.  But
 it also looks like that hasn't gotten much traction in the past -
 anyone feel like they'd be in support of reviving the notion of
 maintenance mode?
 
 -Christopher
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 host-maintenance-mode is definitely a thing in nova compute via the os-hosts
 API extension and the --maintenance parameter, the compute manager code is
 here [1].  The thing is the only in-tree virt driver that implements it is
 xenapi, and I believe when you put the host in maintenance mode it's
 supposed to automatically evacuate the instances to some other host, but you
 can't target the other host or tell the driver, from the API, which
 instances you want to evacuate, e.g. all, none, running only, etc.
 
 [1]
 http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py?id=2014.2#n3990
 
 We should certainly make that more generic. It doesn't update the VM
 state, so its really only admin focused in its current form.
 
 The XenAPI logic only works when using XenServer pools with shared NFS
 storage, if my memory serves me correctly. Honestly, its a bit of code
 I have planned on removing, along with the rest of the pool support.
 
 In terms of requiring DB downtime in Nova, the current efforts are
 focusing on avoiding downtime all together, via expand/contract style
 migrations, with a little help from objects to avoid data migrations.
 
 That doesn't mean maintenance mode if not useful for other things,
 like an emergency patching of the hypervisor.
 
 John