Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-08 Thread Russell Bryant
On 08/07/2014 08:06 PM, Michael Still wrote:
 It seems to me that the tension here is that there are groups who
 would really like to use features in newer libvirts that we don't CI
 on in the gate. Is it naive to think that a possible solution here is
 to do the following:
 
  - revert the libvirt version_cap flag

I don't feel strongly either way on this.  It seemed useful at the time
for being able to decouple upgrading libvirt and enabling features that
come with that.  I'd like to let Dan get back from vacation and weigh in
on it, though.

  - instead implement a third party CI with the latest available
 libvirt release [1]

As for the general idea of doing CI, absolutely.  That was discussed
earlier in the thread, though nobody has picked up the ball yet.  I can
work on it, though.  We just need to figure out a sensible approach.

We've seen several times that building and maintaining 3rd party CI is a
*lot* of work.  Like you said in [1], doing this in infra's CI would be
ideal.  I think 3rd party should be reserved for when running it in the
project's infrastructure is not an option for some reason (requires
proprietary hw or sw, for example).

I wonder if the job could be as simple as one with an added step in the
config to install latest libvirt from source.  Dan, do you think someone
could add a libvirt-current.tar.gz to http://libvirt.org/sources/ ?
Using the latest release seems better than master from git.

I'll mess around and see if I can spin up an experimental job.

  - document clearly in the release notes the versions of dependancies
 that we tested against in a given release: hypervisor versions (gate
 and third party), etc etc

Sure, that sounds like a good thing to document in release notes.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-08 Thread Russell Bryant
On 08/08/2014 01:46 AM, Luke Gorrie wrote:
 On 8 August 2014 02:06, Michael Still mi...@stillhq.com
 mailto:mi...@stillhq.com wrote:
 
 1: I think that ultimately should live in infra as part of check, but
 I'd be ok with it starting as a third party if that delivers us
 something faster. I'd be happy enough to donate resources to get that
 going if we decide to go with this plan.
 
 
 Can we cooperate somehow?
 
 We are already working on bringing up a third party CI covering QEMU 2.1
 and Libvirt 1.2.7. The intention of this CI is to test the software
 configuration that we are recommending for NFV deployments (including
 vhost-user feature which appeared in those releases), and to provide CI
 cover for the code we are offering for Neutron.
 
 Michele Paolino is working on this and the relevant nova/devstack changes.

It sounds like what you're working on is a separate thing.  You're
targeting coverage for a specific set of use cases, while this is a
flavor of the general CI coverage we're already doing, but with the
latest (not pegged) libvirt (and maybe qemu).

By all means, more testing is useful though.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] The future of the integrated release

2014-08-08 Thread Russell Bryant
On 08/08/2014 05:06 AM, Thierry Carrez wrote:
 Michael Still wrote:
 [...] I think an implied side effect of
 the runway system is that nova-drivers would -2 blueprint reviews
 which were not occupying a slot.

 (If we start doing more -2's I think we will need to explore how to
 not block on someone with -2's taking a vacation. Some sort of role
 account perhaps).
 
 Ideally CodeReview-2s should be kept for blocking code reviews on
 technical grounds, not procedural grounds. For example it always feels
 weird to CodeReview-2 all feature patch reviews on Feature Freeze day --
 that CodeReview-2 really doesn't have the same meaning as a traditional
 CodeReview-2.
 
 For those procedural blocks (feature freeze, waiting for runway
 room...), it might be interesting to introduce a specific score
 (Workflow-2 perhaps) that drivers could set. That would not prevent code
 review from happening, that would just clearly express that this is not
 ready to land for release cycle / organizational reasons.
 
 Thoughts?
 

That sounds much nicer than using code review -2.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance] Image upload/download bandwidth cap

2014-08-08 Thread Russell Bryant
On 08/08/2014 04:17 PM, Jay Pipes wrote:
 On 08/08/2014 08:49 AM, Tomoki Sekiyama wrote:
 Hi all,

 I'm considering how I can apply image download/upload bandwidth limit for
 glance for network QoS.

 There was a review for the bandwidth limit, however it is abandoned.

 * Download rate limiting
https://review.openstack.org/#/c/21380/

 Was there any discussion in the past summit about this not to merge this?
 Or, is there alternative way to cap the bandwidth consumed by Glance?

 I appreciate any information about this.
 
 Hi Tomoki :)
 
 Would it be possible to integrate traffic control into the network
 configuration between the Glance endpoints and the nova-compute nodes
 over the control plane network?
 
 http://www.lartc.org/lartc.html#LARTC.RATELIMIT.SINGLE

Yep, that was my first thought as well.  It seems like something that
would ideally be handled outside of OpenStack itself.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-08 Thread Russell Bryant
On 08/08/2014 09:06 AM, Russell Bryant wrote:
  - instead implement a third party CI with the latest available
 libvirt release [1]
 
 As for the general idea of doing CI, absolutely.  That was discussed
 earlier in the thread, though nobody has picked up the ball yet.  I can
 work on it, though.  We just need to figure out a sensible approach.
 
 We've seen several times that building and maintaining 3rd party CI is a
 *lot* of work.  Like you said in [1], doing this in infra's CI would be
 ideal.  I think 3rd party should be reserved for when running it in the
 project's infrastructure is not an option for some reason (requires
 proprietary hw or sw, for example).
 
 I wonder if the job could be as simple as one with an added step in the
 config to install latest libvirt from source.  Dan, do you think someone
 could add a libvirt-current.tar.gz to http://libvirt.org/sources/ ?
 Using the latest release seems better than master from git.
 
 I'll mess around and see if I can spin up an experimental job.

Here's a first stab at it:

https://review.openstack.org/113020

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Neutron][Technical Committee] nova-network - Neutron. Throwing a wrench in the Neutron gap analysis

2014-08-08 Thread Russell Bryant
On 08/06/2014 01:41 PM, Jay Pipes wrote:
 On 08/06/2014 01:40 AM, Tom Fifield wrote:
 On 06/08/14 13:30, Robert Collins wrote:
 On 6 August 2014 17:27, Tom Fifield t...@openstack.org wrote:
 On 06/08/14 13:24, Robert Collins wrote:

 What happened to your DB migrations then? :)


 Sorry if I misunderstood, I thought we were talking about running VM
 downtime here?

 While DB migrations are running things like the nova metadata service
 can/will misbehave - and user code within instances will be affected.
 Thats arguably VM downtime.

 OTOH you could define it more narrowly as 'VMs are not powered off' or
 'VMs are not stalled for more than 2s without a time slice' etc etc -
 my sense is that most users are going to be particularly concerned
 about things for which they have to *do something* - e.g. VMs being
 powered off or rebooted - but having no network for a short period
 while vifs are replugged and the overlay network re-establishes itself
 would be much less concerning.

 I think you've got it there, Rob - nicely put :)

 In many cases the users I've spoken to who are looking for a live path
 out of nova-network on to neutron are actually completely OK with some
 API service downtime (metadata service is an API service by their
 definition). A little 'glitch' in the network is also OK for many of
 them.

 Contrast that with the original proposal in this thread (snapshot VMs
 in old nova-network deployment, store in Swift or something, then launch
 VM from a snapshot in new Neutron deployment) - it is completely
 unacceptable and is not considered a migration path for these users.
 
 Who are these users? Can we speak with them? Would they be interested in
 participating in the documentation and migration feature process?

Yes, I'd really like to see some participation in the development of a
solution if it's an important requirement.  Until then, it feels like a
case of an open question of what do you want.  Of course the answer is
a pony.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Russell Bryant
On 08/11/2014 05:53 AM, Daniel P. Berrange wrote:
 There is work to add support for this in devestack already which I
 prefer since it makes it easy for developers to get an environment
 which matches the build system:
 
   https://review.openstack.org/#/c/108714/

Ah, cool.  Devstack is indeed a better place to put the build scripting.
 So, I think we should:

1) Get the above patch working, and then merged.

2) Get an experimental job going to use the above while we work on #3

3) Before the job can move into the check queue and potentially become
voting, it needs to not rely on downloading the source on every run.
IIRC, we can have nodepool build an image to use for these jobs that
includes the bits already installed.

I'll switch my efforts over to helping get the above completed.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Russell Bryant
On 08/09/2014 12:33 PM, Jeremy Stanley wrote:
 On 2014-08-08 09:06:29 -0400 (-0400), Russell Bryant wrote:
 [...]
 We've seen several times that building and maintaining 3rd party
 CI is a *lot* of work.
 
 Building and maintaining *any* CI is a *lot* of work, not the least
 of which is the official OpenStack project CI (I believe Monty
 mentioned in #openstack-infra last night that our CI is about twice
 the size of Travis-CI now, not sure what metric he's comparing there
 though).

Dang, I'd love to see those numbers.  :-)

 Like you said in [1], doing this in infra's CI would be ideal. I
 think 3rd party should be reserved for when running it in the
 project's infrastructure is not an option for some reason
 (requires proprietary hw or sw, for example).
 
 Add to the not an option for some reason list, software which is
 not easily obtainable through typical installation channels (PyPI,
 Linux distro-managed package repositories for their LTS/server
 releases, et cetera) or which requires gyrations which destabilize
 or significantly complicate maintenance of the overall system as
 well as reproducibility for developers. It may be possible to work
 around some of these concerns via access from multiple locations
 coupled with heavy caching, but adding that in for a one-off source
 is hard to justify the additional complexity too.

Understood.  Some questions ... is building an image that has libvirt
and qemu pre-installed from source good enough?  It avoids the
dependency on job runs, but moves it to image build time though, so it
still exists.

If the above still doesn't seem like a workable setup, then I think we
should just go straight to an image with fedora + virt-preview repo,
which kind of sounds easier, anyway.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Russell Bryant
On 08/11/2014 07:58 AM, Russell Bryant wrote:
 On 08/11/2014 05:53 AM, Daniel P. Berrange wrote:
 There is work to add support for this in devestack already which I
 prefer since it makes it easy for developers to get an environment
 which matches the build system:

   https://review.openstack.org/#/c/108714/
 
 Ah, cool.  Devstack is indeed a better place to put the build scripting.
  So, I think we should:
 
 1) Get the above patch working, and then merged.
 
 2) Get an experimental job going to use the above while we work on #3
 
 3) Before the job can move into the check queue and potentially become
 voting, it needs to not rely on downloading the source on every run.
 IIRC, we can have nodepool build an image to use for these jobs that
 includes the bits already installed.
 
 I'll switch my efforts over to helping get the above completed.
 

I still think the devstack patch is good, but after some more thought, I
think a better long term CI job setup would just be a fedora image with
the virt-preview repo.  I think I'll try that ...

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Russell Bryant
On 08/11/2014 08:01 AM, Daniel P. Berrange wrote:
 On Mon, Aug 11, 2014 at 07:58:41AM -0400, Russell Bryant wrote:
 On 08/11/2014 05:53 AM, Daniel P. Berrange wrote:
 There is work to add support for this in devestack already which I
 prefer since it makes it easy for developers to get an environment
 which matches the build system:

   https://review.openstack.org/#/c/108714/

 Ah, cool.  Devstack is indeed a better place to put the build scripting.
  So, I think we should:

 1) Get the above patch working, and then merged.

 2) Get an experimental job going to use the above while we work on #3

 3) Before the job can move into the check queue and potentially become
 voting, it needs to not rely on downloading the source on every run.
 
 Don't we have the ability to mirror downloads locally to the build
 system for python ?  The proposed patch allows an alternate download
 URL to be set via an env variable so it could point to a local mirror
 instead of libvirt.org / qemu.org

There's a pypi mirror at least.  I'm not sure about mirroring other things.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-11 Thread Russell Bryant
On 08/11/2014 09:17 AM, Jeremy Stanley wrote:
 On 2014-08-11 08:04:34 -0400 (-0400), Russell Bryant wrote:
 Dang, I'd love to see those numbers.  :-)
 
 Me too. Now that I'm not travelling I'll see if I can find out what
 he meant by that.
 
 Understood.  Some questions ... is building an image that has libvirt
 and qemu pre-installed from source good enough?  It avoids the
 dependency on job runs, but moves it to image build time though, so it
 still exists.
 
 Moving complex stability risks to image creation time still causes
 us to potentially fail to update our worker images as often, which
 means tests randomly run on increasingly stale systems in some
 providers/regions until the issue is noticed, identified and
 addressed. That said, we do already compile some things during job
 runs today (in particular, library bindings which get install-time
 linked by some Python modules).
 
 In reality, depending on more things gathered from different places
 on the Internet (be it Git repository sites like GitHub/Bitbucket,
 or private package collections) decreases our overall stability far
 more than compiling things does.
 
 If the above still doesn't seem like a workable setup, then I think we
 should just go straight to an image with fedora + virt-preview repo,
 which kind of sounds easier, anyway.
 
 If it's published from EPEL or whatever Fedora's equivalent is, then
 that's probably fine. If it's served from a separate site, then that
 increases the chances that we run into network issues either at
 image build time or job run time. Also, we would want to make sure
 whatever solution we settle on is well integrated within DevStack
 itself, so that individual developers can recreate these conditions
 themselves without a lot of additional work.

EPEL is a repo produced by the Fedora project for RHEL and its
derivatives.  The virt-preview repo is hosted on fedorapeople.org, which
is where custom repos live.  I'd say it's more analogous to Ubuntu's PPAs.

https://fedorapeople.org/groups/virt/virt-preview/

 One other thing to keep in mind... Fedora's lifecycle is too short
 for us to support outside of jobs for our master branches, so this
 would not be a solution beyond release time (we couldn't continue to
 run these jobs for Juno once released if the solution hinges on
 Fedora). Getting the versions we want developers and deployers to
 use into Ubuntu 14.04 Cloud Archive and CentOS (RHEL) 7 EPEL on the
 other hand would be a much more viable long-term solution.

Yep, makes sense.

For testing bleeding edge, I've also got my eye on how we could do this
with CentOS.  There is a virt SIG in CentOS that I'm hoping will produce
something similar to Fedora's virt-preview repo, but it's not there yet.
 I'm going to go off and discuss this with the SIG there.

http://wiki.centos.org/SpecialInterestGroup/Virtualization

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-12 Thread Russell Bryant
On 08/12/2014 05:54 AM, Daniel P. Berrange wrote:
 I am less concerned about the contents of this patch, and more concerned
 with how such a big de facto change in nova policy (we accept untested code
 sometimes) without any discussion or consensus. In your comment on the
 revert [2], you say the 'whether not-CI-tested features should be allowed
 to be merged' debate is 'clearly unresolved.' How did you get to that
 conclusion? This was never brought up in the mid-cycles as a unresolved
 topic to be discussed. In our specs template we say Is this untestable in
 gate given current limitations (specific hardware / software configurations
 available)? If so, are there mitigation plans (3rd party testing, gate
 enhancements, etc) [3].  We have been blocking untested features for some
 time now.
 
 That last lines are nonsense. We have never unconditionally blocked untested
 features nor do I recommend that we do so. The specs template testing allows
 the contributor to *justify* why they think the feature is worth accepting
 despite lack of testing. The reviewers make a judgement call on whether the
 justification is valid or not. This is a pragmmatic approach to the problem.

That has been my interpretation and approach as well: we strongly prefer
functional testing for everything, but take a pragmatic approach and
evaluate proposals on a case by case basis.  It's clear we need to be a
bit more explicit here.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Retrospective veto revert policy

2014-08-12 Thread Russell Bryant
On 08/12/2014 10:56 AM, Mark McLoughlin wrote:
 Hey
 
 (Terrible name for a policy, I know)
 
 From the version_cap saga here:
 
   https://review.openstack.org/110754
 
 I think we need a better understanding of how to approach situations
 like this.
 
 Here's my attempt at documenting what I think we're expecting the
 procedure to be:
 
   https://etherpad.openstack.org/p/nova-retrospective-veto-revert-policy
 
 If it sounds reasonably sane, I can propose its addition to the
 Development policies doc.

Looks reasonable to me.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fair standards for all hypervisor drivers

2014-08-12 Thread Russell Bryant
On 08/12/2014 03:40 PM, Kashyap Chamarthy wrote:
 On Mon, Aug 11, 2014 at 08:05:26AM -0400, Russell Bryant wrote:
 On 08/11/2014 07:58 AM, Russell Bryant wrote:
 On 08/11/2014 05:53 AM, Daniel P. Berrange wrote:
 There is work to add support for this in devestack already which I
 prefer since it makes it easy for developers to get an environment
 which matches the build system:

   https://review.openstack.org/#/c/108714/

 Ah, cool.  Devstack is indeed a better place to put the build scripting.
  So, I think we should:

 1) Get the above patch working, and then merged.

 2) Get an experimental job going to use the above while we work on #3

 3) Before the job can move into the check queue and potentially become
 voting, it needs to not rely on downloading the source on every run.
 IIRC, we can have nodepool build an image to use for these jobs that
 includes the bits already installed.

 I'll switch my efforts over to helping get the above completed.


 I still think the devstack patch is good, but after some more thought, I
 think a better long term CI job setup would just be a fedora image with
 the virt-preview repo. 
 
 So, effectively, you're trying to add a minimal Fedora image w/
 virt-preview repo (as part of some post-install kickstart script). If
 so, where would the image be stored? I'm asking because, previously Sean
 Dague mentioned of mirroring issues (which later turned out to be
 intermittent network issues with OpenStack infra cloud providers) of
 Fedora images, and floated an idea whether an updated image can be
 stored on tarballs.openstack.org, like how Trove[1] does. But, OpenStack
 infra folks (fungi) raised some valid points on why not do that.
 
 IIUC, if you intend to run tests w/ this CI job with this new image,
 there has to be a mechanism in place to ensure the cached copy (on
 tarballs.o.o) is updated.
 
 If I misunderstood what you said, please correct me.

Patches for this here:

https://review.openstack.org/#/c/113349/
https://review.openstack.org/#/c/113350/

The first one is the important part about how the image is created.
nodepool runs some prep scripts against the cloud's distro image and
then snapshots it.  That's the image stored to be used later for testing.

In this case, it enables the virt-preview repo and then calls out to the
regular devstack prep scripts to cache all packages needed for the test
locally on the image.

If there are issues with the reliability of fedorapeople.org, it will
indeed cause problems, but at least it's local to image creation and not
every test run.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Retrospective veto revert policy

2014-08-12 Thread Russell Bryant


 On Aug 12, 2014, at 5:10 PM, Michael Still mi...@stillhq.com wrote:
 
 This looks reasonable to me, with a slight concern that I don't know
 what step five looks like... What if we can never reach a consensus on
 an issue?

In an extreme case, the PTL has the authority to make the call.

In general I would like to think we can all just put on our big boy pants and 
talk through contentious issues to find a resolution that everyone can live 
with.

-- 
Russell Bryant
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-13 Thread Russell Bryant
On 08/12/2014 06:57 PM, Michael Still wrote:
 Hi.
 
 One of the action items from the nova midcycle was that I was asked to
 make nova's expectations of core reviews more clear. This email is an
 attempt at that.

Note that we also have:

https://wiki.openstack.org/wiki/Nova/CoreTeam

so once new critera reaches consensus, it should be added there.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] The future of the integrated release

2014-08-13 Thread Russell Bryant
On 08/12/2014 10:05 PM, Michael Still wrote:
 there are hundreds of proposed features for
 Juno, nearly 100 of which have been accepted. However, we're kidding
 ourselves if we think we can land 100 blueprints in a release cycle.

FWIW, I think this is actually huge improvement from previous cycles.  I
think we had almost double that # of blueprints on the list in the past.

I also don't think 100 is *completely* out of the question.  We're in
the 50-100 range already:

Icehouse - 67
Havana - 91
Grizzly - 66

Anyway, just wanted to share some numbers ... some improvements to
prioritization within that 100 is certainly still a good thing.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-13 Thread Russell Bryant
.

 I'm also not a fan of mid-cycle meetups because I feel it further
 stratifies our contributors into two increasly distinct camps - core
 vs non-core.
 
 I can see that a big benefit of a mid-cycle meetup is to be a focal
 point for collaboration, to forcably break contributors our of their
 day-to-day work pattern to concentrate on discussing specific issues.
 It also obviously solves the distinct timezone problem we have with
 our dispersed contributor base. I think that we should be examining
 what we can achieve with some kind of virtual online mid-cycle meetups
 instead. Using technology like google hangouts or some similar live
 collaboration technology, not merely an IRC discussion. Pick a 2-3
 day period, schedule formal agendas / talking slots as you would with
 a physical summit and so on. I feel this would be more inclusive to
 our community as a whole, avoid excessive travel costs, so allowing
 more of our community to attend the bigger design summits. It would
 even open possibility of having multiple meetups during a cycle (eg
 could arrange mini virtual events around each milestone if we wanted)

I think this is a nice concrete suggestion for an alternative.  I think
it's worth exploring in more detail.  I would much prefer something like
this as a replacement for the mid-cycle stuff and save the in-person
meetings for the existing twice-per-year summits.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] The future of the integrated release

2014-08-13 Thread Russell Bryant
On 08/13/2014 08:52 AM, Mark McLoughlin wrote:
 On Tue, 2014-08-12 at 14:26 -0400, Eoghan Glynn wrote:
   It seems like this is exactly what the slots give us, though. The core 
 review
 team picks a number of slots indicating how much work they think they can
 actually do (less than the available number of blueprints), and then
 blueprints queue up to get a slot based on priorities and turnaround time
 and other criteria that try to make slot allocation fair. By having the
 slots, not only is the review priority communicated to the review team, it
 is also communicated to anyone watching the project.

 One thing I'm not seeing shine through in this discussion of slots is
 whether any notion of individual cores, or small subsets of the core
 team with aligned interests, can champion blueprints that they have
 a particular interest in.

 For example it might address some pain-point they've encountered, or
 impact on some functional area that they themselves have worked on in
 the past, or line up with their thinking on some architectural point.

 But for whatever motivation, such small groups of cores currently have
 the freedom to self-organize in a fairly emergent way and champion
 individual BPs that are important to them, simply by *independently*
 giving those BPs review attention.

 Whereas under the slots initiative, presumably this power would be
 subsumed by the group will, as expressed by the prioritization
 applied to the holding pattern feeding the runways?

 I'm not saying this is good or bad, just pointing out a change that
 we should have our eyes open to.
 
 Yeah, I'm really nervous about that aspect.
 
 Say a contributor proposes a new feature, a couple of core reviewers
 think it's important exciting enough for them to champion it but somehow
 the 'group will' is that it's not a high enough priority for this
 release, even if everyone agrees that it is actually cool and useful.
 
 What does imposing that 'group will' on the two core reviewers and
 contributor achieve? That the contributor and reviewers will happily
 turn their attention to some of the higher priority work? Or we lose a
 contributor and two reviewers because they feel disenfranchised?
 Probably somewhere in the middle.
 
 On the other hand, what happens if work proceeds ahead even if not
 deemed a high priority? I don't think we can say that the contributor
 and two core reviewers were distracted from higher priority work,
 because blocking this work is probably unlikely to shift their focus in
 a productive way. Perhaps other reviewers are distracted because they
 feel the work needs more oversight than just the two core reviewers? It
 places more of a burden on the gate?
 
 I dunno ... the consequences of imposing group will worry me more than
 the consequences of allowing small groups to self-organize like this.

Yes, this is by far my #1 concern with the plan.

I think perhaps some middle ground makes sense.

1) Start doing a better job of generating a priority list, and
identifying the highest priority items based on group will.

2) Expect that reviewers use the priority list to influence their
general review time.

3) Don't actually block other things, should small groups self-organize
and decide it's important enough to them, even if not to the group as a
whole.

That sort of approach still sounds like an improvement to what we have
today, which is alack of good priority communication to direct general
review time.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-13 Thread Russell Bryant
On 08/13/2014 01:09 PM, Dan Smith wrote:
 Expecting cores to be at these sorts of things seems pretty reasonable
 to me, given the usefulness (and gravity) of the discussions we've been
 having so far. Companies with more cores will have to send more or make
 some hard decisions, but I don't want to cut back on the meetings until
 their value becomes unjustified.

I disagree.  IMO, *expecting* people to travel, potentially across the
globe, 4 times a year is an unreasonable expectation, and quite
uncharacteristic of open source projects.  If we can't figure out a way
to have the most important conversations in a way that is inclusive of
everyone, we're failing with our processes.

By all means, if a subset wants to meet up and make progress on some
things, I think that's fine.  I don't think anyone think it's not
useful.  However, discussions need to be summarized and taken back to
the list for discussion before decisions are made.  That's not the way
things are trending here, and I think that's a problem.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-13 Thread Russell Bryant
On 08/13/2014 02:33 PM, Dan Smith wrote:
 On 8/13/14 11:20 AM, Mike Bayer wrote:
 On Aug 13, 2014, at 1:44 PM, Russell Bryant rbry...@redhat.com
 wrote:
 I disagree.  IMO, *expecting* people to travel, potentially across
 the globe, 4 times a year is an unreasonable expectation, and
 quite uncharacteristic of open source projects.  If we can't figure
 out a way to have the most important conversations in a way that is
 inclusive of everyone, we're failing with our processes.

 By all means, if a subset wants to meet up and make progress on
 some things, I think that's fine.  I don't think anyone think it's
 not useful.
 
 Well, it doesn't seem at all excessive to me, given the rate and volume
 at which we do things around here. That said, if a significant number of
 cores think it's not doable, then I guess that's a data point.
 
 From what you said above, it sounds like you're okay with the meetings
 but not the requirement for cores. I said expect above -- is that a
 reasonable thing? Expect them to be present, unless they have a reason
 not to be there? Reasons could be personal or preference, but hopefully
 not I never come to midcycles because $reason.
 
 It’s difficult to compare OpenStack to other open source projects, in
 that it is on such a more massive and high velocity scale than almost
 any others (perhaps the Linux kernel is similar). 
 
 Yeah, I have a hard time justifying anything by comparing us to other
 projects. I've been involved with plenty and don't think any of them are
 useful data points for what we should or should not do here in terms of
 anything related to velocity :)

I think we also need to be careful with not continuing to increase
expectations because of velocity.  Burnout is a real problem.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] The future of the integrated release

2014-08-13 Thread Russell Bryant
On 08/13/2014 04:01 PM, Doug Hellmann wrote:
 
 On Aug 13, 2014, at 9:11 AM, Russell Bryant rbry...@redhat.com wrote:
 
 On 08/13/2014 08:52 AM, Mark McLoughlin wrote:
 On Tue, 2014-08-12 at 14:26 -0400, Eoghan Glynn wrote:
 It seems like this is exactly what the slots give us, though. The core 
 review
 team picks a number of slots indicating how much work they think they can
 actually do (less than the available number of blueprints), and then
 blueprints queue up to get a slot based on priorities and turnaround time
 and other criteria that try to make slot allocation fair. By having the
 slots, not only is the review priority communicated to the review team, it
 is also communicated to anyone watching the project.

 One thing I'm not seeing shine through in this discussion of slots is
 whether any notion of individual cores, or small subsets of the core
 team with aligned interests, can champion blueprints that they have
 a particular interest in.

 For example it might address some pain-point they've encountered, or
 impact on some functional area that they themselves have worked on in
 the past, or line up with their thinking on some architectural point.

 But for whatever motivation, such small groups of cores currently have
 the freedom to self-organize in a fairly emergent way and champion
 individual BPs that are important to them, simply by *independently*
 giving those BPs review attention.

 Whereas under the slots initiative, presumably this power would be
 subsumed by the group will, as expressed by the prioritization
 applied to the holding pattern feeding the runways?

 I'm not saying this is good or bad, just pointing out a change that
 we should have our eyes open to.

 Yeah, I'm really nervous about that aspect.

 Say a contributor proposes a new feature, a couple of core reviewers
 think it's important exciting enough for them to champion it but somehow
 the 'group will' is that it's not a high enough priority for this
 release, even if everyone agrees that it is actually cool and useful.

 What does imposing that 'group will' on the two core reviewers and
 contributor achieve? That the contributor and reviewers will happily
 turn their attention to some of the higher priority work? Or we lose a
 contributor and two reviewers because they feel disenfranchised?
 Probably somewhere in the middle.

 On the other hand, what happens if work proceeds ahead even if not
 deemed a high priority? I don't think we can say that the contributor
 and two core reviewers were distracted from higher priority work,
 because blocking this work is probably unlikely to shift their focus in
 a productive way. Perhaps other reviewers are distracted because they
 feel the work needs more oversight than just the two core reviewers? It
 places more of a burden on the gate?

 I dunno ... the consequences of imposing group will worry me more than
 the consequences of allowing small groups to self-organize like this.

 Yes, this is by far my #1 concern with the plan.

 I think perhaps some middle ground makes sense.

 1) Start doing a better job of generating a priority list, and
 identifying the highest priority items based on group will.

 2) Expect that reviewers use the priority list to influence their
 general review time.

 3) Don't actually block other things, should small groups self-organize
 and decide it's important enough to them, even if not to the group as a
 whole.

 That sort of approach still sounds like an improvement to what we have
 today, which is alack of good priority communication to direct general
 review time.

 -- 
 Russell Bryant
 
 This is more formal than what we’ve been doing in Oslo, but it’s closer than 
 a strict slot-based approach. We talk about review priorities in the meeting 
 each week, and ask anyone in the meeting to suggest changes that need 
 attention. It’s up to the individual core reviewers to act on those 
 suggestions, though.

And I think that's a very healthy approach.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-13 Thread Russell Bryant
On 08/13/2014 02:44 PM, Russell Bryant wrote:
 On 08/13/2014 02:33 PM, Dan Smith wrote:
 On 8/13/14 11:20 AM, Mike Bayer wrote:
 On Aug 13, 2014, at 1:44 PM, Russell Bryant rbry...@redhat.com
 wrote:
 I disagree.  IMO, *expecting* people to travel, potentially across
 the globe, 4 times a year is an unreasonable expectation, and
 quite uncharacteristic of open source projects.  If we can't figure
 out a way to have the most important conversations in a way that is
 inclusive of everyone, we're failing with our processes.

 By all means, if a subset wants to meet up and make progress on
 some things, I think that's fine.  I don't think anyone think it's
 not useful.

 Well, it doesn't seem at all excessive to me, given the rate and volume
 at which we do things around here. That said, if a significant number of
 cores think it's not doable, then I guess that's a data point.

 From what you said above, it sounds like you're okay with the meetings
 but not the requirement for cores. I said expect above -- is that a
 reasonable thing? Expect them to be present, unless they have a reason
 not to be there? Reasons could be personal or preference, but hopefully
 not I never come to midcycles because $reason.

 It’s difficult to compare OpenStack to other open source projects, in
 that it is on such a more massive and high velocity scale than almost
 any others (perhaps the Linux kernel is similar). 

 Yeah, I have a hard time justifying anything by comparing us to other
 projects. I've been involved with plenty and don't think any of them are
 useful data points for what we should or should not do here in terms of
 anything related to velocity :)
 
 I think we also need to be careful with not continuing to increase
 expectations because of velocity.  Burnout is a real problem.
 

Let me try to say it another way.  You seemed to say that it wasn't much
to ask given the rate at which things happen in OpenStack.  I would
argue that given the rate, we should not try to ask more of individuals
(like this proposal) and risk burnout.  Instead, we should be doing our
best to be more open an inclusive to give the project the best chance to
grow, as that's the best way to get more done.

I think an increased travel expectation is a raised bar that will hinder
team growth, not help it.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] 3rd Party CI vs. Gerrit

2014-08-13 Thread Russell Bryant
On 08/13/2014 06:23 PM, Mark McLoughlin wrote:
 On Wed, 2014-08-13 at 12:05 -0700, James E. Blair wrote:
 cor...@inaugust.com (James E. Blair) writes:

 Sean Dague s...@dague.net writes:

 This has all gone far enough that someone actually wrote a Grease Monkey
 script to purge all the 3rd Party CI content out of Jenkins UI. People
 are writing mail filters to dump all the notifications. Dan Berange
 filters all them out of his gerrit query tools.

 I should also mention that there is a pending change to do something
 similar via site-local Javascript in our Gerrit:

   https://review.openstack.org/#/c/95743/

 I don't think it's an ideal long-term solution, but if it works, we may
 have some immediate relief without all having to install greasemonkey
 scripts.

 You may have noticed that this has merged, along with a further change
 that shows the latest results in a table format.  (You may need to
 force-reload in your browser to see the change.)
 
 Beautiful! Thank you so much to everyone involved.

+1!  Love this.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-14 Thread Russell Bryant
On 08/13/2014 07:27 PM, Michael Still wrote:
 On Thu, Aug 14, 2014 at 3:44 AM, Russell Bryant rbry...@redhat.com wrote:
 On 08/13/2014 01:09 PM, Dan Smith wrote:
 Expecting cores to be at these sorts of things seems pretty reasonable
 to me, given the usefulness (and gravity) of the discussions we've been
 having so far. Companies with more cores will have to send more or make
 some hard decisions, but I don't want to cut back on the meetings until
 their value becomes unjustified.

 I disagree.  IMO, *expecting* people to travel, potentially across the
 globe, 4 times a year is an unreasonable expectation, and quite
 uncharacteristic of open source projects.  If we can't figure out a way
 to have the most important conversations in a way that is inclusive of
 everyone, we're failing with our processes.
 
 I am a bit confused by this stance to be honest. You yourself said
 when you were Icehouse PTL that you wanted cores to come to the
 summit. What changed?

Yes, I would love for core team members to come to the design summit
that's twice a year.  I still don't *expect* it for them to remain a
member of the team, and I certainly don't expect it 4 times a year.
It's a matter of frequency and requirement.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-14 Thread Russell Bryant
On 08/13/2014 11:31 PM, Michael Still wrote:
 On Thu, Aug 14, 2014 at 1:24 PM, Jay Pipes jaypi...@gmail.com wrote:
 
 Just wanted to quickly weigh in with my thoughts on this important topic. I
 very much valued the face-to-face interaction that came from the mid-cycle
 meetup in Beaverton (it was the only one I've ever been to).

 That said, I do not believe it should be a requirement that cores make it to
 the face-to-face meetings in-person. A number of folks have brought up very
 valid concerns about personal/family time, travel costs and burnout.
 
 I'm not proposing they be a requirement. I am proposing that they be
 strongly encouraged.

I'm not sure that's much different in reality.

 I believe that the issue raised about furthering the divide between core and
 non-core folks is actually the biggest reason I don't support a mandate to
 have cores at the face-to-face meetings, and I think we should make our best
 efforts to support quality virtual meetings that can be done on a more
 frequent basis than the face-to-face meetings that would be optional.
 
 I am all for online meetings, but we don't have a practical way to do
 them at the moment apart from IRC. Until someone has a concrete
 proposal that's been shown to work, I feel its a straw man argument.

Yes, IRC is one option which we already use on a regular basis.  We can
also switch to voice communication for higher bandwidth when needed.  We
even have a conferencing server set up in OpenStack's infrastructure:

https://wiki.openstack.org/wiki/Infrastructure/Conferencing

In theory it even supports basic video conferencing, though I haven't
tested it on this server yet.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-14 Thread Russell Bryant
On 08/13/2014 07:27 PM, Michael Still wrote:
 The etherpad for the meetup has extensive notes. Any summary I write
 will basically be those notes in prose. What are you looking for in a
 summary that isn't in the etherpad? There also wasn't a summary of the
 Icehouse midcycle produced that I can find. Whilst I am happy to do
 one for Juno, its a requirement that I hadn't planned for, and is
 therefore taking me some time to retrofit.
 
 I think we should chalk the request for summaries up experience and
 talk through how to better provide such things at future meetups.

The summary from the Icehouse meetup is here:

http://lists.openstack.org/pipermail/openstack-dev/2014-February/027370.html

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-14 Thread Russell Bryant
On 08/14/2014 10:04 AM, CARVER, PAUL wrote:
 Daniel P. Berrange [mailto:berra...@redhat.com] wrote:
 
 Depending on the usage needs, I think Google hangouts is a quite useful
 technology. For many-to-many session its limit of 10 participants can be
 an issue, but for a few-to-many broadcast it could be practical. What I
 find particularly appealing is the way it can live stream the session
 over youtube which allows for unlimited number of viewers, as well as
 being available offline for later catchup.
 
 I can't actually offer ATT resources without getting some level of
 management approval first, but just for the sake of discussion here's
 some info about the telepresence system we use.
 
 -=-=-=-=-=-=-=-=-=-
 ATS B2B Telepresence conferences can be conducted with an external company's
 Telepresence room(s), which subscribe to the ATT Telepresence Solution,
 or a limited number of other Telepresence service provider's networks.
 
 Currently, the number of Telepresence rooms that can participate in a B2B
 conference is limited to a combined total of 20 rooms (19 of which can be
 ATT rooms, depending on the number of remote endpoints included).
 -=-=-=-=-=-=-=-=-=-
 
 We currently have B2B interconnect with over 100 companies and ATT has
 telepresence rooms in many of our locations around the US and around
 the world. If other large OpenStack companies also have telepresence
 rooms that we could interconnect with I think it might be possible
 to get management agreement to hold a couple OpenStack meetups per
 year.
 
 Most of our rooms are best suited for 6 people, but I know of at least
 one 18 person telepresence room near me.

An ideal solution would allow attendees to join as individuals from
anywhere.  A lot of contributors work from home.  Is that sort of thing
compatible with your system?

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-14 Thread Russell Bryant
) and the design summit sessions for Juno
 were a big disappointment after the meetup sessions, basically because
 of the time constraints. The meetups are nice since there is time to
 really hash over a topic and you're not rushed, whereas with the
 design summit sessions it felt like we'd be half way through the
 allotted time before we really started talking about anything of use
 and then shortly after that you'd be hearing 5 minutes left, and I
 felt like very few of the design sessions were actually useful, or
 things we've worked on in Juno, or at least high-priority/impact
 things (v3 API being an exception there, that was a useful session).
 I have seen what you describe, and have also been at sessions where
 there is active discussion for 15 minutes, all issues are resolved, and
 there is still a bunch of time left. The issue you cited could be
 addressed by accepting fewer topics and giving double or triple slots to
 topics that are important and expected to need a lot of discussion. The
 design summits are very useful for cores and newcomers alike and I would
 hate to see that fragmented by people deciding to not go to summits.

Yes, giving more than one slot is an option and not one we've used in
the nova track before that I can recall.  It's usually because we're
trying to pack so many things into the schedule.  It's probably worth it
for some topics.

However, I still think things have to be strictly scheduled for the
design summit, as compared to very loose for meetups.  At the design
summit, there are several tracks going on at once that people need to
jump between, as well as keep up talks they are giving, or even customer
meetings.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][db] Nominating Mike Bayer for the oslo.db core reviewers team

2014-08-15 Thread Russell Bryant
On 08/15/2014 09:13 AM, Jay Pipes wrote:
 On 08/15/2014 04:21 AM, Roman Podoliaka wrote:
 Hi Oslo team,

 I propose that we add Mike Bayer (zzzeek) to the oslo.db core
 reviewers team.

 Mike is an author of SQLAlchemy, Alembic, Mako Templates and some
 other stuff we use in OpenStack. Mike has been working on OpenStack
 for a few months contributing a lot of good patches and code reviews
 to oslo.db [1]. He has also been revising the db patterns in our
 projects and prepared a plan how to solve some of the problems we have
 [2].

 I think, Mike would be a good addition to the team.
 
 Uhm, yeah... +10 :)

^2 :-)

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] OS or os are not acronyms for OpenStack

2014-08-15 Thread Russell Bryant
On 08/15/2014 11:00 AM, Mike Spreitzer wrote:
 Anita Kuno ante...@anteaya.info wrote on 08/15/2014 10:38:20 AM:
 
 OpenStack is OpenStack. The use of openstack is also acceptable in our
 development conversations.

 OS or os is operating system. I am starting to see some people us OS or
 os to mean OpenStack. This is confusing and also incorrect[0].

 ...
 
 I have seen OS for OpenStack from the start.  Just look at the
 environment variables that the CLI reads.

Yep, it's quite common and I think it's fine in the right context.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] OS or os are not acronyms for OpenStack

2014-08-15 Thread Russell Bryant
On 08/15/2014 01:28 PM, Mike Spreitzer wrote:
 Anita Kuno ante...@anteaya.info wrote on 08/15/2014 01:08:44 PM:
 
 ...
 I think you hit the nail on the head here, Russell, it's fine in the
 right context.

 The definition of the right context however is somewhat elusive. I have
 chosen (it is my own fault) to place myself in the area where the folks
 I deal with struggle with understanding context. The newcomers to the
 third party space and folks creating stackforge repos don't have the
 benefit of the understanding that core reviewers have (would I be
 accurate in saying that it is mostly nova reviewers who have responded
 to my initial post thus far?).
 
 I suffered from an instance of this confusion myself when I was just
 getting started,
 and have seen colleagues get confused too.  I suspect this problem hits
 many
 newcomers.

but surely when it comes to learning OpenStack itself, the OpenStack
community, dev processes, tools, etc  this has got to be extremely
far down the list of barriers to entry.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Enabling silent Docker tests for Nova?

2014-08-15 Thread Russell Bryant
On 08/15/2014 02:45 PM, Eric Windisch wrote:
 I have proposed a _silent_ check for Nova for integration of the Docker
 driver:
 
 https://review.openstack.org/#/c/114547/
 
 It has been established that this code cannot move back into Nova until
 the tests are running and have a solid history of success. That cannot
 happen unless we're allowed to run the tests. Running a silent check on
 changes to Nova is the first step in establishing that history.
 
 Joe Gordon suggests we need a spec to bring the driver back into Nova.
 Besides the fact that specs are closed and there is no intention of
 reintegrating the driver for Juno, I'm uncertain of proposing a spec
 without first having solid history of successful testing, especially
 given the historical context of this driver's relationship with Nova.
 
 If we could enable silent checks, we could help minimize API skew and
 branch breakages, improving driver quality and reducing maintenance
 while we prepare for the Kilo spec + merge windows. Furthermore, by
 having a history of testing, we can seek faster inclusion into Kilo.
 
 Finally, I acknowledge that we may be entering a window of significant
 load on the CI servers and I'm sensitive to the needs of the
 infrastructure team to remain both focused and to conserve precious
 compute resources. If this is an issue, then I'd like to plot a
 timeline, however rough, with the infrastructure team. 

CI resources aside, I think enabling it sounds fine and useful.

Given resource concerns, maybe just adding it to the experimental
pipeline would be sufficient?  That doesn't run as often, but still
gives you the chance to run on demand against nova patches.  There are
other things in experimental for nova as well, so there will be other
people triggering runs.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Enabling silent Docker tests for Nova?

2014-08-15 Thread Russell Bryant
On 08/15/2014 02:53 PM, Russell Bryant wrote:
 On 08/15/2014 02:45 PM, Eric Windisch wrote:
 I have proposed a _silent_ check for Nova for integration of the Docker
 driver:

 https://review.openstack.org/#/c/114547/

 It has been established that this code cannot move back into Nova until
 the tests are running and have a solid history of success. That cannot
 happen unless we're allowed to run the tests. Running a silent check on
 changes to Nova is the first step in establishing that history.

 Joe Gordon suggests we need a spec to bring the driver back into Nova.
 Besides the fact that specs are closed and there is no intention of
 reintegrating the driver for Juno, I'm uncertain of proposing a spec
 without first having solid history of successful testing, especially
 given the historical context of this driver's relationship with Nova.

 If we could enable silent checks, we could help minimize API skew and
 branch breakages, improving driver quality and reducing maintenance
 while we prepare for the Kilo spec + merge windows. Furthermore, by
 having a history of testing, we can seek faster inclusion into Kilo.

 Finally, I acknowledge that we may be entering a window of significant
 load on the CI servers and I'm sensitive to the needs of the
 infrastructure team to remain both focused and to conserve precious
 compute resources. If this is an issue, then I'd like to plot a
 timeline, however rough, with the infrastructure team. 
 
 CI resources aside, I think enabling it sounds fine and useful.
 
 Given resource concerns, maybe just adding it to the experimental
 pipeline would be sufficient?  That doesn't run as often, but still
 gives you the chance to run on demand against nova patches.  There are
 other things in experimental for nova as well, so there will be other
 people triggering runs.
 

And I missed that it's already in experimental.  Oops.

Feature freeze is only a few weeks away (Sept 4).  How about we just
leave it in experimental until after that big push?  That seems pretty
reasonable.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA] Picking a Name for the Tempest Library

2014-08-15 Thread Russell Bryant
On 08/15/2014 03:26 PM, Drew Fisher wrote:
 What about 'teapot' (as in the idiom 'tempest in a teapot'[1])
 
 -Drew
 
 [1] http://en.wikipedia.org/wiki/Tempest_in_a_teapot

Though in this case it'd be teacup in tempest, I think?

There's also a TCup project [1] that uses tempest.  So, you have teapot
in tempest in tcup ... and that just gets confusing.  :-)

[1] https://wiki.openstack.org/wiki/RefStack/TCup

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA] Picking a Name for the Tempest Library

2014-08-15 Thread Russell Bryant


 On Aug 15, 2014, at 5:39 PM, Jay Pipes jaypi...@gmail.com wrote:
 
 On 08/15/2014 03:14 PM, Matthew Treinish wrote:
 Hi Everyone,
 
 So as part of splitting out common functionality from tempest into a library 
 [1]
 we need to create a new repository. Which means we have the fun task of 
 coming
 up with something to name it. I'm personally thought we should call it:
 
  - mesocyclone
 
 Which has the advantage of being a cloud/weather thing, and the name sort of
 fits because it's a precursor to a tornado. Also, it's an available 
 namespace on
 both launchpad and pypi. But there has been expressed concern that both it 
 is a
 bit on the long side (which might have 80 char line length implications) and
 it's unclear from the name what it does.
 
 During the last QA meeting some alternatives were also brought up:
 
  - tempest-lib / lib-tempest
  - tsepmet
  - blackstorm
  - calm
  - tempit
  - integration-test-lib
 
 (although I'm not entirely sure I remember which ones were serious 
 suggestions
 or just jokes)
 
 So as a first step I figured that I'd bring it up on the ML to see if anyone 
 had
 any other suggestions. (or maybe get a consensus around one choice) I'll take
 the list, check if the namespaces are available, and make a survey so that
 everyone can vote and hopefully we'll have a clear choice for a name from 
 that.
 
 I suggest that tempest should be the name of the import'able library, and 
 that the integration tests themselves should be what is pulled out of the 
 current Tempest repository, into their own repo called 
 openstack-integration-tests or os-integration-tests.

Ooh, I like that idea!  +1

--
Russell Bryant
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-18 Thread Russell Bryant
On 08/18/2014 06:18 AM, Thierry Carrez wrote:
 Doug Hellmann wrote:
 On Aug 13, 2014, at 4:42 PM, Russell Bryant rbry...@redhat.com wrote:
 Let me try to say it another way.  You seemed to say that it wasn't much
 to ask given the rate at which things happen in OpenStack.  I would
 argue that given the rate, we should not try to ask more of individuals
 (like this proposal) and risk burnout.  Instead, we should be doing our
 best to be more open an inclusive to give the project the best chance to
 grow, as that's the best way to get more done.

 I think an increased travel expectation is a raised bar that will hinder
 team growth, not help it.

 +1, well said.
 
 Sorry, I was away for a few days. This is a topic I have a few strong
 opinions on :)
 
 There is no denial that the meetup format is working well, comparatively
 better than the design summit format. There is also no denial that that
 requiring 4 travels per year for a core dev is unreasonable. Where is
 the limit ? Wouldn't we be more productive and aligned if we did one per
 month ? No, the question is how to reach a sufficient level of focus and
 alignment while keeping the number of mandatory travel at 2 per year.
 
 I don't think our issue comes from not having enough F2F time. Our issue
 is that the design summit no longer reaches its objectives of aligning
 key contributors on a common plan, and we need to fix it.
 
 We established the design summit as the once-per-cycle opportunity to
 have face-to-face time and get alignment across the main contributors to
 a project. That used to be completely sufficient, but now it doesn't
 work as well... which resulted in alignment and team discussions to be
 discussed at mid-cycle meetups instead. Why ? And what could we change
 to have those alignment discussions at the design summit again ?
 
 Why are design summits less productive that mid-cycle meetups those days
 ? Is it because there are too many non-contributors in the design summit
 rooms ? Is it the 40-min format ? Is it the distractions (having talks
 to give somewhere else, booths to attend, parties and dinners to be at)
 ? Is it that beginning of cycle is not the best moment ? Once we know
 WHY the design summit fails its main objective, maybe we can fix it.
 
 My gut feeling is that having a restricted audience and a smaller group
 lets people get to the bottom of an issue and reach consensus. And that
 you need at least half a day or a full day of open discussion to reach
 such alignment. And that it's not particularly great to get such
 alignment in the middle of the cycle, getting it at the start is still
 the right way to align with the release cycle.
 
 Nothing prevents us from changing part of the design summit format (even
 the Paris one!), and restrict attendance to some of the sessions. And if
 the main issue is the distraction from the conference colocation, we
 might have to discuss the future of co-location again. In that 2 events
 per year objective, we could make the conference the optional cycle
 thing, and a developer-oriented specific event the mandatory one.
 
 If we manage to have alignment at the design summit, then it doesn't
 spell the end of the mid-cycle things. But then, ideally the extra
 mid-cycle gatherings should be focused on getting specific stuff done,
 rather than general team alignment. Think workshop/hackathon rather than
 private gathering. The goal of the workshop would be published in
 advance, and people could opt to join that. It would be totally optional.

Great response ... I agree with everything you've said here.  Let's
figure out how to improve the design summit to better achieve team
alignment.

Of the things you mentioned, I think the biggest limit to alignment has
been the 40 minute format.  There are some topics that need more time.
It may be that we just need to take more advantage of the ability to
give a single topic multiple time slots to ensure enough time is
available.  As Dan discussed, there are some topics that we could stand
to turn down and distribute information another way that is just as
effective.

I would also say that the number of things going on at one time is also
problematic.  Not only are there several design summit sessions going
once, but there are conference sessions and customer meetings.  The
rapid rate of jumping around and context switching is exhausting.  It
also makes it a bit harder to get critical mass for an extended period
of time around a topic.  In mid-cycle meetups, there is one track and no
other things competing for time and attention.

I don't have a good suggestion for fixing this issue with so many things
competing for time and attention.  I used to be a big proponent of
splitting the event out completely, but I don't feel the same way
anymore.  In theory we could call the conference the optional event, but
in practice it's going to be required for many folks anyway.  I can't
speak for everyone, but I suspect if you're a senior engineer at your

Re: [openstack-dev] [TripleO][Nova] Specs and approvals

2014-08-19 Thread Russell Bryant
On 08/19/2014 05:31 AM, Robert Collins wrote:
 Hey everybody - https://wiki.openstack.org/wiki/TripleO/SpecReviews
 seems pretty sane as we discussed at the last TripleO IRC meeting.
 
 I'd like to propose that we adopt it with the following tweak:
 
 19:46:34 lifeless so I propose that +2 on a spec is a commitment to
 review it over-and-above the core review responsibilities
 19:47:05 lifeless if its not important enough for a reviewer to do
 that thats a pretty strong signal
 19:47:06 dprince lifeless: +1, I thought we already agreed to that
 at the meetup
 19:47:17 slagle yea, sounds fine to me
 19:47:20 bnemec +1
 19:47:30 lifeless dprince: it wasn't clear whether it was
 part-of-responsibility, or additive, I'm proposing we make it clearly
 additive
 19:47:52 lifeless and separately I think we need to make surfacing
 reviews-for-themes a lot better
 
 That is - +1 on a spec review is 'sure, I like it', +2 is specifically
 I will review this *over and above* my core commitment - the goal
 here is to have some very gentle choke on concurrent WIP without
 needing the transition to a managed pull workflow that Nova are
 discussing - which we didn't have much support for during the meeting.
 
 Obviously, any core can -2 for any of the usual reasons - this motion
 is about opening up +A to the whole Tripleo core team on specs.
 
 Reviewers, and other interested kibbitzers, please +1 / -1 as you feel fit :)

+1

I really like this.  In fact, I like it a lot more than the current
proposal for Nova.  I think the Nova team should consider this, as well.

It still rate limits code reviews by making core reviewers explicitly
commit to reviewing things.  This is like our previous attempt at
sponsoring blueprints, but the use of gerrit I think would make it more
successful.

It also addresses my primary concerns with the tensions between group
will and small groups no longer being able to self organize and push
things to completion without having to haggle through yet another process.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] [ptls] The Czar system, or how to scale PTLs

2014-08-22 Thread Russell Bryant
On 08/22/2014 08:33 AM, Thierry Carrez wrote:
 Hi everyone,
 
 We all know being a project PTL is an extremely busy job. That's because
 in our structure the PTL is responsible for almost everything in a project:
 
 - Release management contact
 - Work prioritization
 - Keeping bugs under control
 - Communicate about work being planned or done
 - Make sure the gate is not broken
 - Team logistics (run meetings, organize sprints)
 - ...
 
 They end up being completely drowned in those day-to-day operational
 duties, miss the big picture, can't help in development that much
 anymore, get burnt out. Since you're either the PTL or not the PTL,
 you're very alone and succession planning is not working that great either.
 
 There have been a number of experiments to solve that problem. John
 Garbutt has done an incredible job at helping successive Nova PTLs
 handling the release management aspect. Tracy Jones took over Nova bug
 management. Doug Hellmann successfully introduced the concept of Oslo
 liaisons to get clear point of contacts for Oslo library adoption in
 projects. It may be time to generalize that solution.
 
 The issue is one of responsibility: the PTL is ultimately responsible
 for everything in a project. If we can more formally delegate that
 responsibility, we can avoid getting up to the PTL for everything, we
 can rely on a team of people rather than just one person.
 
 Enter the Czar system: each project should have a number of liaisons /
 official contacts / delegates that are fully responsible to cover one
 aspect of the project. We need to have Bugs czars, which are responsible
 for getting bugs under control. We need to have Oslo czars, which serve
 as liaisons for the Oslo program but also as active project-local oslo
 advocates. We need Security czars, which the VMT can go to to progress
 quickly on plugging vulnerabilities. We need release management czars,
 to handle the communication and process with that painful OpenStack
 release manager. We need Gate czars to serve as first-line-of-contact
 getting gate issues fixed... You get the idea.
 
 Some people can be czars of multiple areas. PTLs can retain some czar
 activity if they wish. Czars can collaborate with their equivalents in
 other projects to share best practices. We just need a clear list of
 areas/duties and make sure each project has a name assigned to each.
 
 Now, why czars ? Why not rely on informal activity ? Well, for that
 system to work we'll need a lot of people to step up and sign up for
 more responsibility. Making them czars makes sure that effort is
 recognized and gives them something back. Also if we don't formally
 designate people, we can't really delegate and the PTL will still be
 directly held responsible. The Release management czar should be able to
 sign off release SHAs without asking the PTL. The czars and the PTL
 should collectively be the new project drivers.
 
 At that point, why not also get rid of the PTL ? And replace him with a
 team of czars ? If the czar system is successful, the PTL should be
 freed from the day-to-day operational duties and will be able to focus
 on the project health again. We still need someone to keep an eye on the
 project-wide picture and coordinate the work of the czars. We need
 someone to pick czars, in the event multiple candidates sign up. We also
 still need someone to have the final say in case of deadlocked issues.
 
 People say we don't have that many deadlocks in OpenStack for which the
 PTL ultimate power is needed, so we could get rid of them. I'd argue
 that the main reason we don't have that many deadlocks in OpenStack is
 precisely *because* we have a system to break them if they arise. That
 encourages everyone to find a lazy consensus. That part of the PTL job
 works. Let's fix the part that doesn't work (scaling/burnout).
 

+1 on czars.  That's what was working best for me to start scaling
things in Nova, especially through my 2nd term (Icehouse).  John and
Tracy were a big help as you mentioned as examples.  There were others
that were stepping up, too.  I think it's been working well enough to
formalize it.

Another area worth calling out is a gate czar.  Having someone who
understands infra and QA quite well and is regularly on top of the
status of the project in the gate is helpful and quite important.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] [ptls] The Czar system, or how to scale PTLs

2014-08-22 Thread Russell Bryant
On 08/22/2014 09:40 AM, Russell Bryant wrote:
 Another area worth calling out is a gate czar.  Having someone who
 understands infra and QA quite well and is regularly on top of the
 status of the project in the gate is helpful and quite important.

Oops, you said this one, too.  Anyway, +1.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Server Groups - remove VM from group?

2014-08-25 Thread Russell Bryant
On 08/25/2014 12:56 PM, Joe Cropper wrote:
 That was indeed a rather long (and insightful) thread on the topic.
 It sounds like there are still some healthy discussions worth having
 on the subject -- either exploring your [potentially superseding]
 proposal, or minimally rounding out the existing server group API to
 support add existing VM [1] and remove VM -- I think these would
 make it a lot more usable (I'm thinking of the poor cloud
 administrator that makes a mistake when they boot an instance and
 either forgets to put it in a group or puts it in the wrong group --
 it's square 1 for them)?
 
 Is this queued up as a discussion point for Paris?  If so, count me in!

Adding a VM is far from trivial and is why we ripped it out before
merging.  That implies a potential reshuffling of a bunch of existing
VMs.  Consider an affinity group of instances A and B and then you add
running instance C to that group.  What do you expect to happen?  Live
migrate C to the host running A and B?  What if there isn't room?
Reschedule all 3 to find a host and live migrate all of them?  This kind
of orchestration is a good bit outside of the scope of what's done
inside of Nova today.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Server Groups - remove VM from group?

2014-08-25 Thread Russell Bryant
On 08/25/2014 01:25 PM, Joe Cropper wrote:
 I was thinking something simple such as only allowing the add operation to 
 succeed IFF no policies are found to be in violation... and then nova 
 wouldn't need to get into all the complexities you mention?

Even something like this is a lot more complicated than it sounds due to
the fact that several operations can be happening in parallel.  I think
we just need to draw a line for Nova that just doesn't include this
functionality.

 And remove would be fairly straightforward as well since no constraints would 
 need to be checked. 

Right, remove is straight forward, but seems a bit odd to have without
add.  I'm not sure there's much value to it.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Design Summit reloaded

2014-08-27 Thread Russell Bryant
On 08/27/2014 08:51 AM, Thierry Carrez wrote:
 Hi everyone,
 
 I've been thinking about what changes we can bring to the Design Summit
 format to make it more productive. I've heard the feedback from the
 mid-cycle meetups and would like to apply some of those ideas for Paris,
 within the constraints we have (already booked space and time). Here is
 something we could do:
 
 Day 1. Cross-project sessions / incubated projects / other projects
 
 I think that worked well last time. 3 parallel rooms where we can
 address top cross-project questions, discuss the results of the various
 experiments we conducted during juno. Don't hesitate to schedule 2 slots
 for discussions, so that we have time to come to the bottom of those
 issues. Incubated projects (and maybe other projects, if space allows)
 occupy the remaining space on day 1, and could occupy pods on the
 other days.

I would add Don't hesitate to schedule 2 slots ... to the description
for days 2 and 3, as well.  I think the same point applies for
project-specific sessions.  I don't think I've seen that used for
project sessions much, but I think it would help in some cases.

 Day 2 and Day 3. Scheduled sessions for various programs
 
 That's our traditional scheduled space. We'll have a 33% less slots
 available. So, rather than trying to cover all the scope, the idea would
 be to focus those sessions on specific issues which really require
 face-to-face discussion (which can't be solved on the ML or using spec
 discussion) *or* require a lot of user feedback. That way, appearing in
 the general schedule is very helpful. This will require us to be a lot
 stricter on what we accept there and what we don't -- we won't have
 space for courtesy sessions anymore, and traditional/unnecessary
 sessions (like my traditional release schedule one) should just move
 to the mailing-list.
 
 Day 4. Contributors meetups
 
 On the last day, we could try to split the space so that we can conduct
 parallel midcycle-meetup-like contributors gatherings, with no time
 boundaries and an open agenda. Large projects could get a full day,
 smaller projects would get half a day (but could continue the discussion
 in a local bar). Ideally that meetup would end with some alignment on
 release goals, but the idea is to make the best of that time together to
 solve the issues you have. Friday would finish with the design summit
 feedback session, for those who are still around.
 
 
 I think this proposal makes the best use of our setup: discuss clear
 cross-project issues, address key specific topics which need
 face-to-face time and broader attendance, then try to replicate the
 success of midcycle meetup-like open unscheduled time to discuss
 whatever is hot at this point.
 
 There are still details to work out (is it possible split the space,
 should we use the usual design summit CFP website to organize the
 scheduled time...), but I would first like to have your feedback on
 this format. Also if you have alternative proposals that would make a
 better use of our 4 days, let me know.

+1 on the format.  I think it sounds like a nice iteration on our setup
to try some new ideas.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron][stable] How to backport database schema fixes

2014-08-29 Thread Russell Bryant
On 08/29/2014 06:54 AM, Salvatore Orlando wrote:
 If you are running version from a stable branch, changes in DB
 migrations should generally be forbidden as the policy states since
 those migrations are not likely to be executed again. Downgrading and
 then upgrading again is extremely risky and I don't think anybody would
 ever do that.
 
 However, if one is running stable branch X-2 where X is the current
 development branch, back porting migration fixes could make sense for
 upgrading to version X-1 if the migration being fixed is in the path
 between X-2 and X-1.
 Therefore I would forbid every fix to migration earlier than X-2 release
 (there should not be any in theory but neutron has migrations back to
 folsom). For the path between X-2 and  X-1 fixes might be ok. 

I think it's safe to backport to X-1.  The key bit is that the migration
in master and the backported version must be reentrant.  They need to
inspect the schema and only perform the change if it hasn't already been
applied.  This is a good best practice to adopt for *all* migrations to
make the backport option easier.

 However,
 rather than amending existing migration is always better to add new
 migrations - even if it's a matter of enabling a given change for a
 particular plugin (*). 

Agreed, in general.

It depends on the bug.  If there's an error in the migration that will
prevent the original code from running properly, breaking the migration,
that obviously needs to be fixed.

 As nova does, the best place for doing that is
 always immediately before release.

Doing what, adding placeholders?

Note that we actually add placeholders at the very *beginning* of a
release cycle.  The placeholders have to be put in place as the first
set of migrations in a release.  That way:

1) X-1 has those migration slots unused.

2) X has those slots reserved.

If we did it just *before* release, you can't actually backport into
those positions.  They've already run as no-op.

 With alembic, we do not need to add placeholders, but just adjust
 pointers just like you would when inserting an element in a dynamic list.

Good point.

 (*) we are getting rid of this conditional migration logic for juno anyway

Yay!

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-04 Thread Russell Bryant
On 09/04/2014 06:24 AM, Daniel P. Berrange wrote:
 Position statement
 ==
 
 Over the past year I've increasingly come to the conclusion that
 Nova is heading for (or probably already at) a major crisis. If
 steps are not taken to avert this, the project is likely to loose
 a non-trivial amount of talent, both regular code contributors and
 core team members. That includes myself. This is not good for
 Nova's long term health and so should be of concern to anyone
 involved in Nova and OpenStack.
 
 For those who don't want to read the whole mail, the executive
 summary is that the nova-core team is an unfixable bottleneck
 in our development process with our current project structure.
 The only way I see to remove the bottleneck is to split the virt
 drivers out of tree and let them all have their own core teams
 in their area of code, leaving current nova core to focus on
 all the common code outside the virt driver impls. I, now, none
 the less urge people to read the whole mail.

Fantastic write-up.  I can't +1 enough the problem statement, which I
think you've done a nice job of framing.  We've taken steps to try to
improve this, but none of them have been big enough.  I feel we've
reached a tipping point.  I think many others do too, and several
proposals being discussed all seem rooted in this same core issue.

When it comes to the proposed solution, I'm +1 on that too, but part of
that is that it's hard for me to ignore the limitations placed on us by
our current review infrastructure (gerrit).

If we ignored gerrit for a moment, is rapid increase in splitting out
components the ideal workflow?  Would we be better off finding a way to
finally just implement a model more like the Linux kernel with
sub-system maintainers and pull requests to a top-level tree?  Maybe.
I'm not convinced that split of repos is obviously better.

You make some good arguments for why splitting has other benefits.
Besides, even if we weren't going to split them and instead wanted to
have separate branches, we'd have to take interface stability much more
seriously.   I think the work immediately needed overlaps quite a bit.

In any case, let's not completely side-tracked on the end game workflow.
 I am completely on board with the idea that we have to move to a model
that involves more than one team and spreading out the responsibility
further than we have thus far.

I don't think we can afford to wait much longer without drastic change,
so let's make it happen.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-04 Thread Russell Bryant


- Original Message -
 On 09/04/2014 11:32 AM, Vladik Romanovsky wrote:
  +1
 
  I very much agree with Dan's the propsal.
 
  I am concerned about difficulties we will face with merging
  patches that spreads accross various regions: manager, conductor,
  scheduler, etc..
  However, I think, this is a small price to pay for having a more focused
  teams.
 
  IMO, we will stiil have to pay it, the moment the scheduler will separate.
 
 There will be more pain the moment the scheduler separates, IMO,
 especially with its current design and interfaces.

I absolutely agree that the scheduler split is a non-starter without 
stabilizing all of the relevant interfaces.  I hope there's not much debate on 
that high level point.  Of course, identifying exactly what those interfaces 
should be a bit more complicated, but I hope the focus can stay there.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-05 Thread Russell Bryant
On 09/05/2014 10:06 AM, Jay Pipes wrote:
 On 09/05/2014 06:29 AM, John Garbutt wrote:
 Scheduler: I think we need to split out the scheduler with a similar
 level of urgency. We keep blocking features on the split, because we
 know we don't have the review bandwidth to deal with them. Right now I
 am talking about a compute related scheduler in the compute program,
 that might evolve to worry about other services at a later date.
 
 -1
 
 Without first cleaning up the interfaces around resource tracking, claim
 creation and processing, and the communication interfaces between the
 nova-conductor, nova-scheduler, and nova-compute.
 
 I see no urgency at all in splitting out the scheduler. The cleanup of
 the interfaces around the resource tracker and scheduler has great
 priority, though, IMO.

I'd just reframe things ... I'd like the work you're referring to here
be treated as an obvious key pre-requisite to a split, and this cleanup
is what should be treated with urgency by those with a vested interest
in getting more autonomy around scheduler development.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] doubling our core review bandwidth

2014-09-08 Thread Russell Bryant
On 09/08/2014 05:17 AM, Steven Hardy wrote:
 On Mon, Sep 08, 2014 at 03:14:24PM +1200, Robert Collins wrote:
 I hope the subject got your attention :).

 This might be a side effect of my having too many cosmic rays, but its
 been percolating for a bit.

 tl;dr I think we should drop our 'needs 2x+2 to land' rule and instead
 use 'needs 1x+2'. We can ease up a large chunk of pressure on our
 review bottleneck, with the only significant negative being that core
 reviewers may see less of the code going into the system - but they
 can always read more to stay in shape if thats an issue :)
 
 I think this may be a sensible move, but only if it's used primarily to
 land the less complex/risky patches more quickly.
 
 As has been mentioned already by Angus, +1 can (and IMO should) be used for
 any less trival and/or risky patches, as the more-eyeballs thing is really
 important for big or complex patches (we are all fallible, and -core folks
 quite regularly either disagree, spot different types of issue, or just
 have better familiarity with some parts of the codebase than others).
 
 FWIW, every single week in the Heat queue, disagreements between -core
 reviewers result in issues getting fixed before merge, which would result
 in more bugs if the 1x+2 scheme was used unconditionally.  I'm sure other
 projects are the same, but I guess this risk can be mitigated with reviewer
 +1 discretion.

Agreed with this.  I think this is a worthwhile move for simpler
patches.  I've already done it plenty of times for a very small category
of things (like translations updates).  It would be worth having someone
write up a proposal that reflects this, with some examples that
demonstrate patches that really need the second review vs others that
don't.  In the end, it has to be based on trust in a -core team member
judgement call.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] On an API proxy from baremetal to ironic

2014-09-09 Thread Russell Bryant
On 09/09/2014 05:24 PM, Michael Still wrote:
 Hi.
 
 One of the last things blocking Ironic from graduating is deciding
 whether or not we need a Nova API proxy for the old baremetal
 extension to new fangled Ironic API. The TC has asked that we discuss
 whether we think this functionality is actually necessary.
 
 It should be noted that we're _not_ talking about migration of
 deployed instances from baremetal to Ironic. That is already
 implemented. What we are talking about is if users post-migration
 should be able to expect their previous baremetal Nova API extension
 to continue to function, or if they should use the Ironic APIs from
 that point onwards.
 
 Nova had previously thought this was required, but it hasn't made it
 in time for Juno unless we do a FFE, and it has been suggested that
 perhaps its not needed at all because it is an admin extension.
 
 To be super specific, we're talking about the baremetal nodes admin
 extension here. This extension has the ability to:
 
  - list nodes running baremetal
  - show detail of one of those nodes
  - create a new baremetal node
  - delete a baremetal node
 
 Only the first two of those would be supported if we implemented a proxy.
 
 So, discuss.

I'm in favor of proceeding with deprecation without requiring the API proxy.

In the case of user facing APIs, the administrators in charge of
upgrading the cloud do not have full control over all of the apps using
the APIs.  In this particular case, I would expect that the cloud
administrators have *complete* control over the use of these APIs.

Assuming we have one overlap release (Juno) to allow the migration to
occur and given proper documentation of the migration plan and release
notes stating the fact that the old APIs are going away, we should be fine.

In summary, +1 to moving forward without the API proxy requirement.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Kilo Cycle Goals Exercise

2014-09-10 Thread Russell Bryant
On 09/03/2014 11:37 AM, Joe Gordon wrote:
 As you all know, there has recently been several very active discussions
 around how to improve assorted aspects of our development process. One idea
 that was brought up is to come up with a list of cycle goals/project
 priorities for Kilo [0].
 
 To that end, I would like to propose an exercise as discussed in the TC
 meeting yesterday [1]:
 Have anyone interested (especially TC members) come up with a list of
 what they think the project wide Kilo cycle goals should be and post
 them on this thread by end of day Wednesday, September 10th. After which
 time we can begin discussing the results.
 The goal of this exercise is to help us see if our individual world
 views align with the greater community, and to get the ball rolling on a
 larger discussion of where as a project we should be focusing more time.

In OpenStack, we have no shortage of interest and enthusiasm on all
fronts, including development contributors, deployers, and cloud end
users.  When looking at project wide-priorities, we need to make sure
our tools, processes, and resulting technology facilitate turning all of
that interest into real value.  We need to identify which areas have the
most pain, and work to improve them.

A lot of this is certainly about Kilo, but it's longer term, too.  It's
the way we should always be thinking about this.

1) Dev community

We clearly have a lot of growing pains here.  What's quite encouraging
is that we also have a lot of hard work going into several different
proposals to figure out ways to help.

The largest projects (Nova and Neutron) are overwhelmed and approaching
breaking points.  We have to find ways to relieve this pressure.  This
may involve aggressively pursing project splits or other code review
workflow changes.  I think the problems and solutions here are
project-wide issues, as solutions put in place tend to rapidly spread to
the rest of OpenStack.  This is an area I'm especially concerned about
and eager to help look for solutions.  We should evaluate all potential
improvements against how well they help us scale our teams and processes
to remove bottlenecks to productivity in the dev communtiy.

There are several other encouraging proposals related to easing pain in
the dev community:

 - re-working how we do functional testing by making it more project focused

 - discussions like this one to discuss both priorities, but also how we
turn priorities into real action (like the nova projects discussions
around using priorities in development)

 - evolving project leadership (the PTL position) so that we can provide
more guidance around delegation in a way that is reasonably consistent
across projects

 - continued discussion about the contents of the integrated release and
how we can continue to foster growth without sacrificing quality

We are always going to have problems like this, and I hope we continue
to think about, discuss, and improve the way we run our projects every
release cycle to come.

2) End Users

A few others have done a very nice job describing end user experience
problems.  Monty's description of getting an instance with an IP was
painful and embarrassing to read.  We've got to figure out ways to
provide better focus on these sorts of usability issues.  They're
obviously not getting the attention they deserve.

There have also been lots of good points about improving our API
consistency.  I totally agree.  I'd love to see a group of people step
up and emerge as leaders in this area across all projects.  I feel like
that's something we're sorely missing right now.

3) Deployers

OpenStack is still painful to deploy, and even more painful to manage.
  I'm still quite pleased that we have a deployment program working on
this space.  I'd actually really like to see how we can facilitate
better integration and discussion between TripleO and the rest of the
project teams.

I'm also very pleased with the progress we've made in Nova towards the
initial support for live upgrades.  We still have more work to do in
Nova, but I'd also like to see more work done in other projects towards
the same goal.

For both deployers and the dev community, figuring out what went wrong
when OpenStack breaks sucks.  Others provided some good pointers to
several areas we can improve that area (better logs, tooling, ...) and I
hope we can make some real progress in this area in the coming cycle.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Server Groups - remove VM from group?

2014-09-10 Thread Russell Bryant


 On Sep 10, 2014, at 2:03 PM, Joe Cropper cropper@gmail.com wrote:
 
 I agree, Chris.  I think a number of folks put in a lot of really great work 
 into the existing server groups and there has been a lot of interest on their 
 usage, especially given that the scheduler already has some constructs in 
 place to piggyback on them.
 
 I would like to craft up a blueprint proposal for Kilo to add two simple 
 extensions to the existing server group APIs that I believe will make them 
 infinitely more usable in any ‘real world’ scenario.  I’ll put more details 
 in the proposal, but in a nutshell:
 
 1. Adding a VM to a server group
 Only allow it to succeed if its policy wouldn’t be violated by the addition 
 of the VM
 

I'm not sure that determining this at the time of the API request is possible 
due to the parallel and async nature of the system. I'd love to hear ideas on 
how you think this might be done, but I'm really not optimistic and would 
rather just not go down this road. 

 2. Removing a VM from a server group
 Just allow it
 
 I think this would round out the support that’s there and really allow us to 
 capitalize on the hard work everyone’s already put into them.
 
 - Joe
 
 On Aug 26, 2014, at 6:39 PM, Chris Friesen chris.frie...@windriver.com 
 wrote:
 
 On 08/25/2014 11:25 AM, Joe Cropper wrote:
 I was thinking something simple such as only allowing the add
 operation to succeed IFF no policies are found to be in violation...
 and then nova wouldn't need to get into all the complexities you
 mention?
 
 Personally I would be in favour of this...nothing fancy, just add it if it 
 already meets all the criteria.  This is basically just a database operation 
 so I would hope we could make it reliable in the face of simultaneous things 
 going on with the instance.
 
 And remove would be fairly straightforward as well since no
 constraints would need to be checked.
 
 Agreed.
 
 Chris
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Server Groups - remove VM from group?

2014-09-10 Thread Russell Bryant
On 09/10/2014 06:46 PM, Joe Cropper wrote:
 Hmm, not sure I follow the concern, Russell.  How is that any different
 from putting a VM into the group when it’s booted as is done today?
  This simply defers the ‘group insertion time’ to some time after
 initial the VM’s been spawned, so I’m not sure this creates anymore race
 conditions than what’s already there [1].
 
 [1] Sure, the to-be-added VM could be in the midst of a migration or
 something, but that would be pretty simple to check make sure its task
 state is None or some such.

The way this works at boot is already a nasty hack.  It does policy
checking in the scheduler, and then has to re-do some policy checking at
launch time on the compute node.  I'm afraid of making this any worse.
In any case, it's probably better to discuss this in the context of a
more detailed design proposal.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][neutron][cinder] Averting the Nova crisis by splitting out virt drivers

2014-09-10 Thread Russell Bryant
On 09/10/2014 10:35 PM, Armando M. wrote:
 Hi,
 
 I devoured this thread, so much it was interesting and full of
 insights. It's not news that we've been pondering about this in the
 Neutron project for the past and existing cycle or so.
 
 Likely, this effort is going to take more than two cycles, and would
 require a very focused team of people working closely together to
 address this (most likely the core team members plus a few other folks
 interested).
 
 One question I was unable to get a clear answer was: what happens to
 existing/new bug fixes and features? Would the codebase go in lockdown
 mode, i.e. not accepting anything else that isn't specifically
 targeting this objective? Just using NFV as an example, I can't
 imagine having changes supporting NFV still being reviewed and merged
 while this process takes place...it would be like shooting at a moving
 target! If we did go into lockdown mode, what happens to all the
 corporate-backed agendas that aim at delivering new value to
 OpenStack?

Yes, I imagine a temporary slow-down on new feature development makes
sense.  However, I don't think it has to be across the board.  Things
should be considered case by case, like usual.

For example, a feature that requires invasive changes to the virt driver
interface might have a harder time during this transition, but a more
straight forward feature isolated to the internals of a driver might be
fine to let through.  Like anything else, we have to weight cost/benefit.

 Should we relax what goes into the stable branches, i.e. considering
 having  a Juno on steroids six months from now that includes some of
 the features/fixes that didn't land in time before this process kicks
 off?

No ... maybe I misunderstand the suggestion, but I definitely would not
be in favor of a Juno branch with features that haven't landed in master.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Kilo Cycle Goals Exercise

2014-09-10 Thread Russell Bryant
On 09/11/2014 12:52 AM, Angus Lees wrote:
 So easy/obvious it probably isn't even worth mentioning:
 
 Drop support for python2.6

Yeah, that's been the plan.  We discussed this at the Juno summit and
representatives from most (all?) distributions carrying OpenStack were
there.  Dropping in Kilo seemed like a reasonable time frame at the time.

https://etherpad.openstack.org/p/juno-cross-project-future-of-python

And obviously tweeting about it makes it official, right?

https://twitter.com/russellbryant/status/466241078472228864

But seriously, we should probably put out a more official notice about
this once Kilo opens up.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] nova-specs for Kilo?

2014-09-10 Thread Russell Bryant
On 09/11/2014 01:32 AM, Joe Cropper wrote:
 Hi Folks,
 
 Just wondering if the nova-specs master branch will have a ‘kilo’
 directory created soon for Kilo proposals? I have a few things I’d like
 to submit, just looking for the proper home.

There's some more info on that here:

http://lists.openstack.org/pipermail/openstack-dev/2014-August/044431.html

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Server Groups - remove VM from group?

2014-09-12 Thread Russell Bryant
On 09/11/2014 05:01 PM, Jay Pipes wrote:
 On 09/11/2014 04:51 PM, Matt Riedemann wrote:
 On 9/10/2014 6:00 PM, Russell Bryant wrote:
 On 09/10/2014 06:46 PM, Joe Cropper wrote:
 Hmm, not sure I follow the concern, Russell.  How is that any different
 from putting a VM into the group when it’s booted as is done today?
   This simply defers the ‘group insertion time’ to some time after
 initial the VM’s been spawned, so I’m not sure this creates anymore
 race
 conditions than what’s already there [1].

 [1] Sure, the to-be-added VM could be in the midst of a migration or
 something, but that would be pretty simple to check make sure its task
 state is None or some such.

 The way this works at boot is already a nasty hack.  It does policy
 checking in the scheduler, and then has to re-do some policy checking at
 launch time on the compute node.  I'm afraid of making this any worse.
 In any case, it's probably better to discuss this in the context of a
 more detailed design proposal.


 This [1] is the hack you're referring to right?

 [1]
 http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py?id=2014.2.b3#n1297

 
 That's the hack *I* had in the back of my mind.

Yep.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Design Summit planning

2014-09-12 Thread Russell Bryant
On 09/12/2014 07:37 AM, Thierry Carrez wrote:
 If you think this is wrong and think the design summit suggestion
 website is a better way to do it, let me know why! If some programs
 really can't stand the 'etherpad/IRC' approach I'll see how we can spin
 up a limited instance.

I think this is fine, especially if it's a better reflection of reality
and lets the teams work more efficiently.

However, one of the benefits of the old submission system was the
clarity of the process and openness to submissions from anyone.  We
don't want to be in a situation where non-core folks feel like they have
a harder time submitting a session.

Once this is settled, as long as the wiki pages [1][2] reflect the
process and is publicized, it should be fine.

[1] https://wiki.openstack.org/wiki/Summit
[2] https://wiki.openstack.org/wiki/Summit/Planning

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] keep old specs

2014-09-15 Thread Russell Bryant
On 09/15/2014 10:01 AM, Kevin Benton wrote:
 Some of the specs had a significant amount of detail and thought put
 into them. It seems like a waste to bury them in a git tree history.
 
 By having them in a place where external parties (e.g. operators) can
 easily find them, they could get more visibility and feedback for any
 future revisions. Just being able to see that a feature was previously
 designed out and approved can prevent a future person from wasting a
 bunch of time typing up a new spec for the same feature. Hardly anyone
 is going to search deleted specs from two cycles ago if it requires
 checking out a commit.
 
 Why just restrict the whole repo to being documentation of what went
 in?  If that's all the specs are for, why don't we just wait to create
 them until after the code merges?

FWIW, I agree with you that it makes sense to keep them in a directory
that makes it clear that they were not completed.

There's a ton of useful info in them.  Even if they get re-proposed,
it's still useful to see the difference in the proposal as it evolved
between releases.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] What's holding nova development back?

2014-09-15 Thread Russell Bryant
On 09/15/2014 05:42 AM, Daniel P. Berrange wrote:
 On Sun, Sep 14, 2014 at 07:07:13AM +1000, Michael Still wrote:
 Just an observation from the last week or so...

 The biggest problem nova faces at the moment isn't code review latency. Our
 biggest problem is failing to fix our bugs so that the gate is reliable.
 The number of rechecks we've done in the last week to try and land code is
 truly startling.
 
 I consider both problems to be pretty much equally as important. I don't
 think solving review latency or test reliabilty in isolation is enough to
 save Nova. We need to tackle both problems as a priority. I tried to avoid
 getting into my concerns about testing in my mail on review team bottlenecks
 since I think we should address the problems independantly / in parallel.

Agreed with this.  I don't think we can afford to ignore either one of them.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] reopen a change / pull request for nova-pythonclient ?

2014-09-17 Thread Russell Bryant
On 09/17/2014 11:47 AM, Alex Leonhardt wrote:
 hi,
 
 how does one re-open a abandoned change / pull request ? it just timed
 out and was then abandoned - 
 
 https://review.openstack.org/#/c/57834/
 
 please let me know

I re-opened it.  You should be able to update it now.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] reopen a change / pull request for nova-pythonclient ?

2014-09-17 Thread Russell Bryant
On 09/17/2014 11:56 AM, Daniel P. Berrange wrote:
 On Wed, Sep 17, 2014 at 04:47:06PM +0100, Alex Leonhardt wrote:
 hi,

 how does one re-open a abandoned change / pull request ? it just timed
 out and was then abandoned -

 https://review.openstack.org/#/c/57834/

 please let me know
 
 Just re-upload the change, maintaining the same Change-Id line in the
 commit message.

Gerrit will reject it if it's still abandoned.  You have to restore it
first.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][nova] VM restarting on host failure in convergence

2014-09-17 Thread Russell Bryant
On 09/17/2014 09:03 AM, Jastrzebski, Michal wrote:
 In short, what we'll need from nova is to have 100% reliable
 host-health monitor and equally reliable rebuild/evacuate mechanism
 with fencing and scheduler. In heat we need scallable and reliable
 event listener and engine to decide which action to perform in given
 situation.

Unfortunately, I don't think Nova can provide this alone.  Nova only
knows about whether or not the nova-compute daemon is current
communicating with the rest of the system.  Even if the nova-compute
daemon drops out, the compute node may still be running all instances
just fine.  We certainly don't want to impact those running workloads
unless absolutely necessary.

I understand that you're suggesting that we enhance Nova to be able to
provide that level of knowledge and control.  I actually don't think
Nova should have this knowledge of its underlying infrastructure.

I would put the host monitoring infrastructure (to determine if a host
is down) and fencing capability as out of scope for Nova and as a part
of the supporting infrastructure.  Assuming those pieces can properly
detect that a host is down and fence it, then all that's needed from
Nova is the evacuate capability, which is already there.  There may be
some enhancements that could be done to it, but surely it's quite close.

There's also the part where a notification needs to go out saying that
the instance has failed.  Some thing (which could be Heat in the case of
this proposal) can react to that, either directly or via ceilometer, for
example.  There is an API today to hard reset the state of an instance
to ERROR.  After a host is fenced, you could use this API to mark all
instances on that host as dead.  I'm not sure if there's an easy way to
do that for all instances on a host today.  That's likely an enhancement
we could make to python-novaclient, similar to the evacuate all
instances on a host enhancement that was done in novaclient.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Oslo] First steps towards amqp 1.0

2013-12-09 Thread Russell Bryant
On 12/09/2013 12:56 PM, Gordon Sim wrote:
 In the case of Nova (and others that followed Nova's messaging
 patterns), I firmly believe that for scaling reasons, we need to move
 toward it becoming the norm to use peer-to-peer messaging for most
 things.  For example, the API and conductor services should be talking
 directly to compute nodes instead of through a broker.
 
 Is scale the only reason for preferring direct communication? I don't
 think an intermediary based solution _necessarily_ scales less
 effectively (providing it is distributed in nature, which for example is
 one of the central aims of the dispatch router in Qpid).
 
 That's not to argue that peer-to-peer shouldn't be used, just trying to
 understand all the factors.

Scale is the primary one.  If the intermediary based solution is easily
distributed to handle our scaling needs, that would probably be fine,
too.  That just hasn't been our experience so far with both RabbitMQ and
Qpid.

 One other pattern that can benefit from intermediated message flow is in
 load balancing. If the processing entities are effectively 'pulling'
 messages, this can more naturally balance the load according to capacity
 than when the producer of the workload is trying to determine the best
 balance.

Yes, that's another factor.  Today, we rely on the message broker's
behavior to equally distribute messages to a set of consumers.

One example is how Nova components talk to the nova-scheduler service.
All instances of the nova-scheduler service are reading off a single
'scheduler' queue, so messages hit them round-robin.

In the case of the zeromq driver, this logic is embedded in the client.
 It has to know about all consumers and handles choosing where each
message goes itself.  See references to the 'matchmaker' code for this.

Honestly, using a distributed more lightweight router like Dispatch
sounds *much* nicer.

  The exception
 to that is cases where we use a publish-subscribe model, and a broker
 serves that really well.  Notifications and notification consumers
 (such as Ceilometer) are the prime example.
 
 The 'fanout' RPC cast would perhaps be another?

Good point.

In Nova we have been working to get rid of the usage of this pattern.
In the latest code the only place it's used AFAIK is in some code we
expect to mark deprecated (nova-network).

 In terms of existing messaging drivers, you could accomplish this with
 a combination of both RabbitMQ or Qpid for brokered messaging and
 ZeroMQ for the direct messaging cases.  It would require only a small
 amount of additional code to allow you to select a separate driver for
 each case.

 Based on my understanding, AMQP 1.0 could be used for both of these
 patterns.  It seems ideal long term to be able to use the same
 protocol for everything we need.
 
 That is certainly true. AMQP 1.0 is fully symmetric so it can be used
 directly peer-to-peer as well as between intermediaries. In fact, apart
 from the establishment of the connection in the first place, a process
 need not see any difference in the interaction either way.
 
 We could use only ZeroMQ, as well.  It doesn't have the
 publish-subscribe stuff we need built in necessarily.  Surely that has
 been done multiple times by others already, though.  We could build it
 too, if we had to.
 
 Indeed. However the benefit of choosing a protocol is that you can use
 solutions developed outside OpenStack or any other single project.
 
 Can you (or someone) elaborate further on what will make this solution
 superior to our existing options?
 
 Superior is a very bold claim to make :-) I do personally think that an
 AMQP 1.0 based solution would be worthwhile for the reasons above. Given
 a hypothetical choice between say the current qpid driver and one that
 could talk to different back-ends, over a standard protocol for which
 e.g. semantic monitoring tools could be developed and which would make
 reasoning about partial upgrades or migrations easier, I know which I
 would lean to. Obviously that is not the choice here, since one already
 exists and the other is as yet hypothetical. However, as I say I think
 this could be a worthwhile journey and that would justify at least
 taking some initial steps.

Thanks for sharing some additional insight.

I was already quite optimistic, but you've helped solidify that.  I'm
very interested in diving deeper into how Dispatch would fit into the
various ways OpenStack is using messaging today.  I'd like to get a
better handle on how the use of Dispatch as an intermediary would scale
out for a deployment that consists of 10s of thousands of compute nodes,
for example.

Is it roughly just that you can have a network of N Dispatch routers
that route messages from point A to point B, and for notifications we
would use a traditional message broker (qpidd or rabbitmq) ?

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http

Re: [openstack-dev] [Oslo] First steps towards amqp 1.0

2013-12-09 Thread Russell Bryant
On 12/09/2013 05:16 PM, Gordon Sim wrote:
 On 12/09/2013 07:15 PM, Russell Bryant wrote:
 On 12/09/2013 12:56 PM, Gordon Sim wrote:
 In the case of Nova (and others that followed Nova's messaging
 patterns), I firmly believe that for scaling reasons, we need to move
 toward it becoming the norm to use peer-to-peer messaging for most
 things.  For example, the API and conductor services should be talking
 directly to compute nodes instead of through a broker.

 Is scale the only reason for preferring direct communication? I don't
 think an intermediary based solution _necessarily_ scales less
 effectively (providing it is distributed in nature, which for example is
 one of the central aims of the dispatch router in Qpid).

 That's not to argue that peer-to-peer shouldn't be used, just trying to
 understand all the factors.

 Scale is the primary one.  If the intermediary based solution is easily
 distributed to handle our scaling needs, that would probably be fine,
 too.  That just hasn't been our experience so far with both RabbitMQ and
 Qpid.
 
 Understood. The Dispatch Router was indeed created from an understanding
 of the limitations and drawbacks of the 'federation' feature of qpidd
 (which was the primary mechanism for scaling beyond one broker) as well
 learning lessons around the difficulties of message replication and
 storage.

Cool.  To make the current situation worse, AFAIK, we've never been able
to make Qpid federation work at all for OpenStack.  That may be due to
the way we use Qpid, though.

For RabbitMQ, I know people are at least using active-active clustering
of the broker.

 One other pattern that can benefit from intermediated message flow is in
 load balancing. If the processing entities are effectively 'pulling'
 messages, this can more naturally balance the load according to capacity
 than when the producer of the workload is trying to determine the best
 balance.

 Yes, that's another factor.  Today, we rely on the message broker's
 behavior to equally distribute messages to a set of consumers.
 
 Sometimes you even _want_ message distribution to be 'unequal', if the
 load varies by message or the capacity by consumer. E.g. If one consumer
 is particularly slow (or is given a particularly arduous task), it may
 not be optimal for it to receive the same portion of subsequent messages
 as other less heavily loaded or more powerful consumers.

Indeed.  We haven't tried to do that anywhere, but it would be an
improvement for some cases.

   The exception
 to that is cases where we use a publish-subscribe model, and a broker
 serves that really well.  Notifications and notification consumers
 (such as Ceilometer) are the prime example.

 The 'fanout' RPC cast would perhaps be another?

 Good point.

 In Nova we have been working to get rid of the usage of this pattern.
 In the latest code the only place it's used AFAIK is in some code we
 expect to mark deprecated (nova-network).
 
 Interesting. Is that because of problems in scaling the messaging
 solution or for other reasons?

It's primarily a scaling concern.  We're assuming that broadcasting
messages is generally an anti-pattern for the massive scale we're aiming
for.

 [...]
 I'm very interested in diving deeper into how Dispatch would fit into
 the various ways OpenStack is using messaging today.  I'd like to get
 a better handle on how the use of Dispatch as an intermediary would
 scale out for a deployment that consists of 10s of thousands of
 compute nodes, for example.

 Is it roughly just that you can have a network of N Dispatch routers
 that route messages from point A to point B, and for notifications we
 would use a traditional message broker (qpidd or rabbitmq) ?
 
 For scaling the basic idea is that not all connections are made to the
 same process and therefore not all messages need to travel through a
 single intermediary process.
 
 So for N different routers, each have a portion of the total number of
 publishers and consumers connected to them. Though client can
 communicate even if they are not connected to the same router, each
 router only needs to handle the messages sent by the publishers directly
 attached, or sent to the consumer directly attached. It never needs to
 see messages between publishers and consumer that are not directly
 attached.
 
 To address your example, the 10s of thousands of compute nodes would be
 spread across N routers. Assuming these were all interconnected, a
 message from the scheduler would only travel through at most two of
 these N routers (the one the scheduler was connected to and the one the
 receiving compute node was connected to). No process needs to be able to
 handle 10s of thousands of connections itself (as contrasted with full
 direct, non-intermediated communication, where the scheduler would need
 to manage connections to each of the compute nodes).
 
 This basic pattern is the same as networks of brokers, but Dispatch
 router has been designed from the start

Re: [openstack-dev] [Nova] New API requirements, review of GCE

2013-12-10 Thread Russell Bryant
On 12/10/2013 08:47 AM, Christopher Yeoh wrote:
 On Tue, Dec 10, 2013 at 11:36 PM, Alexandre Levine
 alev...@cloudscaling.com mailto:alev...@cloudscaling.com wrote:
 
 Russell,
 
 I'm a little confused about importing it into stackforge repository.
 Please clarify it for me.
 Right now our code is a part of nova itself, adding lots of files
 and changing several of existing, namely: service.py, cmd/api.py,
 setup.cfg, api-paste.ini and nova.conf.sample. This is only the
 core. Also our functional code for some marginal GCE API
 functionality has to use database (to store naming translation).
 So for creating the stackforge repository we have 3 different
 options with different costs in terms of additional labor:
 1. Create the copy of nova and add GCE as a part of it. - I don't
 fill it to be a customary way for such additions but it'll be the
 least costly for us.
 2. Separate our code from nova leaving it its part still. Stackforge
 repository will contain only GCE code. It'd be installed after the
 nova is installed, create another nova service, change api-paste.ini
 and nova.conf, create tables in nova DB (or create its own DB, I'm
 not sure here). This will require some changes for the present code
 but not many though.
 3. Completely separate GCE and make it a standalone service using
 nova via REST. The most costly options for us now. Still doable as well.
 
 
 In the long run I suspect that option 3 will probably be the least work
 for everyone. Why be dependent on changes to Nova's internal APIs unless
 you really have to when you have the alternative of being dependent on a
 much more stable REST API?

#3 is preferable by far, IMO.

#2 is roughly what I was suggesting.  If it doesn't go into Nova, this
would be a good way to manage it as an add-on.  If it does go into Nova,
it would be a good way to stage such a big addition, since we'll be able
to see CI running the tests against it.  That's beneficial for the code
even if it doesn't go into Nova.

I'm still waiting to see what kind of support there is for this.  We
really need clear support for it being in the tree before accepting the
ongoing maintenance and review burden.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] New API requirements, review of GCE

2013-12-10 Thread Russell Bryant
On 12/10/2013 11:13 AM, Alexandre Levine wrote:
 Yes, I understand it perfectly, Cristopher, and cannot agree more. It's
 just more work to reach this right now than use what's present. Still in
 my opinion even in a mid-run just till IceHouse release it might be less
 work overall.
 I'm going to think it over.

So ... if you really do feel that way, I'm not sure it makes a lot of
sense to merge it one way if there's already a plan emerging to re-do
it.  We'd have to go through a painful deprecation cycle of the old code
where we're maintaining it in two places for a while.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] New API requirements, review of GCE

2013-12-10 Thread Russell Bryant
On 12/10/2013 10:47 AM, Alexandre Levine wrote:
 Does nova actually have add-ons infrastructure in which our GCE API can
 fit? I see the plugins with xen-server only and have found this link:
 http://docs.openstack.org/developer/nova/api/nova.openstack.common.plugin.plugin.html.
 Is it the thing? I'm trying to understand what you meant by this: If it
 doesn't go into Nova, this would be a good way to manage it as an add-on.

I mean as a place to host the code with infrastructure that can help
make sure it stays in sync with Nova.

There's no real plugin infrastructure that this would fit in to.  It
would be a new service in your repo, based on Nova's service code, that
loads the GCE API.  Your code would be importing nova.* stuff, even
though it's in a separate repo.

 Also if we do it a separate service does it remove necessity for support
 commitment from the nova core team? 

Yes.

 And if it does who would have to commit instead?

It could be anyone.  I presume it would be the original authors who were
planning to help maintain it anyway, as well as anyone else that the
project can attract that is interested in GCE.

 Sorry if I ask some obvious questions.

No problem ... there isn't an existing example for you follow exactly,
so it's not obvious.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Nova] icehouse-2 blueprint deadline

2013-12-10 Thread Russell Bryant
Greetings,

We have a lot of blueprints targeted to icehouse-2 [1] that are still
under review.  The blueprints that do not have a priority are still
under review.  We need a deadline for blueprints to be finalized for
this milestone.  I propose that deadline to be Thursday, December 19.

Any blueprints that have not been approved at that point will be moved
to the icehouse-3 milestone.

If you have an icehouse-2 blueprint, please check its status.
Specifically, check the Definition field of the blueprint.  Here is
what it means:

  Approved - blueprint has been approved for this milestone

  Pending Approval - waiting on a blueprint reviewer to follow up

  Review - waiting on the blueprint submitter to provide more
information.  Change to Pending Approval once you feel that the
information has been provided.

  Drafting - Details still being written up by the submitter, so the
blueprint has not been reviewed.  Update to Pending Approval when ready.

  Discussion - Blueprint approval hinges on the result of a discussion.
 The blueprint should contain a link to a mailing list thread discussing
the blueprint.  Once you feel the discussion has concluded and more
review is needed, update it to Pending Approval.

  New - new blueprints that have not been triaged yet.  A blueprint
reviewer (member of nova-drivers) will triage it soon.


After this deadline, anything left in Review, Drafting, or Discussion
state will be moved to icehouse-3.

Thanks,

[1] https://launchpad.net/nova/+milestone/icehouse-2

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] [Solum] [tempest] Use of pecan test framework in functional tests

2013-12-10 Thread Russell Bryant
On 12/10/2013 04:10 PM, Georgy Okrokvertskhov wrote:
 Hi,
 
 In Solum project we are currently creating tests environments for future
 test. We split unit tests and functional tests in order to use tempest
 framework from the beginning. 
 
 Tempest framework assumes that you run your service and test APi
 endpoints by sending HTTP requests. Solum uses Pecan WSGI framework
 which has its own test framework based on WebTest. This framework allows
 to test application without sending actual HTTP traffic. It mocks low
 level stuff related to transport but keeps all high level WSGI part as
 it is a real life application\service.
 
 There is a question to QA\Tempest teams, what do you think about using
 pecan test framework in tempest for Pecan based applications?

I don't think that makes sense.  Then we're not using the code like it
would be used normally (via HTTP).

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Horizon] Nominations to Horizon Core

2013-12-11 Thread Russell Bryant
On 12/10/2013 05:57 PM, Paul McMillan wrote:
 +1 on Tatiana Mazur, she's been doing a bunch of good work lately.
 
 I'm fine with me being removed from core provided you have someone else 
 qualified to address security issues as they come up. My contributions have 
 lately been reviewing and responding to security issues, vetting fixes for 
 those, and making sure they happen in a timely fashion. Fortunately, we 
 haven't had too many of those lately. Other than that, I've been lurking and 
 reviewing to make sure nothing egregious gets committed.
 
 If you don't have anyone else who is a web security specialist on the core 
 team, I'd like to stay. Since I'm also a member of the Django security team, 
 I offer a significant chunk of knowledge about how the underlying security 
 protections are intended work.

Security reviews aren't done on gerrit, though.  They are handled in
launchpad bugs.  It seems you could still contribute in this way without
being on the horizon-core team responsible for reviewing normal changes
in gerrit.

The bigger point is that you don't have to be on whatever-core to
contribute productively to reviews.  I think every project has people
that make important review contributions, but aren't necessarily
reviewing regularly enough to be whatever-core.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Horizon] Nominations to Horizon Core

2013-12-11 Thread Russell Bryant
On 12/11/2013 08:14 PM, Bryan D. Payne wrote:
 Re: Removing Paul McMillan from core
 
 I would argue that it is critical that each project have 1-2 people on
 core that are security experts.  The VMT is an intentionally small team.
  They are moving to having specifically appointed security sub-teams on
 each project (I believe this is what I heard at the last summit).  These
 teams would be a subset of the core devs that can handle security
 reviews.  They idea is that these people would then be able to +1 / -1
 embargoed security patches.  So having someone like Paul on Horizon core
 would be very valuable for such things.

We can involve people in security reviews without having them on the
core review team.  They are separate concerns.

 In addition, I think that gerrit is exactly where security reviews
 *should* be happening.  Much better to catch things before they are
 merged, rather than as bugs after-the-fact.  Would we rather have a -1
 on a code review than a CVE?

This has been discussed quite a bit.  We can't handle security patches
on gerrit right now while they are embargoed because we can't completely
hide them.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Horizon] Nominations to Horizon Core

2013-12-12 Thread Russell Bryant
On 12/11/2013 11:08 PM, Bryan D. Payne wrote:
 We can involve people in security reviews without having them on the
 core review team.  They are separate concerns.
 
 
 Yes, but those people can't ultimately approve the patch.  So you'd need
 to have a security reviewer do their review, and then someone who isn't
 a security person be able to offer the +1/+2 based on the opinion of the
 security reviewer.  This doesn't make any sense to me.  You're involving
 an extra person needlessly, and creating extra work.

I don't want someone not regularly looking at changes going into the
code able to do the ultimate approval of any patch.  I think this is
working as designed.  Including the extra person in this case is a good
thing.

 
  
 
 This has been discussed quite a bit.  We can't handle security patches
 on gerrit right now while they are embargoed because we can't completely
 hide them.
 
 
 I think that you're confusing security reviews of new code changes with
 reviews of fixes to security problems.  In this part of my email, I'm
 talking about the former.  These are not embargoed.  They are just the
 everyday improvements to the system.  That is the best time to identify
 and gate on security issues.  Without someone on core that can give a -2
 when there's a problem, this will basically never happen.  Then we'll be
 back to fixing a greater number of things as bugs.

Anyone can offer a -1, and that will be paid attention to.  If that ever
doesn't happen, let's talk about it.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] All I want for Christmas is one more +2 ...

2013-12-12 Thread Russell Bryant
On 12/12/2013 09:22 AM, Day, Phil wrote:
 Hi Cores,
 
  
 
 The “Stop, Rescue, and Delete should give guest a chance to shutdown”
 change https://review.openstack.org/#/c/35303/ was approved a couple of
 days ago, but failed to merge because the RPC version had moved on.  
 Its rebased and sitting there with one +2 and a bunch of +1s  -would be
 really nice if it could land before it needs another rebase please ?

Approved.

FWIW, I'm fine with folks approving with a single +2 for cases where a
patch is approved but needed a simple rebase.  This happens pretty
often.  We even have a script that generates a list of patches still
open that were previously approved:

http://russellbryant.net/openstack-stats/nova-openapproved.txt

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Generic question: Any tips for 'keeping up' with the mailing lists?

2013-12-12 Thread Russell Bryant
On 12/12/2013 11:23 AM, Justin Hammond wrote:
 I am a developer who is currently having troubles keeping up with the
 mailing list due to volume, and my inability to organize it in my client.
 I am nearly forced to use Outlook 2011 for Mac and I have read and
 attempted to implement
 https://wiki.openstack.org/wiki/MailingListEtiquette but it is still a lot
 to deal with. I read once a topic or wiki page on using X-Topics but I
 have no idea how to set that in outlook (google has told me that the
 feature was removed).
 
 I'm not sure if this is a valid place for this question, but I *am* having
 difficulty as a developer.
 
 Thank you for anyone who takes the time to read this.

The trick is defining what keeping up means for you.  I doubt anyone
reads everything.  I certainly don't.

First, I filter all of openstack-dev into its own folder.  I'm sure
others filter more aggressively based on topic, but I don't since I know
I may be interested in threads in any of the topics.  Figure out what
filtering works for you.

I scan subjects for the threads I'd probably be most interested in.
While I'm scanning, I'm first looking for topic tags, like [Nova], then
I read the subject and decide whether I want to dive in and read the
rest.  It happens very quickly, but that's roughly my thought process.

With whatever is left over: mark all as read.  :-)

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

2013-12-12 Thread Russell Bryant
On 12/12/2013 12:02 PM, Clint Byrum wrote:
 I've been chasing quite a few bugs in the TripleO automated bring-up
 lately that have to do with failures because either there are no valid
 hosts ready to have servers scheduled, or there are hosts listed and
 enabled, but they can't bind to the network because for whatever reason
 the L2 agent has not checked in with Neutron yet.
 
 This is only a problem in the first few minutes of a nova-compute host's
 life. But it is critical for scaling up rapidly, so it is important for
 me to understand how this is supposed to work.
 
 So I'm asking, is there a standard way to determine whether or not a
 nova-compute is definitely ready to have things scheduled on it? This
 can be via an API, or even by observing something on the nova-compute
 host itself. I just need a definitive signal that the compute host is
 ready.

If a nova compute host has registered itself to start having instances
scheduled to it, it *should* be ready.  AFAIK, we're not doing any
network sanity checks on startup, though.

We already do some sanity checks on startup.  For example, nova-compute
requires that it can talk to nova-conductor.  nova-compute will block on
startup until nova-conductor is responding if they happened to be
brought up at the same time.

We could do something like this with a networking sanity check if
someone could define what that check should look like.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] New official bugtag 'Ironic' ?

2013-12-12 Thread Russell Bryant
On 12/12/2013 03:03 PM, Robert Collins wrote:
 We have official tags for most of the hypervisors, but not ironic as
 yet - any objections to adding one?

Nope, go ahead.  For reference, to add it we need to:

1) Make it an official tag in launchpad

2) Update https://wiki.openstack.org/wiki/BugTags

3) Update https://wiki.openstack.org/wiki/Nova/BugTriage

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

2013-12-12 Thread Russell Bryant
On 12/12/2013 12:35 PM, Clint Byrum wrote:
 Excerpts from Chris Friesen's message of 2013-12-12 09:19:42 -0800:
 On 12/12/2013 11:02 AM, Clint Byrum wrote:

 So I'm asking, is there a standard way to determine whether or not a
 nova-compute is definitely ready to have things scheduled on it? This
 can be via an API, or even by observing something on the nova-compute
 host itself. I just need a definitive signal that the compute host is
 ready.

 Is it not sufficient that nova service-list shows the compute service 
 as up?

 
 I could spin waiting for at least one. Not a bad idea actually. However,
 I suspect that will only handle the situations I've gotten where the
 scheduler returns NoValidHost.

Right it solves this case

 I say that because I think if it shows there, it matches the all hosts
 filter and will have things scheduled on it. With one compute host I
 get failures after scheduling because neutron has no network segment to
 bind to. That is because the L2 agent on the host has not yet registered
 itself with Neutron.

but not this one.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] [Neutron] How do we know a host is ready to have servers scheduled onto it?

2013-12-12 Thread Russell Bryant
On 12/12/2013 01:36 PM, Clint Byrum wrote:
 Excerpts from Kyle Mestery's message of 2013-12-12 09:53:57 -0800:
 On Dec 12, 2013, at 11:44 AM, Jay Pipes jaypi...@gmail.com wrote:
 On 12/12/2013 12:36 PM, Clint Byrum wrote:
 Excerpts from Russell Bryant's message of 2013-12-12 09:09:04 -0800:
 On 12/12/2013 12:02 PM, Clint Byrum wrote:
 I've been chasing quite a few bugs in the TripleO automated bring-up
 lately that have to do with failures because either there are no valid
 hosts ready to have servers scheduled, or there are hosts listed and
 enabled, but they can't bind to the network because for whatever reason
 the L2 agent has not checked in with Neutron yet.

 This is only a problem in the first few minutes of a nova-compute host's
 life. But it is critical for scaling up rapidly, so it is important for
 me to understand how this is supposed to work.

 So I'm asking, is there a standard way to determine whether or not a
 nova-compute is definitely ready to have things scheduled on it? This
 can be via an API, or even by observing something on the nova-compute
 host itself. I just need a definitive signal that the compute host is
 ready.

 If a nova compute host has registered itself to start having instances
 scheduled to it, it *should* be ready.  AFAIK, we're not doing any
 network sanity checks on startup, though.

 We already do some sanity checks on startup.  For example, nova-compute
 requires that it can talk to nova-conductor.  nova-compute will block on
 startup until nova-conductor is responding if they happened to be
 brought up at the same time.

 We could do something like this with a networking sanity check if
 someone could define what that check should look like.

 Could we ask Neutron if our compute host has an L2 agent yet? That seems
 like a valid sanity check.

 ++

 This makes sense to me as well. Although, not all Neutron plugins have
 an L2 agent, so I think the check needs to be more generic than that.
 For example, the OpenDaylight MechanismDriver we have developed
 doesn't need an agent. I also believe the Nicira plugin is agent-less,
 perhaps there are others as well.

 And I should note, does this sort of integration also happen with cinder,
 for example, when we're dealing with storage? Any other services which
 have a requirement on startup around integration with nova as well?

 
 Does cinder actually have per-compute-host concerns? I admit to being a
 bit cinder-stupid here.

No, it doesn't.

 Anyway, it seems to me that any service that is compute-host aware
 should be able to respond to the compute host whether or not it is a)
 aware of it, and b) ready to serve on it.
 
 For agent-less drivers that is easy, you just always return True. And
 for drivers with agents, you return false unless you can find an agent
 for the host.
 
 So something like:
 
 GET /host/%(compute-host-name)
 
 And then in the response include a ready attribute that would signal
 whether all networks that should work there, can work there.
 
 As a first pass, just polling until that is ready before nova-compute
 enables itself would solve the problems I see (and that I think users
 would see as a cloud provider scales out compute nodes). Longer term
 we would also want to aim at having notifications available for this
 so that nova-compute could subscribe to that notification bus and then
 disable itself if its agent ever goes away.
 
 I opened this bug to track the issue. I suspect there are duplicates of
 it already reported, but would like to start clean to make sure it is
 analyzed fully and then we can use those other bugs as test cases and
 confirmation:
 
 https://bugs.launchpad.net/nova/+bug/1260440

Sounds good.  I'm happy to do this in Nova, but we'll have to get the
Neutron API bit sorted out first.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [bugs] definition of triaged

2013-12-13 Thread Russell Bryant
On 12/12/2013 04:46 PM, Robert Collins wrote:
 Hi, I'm trying to overhaul the bug triage process for nova (initially)
 to make it much lighter and more effective.
 
 I'll be sending a more comprehensive mail shortly

before you do, let's agree what we're trying to solve.  Perhaps you were
going to cover that in your later message, but it wouldn't hurt
discussing it now.

I actually didn't think our process was that broken.  It's more that I
feel we need a person leading a small team that is working on it reguarly.

The idea with the tagging approach was to break up the triage problem
into smaller work queues.  I haven't kept up with the tagging part and
would really like to hand that off.  Then some of the work queues aren't
getting triaged as regularly as they need to.  I'd like to see a small
team making this a high priority with some of their time each week.

With all of that said, if you think an overhaul of the process is
necessary to get to the end goal of a more well triaged bug queue, then
I'm happy to entertain it.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [governance] Becoming a Program, before applying for incubation

2013-12-13 Thread Russell Bryant
, FF
 
 
 
 
 ___ OpenStack-dev
 mailing list OpenStack-dev@lists.openstack.org 
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 


-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Docker] Environment variables

2013-12-16 Thread Russell Bryant
On 12/16/2013 09:27 AM, Daniel Kuffner wrote:
 Hi All,
 
 I have submitted a new blueprint which addresses the a common pattern
 in the docker world. A usual pattern in the docker world is to use
 environment variables to configure a container.
 
 docker run -e SQL_URL=postgres://user:password@/db my-app
 
 The nova docker driver doesn't support to set any environment
 variables. To work around this issue I used cloud-init which works
 fine. But this approach has of course the drawback that a) I have to
 install the cloud init service. and b) my docker container doesn't
 work outside of openstack.
 
 I propose to allow a user to set docker environment variables via nova
 instance metadata. The metadata key should have a prefix like ENV_
 which can be used to determine all environment variables. The prefix
 should be removed and the remainder key and vaule will be injected.
 
 The metadata can unfortunately not be set in horizon but can be used
 from the nova command line tool and from heat. Example heat:
 
 myapp:
 Type: OS::Nova::Server
 Properties:
   flavor: m1.small
   image: my-app:latest
   meta-data:
 - ENV_SQL_URL: postgres://user:password@/db
 - ENV_SOMETHING_ELSE: Value
 
 
 Let me know what you think about that.
 
 Blueprint: 
 https://blueprints.launchpad.net/nova/+spec/docker-env-via-meta-data

Thanks for starting the discussion.  More people should do this for
their blueprints.  :-)

One of the things we should be striving for is to provide as consistent
of an experience as we can across drivers.  Right now, we have the
metadata service and config drive, and neither of those are driver
specific.  In the case of config drive, whether it's used or not is
exposed through the API.  As you point out, the meta-data service does
technically work with the docker driver.

I don't think we should support environment variables like this
automatically.  Instead, I think it would be more appropriate to add an
API extension for specifying env vars.  That way the behavior is more
explicit and communicated through the API.  The env vars would be passed
through all of the appropriate plumbing and down to drivers that are
able to support it.

This is all also assuming that containers support is staying in Nova and
not a new service.  That discussion seems to have stalled.  Is anyone
still pushing on that?  Any updates?

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] time for a new major network rpc api version?

2013-12-16 Thread Russell Bryant
On 12/15/2013 05:12 PM, Robert Collins wrote:
 That said, doing anything to the network RPC API seems premature until
 the Neutron question is resolved.

This.

I've been pretty much ignoring this API since it has been frozen and
almost deprecated for a long time.  My plan was to revisit the status
of nova-network after the release of icehouse-2.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Docker] Environment variables

2013-12-16 Thread Russell Bryant
On 12/16/2013 10:12 AM, Chuck Short wrote:
 I have something that is pushing it for to stay in nova (at least the
 compute drivers). I should have a gerrit branch for people to review soon.

OK.  Do you have any design notes for whatever you're proposing?  That
would probably be easier to review and discuss.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Docker] Environment variables

2013-12-16 Thread Russell Bryant
On 12/16/2013 10:18 AM, Daniel Kuffner wrote:
 Hi Russell,
 
 You actually propose to extend the whole nova stack to support
 environment variables. Would any other driver benefit from this API
 extension?
 
 Is that what you imagine?
 nova --env SQL_URL=postgres://user:password --image 

Yes.

 Regarding the discussion you mentioned. Are there any public resources
 to read. I kind of missed it. Most likely it was before I was part of
 this community :)

It started here back in November:


http://lists.openstack.org/pipermail/openstack-dev/2013-November/019637.html

and then there have been a few messages on that thread this month, too.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Docker] Environment variables

2013-12-16 Thread Russell Bryant
On 12/16/2013 10:39 AM, Daniel P. Berrange wrote:
 On Mon, Dec 16, 2013 at 04:18:52PM +0100, Daniel Kuffner wrote:
 Hi Russell,

 You actually propose to extend the whole nova stack to support
 environment variables. Would any other driver benefit from this API
 extension?

 Is that what you imagine?
 nova --env SQL_URL=postgres://user:password --image 

 Regarding the discussion you mentioned. Are there any public resources
 to read. I kind of missed it. Most likely it was before I was part of
 this community :)
 
 With glance images we have a way to associate arbitrary metadata
 attributes with the image. I could see using this mechanism to
 associate some default set of environment variables.
 
 eg use a 'env_' prefix for glance image attributes
 
 We've got a couple of cases now where we want to overrides these
 same things on a per-instance basis. Kernel command line args
 is one other example. Other hardware overrides like disk/net device
 types are another possibility
 
 Rather than invent new extensions for each, I think we should
 have a way to pass arbitrary attributes alon with the boot
 API call, that a driver would handle in much  the same way as
 they do for glance image properties. Basically think of it as
 a way to custom any image property per instance created.

That's a pretty nice idea.  I like it.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][db] Thoughts on making instances.uuid non-nullable?

2013-12-16 Thread Russell Bryant
On 12/16/2013 11:45 AM, Matt Riedemann wrote:
 1. Add a migration to change instances.uuid to non-nullable. Besides the
 obvious con of having yet another migration script, this seems the most
 straight-forward. The instance object class already defines the uuid
 field as non-nullable, so it's constrained at the objects layer, just
 not in the DB model.  Plus I don't think we'd ever have a case where
 instance.uuid is null, right?  Seems like a lot of things would break
 down if that happened.  With this option I can build on top of it for
 the DB2 migration support to add the same FKs as the other engines.

Yeah, having instance.uuid nullable doesn't seem valuable to me, so this
seems OK.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Nova] Future meeting times

2013-12-18 Thread Russell Bryant
Greetings,

The weekly Nova meeting [1] has been held on Thursdays at 2100 UTC.
I've been getting some requests to offer an alternative meeting time.
I'd like to try out alternating the meeting time between two different
times to allow more people in our global development team to attend
meetings and engage in some real-time discussion.

I propose the alternate meeting time as 1400 UTC.  I realize that
doesn't help *everyone*, but it should be an improvement for some,
especially for those in Europe.

If we proceed with this, we would meet at 2100 UTC on January 2nd, 1400
UTC on January 9th, and alternate from there.  Note that we will not be
meeting at all on December 26th as a break for the holidays.

If you can't attend either of these times, please note that the meetings
are intended to be supplementary to the openstack-dev mailing list.  In
the meetings, we check in on status, raise awareness of important
issues, and progress some discussions with real-time debate, but the
most important discussions and decisions will always be brought to the
openstack-dev mailing list, as well.  With that said, active Nova
contributors are always encouraged to attend and participate if they are
able.

Comments welcome, especially some acknowledgement that there are people
that would attend the alternate meeting time.  :-)

Thanks,

[1] https://wiki.openstack.org/wiki/Meetings/Nova

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] How do we format/version/deprecate things from notifications?

2013-12-18 Thread Russell Bryant
On 12/18/2013 12:44 PM, Nikola Đipanov wrote:
 On 12/18/2013 06:17 PM, Matt Riedemann wrote:


 On 12/18/2013 9:42 AM, Matt Riedemann wrote:
 The question came up in this patch [1], how do we deprecate and remove
 keys in the notification payload?  In this case I need to deprecate and
 replace the 'instance_type' key with 'flavor' per the associated
 blueprint.

 [1] https://review.openstack.org/#/c/62430/


 By the way, my thinking is it's handled like a deprecated config option,
 you deprecate it for a release, make sure it's documented in the release
 notes and then drop it in the next release. For anyone that hasn't
 switched over they are broken until they start consuming the new key.

 
 FWIW - I am OK with this approach - but we should at least document it.
 I am also thinking that we may want to make it explicit like oslo.config
 does it.

We really need proper versioning for notifications.  We've had a
blueprint open for about a year, but AFAICT, nobody is actively working
on it.

https://blueprints.launchpad.net/nova/+spec/versioned-notifications

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Process for proposing patches attached to launchpad bugs?

2013-12-20 Thread Russell Bryant
On 12/20/2013 09:32 AM, Dolph Mathews wrote:
 In the past, I've been able to get authors of bug fixes attached to
 Launchpad bugs to sign the CLA and submit the patch through gerrit...
 although, in one case it took quite a bit of time (and thankfully it
 wasn't a critical fix or anything).
 
 This scenario just came up again (example: [1]), so I'm asking
 preemptively... what if the author is unwilling / unable in signing the
 CLA and propose through gerrit, or it's a critical bug fix and waiting
 on an author to go through the CLA process is undesirable for the
 community? Obviously that's a bit of a fail on our part, but what's the
 most appropriate  expedient way to handle it?
 
 Can we propose the patch to gerrit ourselves?
 
 If so, who should appear as the --author of the commit? Who should
 appear as Co-Authored-By, especially when the committer helps to evolve
 the patch evolves further in review?
 
 Alternatively, am I going about this all wrong?
 
 Thanks!
 
 [1]: https://bugs.launchpad.net/keystone/+bug/1198171/comments/8

It's not your code, so you really can't propose it without them having
signed the CLA, or propose it as your own.

Ideally have someone else fix the same bug that hasn't looked at the patch.

From a quick look, it seems likely that this fix is small and straight
forward enough that the clean new implementation is going to end up
looking very similar.  Still, I think it's the right thing to do.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Nova] Live upgrades and major rpc versions

2013-12-20 Thread Russell Bryant
Greetings,

Bumping the major rpc versions allows us to drop old backwards
compatibility code.  However, we have to do this in such a way that
doesn't break live upgrades.  We've expected live upgrades for CD to
work for a while, and we're also expecting to be able to support it from
Havana to Icehouse.

The approach for bumping major rpc versions in the past has been like this:

Step 1) https://review.openstack.org/#/c/53944/

Step 2) https://review.openstack.org/#/c/54493/

The approach outlined in the commit message for step 1 discusses how
this approach works with live upgrades in a CD environment.  However,
making changes like this in the middle of a release cycle breaks the
live upgrade from the N-1 to N release.

(Yes, these changes broke Havana-Icehouse live upgrades, but that has
since been resolved with some other patches.  This discussion is how we
avoid breaking it in the future.)

To support N-1 to N live upgrades, I propose that we use the same change
structure, but split it over a release boundary.  A practical example
for the conductor service:

Step 1) https://review.openstack.org/#/c/52218/

This patch adds a new revision of the conductor rpc API, 2.0.  I say we
merge a change like this just before the Icehouse release.  The way it's
written is very low risk to the release since it leaves most important
existing code (1.X) untouched.

Step 2) https://review.openstack.org/#/c/52219/

Once master is open for J development, merge a patch like this one as
step 2.  At this point, we would drop all support for 1.X.  It's no
longer needed because in J we're only trying to support upgrades from
Icehouse, and Icehouse supported 2.0.

Using this approach I think we can support live upgrades from N-1 to N
while still being able to drop some backwards compatibility code each
release cycle.

Once we get the details worked out, I'd like to capture the process on
the release checklist wiki page for Nova.

https://wiki.openstack.org/wiki/Nova/ReleaseChecklist

Thoughts?

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Nova] No meeting this week

2013-12-23 Thread Russell Bryant
No Nova meeting this week.  We will resume on Thursday, January 2, at
21:00 UTC.

https://wiki.openstack.org/wiki/Meetings/Nova

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Horizon] import only module message and #noqa

2014-01-03 Thread Russell Bryant
On 01/03/2014 10:10 AM, Radomir Dopieralski wrote:
 I think that we can actually do a little bit better and remove many of
 the #noqa tags without forfeiting automatic checking. I submitted a
 patch: https://review.openstack.org/#/c/64832/
 
 This basically adds a h302_exceptions option to tox.ini, that lets us
 specify which names are allowed to be imported. For example, we can do:
 
 [hacking]
 h302_exceptions = django.conf.settings,
   django.utils.translation.ugettext_lazy,
   django.core.urlresolvers.
 
 To have settings, _ and everything from urlresolvers importable without
 the need for the #noqa tag.
 
 Of course every project can add their own names there, depending what
 they need.

Isn't that what import_exceptions is for?  For example, we have this
in nova:

import_exceptions = nova.openstack.common.gettextutils._

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Horizon] import only module message and #noqa

2014-01-03 Thread Russell Bryant
On 01/03/2014 10:35 AM, Radomir Dopieralski wrote:
 On 03/01/14 16:18, Russell Bryant wrote:
 On 01/03/2014 10:10 AM, Radomir Dopieralski wrote:
 I think that we can actually do a little bit better and remove many of
 the #noqa tags without forfeiting automatic checking. I submitted a
 patch: https://review.openstack.org/#/c/64832/

 This basically adds a h302_exceptions option to tox.ini, that lets us
 specify which names are allowed to be imported. For example, we can do:

 [hacking]
 h302_exceptions = django.conf.settings,
   django.utils.translation.ugettext_lazy,
   django.core.urlresolvers.

 To have settings, _ and everything from urlresolvers importable without
 the need for the #noqa tag.

 Of course every project can add their own names there, depending what
 they need.

 Isn't that what import_exceptions is for?  For example, we have this
 in nova:

 import_exceptions = nova.openstack.common.gettextutils._

 No exactly, as this will disable all import checks, just like # noqa.
 

Ah, makes sense.  Thanks.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Gantt] Looking for some answers...

2014-01-06 Thread Russell Bryant
On 01/06/2014 02:30 PM, Vishvananda Ishaya wrote:
 
 On Jan 6, 2014, at 11:02 AM, Jay Pipes jaypi...@gmail.com wrote:
 
 Hello Stackers,
 
 I was hoping to get some answers on a few questions I had
 regarding the Gantt project [1]. Specifically, here are my
 queries:
 
 1) Why was Nova forked to the http://github.com/openstack/gantt 
 repository? Forking Nova just to then remove a bunch of code
 that doesn't relate to the scheduler code means that we bring
 10K+ commits and a git history along with the new project... this
 seems to be the wrong origin for a project the aims to be a
 separate service. There's a reason that Cinder and Neutron didn't
 start out as a fork of Nova, after all…
 
 Authorship history is nice, but this does seem a bit excessive. The
 cinder strategy of a single squashed fork would have been/still be
 fine I’m sure.

That's not exactly what was done here.

It's a new repo created with the history filtered out.  The history
was only maintained for code kept.  That seems pretty ideal to me.

 
 2) Why is Gantt in the /openstack GitHub organization? Wouldn't 
 the /stackforge organization be more appropriate for a project
 that isn't integrated? If I understand some of the backstory
 behind Gantt, the idea was to create a scheduler service from the
 existing Nova scheduler code in order to complete the work
 sometime in our lifetime. While I understand the drive to start
 with something that already exists and iterate over it, I don't
 understand why the project went right into the /openstack
 organization instead of following the /stackforge processes for
 housing code that bakes and gets iterated on before proposing for
 incubation. Some explanation would be great here.
 
 This is split-out of existing code so it is following the same path
 as cinder. The goal is to deprecate the existing nova scheduler in
 I. It currently a new project under the nova program I believe.

Correct (compute program, technically).  It's just a mechanical thing,
not new code.  Also, it's not an incubated or integrated project yet.
 It's just an official repo under the compute program.

 
 
 3) Where is feature planning happening for Gantt? The Launchpad
 site for Gantt [2] is empty. Furthermore, there are a number of
 blueprints for improving the Nova scheduler, notably the
 no-db-scheduler blueprint [3], which even has code submitted for
 it and is targeted to Icehouse-2. How are improvements like this
 planned to be ported (if at all) to Gantt?
 
 Not sure about the launchpad site. There is a regular scheduler
 group meeting and as I understand it the hope will be to do the
 no-db-scheduler blueprint. There was quite a bit of debate on
 whether to do the no-db-scheduler stuff before or after the
 forklift and I think the consensus was to do the forklift first.

The planning is just being done in nova blueprints right now.  Once
gantt has enough momentum, we can start using a separate launchpad
project.  But we haven't even finished step 1 of making the thing run yet.

 
 4) Is the aim of Gantt to provide a RESTful HTTP API in addition
 to the RPC-based API that the existing Nova scheduler exposes?
 
 In the short term the plan is to just replicate the rpc api, but I
 think a REST api will be considered long term.

Yep.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Gantt] Looking for some answers...

2014-01-06 Thread Russell Bryant
On 01/06/2014 02:52 PM, Boris Pavlovic wrote:
 Vish,
 
 and as I understand it the hope will be to do the no-db-scheduler blueprint.
 There was quite a bit of debate on whether to do the no-db-scheduler stuff
 before or after the forklift and I think the consensus was to do the
 forklift
 first.
 
 Current Nova scheduler is so deeply bind to nova data models, that it is
 useless for every other project. 
 
 So I don't think that forkit in such state of Nova Scheduler is useful
 for any other project. 

It should be pretty easy to do this in gantt though.  Right now I would
probably do it against the current scheduler and then we'll port it
over.  I don't think we should do major work only in gantt until we're
ready to deprecate the current scheduler.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Gantt] New scheduler tree not up

2014-01-06 Thread Russell Bryant
On 01/06/2014 03:55 PM, Dugger, Donald D wrote:
 This is to let everyone know that we have created a new source tree,
 `https://github.com/openstack/gantt.git’, that contains the code for the
 Nova scheduler.  The ultimate goal is to create a separate scheduler
 service that can be utilized by any part of OpenStack that needs a
 scheduling capability but we’re going to start out just by moving the
 current scheduler into a separate tree.
 
  
 
 Note that right now the new tree is not a new scheduler, it is just the
 current Nova scheduler code that has been moved to a new tree.  Any
 changes we want to make to the Nova scheduler should still happen inside
 the Nova tree.
 
  
 
 There are a few tasks that need to be completed before the new tree can
 be utilized by Nova:
 
  
 
 1)  Get the tests working in the Gantt tree.  Currently, we’ve make
 the tests non-voting as none of the tests work yet, fixing this is
 clearly a critical task.
 
 2)  Do the plumbing such that Nova makes its scheduler calls into
 the new Gantt tree.  Given that the Gantt tree is a duplicate of the
 nova code this should be a fairly safe change but there is clearly work
 that needs to be done here.
 
 3)  Get the documentation working in the Gantt tree, that’s
 currently broken.
 
 4)  Start working on creating Gantt as separately running, callable
 service (expect some lively sessions at the next summit, new RESTful
 APIs are probably needed at a minimum).
 
  
 
 Note that until we’ve completed task 2 there will be 2 separate
 scheduler source trees, the one in Nova and the one in Gantt.  Scheduler
 development should not change for now, all scheduler changes should be
 applied to the Nova tree.  I’ll be monitoring the Nova tree and will
 port over any scheduler changes into the Gantt tree.  (Hopefully we will
 complete task 2 before I go crazy doing that.)

Further, I think all gantt feature development should be explicitly
*not* allowed until the nova scheduler is deprecated and ready to be
replaced by gantt.  Other efforts (a REST API, supporting other
services) will have to wait.  This is to make the transition as quick as
possible.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [OpenStack][Nova][cold migration] Why we need confirm resize after cold migration

2014-01-08 Thread Russell Bryant
On 01/08/2014 04:52 AM, Jay Lau wrote:
 Greetings,
 
 I have a question related to cold migration.
 
 Now in OpenStack nova, we support live migration, cold migration and resize.
 
 For live migration, we do not need to confirm after live migration finished.
 
 For resize, we need to confirm, as we want to give end user an
 opportunity to rollback.
 
 The problem is cold migration, because cold migration and resize share
 same code path, so once I submit a cold migration request and after the
 cold migration finished, the VM will goes to verify_resize state, and I
 need to confirm resize. I felt a bit confused by this, why do I need to
 verify resize for a cold migration operation? Why not reset the VM to
 original state directly after cold migration?

The confirm step definitely makes more sense for the resize case.  I'm
not sure if there was a strong reason why it was also needed for cold
migration.

If nobody comes up with a good reason to keep it, I'm fine with removing
it.  It can't be changed in the v2 API, though.  This would be a v3 only
change.

 Also, I think that probably we need split compute.api.resize() to two
 apis: one is for resize and the other is for cold migrations.
 
 1) The VM state can be either ACTIVE and STOPPED for a resize operation
 2) The VM state must be STOPPED for a cold migrate operation.

I'm not sure why would require different states here, though.  ACTIVE
and STOPPED are allowed now.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [OpenStack][Nova][cold migration] Why we need confirm resize after cold migration

2014-01-08 Thread Russell Bryant
On 01/08/2014 09:53 AM, John Garbutt wrote:
 On 8 January 2014 10:02, David Xie david.script...@gmail.com wrote:
 In nova/compute/api.py#2289, function resize, there's a parameter named
 flavor_id, if it is None, it is considered as cold migration. Thus, nova
 should skip resize verifying. However, it doesn't.

 Like Jay said, we should skip this step during cold migration, does it make
 sense?
 
 Not sure.
 
 On Wed, Jan 8, 2014 at 5:52 PM, Jay Lau jay.lau@gmail.com wrote:

 Greetings,

 I have a question related to cold migration.

 Now in OpenStack nova, we support live migration, cold migration and
 resize.

 For live migration, we do not need to confirm after live migration
 finished.

 For resize, we need to confirm, as we want to give end user an opportunity
 to rollback.

 The problem is cold migration, because cold migration and resize share
 same code path, so once I submit a cold migration request and after the cold
 migration finished, the VM will goes to verify_resize state, and I need to
 confirm resize. I felt a bit confused by this, why do I need to verify
 resize for a cold migration operation? Why not reset the VM to original
 state directly after cold migration?
 
 I think the idea was allow users/admins to check everything went OK,
 and only delete the original VM when the have confirmed the move went
 OK.
 
 I thought there was an auto_confirm setting. Maybe you want
 auto_confirm cold migrate, but not auto_confirm resize?

I suppose we could add an API parameter to auto-confirm these things.
That's probably a good compromise.

 Also, I think that probably we need split compute.api.resize() to two
 apis: one is for resize and the other is for cold migrations.

 1) The VM state can be either ACTIVE and STOPPED for a resize operation
 2) The VM state must be STOPPED for a cold migrate operation.
 
 We just stop the VM them perform the migration.
 I don't think we need to require its stopped first.
 Am I missing something?

Don't think so ... I think we should leave it as is.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Organizing a Gate Blocking Bug Fix Day

2014-01-09 Thread Russell Bryant
On 01/09/2014 07:46 AM, Sean Dague wrote:
 I think we are all agreed that the current state of Gate Resets isn't
 good. Unfortunately some basic functionality is really not working
 reliably, like being able to boot a guest to a point where you can ssh
 into it.
 
 These are common bugs, but they aren't easy ones. We've had a few folks
 digging deep on these, but we, as a community, are not keeping up with
 them.
 
 So I'd like to propose Gate Blocking Bug Fix day, to be Monday Jan 20th.
 On that day I'd ask all core reviewers (and anyone else) on all projects
 to set aside that day to *only* work on gate blocking bugs. We'd like to
 quiet the queues to not include any other changes that day so that only
 fixes related to gate blocking bugs would be in the system.
 
 This will have multiple goals:
  #1 - fix some of the top issues
  #2 - ensure we classify (ER fingerprint) and register everything we're
 seeing in the gate fails
  #3 - ensure all gate bugs are triaged appropriately
 
 I'm hopefully that if we can get everyone looking at this one a single
 day, we can start to dislodge the log jam that exists.
 
 Specifically I'd like to get commitments from as many PTLs as possible
 that they'll both directly participate in the day, as well as encourage
 the rest of their project to do the same.

I'm in!

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][neutron] top gate bugs: a plea for help

2014-01-09 Thread Russell Bryant
On 01/08/2014 05:53 PM, Joe Gordon wrote:
 Hi All, 
 
 As you know the gate has been in particularly bad shape (gate queue over
 100!) this week due to a number of factors. One factor is how many major
 outstanding bugs we have in the gate.  Below is a list of the top 4 open
 gate bugs.
 
 Here are some fun facts about this list:
 * All bugs have been open for over a month
 * All are nova bugs
 * These 4 bugs alone were hit 588 times which averages to 42 hits per
 day (data is over two weeks)!
 
 If we want the gate queue to drop and not have to continuously run
 'recheck bug x' we need to fix these bugs.  So I'm looking for
 volunteers to help debug and fix these bugs.

I created the following etherpad to help track the most important Nova
gate bugs. who is actively working on them, and any patches that we have
in flight to help address them:

  https://etherpad.openstack.org/p/nova-gate-issue-tracking

Please jump in if you can.  We shouldn't wait for the gate bug day to
move on these.  Even if others are already looking at a bug, feel free
to do the same.  We need multiple sets of eyes on each of these issues.

-- 
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


  1   2   3   4   5   6   7   8   >