[openstack-dev] [qa] Do I need a spec for testing the compute os-networks API?

2014-08-08 Thread Matt Riedemann
This came up while reviewing the fix for bug 1327406 [1].  Basically the 
os-networks API behaves differently depending on your backing network 
manager in nova-network.


We run Tempest in the gate with the FlatDHCPManager, which has the bug; 
if you try to list networks as a non-admin user it won't return anything 
you can't assign those networks to a tenant.  With VlanManager you do 
assign a tenant so list-networks works.


I don't see any os-networks API testing in Tempest today and I'm looking 
to add something, at least for listing networks to show that this bug 
exists (plus get coverage).  The question is do I need a qa-spec to do 
this?  When I wrote the tests for os-quota-classes it was for a bug fix 
since we regressed when we thought the API was broken and unused and it 
was erroneously removed in Icehouse.  I figured I'd treat this the same 
way, but it's going to require changes to the servers client to call the 
os-networks API, plus a new test module.


As far as the test design, we'd skip if using neutron since this is a 
nova-network only test. As far as how to figure out the proper 
assertions given we don't know what the backing network manager is and 
the API is inconsistent in that regard, I might have some other hurdles 
there but would at least like to get a POC going.


I guess I can do the POC before the question of blueprints/specs needs 
to be answered...


[1] https://launchpad.net/bugs/1327406

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] Do I need a spec for testing the compute os-networks API?

2014-08-08 Thread Matt Riedemann



On 8/8/2014 9:50 AM, Andrea Frittoli wrote:

Thanks Matt for bringing this up/

There is a tiny start in flight here [0] - if you plan to work on
providing full testing coverage for the n-net api you may want to create
a spec with a link to an etherpad to help track / split the work.

andrea

[0] https://review.openstack.org/#/c/107552/21



On 8 August 2014 15:42, Matt Riedemann mrie...@linux.vnet.ibm.com
mailto:mrie...@linux.vnet.ibm.com wrote:

This came up while reviewing the fix for bug 1327406 [1].  Basically
the os-networks API behaves differently depending on your backing
network manager in nova-network.

We run Tempest in the gate with the FlatDHCPManager, which has the
bug; if you try to list networks as a non-admin user it won't return
anything you can't assign those networks to a tenant.  With
VlanManager you do assign a tenant so list-networks works.

I don't see any os-networks API testing in Tempest today and I'm
looking to add something, at least for listing networks to show that
this bug exists (plus get coverage).  The question is do I need a
qa-spec to do this?  When I wrote the tests for os-quota-classes it
was for a bug fix since we regressed when we thought the API was
broken and unused and it was erroneously removed in Icehouse.  I
figured I'd treat this the same way, but it's going to require
changes to the servers client to call the os-networks API, plus a
new test module.

As far as the test design, we'd skip if using neutron since this is
a nova-network only test. As far as how to figure out the proper
assertions given we don't know what the backing network manager is
and the API is inconsistent in that regard, I might have some other
hurdles there but would at least like to get a POC going.

I guess I can do the POC before the question of blueprints/specs
needs to be answered...

[1] https://launchpad.net/bugs/__1327406
https://launchpad.net/bugs/1327406

--

Thanks,

Matt Riedemann


_
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.__org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev 
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Andrea,

Thanks, the client stuff was what I needed right now since that was the 
bulk of the code I needed for this simple POC to show the bug:


https://review.openstack.org/#/c/112944/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Retrospective veto revert policy

2014-08-12 Thread Matt Riedemann



On 8/12/2014 4:03 PM, Michael Still wrote:

This looks reasonable to me, with a slight concern that I don't know
what step five looks like... What if we can never reach a consensus on
an issue?

Michael

On Wed, Aug 13, 2014 at 12:56 AM, Mark McLoughlin mar...@redhat.com wrote:

Hey

(Terrible name for a policy, I know)

 From the version_cap saga here:

   https://review.openstack.org/110754

I think we need a better understanding of how to approach situations
like this.

Here's my attempt at documenting what I think we're expecting the
procedure to be:

   https://etherpad.openstack.org/p/nova-retrospective-veto-revert-policy

If it sounds reasonably sane, I can propose its addition to the
Development policies doc.

Mark.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






Just thinking out loud, you could do something like a 2/3 majority vote 
on the issue but that sounds too much like government, which is 
generally terrible.


Otherwise maybe the PTL is the tie-breaker?

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][core] Expectations of core reviewers

2014-08-14 Thread Matt Riedemann
 there is time to really 
hash over a topic and you're not rushed, whereas with the design summit 
sessions it felt like we'd be half way through the allotted time before 
we really started talking about anything of use and then shortly after 
that you'd be hearing 5 minutes left, and I felt like very few of the 
design sessions were actually useful, or things we've worked on in Juno, 
or at least high-priority/impact things (v3 API being an exception 
there, that was a useful session).


Maybe that's just me, but honestly if I had to choose which to go to 
between a meetup and the summit, I'd pick the mid-cycle meetup.


I don't think any travel should be *required* to be a member of the core 
team, but I do think the meetups are more productive than the summit, so 
I'm just agreeing with danpb here that it seems the design sessions 
should be restructured somehow.





As I explain in the rest of my email below I'm not advocating
getting rid of mid-cycle events entirely. I'm suggesting that
we can attain a reasonable % of the benefits of f2f meetings
by doing more formal virtual meetups and so be more efficient
and inclusive overall.


I'd love to see more high-bandwidth mechanisms used to have discussions
in between f2f meetings. In fact, one of the outcomes of this last
midcycle was that we should have one about APIv3 with the folks that
couldn't attend for other reasons. It came up specifically because we
made more progress in ninety minutes than we had in the previous eight
months (yes, even with a design summit in the middle of that).

Expecting cores to be at these sorts of things seems pretty reasonable
to me, given the usefulness (and gravity) of the discussions we've been
having so far. Companies with more cores will have to send more or make
some hard decisions, but I don't want to cut back on the meetings until
their value becomes unjustified.


I think this gets to the crux of the original email -- we are
increasingly needing cores to understand the overall direction nova is
going. You could argue for example that our failure to land many high
priority blueprints in Juno is because cores aren't acting in
coordinated a manner. So, we're attempting to come up with ways to
improve coordination.


Having mid-cycle meetups where only a subset of cores will attend doesn't
feel like an great way to improve the coordination between /all/ cores.

Regards,
Daniel



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] stable/havana jobs failing due to keystone bug 1357652

2014-08-17 Thread Matt Riedemann
I'm seeing some nova stable/havana patches failing consistently on 
keystone bug 1357652 [1], keystone won't start due to an import error.


I'm not seeing any recent changes for keystone in stable/havana so not 
sure if this is an infra issue or something else.


I'm also not seeing the hits in logstash for some reason, which is odd.

[1] https://bugs.launchpad.net/keystone/+bug/1357652

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-stable-maint] stable/havana jobs failing due to keystone bug 1357652

2014-08-17 Thread Matt Riedemann



On 8/17/2014 3:36 PM, Alan Pevec wrote:

2014-08-17 22:25 GMT+02:00 Matt Riedemann mrie...@linux.vnet.ibm.com:

The other thing I thought was we could cap the version of
python-keystoneclient in stable/havana, would that be bad? stable/havana is
going to be end of life pretty soon anyway.


No, we had cap on some clients and it was creating situations with
conflicting requirements, last example was swiftclient2.
Another alternative was to start stable/* from clients but that was
rejected in the past.
Theory is that *clients are backward compatible but I'm not sure if
addition of new dependencies was considered when decision to go with
master-only clients was made.

I think it's fine to add new test-requirements on stable, we should
just somehow get an early warning that client change is going to break
stable branch and update test-req preemptively.

Cheers,
Alan



OK, so here is where we appear to be:

1. We need the oslo.utils changes in python-keystoneclient reverted on 
master to get the stable/havana backports for global-requirements to 
pass Jenkins.  The revert is here:


https://review.openstack.org/#/q/status:open+project:openstack/python-keystoneclient+branch:master+topic:bug/1357652,n,z

2. The backports for oslo.i18n and oslo.utils to stable/havana are here:

https://review.openstack.org/#/q/status:open+project:openstack/requirements+branch:stable/havana+topic:bug/1357652,n,z

3. Once 1 and 2 are done, we can restore the changes to 
python-keystoneclient on master.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Non-readonly connection to libvirt in unit tests

2014-08-20 Thread Matt Riedemann



On 8/11/2014 4:42 AM, Daniel P. Berrange wrote:

On Mon, Aug 04, 2014 at 06:46:13PM -0400, Solly Ross wrote:

Hi,
I was wondering if there was a way to get a non-readonly connection
to libvirt when running the unit tests
on the CI.  If I call `LibvirtDriver._connect(LibvirtDriver.uri())`,
it works fine locally, but the upstream
CI barfs with libvirtError: internal error Unable to locate libvirtd
daemon in /usr/sbin (to override, set $LIBVIRTD_PATH to the name of the
libvirtd binary).
If I try to connect by calling libvirt.open(None), it also barfs, saying
I don't have permission to connect.  I could just set it to always use
fakelibvirt,
but it would be nice to be able to run some of the tests against a real
target.  The tests in question are part of 
https://review.openstack.org/#/c/111459/,
and involve manipulating directory-based libvirt storage pools.


Nothing in the unit tests should rely on being able to connect to the
libvirt daemon, as the unit tests should still be able to pass when the
daemon is not running at all. We should be either using fakelibvirt or
mocking any libvirt APIs that need to be tested

Regards,
Daniel



So this is busted then right because the new flags being used aren't 
defined in fakelibvirt:


https://github.com/openstack/nova/commit/26504d71ceaecf22f135d8321769db801290c405

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Non-readonly connection to libvirt in unit tests

2014-08-20 Thread Matt Riedemann



On 8/11/2014 4:42 AM, Daniel P. Berrange wrote:

On Mon, Aug 04, 2014 at 06:46:13PM -0400, Solly Ross wrote:

Hi,
I was wondering if there was a way to get a non-readonly connection
to libvirt when running the unit tests
on the CI.  If I call `LibvirtDriver._connect(LibvirtDriver.uri())`,
it works fine locally, but the upstream
CI barfs with libvirtError: internal error Unable to locate libvirtd
daemon in /usr/sbin (to override, set $LIBVIRTD_PATH to the name of the
libvirtd binary).
If I try to connect by calling libvirt.open(None), it also barfs, saying
I don't have permission to connect.  I could just set it to always use
fakelibvirt,
but it would be nice to be able to run some of the tests against a real
target.  The tests in question are part of 
https://review.openstack.org/#/c/111459/,
and involve manipulating directory-based libvirt storage pools.


Nothing in the unit tests should rely on being able to connect to the
libvirt daemon, as the unit tests should still be able to pass when the
daemon is not running at all. We should be either using fakelibvirt or
mocking any libvirt APIs that need to be tested

Regards,
Daniel



Also, doesn't this kind of break with the test requirement on 
libvirt-python now?  Before I was on trusty and trying to install that 
it was failing because I didn't have a new enough version of libvirt-bin 
installed.  So if we require libvirt-python for tests and that requires 
libvirt-bin, what's stopping us from just removing fakelibvirt since 
it's kind of useless now anyway, right?


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Prioritizing review of potentially approvable patches

2014-08-21 Thread Matt Riedemann



On 8/21/2014 7:09 AM, Sean Dague wrote:

FWIW, this is one of my normal morning practices, and the reason that
that query is part of most of the gerrit dashboards -
https://github.com/stackforge/gerrit-dash-creator/blob/master/dashboards/compute-program.dash

On 08/21/2014 06:57 AM, Daniel P. Berrange wrote:

Tagged with '[nova]' but this might be relevant data / idea for other
teams too.

With my code contributor hat on, one of the things that I find most the
frustrating about Nova code review process is that a patch can get a +2
vote from one core team member and then sit around for days, weeks, even
months without getting a second +2 vote, even if it has no negative
feedback at all and is a simple  important bug fix.

If a patch is good enough to have received one +2 vote, then compared to
the open patches as a whole, this patch is much more likely to be one
that is ready for approval  merge. It will likely be easier to review,
since it can be assumed other reviewers have already caught the majority
of the silly / tedious / time consuming bugs.

Letting these patches languish with a single +2 for too long makes it very
likely that, when a second core reviewer finally appears, there will be a
merge conflict or other bit-rot that will cause it to have to undergo yet
another rebase  re-review. This is wasting time of both our contributors
and our review team.

On this basis I suggest that core team members should consider patches
that already have a +2 to be high(er) priority items to review than open
patches as a whole.

Currently Nova has (on master branch)

   - 158 patches which have at least one +2 vote, and are not approved
   - 122 patches which have at least one +2 vote, are not approved and
 don't have any -1 code review votes.

So that's 122 patches that should be easy candidates for merging right
now. Another 30 can possibly be merged depending on whether the core
reviewer agrees with the -1 feedback given or now.

That is way more patches than we should have outstanding in that state.
It is not unreasonable to say that once a patch has a single +2 vote, we
should aim to get either a second +2 vote or further -1 review feedback
in a matter of days, and certainly no longer than a week.

If everyone on the core team looked at the list of potentially approvable
patches each day I think it would significantly improve our throughput.
It would also decrease the amount of review work overall by reducing
chance that patches bitrot  need rebase for merge conflicts. And most
importantly of all it will give our code contributors a better impression
that we care about them.

As an added carrot, working through this list will be an effective way
to improve your rankings [1] against other core reviewers, not that I
mean to suggest we should care about rankings over review quality ;-P

The next version of gerrymander[2] will contain a new command to allow
core reviewers to easily identify these patches

$ gerrymander todo-approvable -g nova --branch master

This will of course filter out patches which you yourself own since you
can't approve your own work. It will also filter out patches which you
have given feedback on already. What's left will be a list of patches
where you are able to apply the casting +2 vote to get to +A state.
If the '--strict' arg is added it will also filter out any patches which
have a -1 code review comment.

Regards,
Daniel

[1] http://russellbryant.net/openstack-stats/nova-reviewers-30.txt
[2] 
https://github.com/berrange/gerrymander/commit/790df913fc512580d92e808f28793e29783fecd7






___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Yeah:

https://review.openstack.org/#/projects/openstack/nova,dashboards/important-changes:review-inbox-dashboard

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Non-readonly connection to libvirt in unit tests

2014-08-21 Thread Matt Riedemann



On 8/21/2014 10:23 AM, Daniel P. Berrange wrote:

On Thu, Aug 21, 2014 at 11:14:33AM -0400, Solly Ross wrote:

(reply inline)

- Original Message -

From: Daniel P. Berrange berra...@redhat.com
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.org
Sent: Thursday, August 21, 2014 11:05:18 AM
Subject: Re: [openstack-dev] [nova][libvirt] Non-readonly connection to libvirt 
in unit tests

On Thu, Aug 21, 2014 at 10:52:42AM -0400, Solly Ross wrote:

FYI, the context of this is that I would like to be able to test some
of the libvirt storage pool code against a live file system, as we
currently test the storage pool code.  To do this, we need at least to
be able to get a proper connection to a session daemon.  IMHO, since
these calls aren't expensive, so to speak, it should be fine to have
them run against a real libvirt.


No it really isn't OK to run against the real libvirt host system when
in the unit tests. Unit tests must *not* rely on external system state
in this way because it will lead to greater instability and unreliability
of our unit tests. If you want to test stuff against the real libvirt
storage pools then that becomes a functional / integration test suite
which is pretty much what tempest is targetting.


That's all well and good, but we *currently* manipulates the actual file
system manually in tests.  Should we then say that we should never manipulate
the actual file system either?  In that case, there are some tests which
need to be refactored.


Places where the tests manipulate the filesystem though should be doing
so in an isolated playpen directory, not in the live location where
a deployed nova runs, so that's not the same thing.


So If we require libvirt-python for tests and that requires
libvirt-bin, what's stopping us from just removing fakelibvirt since
it's kind of useless now anyway, right?


The thing about fakelibvirt is that it allows us to operate against
against a libvirt API without actually doing libvirt-y things like
launching VMs.  Now, libvirt does have a test:///default URI that
IIRC has similar functionality, so we could start to phase out fake
libvirt in favor of that.  However, there are probably still some
spots where we'll want to use fakelibvirt.


I'm actually increasingly of the opinion that we should not in fact
be trying to use the real libvirt library in the unit tests at all
as it is not really adding any value. We typically nmock out all the
actual API calls we exercise so despite using libvirt-python we
are not in fact exercising its code or even validating that we're
passing the correct numbers of parameters to API calls. Pretty much
all we really relying on is the existance of the various global
constants that are defined, and that has been nothing but trouble
because the constants may or may not be defined depending on the
version.


Isn't that what 'test:///default' is supposed to be?  A version of libvirt
with libvirt not actually touching the rest of the system?


Yes, that is what it allows for, however, even if we used that URI we
still wouldn't be actually exercising any of the libvirt code in any
meaningful way because our unit tests mock out all the API calls that
get touched. So using libvirt-python + test:///default URI doesn't
really seem to buy us anything, but it does still mean that developers
need to have libvirt installed in order to run  the unit tests. I'm
not convinced that is a beneficial tradeoff.


The downside of fakelibvirt is that it is a half-assed implementation
of libvirt that we evolve in an adhoc fashion. I'm exploring the idea
of using pythons introspection abilities to query the libvirt-python
API and automatically generate a better 'fakelibvirt' that we can
guarantee to match the signatures of the real libvirt library. If we
had something like that which we had more confidence in, then we could
make the unit tests use that unconditionally. This would make our unit
tests more reliable since we would not be suspectible to different API
coverage in different libvirt module versions which have tripped us up
so many times


Regards,
Daniel



+1000 to removing the need to have libvirt installed to run unit tests, 
but that's what I'm seeing today unless I'm mistaken since we require 
libvirt-python which requires libvirt as already pointed out.


If you revert the change to require libvirt-python and try to run the 
unit tests, it fails, see bug 1357437 [1].


[1] https://bugs.launchpad.net/nova/+bug/1357437

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Non-readonly connection to libvirt in unit tests

2014-08-21 Thread Matt Riedemann



On 8/21/2014 11:37 AM, Clark Boylan wrote:



On Thu, Aug 21, 2014, at 09:25 AM, Matt Riedemann wrote:



On 8/21/2014 10:23 AM, Daniel P. Berrange wrote:

On Thu, Aug 21, 2014 at 11:14:33AM -0400, Solly Ross wrote:

(reply inline)

- Original Message -

From: Daniel P. Berrange berra...@redhat.com
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.org
Sent: Thursday, August 21, 2014 11:05:18 AM
Subject: Re: [openstack-dev] [nova][libvirt] Non-readonly connection to libvirt 
in unit tests

On Thu, Aug 21, 2014 at 10:52:42AM -0400, Solly Ross wrote:

FYI, the context of this is that I would like to be able to test some
of the libvirt storage pool code against a live file system, as we
currently test the storage pool code.  To do this, we need at least to
be able to get a proper connection to a session daemon.  IMHO, since
these calls aren't expensive, so to speak, it should be fine to have
them run against a real libvirt.


No it really isn't OK to run against the real libvirt host system when
in the unit tests. Unit tests must *not* rely on external system state
in this way because it will lead to greater instability and unreliability
of our unit tests. If you want to test stuff against the real libvirt
storage pools then that becomes a functional / integration test suite
which is pretty much what tempest is targetting.


That's all well and good, but we *currently* manipulates the actual file
system manually in tests.  Should we then say that we should never manipulate
the actual file system either?  In that case, there are some tests which
need to be refactored.


Places where the tests manipulate the filesystem though should be doing
so in an isolated playpen directory, not in the live location where
a deployed nova runs, so that's not the same thing.


So If we require libvirt-python for tests and that requires
libvirt-bin, what's stopping us from just removing fakelibvirt since
it's kind of useless now anyway, right?


The thing about fakelibvirt is that it allows us to operate against
against a libvirt API without actually doing libvirt-y things like
launching VMs.  Now, libvirt does have a test:///default URI that
IIRC has similar functionality, so we could start to phase out fake
libvirt in favor of that.  However, there are probably still some
spots where we'll want to use fakelibvirt.


I'm actually increasingly of the opinion that we should not in fact
be trying to use the real libvirt library in the unit tests at all
as it is not really adding any value. We typically nmock out all the
actual API calls we exercise so despite using libvirt-python we
are not in fact exercising its code or even validating that we're
passing the correct numbers of parameters to API calls. Pretty much
all we really relying on is the existance of the various global
constants that are defined, and that has been nothing but trouble
because the constants may or may not be defined depending on the
version.


Isn't that what 'test:///default' is supposed to be?  A version of libvirt
with libvirt not actually touching the rest of the system?


Yes, that is what it allows for, however, even if we used that URI we
still wouldn't be actually exercising any of the libvirt code in any
meaningful way because our unit tests mock out all the API calls that
get touched. So using libvirt-python + test:///default URI doesn't
really seem to buy us anything, but it does still mean that developers
need to have libvirt installed in order to run  the unit tests. I'm
not convinced that is a beneficial tradeoff.


The downside of fakelibvirt is that it is a half-assed implementation
of libvirt that we evolve in an adhoc fashion. I'm exploring the idea
of using pythons introspection abilities to query the libvirt-python
API and automatically generate a better 'fakelibvirt' that we can
guarantee to match the signatures of the real libvirt library. If we
had something like that which we had more confidence in, then we could
make the unit tests use that unconditionally. This would make our unit
tests more reliable since we would not be suspectible to different API
coverage in different libvirt module versions which have tripped us up
so many times


Regards,
Daniel



+1000 to removing the need to have libvirt installed to run unit tests,
but that's what I'm seeing today unless I'm mistaken since we require
libvirt-python which requires libvirt as already pointed out.

If you revert the change to require libvirt-python and try to run the
unit tests, it fails, see bug 1357437 [1].

[1] https://bugs.launchpad.net/nova/+bug/1357437


Reverting the change to require libvirt-python is insufficient. That
revert will flip back to using system packages and include libvirt
python lib from your operating system. Libvirt will still be required
just via a different avenue (nova does try to fall back on its fake
libvirt but iirc that doesn't always work so well).

If you want to stop depending

Re: [openstack-dev] [nova][libvirt] Non-readonly connection to libvirt in unit tests

2014-08-21 Thread Matt Riedemann



On 8/21/2014 12:26 PM, Daniel P. Berrange wrote:

On Thu, Aug 21, 2014 at 12:23:12PM -0500, Matt Riedemann wrote:



On 8/21/2014 11:37 AM, Clark Boylan wrote:



On Thu, Aug 21, 2014, at 09:25 AM, Matt Riedemann wrote:



On 8/21/2014 10:23 AM, Daniel P. Berrange wrote:

On Thu, Aug 21, 2014 at 11:14:33AM -0400, Solly Ross wrote:

(reply inline)

- Original Message -

From: Daniel P. Berrange berra...@redhat.com
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.org
Sent: Thursday, August 21, 2014 11:05:18 AM
Subject: Re: [openstack-dev] [nova][libvirt] Non-readonly connection to libvirt 
in unit tests

On Thu, Aug 21, 2014 at 10:52:42AM -0400, Solly Ross wrote:

FYI, the context of this is that I would like to be able to test some
of the libvirt storage pool code against a live file system, as we
currently test the storage pool code.  To do this, we need at least to
be able to get a proper connection to a session daemon.  IMHO, since
these calls aren't expensive, so to speak, it should be fine to have
them run against a real libvirt.


No it really isn't OK to run against the real libvirt host system when
in the unit tests. Unit tests must *not* rely on external system state
in this way because it will lead to greater instability and unreliability
of our unit tests. If you want to test stuff against the real libvirt
storage pools then that becomes a functional / integration test suite
which is pretty much what tempest is targetting.


That's all well and good, but we *currently* manipulates the actual file
system manually in tests.  Should we then say that we should never manipulate
the actual file system either?  In that case, there are some tests which
need to be refactored.


Places where the tests manipulate the filesystem though should be doing
so in an isolated playpen directory, not in the live location where
a deployed nova runs, so that's not the same thing.


So If we require libvirt-python for tests and that requires
libvirt-bin, what's stopping us from just removing fakelibvirt since
it's kind of useless now anyway, right?


The thing about fakelibvirt is that it allows us to operate against
against a libvirt API without actually doing libvirt-y things like
launching VMs.  Now, libvirt does have a test:///default URI that
IIRC has similar functionality, so we could start to phase out fake
libvirt in favor of that.  However, there are probably still some
spots where we'll want to use fakelibvirt.


I'm actually increasingly of the opinion that we should not in fact
be trying to use the real libvirt library in the unit tests at all
as it is not really adding any value. We typically nmock out all the
actual API calls we exercise so despite using libvirt-python we
are not in fact exercising its code or even validating that we're
passing the correct numbers of parameters to API calls. Pretty much
all we really relying on is the existance of the various global
constants that are defined, and that has been nothing but trouble
because the constants may or may not be defined depending on the
version.


Isn't that what 'test:///default' is supposed to be?  A version of libvirt
with libvirt not actually touching the rest of the system?


Yes, that is what it allows for, however, even if we used that URI we
still wouldn't be actually exercising any of the libvirt code in any
meaningful way because our unit tests mock out all the API calls that
get touched. So using libvirt-python + test:///default URI doesn't
really seem to buy us anything, but it does still mean that developers
need to have libvirt installed in order to run  the unit tests. I'm
not convinced that is a beneficial tradeoff.


The downside of fakelibvirt is that it is a half-assed implementation
of libvirt that we evolve in an adhoc fashion. I'm exploring the idea
of using pythons introspection abilities to query the libvirt-python
API and automatically generate a better 'fakelibvirt' that we can
guarantee to match the signatures of the real libvirt library. If we
had something like that which we had more confidence in, then we could
make the unit tests use that unconditionally. This would make our unit
tests more reliable since we would not be suspectible to different API
coverage in different libvirt module versions which have tripped us up
so many times


Regards,
Daniel



+1000 to removing the need to have libvirt installed to run unit tests,
but that's what I'm seeing today unless I'm mistaken since we require
libvirt-python which requires libvirt as already pointed out.

If you revert the change to require libvirt-python and try to run the
unit tests, it fails, see bug 1357437 [1].

[1] https://bugs.launchpad.net/nova/+bug/1357437


Reverting the change to require libvirt-python is insufficient. That
revert will flip back to using system packages and include libvirt
python lib from your operating system. Libvirt will still be required
just via a different avenue (nova does try

Re: [openstack-dev] [all] new testtools breaking gate

2014-08-23 Thread Matt Riedemann
 were blowing up, so I guess now I know 
there was an infra flip switched. Whatever, that's fine, if someone asks 
wtf is going on with these since 8/21? someone from infra is pretty 
quick to point out the change, then it's just getting people that care 
enough about fixing the bugs to fix them.  I don't think you or the 
infra team should be responsible for that in all projects affected, it 
doesn't scale.


Maybe next time something like this comes up we get the PTLs to be the 
ones assigning a person (Clark's Infra Czar?!?!) responsible for 
coordinating these types of changes so they are ready.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Adding GateFailureFix tag to commit messages

2014-08-23 Thread Matt Riedemann



On 8/22/2014 4:11 AM, Daniel P. Berrange wrote:

On Thu, Aug 21, 2014 at 09:02:17AM -0700, Armando M. wrote:

Hi folks,

According to [1], we have ways to introduce external references to commit
messages.

These are useful to mark certain patches and their relevance in the context
of documentation, upgrades, etc.

I was wondering if it would be useful considering the addition of another
tag:

GateFailureFix

The objective of this tag, mainly for consumption by the review team, would
be to make sure that some patches get more attention than others, as they
affect the velocity of how certain critical issues are addressed (and gate
failures affect everyone).

As for machine consumption, I know that some projects use the
'gate-failure' tag to categorize LP bugs that affect the gate. The use of a
GateFailureFix tag in the commit message could make the tagging automatic,
so that we can keep a log of what all the gate failures are over time.

Not sure if this was proposed before, and I welcome any input on the matter.


We've tried a number of different tags in git commit messages before, in
an attempt to help prioritization of reviews and unfortunately none of them
have been particularly successful so far.  I think a key reasonsfor this
is that tags in the commit message are invisible when people are looking at
lists of possible changes to choose for review. Whether in the gerrit web
UI reports / dashboards or in command line tools like my own gerrymander,
reviewers are looking at lists of changes and primarily choosing which
to review based on the subject line, or other explicitly recorded metadata
fields. You won't typically look at the commit message until you've already
decided you want to review the change. So while GateFailureFix may cause
me to pay more attention during the review of it, it probably won't make
me start review any sooner.

Regards,
Daniel



Yup, I had the same thoughts.  The TrivialFix tag idea is similar and 
never took off, and I personally don't like that kind of tag anyway 
since it's very open to interpretation.


And if GateFailureFix wasn't going to be tied to bug fixes for known 
(tracked in elastic-recheck) failures, but just high-priority fixes for 
a given project, then it's false advertizing for the change.  Gate 
failures typically affect all projects, whereas high-priority fixes for 
a project might be just isolated to that project, e.g. the recent 
testtools 0.9.36 setUp/tearDown and tox hashseed unit test failures are 
project-specific and high priority for the project to fix.


If you want a simple way to see high priority bugs that have code out 
for review, Tracy Jones has a nice page created for Nova [1].


[1] http://54.201.139.117/nova-bugs.html

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Adding GateFailureFix tag to commit messages

2014-08-23 Thread Matt Riedemann



On 8/21/2014 11:55 AM, Sean Dague wrote:

On 08/21/2014 11:02 AM, Armando M. wrote:

Hi folks,

According to [1], we have ways to introduce external references to
commit messages.

These are useful to mark certain patches and their relevance in the
context of documentation, upgrades, etc.

I was wondering if it would be useful considering the addition of
another tag:

GateFailureFix

The objective of this tag, mainly for consumption by the review team,
would be to make sure that some patches get more attention than others,
as they affect the velocity of how certain critical issues are addressed
(and gate failures affect everyone).

As for machine consumption, I know that some projects use the
'gate-failure' tag to categorize LP bugs that affect the gate. The use
of a GateFailureFix tag in the commit message could make the tagging
automatic, so that we can keep a log of what all the gate failures are
over time.

Not sure if this was proposed before, and I welcome any input on the matter.


A concern with this approach is it's pretty arbitrary, and not always
clear which bugs are being addressed and how severe they are.

An idea that came up in the Infra/QA meetup was to build a custom review
dashboard based on the bug list in elastic recheck. That would also
encourage people to categorize this bugs through that system, and I
think provide a virtuous circle around identifying the issues at hand.

I think Joe Gordon had a first pass at this, but I'd be more interested
in doing it this way because it means the patch author fixing a bug just
needs to know they are fixing the bug. Whether or not it's currently a
gate issue would be decided not by the commit message (static) but by
our system that understands what are the gate issues *right now* (dynamic).

-Sean



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Joe's change has merged:

https://review.openstack.org/#/c/109144/

There should be an Open reviews section in the elastic-recheck status 
page now:


http://status.openstack.org/elastic-recheck/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [trove] single trove config file and run check_uptodate?

2014-08-25 Thread Matt Riedemann
I was reviewing some install script recently where someone was trying to 
configure trove and I was asking about some config option they were 
setting which wasn't in the trove.conf.sample.  I was too lazy at the 
time to see if it was a valid option in the code, but did make me wonder 
if there has been discussion about consolidating the various trove 
config files into a single config file with sections (heat used to have 
the same issue, now they have a single config file).  I'm assuming this 
has come up before but didn't see anything in the mailing list.


Building on that, if there was a single trove config file we could have 
a job that gates on making sure it's up to date by using the 
check_uptodate script used in other projects, e.g. cinder.  That should 
be a relatively easy drop-in, the work will be in cleaning up the config 
file (unless you just replace with what generate_sample provides).


I think the docs team is also interested in seeing this happen...

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] python-neutronclient, launchpad, and milestones

2014-08-29 Thread Matt Riedemann



On 7/29/2014 4:12 PM, Kyle Mestery wrote:

On Tue, Jul 29, 2014 at 3:50 PM, Nader Lahouti nader.laho...@gmail.com wrote:

Hi Kyle,

I have a BP listed in https://blueprints.launchpad.net/python-neutronclient
and looks like it is targeted for 3.0 (it is needed fro juno-3) The code is
ready and in the review. Can it be a included for 2.3.7 release?


Yes, you can target it there. We'll see about including it in that
release, pending review.

Thanks!
Kyle


Thanks,
Nader.



On Tue, Jul 29, 2014 at 12:28 PM, Kyle Mestery mest...@mestery.com wrote:


All:

I spent some time today cleaning up python-neutronclient in LP. I
created a 2.3 series, and created milestones for the 2.3.5 (June 26)
and 2.3.6 (today) releases. I also targeted bugs which were released
in those milestones to the appropriate places. My next step is to
remove the 3.0 series, as I don't believe this is necessary anymore.

One other note: I've tentatively created a 2.3.7 milestone in LP, so
we can start targeting client bugs which merge there for the next
client release.

If you have any questions, please let me know.

Thanks,
Kyle

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



What are the thoughts on when a 2.3.7 release is going to happen? I'm 
specifically interested in getting the keystone v3 support [1] into a 
released version of the library.


9/4 and feature freeze seems like a decent target date.

[1] https://review.openstack.org/#/c/92390/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] [feature freeze exception] Move to oslo.db

2014-09-03 Thread Matt Riedemann



On 9/3/2014 5:08 PM, Andrey Kurilin wrote:

Hi All!

I'd like to ask for a feature freeze exception for porting nova to use
oslo.db.

This change not only removes 3k LOC, but fixes 4 bugs(see commit message
for more details) and provides relevant, stable common db code.

Main maintainers of oslo.db(Roman Podoliaka and Victor Sergeyev) are OK
with this.

Joe Gordon and Matt Riedemann are already signing up, so we need one
more vote from Core developer.

By the way a lot of core projects are using already oslo.db for a
while:  keystone, cinder, glance, ceilometer, ironic, heat, neutron and
sahara. So migration to oslo.db won’t produce any unexpected issues.

Patch is here: https://review.openstack.org/#/c/101901/

--
Best regards,
Andrey Kurilin.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Just re-iterating my agreement to sponsor this.  I'm waiting for the 
latest patch set to pass Jenkins and for Roman to review after his 
comments from the previous patch set and -1.  Otherwise I think this is 
nearly ready to go.


The turbo-hipster failures on the change appear to be infra issues in 
t-h rather than problems with the code.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno

2014-09-03 Thread Matt Riedemann
 that does reduce the number of people who
have agreed to review the code for that exception.



Michael has correctly picked up on a hint of snark in my email, so let
me explain where I was going with that:

The reason many features including my own may not make the FF is not
because there was not enough buy in from the core team (let's be
completely honest - I have 3+ other core members working for the same
company that are by nature of things easier to convince), but because of
any of the following:

* Crippling technical debt in some of the key parts of the code
* that we have not been acknowledging as such for a long time
* which leads to proposed code being arbitrarily delayed once it makes
the glaring flaws in the underlying infra apparent
* and that specs process has been completely and utterly useless in
helping uncover (not that process itself is useless, it is very useful
for other things)

I am almost positive we can turn this rather dire situation around
easily in a matter of months, but we need to start doing it! It will not
happen through pinning arbitrary numbers to arbitrary processes.

I will follow up with a more detailed email about what I believe we are
missing, once the FF settles and I have applied some soothing creme to
my burnout wounds, but currently my sentiment is:

Contributing features to Nova nowadays SUCKS!!1 (even as a core
reviewer) We _have_ to change that!

N.


Michael


 * exceptions must be granted before midnight, Friday this week
(September 5) UTC
 * the exception is valid until midnight Friday next week
(September 12) UTC when all exceptions expire

For reference, our rc1 drops on approximately 25 September, so the
exception period needs to be short to maximise stabilization time.

John Garbutt and I will both be granting exceptions, to maximise our
timezone coverage. We will grant exceptions as they come in and gather
the required number of cores, although I have also carved some time
out in the nova IRC meeting this week for people to discuss specific
exception requests.

Michael




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev







___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Feature Freeze Exception process for Juno

2014-09-04 Thread Matt Riedemann



On 9/4/2014 4:21 AM, Day, Phil wrote:


One final note: the specs referenced above didn't get approved until
Spec Freeze, which seemed to leave me with less time to implement
things.  In fact, it seemed that a lot of specs didn't get approved
until spec freeze.  Perhaps if we had more staggered approval of
specs, we'd have more staggered submission of patches, and thus less of a

sudden influx of patches in the couple weeks before feature proposal
freeze.

Yeah I think the specs were getting approved too late into the cycle, I was
actually surprised at how far out the schedules were going in allowing things
in and then allowing exceptions after that.

Hopefully the ideas around priorities/slots/runways will help stagger some of
this also.


I think there is a problem with the pattern that seemed to emerge in June where 
the J.1 period was taken up with spec review  (a lot of good reviews happened 
early in that period, but the approvals kind of came in a lump at the end)  
meaning that the implementation work itself only seemed to really kick in 
during J.2 - and not surprisingly given the complexity of some of the changes 
ran late into J.3.

We also has previously noted didn’t do any prioritization between those specs 
that were approved - so it was always going to be a race to who managed to get 
code up for review first.

It kind of feels to me as if the ideal model would be if we were doing spec 
review for K now (i.e during the FF / stabilization period) so that we hit 
Paris with a lot of the input already registered and a clear idea of the range  
of things folks want to do.We shouldn't really have to ask for session 
suggestions for the summit  - they should be something that can be extracted 
from the proposed specs (maybe we do voting across the specs or something like 
that).In that way the summit would be able to confirm the list of specs for 
K and the priority order.

With the current state of the review queue maybe we can’t quite hit this 
pattern for K, but would be worth aspiring to for I ?

Phil

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I like the idea of having our ducks somewhat in a row for the summit so 
we can hash out details in design sessions on high-priority specs and 
reserve time for figuring out what the priorities are.  I think that 
would go a long way in fixing some of the frustrations in the other 
thread about the mid-cycle meetups being the place where blueprint 
issues are hashed out rather than the summit, and the design sessions at 
the summit not feeling productive.


But as noted, there is also a feeling right now of focusing on Juno to 
get that out the door before anyone starts getting distracted with 
reviewing Kilo specs.  And I suppose once Juno is finished no one is 
going to want to talk about Kilo for awhile due to burnout.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Averting the Nova crisis by splitting out virt drivers

2014-09-04 Thread Matt Riedemann
 :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Regards,
Daniel



Even if we split the virt drivers out, libvirt would still be the 
default in the Tempest gate runs right?


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Feature freeze + Juno-3 milestone candidates available

2014-09-05 Thread Matt Riedemann



On 9/5/2014 5:10 AM, Thierry Carrez wrote:

Hi everyone,

We just hit feature freeze[1], so please do not approve changes that add
features or new configuration options unless those have been granted a
feature freeze exception.

This is also string freeze[2], so you should avoid changing translatable
strings. If you have to modify a translatable string, you should give a
heads-up to the I18N team.

Finally, this is also DepFreeze[3], so you should avoid adding new
dependencies (bumping oslo or openstack client libraries is OK until
RC1). If you have a new dependency to add, raise a thread on
openstack-dev about it.

The juno-3 development milestone was tagged, it contains more than 135
features and 760 bugfixes added since the juno-2 milestone 6 weeks ago
(not even counting the Oslo libraries in the mix). You can find the full
list of new features and fixed bugs, as well as tarball downloads, at:

https://launchpad.net/keystone/juno/juno-3
https://launchpad.net/glance/juno/juno-3
https://launchpad.net/nova/juno/juno-3
https://launchpad.net/horizon/juno/juno-3
https://launchpad.net/neutron/juno/juno-3
https://launchpad.net/cinder/juno/juno-3
https://launchpad.net/ceilometer/juno/juno-3
https://launchpad.net/heat/juno/juno-3
https://launchpad.net/trove/juno/juno-3
https://launchpad.net/sahara/juno/juno-3

Many thanks to all the PTLs and release management liaisons who made us
reach this important milestone in the Juno development cycle. Thanks in
particular to John Garbutt, who keeps on doing an amazing job at the
impossible task of keeping the Nova ship straight in troubled waters
while we head toward the Juno release port.

Regards,

[1] https://wiki.openstack.org/wiki/FeatureFreeze
[2] https://wiki.openstack.org/wiki/StringFreeze
[3] https://wiki.openstack.org/wiki/DepFreeze



I should probably know this, but at least I'm asking first. :)

Here is an example of a new translatable user-facing error message [1].

From the StringFreeze wiki, I'm not sure if this is small or large.

Would a compromise to get this in be to drop the _() so it's just a 
string and not a message?


Maybe I should just shut-up and email the openstack-i18n mailing list [2].

[1] https://review.openstack.org/#/c/118535/
[2] http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-i18n

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] python-neutronclient, launchpad, and milestones

2014-09-06 Thread Matt Riedemann



On 8/29/2014 1:53 PM, Kyle Mestery wrote:

On Fri, Aug 29, 2014 at 1:40 PM, Matt Riedemann
mrie...@linux.vnet.ibm.com wrote:



On 7/29/2014 4:12 PM, Kyle Mestery wrote:


On Tue, Jul 29, 2014 at 3:50 PM, Nader Lahouti nader.laho...@gmail.com
wrote:


Hi Kyle,

I have a BP listed in
https://blueprints.launchpad.net/python-neutronclient
and looks like it is targeted for 3.0 (it is needed fro juno-3) The code
is
ready and in the review. Can it be a included for 2.3.7 release?


Yes, you can target it there. We'll see about including it in that
release, pending review.

Thanks!
Kyle


Thanks,
Nader.



On Tue, Jul 29, 2014 at 12:28 PM, Kyle Mestery mest...@mestery.com
wrote:



All:

I spent some time today cleaning up python-neutronclient in LP. I
created a 2.3 series, and created milestones for the 2.3.5 (June 26)
and 2.3.6 (today) releases. I also targeted bugs which were released
in those milestones to the appropriate places. My next step is to
remove the 3.0 series, as I don't believe this is necessary anymore.

One other note: I've tentatively created a 2.3.7 milestone in LP, so
we can start targeting client bugs which merge there for the next
client release.

If you have any questions, please let me know.

Thanks,
Kyle

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev





___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



What are the thoughts on when a 2.3.7 release is going to happen? I'm
specifically interested in getting the keystone v3 support [1] into a
released version of the library.

9/4 and feature freeze seems like a decent target date.


I can make that happen. I'll take a pass through the existing client
reviews to see what's there, and roll another release which would
include the keystone v3 work which is already merged.

Thanks,
Kyle


[1] https://review.openstack.org/#/c/92390/

--

Thanks,

Matt Riedemann



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I think we're at or near dependency freeze so wondering what the plan is 
for cutting the final release of python-neutronclient before Juno 
release candidates start building (which I think is too late for a dep 
update).


Are there any Neutron FFEs that touch the client that people need to 
wait for?


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] [nova] non-deterministic gate failures due to unclosed eventlet Timeouts

2014-09-07 Thread Matt Riedemann



On 9/7/2014 8:39 AM, John Schwarz wrote:

Hi,

Long story short: for future reference, if you initialize an eventlet
Timeout, make sure you close it (either with a context manager or simply
timeout.close()), and be extra-careful when writing tests using
eventlet Timeouts, because these timeouts don't implicitly expire and
will cause unexpected behaviours (see [1]) like gate failures. In our
case this caused non-deterministic failures on the dsvm-functional test
suite.


Late last week, a bug was found ([2]) in which an eventlet Timeout
object was initialized but not closed. This instance was left inside
eventlet's inner-workings and triggered non-deterministic Timeout: 10
seconds errors and failures in dsvm-functional tests.

As mentioned earlier, initializing a new eventlet.timeout.Timeout
instance also registers it to inner mechanisms that exist within the
library, and the reference remains there until it is explicitly removed
(and not until the scope leaves the function block, as some would have
thought). Thus, the old code (simply creating an instance without
assigning it to a variable) left no way to close the timeout object.
This reference remains throughout the life of a worker, so this can
(and did) effect other tests and procedures using eventlet under the
same process. Obviously this could easily effect production-grade
systems with very high load.

For future reference:
  1) If you run into a Timeout: %d seconds exception whose traceback
includes hub.switch() and self.greenlet.switch() calls, there might
be a latent Timeout somewhere in the code, and a search for all
eventlet.timeout.Timeout instances will probably produce the culprit.

  2) The setup used to reproduce this error for debugging purposes is a
baremetal machine running a VM with devstack. In the baremetal machine I
used some 6 dd if=/dev/zero of=/dev/null to simulate high CPU load
(full command can be found at [3]), and in the VM I ran the
dsvm-functional suite. Using only a VM with similar high CPU simulation
fails to produce the result.

[1]
http://eventlet.net/doc/modules/timeout.html#eventlet.timeout.eventlet.timeout.Timeout.Timeout.cancel
[2] https://review.openstack.org/#/c/119001/
[3]
http://stackoverflow.com/questions/2925606/how-to-create-a-cpu-spike-with-a-bash-command


--
John Schwarz,
Software Engineer, Red Hat.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Thanks, that might be what's causing this timeout/gate failure in the 
nova unit tests. [1]


[1] https://bugs.launchpad.net/nova/+bug/1357578

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [cinder] new cinderclient release this week?

2014-09-07 Thread Matt Riedemann
I think we're in dependency freeze or quickly approaching.  What are the 
plans from the Cinder team for doing a python-cinderclient release to 
pick up any final features before Juno rc1?


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] China blocking access to OpenStack git review push

2014-09-08 Thread Matt Riedemann



On 9/8/2014 2:20 PM, Thomas Goirand wrote:

Am I dreaming, or is the Chinese government is trying to push for the
cloud, they said. However, today, bad surprise:

# nmap -p 29418 23.253.232.87

Starting Nmap 6.00 ( http://nmap.org ) at 2014-09-09 03:10 CST
Nmap scan report for review.openstack.org (23.253.232.87)
Host is up (0.21s latency).
PORT  STATESERVICE
29418/tcp filtered unknown

Oh dear ... not fun!

FYI, this is from China Unicom (eg: CNC Group)

I'm guessing that this is the Great Firewall of China awesome automated
ban script which detected too many ssh connection to a weird port. It
has blocked a few of my servers recently too, as it became a way too
aggressive. I very much prefer to use my laptop to use git review than
having to bounce around servers. :(

Are their alternative IPs that I could use for review.openstack.org?

Cheers,

Thomas Goirand (zigo)

P.S: If a Chinese official read this, an easy way to unlist (legitimate)
servers access would be the first action any reasonable Chinese
government people must do.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Yeah the IBM DB2 third party CI is run from a team in China and they've 
been blocked for a few weeks now.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] global-reqs on tooz pulls in worrisome transitive dep

2014-09-09 Thread Matt Riedemann

It took me a while to untangle this so prepare for links. :)

I noticed this change [1] today for global-requirements to require tooz 
[2] for a ceilometer blueprint [3].


The sad part is that tooz requires pymemcache [4] which is, from what I 
can tell, a memcached client that is not the same as python-memcached [5].


Note that python-memcached is listed in global-requirements already [6].

The problem I have with this is it doesn't appear that RHEL/Fedora 
package pymemcache (they do package python-memcached).  I see that 
openSUSE builds separate packages for each.  It looks like Ubuntu also 
has separate packages.


My question is, is this a problem?  I'm assuming RDO will just have to 
package python-pymemcache themselves but what about people not using RDO 
(SOL? Don't care? Other?).


Reverting the requirements change would probably mean reverting the 
ceilometer blueprint (or getting a version of tooz out that works with 
python-memcached which is probably too late for that right now).  Given 
the point in the schedule that seems pretty drastic.


Maybe I'm making more of this than it's worth but wanted to bring it up 
in case anyone else has concerns.


[1] https://review.openstack.org/#/c/93443/
[2] https://github.com/stackforge/tooz/blob/master/requirements.txt#L6
[3] 
http://specs.openstack.org/openstack/ceilometer-specs/specs/juno/central-agent-partitioning.html

[4] https://pypi.python.org/pypi/pymemcache
[5] https://pypi.python.org/pypi/python-memcached/
[6] 
https://github.com/openstack/requirements/blob/master/global-requirements.txt#L108


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [release] client release deadline - Sept 18th

2014-09-10 Thread Matt Riedemann



On 9/9/2014 4:19 PM, Sean Dague wrote:

As we try to stabilize OpenStack Juno, many server projects need to get
out final client releases that expose new features of their servers.
While this seems like not a big deal, each of these clients releases
ends up having possibly destabilizing impacts on the OpenStack whole (as
the clients do double duty in cross communicating between services).

As such in the release meeting today it was agreed clients should have
their final release by Sept 18th. We'll start applying the dependency
freeze to oslo and clients shortly after that, all other requirements
should be frozen at this point unless there is a high priority bug
around them.

-Sean



Thanks for bringing this up. We do our own packaging and need time for 
legal clearances and having the final client releases done in a 
reasonable time before rc1 is helpful.  I've been pinging a few projects 
to do a final client release relatively soon.  python-neutronclient has 
a release this week and I think John was planning a python-cinderclient 
release this week also.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Server Groups - remove VM from group?

2014-09-11 Thread Matt Riedemann



On 9/10/2014 6:00 PM, Russell Bryant wrote:

On 09/10/2014 06:46 PM, Joe Cropper wrote:

Hmm, not sure I follow the concern, Russell.  How is that any different
from putting a VM into the group when it’s booted as is done today?
  This simply defers the ‘group insertion time’ to some time after
initial the VM’s been spawned, so I’m not sure this creates anymore race
conditions than what’s already there [1].

[1] Sure, the to-be-added VM could be in the midst of a migration or
something, but that would be pretty simple to check make sure its task
state is None or some such.


The way this works at boot is already a nasty hack.  It does policy
checking in the scheduler, and then has to re-do some policy checking at
launch time on the compute node.  I'm afraid of making this any worse.
In any case, it's probably better to discuss this in the context of a
more detailed design proposal.



This [1] is the hack you're referring to right?

[1] 
http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager.py?id=2014.2.b3#n1297


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [release] client release deadline - Sept 18th

2014-09-15 Thread Matt Riedemann



On 9/10/2014 11:08 AM, Kyle Mestery wrote:

On Wed, Sep 10, 2014 at 10:01 AM, Matt Riedemann
mrie...@linux.vnet.ibm.com wrote:



On 9/9/2014 4:19 PM, Sean Dague wrote:


As we try to stabilize OpenStack Juno, many server projects need to get
out final client releases that expose new features of their servers.
While this seems like not a big deal, each of these clients releases
ends up having possibly destabilizing impacts on the OpenStack whole (as
the clients do double duty in cross communicating between services).

As such in the release meeting today it was agreed clients should have
their final release by Sept 18th. We'll start applying the dependency
freeze to oslo and clients shortly after that, all other requirements
should be frozen at this point unless there is a high priority bug
around them.

 -Sean



Thanks for bringing this up. We do our own packaging and need time for legal
clearances and having the final client releases done in a reasonable time
before rc1 is helpful.  I've been pinging a few projects to do a final
client release relatively soon.  python-neutronclient has a release this
week and I think John was planning a python-cinderclient release this week
also.


Just a slight correction: python-neutronclient will have a final
release once the L3 HA CLI changes land [1].

Thanks,
Kyle

[1] https://review.openstack.org/#/c/108378/


--

Thanks,

Matt Riedemann



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



python-cinderclient 1.1.0 was released on Saturday:

https://pypi.python.org/pypi/python-cinderclient/1.1.0

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] are we going to remove the novaclient v3 shell or what?

2014-09-17 Thread Matt Riedemann
This has come up a couple of times in IRC now but the people that 
probably know the answer aren't available.


There are python-novaclient patches that are adding new CLIs to the v2 
(v1_1) and v3 shells, but now that we have the v2.1 API (v2 on v3) why 
do we still have a v3 shell in the client?  Are there plans to remove that?


I don't really care either way, but need to know for code reviews.

One example: [1]

[1] https://review.openstack.org/#/c/108942/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] What's holding nova development back?

2014-09-17 Thread Matt Riedemann



On 9/16/2014 1:01 PM, Joe Gordon wrote:


On Sep 15, 2014 8:31 PM, Jay Pipes jaypi...@gmail.com
mailto:jaypi...@gmail.com wrote:
 
  On 09/15/2014 08:07 PM, Jeremy Stanley wrote:
 
  On 2014-09-15 17:59:10 -0400 (-0400), Jay Pipes wrote:
  [...]
 
  Sometimes it's pretty hard to determine whether something in the
  E-R check page is due to something in the infra scripts, some
  transient issue in the upstream CI platform (or part of it), or
  actually a bug in one or more of the OpenStack projects.
 
  [...]
 
  Sounds like an NP-complete problem, but if you manage to solve it
  let me know and I'll turn it into the first line of triage for Infra
  bugs. ;)
 
 
  LOL, thanks for making me take the last hour reading Wikipedia pages
about computational complexity theory! :P
 
  No, in all seriousness, I wasn't actually asking anyone to boil the
ocean, mathematically. I think doing a couple things just making the
categorization more obvious (a UI thing, really) and doing some
(hopefully simple?) inspection of some control group of patches that we
know do not introduce any code changes themselves and comparing to
another group of patches that we know *do* introduce code changes to
Nova, and then seeing if there are a set of E-R issues that consistently
appear in *both* groups. That set of E-R issues has a higher likelihood
of not being due to Nova, right?

We use launchpad's affected projects listings on the elastic recheck
page to say what may be causing the bug.  Tagging projects to bugs is a
manual process, but one that works pretty well.

UI: The elastic recheck UI definitely could use some improvements. I am
very poor at writing UIs, so patches welcome!

 
  OK, so perhaps it's not the most scientific or well-thought out plan,
but hey, it's a spark for thought... ;)
 
  Best,
  -jay
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I'm not great with UIs either but would a dropdown of the affected 
projects be helpful and then people can filter on their favorite 
project and then the page is sorted by top offenders as we have today?


There are times when the top bugs are infra issues (pip timeouts for 
exapmle) so you have to scroll a ways before finding something for your 
project (nova isn't the only one).


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [release] client release deadline - Sept 18th

2014-09-17 Thread Matt Riedemann



On 9/15/2014 12:57 PM, Matt Riedemann wrote:



On 9/10/2014 11:08 AM, Kyle Mestery wrote:

On Wed, Sep 10, 2014 at 10:01 AM, Matt Riedemann
mrie...@linux.vnet.ibm.com wrote:



On 9/9/2014 4:19 PM, Sean Dague wrote:


As we try to stabilize OpenStack Juno, many server projects need to get
out final client releases that expose new features of their servers.
While this seems like not a big deal, each of these clients releases
ends up having possibly destabilizing impacts on the OpenStack whole
(as
the clients do double duty in cross communicating between services).

As such in the release meeting today it was agreed clients should have
their final release by Sept 18th. We'll start applying the dependency
freeze to oslo and clients shortly after that, all other requirements
should be frozen at this point unless there is a high priority bug
around them.

 -Sean



Thanks for bringing this up. We do our own packaging and need time
for legal
clearances and having the final client releases done in a reasonable
time
before rc1 is helpful.  I've been pinging a few projects to do a final
client release relatively soon.  python-neutronclient has a release this
week and I think John was planning a python-cinderclient release this
week
also.


Just a slight correction: python-neutronclient will have a final
release once the L3 HA CLI changes land [1].

Thanks,
Kyle

[1] https://review.openstack.org/#/c/108378/


--

Thanks,

Matt Riedemann



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



python-cinderclient 1.1.0 was released on Saturday:

https://pypi.python.org/pypi/python-cinderclient/1.1.0



python-novaclient 2.19.0 was released yesterday [1].

List of changes:

mriedem@ubuntu:~/git/python-novaclient$ git log 2.18.1..2.19.0 --oneline 
--no-merges

cd56622 Stop using intersphinx
d96f13d delete python bytecode before every test run
4bd0c38 quota delete tenant_id parameter should be required
3d68063 Don't display duplicated security groups
2a1c07e Updated from global requirements
319b61a Fix test mistake with requests-mock
392148c Use oslo.utils
e871bd2 Use Token fixtures from keystoneclient
aa30c13 Update requirements.txt to include keystoneclient
bcc009a Updated from global requirements
f0beb29 Updated from global requirements
cc4f3df Enhance network-list to allow --fields
fe95fe4 Adding Nova Client support for auto find host APIv2
b3da3eb Adding Nova Client support for auto find host APIv3
3fa04e6 Add filtering by service to hosts list command
c204613 Quickstart (README) doc should refer to nova
9758ffc Updated from global requirements
53be1f4 Fix listing of flavor-list (V1_1) to display swap value
db6d678 Use adapter from keystoneclient
3955440 Fix the return code of the command delete
c55383f Fix variable error for nova --service-type
caf9f79 Convert to requests-mock
33058cb Enable several checks and do not check docs/source/conf.py
abae04a Updated from global requirements
68f357d Enable check for E131
b6afd59 Add support for security-group-default-rules
ad9a14a Fix rxtx_factor name for creating a flavor
ff4af92 Allow selecting the network for doing the ssh with
9ce03a9 fix host resource repr to use 'host' attribute
4d25867 Enable H233
60d1283 Don't log sensitive auth data
d51b546 Enabled hacking checks H305 and H307
8ec2a29 Edits on help strings
c59a0c8 Add support for new fields in network create
67585ab Add version-list for listing REST API versions
0ff4afc Description is mandatory parameter when creating Security Group
6ee0b28 Filter endpoints by region whenever possible
32d13a6 Add missing parameters for server rebuild
f10d8b6 Fixes typo in error message of do_network_create
9f1ee12 Mention keystoneclient.Session use in docs
58cdcab Fix booting from volume when using api v3
52c5ad2 Sync apiclient from oslo-incubator
2acfb9b Convert server tests to httpretty
762bf69 Adding cornercases for set_metadata
313a2f8 Add way to specify key-name from environ

[1] https://pypi.python.org/pypi/python-novaclient/2.19.0

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-17 Thread Matt Riedemann



On 9/17/2014 7:59 PM, Ian Wienand wrote:

On 09/18/2014 09:49 AM, Clark Boylan wrote:

Recent sampling of test run times shows that our tempest jobs run
against clouds using PostgreSQL are significantly slower than jobs run
against clouds using MySQL.


FYI There is a possibly relevant review out for max_connections limits
[1], although it seems to have some issues with shmem usage

-i

[1] https://review.openstack.org/#/c/121952/

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



That's a backport of a fix from master where we were hitting fatal 
errors due to too many DB connections which was brought on by the 
changes to cinder and glance to run as many workers as there were CPUs 
available.  So I don't think it probably plays here...


The errors pointed out in another part of the thread have been around 
for awhile, I think they are due to negative tests where we're hitting 
unique constraints because of the negative tests, so they are expected.


We should also note that the postgresql jobs run with the nova metadata 
API service, I'm not sure how much of a factor that would have here.


Is there anything else unique about those jobs from the MySQL ones?

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-18 Thread Matt Riedemann



On 9/18/2014 5:49 AM, Sean Dague wrote:

On 09/17/2014 11:50 PM, Clark Boylan wrote:

On Wed, Sep 17, 2014, at 06:48 PM, Clark Boylan wrote:

On Wed, Sep 17, 2014, at 06:37 PM, Matt Riedemann wrote:



On 9/17/2014 7:59 PM, Ian Wienand wrote:

On 09/18/2014 09:49 AM, Clark Boylan wrote:

Recent sampling of test run times shows that our tempest jobs run
against clouds using PostgreSQL are significantly slower than jobs run
against clouds using MySQL.


FYI There is a possibly relevant review out for max_connections limits
[1], although it seems to have some issues with shmem usage

-i

[1] https://review.openstack.org/#/c/121952/

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



That's a backport of a fix from master where we were hitting fatal
errors due to too many DB connections which was brought on by the
changes to cinder and glance to run as many workers as there were CPUs
available.  So I don't think it probably plays here...

The errors pointed out in another part of the thread have been around
for awhile, I think they are due to negative tests where we're hitting
unique constraints because of the negative tests, so they are expected.

We should also note that the postgresql jobs run with the nova metadata
API service, I'm not sure how much of a factor that would have here.

Is there anything else unique about those jobs from the MySQL ones?


Good question. There are apparently other differences. The postgres job
runs Keystone under eventlet instead of via apache mod_wsgi. It also
sets FORCE_CONFIGDRIVE=False instead of always. And the final difference
I can find is the one you point out, nova api metadata service is run as
an independent thing.

Could these things be related? It would be relatively simple to push a
change or two to devstack-gate to test this but there are enough options
here that I probably won't do that until we think at least one of these
options is at fault.

I am starting to feel bad that I picked on PostgreSQL and completely
forgot that there were other items in play here. I went ahead and
uploaded [0] to run all devstack jobs without keystone wsgi services
(eventlet) and [1] to run all devstack job with keystone wsgi services
and the initial results are pretty telling.

It appears that keystone eventlet is the source of the slowness in this
job. With keystone eventlet all of the devstack jobs are slower and with
keystone wsgi all of the jobs are quicker. Probably need to collect a
bit more data but this doesn't look good for keystone eventlet.

Thank you Matt for pointing me in this direction.

[0] https://review.openstack.org/#/c/122299/
[1] https://review.openstack.org/#/c/122300/


Don't feel bad. :)

The point that Clark highlights here is a good one. There is an
assumption that once someone creates a job in infra, the magic elves are
responsible for it.

But there are no magic elves. So jobs like this need sponsors.

Maybe the right thing to do is not conflate this configuration and put
an eventlet version of the keystone job only on keystone (because the
keystone team was the one that proposed having a config like that, but
it's so far away from their project they aren't ever noticing when it's
regressing).

Same issue with the metadata server split. That's really only a thing
Nova cares about. It shouldn't impact anyone else.

-Sean



Neutron cares about the nova metadata API service right?

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] PostgreSQL jobs slow in the gate

2014-09-18 Thread Matt Riedemann



On 9/18/2014 12:35 AM, Morgan Fainberg wrote:

-Original Message-
From: Dean Troyer dtro...@gmail.com
Reply: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.org
Date: September 17, 2014 at 21:21:47
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.org
Subject:  Re: [openstack-dev] PostgreSQL jobs slow in the gate



Clark Boylan Wrotr :


It appears that keystone eventlet is the source of the slowness in this
job. With keystone eventlet all of the devstack jobs are slower and with
keystone wsgi all of the jobs are quicker. Probably need to collect a
bit more data but this doesn't look good for keystone eventlet.




On Wed, Sep 17, 2014 at 11:02 PM, Morgan Fainberg 
morgan.fainb...@gmail.com wrote:


I've kicked off a test[1] as well to check into some tunable options
(eventlet workers) for improving keystone eventlet performance. I'll circle
back with the infra team once we have a little more data on both fronts.
The Keystone team will use this data to figure out the best way to approach
this issue.



Brant submitted https://review.openstack.org/#/c/121384/ to up the Keystone
workers when API_WORKERS is set.

I submitted https://review.openstack.org/#/c/122013/ to set a scaling
default for API_WORKERS based on the available CPUs ((nproc+1)/2). There
is a summary in that commit message of the current reviews addressing the
workers in various projects.

I think it has become clear that DevStack needs to set a default for most
services that are currently either too big (nproc) or too small (Keystone
at 1). Of course, moving things to mod_wsgi moots all of that, but it'll
be a while before everything moves.

dt

--

Dean Troyer
dtro...@gmail.com


Dean,

We should probably look at increasing the default number of workers as well in 
the Keystone configuration (rather than just devstack). It looks like, with 
limited datasets, we are seeing a real improvement with Keystone and 4 workers 
(from my previously mentioned quick test). Thanks for the extra data points. 
This helps to confirm the issue is Keystone under eventlet.

Cheers,
Morgan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I pointed this out in Brant's devstack patch but we had a product team 
internally bring up this same point in Icehouse, they were really 
limited due to the eventlet workers issue in Keystone and once we 
provided the option (backported) it increased their throughput by 20%. 
We've been running with that in our internal Tempest runs (setting 
workers equal to number of CPUs / 2) and so far so good.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] 2 weeks in the bug tracker

2014-09-21 Thread Matt Riedemann
 reviewed/approved, I'm not 
sure, but my point is I agree with making it socially acceptable to 
rewrite the commit message as part of the review.






I'm sure there are other thoughts, but my brain is running out of steam.
These were the things that popped to the top of my head. It's definitely
been really interesting to spend this much time with the tracker to
build a bigger picture of this feedback channel we have from our users.
Hopefully other folks found some of this handy.

-Sean



Agree with everything else said here.  It's also helpful that you're 
directly pinging people in IRC for action on things, e.g. what's up 
with this bug (that you opened)? or pointing out things that are ready 
for approval (I've been doing this more lately in IRC on what I consider 
trivial reviews that I've already +2'ed).


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] expected behaviour of _report_state() on rabbitmq failover

2014-09-21 Thread Matt Riedemann



On 9/10/2014 3:33 PM, Chris Friesen wrote:

On 09/10/2014 02:13 PM, Chris Friesen wrote:


As it stands, it seems that waiting for the RPC call to time out
blocks _report_state() from running again in report_interval seconds,
which delays the service update until the RPC timeout period expires.


Just noticed something...

In the case of _report_state(), does it really make sense to wait 60
seconds for RPC timeout when we're going to send a new service update in
10 seconds anyway?

More generally, the RPC timeout on the service_update() call should be
less than or equal to service.report_interval for the service.

Chris

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Maybe completely unrelated, but FYI while you're looking at this code path:

https://bugs.launchpad.net/neutron/+bug/1357055

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] expected behaviour of _report_state() on rabbitmq failover

2014-09-21 Thread Matt Riedemann



On 9/21/2014 7:56 PM, Matt Riedemann wrote:



On 9/10/2014 3:33 PM, Chris Friesen wrote:

On 09/10/2014 02:13 PM, Chris Friesen wrote:


As it stands, it seems that waiting for the RPC call to time out
blocks _report_state() from running again in report_interval seconds,
which delays the service update until the RPC timeout period expires.


Just noticed something...

In the case of _report_state(), does it really make sense to wait 60
seconds for RPC timeout when we're going to send a new service update in
10 seconds anyway?

More generally, the RPC timeout on the service_update() call should be
less than or equal to service.report_interval for the service.

Chris

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Maybe completely unrelated, but FYI while you're looking at this code path:

https://bugs.launchpad.net/neutron/+bug/1357055



Oops, completely wrong bug.  Here is the correct one:

https://bugs.launchpad.net/nova/+bug/1331537

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Neutron] Third-party testing

2013-12-09 Thread Matt Riedemann



On Sunday, December 08, 2013 11:32:50 PM, Yoshihiro Kaneko wrote:

Hi Neutron team,

I'm working on building Third-party testing for Neutron Ryu plugin.
I intend to use Jenkins and gerrit-trigger plugin.

It is required that Third-party testing provides verify vote for
all changes to a plugin/driver's code, and all code submissions
by the jenkins user.
https://wiki.openstack.org/wiki/Neutron_Plugins_and_Drivers#Testing_Requirements

For this requirements, what kind of filter for the trigger should
I set?
It is easy to set a file path of the plugin/driver:
   project: plain:neutron
   branch:  plain:master
   file:path:neutron/plugins/ryu/**
However, this is not enough because it lacks dependencies.
It is difficult to judge a patchset which affects the plugin/driver.
In addition, gerrit trigger has a file path filter, but there is no
patchset owner filter, so it is not able to set a trigger to a
patchset which is submitted by the jenkins user.

Can Third-party testing execute tests for all patchset including the
thing which may not affect the plugin/driver?

Thanks,
Kaneko

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I can't speak for the Neutron team, but in Nova the requirement is to 
run all patches through the vendor plugin third party CI, not just 
vendor-specific patches.


https://wiki.openstack.org/wiki/HypervisorSupportMatrix/DeprecationPlan

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] ExtraSpecs format bug

2013-12-10 Thread Matt Riedemann



On Thursday, December 05, 2013 1:38:45 PM, Costantino, Leandro I wrote:

Hi!

i am working on the Horizon 'side' of
https://bugs.launchpad.net/nova/+bug/1256119 , where basically
if you create a ExtraSpec key containing '/', then it cannot be
deleted anymore.

Is there any restriction about this?
Shall the format of the keys be limited to some specific format or any
combination should be valid?

For instance, heat use this pattern for stack names:
[a-zA-Z][a-zA-Z0-9_.-]* .


Regards
Leandro




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



My response when this was brought up in IRC:

(3:12:11 PM) mriedem: lcostantino: looks like flavorid is checked 
against a regex in the code
(3:12:11 PM) mriedem: 
https://github.com/openstack/nova/blob/master/nova/compute/flavors.py#L57

(3:12:18 PM) mriedem: would think you could do the same for extra specs

I don't see any specific rules for extra_specs in the API docs or 
validation happening in the code.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Cells] compute api and objects

2013-12-10 Thread Matt Riedemann



On Monday, December 09, 2013 4:58:31 PM, Sam Morrison wrote:

Hi,

I’m trying to fix up some cells issues related to objects. Do all compute api 
methods take objects now?
cells is still sending DB objects for most methods (except start and stop) and 
I know there are more than that.

Eg. I know lock/unlock, shelve/unshelve take objects, I assume there are others 
if not all methods now?

Cheers,
Sam



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I don't know the answer about cells, but posting a few bugs you've 
opened on the topic:


https://bugs.launchpad.net/nova/+bug/1251043
https://bugs.launchpad.net/nova/+bug/1257168

As for Do all compute api methods take objects now?, I believe the 
answer is 'no'.  There are still some objects blueprints in the works.  
Here is a big one:


https://blueprints.launchpad.net/nova/+spec/compute-manager-objects

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova]

2013-12-10 Thread Matt Riedemann



On Tuesday, December 10, 2013 4:17:45 PM, Maithem Munshed 71510 wrote:

Hello,

I was wondering, what is the reason behind having nova audit resources
as opposed to using usage stats directly from what is reported by the
compute driver. The available resources reported from the audit can be
incorrect in some cases. Also, in many cases the reported usage stats
from the driver are correct, so auditing periodically while having the
usage stats from the driver is inefficient. One of the which result in
an incorrect audit is: existing VMs on a hypervisor that are created
prior to deploying nova. As a result, the scheduler will see more
available resources than what actually is available. I am aware that
Nova shouldn’t be managing VMs that it hasn’t created, but the
reported available resources should be as accurate as possible.

I have proposed the following blueprint to provide the option of using
usage stats directly from the driver :

https://blueprints.launchpad.net/nova/+spec/use-driver-usage-stats

I would like to know what your thoughts are and would appreciate feedback.

Regards,

Maithem



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


One (big) problem is the virt drivers don't follow a standard format 
for the usage diagnostics, which has been discussed before in the 
mailing list [1].


There is a nova blueprint [2] for standard auditing formats like in 
ceilometer which might be related to what you're looking for.


[1] 
http://lists.openstack.org/pipermail/openstack-dev/2013-October/016385.html
[2] 
https://blueprints.launchpad.net/nova/+spec/support-standard-audit-formats


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Sphinx 1.2 incompatibility (failing -docs jobs)

2013-12-10 Thread Matt Riedemann



On 12/10/2013 6:32 PM, Angus Salkeld wrote:

On 10/12/13 14:57 -0800, James E. Blair wrote:

Hi,

Sphinx 1.2 was just released and it is incompatible with distutils in
python 2.7.  See these links for more info:

 
https://bitbucket.org/birkenfeld/sphinx/pull-request/193/builddoc-shouldnt-fail-on-unicode-paths/diff

 http://bugs.python.org/issue19570


Is there a bug number that we can reference in recheck/reverifies?


The bug is 1259511.





This has caused all -docs jobs to fail.  This morning we merged a change
to openstack/requirements to pin Sphinx to version 1.2:

 https://review.openstack.org/#/c/61164/

Sergey Lukjanov, Clark Boylan, and Jeremy Stanley finished up the
automatic requirements proposal job (Thanks!), and so now updates have
been automatically proposed to all projects that subscribe:

 https://review.openstack.org/#/q/topic:openstack/requirements,n,z

Once those changes merge, -docs jobs for affected projects should start
working again.

Note that requirements updates for stable branches are proceeding
separately; you can track their progress here:

 https://review.openstack.org/#/q/I0487b4eca8f2755b882689289e3cdf429729b1fb,n,z


-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] State of the Gate - Dec 12

2013-12-12 Thread Matt Riedemann



On 12/12/2013 7:20 AM, Sean Dague wrote:

Current Gate Length: 12hrs*, 41 deep

(top of gate entered 12hrs ago)

It's been an *exciting* week this week. For people not paying attention
we had 2 external events which made things terrible earlier in the week.

==
Event 1: sphinx 1.2 complete breakage - MOSTLY RESOLVED
==

It turns out sphinx 1.2 + distutils (which pbr magic call through) means
total sadness. The fix for this was a requirements pin to sphinx  1.2,
and until a project has taken that they will fail in the gate.

It also turns out that tox installs pre-released software by default (a
terrible default behavior), so you also need a tox.ini change like this
- https://github.com/openstack/nova/blob/master/tox.ini#L9 otherwise
local users will install things like sphinx 1.2b3. They will also break
in other ways.

Not all projects have merged this. If you are a project that hasn't,
please don't send any other jobs to the gate until you do. A lot of
delay was added to the gate yesterday by Glance patches being pushed to
the gate before their doc jobs were done.

==
Event 2: apt.puppetlabs.com outage - RESOLVED
==

We use that apt repository to setup the devstack nodes in nodepool with
puppet. We were triggering an issue with grenade where it's apt-get
calls were failing, because it does apt-get update once to make sure
life is good. This only triggered in grenade (noth other devstack runs)
because we do set -o errexit aggressively.

A fix in grenade to ignore these errors was merged yesterday afternoon
(the purple line - http://status.openstack.org/elastic-recheck/ you can
see where it showed up).

==
Top Gate Bugs
==

We normally do this as a list, and you can see the whole list here -
http://status.openstack.org/elastic-recheck/ (now sorted by number of
FAILURES in the last 2 weeks)

That being said, our bigs race bug is currently this one bug -
https://bugs.launchpad.net/tempest/+bug/1253896 - and if you want to
merge patches, fixing that one bug will be huge.

Basically, you can't ssh into guests that get created. That's sort of a
fundamental property of a cloud. It shows up more frequently on neutron
jobs, possibly due to actually testing the metadata server path. There
have been many attempts on retry logic on this, we actually retry for
196 seconds to get in and only fail once we can't get in, so waiting
isn't helping. It doesn't seem like the env is under that much load.

Until we resolve this, life will not be good in landing patches.

-Sean



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



There have been a few threads [1][2] on gate failures and the process 
around what happens when we go about identifying, tracking and fixing them.


I couldn't find anything outside of the mailing list to keep a record of 
this so started a page here [3].


Feel free to contribute so we can point people to how they can easily 
help in working these faster.


[1] 
http://lists.openstack.org/pipermail/openstack-dev/2013-November/020280.html
[2] 
http://lists.openstack.org/pipermail/openstack-dev/2013-November/019931.html

[3] https://wiki.openstack.org/wiki/ElasticRecheck

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] cross-project bug fixes

2013-12-12 Thread Matt Riedemann



On Thursday, December 12, 2013 10:29:11 AM, Russell Bryant wrote:

On 12/12/2013 11:21 AM, Hamilton, Peter A. wrote:

I am in the process of getting a bug fix approved for a bug found in 
openstack.common:

https://review.openstack.org/#/c/60500/

The bug is present in both nova and cinder. The above patch is under nova; do I 
need to submit a separate cinder patch covering the same fix, or does the 
shared nature of the openstack.common module allow for updates across projects 
without needing separate project patches?


The part under nova/openstack/common needs to be fixed in the
oslo-incubator git repository first.  From there you sync the fix into
nova and cinder.



Peter,

FYI: https://wiki.openstack.org/wiki/Oslo#Syncing_Code_from_Incubator

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Cinder] suggestion to a new driver

2013-12-14 Thread Matt Riedemann



On Tuesday, December 10, 2013 2:59:19 AM, Ronen Angluster wrote:

Hello all!

we're developing a new storage appliance and per one of our customers
would like
to build a cinder driver.
i kept digging into the documentation for the past 2 weeks and could not
find anything that described the code level of API. i.e. nothing
describes what each function should
receive and what it should return.
is there a document that describe it and i missed it? if not, who can
provide that missing information?

Ronen


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


I don't work on cinder, but I'm guessing this will get you started:

http://docs.openstack.org/api/openstack-block-storage/2.0/content/
https://github.com/openstack/cinder/blob/master/doc/source/devref/drivers.rst

I'm also not sure where the cinder team is with 3rd party CI 
requiements, but you'll want to at least read this also:


http://lists.openstack.org/pipermail/openstack-dev/2013-July/012557.html

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] time for a new major network rpc api version?

2013-12-15 Thread Matt Riedemann
I was looking at this review [1] and grepping through the code on that 
method and found a lot of old code [2][3] that needs to be removed but 
can't until we bump the network API version to 2.0.


I know Russell has started bumping the compute RPC API version so would 
something like this for the network API fall under the same blueprint? 
[4].  I'm not sure if the network API was also in mind for that blueprint.


[1] https://review.openstack.org/#/c/60603/
[2] 
https://github.com/openstack/nova/blob/master/nova/network/floating_ips.py#L511
[3] 
https://github.com/openstack/nova/blob/master/nova/network/manager.py#L399
[4] 
https://blueprints.launchpad.net/nova/+spec/rpc-major-version-updates-icehouse


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][db] Thoughts on making instances.uuid non-nullable?

2013-12-16 Thread Matt Riedemann
I've got a blueprint [1] scheduled for icehouse-3 to add DB2 support to 
Nova. That's blocked by a patch working it's way through 
sqlalchemy-migrate to add DB2 support [2] there.


I've held off pushing any nova patches up until the sqlalchemy-migrate 
DB2 support is merged (which is also blocked by 3rd party CI, which is a 
WIP of it's own).


Thinking ahead though for nova, one of the main issues with DB2 in the 
migration scripts is DB2 10.5 doesn't support unique constraints over 
nullable columns.  The sqlalchemy-migrate code will instead create a 
unique index, since that's DB2's alternative.  However, since a unique 
index is not a unique constraint, the FK creation fails if the UC 
doesn't exist.


There are a lot of foreign keys in nova based on the instances.uuid 
column [3].  I need to figure out how I'm going to solve the UC problem 
for DB2 in that case.  Here are the options as I see them, looking for 
input on the best way to go.


1. Add a migration to change instances.uuid to non-nullable. Besides the 
obvious con of having yet another migration script, this seems the most 
straight-forward. The instance object class already defines the uuid 
field as non-nullable, so it's constrained at the objects layer, just 
not in the DB model.  Plus I don't think we'd ever have a case where 
instance.uuid is null, right?  Seems like a lot of things would break 
down if that happened.  With this option I can build on top of it for 
the DB2 migration support to add the same FKs as the other engines.


2. When I push up the migration script changes for DB2, I make the 
instances.uuid (and any other similar cases) work in the DB2 case only, 
i.e. if the engine is 'ibm_db_sa', then instances.uuid is non-nullable. 
 This could be done in the 160_havana migration script since moving to 
DB2 with nova is going to require a fresh migration anyway (there are 
some other older scripts that I'll have to change to work with migrating 
to DB2).  I don't particularly care for this option since it makes the 
model inconsistent between backends, but the upside is it doesn't 
require a new migration for any other backend, only DB2 - and you'd have 
to run the migrations for DB2 support anyway.


I'm trying to flesh this out early since I could start working on option 
1 at any time if it's the agreed upon solution, but looking for input 
first because I don't want to make assumptions about what everyone 
thinks here.


[1] https://blueprints.launchpad.net/nova/+spec/db2-database
[2] https://review.openstack.org/#/c/55572/
[3] 
https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/migrate_repo/versions/160_havana.py#L1335


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] How do we format/version/deprecate things from notifications?

2013-12-18 Thread Matt Riedemann
The question came up in this patch [1], how do we deprecate and remove 
keys in the notification payload?  In this case I need to deprecate and 
replace the 'instance_type' key with 'flavor' per the associated blueprint.


[1] https://review.openstack.org/#/c/62430/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] How do we format/version/deprecate things from notifications?

2013-12-18 Thread Matt Riedemann



On 12/18/2013 9:42 AM, Matt Riedemann wrote:

The question came up in this patch [1], how do we deprecate and remove
keys in the notification payload?  In this case I need to deprecate and
replace the 'instance_type' key with 'flavor' per the associated blueprint.

[1] https://review.openstack.org/#/c/62430/



By the way, my thinking is it's handled like a deprecated config option, 
you deprecate it for a release, make sure it's documented in the release 
notes and then drop it in the next release. For anyone that hasn't 
switched over they are broken until they start consuming the new key.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Adding DB migration items to the common review checklist

2013-12-18 Thread Matt Riedemann
I've seen this come up a few times in reviews and was thinking we should 
put something in the general review checklist wiki for it [1].


Basically I have three things I'd like to have in the list for DB 
migrations:


1. Unique constraints should be named. Different DB engines and 
SQLAlchemy dialects automatically name the constraint their own way, 
which can be troublesome for universal migrations. We should avoid this 
by enforcing that UCs are named when they are created. This means not 
using the unique=True arg in UniqueConstraint if the name arg isn't 
provided.


2. Foreign keys should be named for the same reasons in #1.

3. Foreign keys shouldn't be created against nullable columns. Some DB 
engines don't allow unique constraints over nullable columns and if you 
can't create the unique constraint you can't create the foreign key, so 
we should avoid this. If you need the FK, then the pre-req is to make 
the target column non-nullable. Think of the instances.uuid column in 
nova for example.


Unless anyone has a strong objection to this, I'll update the review 
checklist wiki with these items.


[1] https://wiki.openstack.org/wiki/ReviewChecklist

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Adding DB migration items to the common review checklist

2013-12-18 Thread Matt Riedemann



On 12/18/2013 1:14 PM, Brant Knudson wrote:

Matt -

Could a test be added that goes through the models and checks these
things? Other projects could use this too.

Here's an example of a test that checks if the tables are all InnoDB:
http://git.openstack.org/cgit/openstack/nova/tree/nova/tests/db/test_migrations.py?id=6e455cd97f04bf26bbe022be17c57e089cf502f4#n430

- Brant



Brant, I could see automating #3 since you could trace the FK to the UC 
and then check if the columns in the UC are nullable or not, but I'm not 
sure how easy it would be to generically test 1 and 2 because we don't 
have strict naming conventions on UC/FK names as far as I know, but I 
guess we could start enforcing that with a test, and whitelist any 
existing UC/FK names that don't fit the new convention.


Thoughts on that or other ideas how to automate checking for this?





On Wed, Dec 18, 2013 at 11:27 AM, Matt Riedemann
mrie...@linux.vnet.ibm.com mailto:mrie...@linux.vnet.ibm.com wrote:

I've seen this come up a few times in reviews and was thinking we
should put something in the general review checklist wiki for it [1].

Basically I have three things I'd like to have in the list for DB
migrations:

1. Unique constraints should be named. Different DB engines and
SQLAlchemy dialects automatically name the constraint their own way,
which can be troublesome for universal migrations. We should avoid
this by enforcing that UCs are named when they are created. This
means not using the unique=True arg in UniqueConstraint if the name
arg isn't provided.

2. Foreign keys should be named for the same reasons in #1.

3. Foreign keys shouldn't be created against nullable columns. Some
DB engines don't allow unique constraints over nullable columns and
if you can't create the unique constraint you can't create the
foreign key, so we should avoid this. If you need the FK, then the
pre-req is to make the target column non-nullable. Think of the
instances.uuid column in nova for example.

Unless anyone has a strong objection to this, I'll update the review
checklist wiki with these items.

[1] https://wiki.openstack.org/__wiki/ReviewChecklist
https://wiki.openstack.org/wiki/ReviewChecklist

--

Thanks,

Matt Riedemann


_
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.__org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev 
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Adding DB migration items to the common review checklist

2013-12-18 Thread Matt Riedemann



On 12/18/2013 2:11 PM, Dan Prince wrote:



- Original Message -

From: Matt Riedemann mrie...@linux.vnet.ibm.com
To: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.org
Sent: Wednesday, December 18, 2013 12:27:49 PM
Subject: [openstack-dev] Adding DB migration items to the common review 
checklist

I've seen this come up a few times in reviews and was thinking we should
put something in the general review checklist wiki for it [1].

Basically I have three things I'd like to have in the list for DB
migrations:

1. Unique constraints should be named. Different DB engines and
SQLAlchemy dialects automatically name the constraint their own way,
which can be troublesome for universal migrations. We should avoid this
by enforcing that UCs are named when they are created. This means not
using the unique=True arg in UniqueConstraint if the name arg isn't
provided.

2. Foreign keys should be named for the same reasons in #1.

3. Foreign keys shouldn't be created against nullable columns. Some DB
engines don't allow unique constraints over nullable columns and if you
can't create the unique constraint you can't create the foreign key, so
we should avoid this. If you need the FK, then the pre-req is to make
the target column non-nullable. Think of the instances.uuid column in
nova for example.

Unless anyone has a strong objection to this, I'll update the review
checklist wiki with these items.


No objection to these.

One possible addition would be to make sure that migrations stand on their own 
as much as possible. Code sharing, while good in many cases, can bite you in DB 
migrations because fixing a bug in the shared code may change the behavior of 
an old (released) migration. So by sharing migration code it then can  become 
easier to break upgrades paths down the road. If we make some exceptions to 
this rule with nova.db.sqlalchemy we need to be very careful that we don't 
change the behavior in those functions. Automated tests help here too.

Dan


Not sure if this is pure coincidence, but case in point:

https://review.openstack.org/#/c/62965/





[1] https://wiki.openstack.org/wiki/ReviewChecklist

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] VM diagnostics - V3 proposal

2013-12-19 Thread Matt Riedemann



On Thursday, December 19, 2013 8:49:13 AM, Vladik Romanovsky wrote:

Or

ceilometer meter-list -q resource_id='vm_uuid'

- Original Message -

From: Daniel P. Berrange berra...@redhat.com
To: John Garbutt j...@johngarbutt.com
Cc: OpenStack Development Mailing List (not for usage questions) 
openstack-dev@lists.openstack.org
Sent: Thursday, 19 December, 2013 9:34:02 AM
Subject: Re: [openstack-dev] [nova] VM diagnostics - V3 proposal

On Thu, Dec 19, 2013 at 02:27:40PM +, John Garbutt wrote:

On 16 December 2013 15:50, Daniel P. Berrange berra...@redhat.com wrote:

On Mon, Dec 16, 2013 at 03:37:39PM +, John Garbutt wrote:

On 16 December 2013 15:25, Daniel P. Berrange berra...@redhat.com
wrote:

On Mon, Dec 16, 2013 at 06:58:24AM -0800, Gary Kotton wrote:

I'd like to propose the following for the V3 API (we will not touch
V2
in case operators have applications that are written against this –
this
may be the case for libvirt or xen. The VMware API support was added
in I1):

  1.  We formalize the data that is returned by the API [1]


Before we debate what standard data should be returned we need
detail of exactly what info the current 3 virt drivers return.
IMHO it would be better if we did this all in the existing wiki
page associated with the blueprint, rather than etherpad, so it
serves as a permanent historical record for the blueprint design.


+1


While we're doing this I think we should also consider whether
the 'get_diagnostics' API is fit for purpose more generally.
eg currently it is restricted to administrators. Some, if
not all, of the data libvirt returns is relevant to the owner
of the VM but they can not get at it.


Ceilometer covers that ground, we should ask them about this API.


If we consider what is potentially in scope for ceilometer and
subtract that from what the libvirt get_diagnostics impl currently
returns, you pretty much end up with the empty set. This might cause
us to question if 'get_diagnostics' should exist at all from the
POV of the libvirt driver's impl. Perhaps vmware/xen return data
that is out of scope for ceilometer ?


Hmm, a good point.


So perhaps I'm just being dumb, but I deployed ceilometer and could
not figure out how to get it to print out the stats for a single
VM from its CLI ? eg, can someone show me a command line invocation
for ceilometer that displays CPU, memory, disk and network I/O stats
in one go ?


Daniel
--
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


I just wanted to point out for anyone that hasn't reviewed it yet, but 
Gary's latest design wiki [1] is quite a departure from his original 
set of patches for this blueprint, which was pretty straight-forward, 
just namespacing the diagnostics dict when using the nova v3 API.  The 
keys were all still hypervisor-specific.


The proposal now is much more generic and attempts to translate 
hypervisor-specific keys/data into a common standard versioned set and 
allows for some wiggle room for the drivers to still provide custom 
data if necessary.


I think this is a better long-term solution but is a lot more work than 
the original blueprint and given there seems to be some feeling of 
does nova even need this API, can ceilometer provider it instead? I'd 
like there to be some agreement within nova that this is the right way 
to go before Gary spends a bunch of time on it - and I as the bp 
sponsor spend a bunch of time reviewing it. :)


[1] https://wiki.openstack.org/wiki/Nova_VM_Diagnostics

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] VM diagnostics - V3 proposal

2013-12-20 Thread Matt Riedemann



On Friday, December 20, 2013 3:57:15 AM, Daniel P. Berrange wrote:

On Fri, Dec 20, 2013 at 12:56:47PM +0400, Oleg Gelbukh wrote:

Hi everyone,

I'm sorry for being late to the thread, but what about baremetal driver?
Should it support the get_diagnostics() as well?


Of course, where practical, every driver should aim to support every
method in the virt driver class API.

Regards,
Daniel


Although isn't the baremetal driver moving to ironic, or there is an 
ironic driver moving into nova?  I'm a bit fuzzy on what's going on 
there.  Point is, if we're essentially halting feature development on 
the nova baremetal driver I'd hold off on implementing get_diagnostics 
there for now.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] All I want for Christmas is one more +2 ...

2013-12-23 Thread Matt Riedemann



On 12/12/2013 8:22 AM, Day, Phil wrote:

Hi Cores,

The “Stop, Rescue, and Delete should give guest a chance to shutdown”
change https://review.openstack.org/#/c/35303/ was approved a couple of
days ago, but failed to merge because the RPC version had moved on.
Its rebased and sitting there with one +2 and a bunch of +1s  -would be
really nice if it could land before it needs another rebase please ?

Thanks

Phil



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Since this is happening to others that are requesting reviews in the 
mailing list, even on patches with several +1s and a +2, and it's way 
after the fact, I'm going to link this:


http://lists.openstack.org/pipermail/openstack-dev/2013-September/015264.html

Maybe we should update the blurb here also to say 'in IRC' to nix any 
confusion about the mailing list.


https://wiki.openstack.org/wiki/ReviewChecklist#Notes_for_Non-Core_Developers

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Turbo-hipster

2014-01-02 Thread Matt Riedemann



On 12/31/2013 3:58 PM, Michael Still wrote:

Hi.

So while turbo hipster is new, I've been reading every failure message
it produces to make sure its not too badly wrong. There were four
failures posted last night while I slept:

https://review.openstack.org/#/c/64521


This one is a TH bug. We shouldn't be testing stable branches.
bug/1265238 has been filed to track this.

https://review.openstack.org/#/c/61753


This is your review. The failed run's log is
https://ssl.rcbops.com/turbo_hipster/logviewer/?q=/turbo_hipster/results/61/61753/8/check/gate-real-db-upgrade_nova_percona_user_001/1326092/user_001.log
and you can see from the failure message that migrations 152 and 206
took too long.

Migration 152 took 326 seconds, where our historical data of 2,846
test migrations says it should take 222 seconds. Migration 206 took 81
seconds, where we think it should take 56 seconds based on 2,940 test
runs.

Whilst I can't explain why those migrations took too long this time
around, they are certainly exactly the sort of thing TH is meant to
catch. If you think your patch isn't responsible (perhaps the machine
is just being slow or something), you can always retest by leaving a
review comment of recheck migrations. I have done this for you on
this patch.


Michael, is recheck migrations something that is going to be added to 
the wiki for test failures here?


https://wiki.openstack.org/wiki/GerritJenkinsGit#Test_Failures



https://review.openstack.org/#/c/61717


This review also had similar unexplained slowness, but has already
been rechecked by someone else and now passes. I note that the
slowness in both cases was from the same TH worker node, and I will
keep an eye on that node today.

https://review.openstack.org/#/c/56420


This review also had slowness in migration 206, but has been rechecked
by the developer and now passes. It wasn't on the percona-001 worker
that the other two were on, so perhaps this indicates that we need to
relax the timing requirements for migration 206.

Hope this helps,
Michael

On Wed, Jan 1, 2014 at 12:34 AM, Gary Kotton gkot...@vmware.com wrote:

Hi,
It seems that she/he is behaving oddly again. I have posted a patch that
does not have any database changes and it has give me a –1….
Happy new year
Gary

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev







--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] minimum review period for functional changes that break backwards compatibility

2014-01-06 Thread Matt Riedemann



On 1/3/2014 10:30 AM, David Kranz wrote:

On 01/03/2014 08:52 AM, Thierry Carrez wrote:

Tim Bell wrote:

Is there a mechanism to tag changes as being potentially more
appropriate for the more ops related profiles ? I'm thinking more
when someone proposes a change they suspect could have an operations
impact, they could highlight this as being one for particular focus.

How about an OpsImpact tag ?

I think such a tag would help. That would encourage ops to start looking
more regularly into proposed changes by highlighting the few reviews
that are most likely to need their expertise.

We could have that tag post reviews to the -operators ML (in the same
way SecurityImpact posts to the -security ML), which would additionally
reinforce the need for this list as a separate list from the openstack
general list.


While this would be an improvement over the current situation, IMO we
are focused a bit too much here on operators vs others. I think we
need clearer guidelines on what an incompatible change is, and how to
balance change it to something better with don't cause users upgrade
pain. There was a similar issue with API changes a while back and
providing the api stability guidelines really helped people understand
the issue better. Of course, similar to what Sean talked about, having
API tests in tempest that blocked incompatible api changes was probably
even more important.

  -David



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



There is discussion in this thread about wouldn't it be nice to have a 
tag on commits for changes that impact upgrades?.  There is.


http://lists.openstack.org/pipermail/openstack-dev/2013-October/016619.html

https://wiki.openstack.org/wiki/GitCommitMessages#Including_external_references

Here is an example of a patch going through the gate now with UpgradeImpact:

https://review.openstack.org/#/c/62815/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Bogus -1 scores from turbo hipster

2014-01-07 Thread Matt Riedemann



On 12/30/2013 6:21 AM, Michael Still wrote:

Hi.

The purpose of this email to is apologise for some incorrect -1 review
scores which turbo hipster sent out today. I think its important when
a third party testing tool is new to not have flakey results as people
learn to trust the tool, so I want to explain what happened here.

Turbo hipster is a system which takes nova code reviews, and runs
database upgrades against them to ensure that we can still upgrade for
users in the wild. It uses real user datasets, and also times
migrations and warns when they are too slow for large deployments. It
started voting on gerrit in the last week.

Turbo hipster uses zuul to learn about reviews in gerrit that it
should test. We run our own zuul instance, which talks to the
openstack.org zuul instance. This then hands out work to our pool of
testing workers. Another thing zuul does is it handles maintaining a
git repository for the workers to clone from.

This is where things went wrong today. For reasons I can't currently
explain, the git repo on our zuul instance ended up in a bad state (it
had a patch merged to master which wasn't in fact merged upstream
yet). As this code is stock zuul from openstack-infra, I have a
concern this might be a bug that other zuul users will see as well.

I've corrected the problem for now, and kicked off a recheck of any
patch with a -1 review score from turbo hipster in the last 24 hours.
I'll talk to the zuul maintainers tomorrow about the git problem and
see what we can learn.

Thanks heaps for your patience.

Michael



How do I interpret the warning and -1 from turbo-hipster on my patch 
here [1] with the logs here [2]?


I'm inclined to just do 'recheck migrations' on this since this patch 
doesn't have anything to do with this -1 as far as I can tell.


[1] https://review.openstack.org/#/c/64725/4/
[2] 
https://ssl.rcbops.com/turbo_hipster/logviewer/?q=/turbo_hipster/results/64/64725/4/check/gate-real-db-upgrade_nova_mysql_user_001/5186e53/user_001.log


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Change I005e752c: Whitelist external netaddr requirement, for bug 1266513, ineffective for me

2014-01-07 Thread Matt Riedemann



On 1/6/2014 8:55 PM, Sean Dague wrote:

On 01/06/2014 09:33 PM, Mike Spreitzer wrote:

I am suffering from bug 1266513, when trying to work on nova.  For
example, on MacOS 10.8.5, I clone nova and then (following the
instructions at https://wiki.openstack.org/wiki/DependsOnOSX) run `cd
nova; python tools/install_venv.py`.  It fails due to PyPI lacking a
sufficiently advanced netaddr.  So I applied patch
https://review.openstack.org/#/c/65019/to my nova/tox.ini, delete my
nova/.venv, and try again.  It fails again, in just the same way
(including the message Some externally hosted files were ignored (use
--allow-external to allow)).  Why is this patch not solving my problem?


Because it only fixes it for tox.

tox -epy27

run_tests.sh and install_venv need a whole other set of fixes that need
to go through oslo.

 -Sean



This [1] is the fix for oslo-incubator and run_tests.sh which will be 
synced to nova later.


[1] https://review.openstack.org/#/c/65151/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [elastic-recheck] Thoughts on next steps

2014-01-07 Thread Matt Riedemann



On 1/2/2014 8:29 PM, Sean Dague wrote:

A lot of elastic recheck this fall has been based on the ad hoc needs of
the moment, in between diving down into the race bugs that were
uncovered by it. This week away from it all helped provide a little
perspective on what I think we need to do to call it *done* (i.e.
something akin to a 1.0 even though we are CDing it).

Here is my current thinking on the next major things that should happen.
Opinions welcomed.

(These are roughly in implementation order based on urgency)

= Split of web UI =

The elastic recheck page is becoming a mismash of what was needed at the
time. I think what we really have emerging is:
  * Overall Gate Health
  * Known (to ER) Bugs
  * Unknown (to ER) Bugs - more below

I think the landing page should be Know Bugs, as that's where we want
both bug hunters to go to prioritize things, as well as where people
looking for known bugs should start.

I think the overall Gate Health graphs should move to the zuul status
page. Possibly as part of the collection of graphs at the bottom.

We should have a secondary page (maybe column?) of the un-fingerprinted
recheck bugs, largely to use as candidates for fingerprinting. This will
let us eventually take over /recheck.

= Data Analysis / Graphs =

I spent a bunch of time playing with pandas over break
(http://dague.net/2013/12/30/ipython-notebook-experiments/), it's kind
of awesome. It also made me rethink our approach to handling the data.

I think the rolling average approach we were taking is more precise than
accurate. As these are statistical events they really need error bars.
Because when we have a quiet night, and 1 job fails at 6am in the
morning, the 100% failure rate it reflects in grenade needs to be
quantified that it was 1 of 1, not 50 of 50.

So my feeling is we should move away from the point graphs we have, and
present these as weekly and daily failure rates (with graphs and error
bars). And slice those per job. My suggestion is that we do the actual
visualization with matplotlib because it's super easy to output that
from pandas data sets.

Basically we'll be mining Elastic Search - Pandas TimeSeries -
transforms and analysis - output tables and graphs. This is different
enough from our current jquery graphing that I want to get ACKs before
doing a bunch of work here and finding out people don't like it in reviews.

Also in this process upgrade the metadata that we provide for each of
those bugs so it's a little more clear what you are looking at.

= Take over of /recheck =

There is still a bunch of useful data coming in on recheck bug 
data which hasn't been curated into ER queries. I think the right thing
to do is treat these as a work queue of bugs we should be building
patterns out of (or completely invalidating). I've got a preliminary
gerrit bulk query piece of code that does this, which would remove the
need of the daemon the way that's currently happening. The gerrit
queries are a little long right now, but I think if we are only doing
this on hourly cron, the additional load will be negligible.

This would get us into a single view, which I think would be more
informative than the one we currently have.

= Categorize all the jobs =

We need a bit of refactoring to let us comment on all the jobs (not just
tempest ones). Basically we assumed pep8 and docs don't fail in the gate
at the beginning. Turns out they do, and are good indicators of infra /
external factor bugs. They are a part of the story so we should put them
in.

= Multi Line Fingerprints =

We've definitely found bugs where we never had a really satisfying
single line match, but we had some great matches if we could do multi line.

We could do that in ER, however it will mean giving up logstash as our
UI, because those queries can't be done in logstash. So in order to do
this we'll really need to implement some tools - cli minimum, which will
let us easily test a bug. A custom web UI might be in order as well,
though that's going to be it's own chunk of work, that we'll need more
volunteers for.

This would put us in a place where we should have all the infrastructure
to track 90% of the race conditions, and talk about them in certainty as
1%, 5%, 0.1% bugs.

 -Sean



Let's add regexp query support to elastic-recheck so that I could have 
fixed this better:


https://review.openstack.org/#/c/65303/

Then I could have just filtered the build_name with this:

build_name:/(check|gate)-(tempest|grenade)-[a-z\-]+/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [elastic-recheck] Thoughts on next steps

2014-01-07 Thread Matt Riedemann



On 1/7/2014 5:26 PM, Sean Dague wrote:

On 01/07/2014 06:20 PM, Matt Riedemann wrote:



On 1/2/2014 8:29 PM, Sean Dague wrote:

A lot of elastic recheck this fall has been based on the ad hoc needs of
the moment, in between diving down into the race bugs that were
uncovered by it. This week away from it all helped provide a little
perspective on what I think we need to do to call it *done* (i.e.
something akin to a 1.0 even though we are CDing it).

Here is my current thinking on the next major things that should happen.
Opinions welcomed.

(These are roughly in implementation order based on urgency)

= Split of web UI =

The elastic recheck page is becoming a mismash of what was needed at the
time. I think what we really have emerging is:
  * Overall Gate Health
  * Known (to ER) Bugs
  * Unknown (to ER) Bugs - more below

I think the landing page should be Know Bugs, as that's where we want
both bug hunters to go to prioritize things, as well as where people
looking for known bugs should start.

I think the overall Gate Health graphs should move to the zuul status
page. Possibly as part of the collection of graphs at the bottom.

We should have a secondary page (maybe column?) of the un-fingerprinted
recheck bugs, largely to use as candidates for fingerprinting. This will
let us eventually take over /recheck.

= Data Analysis / Graphs =

I spent a bunch of time playing with pandas over break
(http://dague.net/2013/12/30/ipython-notebook-experiments/), it's kind
of awesome. It also made me rethink our approach to handling the data.

I think the rolling average approach we were taking is more precise than
accurate. As these are statistical events they really need error bars.
Because when we have a quiet night, and 1 job fails at 6am in the
morning, the 100% failure rate it reflects in grenade needs to be
quantified that it was 1 of 1, not 50 of 50.

So my feeling is we should move away from the point graphs we have, and
present these as weekly and daily failure rates (with graphs and error
bars). And slice those per job. My suggestion is that we do the actual
visualization with matplotlib because it's super easy to output that
from pandas data sets.

Basically we'll be mining Elastic Search - Pandas TimeSeries -
transforms and analysis - output tables and graphs. This is different
enough from our current jquery graphing that I want to get ACKs before
doing a bunch of work here and finding out people don't like it in
reviews.

Also in this process upgrade the metadata that we provide for each of
those bugs so it's a little more clear what you are looking at.

= Take over of /recheck =

There is still a bunch of useful data coming in on recheck bug 
data which hasn't been curated into ER queries. I think the right thing
to do is treat these as a work queue of bugs we should be building
patterns out of (or completely invalidating). I've got a preliminary
gerrit bulk query piece of code that does this, which would remove the
need of the daemon the way that's currently happening. The gerrit
queries are a little long right now, but I think if we are only doing
this on hourly cron, the additional load will be negligible.

This would get us into a single view, which I think would be more
informative than the one we currently have.

= Categorize all the jobs =

We need a bit of refactoring to let us comment on all the jobs (not just
tempest ones). Basically we assumed pep8 and docs don't fail in the gate
at the beginning. Turns out they do, and are good indicators of infra /
external factor bugs. They are a part of the story so we should put them
in.

= Multi Line Fingerprints =

We've definitely found bugs where we never had a really satisfying
single line match, but we had some great matches if we could do multi
line.

We could do that in ER, however it will mean giving up logstash as our
UI, because those queries can't be done in logstash. So in order to do
this we'll really need to implement some tools - cli minimum, which will
let us easily test a bug. A custom web UI might be in order as well,
though that's going to be it's own chunk of work, that we'll need more
volunteers for.

This would put us in a place where we should have all the infrastructure
to track 90% of the race conditions, and talk about them in certainty as
1%, 5%, 0.1% bugs.

 -Sean



Let's add regexp query support to elastic-recheck so that I could have
fixed this better:

https://review.openstack.org/#/c/65303/

Then I could have just filtered the build_name with this:

build_name:/(check|gate)-(tempest|grenade)-[a-z\-]+/


If you want to extend the query files with:

regex:
- build_name: /(check|gate)-(tempest|grenade)-[a-z\-]+/
- some_other_field: /some other regex/

And make it work with the query builder, I think we should consider it.
It would be good to know how much more expensive those queries get
though, because our ES is under decent load as it is.

 -Sean





Yeah, alternatively we could turn

Re: [openstack-dev] [nova] Bogus -1 scores from turbo hipster

2014-01-08 Thread Matt Riedemann



On Tuesday, January 07, 2014 4:53:01 PM, Michael Still wrote:

Hi. Thanks for reaching out about this.

It seems this patch has now passed turbo hipster, so I am going to
treat this as a more theoretical question than perhaps you intended. I
should note though that Joshua Hesketh and I have been trying to read
/ triage every turbo hipster failure, but that has been hard this week
because we're both at a conference.

The problem this patch faced is that we are having trouble defining
what is a reasonable amount of time for a database migration to run
for. Specifically:

2014-01-07 14:59:32,012 [output] 205 - 206...
2014-01-07 14:59:32,848 [heartbeat]
2014-01-07 15:00:02,848 [heartbeat]
2014-01-07 15:00:32,849 [heartbeat]
2014-01-07 15:00:39,197 [output] done

So applying migration 206 took slightly over a minute (67 seconds).
Our historical data (mean + 2 standard deviations) says that this
migration should take no more than 63 seconds. So this only just
failed the test.

However, we know there are issues with our methodology -- we've tried
normalizing for disk IO bandwidth and it hasn't worked out as well as
we'd hoped. This week's plan is to try to use mysql performance schema
instead, but we have to learn more about how it works first.

I apologise for this mis-vote.

Michael

On Wed, Jan 8, 2014 at 1:44 AM, Matt Riedemann
mrie...@linux.vnet.ibm.com wrote:



On 12/30/2013 6:21 AM, Michael Still wrote:


Hi.

The purpose of this email to is apologise for some incorrect -1 review
scores which turbo hipster sent out today. I think its important when
a third party testing tool is new to not have flakey results as people
learn to trust the tool, so I want to explain what happened here.

Turbo hipster is a system which takes nova code reviews, and runs
database upgrades against them to ensure that we can still upgrade for
users in the wild. It uses real user datasets, and also times
migrations and warns when they are too slow for large deployments. It
started voting on gerrit in the last week.

Turbo hipster uses zuul to learn about reviews in gerrit that it
should test. We run our own zuul instance, which talks to the
openstack.org zuul instance. This then hands out work to our pool of
testing workers. Another thing zuul does is it handles maintaining a
git repository for the workers to clone from.

This is where things went wrong today. For reasons I can't currently
explain, the git repo on our zuul instance ended up in a bad state (it
had a patch merged to master which wasn't in fact merged upstream
yet). As this code is stock zuul from openstack-infra, I have a
concern this might be a bug that other zuul users will see as well.

I've corrected the problem for now, and kicked off a recheck of any
patch with a -1 review score from turbo hipster in the last 24 hours.
I'll talk to the zuul maintainers tomorrow about the git problem and
see what we can learn.

Thanks heaps for your patience.

Michael



How do I interpret the warning and -1 from turbo-hipster on my patch here
[1] with the logs here [2]?

I'm inclined to just do 'recheck migrations' on this since this patch
doesn't have anything to do with this -1 as far as I can tell.

[1] https://review.openstack.org/#/c/64725/4/
[2]
https://ssl.rcbops.com/turbo_hipster/logviewer/?q=/turbo_hipster/results/64/64725/4/check/gate-real-db-upgrade_nova_mysql_user_001/5186e53/user_001.log

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






Another question.  This patch [1] failed turbo-hipster after it was 
approved but I don't know if that's a gating or just voting job, i.e. 
should someone do 'reverify migrations' on that patch or just let it 
sit and ignore turbo-hipster?


[1] https://review.openstack.org/#/c/59824/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] new (docs) requirement for third party CI

2014-01-08 Thread Matt Riedemann
I'd like to propose that we add another item to the list here [1] that 
is basically related to what happens when the 3rd party CI job votes a 
-1 on your patch.  This would include:


1. Documentation on how to analyze the results and a good overview of 
what the job does (like the docs we have for check/gate testing now).

2. How to recheck the specific job if needed, i.e. 'recheck migrations'.
3. Who to contact if you can't figure out what's going on with the job.

Ideally this information would be in the comments when the job scores a 
-1 on your patch, or at least it would leave a comment with a link to a 
wiki for that job like we have with Jenkins today.


I'm all for more test coverage but we need some solid documentation 
around that when it's not owned by the community so we know what to do 
with the results if they seem like false negatives.


If no one is against this or has something to add, I'll update the wiki.

[1] 
https://wiki.openstack.org/wiki/HypervisorSupportMatrix/DeprecationPlan#Specific_Requirements


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] new (docs) requirement for third party CI

2014-01-08 Thread Matt Riedemann



On 1/8/2014 12:40 PM, Joe Gordon wrote:


On Jan 8, 2014 7:12 AM, Matt Riedemann mrie...@linux.vnet.ibm.com
mailto:mrie...@linux.vnet.ibm.com wrote:
 
  I'd like to propose that we add another item to the list here [1]
that is basically related to what happens when the 3rd party CI job
votes a -1 on your patch.  This would include:
 
  1. Documentation on how to analyze the results and a good overview of
what the job does (like the docs we have for check/gate testing now).
  2. How to recheck the specific job if needed, i.e. 'recheck migrations'.
  3. Who to contact if you can't figure out what's going on with the job.
 
  Ideally this information would be in the comments when the job scores
a -1 on your patch, or at least it would leave a comment with a link to
a wiki for that job like we have with Jenkins today.
 
  I'm all for more test coverage but we need some solid documentation
around that when it's not owned by the community so we know what to do
with the results if they seem like false negatives.
 
  If no one is against this or has something to add, I'll update the wiki.

-1 to putting this in the wiki. This isn't a nova only issue. We are
trying to collect the requirements here:

https://review.openstack.org/#/c/63478/


Cool, didn't know about that, thanks.  Good discussion going on in 
there, I left my thoughts as well. :)




 
  [1]
https://wiki.openstack.org/wiki/HypervisorSupportMatrix/DeprecationPlan#Specific_Requirements
 
  --
 
  Thanks,
 
  Matt Riedemann
 
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Icehouse mid-cycle meetup

2014-01-09 Thread Matt Riedemann



On 11/25/2013 6:30 PM, Mike Wilson wrote:

Hotel information has been posted. Look forward to seeing you all in
February :-).

-Mike


On Mon, Nov 25, 2013 at 8:14 AM, Russell Bryant rbry...@redhat.com
mailto:rbry...@redhat.com wrote:

Greetings,

Other groups have started doing mid-cycle meetups with success.  I've
received significant interest in having one for Nova.  I'm now excited
to announce some details.

We will be holding a mid-cycle meetup for the compute program from
February 10-12, 2014, in Orem, UT.  Huge thanks to Bluehost for
hosting us!

Details are being posted to the event wiki page [1].  If you plan to
attend, please register.  Hotel recommendations with booking links will
be posted soon.

Please let me know if you have any questions.

Thanks,

[1] https://wiki.openstack.org/wiki/Nova/IcehouseCycleMeetup
--
Russell Bryant




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I've started an etherpad [1] for gathering ideas for morning 
unconference topics.  Feel free to post anything you're interested in 
discussing.


[1] https://etherpad.openstack.org/p/nova-icehouse-mid-cycle-meetup-items

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Gate issues

2014-01-23 Thread Matt Riedemann



On Thursday, January 23, 2014 9:02:56 AM, Davanum Srinivas wrote:

Gary,

Here's the bug - https://bugs.launchpad.net/nova/+bug/1271331 The oslo
change has been merged - https://review.openstack.org/#/c/68275/ Can't
yet find a Nova merge of the change.

-- dims

On Thu, Jan 23, 2014 at 9:11 AM, Gary Kotton gkot...@vmware.com wrote:

Hi,
1 in 3 failures today are due to:

2014-01-23 11:54:03.879 |
2014-01-23 11:54:03.880 | Traceback (most recent call last):
2014-01-23 11:54:03.880 |   File nova/tests/compute/test_compute.py, line
577, in test_poll_volume_usage_with_data
2014-01-23 11:54:03.880 | self.compute._last_vol_usage_poll)
2014-01-23 11:54:03.880 |   File /usr/lib/python2.7/unittest/case.py, line
420, in assertTrue
2014-01-23 11:54:03.880 | raise self.failureException(msg)
2014-01-23 11:54:03.880 | AssertionError: _last_vol_usage_poll was not
properly updated 1390477654.28
2014-01-23 11:54:03.880 |
==
2014-01-23 11:54:03.880 | FAIL:
nova.tests.db.test_sqlite.TestSqlite.test_big_int_mapping
2014-01-23 11:54:03.880 | tags: worker-1
2014-01-23 11:54:03.880 |
--
2014-01-23 11:54:03.880 | Empty attachments:
2014-01-23 11:54:03.880 |   pythonlogging:''
2014-01-23 11:54:03.881 |   stderr
2014-01-23 11:54:03.881 |   stdout
2014-01-23 11:54:03.881 |
2014-01-23 11:54:03.881 | Traceback (most recent call last):
2014-01-23 11:54:03.881 |   File nova/tests/db/test_sqlite.py, line 53, in
test_big_int_mapping
2014-01-23 11:54:03.881 | output, _ = utils.execute(get_schema_cmd,
shell=True)
2014-01-23 11:54:03.881 |   File nova/utils.py, line 166, in execute
2014-01-23 11:54:03.881 | return processutils.execute(*cmd, **kwargs)
2014-01-23 11:54:03.881 |   File nova/openstack/common/processutils.py,
line 168, in execute
2014-01-23 11:54:03.881 | result = obj.communicate()
2014-01-23 11:54:03.881 |   File /usr/lib/python2.7/subprocess.py, line
754, in communicate
2014-01-23 11:54:03.882 | return self._communicate(input)
2014-01-23 11:54:03.882 |   File /usr/lib/python2.7/subprocess.py, line
1314, in _communicate
2014-01-23 11:54:03.882 | stdout, stderr =
self._communicate_with_select(input)
2014-01-23 11:54:03.882 |   File /usr/lib/python2.7/subprocess.py, line
1438, in _communicate_with_select
2014-01-23 11:54:03.882 | data = os.read(self.stdout.fileno(), 1024)
2014-01-23 11:54:03.882 | OSError: [Errno 11] Resource temporarily
unavailable
2014-01-23 11:54:03.882 |
==
2014-01-23 11:54:03.882 | FAIL: process-returncode
2014-01-23 11:54:03.882 | tags: worker-1
2014-01-23 11:54:03.882 |
--
2014-01-23 11:54:03.882 | Binary content:
2014-01-23 11:54:03.882 |   traceback (test/plain; charset=utf8)
2014-01-23 11:54:03.883 |
==
2014-01-23 11:54:03.883 | FAIL: process-returncode
2014-01-23 11:54:03.883 | tags: worker-3
2014-01-23 11:54:03.883 |
--
2014-01-23 11:54:03.883 | Binary content:


Any ideas?

Thanks

Gary


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev







Russell has the nova sync local, was going to push up after the nova 
meeting this morning.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] stable/havana currently blocked - do not approve or recheck stable/* patches

2014-01-23 Thread Matt Riedemann



On Friday, January 17, 2014 12:38:36 PM, Sean Dague wrote:

Because of a pip issue with netaddr on stable/grizzly devstack, all the
stable/havana changes for jobs that include grenade are currently
blocked. This is because of stevedore's version enforcement of netaddr
versions inside the cinder scheduler.

John, Chmouel, Dean, and I have got eyes on it, waiting for a check to
return to figure out if we've gotten it sorted this time. Hopefully it
will be resolved soon, but until then havana jobs are just killing the gate.

-Sean



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Stable is OK again apparently so for anyone else waiting on a response 
here, go ahead and 'recheck no bug' stable branch patches that were 
waiting for this.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Hyper-V Nova CI Infrastructure

2014-01-24 Thread Matt Riedemann



On Friday, January 24, 2014 3:41:59 PM, Peter Pouliot wrote:

Hello OpenStack Community,

I am excited at this opportunity to make the community aware that the
Hyper-V CI infrastructure

is now up and running.  Let’s first start with some housekeeping
details.  Our Tempest logs are

publically available here: http://64.119.130.115. You will see them
show up in any

Nova Gerrit commit from this moment on.

Additionally if anyone is interested, all of the infrastructure and
automation

work can be found in the following locations:

http://github.com/openstack-hyper-v

http://github.com/cloudbase

Furthermore I would like to take a moment thank the team that made
this moment possible.

This is our first step, and as such was an incredible learning process
for everyone involved.

We were able to accomplish something that most thought couldn’t be done.

I would personally like to take some time thank those individuals
whose tireless effort helped get us to this stage.

Alessandro Pilotti apilo...@cloudbasesolutions.com

Hashir Abdi ha...@microsoft.com

Octavian Ciuhandu ociuha...@cloudbasesolutions.com

Nick Meier nick.me...@microsoft.com

Gabriel Samfira gsamf...@cloudbasesolutions.com

Tim Rogers tirog...@microsoft.com

Claudiu Neșa cn...@cloudbasesolutions.com

Vijay Tripathy vij...@microsoft.com mailto:vij...@microsoft.com

Thank you for the time and effort you put forth to get us to this
milestone. Without each

of you this would not have been possible. I look forward to continuing
the progress and evolving

the infrastructure to the next level.  Our job is not done, but we are
now on the right path.

Thank you all.

I would also at this time like to thank the OpenStack community for
your patience and guidance

through this process.   There are too many of you who helped,
instructed or guided us along the

way to list here.  I hope we can rely on your continued support going
forward as we progress

through this journey including CI testing for all the OpenStack
projects where Microsoft

technologies are involved.

Peter J. Pouliot CISSP

Sr. SDET OpenStack

Microsoft

New England Research  Development Center

1 Memorial Drive

Cambridge, MA 02142

P: 1.(857).4536436

E: ppoul...@microsoft.com mailto:ppoul...@microsoft.com

IRC: primeministerp



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Nice work, this is great to see.  Congratulations to the whole team 
that worked on making this happen!


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Hyper-V Nova CI Infrastructure

2014-01-25 Thread Matt Riedemann



On 1/24/2014 3:41 PM, Peter Pouliot wrote:

Hello OpenStack Community,

I am excited at this opportunity to make the community aware that the
Hyper-V CI infrastructure

is now up and running.  Let’s first start with some housekeeping
details.  Our Tempest logs are

publically available here: http://64.119.130.115. You will see them show
up in any

Nova Gerrit commit from this moment on.

Additionally if anyone is interested, all of the infrastructure and
automation

work can be found in the following locations:

http://github.com/openstack-hyper-v

http://github.com/cloudbase

Furthermore I would like to take a moment thank the team that made this
moment possible.

This is our first step, and as such was an incredible learning process
for everyone involved.

We were able to accomplish something that most thought couldn’t be done.

I would personally like to take some time thank those individuals whose
tireless effort helped get us to this stage.

Alessandro Pilotti apilo...@cloudbasesolutions.com

Hashir Abdi ha...@microsoft.com

Octavian Ciuhandu ociuha...@cloudbasesolutions.com

Nick Meier nick.me...@microsoft.com

Gabriel Samfira gsamf...@cloudbasesolutions.com

Tim Rogers tirog...@microsoft.com

Claudiu Neșa cn...@cloudbasesolutions.com

Vijay Tripathy vij...@microsoft.com mailto:vij...@microsoft.com

Thank you for the time and effort you put forth to get us to this
milestone. Without each

of you this would not have been possible. I look forward to continuing
the progress and evolving

the infrastructure to the next level.  Our job is not done, but we are
now on the right path.

Thank you all.

I would also at this time like to thank the OpenStack community for your
patience and guidance

through this process.   There are too many of you who helped, instructed
or guided us along the

way to list here.  I hope we can rely on your continued support going
forward as we progress

through this journey including CI testing for all the OpenStack projects
where Microsoft

technologies are involved.

Peter J. Pouliot CISSP

Sr. SDET OpenStack

Microsoft

New England Research  Development Center

1 Memorial Drive

Cambridge, MA 02142

P: 1.(857).4536436

E: ppoul...@microsoft.com mailto:ppoul...@microsoft.com

IRC: primeministerp



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



So now some questions. :)

I saw this failed on one of my nova patches [1].  It says the build 
succeeded but that the tests failed.  I talked with Alessandro about 
this yesterday and he said that's working as designed, something with 
how the scoring works with zuul?  The problem I'm having is figuring out 
why it failed.  I looked at the compute logs but didn't find any errors. 
 Can someone help me figure out what went wrong here?


[1] https://review.openstack.org/#/c/69047/1

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] vmware minesweeper jobs failing?

2014-01-25 Thread Matt Riedemann
Seeing a few patches where vmware minesweeper is saying the build 
failed, but looks like an infra issue? An example:


http://208.91.1.172/logs/nova/69046/1/console.txt

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Reminder] - Gate Blocking Bug Day on Monday Jan 27th

2014-01-28 Thread Matt Riedemann



On 1/24/2014 2:29 PM, Sean Dague wrote:

Correction, Monday Jan 27th.

My calendar widget was apparently still on May for summit planning...

On 01/24/2014 07:40 AM, Sean Dague wrote:

It may feel like it's been gate bug day all the days, but we would
really like to get people together for gate bug day on Monday, and get
as many people, including as many PTLs as possible, to dive into issues
that we are hitting in the gate.

We have 2 goals for the day.

** Fingerprint all the bugs **

As of this second, we have fingerprints matching 73% of gate failures,
that tends to decay over time, as new issues are introduced, and old
ones are fixed. We have a hit list of issues here -
http://status.openstack.org/elastic-recheck/data/uncategorized.html

Ideally we want to get and keep the categorization rate up past 90%.
Basically the process is dive into a failed job, look at how it failed,
register a bug (or find an existing bug that was registered), and build
and submit a finger print.

** Tackle the Fingerprinted Bugs **

The fingerprinted bugs - http://status.openstack.org/elastic-recheck/
are now sorted by the # of hits we've gotten in the last 24hrs across
all queues, so that we know how much immediate pain this is causing us.

We'll do this on the #openstack-gate IRC channel, which I just created.
We'll be helping people through what's required to build fingerprints,
trying to get lots of eyes on the existing bugs, and see how many of
these remaining races we can drive out.

Looking forward to Monday!

-Sean



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



For those that haven't compared numbers yet, before the bug day 
yesterday the percentage of uncategorized bugs was 73% and now it's 
96.4%, so fingerprinting is better.


I'll leave it up to Sean to provide a more executive-level summary if 
one is needed. :)


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Neutron] nova-network in Icehouse and beyond

2014-01-29 Thread Matt Riedemann



On 1/29/2014 10:47 AM, Russell Bryant wrote:

Greetings,

A while back I mentioned that we would revisit the potential deprecation
of nova-network in Icehouse after the icehouse-2 milestone.  The time
has come.  :-)

First, let me recap my high level view of the blockers to deprecating
nova-network in favor of Neutron:

   - Feature parity
 - The biggest gap here has been nova-network's multi-host mode.
   Neutron needs some sort of HA for l3 agents, as well as the
   ability to run in a mode that enables a single tenant's traffic
   to be actively handled by multiple nodes.

   - Testing / Quality parity
 - Neutron needs to reach testing and quality parity in CI.  This
   includes running the full tempest suite, for example.  For all
   tests run against nova with nova-network that are applicable, they
   need to be run against Neutron, as well.  All of these jobs should
   have comparable or better reliability than the ones with
   nova-network.

   - Production-ready open source components
 - nova-network provides basic, but usable in production networking
   based purely on open source components.  Neutron must have
   production-ready options based purely on open source components,
   as well, that provides comparable or better performance and
   reliability.

First, I would like to say thank you to those in the Neutron team that
have worked hard to make progress in various areas.  While there has
been good progress, we're not quite there on achieving these items.  As
a result, nova-network will *not* be marked deprecated in Icehouse.  We
will revisit this question again in a future release.  I'll leave it to
the Neutron team to comment further on the likelihood of meeting these
goals in the Juno development cycle.

Regarding nova-network, I would like to make some changes.  We froze
development on nova-network in advance of its deprecation.
Unfortunately, this process has taken longer than anyone thought or
hoped.  This has had some negative consequences on the nova-network code
(such as [1]).

Effective immediately, I would like to unfreeze nova-network
development.  What this really means:

   - We will no longer skip nova-network when making general
 architectural improvements to the rest of the code.  An example
 of playing catch-up in nova-network is [2].

   - We will accept new features, evaluated on a case by case basis,
 just like any other Nova feature.  However, we are explicitly
 *not* interested in features that widen the parity gaps between
 nova-network and Neutron.

   - While we will accept incremental features to nova-network, we
 are *not* interested in increasing the scope of nova-network
 to include support of any SDN controller.  We leave that as
 something exclusive to Neutron.

I firmly believe that Neutron is the future of networking for OpenStack.
  We just need to loosen up nova-network to move it along to ease some
pressure and solve some problems as we continue down this transition.

Thanks,

[1]
http://lists.openstack.org/pipermail/openstack-dev/2014-January/024052.html
[2] https://blueprints.launchpad.net/nova/+spec/nova-network-objects



Timely thread.  I was just going through nova/neutron-related blueprints 
and patches yesterday for Icehouse and noted these as something I think 
we definitely need as pre-reqs before going all-in with neutron:


https://blueprints.launchpad.net/neutron/+spec/instance-nw-info-api
https://bugs.launchpad.net/nova/+bug/1255594
https://bugs.launchpad.net/nova/+bug/1258620

There are patches up for the two bugs, but they need some work.

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Hyper-V Nova CI Infrastructure

2014-01-29 Thread Matt Riedemann



On Monday, January 27, 2014 7:17:27 AM, Alessandro Pilotti wrote:

On 25 Jan 2014, at 16:51 , Matt Riedemann mrie...@linux.vnet.ibm.com wrote:




On 1/24/2014 3:41 PM, Peter Pouliot wrote:

Hello OpenStack Community,

I am excited at this opportunity to make the community aware that the
Hyper-V CI infrastructure

is now up and running.  Let’s first start with some housekeeping
details.  Our Tempest logs are

publically available here: http://64.119.130.115. You will see them show
up in any

Nova Gerrit commit from this moment on.
snip


So now some questions. :)

I saw this failed on one of my nova patches [1].  It says the build succeeded 
but that the tests failed.  I talked with Alessandro about this yesterday and 
he said that's working as designed, something with how the scoring works with 
zuul?


I spoke with clarkb on infra, since we were also very puzzled by this 
behaviour. I’ve been told that when the job is non voting, it’s always reported 
as succeeded, which makes sense, although sligltly misleading.
The message in the Gerrit comment is clearly stating: Test run failed in ..m 
..s (non-voting)”, so this should be fair enough. It’d be great to have a way to get 
rid of the “Build succeded” message above.


The problem I'm having is figuring out why it failed.  I looked at the compute 
logs but didn't find any errors.  Can someone help me figure out what went 
wrong here?



The reason for the failure of this job can be found here:

http://64.119.130.115/69047/1/devstack_logs/screen-n-api.log.gz

Please search for (1054, Unknown column 'instances.locked_by' in 'field 
list')

In this case the job failed when nova service-list” got called to verify 
wether the compute nodes have been properly added to the devstack instance in the 
overcloud.

During the weekend we added also a console.log to help in simplifying 
debugging, especially in the rare cases in which the job fails before getting 
to run tempest:

http://64.119.130.115/69047/1/console.log.gz


Let me know if this helps in tracking down your issue!

Alessandro



[1] https://review.openstack.org/#/c/69047/1

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Alex, thanks, figured it out and yes, the console log is helpful, and 
the fail was a real bug in my patch which changed how the 180 migration 
was doing something which later broke another migration running against 
your MySQL backend - so nice catch.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][oslo] pep8 gating fails due to tools/config/check_uptodate.sh

2014-02-03 Thread Matt Riedemann



On 1/13/2014 10:49 AM, Sahid Ferdjaoui wrote:

Hello all,

It looks 100% of the pep8 gate for nova is failing because of a bug reported,
we probably need to mark this as Critical.

https://bugs.launchpad.net/nova/+bug/1268614

Ivan Melnikov has pushed a patchset waiting for review:
https://review.openstack.org/#/c/66346/

http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiRVJST1I6IEludm9jYXRpb25FcnJvcjogXFwnL2hvbWUvamVua2lucy93b3Jrc3BhY2UvZ2F0ZS1ub3ZhLXBlcDgvdG9vbHMvY29uZmlnL2NoZWNrX3VwdG9kYXRlLnNoXFwnXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjQzMjAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTM4OTYzMTQzMzQ4OSwibW9kZSI6IiIsImFuYWx5emVfZmllbGQiOiIifQ==


s.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



This broke us (nova) again today after python-keystoneclient-0.5.0 was 
released with a new config option. Joe Gordon pushed the patch to fix 
nova [1] so everyone will need to recheck their patches again once that 
merges.


This is going to be a continuing problem when external libs that nova 
pulls config options from get released, which now also includes 
oslo.messaging.


Ben Nemec floated some ideas in the previous bug [2]. I'll restate them 
here for discussion:


1) Set up a Jenkins job that triggers on keystoneclient releases to 
check whether it changed their config options and automatically propose 
an update to the other projects. I expect this could work like the 
requirements sync job.


2) Move the keystoneclient config back to a separate file and don't 
automatically generate it. This will likely result in it getting out of 
date again though. I assume that's why we started including 
keystoneclient directly in the generated config.


Joe also had an idea that we keep/generate a vanilla nova.conf.sample 
that only includes options from the nova tree itself which the 
check_uptodate script can check against, not the one generated under 
etc/nova/ which has the external lib options in it.  Then we can still 
get the generated nova.conf.sample that gets packaged by setup.cfg with 
the external lib options but not gate on it when those external packages 
are updated. (Joe, please correct my summary of your idea if it's wrong).


I was also thinking of something similar which could maybe just be done 
in memory where the check tool keeps track of the external config 
options and when validating the generated nova.conf.sample it ignores 
any 'failures' if they are in the list of external options.


Anyway, no matter how we fix it, we need to fix it, so let's weigh the 
pros and cons of the various options since this is worse than a race 
condition that breaks the gate, it just simply breaks and blocks 
everything until fixed.


[1] https://review.openstack.org/#/c/70891/
[2] https://bugs.launchpad.net/nova/+bug/1268614/comments/15

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Bug Triage Day Proposal - Friday 7th February

2014-02-07 Thread Matt Riedemann



On Friday, February 07, 2014 1:15:11 PM, John Garbutt wrote:

So untouched bugs:
~200 - ~130

Awesome. Hope the US guys (and anyone else who is till working) have a
good afternoon brining that down further.

Anyways, I am going to have my dinner, because I like in the UK, and I
have to play my Tuba this evening.


Haha, that made my day.

http://www.youtube.com/watch?v=d0aIqx1McVI



johnthetubaguy

On 7 February 2014 08:12, John Garbutt j...@johngarbutt.com wrote:

Just a quick reminder, its bug day!

Lets collaborate in #openstack-nova

We can track progress here:
http://webnumbr.com/untouched-nova-bugs
And later progress:
http://status.openstack.org/bugday

Get those bugs tagged:
https://bugs.launchpad.net/nova/+bugs?field.tag=-*field.status%3Alist=NEW

Tag owners, and others, lets set the priorities:
https://wiki.openstack.org/wiki/Nova/BugTriage

But don't forget:
* Critical if the bug prevents a key feature from working properly
(regression) for all users (or without a simple workaround) or result
in data loss
* High if the bug prevents a key feature from working properly for
some users (or with a workaround)
* Medium if the bug prevents a secondary feature from working properly
* Low if the bug is mostly cosmetic
* Wishlist if the bug is not really a bug, but rather a welcome change
in behavior

Lets also watch out for stale bugs:
https://bugs.launchpad.net/nova/+bugs?orderby=date_last_updatedfield.status%3Alist=INPROGRESSassignee_option=any

John


PS
I am having to be an emergency taxi service first thing this morning,
but should be joining you this afternoon.

On 5 February 2014 01:01, Russell Bryant rbry...@redhat.com wrote:

On 02/04/2014 05:10 PM, John Garbutt wrote:

Hi,

Now that we getting close towards the end of Icehouse, it seems a good
time to make sure we tame the un-triaged bug backlog (try say that
really quickly a few times over), and look at what really needs fixing
before Icehouse is released.

I propose that we have a bug triage day this Friday, February 7th.
That way, things should be in a more reasonable state by the Utah
mid-cycle meet up, on Monday.

If you have some bugs you keep meaning to raise, but haven't quite got
around to it yet, please do that before Friday, rather than after
Friday.

The usual process applies for Bug Triage. Applying official nova tags, etc:
 https://wiki.openstack.org/wiki/Nova/BugTriage
 https://wiki.openstack.org/wiki/BugTriage

To see how we are doing, take a look at:
 http://webnumbr.com/untouched-nova-bugs
 http://status.openstack.org/bugday

Lets also not forgot about fixing bugs too, particularly ones that show up here:
 http://status.openstack.org/elastic-recheck/

Hopefully you can join us on #openstack-nova for some bug triage fun
on Friday.

If there are horrid clashes, or other issues or ideas, do speak up.


Sounds great.  We're due for a bug day.  An improved bug queue as we
head toward the freeze would be very helpful.  Thanks!

--
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][ceilometer] Removing simple_tenant_usage and os-instance_usage_audit_log from V3 API

2014-02-08 Thread Matt Riedemann



On 2/7/2014 4:10 PM, Joe Gordon wrote:

Hi All,

I would like to propose removing the simple_tenant_usage and
os-instance_usage_audit_log extensions from the nova V3 API (while
keeping them in V2). Both of these are pre ceilometer extensions to
generate rudimentary usage information, something that we should be
using ceilometer for.

For those of you who aren't sure what os-instance_usage_audit_log is:
* 
https://github.com/openstack/nova/commit/2fdd73816c56b578a65466db4e5a86b9b191e1c1
* No python-novaclient support
* output from a devstack run http://paste.ubuntu.com/6893886/


+1

Not to mention no api-ref docs for os-instance_usage_audit_log.





best,
Joe

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA] The future of nosetests with Tempest

2014-02-12 Thread Matt Riedemann

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Neutron] nova-network in Icehouse and beyond

2014-02-17 Thread Matt Riedemann



On 1/30/2014 7:10 AM, Christopher Yeoh wrote:

On Thu, Jan 30, 2014 at 2:08 PM, Michael Still mi...@stillhq.com
mailto:mi...@stillhq.com wrote:

On Thu, Jan 30, 2014 at 2:29 PM, Christopher Yeoh cbky...@gmail.com
mailto:cbky...@gmail.com wrote:

  So if nova-network doesn't go away this has implications for the
V3 API as
  it currently doesn't support
  nova-network. I'm not sure that we have time to add support for it in
  icehouse now, but if nova-network is
  not going to go away then we need to add it to the V3 API or we
will be
  unable to ever deprecate the
  V2 API.

Is the problem here getting the code written, or getting it through
reviews? i.e. How can I re-prioritise work to help you here?


So I think its a combination of both. There's probably around 10
extensions from V2 that would need looking at to port from V2. There's
some cases where the API supported both nova network and neutron,
proxying in the latter case and others where only nova network was
supported. So we'll need to make a decision pretty quickly around
whether we present a unified networking interface (eg proxy for neutron)
or have some interfaces which you only use when you use nova-network.
There's a bit of work either way. Also given how long we'll have V3 for
want to take the opportunity to cleanup the APIs we do port. And feature
proposal deadline is now less than 3 weeks away so combined with the
already existing work we have for i-3 it is going to be a little tight.

The other issue is we have probably at least 50 or so V3 API related
changesets in the queue at the moment, plus obviously more coming over
the next few weeks. So I'm a bit a wary of how much extra review
attention we can realistically expect.

The two problems together make me think that although its not
impossible, there's a reasonable level of risk that we wouldn't get it
all done AND merged in i-3. And I think we want to avoid the situation
where we have some of the things currently in the queue merged and some
of say the nova-network patches done, but not complete with either. More
people contributing patches and core review cycles will of course help
though so any help is welcome :-)

This is all dependent on nova-network never going away. If the intent is
that it would eventually be deprecated - say in the same timeframe as
the V2 API then I don't think its worth the extra effort/risk putting it
in the V3 API in icehouse.

Regards,

Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Given the above, I'm trying to figure out if the limits/used_limits API 
extensions will come back in nova V3?  I ask because I'm trying to get 
this patch [1] working for V2 and earlier in the review cycle it was 
asserted it could be a V2-only change since Neutron would be handled 
differently in V3, but now I'm confused.


[1] https://review.openstack.org/#/c/43822/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Meetup Summary

2014-02-18 Thread Matt Riedemann



On 2/17/2014 4:41 PM, Russell Bryant wrote:

Greetings,

Last week we had an in-person Nova meetup.  Bluehost was a wonderful and
generous host.  Many thanks to them.  :-)

Here's some observations and a summary of some of the things that we
discussed:

1) Mark McClain (Neutron PTL) and and Mark Washenberger (Glance PTL)
both attended.  Having some cross-project discussion time was
*incredibly* useful, so I'm thankful they attended.  This makes me very
optimistic about our plans to have a cross-project day at the Atlanta
design summit.  We need to try to get as many opportunities as possible
for this sort of collaboration.

2) Gantt  - We discussed the progress of the Gantt effort.  After
discussing the problems encountered so far and the other scheduler work
going on, the consensus was that we really need to focus on decoupling
the scheduler from the rest of Nova while it's still in the Nova tree.

Don was still interested in working on the existing gantt tree to learn
what he can about the coupling of the scheduler to the rest of Nova.
Nobody had a problem with that, but it doesn't sound like we'll be ready
to regenerate the gantt tree to be the real gantt tree soon.  We
probably need another cycle of development before it will be ready.

As a follow-up to this, I wonder if we should rename the current gantt
repository from openstack/gantt to stackforge/gantt to avoid any
possible confusion.  We should make it clear that we don't expect the
current repo to be used yet.

3) v3 API - We discussed the current status of this effort, including
the tasks API, and all other v3 work.  There are some notes here:

https://etherpad.openstack.org/p/NovaV3APIDoneCriteria

I actually think we need to talk about this some more before we mark v3
as stable.  I'll get notes together and start another thread soon.

4) We talked about Nova's integration with Neutron and made some good
progress.  We came up with a blueprint (ideally for Icehouse) to improve
Nova-Neutron interaction.

There are two cases we need to improve that have been particularly
painful.  The first is the network info cache.  Neutron can issue an API
callback to Nova to let us know that we need to refresh the cache.  The
second is knowing that VIF setup is complete.  Right now we have cases
where we issue a request to Neutron and it is processed asynchronously.
  We have no way to know when it has finished.  For example, we really
need to know that VIF plumbing is set up before we boot an instance and
it tries its DHCP request.  We can do this with nova-network, but with
Neutron it's just a giant race.  I'm actually surprised we've made it
this long without fixing this.


One or both of these issues (thinking VIF readiness) is also causing a 
gate failure in master and stable/havana:


https://bugs.launchpad.net/nova/+bug/1210483

I'd like to propose skipping that test if Tempest is configured with 
Neutron until we get the bug fixed/blueprint resolved.


By the way, can I get a link to the blueprint to reference in the bug 
(or vice-versa)?




5) Driver CI - We talked about the ongoing effort to set up CI for all
of the compute drivers.  The discussion was mostly a status review.  At
this point, the Xenserver and Docker drivers are both at risk of being
removed from Nova for the Icehouse release if CI is not up and running
in time.

6) Upgrades - we discussed the state of upgrading Nova.  It was mostly a
review of the excellent progress being made this cycle.  Dan Smith has
been doing a lot of work to get us closer to where we can upgrade the
control services at once with downtime, but roll through upgrading the
computes later after service is back up.  Joe Gordon has been working on
automating the testing of this to make sure we don't break it, so that
should be running soon.


Lastly, everyone in attendance seemed to really enjoy it, and the
overwhelming vote in the room was for doing the same thing again during
the Juno cycle.  Dates and location TBD.


+1



Thanks,



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Meetup Summary

2014-02-18 Thread Matt Riedemann



On 2/18/2014 1:12 PM, Russell Bryant wrote:

On 02/18/2014 12:36 PM, Matt Riedemann wrote:

4) We talked about Nova's integration with Neutron and made some good
progress.  We came up with a blueprint (ideally for Icehouse) to improve
Nova-Neutron interaction.

There are two cases we need to improve that have been particularly
painful.  The first is the network info cache.  Neutron can issue an API
callback to Nova to let us know that we need to refresh the cache.  The
second is knowing that VIF setup is complete.  Right now we have cases
where we issue a request to Neutron and it is processed asynchronously.
   We have no way to know when it has finished.  For example, we really
need to know that VIF plumbing is set up before we boot an instance and
it tries its DHCP request.  We can do this with nova-network, but with
Neutron it's just a giant race.  I'm actually surprised we've made it
this long without fixing this.


One or both of these issues (thinking VIF readiness) is also causing a
gate failure in master and stable/havana:

https://bugs.launchpad.net/nova/+bug/1210483

I'd like to propose skipping that test if Tempest is configured with
Neutron until we get the bug fixed/blueprint resolved.

By the way, can I get a link to the blueprint to reference in the bug
(or vice-versa)?


I haven't seen a blueprint for this yet.

Mark, is that something you were planning on driving?



Here we go:

https://blueprints.launchpad.net/nova/+spec/check-neutron-port-status

There are two patches up for it now from Aaron, still needs (exception) 
approval for Icehouse.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][libvirt] Is there anything blocking the libvirt driver from implementing the host_maintenance_mode API?

2014-02-19 Thread Matt Riedemann
The os-hosts OS API extension [1] showed up before I was working on the 
project and I see that only the VMware and XenAPI drivers implement it, 
but was wondering why the libvirt driver doesn't - either no one wants 
it, or there is some technical reason behind not implementing it for 
that driver?


[1] 
http://docs.openstack.org/api/openstack-compute/2/content/PUT_os-hosts-v2_updateHost_v2__tenant_id__os-hosts__host_name__ext-os-hosts.html


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][oslo] Changes to oslo-incubator sync workflow

2014-02-19 Thread Matt Riedemann



On 2/19/2014 7:13 PM, Joe Gordon wrote:

Hi All,

As many of you know most oslo-incubator code is wildly out of sync.
Assuming we consider it a good idea to sync up oslo-incubator code
before cutting Icehouse, then we have a problem.

Today oslo-incubator code is synced in ad-hoc manor, resulting in
duplicated efforts and wildly out of date code. Part of the challenges
today are backwards incompatible changes and new oslo bugs. I expect
that once we get a single project to have an up to date oslo-incubator
copy it will make syncing a second project significantly easier. So
because I (hopefully) have some karma built up in nova, I would like
to volunteer nova to be the guinea pig.


To fix this I would like to propose starting an oslo-incubator/nova
sync team. They would be responsible for getting nova's oslo code up
to date.  I expect this work to involve:
* Reviewing lots of oslo sync patches
* Tracking the current sync patches
* Syncing over the low hanging fruit, modules that work without changing nova.
* Reporting bugs to oslo team
* Working with oslo team to figure out how to deal with backwards
incompatible changes
   * Update nova code or make oslo module backwards compatible
* Track all this
* Create a roadmap for other projects to follow (re: documentation)

I am looking for volunteers to help with this effort, any takers?


best,
Joe Gordon

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Well I'll get the ball rolling...

In the past when this has come up there is always a debate over should 
be just sync to sync because we should always be up to date, or is that 
dangerous and we should only sync when there is a need (which is what 
the review guidelines say now [1]).  There are pros and cons:


pros:

- we get bug fixes that we didn't know existed
- it should be less painful to sync if we do it more often

cons:

- it's more review overhead and some crazy guy thinks we need a special 
team dedicated to reviewing those changes :)
- there are some changes in o-i that would break nova; I'm specifically 
thinking of the oslo RequestContext which has domain support now (or 
some other keystone thingy) and nova has it's own RequestContext - so if 
we did sync that from o-i it would change nova's logging context and 
break on us since we didn't use oslo context.


For that last con, I'd argue that we should move to the oslo 
RequestContext, I'm not sure why we aren't.  Would that module then not 
fall under low-hanging-fruit?


I think the DB API modules have been a concern for auto-syncing before 
too but I can't remember why now...something about possibly changing the 
behavior of how the nova migrations would work?  But if they are already 
using the common code, I don't see the issue.


This is kind of an aside, but I'm kind of confused now about how the 
syncs work with things that fall under oslo.rootwrap or oslo.messaging, 
like this patch [2].  It doesn't completely match the o-i patch, i.e. 
it's not syncing over openstack/common/rootwrap/wrapper.py, and I'm 
assuming because that's in oslo.rootwrap now?  But then why does the 
code still exist in oslo-incubator?


I think the keystone guys are running into a similar issue where they 
want to remove a bunch of now-dead messaging code from keystone but 
can't because there are still some things in oslo-incubator using 
oslo.messaging code, or something weird like that. So maybe those 
modules are considered out of scope for this effort until the o-r/o-m 
code is completely out of o-i?


Finally, just like we'd like to have cores for each virt driver in nova 
and the neutron API in nova, I think this kind of thing, at least 
initially, would benefit from having some oslo cores involved in a team 
that are also familiar to a degree with nova, e.g. bnemec or dims.


[1] https://wiki.openstack.org/wiki/ReviewChecklist#Oslo_Syncing_Checklist
[2] https://review.openstack.org/#/c/73340/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] v3 API in Icehouse

2014-02-20 Thread Matt Riedemann



On 2/19/2014 12:26 PM, Chris Behrens wrote:

+1. I'd like to leave it experimental as well. I think the task work is 
important to the future of nova-api and I'd like to make sure we're not rushing 
anything. We're going to need to live with old API versions for a long time, so 
it's important that we get it right. I'm also not convinced there's a 
compelling enough reason for one to move to v3 as it is. Extension versioning 
is important, but I'm not sure it can't be backported to v2 in the meantime.


Thinking about what would differentiate V3, tasks is the big one but the 
common request ID [1] is something that could be a nice carrot for 
getting people to move eventually.


[1] https://blueprints.launchpad.net/nova/+spec/cross-service-request-id



- Chris


On Feb 19, 2014, at 9:36 AM, Russell Bryant rbry...@redhat.com wrote:

Greetings,

The v3 API effort has been going for a few release cycles now.  As we
approach the Icehouse release, we are faced with the following question:
Is it time to mark v3 stable?

My opinion is that I think we need to leave v3 marked as experimental
for Icehouse.

There are a number of reasons for this:

1) Discussions about the v2 and v3 APIs at the in-person Nova meetup
last week made me come to the realization that v2 won't be going away
*any* time soon.  In some cases, users have long term API support
expectations (perhaps based on experience with EC2).  In the best case,
we have to get all of the SDKs updated to the new API, and then get to
the point where everyone is using a new enough version of all of these
SDKs to use the new API.  I don't think that's going to be quick.

We really don't want to be in a situation where we're having to force
any sort of migration to a new API.  The new API should be compelling
enough that everyone *wants* to migrate to it.  If that's not the case,
we haven't done our job.

2) There's actually quite a bit still left on the existing v3 todo list.
We have some notes here:

https://etherpad.openstack.org/p/NovaV3APIDoneCriteria

One thing is nova-network support.  Since nova-network is still not
deprecated, we certainly can't deprecate the v2 API without nova-network
support in v3.  We removed it from v3 assuming nova-network would be
deprecated in time.

Another issue is that we discussed the tasks API as the big new API
feature we would include in v3.  Unfortunately, it's not going to be
complete for Icehouse.  It's possible we may have some initial parts
merged, but it's much smaller scope than what we originally envisioned.
Without this, I honestly worry that there's not quite enough compelling
functionality yet to encourage a lot of people to migrate.

3) v3 has taken a lot more time and a lot more effort than anyone
thought.  This makes it even more important that we're not going to need
a v4 any time soon.  Due to various things still not quite wrapped up,
I'm just not confident enough that what we have is something we all feel
is Nova's API of the future.


Let's all take some time to reflect on what has happened with v3 so far
and what it means for how we should move forward.  We can regroup for Juno.

Finally, I would like to thank everyone who has helped with the effort
so far.  Many hours have been put in to code and reviews for this.  I
would like to specifically thank Christopher Yeoh for his work here.
Chris has done an *enormous* amount of work on this and deserves credit
for it.  He has taken on a task much bigger than anyone anticipated.
Thanks, Chris!

--
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] v3 API in Icehouse

2014-02-20 Thread Matt Riedemann
 with v3 so far
and what it means for how we should move forward.  We can regroup for Juno.

Finally, I would like to thank everyone who has helped with the effort
so far.  Many hours have been put in to code and reviews for this.  I
would like to specifically thank Christopher Yeoh for his work here.
Chris has done an *enormous* amount of work on this and deserves credit
for it.  He has taken on a task much bigger than anyone anticipated.
Thanks, Chris!






___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Is there anything blocking the libvirt driver from implementing the host_maintenance_mode API?

2014-02-20 Thread Matt Riedemann



On 2/19/2014 4:05 PM, Matt Riedemann wrote:

The os-hosts OS API extension [1] showed up before I was working on the
project and I see that only the VMware and XenAPI drivers implement it,
but was wondering why the libvirt driver doesn't - either no one wants
it, or there is some technical reason behind not implementing it for
that driver?

[1]
http://docs.openstack.org/api/openstack-compute/2/content/PUT_os-hosts-v2_updateHost_v2__tenant_id__os-hosts__host_name__ext-os-hosts.html




By the way, am I missing something when I think that this extension is 
already covered if you're:


1. Looking to get the node out of the scheduling loop, you can just 
disable it with os-services/disable?


2. Looking to evacuate instances off a failed host (or one that's in 
maintenance mode), just use the evacuate server action.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] v3 API in Icehouse

2014-02-21 Thread Matt Riedemann



On 2/21/2014 1:53 AM, Christopher Yeoh wrote:

On Fri, 21 Feb 2014 06:53:11 +
Kenichi Oomichi oomi...@mxs.nes.nec.co.jp wrote:


-Original Message-
From: Christopher Yeoh [mailto:cbky...@gmail.com]
Sent: Thursday, February 20, 2014 11:44 AM
To: openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [Nova] v3 API in Icehouse

On Wed, 19 Feb 2014 12:36:46 -0500
Russell Bryant rbry...@redhat.com wrote:


Greetings,

The v3 API effort has been going for a few release cycles now.
As we approach the Icehouse release, we are faced with the
following question: Is it time to mark v3 stable?

My opinion is that I think we need to leave v3 marked as
experimental for Icehouse.



Although I'm very eager to get the V3 API released, I do agree with
you. As you have said we will be living with both the V2 and V3
APIs for a very long time. And at this point there would be simply
too many last minute changes to the V3 API for us to be confident
that we have it right enough to release as a stable API.


Through v3 API development, we have found a lot of the existing v2 API
input validation problems. but we have concentrated v3 API development
without fixing the problems of v2 API.

After Icehouse release, v2 API would be still CURRENT and v3 API would
be EXPERIMENTAL. So should we fix v2 API problems also in the
remaining Icehouse cycle?



So bug fixes are certainly fine with the usual caveats around backwards
compatibility (I think there's a few in there that aren't
backwards compatible especially those that fall into the category of
making the API more consistent).

https://wiki.openstack.org/wiki/APIChangeGuidelines

Chris

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



We also need to circle back to the issues/debates around what to do with 
the related bug(s) and how to handle something like this in V2 now with 
respect to proxying to neutron (granted that my premise in the last 
comment may be off a bit now):


https://review.openstack.org/#/c/43822/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][libvirt] Is there anything blocking the libvirt driver from implementing the host_maintenance_mode API?

2014-02-23 Thread Matt Riedemann



On Sunday, February 23, 2014 12:41:15 PM, Jay Pipes wrote:

On Sun, 2014-02-23 at 09:14 +0800, Jay Lau wrote:

So there is no need to implement libvirt driver for the
host_maintenance_mode API as host_maintenance_mode is mainly for
VMWare and XenServer, also we can use evacuate and os-services/disable
for libvirt host maintain, right?


At a minimum, can we please make the API consistent in regards to what
we call this operation (evacuation, host maintenance, enable/disable,
etc).

Death to API extensions.

Best,
-another jay



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Well, the other thing here is we have no Tempest coverage of this virt 
driver API since only xenapi and vmware drivers implement it and 
community Tempest runs against the libvirt driver.  I'm not aware of 
anything in Tempest that allows automatically detecting virt 
driver-specific APIs that we can test, like get_diagnostics.


XenServer CI and MineSweeper could run against these in Tempest 
otherwise, we just don't have tests defined.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] oslo.messaging rampant errors in nova-api logs

2014-02-24 Thread Matt Riedemann



On Monday, February 24, 2014 7:26:04 AM, Sean Dague wrote:

I'm looking at whether we can get ourselves to enforcing only known
ERRORs in logs. In doing so one of the most visible issues on non
neutron runs is oslo.messaging spewing approximately 50:

ERROR oslo.messaging.notify._impl_messaging [-] Could not send
notification to notifications

http://logs.openstack.org/45/75245/3/check/check-tempest-dsvm-full/7ad149e/logs/screen-n-api.txt.gz?level=TRACE

We could whitelist this, however, this looks like a deeper issue.
Something that should actually be solved prior to release.

Really need some eyes in here from people more familiar with the
oslo.messaging code, and why we'd be tripping a circular reference
violation here.

-Sean



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


FYI there is a bug for it too:

https://bugs.launchpad.net/nova/+bug/1283270

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Future of the Nova API

2014-02-24 Thread Matt Riedemann
 compelling enough to get people to *want* to migrate as
soon as they are able, then we haven't done our job.  Deprecation of the
old thing should only be done when we feel it's no longer wanted or used
by the vast majority.  I just don't see that happening any time soon.

We have a couple of ways forward right now.

1) Continue as we have been, and plan to release v3 once we have a
compelling enough feature set.

2) Take what we have learned from v3 and apply it to v2.  For example:

  - The plugin infrastructure is an internal implementation detail that
can be done with the existing API.

  - extension versioning is a concept we can add to v2

  - we've also been discussing the concept of a core minor version, to
reflect updates to the core that are backwards compatible.  This
seems doable in v2.

  - revisit a new major API when we get to the point of wanting to
effectively do a re-write, where we are majorly re-thinking the
way our API is designed (from an external perspective, not internal
implementation).

[1] http://en.wikipedia.org/wiki/Robustness_principle



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Future of the Nova API

2014-02-25 Thread Matt Riedemann
, but there's likely to be a lot of
   code duplication we can collapse (even within the v2 and v3 APIs
   unittests there's a lot of duplicated code that could be removed and
   I suspect if we tried we could share it between v2 and v3). There'd
   be a bunch of refactoring work required mostly around tests being
   able to more generically take input and more generically test output.
   So its not easy, but we could cut down on the overhead there.

So I think this is all hard to quantify but I don't think its as big as
people fear - I think the tempest overhead is the main concern because
it maps to extra check/gate resources but if we want backwards
incompatible changes we get that regardless. I really don't see the
in-tree nova overhead as that significant - some of it comes down just
to reviewers asking if a change is made to v2 does it need to be done to
ec2/v3 as well?


+1



So I think we come back to our choices:

- V2 stays forever. Never any backwards incompatible changes. For lots
   of reasons I've mentioned before don't like it. Also the longer we
   delay the harder it gets.

- V2 with V3 backport incorporating changes into V2. Probably less
   tempest testing load depending on how much is backported. But a *lot*
   of backporting working. It took us 2 cycles to get V3 to this stage,
   it'd be 3 in the end if we the release V3 in Juno. How many cycles
   would it take us to implement V3 changes in the V2 code? And in many
   cases its not a matter of just backporting patches, its starting from
   scratch. And we don't have a clear idea of when we can deprecate the
   V2 part of the code (the cleanup of which will be harder than just
   removing everything in the contrib directory ;-)

- Release V3. But we don't know how long we have to maintain V2 for.
   But if its just two years after the V3 release I think its a
   no-brainer that we just go the V3 route. If its 7 or 10 years then I
   think we'll probably find it hard to justify any backwards
   incompatible change and that will make me very sad given the state of
   the V2 API. (And as an aside if we suspect that never deprecate is
   the answer I think we should defer all the pending new API extensions
   in the queue for V2 - because we haven't done a sufficient evaluation
   of them and we'll have to live with what they do forever)

Whatever we decide I think its clear we need to be much much more
stricter about what new APIs we allow in and any really changes at all
to the API. Because we're stuck with the consequences for a very long
time. There's a totally different trade off between speed of
development and long term consequences if you make a mistake compared
to the rest of Nova.

Chris

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Future of the Nova API

2014-02-25 Thread Matt Riedemann



On Tuesday, February 25, 2014 4:02:13 PM, Dan Smith wrote:

+1, seems would could explore for another cycle just to find out that
backporting everything to V2 isn't going to be what we want, and now
we've just wasted more time.



If we say it's just deprecated and frozen against new features, then
it's maintenance is just limited to bug fixes right?


No, any time we make a change to how the api communicates with compute,
conductor, or the scheduler, both copies of the API code have to be
changed. If we never get rid of v2 (and I don't think we have a good
reason to right now) then we're doing that *forever*. I do not want to
sign up for that.


Yeah, so objects is the big one here.  And it doesn't sound like we're 
talking about getting rid of V2 *right now*, we're talking about 
deprecating it after V3 is released (plan would be Juno for 
nova-network and tasks) and then maintaining it for some amount of time 
before it could be removed, and it doesn't sound like we know what that 
number is until we get some input from deployers/operators.




I'm really curious what deployers like RAX, HP Cloud, etc think about
freezing V2 to features and having to deploying V3 to get them. Does RAX
expose V3 right now? Also curious if RAX/HP/etc see the V3 value
statement when compared to what it will mean for their users.


I'd also be interested to see what happens with the Keystone V2 API 
because as I understand it, it's deprecated already and there is no V3 
support in python-keystoneclient, that's all moved to 
python-openstackclient, which I don't think even Tempest is using yet, 
at least not for API tests.


So what kind of reaction are the Keystone people getting to that?  Do 
they plan on removing their V2 API at some point?  Or just maintain it 
with bug fixes forever?




--Dan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA] The future of nosetests with Tempest

2014-02-25 Thread Matt Riedemann



On 2/12/2014 1:57 PM, Matthew Treinish wrote:

On Wed, Feb 12, 2014 at 11:32:39AM -0700, Matt Riedemann wrote:



On 1/17/2014 8:34 AM, Matthew Treinish wrote:

On Fri, Jan 17, 2014 at 08:32:19AM -0500, David Kranz wrote:

On 01/16/2014 10:56 PM, Matthew Treinish wrote:

Hi everyone,

With some recent changes made to Tempest compatibility with nosetests is going
away. We've started using newer features that nose just doesn't support. One
example of this is that we've started using testscenarios and we're planning to
do this in more places moving forward.

So at Icehouse-3 I'm planning to push the patch out to remove nosetests from the
requirements list and all the workarounds and references to nose will be pulled
out of the tree. Tempest will also start raising an unsupported exception when
you try to run it with nose so that there isn't any confusion on this moving
forward. We talked about doing this at summit briefly and I've brought it up a
couple of times before, but I believe it is time to do this now. I feel for
tempest to move forward we need to do this now so that there isn't any ambiguity
as we add even more features and new types of testing.

I'm with you up to here.


Now, this will have implications for people running tempest with python 2.6
since up until now we've set nosetests. There is a workaround for getting
tempest to run with python 2.6 and testr see:

https://review.openstack.org/#/c/59007/1/README.rst

but essentially this means that when nose is marked as unsupported on tempest
python 2.6 will also be unsupported by Tempest. (which honestly it basically has
been for while now just we've gone without making it official)

The way we handle different runners/os can be categorized as tested
in gate, unsupported (should work, possibly some hacks needed),
and hostile. At present, both nose and py2.6 I would say are in
the unsupported category. The title of this message and the content
up to here says we are moving nose to the hostile category. With
only 2 months to feature freeze I see no justification in moving
py2.6 to the hostile category. I don't see what new testing features
scheduled for the next two months will be enabled by saying that
tempest cannot and will not run on 2.6. It has been agreed I think
by all projects that py2.6 will be dropped in J. It is OK that py2.6
will require some hacks to work and if in the next few months it
needs a few more then that is ok. If I am missing another connection
between the py2.6 and nose issues, please explain.



So honestly we're already at this point in tempest. Nose really just doesn't
work with tempest, and we're adding more features to tempest, your negative test
generator being one of them, that interfere further with nose. I've seen several


I disagree here, my team is running Tempest API, CLI and scenario
tests every day with nose on RHEL 6 with minimal issues.  I had to
workaround the negative test discovery by simply sed'ing that out of
the tests before running it, but that's acceptable to me until we
can start testing on RHEL 7.  Otherwise I'm completely OK with
saying py26 isn't really supported and isn't used in the gate, and
it's a buyer beware situation to make it work, which includes
pushing up trivial patches to make it work (which I did a few of
last week, and they were small syntax changes or usages of
testtools).

I don't understand how the core projects can be running unit tests
in the gate on py26 but our functional integration project is going
to actively go out and make it harder to run Tempest with py26, that
sucks.

If we really want to move the test project away from py26, let's
make the concerted effort to get the core projects to move with it.


So as I said before the python 2.6 story for tempest remains the same after this
change. The only thing that we'll be doing is actively preventing nose from
working with tempest.



And FWIW, I tried the discover.py patch with unittest2 and
testscenarios last week and either I botched it, it's not documented
properly on how to apply it, or I screwed something up, but it
didn't work for me, so I'm not convinced that's the workaround.

What's the other option for running Tempest on py26 (keeping RHEL 6
in mind)?  Using tox with testr and pip?  I'm doing this all
single-node.


Yes, that is what the discover patch is used to enable. By disabling nose the
only path to run tempest with py2.6 is to use testr. (which is what it always
should have been)

Attila confirmed it was working here:
http://fpaste.org/76651/32143139/
in that example he applies 2 patches the second one is currently in the gate for
tempest. (https://review.openstack.org/#/c/72388/ ) So all that needs to be done
is to apply that discover patch:

https://code.google.com/p/unittest-ext/issues/detail?id=79

(which I linked to before)

Then tempest should run more or less the same between 2.7 and 2.6. (The only
difference I've seen is in how skips are handled)




patches this cycle that attempted

Re: [openstack-dev] [QA] The future of nosetests with Tempest

2014-02-26 Thread Matt Riedemann



On 2/25/2014 7:46 PM, Matt Riedemann wrote:



On 2/12/2014 1:57 PM, Matthew Treinish wrote:

On Wed, Feb 12, 2014 at 11:32:39AM -0700, Matt Riedemann wrote:



On 1/17/2014 8:34 AM, Matthew Treinish wrote:

On Fri, Jan 17, 2014 at 08:32:19AM -0500, David Kranz wrote:

On 01/16/2014 10:56 PM, Matthew Treinish wrote:

Hi everyone,

With some recent changes made to Tempest compatibility with
nosetests is going
away. We've started using newer features that nose just doesn't
support. One
example of this is that we've started using testscenarios and
we're planning to
do this in more places moving forward.

So at Icehouse-3 I'm planning to push the patch out to remove
nosetests from the
requirements list and all the workarounds and references to nose
will be pulled
out of the tree. Tempest will also start raising an unsupported
exception when
you try to run it with nose so that there isn't any confusion on
this moving
forward. We talked about doing this at summit briefly and I've
brought it up a
couple of times before, but I believe it is time to do this now. I
feel for
tempest to move forward we need to do this now so that there isn't
any ambiguity
as we add even more features and new types of testing.

I'm with you up to here.


Now, this will have implications for people running tempest with
python 2.6
since up until now we've set nosetests. There is a workaround for
getting
tempest to run with python 2.6 and testr see:

https://review.openstack.org/#/c/59007/1/README.rst

but essentially this means that when nose is marked as unsupported
on tempest
python 2.6 will also be unsupported by Tempest. (which honestly it
basically has
been for while now just we've gone without making it official)

The way we handle different runners/os can be categorized as tested
in gate, unsupported (should work, possibly some hacks needed),
and hostile. At present, both nose and py2.6 I would say are in
the unsupported category. The title of this message and the content
up to here says we are moving nose to the hostile category. With
only 2 months to feature freeze I see no justification in moving
py2.6 to the hostile category. I don't see what new testing features
scheduled for the next two months will be enabled by saying that
tempest cannot and will not run on 2.6. It has been agreed I think
by all projects that py2.6 will be dropped in J. It is OK that py2.6
will require some hacks to work and if in the next few months it
needs a few more then that is ok. If I am missing another connection
between the py2.6 and nose issues, please explain.



So honestly we're already at this point in tempest. Nose really just
doesn't
work with tempest, and we're adding more features to tempest, your
negative test
generator being one of them, that interfere further with nose. I've
seen several


I disagree here, my team is running Tempest API, CLI and scenario
tests every day with nose on RHEL 6 with minimal issues.  I had to
workaround the negative test discovery by simply sed'ing that out of
the tests before running it, but that's acceptable to me until we
can start testing on RHEL 7.  Otherwise I'm completely OK with
saying py26 isn't really supported and isn't used in the gate, and
it's a buyer beware situation to make it work, which includes
pushing up trivial patches to make it work (which I did a few of
last week, and they were small syntax changes or usages of
testtools).

I don't understand how the core projects can be running unit tests
in the gate on py26 but our functional integration project is going
to actively go out and make it harder to run Tempest with py26, that
sucks.

If we really want to move the test project away from py26, let's
make the concerted effort to get the core projects to move with it.


So as I said before the python 2.6 story for tempest remains the same
after this
change. The only thing that we'll be doing is actively preventing nose
from
working with tempest.



And FWIW, I tried the discover.py patch with unittest2 and
testscenarios last week and either I botched it, it's not documented
properly on how to apply it, or I screwed something up, but it
didn't work for me, so I'm not convinced that's the workaround.

What's the other option for running Tempest on py26 (keeping RHEL 6
in mind)?  Using tox with testr and pip?  I'm doing this all
single-node.


Yes, that is what the discover patch is used to enable. By disabling
nose the
only path to run tempest with py2.6 is to use testr. (which is what it
always
should have been)

Attila confirmed it was working here:
http://fpaste.org/76651/32143139/
in that example he applies 2 patches the second one is currently in
the gate for
tempest. (https://review.openstack.org/#/c/72388/ ) So all that needs
to be done
is to apply that discover patch:

https://code.google.com/p/unittest-ext/issues/detail?id=79

(which I linked to before)

Then tempest should run more or less the same between 2.7 and 2.6.
(The only
difference I've seen is in how skips are handled

Re: [openstack-dev] How do I mark one option as deprecating another one ?

2014-02-27 Thread Matt Riedemann



On 2/27/2014 6:32 AM, Davanum Srinivas wrote:

Phil,

Correct. We don't have this functionality in oslo.config. Please
create a new feature/enhancement request against oslo

thanks,
dims


Done: https://bugs.launchpad.net/oslo/+bug/1285768



On Thu, Feb 27, 2014 at 4:47 AM, Day, Phil philip@hp.com wrote:

Hi Denis,



Thanks for the pointer, but I looked at that and I my understanding is that
it only allows me to retrieve a value by an old name, but doesn't let me
know that the old name has been used.  So If all I wanted to do was change
the name/group of the config value it would be fine.  But in my case I need
to be able to implement:

If new_value_defined:

   do_something

else if old_value_defined:

  warn_about_deprectaion

 do_something_else



Specifically I want to replace tenant_name based authentication with
tenant_id - so I need to know which has been specified.



Phil





From: Denis Makogon [mailto:dmako...@mirantis.com]
Sent: 26 February 2014 14:31
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] How do I mark one option as deprecating another
one ?



Here what oslo.config documentation says.


Represents a Deprecated option. Here's how you can use it

 oldopts = [cfg.DeprecatedOpt('oldfoo', group='oldgroup'),
cfg.DeprecatedOpt('oldfoo2', group='oldgroup2')]
 cfg.CONF.register_group(cfg.OptGroup('blaa'))
 cfg.CONF.register_opt(cfg.StrOpt('foo', deprecated_opts=oldopts),
group='blaa')

 Multi-value options will return all new and deprecated
 options.  For single options, if the new option is present
 ([blaa]/foo above) it will override any deprecated options
 present.  If the new option is not present and multiple
 deprecated options are present, the option corresponding to
 the first element of deprecated_opts will be chosen.

I hope that it'll help you.


Best regards,

Denis Makogon.



On Wed, Feb 26, 2014 at 4:17 PM, Day, Phil philip@hp.com wrote:

Hi Folks,



I could do with some pointers on config value deprecation.



All of the examples in the code and documentation seem to deal with  the
case of old_opt being replaced by new_opt but still returning the same
value

Here using deprecated_name and  / or deprecated_opts in the definition of
new_opt lets me still get the value (and log a warning) if the config
still uses old_opt



However my use case is different because while I want deprecate old-opt,
new_opt doesn't take the same value and I need to  different things
depending on which is specified, i.e. If old_opt is specified and new_opt
isn't I still want to do some processing specific to old_opt and log a
deprecation warning.



Clearly I can code this up as a special case at the point where I look for
the options - but I was wondering if there is some clever magic in
oslo.config that lets me declare this as part of the option definition ?







As a second point,  I thought that using a deprecated option automatically
logged a warning, but in the latest Devstack wait_soft_reboot_seconds is
defined as:



 cfg.IntOpt('wait_soft_reboot_seconds',

default=120,

help='Number of seconds to wait for instance to shut down
after'

 ' soft reboot request is made. We fall back to hard
reboot'

 ' if instance does not shutdown within this window.',

deprecated_name='libvirt_wait_soft_reboot_seconds',

deprecated_group='DEFAULT'),







but if I include the following in nova.conf



 libvirt_wait_soft_reboot_seconds = 20





I can see the new value of 20 being used, but there is no warning logged
that I'm using a deprecated name ?



Thanks

Phil




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev







--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] sqlalchemy-migrate release impending

2014-02-28 Thread Matt Riedemann



On 2/26/2014 11:34 AM, Sean Dague wrote:

On 02/26/2014 11:24 AM, David Ripton wrote:

I'd like to release a new version of sqlalchemy-migrate in the next
couple of days.  The only major new feature is DB2 support.  If anyone
thinks this is a bad time, please let me know.



So it would be nice if someone could actually work through the 0.9 sqla
support, because I think it's basically just a change in quoting
behavior that's left (mostly where quoting gets called) -
https://review.openstack.org/#/c/66156/

-Sean



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Looks like the 0.8.3 tag is up so it's just a matter of time before it 
shows up on pypi?


https://review.openstack.org/gitweb?p=stackforge/sqlalchemy-migrate.git;a=commit;h=21fcdad0f485437d010e5743626c63ab3acdaec5

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Concrete Proposal for Keeping V2 API

2014-03-03 Thread Matt Riedemann
 on.


Changing return codes always scares me, we risk breaking code that
says if '==200.' Although having versioned backwards compatable APIs
makes this a little better.


Accept them as wrong but not critically so. With this approach, we can
strive for correctness in the future without changing behavior of our
existing APIs. Nobody seems to complain about them right now, so
changing them seems to be little gain. If the client begins exposing a
version header (which we need for other things) then we could
alternately start returning accurate codes for those clients.


Wait what? client needs version headers? Can you expand

++ to accepting them as wrong and moving on.


The key point here is that we see a way forward with this in the v2 API
regardless of which path we choose.

7) Entrypoint based extensions

The v3 effort included improvements to the infrastructure used to
implement the API, both for proper extensions and modular construction
of the core API.  We definitely want that for v2, and since these are
just internal implementation details, there is no reason why we can't
implement these improvements in the v2 API.  Note that with the addition
of versioning of extensions and the core, we should be adding fewer API
extensions and may be able to collapse some that we already have.

8) Input Validation

Previously, much of our input validation happened at the database driver
layer for various reasons. This means that behavior of the API depends
on such user-invisible things as which RDBM is in use. Thus, our input
validation is already inconsistent across deployments. Further, the move
to objects as the communication mechanism between the API and the
backend means that more input validation is going on than we once had
anyway, and this is not something we can avoid unless we freeze the
backend anyway.

Exposing a definition like jsonschema is something that we should do
anyway, and the process of doing so should allow us to define what was
previously undefined. Given the variance of the behavior depending on
the database used on the backend, getting something reasonably strict
should be roughly equivalent (user-wise) to working against a provider
that was using something other than MySQL.


As long as we can bump the API version (backwards compatible bump of
course) then ++.



9) v3 ... if not now, when?

Not in the foreseeable future.  We don't see it happening unless some
group of people wanted to rebuild the API from scratch in such a way
that it doesn't look anything like the current API.  The new API would
have to be different and compelling enough that we'd be willing to
maintain it along side the current API for several years.



Hooray for deleting code


Personally I'd like to see this bake for a bit, not turn into a huge 
thread like the other one already is, but let things settle for a bit 
between now and the Juno summit and then let people talk about this in 
person before there are any proposals to completing deleting all of the 
V3 code.  Unfortunately it didn't come up before the meetup in Utah, but 
if it can sit for a bit I think it'd be helpful to hash it out in Atlanta.





10) Conclusion and Proposal

The v3 API effort has produced a lot of excellent work.  However, the
majority opinion seems to be that we should avoid the cost of
maintaining two APIs if at all possible.  We should apply what has been
learned to the existing API where we can and focus on making v2
something that we can continue to maintain for years to come.

We recognize and accept that it is a failure of Nova project leadership
that we did not come to this conclusion much sooner.  We hope to have
learned from the experience to help avoiding a situation like this
happening again in the future.

Please provide your input on this proposal, even if it is just agreement
with this as the way forward.

Thanks,

Proposal authors / sponsors:

 Russell Bryant
 Dan Smith
 John Garbutt
 Andrew Laski

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Toward SQLAlchemy 0.9.x compatibility everywhere for Icehouse

2014-03-04 Thread Matt Riedemann



On 3/3/2014 8:59 AM, Thomas Goirand wrote:

On 03/03/2014 01:14 PM, Thomas Goirand wrote:

On 03/03/2014 11:24 AM, Thomas Goirand wrote:

It looks like my patch fixes the first unit test failure. Though we
still need a fix for the 2nd problem:
AttributeError: 'module' object has no attribute 'AbstractType'


Replying to myself...

It looks like AbstractType is not needed except for backwards
compatibility in SQLA 0.7  0.8, and it's gone away in 0.9. See:

http://docs.sqlalchemy.org/en/rel_0_7/core/types.html
http://docs.sqlalchemy.org/en/rel_0_8/core/types.html
http://docs.sqlalchemy.org/en/rel_0_9/core/types.html

(reference to AbstractType is gone from the 0.9 doc)

Therefore, I'm tempted to just remove lines 336 and 337, though I am
unsure of what was intended in this piece of code.

Your thoughts?

Thomas


Seems Sean already fixed that one, and it was lost in the git review
process (with patches going back and forth). I added it again as a
separate patch, and now the unit tests are now ok. It just passed the
gating tests! :)

Cheers, and thanks to Sean and everyone else for the help, hoping to get
this series approved soon,

Thomas Goirand (zigo)


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



You're going to need to rebase on this [1] now since we have a Tempest 
job running against sqlalchemy-migrate patches as of yesterday.  I'm 
trying to figure out why that's failing in devstack-gate-cleanup-host 
though so any help there is appreciated.  I'm assuming we missed 
something in the job setup [2].


[1] https://review.openstack.org/#/c/77669/
[2] https://review.openstack.org/#/c/77679/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] pep8 gating fails due to tools/config/check_uptodate.sh

2014-03-04 Thread Matt Riedemann



On 3/4/2014 4:34 PM, Joe Gordon wrote:

So since tools/config/check_uptodate.sh is oslo code, I assumed this
issue falls into the domain of oslo-incubator.

Until this gets resolved nova is considering
https://review.openstack.org/#/c/78028/


Keystone too: https://review.openstack.org/#/c/78030/



On Wed, Feb 5, 2014 at 9:21 AM, Daniel P. Berrange berra...@redhat.com wrote:

On Wed, Feb 05, 2014 at 11:56:35AM -0500, Doug Hellmann wrote:

On Wed, Feb 5, 2014 at 11:40 AM, Chmouel Boudjnah chmo...@enovance.comwrote:



On Wed, Feb 5, 2014 at 4:20 PM, Doug Hellmann doug.hellm...@dreamhost.com

wrote:



Including the config file in either the developer documentation or the
packaging build makes more sense. I'm still worried that adding it to the
sdist generation means you would have to have a lot of tools installed just
to make the sdist. However, we could




I think that may slighty complicate more devstack with this, since we rely
heavily on config samples to setup the services.



Good point, we would need to add a step to generate a sample config for
each app instead of just copying the one in the source repository.


Which is what 'python setup.py build' for an app would take care of.

Regards,
Daniel
--
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


  1   2   3   4   5   6   7   8   9   10   >