Re: [openstack-dev] [Nova] FFE Request: Oslo: i18n Message improvements

2014-03-07 Thread Matt Riedemann



On 3/7/2014 4:15 AM, Sean Dague wrote:

On 03/06/2014 04:46 PM, James Carey wrote:

 Please consider a FFE for i18n Message improvements:
BP: https://blueprints.launchpad.net/nova/+spec/i18n-messages

 The base enablement for lazy translation has already been sync'd
from oslo.   This patch was to enable lazy translation support in Nova.
  It is titled re-enable lazy translation because this was enabled during
Havana but was pulled due to issues that have since been resolved.

 In order to enable lazy translation it is necessary to do the
following things:

   (1) Fix a bug in oslo with respect to how keywords are extracted from
the format strings when saving replacement text for use when the message
translation is done.   This is
https://bugs.launchpad.net/nova/+bug/1288049, which I'm actively working
on a fix for in oslo.  Once that is complete it will need to be sync'd
into nova.

   (2) Remove concatenation (+) of translatable messages.  The current
class that is used to hold the translatable message
(gettextutils.Message) does not support concatenation.  There were a few
cases in Nova where this was done and they are coverted to other means
of combining the strings in:
https://review.openstack.org/#/c/78095Remove use of concatenation on
messages

   (3) Remove the use of str() on exceptions.  The intent of this is to
return the message contained in the exception, but these messages may
contain unicode, so str cannot be used on them and gettextutils.Message
enforces this.  Thus these need
to either be removed and allow python formatting to do the right thing,
or changed to unicode().  Since unicode() will change to str() in Py3,
the forward compatible six.text_type() is used instead.  This is done in:
https://review.openstack.org/#/c/78096Remove use of str() on exceptions

   (4) The addition of the call that enables the use of lazy messages.
  This is in:
https://review.openstack.org/#/c/73706Re-enable lazy translation.

 Lazy translation has been enabled in the other projects so it would
be beneficial to be consistent with the other projects with respect to
message translation.


Unless it has landed in *every other* integrated project besides Nova, I
don't find this compelling.

I have tested that the changes in (2) and (3) work

when lazy translation is not enabled.  Thus if a problem is found, the
two line change in (4) could be removed to get to the previous behavior.

 I've been talking to Matt Riedemann and Dan Berrange about this.
  Matt has agreed to be a sponsor.


If this is enabled in other projects, where is the Tempest scenario test
that actually demonstrates that this is working on real installs?

I get that everyone has features that didn't hit. BHowever now is not
that time for that, now is the time for people to get focussed on bugs
hunting. And especially if we are talking about *another* oslo sync.

-1

-Sean



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



The Tempest requirement just came up yesterday.  FWIW, this i18n stuff 
has been working it's way in since Grizzly, and the new requirement for 
Tempest is new.  I'm not saying it's not valid, but the timing sucks - 
but that's life.


Also, the oslo sync would be to one module, gettextutils, which I don't 
think pulls in anything else from oslo.


Anyway, this is in Keystone, Glance, Cinder, Neutron and Ceilometer at 
least.  Patches are working their way through Heat as I understand it.


I'm not trying to turn this into a crusade, just trying to get out what 
I know about the current state of things.  I'll let Jim Carey or Jay 
Bryant discuss it more since they've been more involved in the 
blueprints across all the projects.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][db] Thoughts on making instances.uuid non-nullable?

2014-03-09 Thread Matt Riedemann



On 12/16/2013 11:01 AM, Shawn Hartsock wrote:

+1 on a migration to make uuid a non-nullable column. I advocated a few
patches back in Havana that make assumptions based on the UUID being
present and unique per instance. If it gets nulled the VMware drivers
will have have breakage and I have no idea how to avoid that reasonably
without the UUID.


On Mon, Dec 16, 2013 at 11:59 AM, Russell Bryant rbry...@redhat.com
mailto:rbry...@redhat.com wrote:

On 12/16/2013 11:45 AM, Matt Riedemann wrote:
  1. Add a migration to change instances.uuid to non-nullable.
Besides the
  obvious con of having yet another migration script, this seems
the most
  straight-forward. The instance object class already defines the uuid
  field as non-nullable, so it's constrained at the objects layer, just
  not in the DB model.  Plus I don't think we'd ever have a case where
  instance.uuid is null, right?  Seems like a lot of things would break
  down if that happened.  With this option I can build on top of it for
  the DB2 migration support to add the same FKs as the other engines.

Yeah, having instance.uuid nullable doesn't seem valuable to me, so this
seems OK.

--
Russell Bryant

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




--
# Shawn.Hartsock - twitter: @hartsock - plus.google.com/+ShawnHartsock
http://plus.google.com/+ShawnHartsock


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I've been working on this more and am running up against some issues, 
part of this has to do with my lack of sqlalchemy know-how and 
inexperience with writing DB migrations, so dumping some info/problems 
here to see where people would like me to take this.


My original thinking for doing a migration was to delete the instances 
records where uuid == None and then move those to shadow_instances, then 
make instances.uuid.nullable=False.  Some of the problems with this 
approach are:


1. There are at least 5 other tables related to instances that need to 
be handled for a delete: InstanceFault, InstanceMetadata, 
InstanceSystemMetadata, InstanceInfoCache and 
SecurityGroupInstanceAssociation. Also, these tables don't define their 
instance_uuid column the same way, some have it nullable=False and 
others don't.


2. I'm not sure if I can use a session in the migration to make it a 
transaction.


3. This would make the instances and shadow_instances tables have 
different schemas, i.e. instances.uuid would be nullable=False in 
instances but nullable=True in shadow_instances.  Maybe this doesn't matter.


The whole reason behind using shadow_instances (or any backup table I 
guess) was so I could restore the records on DB downgrade.


So the more I think about this, I'm getting to the point of asking:

1. Do we even care about instances where uuid is None?  I'd have to 
think those wouldn't be working well in the current code with how 
everything relies on uuid for foreign keys and tracking relationships to 
volumes, images and networks across services.  If the answer is 'no' 
then the migration is pretty simple, just delete the records where uuid 
is None and be done with it.  You couldn't downgrade to get them back, 
but in this case we're asserting that we don't want them back.


2. Have an alternative schema in the DB2 case. This would be handled in 
the 216_havana migration when the instances table is defined and 
created, we'd just make the uuid column non-nullable in the DB2 case and 
leave it nullable for all other engines.  Anyone moving to DB2 would 
have to install from scratch anyway since there is no tooling to migrate 
a MySQL DB to DB2, for example.  As it stands, the 216_havana migration 
in my patch [1] already has a different schema for DB2 because of the 
foreign keys it can't create due to this problem.


Anyway, looking for some thoughts on how to best handle this, or if 
anyone has other ideas or good reasons why either approach couldn't be used.


[1] https://review.openstack.org/#/c/69047/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][db] Thoughts on making instances.uuid non-nullable?

2014-03-10 Thread Matt Riedemann



On 3/9/2014 9:05 PM, ChangBo Guo wrote:




2014-03-10 4:47 GMT+08:00 Jay Pipes jaypi...@gmail.com
mailto:jaypi...@gmail.com:



  3. This would make the instances and shadow_instances tables have
  different schemas, i.e. instances.uuid would be nullable=False in
  instances but nullable=True in shadow_instances.  Maybe this
doesn't matter.

No, I don't think this matters much, to be honest. I'm not entirely sure
what the long-term purpose of the shadow tables are in Nova -- perhaps
someone could clue me in to whether the plan is to keep them around?

As I know the tables shadow_*  are used  by command ' nova-manage db
archive_deleted_rows' , which moves  records with deleted=True to
table shadow_* . That means these tables are used by other  process, So,
I think we need other tables to store the old records in your migration .


So you mean move records where instances.uuid == None to 
shadow_instances?  That's not going to work though if we make the uuid 
column non-nullable on both instances and shadow_instances, unless you 
generate a random UUID for the shadow_instances table records that get 
moved, which is just another hack - and that would break moving them 
back on downgrade since you wouldn't know which records to move back, 
i.e. since you wouldn't be able to query shadow_instances for records 
where instances.uuid == None.


Other thoughts?  If you did really want to back these records up, I 
think it would have to be a different backup table rather than 
shadow_instances since I think we want to keep the schema the same 
between instances and shadow_instances.







--
ChangBo Guo(gcb)


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][db] Thoughts on making instances.uuid non-nullable?

2014-03-10 Thread Matt Riedemann



On 3/9/2014 9:18 PM, Jay Pipes wrote:

On Mon, 2014-03-10 at 10:05 +0800, ChangBo Guo wrote:




2014-03-10 4:47 GMT+08:00 Jay Pipes jaypi...@gmail.com:


  3. This would make the instances and shadow_instances tables
 have
  different schemas, i.e. instances.uuid would be
 nullable=False in
  instances but nullable=True in shadow_instances.  Maybe this
 doesn't matter.


 No, I don't think this matters much, to be honest. I'm not
 entirely sure
 what the long-term purpose of the shadow tables are in Nova --
 perhaps
 someone could clue me in to whether the plan is to keep them
 around?


As I know the tables shadow_*  are used  by command ' nova-manage db
archive_deleted_rows' , which moves  records with deleted=True to
table shadow_* . That means these tables are used by other  process,
So, I think we need other tables to store the old records in your
migration.


Yeah, that's what I understood the shadow tables were used for, I just
didn't know what the long-term future of these tables was... curious if
there's been any discussion about that.

Best,
-jay



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I think Joe Gordon was working on something in the hopes of eventually 
killing the shadow tables but I can't remember exactly what that was now.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] FFE Request: Ephemeral RBD image support

2014-03-11 Thread Matt Riedemann



On 3/10/2014 11:20 AM, Dmitry Borodaenko wrote:

On Fri, Mar 7, 2014 at 8:55 AM, Sean Dague s...@dague.net wrote:

On 03/07/2014 11:16 AM, Russell Bryant wrote:

On 03/07/2014 04:19 AM, Daniel P. Berrange wrote:

On Thu, Mar 06, 2014 at 12:20:21AM -0800, Andrew Woodward wrote:

I'd Like to request A FFE for the remaining patches in the Ephemeral
RBD image support chain

https://review.openstack.org/#/c/59148/
https://review.openstack.org/#/c/59149/

are still open after their dependency
https://review.openstack.org/#/c/33409/ was merged.

These should be low risk as:
1. We have been testing with this code in place.
2. It's nearly all contained within the RBD driver.

This is needed as it implements an essential functionality that has
been missing in the RBD driver and this will become the second release
it's been attempted to be merged into.


Add me as a sponsor.


OK, great.  That's two.

We have a hard deadline of Tuesday to get these FFEs merged (regardless
of gate status).



As alt release manager, FFE approved based on Russell's approval.

The merge deadline for Tuesday is the release meeting, not end of day.
If it's not merged by the release meeting, it's dead, no exceptions.


Both commits were merged, thanks a lot to everyone who helped land
this in Icehouse! Especially to Russel and Sean for approving the FFE,
and to Daniel, Michael, and Vish for reviewing the patches!



There was a bug reported today [1] that looks like a regression in this 
new code, so we need people involved in this looking at it as soon as 
possible because we have a proposed revert in case we need to yank it 
out [2].


[1] https://bugs.launchpad.net/nova/+bug/1291014
[2] 
https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bug/1291014,n,z


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] FFE Request: Ephemeral RBD image support

2014-03-11 Thread Matt Riedemann



On 3/11/2014 3:11 PM, Jay Pipes wrote:

On Tue, 2014-03-11 at 14:18 -0500, Matt Riedemann wrote:


On 3/10/2014 11:20 AM, Dmitry Borodaenko wrote:

On Fri, Mar 7, 2014 at 8:55 AM, Sean Dague s...@dague.net wrote:

On 03/07/2014 11:16 AM, Russell Bryant wrote:

On 03/07/2014 04:19 AM, Daniel P. Berrange wrote:

On Thu, Mar 06, 2014 at 12:20:21AM -0800, Andrew Woodward wrote:

I'd Like to request A FFE for the remaining patches in the Ephemeral
RBD image support chain

https://review.openstack.org/#/c/59148/
https://review.openstack.org/#/c/59149/

are still open after their dependency
https://review.openstack.org/#/c/33409/ was merged.

These should be low risk as:
1. We have been testing with this code in place.
2. It's nearly all contained within the RBD driver.

This is needed as it implements an essential functionality that has
been missing in the RBD driver and this will become the second release
it's been attempted to be merged into.


Add me as a sponsor.


OK, great.  That's two.

We have a hard deadline of Tuesday to get these FFEs merged (regardless
of gate status).



As alt release manager, FFE approved based on Russell's approval.

The merge deadline for Tuesday is the release meeting, not end of day.
If it's not merged by the release meeting, it's dead, no exceptions.


Both commits were merged, thanks a lot to everyone who helped land
this in Icehouse! Especially to Russel and Sean for approving the FFE,
and to Daniel, Michael, and Vish for reviewing the patches!



There was a bug reported today [1] that looks like a regression in this
new code, so we need people involved in this looking at it as soon as
possible because we have a proposed revert in case we need to yank it
out [2].

[1] https://bugs.launchpad.net/nova/+bug/1291014
[2]
https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bug/1291014,n,z


Note that I have identified the source of the problem and am pushing a
patch shortly with unit tests.

Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



My concern is how much else where assumes nova is working with the 
glance v2 API because there was a nova blueprint [1] to make nova work 
with the glance V2 API but that never landed in Icehouse, so I'm worried 
about wack-a-mole type problems here, especially since there is no 
tempest coverage for testing multiple image location support via nova.


[1] https://blueprints.launchpad.net/nova/+spec/use-glance-v2-api

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] FFE Request: Ephemeral RBD image support

2014-03-11 Thread Matt Riedemann



On 3/11/2014 5:11 PM, Dmitry Borodaenko wrote:

On Tue, Mar 11, 2014 at 1:31 PM, Matt Riedemann
mrie...@linux.vnet.ibm.com wrote:

There was a bug reported today [1] that looks like a regression in this
new code, so we need people involved in this looking at it as soon as
possible because we have a proposed revert in case we need to yank it
out [2].

[1] https://bugs.launchpad.net/nova/+bug/1291014
[2] 
https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bug/1291014,n,z


Note that I have identified the source of the problem and am pushing a
patch shortly with unit tests.


My concern is how much else where assumes nova is working with the glance v2
API because there was a nova blueprint [1] to make nova work with the glance
V2 API but that never landed in Icehouse, so I'm worried about wack-a-mole
type problems here, especially since there is no tempest coverage for
testing multiple image location support via nova.

[1] https://blueprints.launchpad.net/nova/+spec/use-glance-v2-api


As I mentioned in the bug comments, the code that made the assumption
about glance v2 API actually landed in September 2012:
https://review.openstack.org/13017

The multiple image location patch simply made use of a method that was
already there for more than a year.

-DmitryB

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Yeah, I pointed that out today in IRC also.

So kudos to Jay for getting a patch up quickly, and a really nice one at 
that with extensive test coverage.


What I'd like to see in Juno is a tempest test that covers the multiple 
image locations code since it seems we obviously don't have that today. 
 How hard is something like that with an API test?


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] An analysis of code review in Nova

2014-03-12 Thread Matt Riedemann
 this later and will add unit tests then' or 'it's hard to 
test this path without a lot of changes to how the tests are working'. 
That's unacceptable to me, and I generally give up on the review after that.


So to move this all forward, I think that bp above should be top 
priority for the vmware team in Juno to keep bp patches moving at the 
pace they do, because the features and refactoring just keeps coming and 
at least for me it's very hard to burn out on looking at those reviews.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] FFE Request: Ephemeral RBD image support

2014-03-12 Thread Matt Riedemann



On 3/12/2014 6:32 PM, Dan Smith wrote:

I'm confused as to why we arrived at the decision to revert the commits
since Jay's patch was accepted. I'd like some details about this
decision, and what new steps we need to take to get this back in for Juno.


Jay's fix resolved the immediate problem that was reported by the user.
However, after realizing why the bug manifested itself and why it didn't
occur during our testing, all of the core members involved recommended a
revert as the least-risky course of action at this point. If it took
almost no time for that change to break a user that wasn't even using
the feature, we're fearful about what may crop up later.

We talked with the patch author (zhiyan) in IRC for a while after making
the decision to revert about what the path forward for Juno is. The
tl;dr as I recall is:

  1. Full Glance v2 API support merged
  2. Tests in tempest and nova that exercise Glance v2, and the new
 feature
  3. Push the feature patches back in

--Dan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Those are essentially the steps as I remember them too.  Sean changed 
the dependencies in the blueprints so the nova glance v2 blueprint is 
the root dependency, then multiple images and then the other download 
handler blueprints at the top.  I haven't checked but the blueprints 
should be marked as not complete (not sure what that would be now) and 
marked for next, the v2 glance root blueprint should be marked as high 
priority too so we get the proper focus when Juno opens up.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] An analysis of code review in Nova

2014-03-13 Thread Matt Riedemann



On 3/12/2014 7:29 PM, Arnaud Legendre wrote:

Hi Matt,

I totally agree with you and actually we have been discussing this a lot 
internally the last few weeks.
. As a top priority, the driver MUST integrate with oslo.vmware. This will be 
achieved through this chain of patches [1]. We want these patches to be merged 
before other things.
I think we should stop introducing more complexity which makes the task of refactoring 
more and more complicated. The integration with oslo.vmware is not a refactoring but 
should be seen as a way to get a more lightweight version of the driver which 
will make the task of refactoring a bit easier.
. Then, we want to actually refactor, we got several meetings to know what is 
the best strategy to adopt going forward (and avoid reproducing the same 
mistakes).
The highest priority is spawn(): we need to make it modular, remove nested 
methods. This refactoring work should include the integration with the image 
handler framework [2] and introducing the notion of image type object to avoid 
all these conditions on types of images inside the core logic.


Breaking up the spawn method to make it modular and thus testable or 
refactoring to use oslo.vmware, order there doesn't seem to really 
matter to me since both sound good.  But this scares me:


This refactoring work should include the integration with the image 
handler framework


Hopefully the refactoring being talked about here with oslo.vmware and 
breaking spawn into chunks can be done *before* any work to refactor the 
vmware driver to use the multiple image locations feature - it will 
probably have to be given that was reverted out of Icehouse and will 
have some prerequisite work to do before it will land in Juno.



. I would like to see you cores to be involved in this design since you will be 
reviewing the code at some point. involved here can be interpreted as review the 
design, and/ or actually participate to the design discussions. I would like to get your POV on 
this.

Let me know if this approach makes sense.

Thanks,
Arnaud

[1] https://review.openstack.org/#/c/70175/
[2] https://review.openstack.org/#/c/33409/


- Original Message -
From: Matt Riedemann mrie...@linux.vnet.ibm.com
To: openstack-dev@lists.openstack.org
Sent: Wednesday, March 12, 2014 11:28:23 AM
Subject: Re: [openstack-dev] [nova] An analysis of code review in Nova



On 2/25/2014 6:36 AM, Matthew Booth wrote:

I'm new to Nova. After some frustration with the review process,
specifically in the VMware driver, I decided to try to visualise how the
review process is working across Nova. To that end, I've created 2
graphs, both attached to this mail.

Both graphs show a nova directory tree pruned at the point that a
directory contains less than 2% of total LOCs. Additionally, /tests and
/locale are pruned as they make the resulting graph much busier without
adding a great deal of useful information. The data for both graphs was
generated from the most recent 1000 changes in gerrit on Monday 24th Feb
2014. This includes all pending changes, just over 500, and just under
500 recently merged changes.

pending.svg shows the percentage of LOCs which have an outstanding
change against them. This is one measure of how hard it is to write new
code in Nova.

merged.svg shows the average length of time between the
ultimately-accepted version of a change being pushed and being approved.

Note that there are inaccuracies in these graphs, but they should be
mostly good. Details of generation here:
https://urldefense.proofpoint.com/v1/url?u=https://github.com/mdbooth/heatmapk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=5wWaXo2oVaivfKLCMyU6Z9UTO8HOfeGCzbGHAT4gZpo%3D%0Am=q%2BhYPEq%2BGxlDrGrMdbYCWuaLhZOwXwRpMQwWxkSied4%3D%0As=9a9e8ba562a81e0d00ca4190fbda306617637473ba5e721e4071d8d0ae20175c.
 This code is obviously
single-purpose, but is free for re-use if anyone feels so inclined.

The first graph above (pending.svg) is the one I was most interested in,
and shows exactly what I expected it to. Note the size of 'vmwareapi'.
If you check out Nova master, 24% of the vmwareapi driver has an
outstanding change against it. It is practically impossible to write new
code in vmwareapi without stomping on an oustanding patch. Compare that
to the libvirt driver at a much healthier 3%.

The second graph (merged.svg) is an attempt to look at why that is.
Again comparing the VMware driver with the libvirt we can see that at 12
days, it takes much longer for a change to be approved in the VMware
driver than in the libvirt driver. I suspect that this isn't the whole
story, which is likely a combination of a much longer review time with
very active development.

What's the impact of this? As I said above, it obviously makes it very
hard to come in as a new developer of the VMware driver when almost a
quarter of it has been rewritten, but you can't see it. I am very new to
this and others should validate my conclusions, but I also believe this
is having a detrimental

Re: [openstack-dev] Duplicate code for processing REST APIs

2014-03-13 Thread Matt Riedemann



On 3/13/2014 4:13 PM, Roman Podoliaka wrote:

Hi Steven,

Code from openstack/common/ dir is 'synced' from oslo-incubator. The
'sync' is effectively a copy of oslo-incubator subtree into a project
source tree. As syncs are not done at the same time, the code of
synced modules may indeed by different for each project depending on
which commit of oslo-incubator was synced.

Thanks,
Roman

On Thu, Mar 13, 2014 at 2:03 PM, Steven Kaufer kau...@us.ibm.com wrote:

While investigating some REST API updates I've discovered that there is a
lot of duplicated code across the various OpenStack components.

For example, the paginate_query function exists in all these locations and
there are a few slight differences between most of them:

https://github.com/openstack/ceilometer/blob/master/ceilometer/openstack/common/db/sqlalchemy/utils.py#L61
https://github.com/openstack/cinder/blob/master/cinder/openstack/common/db/sqlalchemy/utils.py#L37
https://github.com/openstack/glance/blob/master/glance/openstack/common/db/sqlalchemy/utils.py#L64
https://github.com/openstack/heat/blob/master/heat/openstack/common/db/sqlalchemy/utils.py#L62
https://github.com/openstack/keystone/blob/master/keystone/openstack/common/db/sqlalchemy/utils.py#L64
https://github.com/openstack/neutron/blob/master/neutron/openstack/common/db/sqlalchemy/utils.py#L61
https://github.com/openstack/nova/blob/master/nova/openstack/common/db/sqlalchemy/utils.py#L64

Does anyone know if there is any work going on to move stuff like this into
oslo and then deprecate these functions?  There are also many functions that
process the REST API request parameters (getting the limit, marker, sort
data, etc.) that are also replicated across many components.

If no existing work is done in this area, how should this be tackled?  As a
blueprint for Juno?

Thanks,

Steven Kaufer
Cloud Systems Software
kau...@us.ibm.com 507-253-5104
Dept HMYS / Bld 015-2 / G119 / Rochester, MN 55901


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Steve, more info here on oslo-incubator:

https://wiki.openstack.org/wiki/Oslo#Incubation

Welcome! :)

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Constructive Conversations

2014-03-18 Thread Matt Riedemann



On 3/7/2014 1:56 PM, Kurt Griffiths wrote:

Folks,

I’m sure that I’m not the first person to bring this up, but I’d like to
get everyone’s thoughts on what concrete actions we, as a community, can
take to improve the status quo.

There have been a variety of instances where community members have
expressed their ideas and concerns via email or at a summit, or simply
submitted a patch that perhaps challenges someone’s opinion of The Right
Way to Do It, and responses to that person have been far less
constructive than they could have been[1]. In an open community, I don’t
expect every person who comments on a ML post or a patch to be
congenial, but I do expect community leaders to lead by example when it
comes to creating an environment where every person’s voice is valued
and respected.

What if every time someone shared an idea, they could do so without fear
of backlash and bullying? What if people could raise their concerns
without being summarily dismissed? What if “seeking first to
understand”[2] were a core value in our culture? It would not only
accelerate our pace of innovation, but also help us better understand
the needs of our cloud users, helping ensure we aren’t just building
OpenStack in the right way, but also building /the right OpenStack/.

We need open minds to build an open cloud.

Many times, we /do/ have wonderful, constructive discussions, but the
times we don’t cause wounds in the community that take a long time to
heal. Psychologists tell us that it takes a lot of good experiences to
make up for one bad one. I will be the first to admit I’m not perfect.
Communication is hard. But I’m convinced we can do better. We /must/ do
better.

How can we build on what is already working, and make the bad
experiences as rare as possible?

A few ideas to seed the discussion:

  * Identify a set of core values that the community already embraces
for the most part, and put them down “on paper.”[3] Leaders can keep
these values fresh in everyone’s minds by (1) leading by example,
and (2) referring to them regularly in conversations and talks.
  * PTLs can add mentoring skills and a mindset of seeking first to
understand” to their list of criteria for evaluating proposals to
add a community member to a core team.
  * Get people together in person, early and often. Mid-cycle meetups
and mini-summits provide much higher-resolution communication
channels than email and IRC, and are great ways to clear up
misunderstandings, build relationships of trust, and generally get
everyone pulling in the same direction.

What else can we do?

Kurt

[1] There are plenty of examples, going back years. Anyone who has been
in the community very long will be able to recall some to mind. Recent
ones I thought of include Barbican’s initial request for incubation on
the ML, dismissive and disrespectful exchanges in some of the design
sessions in Hong Kong (bordering on personal attacks), and the
occasional “WTF?! This is the dumbest idea ever!” patch comment.
[2] https://www.stephencovey.com/7habits/7habits-habit5.php
[3] We already have a code of conduct
https://www.openstack.org/legal/community-code-of-conduct/ but I think
a list of core values would be easier to remember and allude to in
day-to-day discussions. I’m trying to think of ways to make this idea
practical. We need to stand up for our values, not just /say/ we have them.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Not to detract from what you're saying, but this is 'meh' to me. My 
company has some different kind of values thing every 6 months it seems 
and maybe it's just me but I never really pay attention to any of it.  I 
think I have to put something on my annual goals/results about it, but 
it's just fluffy wording.


To me this is a self-policing community, if someone is being a dick, the 
others should call them on it, or the PTL for the project should stand 
up against it and set the tone for the community and culture his project 
wants to have.  That's been my experience at least.


Maybe some people would find codifying this helpful, but there are 
already lots of wikis and things that people can't remember on a daily 
basis so adding another isn't probably going to help the problem. 
Bully's don't tend to care about codes, but if people stand up against 
them in public they should be outcast.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-dev] [Nova] use Keystone V3 token to volume attachment

2014-03-19 Thread Matt Riedemann



On 3/19/2014 2:48 AM, Shao Kai SK Li wrote:

Hello:

  I am working on this
patch(https://review.openstack.org/#/c/77524/) to fix bugs about volume
attach failure with keystone V3 token.

  Just wonder, is there some blue prints or plans in Juno to address
keystone V3 support in nova ?

  Thanks you in advance.


Best Regards~~~

Li, Shaokai


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I have this on the nova meeting agenda for tomorrow [1].  I would think 
at a minimum this means running compute tests in Tempest against a 
keystone v3 backend.  I'm not sure what the current state of Tempest is 
regarding keystone v3.  Note that this isn't the only thing that made it 
into nova in Icehouse related to keystone v3 [2].


[1] https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting
[2] https://review.openstack.org/69972

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-dev] [Nova] use Keystone V3 token to volume attachment

2014-03-20 Thread Matt Riedemann



On 3/19/2014 10:02 AM, Matthew Treinish wrote:

On Wed, Mar 19, 2014 at 09:35:34AM -0500, Matt Riedemann wrote:



On 3/19/2014 2:48 AM, Shao Kai SK Li wrote:

Hello:

  I am working on this
patch(https://review.openstack.org/#/c/77524/) to fix bugs about volume
attach failure with keystone V3 token.

  Just wonder, is there some blue prints or plans in Juno to address
keystone V3 support in nova ?

  Thanks you in advance.


Best Regards~~~

Li, Shaokai


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I have this on the nova meeting agenda for tomorrow [1].  I would
think at a minimum this means running compute tests in Tempest
against a keystone v3 backend.  I'm not sure what the current state
of Tempest is regarding keystone v3.  Note that this isn't the only
thing that made it into nova in Icehouse related to keystone v3 [2].


On the tempest side there are some dedicated keystone v3 api tests, I'm not
sure how well things are covered there though. On using keystone v3 for auth
for the other tests tempest doesn't quite support that yet. Andrea Frittoli is
working on a bp to get this working:

https://blueprints.launchpad.net/tempest/+spec/multi-keystone-api-version-tests

But, at this point it will probably end up being early Juno thing before this
can be enabled everywhere in tempest.

-Matt Treinish

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Furthermore Russell talked to Dolph in IRC and Dolph created this 
blueprint for planning the path forward from keystone v2 to v3:


https://blueprints.launchpad.net/keystone/+spec/document-v2-to-v3-transition

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] An analysis of code review in Nova

2014-03-22 Thread Matt Riedemann



On 3/22/2014 5:19 AM, Shawn Hartsock wrote:

On Fri, Mar 14, 2014 at 6:58 PM, Dan Smith d...@danplanet.com wrote:

Review latency will be directly affected by how good the refactoring
changes are staged. If they are small, on-topic and easy to validate,
they will go quickly. They should be linearized unless there are some
places where multiple sequences of changes make sense (i.e. refactoring
a single file that results in no changes required to others).



I'm going to bring this to the next
https://wiki.openstack.org/wiki/Meetings/VMwareAPI we can start
working on how we'll set the order for this kind of work. Currently we
have a whole blueprint for refactoring a single method. That seems
silly. I'll want to come up with a plan around how to restructure the
driver so we can avoid some of the messes we've seen in the past.


I think the point of starting with refactoring the nested method mess in 
the spawn method was it (1) seemed relatively trivial (think fast review 
turnaround) and (2) should be a good bang for the buck kind of change, 
as a lot of the original complaint was related to how hard it is to 
verify changes in the giant spawn method are tested - which you also 
point out below.




I want to avoid one big refactor effort that drags on, but I also want
to address bigger problems we have inside the driver. For example,


I also want to avoid a big refactor effort dragging on, and I like the 
thinking on design changes, but are they doing going to be happening at 
the same time?  Or is the complete re-design going to supersede the 
refactoring?  My only concern is biting off more than can be chewed in 
juno-1.


Plus there is the refactor to use oslo.vmware, where does that fit into 
this?



vm_util.py seems to have become burdened with work that it shouldn't
have. It also performs a great number of unnecessary round trips using
a vm_ref to pull individual vm details over one at a time. Introducing
a VirtualMachine object that held all these references would simplify
some operations (I'm not the first person to suggest this and it
wasn't novel to me either when it was presented.)

It would seem Juno-1 would be the time to make these changes and we
need to serialize this work to keep reviewers from losing their
marbles trying to track it all. I would like to work out a plan for
this in conjunction with interested core-reviewers who would be
willing to more or less sponsor this work. Because of the problems
Matt points out, I don't want to tackle this in a haphazard or
piece-meal way since it could completely disrupt any new blueprint
work people may have targeted for Juno.


Yeah, definitely need a plan here.  I'd like to see things prioritized 
based on what can be fixed in a relatively isolated way which gives a 
good return on coding/reviewing investment, e.g. pulling those nested 
methods out of spawn so they can be unit tested individually with mock 
rather than a large, seemingly rigid and scaffolded test framework.




Having said that, on this driver, new blueprints in the last several
cycles have introduced serious feature regressions. Several minor bug
fixes have altered and introduced key architectural components that
have broken multiple critical features. In my professional opinion
this has a root cause based on the drivers tightly coupled and
non-cohesive internal design.

The driver design is tightly coupled in that a change in one area
forces multiple updates and changes *throughout* the rest of the
driver. This is true in testing as well. The testing design often
requires you to trace the entire codebase if you add a single optional
parameter to a single method. This does not have to be true.


Yup, this is my major complaint and ties into what I'm saying above, I 
find it really difficult to determine most of the time where a change is 
tested.  Because of the nature of the driver code and my lack of 
actually writing features in it, as a reviewer I don't know if a change 
in X is going to break Y, so I rely on solid test coverage and the 
testing needs to be more natural to follow than it currently is.




The driver design is non-cohesive in that important details and
related information is spread throughout the driver. You must be aware
at all times (for example) whether or not your current operation
requires you to check if your vm_ref is outdated (we just worked on
several last minute critical bugs for RC1 where myself and others
pulled all nighters to fix the issue in a bad case of Heroic
Programming).

I would like to stop the http://c2.com/cgi/wiki?CodeVomit please. May we?



I know this isn't going to be easy so I'm really glad you're planning on 
tackling it in Juno.  I'll tentatively sign up to help sponsor this but 
I'm not going to be able to commit all of my review bandwidth to a ton 
of changes for refactor and re-design.  Hopefully targets will become 
more clear as the team gets the plans in place.


--

Thanks,

Matt Riedemann

Re: [openstack-dev] [qa] [neutron] Neutron Full Parallel job - Last 4 days failures

2014-03-28 Thread Matt Riedemann



On 3/27/2014 8:00 AM, Salvatore Orlando wrote:


On 26 March 2014 19:19, James E. Blair jebl...@openstack.org
mailto:jebl...@openstack.org wrote:

Salvatore Orlando sorla...@nicira.com mailto:sorla...@nicira.com
writes:

  On another note, we noticed that the duplicated jobs currently
executed for
  redundancy in neutron actually seem to point all to the same
build id.
  I'm not sure then if we're actually executing each job twice or just
  duplicating lines in the jenkins report.

Thanks for catching that, and I'm sorry that didn't work right.  Zuul is
in fact running the jobs twice, but it is only looking at one of them
when sending reports and (more importantly) decided whether the change
has succeeded or failed.  Fixing this is possible, of course, but turns
out to be a rather complicated change.  Since we don't make heavy use of
this feature, I lean toward simply instantiating multiple instances of
identically configured jobs and invoking them (eg neutron-pg-1,
neutron-pg-2).

Matthew Treinish has already worked up a patch to do that, and I've
written a patch to revert the incomplete feature from Zuul.


That makes sense to me. I think it is just a matter about the results
are reported to gerrit since from what I gather in logstash the jobs are
executed twice for each new patchset or recheck.


For the status of the full job, I gave a look at the numbers reported by
Rossella.
All the bugs are already known; some of them are not even bug; others
have been recently fixed (given the time span of Rossella analysis and
the fact it covers also non-rebased patches it might be possible to have
this kind of false positive).

of all full job failures, 44% should be discarded.
Bug 1291611 (12%) is definitely not a neutron bug... hopefully.
Bug 1281969 (12%) is really too generic.
It bears the hallmark of bug1283522, and therefore the high number might
be due to the fact that trunk was plagued by this bug up to a few days
before the analysis.
However, it's worth noting that there is also another instance of lock
timeout which has caused 11 failures in full job in the past week.
A new bug has been filed for this issue:
https://bugs.launchpad.net/neutron/+bug/1298355
Bug 1294603 was related to a test now skipped. It is still being debated
whether the problem lies in test design, neutron LBaaS or neutron L3.

The following bugs seem not to be neutron bugs:
1290642, 1291920, 1252971, 1257885

Bug 1292242 appears to have been fixed while the analysis was going on
Bug 1277439 instead is already known to affects neutron jobs occasionally.

The actual state of the job is perhaps better than what the raw numbers
say. I would keep monitoring it, and then make it voting after the
Icehouse release is cut, so that we'll be able to deal with possible
higher failure rate in the quiet period of the release cycle.



-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I reported this bug [1] yesterday.  This was hit in our internal Tempest 
runs on RHEL 6.5 with x86_64 and the nova libvirt driver with the 
neutron openvswitch ML2 driver.  We're running without tenant isolation 
on python 2.6 (no testr yet) so the tests are in serial.  We're running 
basically the full API/CLI/Scenarios tests though, no filtering on the 
smoke tag.


Out of 1,971 tests run, we had 3 failures where a nova instance failed 
to spawn because networking callback events failed, i.e. neutron sends a 
server event request to nova and it's a bad URL so nova API pukes and 
then the networking request in neutron server fails.  As linked in the 
bug report I'm seeing the same neutron server log error showing up in 
logstash for community jobs but it's not 100% failure.  I haven't seen 
the n-api log error show up in logstash though.


Just bringing this to people's attention in case anyone else sees it.

[1] https://bugs.launchpad.net/nova/+bug/1298640

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] Looking for clarification on the diagnostics API

2013-10-10 Thread Matt Riedemann
Tempest recently got some new tests for the nova diagnostics API [1] which 
failed when I was running against the powervm driver since it doesn't 
implement that API.  I started looking at other drivers that did and found 
that libvirt, vmware and xenapi at least had code for the get_diagnostics 
method.  I found that the vmware driver was re-using it's get_info method 
for get_diagnostics which led to bug 1237622 [2] but overall caused some 
confusion about the difference between the compute driver's get_info and 
get_diagnostics mehods.  It looks like get_info is mainly just used to get 
the power_state of the instance.

First, the get_info method has a nice docstring for what it needs returned 
[3] but the get_diagnostics method doesn't [4].  From looking at the API 
docs [5], the diagnostics API basically gives an example of values to get 
back which is completely based on what the libvirt driver returns. Looking 
at the xenapi driver code, it looks like it does things a bit differently 
than the libvirt driver (maybe doesn't return the exact same keys, but it 
returns information based on what Xen provides).

I'm thinking about implementing the diagnostics API for the powervm driver 
but I'd like to try and get some help on defining just what should be 
returned from that call.  There are some IVM commands available to the 
powervm driver for getting hardware resource information about an LPAR so 
I think I could implement this pretty easily.

I think it basically comes down to providing information about the 
processor, memory, storage and network interfaces for the instance but if 
anyone has more background information on that API I'd like to hear it.

[1] 
https://github.com/openstack/tempest/commit/da0708587432e47f85241201968e6402190f0c5d
 

[2] https://bugs.launchpad.net/nova/+bug/1237622 
[3] 
https://github.com/openstack/nova/blob/2013.2.rc1/nova/virt/driver.py#L144 

[4] 
https://github.com/openstack/nova/blob/2013.2.rc1/nova/virt/driver.py#L299 

[5] http://paste.openstack.org/show/48236/ 



Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States
image/gif___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Looking for clarification on the diagnostics API

2013-10-10 Thread Matt Riedemann
Looks like this has been brought up a couple of times:

https://lists.launchpad.net/openstack/msg09138.html 

https://lists.launchpad.net/openstack/msg08555.html 

But they seem to kind of end up in the same place I already am - it seems 
to be an open-ended API that is hypervisor-specific.



Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States




From:   Matt Riedemann/Rochester/IBM
To: OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
Date:   10/10/2013 02:12 PM
Subject:[nova] Looking for clarification on the diagnostics API


Tempest recently got some new tests for the nova diagnostics API [1] which 
failed when I was running against the powervm driver since it doesn't 
implement that API.  I started looking at other drivers that did and found 
that libvirt, vmware and xenapi at least had code for the get_diagnostics 
method.  I found that the vmware driver was re-using it's get_info method 
for get_diagnostics which led to bug 1237622 [2] but overall caused some 
confusion about the difference between the compute driver's get_info and 
get_diagnostics mehods.  It looks like get_info is mainly just used to get 
the power_state of the instance.

First, the get_info method has a nice docstring for what it needs returned 
[3] but the get_diagnostics method doesn't [4].  From looking at the API 
docs [5], the diagnostics API basically gives an example of values to get 
back which is completely based on what the libvirt driver returns. Looking 
at the xenapi driver code, it looks like it does things a bit differently 
than the libvirt driver (maybe doesn't return the exact same keys, but it 
returns information based on what Xen provides).

I'm thinking about implementing the diagnostics API for the powervm driver 
but I'd like to try and get some help on defining just what should be 
returned from that call.  There are some IVM commands available to the 
powervm driver for getting hardware resource information about an LPAR so 
I think I could implement this pretty easily.

I think it basically comes down to providing information about the 
processor, memory, storage and network interfaces for the instance but if 
anyone has more background information on that API I'd like to hear it.

[1] 
https://github.com/openstack/tempest/commit/da0708587432e47f85241201968e6402190f0c5d
 

[2] https://bugs.launchpad.net/nova/+bug/1237622 
[3] 
https://github.com/openstack/nova/blob/2013.2.rc1/nova/virt/driver.py#L144 

[4] 
https://github.com/openstack/nova/blob/2013.2.rc1/nova/virt/driver.py#L299 

[5] http://paste.openstack.org/show/48236/ 



Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States

image/gifimage/gif___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][powervm] my notes from the meeting on powervm CI

2013-10-10 Thread Matt Riedemann
Based on the discussion with Russell and Dan Smith in the nova meeting 
today, here are some of my notes from the meeting that can continue the 
discussion.  These are all pretty rough at the moment so please bear with 
me, this is more to just get the ball rolling on ideas.

Notes on powervm CI:

1. What OS to run on?  Fedora 19, RHEL 6.4?
- Either of those is probably fine, we use RHEL 6.4 right now 
internally.
2. Deployment - RDO? SmokeStack? Devstack?
- SmokeStack is preferable since it packages rpms which is what 
we're using internally.
3. Backing database - mysql or DB2 10.5?
- Prefer DB2 since that's what we want to support in Icehouse and 
it's what we use internally, but there are differences in how long it 
takes to create a database with DB2 versus MySQL so when you multiply that 
times 7 databases (keystone, cinder, glance, nova, heat, neutron, 
ceilometer) it's going to add up unless we can figure out a better way to 
do it (single database with multiple schemas?).  Internally we use a 
pre-created image with the DB2 databases already created, we just run the 
migrate scripts against them so we don't have to wait for the create times 
every run - would that fly in community?
4. What is the max amount of time for us to report test results?  Dan 
didn't seem to think 48 hours would fly. :)
5. What are the minimum tests that need to run (excluding APIs that the 
powervm driver doesn't currently support)?
- smoke/gate/negative/whitebox/scenario/cli?  Right now we have 
1152 tempest tests running, those are only within api/scenario/cli and we 
don't run everything.
6. Network service? We're running with openvswitch 1.10 today so we 
probably want to continue with that if possible.
7. Cinder backend? We're running with the storwize driver but we do we do 
about the remote v7000?

Again, just getting some thoughts out there to help us figure out our 
goals for this, especially around 4 and 5.



Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States
image/gif___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Hyper-V] Havana status

2013-10-10 Thread Matt Riedemann
Getting integration testing hooked up for the hyper-v driver with tempest 
should go a long way here which is a good reason to have it.  As has been 
mentioned, there is a core team of people that understand the internals of 
the hyper-v driver and the subtleties of when it won't work, and only 
those with a vested interest in using it will really care about it.

My team has the same issue with the powervm driver.  We don't have 
community integration testing hooked up yet.  We run tempest against it 
internally so we know what works and what doesn't, but besides standard 
code review practices that apply throughout everything (strong unit test 
coverage, consistency with other projects, hacking rules, etc), any other 
reviewer has to generally take it on faith that what's in there works as 
it's supposed to.  Sure, there is documentation available on what the 
native commands do and anyone can dig into those to figure it out, but I 
wouldn't expect that low-level of review from anyone that doesn't 
regularly work on the powervm driver.  I think the same is true for 
anything here.  So the equalizer is a rigorously tested and broad set of 
integration tests, which is where we all need to get to with tempest and 
continuous integration.

We've had the same issues as mentioned in the original note about things 
slipping out of releases or taking a long time to get reviewed, and we've 
had to fork code internally because of it which we then have to continue 
to try and get merged upstream - and it's painful, but it is what it is, 
that's the nature of the business.

Personally my experience has been that the more I give the more I get. The 
more I'm involved in what others are doing and the more I review other's 
code, the more I can build a relationship which is mutually beneficial. 
Sometimes I can only say 'hey, you need unit tests for this or this 
doesn't seem right but I'm not sure', but unless you completely automate 
code coverage metrics and build that back into reviews, e.g. does your 
1000 line blueprint have 95% code coverage in the tests, you still need 
human reviewers on everything, regardless of context.  Even then it's not 
going to be enough, there will always be a need for people with a broader 
vision of the project as a whole that can point out where things are going 
in the wrong direction even if it fixes a bug.

The point is I see both sides of the argument, I'm sure many people do. In 
a large complicated project like this it's inevitable.  But I think the 
quality and adoption of OpenStack speaks for itself and I believe a key 
component of that is the review system and that's only as good as the 
people which are going to uphold the standards across the project.  I've 
been on enough development projects that give plenty of lip service to 
code quality and review standards which are always the first thing to go 
when a deadline looms, and those projects are always ultimately failures.



Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States




From:   Tim Smith tsm...@gridcentric.com
To: OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
Date:   10/10/2013 07:48 PM
Subject:Re: [openstack-dev] [Hyper-V] Havana status



On Thu, Oct 10, 2013 at 1:50 PM, Russell Bryant rbry...@redhat.com 
wrote:
 
Please understand that I only want to help here.  Perhaps a good way for
you to get more review attention is get more karma in the dev community
by helping review other patches.  It looks like you don't really review
anything outside of your own stuff, or patches that touch hyper-v.  In
the absence of significant interest in hyper-v from others, the only way
to get more attention is by increasing your karma.

NB: I don't have any vested interest in this discussion except that I want 
to make sure OpenStack stays Open, i.e. inclusive. I believe the concept 
of reviewer karma, while seemingly sensible, is actually subtly counter 
to the goals of openness, innovation, and vendor neutrality, and would 
also lead to overall lower commit quality.

Brian Kernighan famously wrote: Debugging is twice as hard as writing the 
code in the first place. A corollary is that constructing a mental model 
of code is hard; perhaps harder than writing the code in the first place. 
It follows that reviewing code is not an easy task, especially if one has 
not been intimately involved in the original development of the code under 
review. In fact, if a reviewer is not intimately familiar with the code 
under review, and therefore only able to perform the functions of human 
compiler and style-checker (functions which can be and typically are 
performed by automatic tools), the rigor of their review is at best 
less-than-ideal, and at worst purely symbolic.

It is logical, then, that a reviewer should review

Re: [openstack-dev] [nova][powervm] my notes from the meeting on powervm CI

2013-10-10 Thread Matt Riedemann
Dan Smith d...@danplanet.com wrote on 10/10/2013 08:26:14 PM:

 From: Dan Smith d...@danplanet.com
 To: OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
 Date: 10/10/2013 08:31 PM
 Subject: Re: [openstack-dev] [nova][powervm] my notes from the 
 meeting on powervm CI
 
  4. What is the max amount of time for us to report test results?  Dan
  didn't seem to think 48 hours would fly. :)
 
 Honestly, I think that 12 hours during peak times is the upper limit of
 what could be considered useful. If it's longer than that, many patches
 could go into the tree without a vote, which defeats the point.

Yeah, I was just joking about the 48 hour thing, 12 hours seems excessive
but I guess that has happened when things are super backed up with gate
issues and rechecks.

Right now things take about 4 hours, with Tempest being around 1.5 hours
of that. The rest of the time is setup and install, which includes heat
and ceilometer. So I guess that raises another question, if we're really
setting this up right now because of nova, do we need to have heat and
ceilometer installed and configured in the initial delivery of this if
we're not going to run tempest tests against them (we don't right now)?

I think some aspect of the slow setup time is related to DB2 and how
the migrations perform with some of that, but the overall time is not
considerably different from when we were running this with MySQL so
I'm reluctant to blame it all on DB2.  I think some of our topology
could have something to do with it too since the IVM hypervisor is running
on a separate system and we are gated on how it's performing at any
given time.  I think that will be our biggest challenge for the scale
issues with community CI.

 
  5. What are the minimum tests that need to run (excluding APIs that 
the
  powervm driver doesn't currently support)?
  - smoke/gate/negative/whitebox/scenario/cli?  Right now we 
have
  1152 tempest tests running, those are only within api/scenario/cli and
  we don't run everything.
 
 I think that a full run of tempest should be required. That said, if
 there are things that the driver legitimately doesn't support, it makes
 sense to exclude those from the tempest run, otherwise it's not useful.
 
 I think you should publish the tempest config (or config script, or
 patch, or whatever) that you're using so that we can see what it means
 in terms of the coverage you're providing.

Just to clarify, do you mean publish what we are using now or publish
once it's all working?  I can certainly attach our nose.cfg and
latest x-unit results xml file.

 
  6. Network service? We're running with openvswitch 1.10 today so we
  probably want to continue with that if possible.
 
 Hmm, so that means neutron? AFAIK, not much of tempest runs with
 Nova/Neutron.
 
 I kinda think that since nova-network is our default right now (for
 better or worse) that the run should include that mode, especially if
 using neutron excludes a large portion of the tests.
 
 I think you said you're actually running a bunch of tempest right now,
 which conflicts with my understanding of neutron workiness. Can you 
clarify?

Correct, we're running with neutron using the ovs plugin. We basically 
have
the same issues that the neutron gate jobs have, which is related to 
concurrency
issues and tenant isolation (we're doing the same as devstack with neutron
in that we don't run tempest with tenant isolation).  We are running most
of the nova and most of the neutron API tests though (we don't have all
of the neutron-dependent scenario tests working though, probably more due
to incompetence in setting up neutron than anything else).

 
  7. Cinder backend? We're running with the storwize driver but we do we
  do about the remote v7000?
 
 Is there any reason not to just run with a local LVM setup like we do in
 the real gate? I mean, additional coverage for the v7000 driver is
 great, but if it breaks and causes you to not have any coverage at all,
 that seems, like, bad to me :)

Yeah, I think we'd just run with a local LVM setup, that's what we do for
x86_64 and s390x tempest runs. For whatever reason we thought we'd do
storwize for our ppc64 runs, probably just to have a matrix of coverage.

 
  Again, just getting some thoughts out there to help us figure out our
  goals for this, especially around 4 and 5.
 
 Yeah, thanks for starting this discussion!
 
 --Dan
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [cinder] dd performance for wipe in cinder

2013-10-11 Thread Matt Riedemann
Have you looked at the volume_clear and volume_clear_size options in 
cinder.conf?

https://github.com/openstack/cinder/blob/2013.2.rc1/etc/cinder/cinder.conf.sample#L1073
 


The default is to zero out the volume.  You could try 'none' to see if 
that helps with performance.



Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States




From:   cosmos cosmos cosmos0...@gmail.com
To: openstack-dev@lists.openstack.org, 
Date:   10/11/2013 04:26 AM
Subject:[openstack-dev]  dd performance for wipe in cinder



Hello.
My name is Rucia for Samsung SDS.

Now I am in trouble in cinder volume deleting.
I am developing for supporting big data storage in lvm 

But it takes too much time for deleting of cinder lvm volume because of 
dd.
Cinder volume is 200GB for supporting hadoop master data.
When i delete cinder volume in using 'dd if=/dev/zero of $cinder-volume 
count=100 bs=1M' it takes about 30 minutes.

Is there the better and quickly way for deleting?

Cheers. 
Rucia.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

image/gif___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Hyper-V] Havana status

2013-10-11 Thread Matt Riedemann
I'd like to see the powervm driver fall into that first category.  We 
don't nearly have the rapid development that the hyper-v driver does, but 
we do have some out of tree stuff anyway simply because it hasn't landed 
upstream yet (DB2, config drive support for the powervm driver, etc), and 
maintaining that out of tree code is not fun.  So I definitely don't want 
to move out of tree.

Given that, I think at least I'm trying to contribute overall [1][2] by 
doing reviews outside my comfort zone, bug triage, fixing bugs when I can, 
and because we run tempest in house (with neutron-openvswitch) we find 
issues there that I get to push patches for.

Having said all that, it's moot for the powervm driver if we don't get the 
CI hooked up in Icehouse and I completely understand that so it's a top 
priority.


[1] 
http://stackalytics.com/?release=havanametric=commitsproject_type=openstackmodule=company=user_id=mriedem
 

[2] 
https://review.openstack.org/#/q/reviewer:6873+project:openstack/nova,n,z 


Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States




From:   Russell Bryant rbry...@redhat.com
To: openstack-dev@lists.openstack.org, 
Date:   10/11/2013 11:33 AM
Subject:Re: [openstack-dev] [Hyper-V] Havana status



On 10/11/2013 12:04 PM, John Griffith wrote:
 
 
 
 On Fri, Oct 11, 2013 at 9:12 AM, Bob Ball bob.b...@citrix.com
 mailto:bob.b...@citrix.com wrote:
 
  -Original Message-
  From: Russell Bryant [mailto:rbry...@redhat.com
 mailto:rbry...@redhat.com]
  Sent: 11 October 2013 15:18
  To: openstack-dev@lists.openstack.org
 mailto:openstack-dev@lists.openstack.org
  Subject: Re: [openstack-dev] [Hyper-V] Havana status
 
   As a practical example for Nova: in our case that would simply
 include the
  following subtrees: nova/virt/hyperv and 
nova/tests/virt/hyperv.
 
  If maintainers of a particular driver would prefer this sort of
  autonomy, I'd rather look at creating new repositories.  I'm
 completely
  open to going that route on a per-driver basis.  Thoughts?
 
 I think that all drivers that are officially supported must be
 treated in the same way.
 
 If we are going to split out drivers into a separate but still
 official repository then we should do so for all drivers.  This
 would allow Nova core developers to focus on the architectural side
 rather than how each individual driver implements the API that is
 presented.
 
 Of course, with the current system it is much easier for a Nova core
 to identify and request a refactor or generalisation of code written
 in one or multiple drivers so they work for all of the drivers -
 we've had a few of those with XenAPI where code we have written has
 been pushed up into Nova core rather than the XenAPI tree.
 
 Perhaps one approach would be to re-use the incubation approach we
 have; if drivers want to have the fast-development cycles uncoupled
 from core reviewers then they can be moved into an incubation
 project.  When there is a suitable level of integration (and
 automated testing to maintain it of course) then they can graduate.
  I imagine at that point there will be more development of new
 features which affect Nova in general (to expose each hypervisor's
 strengths), so there would be fewer cases of them being restricted
 just to the virt/* tree.
 
 Bob
 
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 mailto:OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
 
 
 I've thought about this in the past, but always come back to a couple of
 things.
 
 Being a community driven project, if a vendor doesn't want to
 participate in the project then why even pretend (ie having their own
 project/repo, reviewers etc).  Just post your code up in your own github
 and let people that want to use it pull it down.  If it's a vendor
 project, then that's fine; have it be a vendor project.
 
 In my opinion pulling out and leaving things up to the vendors as is
 being described has significant negative impacts.  Not the least of
 which is consistency in behaviors.  On the Cinder side, the core team
 spends the bulk of their review time looking at things like consistent
 behaviors, missing features or paradigms that are introduced that
 break other drivers.  For example looking at things like, are all the
 base features implemented, do they work the same way, are we all using
 the same vocabulary, will it work in an multi-backend environment.  In
 addition, it's rare that a vendor implements a new feature in their
 driver that doesn't impact/touch the core code somewhere.
 
 Having

Re: [openstack-dev] [nova] Looking for clarification on the diagnostics API

2013-10-12 Thread Matt Riedemann
There is also a tempest patch now to ease some of the libvirt-specific 
keys checked in the new diagnostics tests there:

https://review.openstack.org/#/c/51412/ 

To relay some of my concerns that I put in that patch:

I'm not sure how I feel about this. It should probably be more generic but 
I think we need more than just a change in tempest to enforce it, i.e. we 
should have a nova patch that changes the doc strings for the abstract 
compute driver method to specify what the minimum keys are for the info 
returned, maybe a doc api sample change, etc?

For reference, here is the mailing list post I started on this last week:

http://lists.openstack.org/pipermail/openstack-dev/2013-October/016385.html

There are also docs here (these examples use xen and libvirt):

http://docs.openstack.org/grizzly/openstack-compute/admin/content/configuring-openstack-compute-basics.html

And under procedure 4.4 here:

http://docs.openstack.org/admin-guide-cloud/content/ch_introduction-to-openstack-compute.html#section_manage-the-cloud


=

I also found this wiki page related to metering and the nova diagnostics 
API:

https://wiki.openstack.org/wiki/EfficientMetering/FutureNovaInteractionModel 


So it seems like if at some point this will be used with ceilometer it 
should be standardized a bit which is what the Tempest part starts but I 
don't want it to get lost there.


Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States




From:   Gary Kotton gkot...@vmware.com
To: OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
Date:   10/12/2013 01:42 PM
Subject:Re: [openstack-dev] [nova] Looking for clarification on 
the diagnostics API



Yup, it seems to be hypervisor specific. I have added in the Vmware 
support following you correcting in the Vmware driver.
Thanks
Gary 

From: Matt Riedemann mrie...@us.ibm.com
Reply-To: OpenStack Development Mailing List 
openstack-dev@lists.openstack.org
Date: Thursday, October 10, 2013 10:17 PM
To: OpenStack Development Mailing List openstack-dev@lists.openstack.org
Subject: Re: [openstack-dev] [nova] Looking for clarification on the 
diagnostics API

Looks like this has been brought up a couple of times:

https://lists.launchpad.net/openstack/msg09138.html

https://lists.launchpad.net/openstack/msg08555.html

But they seem to kind of end up in the same place I already am - it seems 
to be an open-ended API that is hypervisor-specific.



Thanks, 

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States





From:Matt Riedemann/Rochester/IBM
To:OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
Date:10/10/2013 02:12 PM
Subject:[nova] Looking for clarification on the diagnostics API


Tempest recently got some new tests for the nova diagnostics API [1] which 
failed when I was running against the powervm driver since it doesn't 
implement that API.  I started looking at other drivers that did and found 
that libvirt, vmware and xenapi at least had code for the get_diagnostics 
method.  I found that the vmware driver was re-using it's get_info method 
for get_diagnostics which led to bug 1237622 [2] but overall caused some 
confusion about the difference between the compute driver's get_info and 
get_diagnostics mehods.  It looks like get_info is mainly just used to get 
the power_state of the instance.

First, the get_info method has a nice docstring for what it needs returned 
[3] but the get_diagnostics method doesn't [4].  From looking at the API 
docs [5], the diagnostics API basically gives an example of values to get 
back which is completely based on what the libvirt driver returns. Looking 
at the xenapi driver code, it looks like it does things a bit differently 
than the libvirt driver (maybe doesn't return the exact same keys, but it 
returns information based on what Xen provides). 

I'm thinking about implementing the diagnostics API for the powervm driver 
but I'd like to try and get some help on defining just what should be 
returned from that call.  There are some IVM commands available to the 
powervm driver for getting hardware resource information about an LPAR so 
I think I could implement this pretty easily.

I think it basically comes down to providing information about the 
processor, memory, storage and network interfaces for the instance but if 
anyone has more background information on that API I'd like to hear it.

[1] 
https://github.com/openstack/tempest/commit/da0708587432e47f85241201968e6402190f0c5d

[2] https://bugs.launchpad.net/nova/+bug/1237622
[3] 
https://github.com/openstack/nova/blob/2013.2.rc1/nova/virt/driver.py#L144
[4

Re: [openstack-dev] [nova][powervm] my notes from the meeting on powervm CI

2013-10-18 Thread Matt Riedemann
I just opened this bug, it's going to be one of the blockers for us to get 
PowerVM CI going in Icehouse:

https://bugs.launchpad.net/nova/+bug/1241619 



Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States




From:   Matt Riedemann/Rochester/IBM@IBMUS
To: OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
Date:   10/11/2013 10:59 AM
Subject:Re: [openstack-dev] [nova][powervm] my notes from the 
meeting on  powervm CI







Matthew Treinish mtrein...@kortar.org wrote on 10/10/2013 10:31:29 PM:

 From: Matthew Treinish mtrein...@kortar.org 
 To: OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
 Date: 10/10/2013 11:07 PM 
 Subject: Re: [openstack-dev] [nova][powervm] my notes from the 
 meeting on powervm CI 
 
 On Thu, Oct 10, 2013 at 07:39:37PM -0700, Joe Gordon wrote:
  On Thu, Oct 10, 2013 at 7:28 PM, Matt Riedemann mrie...@us.ibm.com 
wrote:
   
 4. What is the max amount of time for us to report test results? 
 Dan
 didn't seem to think 48 hours would fly. :)
   
Honestly, I think that 12 hours during peak times is the upper 
limit of
what could be considered useful. If it's longer than that, many 
patches
could go into the tree without a vote, which defeats the point.
  
   Yeah, I was just joking about the 48 hour thing, 12 hours seems 
excessive
   but I guess that has happened when things are super backed up with 
gate
   issues and rechecks.
  
   Right now things take about 4 hours, with Tempest being around 1.5 
hours
   of that. The rest of the time is setup and install, which includes 
heat
   and ceilometer. So I guess that raises another question, if we're 
really
   setting this up right now because of nova, do we need to have heat 
and
   ceilometer installed and configured in the initial delivery of this 
if
   we're not going to run tempest tests against them (we don't right 
now)?
  
  
  
  In general the faster the better, and if things get to slow enough 
that we
  have to wait for powervm CI to report back, I
  think its reasonable to go ahead and approve things without hearing 
back.
   In reality if you can report back in under 12 hours this will rarely
  happen (I think).
  
  
  
   I think some aspect of the slow setup time is related to DB2 and how
   the migrations perform with some of that, but the overall time is 
not
   considerably different from when we were running this with MySQL so
   I'm reluctant to blame it all on DB2.  I think some of our topology
   could have something to do with it too since the IVM hypervisor is 
running
   on a separate system and we are gated on how it's performing at any
   given time.  I think that will be our biggest challenge for the 
scale
   issues with community CI.
  
   
 5. What are the minimum tests that need to run (excluding 
 APIs that the
 powervm driver doesn't currently support)?
 - smoke/gate/negative/whitebox/scenario/cli?  Right 
 now we have
 1152 tempest tests running, those are only within 
api/scenario/cli and
 we don't run everything.
 
 Well that's almost a full run right now, the full tempest jobs have 1290 
tests
 of which we skip 65 because of bugs or configuration. (don't run neutron 
api
 tests without neutron) That number is actually pretty high since you are
 running with neutron. Right now the neutron gating jobs only have 221 
jobs and
 skip 8 of those. Can you share the list of things you've got working 
with
 neutron so we can up the number of gating tests? 

Here is the nose.cfg we run with: 



Some of the tests are excluded because of performance issues that still 
need to 
be worked out (like test_list_image_filters - it works but it takes over 
20 
minutes sometimes). 

Some of the tests are excluded because of limitations with DB2, e.g. 
test_list_servers_filtered_by_name_wildcard 

Some of them are probably old excludes on bugs that are now fixed. We have 
to 
go back through what's excluded every once in awhile to figure out what's 
still broken and clean things up. 

Here is the tempest.cfg we use on ppc64: 



And here are the xunit results from our latest run: 



Note that we have known issues with some cinder and neutron failures 
in there. 

 
   
I think that a full run of tempest should be required. That 
said, if
there are things that the driver legitimately doesn't support, it 
makes
sense to exclude those from the tempest run, otherwise it's not 
useful.
  
  
  ++
  
  
  

I think you should publish the tempest config (or config script, 
or
patch, or whatever) that you're using so that we can see what it 
means
in terms of the coverage you're providing.
  
   Just to clarify, do you mean publish what we are using now or 
publish
   once it's all working?  I can certainly attach

Re: [openstack-dev] [nova][powervm] my notes from the meeting on powervm CI

2013-10-18 Thread Matt Riedemann
And this guy: https://bugs.launchpad.net/nova/+bug/1241628 



Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States




From:   Matt Riedemann/Rochester/IBM@IBMUS
To: OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
Date:   10/18/2013 09:25 AM
Subject:Re: [openstack-dev] [nova][powervm] my notes from the 
meeting on  powervm CI



I just opened this bug, it's going to be one of the blockers for us to get 
PowerVM CI going in Icehouse: 

https://bugs.launchpad.net/nova/+bug/1241619 



Thanks, 

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development 

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com 


3605 Hwy 52 N
Rochester, MN 55901-1407
United States





From:Matt Riedemann/Rochester/IBM@IBMUS 
To:OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
Date:10/11/2013 10:59 AM 
Subject:Re: [openstack-dev] [nova][powervm] my notes from the 
meeting onpowervm CI 







Matthew Treinish mtrein...@kortar.org wrote on 10/10/2013 10:31:29 PM:

 From: Matthew Treinish mtrein...@kortar.org 
 To: OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
 Date: 10/10/2013 11:07 PM 
 Subject: Re: [openstack-dev] [nova][powervm] my notes from the 
 meeting on powervm CI 
 
 On Thu, Oct 10, 2013 at 07:39:37PM -0700, Joe Gordon wrote:
  On Thu, Oct 10, 2013 at 7:28 PM, Matt Riedemann mrie...@us.ibm.com 
wrote:
   
 4. What is the max amount of time for us to report test results? 
 Dan
 didn't seem to think 48 hours would fly. :)
   
Honestly, I think that 12 hours during peak times is the upper 
limit of
what could be considered useful. If it's longer than that, many 
patches
could go into the tree without a vote, which defeats the point.
  
   Yeah, I was just joking about the 48 hour thing, 12 hours seems 
excessive
   but I guess that has happened when things are super backed up with 
gate
   issues and rechecks.
  
   Right now things take about 4 hours, with Tempest being around 1.5 
hours
   of that. The rest of the time is setup and install, which includes 
heat
   and ceilometer. So I guess that raises another question, if we're 
really
   setting this up right now because of nova, do we need to have heat 
and
   ceilometer installed and configured in the initial delivery of this 
if
   we're not going to run tempest tests against them (we don't right 
now)?
  
  
  
  In general the faster the better, and if things get to slow enough 
that we
  have to wait for powervm CI to report back, I
  think its reasonable to go ahead and approve things without hearing 
back.
   In reality if you can report back in under 12 hours this will rarely
  happen (I think).
  
  
  
   I think some aspect of the slow setup time is related to DB2 and how
   the migrations perform with some of that, but the overall time is 
not
   considerably different from when we were running this with MySQL so
   I'm reluctant to blame it all on DB2.  I think some of our topology
   could have something to do with it too since the IVM hypervisor is 
running
   on a separate system and we are gated on how it's performing at any
   given time.  I think that will be our biggest challenge for the 
scale
   issues with community CI.
  
   
 5. What are the minimum tests that need to run (excluding 
 APIs that the
 powervm driver doesn't currently support)?
 - smoke/gate/negative/whitebox/scenario/cli?  Right 
 now we have
 1152 tempest tests running, those are only within 
api/scenario/cli and
 we don't run everything.
 
 Well that's almost a full run right now, the full tempest jobs have 1290 
tests
 of which we skip 65 because of bugs or configuration. (don't run neutron 
api
 tests without neutron) That number is actually pretty high since you are
 running with neutron. Right now the neutron gating jobs only have 221 
jobs and
 skip 8 of those. Can you share the list of things you've got working 
with
 neutron so we can up the number of gating tests? 

Here is the nose.cfg we run with: 



Some of the tests are excluded because of performance issues that still 
need to 
be worked out (like test_list_image_filters - it works but it takes over 
20 
minutes sometimes). 

Some of the tests are excluded because of limitations with DB2, e.g. 
test_list_servers_filtered_by_name_wildcard 

Some of them are probably old excludes on bugs that are now fixed. We have 
to 
go back through what's excluded every once in awhile to figure out what's 
still broken and clean things up. 

Here is the tempest.cfg we use on ppc64: 



And here are the xunit results from our latest run: 



Note that we have known issues with some cinder and neutron failures 
in there. 

 
   
I think

Re: [openstack-dev] [Neutron] IPv6 DHCP options for dnsmasq

2013-10-22 Thread Matt Riedemann
FWIW, we've wanted IPv6 support too but there are limitations in 
sqlalchemy and python 2.6 and since openstack is still supporting both of 
those, we are gated on that.



Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States




From:   Sean M. Collins s...@coreitpro.com
To: OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
Date:   10/22/2013 10:33 AM
Subject:Re: [openstack-dev] [Neutron] IPv6  DHCP options for 
dnsmasq



On Tue, Oct 22, 2013 at 08:58:52AM +0200, Luke Gorrie wrote:
 Deutsche Telekom too. We are working on making Neutron interoperate well
 with a service provider network that's based on IPv6. I look forward to
 talking about this with people in Hong Kong :)

I may be mistaken, but I don't see a summit proposal for Neutron, on the
subject of IPv6. Are there plans to have one?

-- 
Sean M. Collins
[attachment att18car.dat deleted by Matt Riedemann/Rochester/IBM] 
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

image/gif___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Openstack on power pc/Freescale linux

2013-10-22 Thread Matt Riedemann
We run openstack on ppc64 with RHEL 6.4 using the powervm nova virt 
driver.  What do you want to know?



Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States




From:   Qing He qing...@radisys.com
To: OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
Date:   10/22/2013 05:49 PM
Subject:[openstack-dev]  [nova] Openstack on power pc/Freescale 
linux



All,
I'm wondering if anyone tried OpenStack on Power PC/ free scale Linux?

Thanks,
Qing

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


image/gif___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Openstack on power pc/Freescale linux

2013-10-22 Thread Matt Riedemann
Yeah, my team does.  We're using openvswitch 1.10, qpid 0.22, DB2 10.5 
(but MySQL also works).  Do you have specific issues/questions?

We're working on getting continuous integration testing working for the 
nova powervm driver in the icehouse release, so you can see some more 
details about what we're doing with openstack on power in this thread:

http://lists.openstack.org/pipermail/openstack-dev/2013-October/016395.html 




Thanks,

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development

Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com


3605 Hwy 52 N
Rochester, MN 55901-1407
United States




From:   Qing He qing...@radisys.com
To: OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
Date:   10/22/2013 07:43 PM
Subject:Re: [openstack-dev] [nova] Openstack on power pc/Freescale 
linux



Thanks Matt.
I’d like know if anyone has tried to run the controller, API server and 
MySql database, msg queue, etc—the brain of the openstack, on ppc.
Qing
 
From: Matt Riedemann [mailto:mrie...@us.ibm.com] 
Sent: Tuesday, October 22, 2013 4:17 PM
To: OpenStack Development Mailing List
Subject: Re: [openstack-dev] [nova] Openstack on power pc/Freescale linux
 
We run openstack on ppc64 with RHEL 6.4 using the powervm nova virt 
driver.  What do you want to know?



Thanks, 

MATT RIEDEMANN
Advisory Software Engineer
Cloud Solutions and OpenStack Development 


Phone: 1-507-253-7622 | Mobile: 1-507-990-1889
E-mail: mrie...@us.ibm.com 


3605 Hwy 52 N
Rochester, MN 55901-1407
United States





From:Qing He qing...@radisys.com 
To:OpenStack Development Mailing List 
openstack-dev@lists.openstack.org, 
Date:10/22/2013 05:49 PM 
Subject:[openstack-dev]  [nova] Openstack on power pc/Freescale 
linux 




All,
I'm wondering if anyone tried OpenStack on Power PC/ free scale Linux?

Thanks,
Qing

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


image/gifimage/gif___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [sqlalchemy-migrate] Blueprint for review: Add DB2 10.5 Support

2013-10-31 Thread Matt Riedemann
I've got a sqlalchemy-migrate blueprint up for review to add DB2 support 
in migrate.

https://blueprints.launchpad.net/sqlalchemy-migrate/+spec/add-db2-support 

This is a pre-req for getting DB2 support into Nova so I'm targeting 
icehouse-1.  We've been running with the migrate patches internally since 
Folsom, but getting them into migrate was difficult before OpenStack took 
over maintenance of the project.

Please let me know if there are any questions/issues or something I need 
to address here.

Thanks,

Matt Riedemann
Cloud Solutions and OpenStack Development
Email: mrie...@us.ibm.com
Office Phone: 507-253-7622___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Do we have some guidelines for mock, stub, mox when writing unit test?

2013-11-10 Thread Matt Riedemann
I don't see anything explicit in the wiki and hacking guides, they 
mainly just say to have unit tests for everything and tell you how to 
run/debug them.


Generally mock is supposed to be used over mox now for python 3 support.

There is also a blueprint to remove the usage of mox in neutron:

https://blueprints.launchpad.net/neutron/+spec/remove-mox

For all new patches, we should be using mock over mox because of the 
python 3 support of mock (and lack thereof for mox).


As for when to use mock vs stubs, I think you'll get different opinions 
from different people. Stubs are quick and easy and that's what I used 
early when I started contributing to the project, but since then have 
preferred mox/mock since they validate that methods are actually called 
with specific parameters, which can get lost when simply stubbing a 
method call out. In other words, if I'm stubbing a method and doing 
assertions within it (which you'll usually see), if that method is never 
called (maybe the code changed since the test was written), the 
assertions are lost and the test is essentially broken.


So I think in general it's best to use mock now unless you have a good 
reason not to.


On 11/10/2013 7:40 AM, Jay Lau wrote:

Hi,

I noticed that we are now using mock, mox and stub for unit test, just
curious do we have any guidelines for this, in which condition shall we
use mock, mox or stub?

Thanks,

Jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] sqlalchemy-migrate needs a new release

2013-11-12 Thread Matt Riedemann
I don't know what's all involved in putting out a release for 
sqlalchemy-migrate but if there is a way that I can help, please let me 
know.  I'll try to catch dripton in IRC today.


As for CI with DB2, it's in the blueprint as a work item, I just don't 
know enough about the infra side of things to get that going, so I'd 
need some help there.


DB2 Express-C is the free version which is the plan to run the unit 
tests in CI, but the only problem I see with that is it's a trial 
license and I wouldn't want to have to redo images or licenses every 3 
months or however long it lasts. I would think that IBM would be able to 
provide a permanent license for CI though, otherwise our alternative is 
running the tests in-house and reporting the results back (something 
like what the nova virt drivers have to do and vmware is already doing).


Thanks,

Matt Riedemann

On 11/12/2013 1:50 AM, Roman Podoliaka wrote:

Hey David,

Thank you for undertaking this task!

I agree, that merging of DB2 support can be postponed for now, even if
it looks totally harmless (though I see no way to test it, as we don't
have DB2 instances running on Infra test nodes).

Thanks,
Roman

On Mon, Nov 11, 2013 at 10:54 PM, Davanum Srinivas dava...@gmail.com wrote:

@dripton, @Roman Many thanks :)

On Mon, Nov 11, 2013 at 3:35 PM, David Ripton drip...@redhat.com wrote:

On 11/11/2013 11:37 AM, Roman Podoliaka wrote:


As you may know, in our global requirements list [1] we are currently
depending on SQLAlchemy 0.7.x versions (which is 'old stable' branch
and will be deprecated soon). This is mostly due to the fact, that the
latest release of sqlalchemy-migrate from PyPi doesn't support
SQLAlchemy 0.8.x+.

At the same time, distros have been providing patches for fixing this
incompatibility for a long time now. Moreover, those patches have been
merged to sqlalchemy-migrate master too.

As we are now maintaining sqlalchemy-migrate, we could make a new
release of it. This would allow us to bump the version of SQLAlchemy
release we are depending on (as soon as we fix all the bugs we have)
and let distros maintainers stop carrying their own patches.

This has been discussed at the design summit [2], so we just basically
need a volunteer from [3] Gerrit ACL group to make a new release.

Is sqlalchemy-migrate stable enough to make a new release? I think,
yes. The commits we've merged since we adopted this library, only fix
a few issues with SQLAlchemy 0.8.x compatibility and enable running of
tests (we are currently testing all new changes on py26/py27,
SQLAlchemy 0.7.x/0.8.x, SQLite/MySQL/PostgreSQL).

Who wants to help? :)

Thanks,
Roman

[1]
https://github.com/openstack/requirements/blob/master/global-requirements.txt
[2] https://etherpad.openstack.org/p/icehouse-oslo-db-migrations
[3] https://review.openstack.org/#/admin/groups/186,members



I'll volunteer to do this release.  I'll wait 24 hours from the timestamp of
this email for input first.  So, if anyone has opinions about the timing of
this release, please speak up.

(In particular, I'd like to do a release *before* Matt Riedermann's DB2
support patch https://review.openstack.org/#/c/55572/ lands, just in case it
breaks anything.  Of course we could do another release shortly after it
gets in, to make folks who use DB2 happy.)

--
David Ripton   Red Hat   drip...@redhat.com

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




--
Davanum Srinivas :: http://davanum.wordpress.com

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Do we have some guidelines for mock, stub, mox when writing unit test?

2013-11-13 Thread Matt Riedemann



On 11/12/2013 5:04 PM, Chuck Short wrote:




On Tue, Nov 12, 2013 at 4:49 PM, Mark McLoughlin mar...@redhat.com
mailto:mar...@redhat.com wrote:

On Tue, 2013-11-12 at 16:42 -0500, Chuck Short wrote:
 
  Hi
 
 
  On Tue, Nov 12, 2013 at 4:24 PM, Mark McLoughlin
mar...@redhat.com mailto:mar...@redhat.com
  wrote:
  On Tue, 2013-11-12 at 13:11 -0800, Shawn Hartsock wrote:
   Maybe we should have some 60% rule... that is: If you
change
  more than
   half of a test... you should *probably* rewrite the test in
  Mock.
 
 
  A rule needs a reasoning attached to it :)
 
  Why do we want people to use mock?
 
  Is it really for Python3? If so, I assume that means we've
  ruled out the
  python3 port of mox? (Ok by me, but would be good to hear
why)
  And, if
  that's the case, then we should encourage whoever wants to
  port mox
  based tests to mock.
 
 
 
  The upstream maintainer is not going to port mox to python3 so we
have
  a fork of mox called mox3. Ideally, we would drop the usage of mox in
  favour of mock so we don't have to carry a forked mox.

Isn't that the opposite conclusion you came to here:

http://lists.openstack.org/pipermail/openstack-dev/2013-July/012474.html

i.e. using mox3 results in less code churn?

Mark.



Yes that was my original position but I though we agreed in thread
(further on) that we would use mox3 and then migrate to mock further on.

Regards
chuck


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



So it sounds like we're good with using mox for new tests again? Given 
Chuck got it into global-requirements here:


https://github.com/openstack/requirements/commit/998dda263d7c7881070e3f16e4523ddcd23fc36d

We can stave off the need to transition everything from mox to mock?

I can't seem to find the nova blueprint to convert everything from mox 
to mock, maybe it was obsoleted already.


Anyway, if mox(3) is OK and we don't need to use mock, it seems like we 
could add something to the developer guide here because I think this 
question comes up frequently:


http://docs.openstack.org/developer/nova/devref/unit_tests.html

Does anyone disagree?

BTW, I care about this because I've been keeping in mind the mox/mock 
transition when doing code reviews and giving a -1 when new tests are 
using mox (since I thought that was a no-no now).

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sqlalchemy-migrate] Blueprint for review: Add DB2 10.5 Support

2013-11-14 Thread Matt Riedemann

Joe,

Hey, I missed this question.  I moved email accounts for the 
openstack-dev mailing list and missed this in my old pile.


So I touched on this a bit in response here [1] and also a bit when 
talking about the plans for CI for the nova PowerVM virt driver here 
[2].  The blueprint for adding DB2 support to sqlalchemy-migrate and the 
DB2 enablement wiki [3] does call out CI.  Getting the 
sqlalchemy-migrate unit tests to run against DB2 isn't that hard, I just 
haven't figured out if it's something I can do with community 
infrastructure or running as an external third party test, and I think 
whether we use Express-C or not would matter there since that has a 
trial license.


I'm open to suggestions/comments/ideas.

[1] 
http://lists.openstack.org/pipermail/openstack-dev/2013-November/018714.html 

[2] 
http://lists.openstack.org/pipermail/openstack-dev/2013-October/016395.html

[3] https://wiki.openstack.org/wiki/DB2Enablement

--

Thanks,

Matt Riedemann


From: Joe Gordon joe.gord...@gmail.com
To: OpenStack Development Mailing List (not for usage questions)
openstack-dev@lists.openstack.org,
Date: 11/07/2013 09:41 PM
Subject: Re: [openstack-dev] [sqlalchemy-migrate] Blueprint for review:
Add DB2 10.5 Support




With OpenStacks test and gating oriented mindset, how can we gate on
this functionality working going forward?


On Fri, Nov 1, 2013 at 3:30 AM, Matt Riedemann _mrie...@us.ibm.com_
mailto:mrie...@us.ibm.com wrote:
I've got a sqlalchemy-migrate blueprint up for review to add DB2 support
in migrate.
_
__https://blueprints.launchpad.net/sqlalchemy-migrate/+spec/add-db2-support_

This is a pre-req for getting DB2 support into Nova so I'm targeting
icehouse-1.  We've been running with the migrate patches internally
since Folsom, but getting them into migrate was difficult before
OpenStack took over maintenance of the project.

Please let me know if there are any questions/issues or something I need
to address here.

Thanks,

Matt Riedemann
Cloud Solutions and OpenStack Development
Email: _mrie...@us.ibm.com_ mailto:mrie...@us.ibm.com
Office Phone: _507-253-7622_ tel:507-253-7622
___
OpenStack-dev mailing list_
__OpenStack-dev@lists.openstack.org_
mailto:OpenStack-dev@lists.openstack.org_
__http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev_

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sqlalchemy-migrate] Blueprint for review: Add DB2 10.5 Support

2013-11-15 Thread Matt Riedemann



On 11/14/2013 10:38 PM, Matt Riedemann wrote:

Joe,

Hey, I missed this question.  I moved email accounts for the
openstack-dev mailing list and missed this in my old pile.

So I touched on this a bit in response here [1] and also a bit when
talking about the plans for CI for the nova PowerVM virt driver here
[2].  The blueprint for adding DB2 support to sqlalchemy-migrate and the
DB2 enablement wiki [3] does call out CI.  Getting the
sqlalchemy-migrate unit tests to run against DB2 isn't that hard, I just
haven't figured out if it's something I can do with community
infrastructure or running as an external third party test, and I think
whether we use Express-C or not would matter there since that has a
trial license.

I'm open to suggestions/comments/ideas.

[1]
http://lists.openstack.org/pipermail/openstack-dev/2013-November/018714.html

[2]
http://lists.openstack.org/pipermail/openstack-dev/2013-October/016395.html
[3] https://wiki.openstack.org/wiki/DB2Enablement



Thanks to Brant Bknudson for pointing out that DB2 Express-C doesn't 
have a time restriction:


http://www.ibm.com/developerworks/downloads/im/db2express/

It is a fully licensed product available for free download. It does not 
have any time restrictions.


I must have mistaken that with Enterprise Server Edition that we were 
using in house for some bigger deployments for CI with Tempest.


So it sounds like Express-C is what we could use to get 
sqlalchemy-migrate unit tests running against DB2 using the community 
infrastructure (I hope), I just need some help with getting that going. 
I know Roman got the migrate UT running for MySQL and PostgreSQL here:


https://review.openstack.org/#/c/40436/

I'll try working with Roman, Monty and any infra guys that will talk to 
me to get this going.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [sqlalchemy-migrate] Blueprint for review: Add DB2 10.5 Support

2013-11-15 Thread Matt Riedemann



On 11/15/2013 10:15 AM, Matt Riedemann wrote:



On 11/14/2013 10:38 PM, Matt Riedemann wrote:

Joe,

Hey, I missed this question.  I moved email accounts for the
openstack-dev mailing list and missed this in my old pile.

So I touched on this a bit in response here [1] and also a bit when
talking about the plans for CI for the nova PowerVM virt driver here
[2].  The blueprint for adding DB2 support to sqlalchemy-migrate and the
DB2 enablement wiki [3] does call out CI.  Getting the
sqlalchemy-migrate unit tests to run against DB2 isn't that hard, I just
haven't figured out if it's something I can do with community
infrastructure or running as an external third party test, and I think
whether we use Express-C or not would matter there since that has a
trial license.

I'm open to suggestions/comments/ideas.

[1]
http://lists.openstack.org/pipermail/openstack-dev/2013-November/018714.html


[2]
http://lists.openstack.org/pipermail/openstack-dev/2013-October/016395.html

[3] https://wiki.openstack.org/wiki/DB2Enablement



Thanks to Brant Bknudson for pointing out that DB2 Express-C doesn't
have a time restriction:

http://www.ibm.com/developerworks/downloads/im/db2express/

It is a fully licensed product available for free download. It does not
have any time restrictions.

I must have mistaken that with Enterprise Server Edition that we were
using in house for some bigger deployments for CI with Tempest.

So it sounds like Express-C is what we could use to get
sqlalchemy-migrate unit tests running against DB2 using the community
infrastructure (I hope), I just need some help with getting that going.
I know Roman got the migrate UT running for MySQL and PostgreSQL here:

https://review.openstack.org/#/c/40436/

I'll try working with Roman, Monty and any infra guys that will talk to
me to get this going.



Just to circle back on this before anyone throws in their two cents and 
tells me that 3rd party CI is the way to go, I caught Monty in IRC and 
came to that conclusion already.


While DB2 Express-C is free and doesn't expire, it's closed source so 
it's an issue of the infra team being able to maintain it, and that 
there is no closed source code running in the community infrastructure.


So I'll plan on getting the sqlalchemy-migrate unit tests reporting back 
for DB2 using 3rd party CI and triggers.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Split of the openstack-dev list (summary so far)

2013-11-16 Thread Matt Riedemann
 of nova-network? (3 messages)
[openstack-dev] [Nova] New API requirements, review of GCE (6 messages)
[openstack-dev] how can I know a new instance is created from the code ?
(3 messages)
[openstack-dev] [Nova] Icehouse Blueprints (2 messages)
[openstack-dev] [Solum/Heat] Is Solum really necessary? (14 messages)
[openstack-dev] Nova XML serialization bug 1223358 moving discussion
here to get more people involved (4 messages)
[openstack-dev] [RFC] Straw man to start the incubation / graduation
requirements discussion (11 messages)
[openstack-dev] [Savanna] DiskBuilder / savanna-image-elements (4 messages)
[openstack-dev] [Keystone] Blob in keystone v3 certificate API (2 messages)
[openstack-dev] [oslo] team meeting Friday 15 November @ 14:00 UTC (2
messages)
[openstack-dev] [Trove][Savanna][Murano] Unified Agent proposal
discussion at Summit (6 messages)
[openstack-dev] [oslo] tracking graduation status for incubated code
[openstack-dev] [OpenStack-dev][Neutron][Tempest]Can Tempest embrace
some complicated network scenario tests (3 messages)
[openstack-dev] [nova][cinder][oslo][scheduler] How to leverage oslo
schduler/filters for nova and cinder (6 messages)
[openstack-dev] [Nova] Hypervisor CI requirement and deprecation  plan
[openstack-dev] [Ceilometer] compute agent cannot start (7 messages)
[openstack-dev] [Horizon] Use icon set instead of instance Action (4
messages)
[openstack-dev] [OpenStack][Horizon] poweroff/shutdown action in horizon
(3 messages)
[openstack-dev] [Murano] Implementing Elastic Applications (3 messages)

Now - tell me in the above list where the mass of StackForge related
email overwhelming madness is coming from. I count 4 topics and 26
messages out of a total of 44 topics and 328 messages.

So - before we take the extreme move of segregation, can we just try
threaded mail readers for a while and see if it helps?

Monty

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Thanks for the tip Monty. I just started using Thunderbird last week 
and already had my tags sorting most of the dev list into folders, but 
just installed the Conversations add-on to further clean things up.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [stable/havana] gate broken

2013-11-17 Thread Matt Riedemann



On Sunday, November 17, 2013 7:46:39 AM, Gary Kotton wrote:

Hi,
The gating for the stable version is broken when the running the
neutron gate. Locally this works but the gate has problem. All of the
services are up and running correctly. There are some exceptions with
the ceilometer service but that is not related to the neutron gating.

The error message is as follows:
2013-11-17 11:00:05.855
http://logs.openstack.org/46/56746/1/check/check-tempest-devstack-vm-neutron/a02894b/console.html#_2013-11-17_11_00_05_855
| 2013-11-17 11:00:05
2013-11-17 11:00:17.239  
http://logs.openstack.org/46/56746/1/check/check-tempest-devstack-vm-neutron/a02894b/console.html#_2013-11-17_11_00_17_239
  | Process leaked file descriptors. 
Seehttp://wiki.jenkins-ci.org/display/JENKINS/Spawning+processes+from+build  for more 
information
2013-11-17 11:00:17.437  
http://logs.openstack.org/46/56746/1/check/check-tempest-devstack-vm-neutron/a02894b/console.html#_2013-11-17_11_00_17_437
  | Build step 'Execute shell' marked build as failure
2013-11-17 11:00:19.129  
http://logs.openstack.org/46/56746/1/check/check-tempest-devstack-vm-neutron/a02894b/console.html#_2013-11-17_11_00_19_129
  | [SCP] Connecting to static.openstack.org
Thanks
Gary


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


I've seen this fail on at least two stable/havana patches in nova 
today, so I opened this bug:


https://bugs.launchpad.net/openstack-ci/+bug/1252024

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] [api] How to handle bug 1249526?

2013-11-17 Thread Matt Riedemann
This is mainly just a newbie question but looks like it could be an easy 
fix. The bug report is just asking for the nova os-fixed-ips API 
extension to return the 'reserved' status for the fixed IP. I don't see 
that in the v3 API list though, was that dropped in V3? If it's not 
being ported to V3 I'm sure there was a good reason so maybe this isn't 
worth implementing in the V2 API, even though it seems like a pretty 
harmless backwards compatible change. Am I missing something here?


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] How to determine patch set load for a given project

2013-11-19 Thread Matt Riedemann
We have a team working on getting CI setup for DB2 10.5 in 
sqlalchemy-migrate and they were asking me if there was a way to 
calculate the patch load through that project.


I asked around in the infra IRC channel and Jeremy Stanley pointed out 
that there might be something available in 
http://graphite.openstack.org/ by looking for the project's test stats.


I found that if you expand stats_counts  zuul  job and then search for 
your project (sqlalchemy-migrate in this case), you can find the jobs 
and their graphs for load. In my case I care about stats for 
gate-sqlalchemy-migrate-python27.


I'm having a little trouble interpreting the data though. From looking 
at what's out there for review now, there is one new patch created on 
11/19 and the last new one before that was on 11/15. I see spikes in the 
graph around 11/15, 11/18 and 11/19, but I'm not sure what the 11/18 
spike is from?


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] VNC issue with multi compute node with openstack havana

2013-11-20 Thread Matt Riedemann



On Wednesday, November 20, 2013 7:49:50 AM, Vikash Kumar wrote:

Hi,

  I used devstack Multi-Node + VLANs to install openstack-havana
recently. Installation was successful and i verified basic things like
vm launch, ping between vm's.

  I have two nodes: 1. Ctrl+Compute  2. Compute

  The VM which gets launched on second compute node (here 2, see
above) doesn't gets vnc console. I tried to acces from both horizon
and the url given by nova-cli.

   The *n-novnc* screen on first node which is controller (here 1)
gave this error log:

   Traceback (most recent call last):
  File
/usr/local/lib/python2.7/dist-packages/websockify/websocket.py, line
711, in top_new_client
self.new_client()
  File /opt/stack/nova/nova/console/websocketproxy.py, line 68, in
new_client
tsock = self.socket(host, port, connect=True)
  File
/usr/local/lib/python2.7/dist-packages/websockify/websocket.py, line
188, in socket
sock.connect(addrs[0][4])
  File /usr/local/lib/python2.7/dist-packages/eventlet/greenio.py,
line 192, in connect
socket_checkerr(fd)
  File /usr/local/lib/python2.7/dist-packages/eventlet/greenio.py,
line 46, in socket_checkerr
raise socket.error(err, errno.errorcode[err])
error: [Errno 111] ECONNREFUSED


  The vnc related configuration in nova.conf on Ctrl+Compute node:

   vncserver_proxyclient_address = 127.0.0.1
   vncserver_listen = 127.0.0.1
   vnc_enabled = true
   xvpvncproxy_base_url = http://192.168.2.151:6081/console
   novncproxy_base_url = http://192.168.2.151:6080/vnc_auto.html

   and on second Compute node:
  /* I corrected the I.P. of first two address, by default it sets to
127.0.0.1 */
   vncserver_proxyclient_address = 192.168.2.157
   vncserver_listen = 0.0.0.0
   vnc_enabled = true
   xvpvncproxy_base_url = http://192.168.2.151:6081/console
   novncproxy_base_url = http://192.168.2.151:6080/vnc_auto.html

I also added the host name of compute node in hosts file of
controllernode. With this ERORR 111 gone and new error came.

connecting to: 192.168.2.157:-1
  7: handler exception: [Errno -8] Servname not supported for ai_socktype
  7: Traceback (most recent call last):
  File
/usr/local/lib/python2.7/dist-packages/websockify/websocket.py, line
711, in top_new_client
self.new_client()
  File /opt/stack/nova/nova/console/websocketproxy.py, line 68, in
new_client
tsock = self.socket(host, port, connect=True)
  File
/usr/local/lib/python2.7/dist-packages/websockify/websocket.py, line
180, in socket
socket.IPPROTO_TCP, flags)
  gaierror: [Errno -8] Servname not supported for ai_socktype


   What need to be done to resolve this ?

Thnx






___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


This mailing list is for development discussion only. For support, you 
should go to the general mailing list:


https://wiki.openstack.org/wiki/Mailing_Lists#General_List

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Diagnostic] Diagnostic API: summit follow-up

2013-11-20 Thread Matt Riedemann



On Wednesday, November 20, 2013 7:52:39 AM, Oleg Gelbukh wrote:

Hi, fellow stackers,

There was a conversation during 'Enhance debugability' session at the
summit about Diagnostic API which allows gate to get 'state of world'
of OpenStack installation. 'State of world' includes hardware- and
operating system-level configurations of servers in cluster.

This info would help to compare the expected effect of tests on a
system with its actual state, thus providing Tempest with ability to
see into it (whitebox tests) as one of possible use cases. Another use
case is to provide input for validation of OpenStack configuration files.

We're putting together an initial version of data model of API with
example values in the following etherpad:
https://etherpad.openstack.org/p/icehouse-diagnostic-api-spec

This version covers most hardware and system-level configurations
managed by OpenStack in Linux system. What is missing from there? What
information you'd like to see in such an API? Please, feel free to
share your thoughts in ML, or in the etherpad directly.


--
Best regards,
Oleg Gelbukh
Mirantis Labs


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Hi Oleg,

There has been some discussion over the nova virtapi's get_diagnostics 
method.  The background is in a thread from October [1].  The timing is 
pertinent since the VMware team is working on implementing that API for 
their nova virt driver [2].  The main issue is there is no consistency 
between the nova virt drivers and how they would implement the 
get_diagnostics API, they only return information that is 
hypervisor-specific.  The API docs and current Tempest test covers the 
libvirt driver's implementation, but wouldn't work for say xen, vmware 
or powervm drivers.


I think the solution right now is to namespace the keys in the dict 
that is returned from the API so a caller could at least check for that 
and know how to handle processing the result, but it's not ideal.


Does your solution take into account the nova virtapi's get_diagnostics 
method?


[1] 
http://lists.openstack.org/pipermail/openstack-dev/2013-October/016385.html

[2] https://review.openstack.org/#/c/51404/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] The recent gate performance and how it affects you

2013-11-20 Thread Matt Riedemann



On Wednesday, November 20, 2013 2:44:52 PM, Clark Boylan wrote:

Joe Gordon has been doing great working tracking test failures and how
often they affect us. Post Havana release the failure rate has
increased dramatically, negatively affecting the gate and forcing it to
run in a near worst case scenario. That is changes are being tested in
parallel but the head of the queue is more often than not running into a
failed job forcing all changes behind it to be retested and so on.

This led to a gate queue 130 deep with the head of the queue 18 hours
behind its approval. We have identified fixes for some of the worst
current bugs and in order to get them in have restarted Zuul effectively
cancelling the gate queue and have queued these changes up at the front
of the qeueue. Once these changes are in and we are happy with the bug
fixing results we will requeue changes that were in the queue when it
got cancelled.

How do we avoid this in the future? Step one is reviewers that are
approving changes (or reverifying them) should keep an eye on the gate
queue. If it is struggling adding more changes to that queue problably
won't help. Instead we should focus on identifying the bugs, submitting
changes to elastic-recheck to track these bugs, and work towards fixing
the bugs. Everyone is affected by persistent gate failures, we need to
work together to fix them.

Thank you for your patience,

Clark

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Let me also say that I think it's really helpful that Joe has been 
sending out recaps to the mailing list about the top offenders so 
people can help pitch in on investigating and fixing those (like we saw 
with the Neutron team's response to Joe's recent post about the top 
gate failures).


People get heads-down in their own projects and what they are working 
on and it's hard to keep up with what's going on in the infra channel 
(or nova channel for that matter), so sending out a recap that everyone 
can see in the mailing list is helpful to reset where things are at and 
focus possibly various isolated investigations (as we saw happen this 
week).


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Diagnostic] Diagnostic API: summit follow-up

2013-11-21 Thread Matt Riedemann



On 11/20/2013 9:35 PM, Lingxian Kong wrote:

hi Matt:

noticed there is no consensus there[1], any progress outside the ML?

[1]
http://lists.openstack.org/__pipermail/openstack-dev/2013-__October/016385.html
http://lists.openstack.org/pipermail/openstack-dev/2013-October/016385.html



2013/11/21 Oleg Gelbukh ogelb...@mirantis.com
mailto:ogelb...@mirantis.com

Matt,

Thank you for bringing this up. I've been following this thread and
the idea is somewhat aligned with our approach, but we'd like to
take one step further.

In this Diagnostic API, we want to collect information about system
state from sources outside to OpenStack. We'd probably should
extract this call from Nova API and use it in our implementation to
get hypervisor-specific information about virtual machines which
exist on the node. But the idea is to get vision into the system
state alternative to that provided by OpenStack APIs.

May be we should reconsider our naming to avoid confusion and call
this Instrumentation API or something like that?

--
Best regards,
Oleg Gelbukh


On Wed, Nov 20, 2013 at 6:45 PM, Matt Riedemann
mrie...@linux.vnet.ibm.com mailto:mrie...@linux.vnet.ibm.com wrote:



On Wednesday, November 20, 2013 7:52:39 AM, Oleg Gelbukh wrote:

Hi, fellow stackers,

There was a conversation during 'Enhance debugability'
session at the
summit about Diagnostic API which allows gate to get 'state
of world'
of OpenStack installation. 'State of world' includes
hardware- and
operating system-level configurations of servers in cluster.

This info would help to compare the expected effect of tests
on a
system with its actual state, thus providing Tempest with
ability to
see into it (whitebox tests) as one of possible use cases.
Another use
case is to provide input for validation of OpenStack
configuration files.

We're putting together an initial version of data model of
API with
example values in the following etherpad:
https://etherpad.openstack.__org/p/icehouse-diagnostic-api-__spec
https://etherpad.openstack.org/p/icehouse-diagnostic-api-spec

This version covers most hardware and system-level
configurations
managed by OpenStack in Linux system. What is missing from
there? What
information you'd like to see in such an API? Please, feel
free to
share your thoughts in ML, or in the etherpad directly.


--
Best regards,
Oleg Gelbukh
Mirantis Labs


_
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.__org
mailto:OpenStack-dev@lists.openstack.org

http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Hi Oleg,

There has been some discussion over the nova virtapi's
get_diagnostics method.  The background is in a thread from
October [1].  The timing is pertinent since the VMware team is
working on implementing that API for their nova virt driver [2].
  The main issue is there is no consistency between the nova
virt drivers and how they would implement the get_diagnostics
API, they only return information that is hypervisor-specific.
  The API docs and current Tempest test covers the libvirt
driver's implementation, but wouldn't work for say xen, vmware
or powervm drivers.

I think the solution right now is to namespace the keys in the
dict that is returned from the API so a caller could at least
check for that and know how to handle processing the result, but
it's not ideal.

Does your solution take into account the nova virtapi's
get_diagnostics method?

[1]

http://lists.openstack.org/__pipermail/openstack-dev/2013-__October/016385.html

http://lists.openstack.org/pipermail/openstack-dev/2013-October/016385.html
[2] https://review.openstack.org/#__/c/51404/
https://review.openstack.org/#/c/51404/

--

Thanks,

Matt Riedemann



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




--
*---*
*Lingxian Kong*
Huawei Technologies Co.,LTD.
IT Product Line CloudOS PDU
China, Xi'an
Mobile: +86-18602962792
Email: konglingx...@huawei.com mailto:konglingx...@huawei.com

Re: [openstack-dev] Top Gate Bugs

2013-11-21 Thread Matt Riedemann
+topic:57578,n,z
but went far enough to revert the change that introduced that test. A
couple people were going to keep hitting those changes to run them
through more tests and see if 1251920 goes away.

I don't quite understand why this test is problematic (Joe indicated
it went in at about the time 1251920 became a problem). I would be
very interested in finding out why this caused a problem.

You can see frequencies for bugs with known signatures at
http://status.openstack.org/elastic-recheck/

Hope this helps.

Clark

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Joe is tracking some notes in an etherpad here:

https://etherpad.openstack.org/p/critical-patches-gatecrash-November-2013

I've added https://review.openstack.org/#/c/57069/ and 
https://review.openstack.org/#/c/57042/ to the list.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Glance] Support of v1 and v2 glance APIs in Nova

2013-11-21 Thread Matt Riedemann
 client. But
   that seems better than having that code in nova.
  
   I know in Glance we've largely taken the view that the client
should be as thin and lightweight as possible so users of the client can
make use of it however they best see fit. There was an earlier patch
that would have moved the whole image service layer into glanceclient
that was rejected. So I think there is a division in philosophies here
as well
 
  Hmm, I would be a fan of supporting both use cases, nova style and
  more complex. Just seems better for glance to own as much as possible
  of the glance client-like code. But I am a nova guy, I would say that!
  Anyway, that's a different conversation.
 
  John
 
  ___
  OpenStack-dev mailing list
  OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
  http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I'm joining this thread a bit late but wanted to raise a few points for 
consideration.


1. It doesn't look like the 'use-glance-v2-api' blueprint [1] has gone 
anywhere since this thread seems to have hit a dead-end.


2. There is a blueprint [2] for nova supporting the cinder v2 API now 
too and the related review is actually defaulting to use v2, so given 
the history on this with the glance discussion, I think it's relevant to 
drop it into the same conversation.


3. As for the keystone service catalog being used to abstract some of 
this, there was a related blueprint [3] for abstracting the glance URI 
that nova would talk to. The blueprint was closed because I think Joe 
Gordon had something else cooking for enhancing the keystone service 
catalog, but there weren't any details put into the closed blueprint 
that Yang Yu opened. Where are we with that?


I plan on bringing this up as a blueprint topic in today's nova meeting.

[1] https://blueprints.launchpad.net/nova/+spec/use-glance-v2-api
[2] https://blueprints.launchpad.net/nova/+spec/support-cinderclient-v2
[3] 
https://blueprints.launchpad.net/nova/+spec/nova-enable-glance-arbitrary-url


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Glance] Support of v1 and v2 glance APIs in Nova

2013-11-22 Thread Matt Riedemann



On Thursday, November 21, 2013 9:56:41 AM, Matt Riedemann wrote:



On 11/3/2013 5:22 AM, Joe Gordon wrote:


On Nov 1, 2013 6:46 PM, John Garbutt j...@johngarbutt.com
mailto:j...@johngarbutt.com wrote:
 
  On 29 October 2013 16:11, Eddie Sheffield
eddie.sheffi...@rackspace.com mailto:eddie.sheffi...@rackspace.com
wrote:
  
   John Garbutt j...@johngarbutt.com mailto:j...@johngarbutt.com
said:
  
   Going back to Joe's comment:
   Can both of these cases be covered by configuring the keystone
catalog?
   +1
  
   If both v1 and v2 are present, pick v2, otherwise just pick
what is in
   the catalogue. That seems cool. Not quite sure how the multiple
glance
   endpoints works in the keystone catalog, but should work I assume.
  
   We hard code nova right now, and so we probably want to keep that
route too?
  
   Nova doesn't use the catalog from Keystone when talking to Glance.
There is a config value glance_api_servers which defines a list of
Glance servers that gets randomized and cycled through. I assume that's
what you're referring to with we hard code nova. But currently there's
nowhere in this path (internal nova to glance) where the keystone
catalog is available.
 
  Yes. I was not very clear. I am proposing we change that. We could
try
  shoehorn the multiple glance nodes in the keystone catalog, then
cache
  that in the context, but maybe that doesn't make sense. This is a
  separate change really.

FYI:  We cache the cinder endpoints from keystone catalog in the context
already. So doing something like that with glance won't be without
president.

 
  But clearly, we can't drop the direct configuration of glance servers
  for some time either.
 
   I think some of the confusion may be that Glanceclient at the
programmatic client level doesn't talk to keystone. That happens happens
higher in the CLI level which doesn't come into play here.
  
   From: Russell Bryant rbry...@redhat.com
mailto:rbry...@redhat.com
   On 10/17/2013 03:12 PM, Eddie Sheffield wrote:
   Might I propose a compromise?
  
   1) For the VERY short term, keep the config value and get the
change otherwise
   reviewed and hopefully accepted.
  
   2) Immediately file two blueprints:
  - python-glanceclient - expose a way to discover available
versions
  - nova - depends on the glanceclient bp and allowing
autodiscovery of glance
   version
   and making the config value optional (tho not
deprecated / removed)
  
   Supporting both seems reasonable.  At least then *most* people
don't
   need to worry about it and it just works, but the override is
there if
   necessary, since multiple people seem to be expressing a desire
to have
   it available.
  
   +1
  
   Can we just do this all at once?  Adding this to glanceclient
doesn't
   seem like a huge task.
  
   I worry about us never getting the full solution, but it seems
to have
   got complicated.
  
   The glanceclient side is done, as far as allowing access to the
list of available API versions on a given server. It's getting Nova to
use this info that's a bit sticky.
 
  Hmm, OK. Could we not just cache the detected version, to reduce the
  impact of that decision.
 
   On 28 October 2013 15:13, Eddie Sheffield
eddie.sheffi...@rackspace.com mailto:eddie.sheffi...@rackspace.com
wrote:
   So...I've been working on this some more and hit a bit of a
snag. The
   Glanceclient change was easy, but I see now that doing this in
nova will require
   a pretty huge change in the way things work. Currently, the API
version is
   grabbed from the config value, the appropriate driver is
instantiated, and calls
   go through that. The problem comes in that the actually glance
server isn't
   communicated with until very late in the process. Nothing sees
the servers at
   the level where the driver is determined. Also there isn't a
single glance server
   but a list of them, and in the even of certain communication
failures the list is
   cycled through until success or a number of retries has passed.
  
   So to change this to auto configuring will require turning this
upside down,
   cycling through the servers at a higher level, choosing the
appropriate driver
   for that server, and handling retries at that same level.
  
   Doable, but a much larger task than I first was thinking.
  
   Also, I don't really want the added overhead of getting the api
versions before
   every call, so I'm thinking that going through the list of
servers at startup and
   discovering the versions then and caching that somehow would be
helpful as well.
  
   Thoughts?
  
   I do worry about that overhead. But with Joe's comment, does it
not
   just boil down to caching the keystone catalog in the context?
  
   I am not a fan of all the specific talk to glance code we have in
   nova, moving more of that into glanceclient can only be a good
thing.
   For the XenServer itegration, for efficiency reasons, we need
glance
   to talk from dom0, so it has dom0 making the final HTTP call

[openstack-dev] [nova] who wants to own docker bug triage?

2013-11-23 Thread Matt Riedemann
Going through nova bug triage today I noticed a pretty straight-forward 
untagged bug for docker but then noticed we didn't have a docker tag in 
our bug tag table [1].  I went ahead and added one and the queries show 
a decent number of results, so people were already using the tag.


The question is, who wants their name in the box next to that tag for 
owning triage?


[1] 
https://wiki.openstack.org/wiki/Nova/BugTriage#Step_2:_Triage_Tagged_Bugs


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] who wants to own docker bug triage?

2013-11-23 Thread Matt Riedemann



On Saturday, November 23, 2013 3:28:28 PM, Robert Collins wrote:

Cool; also, if it's not, we should add that as an official tag so that
it type-completes in LP.

On 24 November 2013 10:21, Matt Riedemann mrie...@linux.vnet.ibm.com wrote:

Going through nova bug triage today I noticed a pretty straight-forward
untagged bug for docker but then noticed we didn't have a docker tag in our
bug tag table [1].  I went ahead and added one and the queries show a decent
number of results, so people were already using the tag.

The question is, who wants their name in the box next to that tag for owning
triage?

[1]
https://wiki.openstack.org/wiki/Nova/BugTriage#Step_2:_Triage_Tagged_Bugs

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






Good idea.  I don't know how to do that though.  Any guides I can 
follow to make that happen?


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Tempest] Drop python 2.6 support

2013-11-25 Thread Matt Riedemann



On Monday, November 25, 2013 7:35:51 AM, Zhi Kun Liu wrote:

Hi all,

I saw that Tempest will drop python 2.6 support in design summit
https://etherpad.openstack.org/p/icehouse-summit-qa-parallel.


Drop tempest python 2.6 support:Remove all nose hacks in the code


Delete nose, use unittest2 with testr/testtools and everything
*should* just work (tm)



Does that mean Tempest could not run on python 2.6 in the future?

--
Regards,
Zhi Kun Liu


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Well so if you're running a single-node setup of OpenStack on a VM on 
top of RHEL 6 and running Tempest from there, yeah, this is an 
inconvenience, but it's a pretty simple fix, right?  I just run my 
OpenStack RHEL 6 VM and have an Ubuntu 12.04 or Fedora 19 or whatever 
distro-that-supports-py27 I want running Tempest against it.  Am I 
missing something?


FWIW, trying to keep up with the changes in Tempest when you're running 
on python 2.6 is no fun, especially with how tests are skipped 
(skipException causes a test failure if you don't have a special 
environment variable set).  Plus you don't get parallel execution of 
the tests.


So I agree with the approach even though it's going to hurt me in the 
short-term.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Hypervisor CI requirement and deprecation plan

2013-11-25 Thread Matt Riedemann



On 11/15/2013 9:28 AM, Dan Smith wrote:

Hi all,

As you know, Nova adopted a plan to require CI testing for all our
in-tree hypervisors by the Icehouse release. At the summit last week, we
determined the actual plan for deprecating non-compliant drivers. I put
together a page detailing the specific requirements we're putting in
place as well as a plan and timeline for how the deprecation process
will proceed:

https://wiki.openstack.org/wiki/HypervisorSupportMatrix/DeprecationPlan

I also listed the various drivers and whether we've heard any concrete
plans from them. Driver owners should feel free to add details to that
and correct any of the statements if incorrect.

Thanks!

--Dan

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I'll play devil's advocate here and ask this question before someone 
else does.  I'm assuming that the requirement of a 'full' tempest run 
means running this [1].  Is that correct?  It's just confusing sometimes 
because there are other things in Tempest that aren't in the 'full' run, 
like stress tests.


Assuming that's what 'full' means, it's running API, CLI, third party 
(boto), and scenario tests.  Does it make sense to require a nova virt 
driver's CI to run API tests for keystone, heat and swift?  Or couldn't 
the nova virt driver CI be scoped down to just the compute API tests? 
The argument against that is probably that the network/image/volume 
tests may create instances using nova to do their API testing also.  The 
same would apply for the CLI tests since those are broken down by 
service, i.e. why would I need to run keystone and ceilometer CLI tests 
for a nova virt driver?


If nothing else, I think we could firm up the wording on the wiki a bit 
around the requirements and what that means for scope.


[1] https://github.com/openstack/tempest/blob/master/tox.ini#L33

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Hypervisor CI requirement and deprecation plan

2013-11-25 Thread Matt Riedemann



On Monday, November 25, 2013 4:37:29 PM, Russell Bryant wrote:

On 11/25/2013 05:19 PM, Matt Riedemann wrote:

I'll play devil's advocate here and ask this question before someone
else does.  I'm assuming that the requirement of a 'full' tempest run
means running this [1].  Is that correct?  It's just confusing sometimes
because there are other things in Tempest that aren't in the 'full' run,
like stress tests.

Assuming that's what 'full' means, it's running API, CLI, third party
(boto), and scenario tests.  Does it make sense to require a nova virt
driver's CI to run API tests for keystone, heat and swift?  Or couldn't
the nova virt driver CI be scoped down to just the compute API tests?
The argument against that is probably that the network/image/volume
tests may create instances using nova to do their API testing also.  The
same would apply for the CLI tests since those are broken down by
service, i.e. why would I need to run keystone and ceilometer CLI tests
for a nova virt driver?

If nothing else, I think we could firm up the wording on the wiki a bit
around the requirements and what that means for scope.

[1] https://github.com/openstack/tempest/blob/master/tox.ini#L33



I think the short answer is, whatever we're running against all Nova
changes in the gate.


Maybe a silly question, but is what is run against the check queue any 
different from the gate queue?




I expect that for some drivers, a more specific configuration is going
to be needed to exclude tests for features not implemented in that
driver.  That's fine.

Soon we also need to start solidifying criteria for what features *must*
be implemented in a driver.  I think we've let some drivers in with far
too many features not supported.  That's a separate issue from the CI
requirement, though.



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Hypervisor CI requirement and deprecation plan

2013-11-26 Thread Matt Riedemann



On Tuesday, November 26, 2013 10:07:02 AM, Sean Dague wrote:

On 11/26/2013 09:56 AM, Russell Bryant wrote:

On 11/26/2013 09:38 AM, Bob Ball wrote:

-Original Message-
From: Russell Bryant [mailto:rbry...@redhat.com]
Sent: 26 November 2013 13:56
To: openstack-dev@lists.openstack.org
Cc: Sean Dague
Subject: Re: [openstack-dev] [Nova] Hypervisor CI requirement and
deprecation plan

On 11/26/2013 04:48 AM, Bob Ball wrote:


I hope we can safely say that we should run against all gating tests which

require Nova?  Currently we run quite a number of tests in the gate that
succeed even when Nova is not running as the gate isn't just for Nova but
for all projects.

Would you like to come up with a more detailed proposal?  What tests
would you cut, and how much time does it save?


I don't have a detailed proposal yet - but it's very possible that we'll want 
one in the coming weeks.

In terms of the time saved, I noticed that a tempest smoke run with Nova absent 
took 400 seconds on one of my machines (a particularly slow one) - so I imagine 
that would translate to maybe a 300 second / 5 minute reduction in overall 
time.  Total smoke took approximately 800 seconds on the same machine.


I don't think the smoke tests are really relevant here.  That's not
related to Nova vs non-Nova tests, right?


If the approach could be acceptable then yes, I'm happy to come up with a 
detailed set of tests that I would propose cutting.

My primary hesitation with the approach is it would need Tempest reviewers to 
be aware of this extra type of test, and flag up if a test is added to the full 
tempest suite which should also be in the nova tempest suite.


Right now I don't think it's acceptable.  I was suggesting a more
detailed proposal to help convince me.  :-)


So we already have the beginnings of service tags in Tempest, that would
let you slice exactly like this. I don't think the infrastructure is
fully complete yet, but the idea being that you could run the subset of
tests that interact with compute or networking in any real way.

Realize... that's not going to drop that many tests for something like
compute, it's touched a lot.

-Sean



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Good to know about the service tags, I think I remember being broken at 
some point after those tempest.conf.sample changes. :)


My overall concern, and I think the other guys doing this for virt 
drivers will agree, is trying to scope down the exposure to unrelated 
failures.  For example, if there is a bug in swift breaking the gate, 
it could start breaking the nova virt driver CI as well.  When things 
get bad in the gate, it takes some monstrous effort to rally people 
across the projects to come together to unblock it (like what Joe 
Gordon was doing last week).


I'm running Tempest internally about once per day when we rebase code 
with the community and that's to cover running with the PowerVM driver 
for nova, Storwize driver for cinder, OVS for neutron, with qpid and 
DB2.  We're running almost a full run except for the third party boto 
tests and swift API tests.  The thing is, when something fails, I have 
to figure out if it's environmental (infra), a problem with tempest 
(think instability with neutron in the gate), a configuration issue, or 
a code bug.  That's a lot for one person to have to cover, even a small 
team.  That's why at some points we just have to ignore/exclude tests 
that continuously fail but we can't figure out (think intermittent gate 
breaker bugs that are open for months).  Now multiply this out across 
all the nova virt drivers, the neutron plugins and I'm assuming at some 
point the various glance backends and cinder drivers (haven't heard if 
they are planning on the same types of CI requirements yet).  I think 
either we're going to have a lot of flaky/instable driver CI going on 
so the scores can't be trusted, or we're going to develop a lot of 
people that get really good at infra/QA (which would be a plus in the 
long-run, but maybe not what those teams set out to be).


I don't have any good answers, I'm just trying to raise the issue since 
this is complicated.  I think it's also hard for people that aren't 
forced to invest in infra/QA on a daily basis to understand and 
appreciate the amount of effort it takes just to keep the wheels 
spinning, so I want to keep expectations at a reasonable level.


Don't get me wrong, I absolutely agree with requiring third party CI 
for the various vendor-specific drivers and plugins, that's a 
no-brainer for openstack to scale.  I think it will just be very 
interesting to see the kinds of results coming out of all of these 
disconnected teams come icehouse-3.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org

Re: [openstack-dev] Working with Vagrant and packstack

2013-11-30 Thread Matt Riedemann



On Friday, November 29, 2013 2:16:23 AM, Peeyush Gupta wrote:

Hi all,

I have been trying to set up an openstack environment using vagrant
and packstack. I provisioned a Fedora-19 VM  through vagrant and used
a shell script to take care of installation and other things. The
first thing that shell script does is yum install -y
openstack-packstack and then packstack --allinone. Now, the issue
is that the second command requires me to enter the root's password
explicitly. I mean it doesn't matter if I am running this as root or
using sudo, I have to enter the password explicitly everytime. I tried
to pass the password to the VM through pipes and other methods, but
nothing works.

Did anyone face the same problem? Is there any way around this? Or
does it mean that I can't use puppet/packstack with vagrant?

Thanks,
~Peeyush Gupta


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


This sounds like a support question so it should be posted to the 
general mailing list:


https://wiki.openstack.org/wiki/Mailing_Lists#General_List

The openstack-dev list is for development discussion topics.

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Do we have some guidelines for mock, stub, mox when writing unit test?

2013-12-04 Thread Matt Riedemann
 on this soon IMHO, as this comes
up with literally every commit.

Cheers,

Nikola

[1] https://review.openstack.org/#/c/59694/
[2] https://pypi.python.org/pypi/mox
[3] https://pypi.python.org/pypi/mox3/0.7.0


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Top Gate Bugs

2013-12-06 Thread Matt Riedemann
 about the need 
for having a new column in the Service table for indicating whether or 
not the service was automatically disabled, as Phil Day points out in 
bug 1250049 [6].  That way the ComputeFilter in the scheduler could 
handle that case a bit differently, at least from a 
logging/serviceability standpoint, e.g. info/warning level message vs 
debug.


[1] https://bugs.launchpad.net/nova/+bug/1257644
[2] https://review.openstack.org/#/c/52189/
[3] https://review.openstack.org/#/c/56224/
[4] https://bugs.launchpad.net/nova/+bug/1254872
[5] http://www.redhat.com/archives/libvir-list/2012-July/msg01675.html
[6] https://bugs.launchpad.net/nova/+bug/1250049

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa][keystone] Keystoneclient tests to tempest

2013-12-08 Thread Matt Riedemann



On Sunday, December 08, 2013 11:26:07 AM, Brant Knudson wrote:


We'd like to get the keystoneclient tests out of keystone. They're
serving a useful purpose of catching problems with non-backwards
compatible changes in keystoneclient so we still want them run.
Problem is they're running at the wrong time -- only on changes to
keystone and not changes to keystoneclient.

The tests need to be run:

When keystoneclient changes
 - run the tests against the change

When the tests change
 - run the change against the current keystoneclient and also old clients

When keystone changes
 - run the tests against the change with current client

So here's what I think we need to do to get keystone client tests out
of keystone:

 1) Figure out where to put the tests - is it tempest or something else?
 2) Write up a test and put it there
 3) Have a job that when there's a change in the tests it runs against
current client lib
 4) Expand the job to also run against old clients
- or is there 1 job per version?
- what versions? (keystone does master, essex-3, and 0.1.1)
- e.g. tox -e master,essex-3,0.1.1
- suggest start with these versions and then consider what to use
in future
 5) Now we can start adding tests
 6) Have a job that when there's a change in keystoneclient it runs
these tests against the change
 7) When there's a change in keystone, run these tests against the change
 8) Copy the keystoneclient tests from keystone to the new location --
will require some changes
 9) Remove the tests from keystone \o/
10) Move tests back to keystone where makes sense -- use webtest like
v3 tests

I created an etherpad with this same info so it's easier to discuss:
https://etherpad.openstack.org/p/KeystoneTestsToTempest

- Brant



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


I'll ask the super obvious question, why not move the keystoneclient 
tests to keystoneclient?


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][QA] Disabling the v3 API tests in the gate

2014-06-12 Thread Matt Riedemann



On 6/12/2014 12:40 AM, Christopher Yeoh wrote:

On Thu, Jun 12, 2014 at 7:30 AM, Matthew Treinish mtrein...@kortar.org
mailto:mtrein...@kortar.org wrote:

Hi everyone,

As part of debugging all the bugs that have been plaguing the gate
the past
couple of weeks one of the things that came up is that we're still
running the
v3 API tests in the gate. AIUI at summit Nova decided that the v3
API test won't
exist as a separate major version. So I'm not sure there is much
value in
continuing to run the API tests.


So the v3 API won't exist as a separate major version, but I think its
very important we keep up with the tempest tests so we don't regress.
Over time these v3 api features will either be ported to
v2.1microversions (the vast majority I expect) or dropped. At that
point  they'll be moved to tempest testing v2.1microversions.

  But whatever we do we'll need to test against v2 (which we're stuck
with for a very long time) and v2.1microversions (rolling possible
backwards incompatible changes to the v2 api) for quite a while.


in motivator for doing this is the total run time of tempest, the v3
tests
add ~7-10min of time to the gating jobs right now. [1] (which is
just a time
test, not how it'll be implemented) While this doesn't seem like much it
actually would make a big difference in our total throughput. Every
little bit
counts. There are probably some other less quantifiable benefits to
removing the
extra testing like for example slightly decreasing the load on nova
in an
already stressed environment like the gating nodes.

So I'd like to propose that we disable running the v3 API tests in
the gate. I
was thinking we would keep the tests around in tree for as long as
there was
a v3 API in any supported nova branch, but instead of running them
in the gate
just have a nightly bit-rot job on the tests and also add it to the
experimental
queue.


I'd really prefer we don't take this route, but its better than nothing.
Incidentally the v3 tempest api tests have in the past found race
conditions which did theoretically occur in the v2 api as well. Just the
different architecture exposed them a bit better.

Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I think it'd be OK to move them to the experimental queue and a periodic 
nightly job until the v2.1 stuff shakes out.  The v3 API is marked 
experimental right now so it seems fitting that it'd be running tests in 
the experimental queue until at least the spec is approved and 
microversioning starts happening in the code base.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Gate still backed up - need assistance with nova-network logging enhancements

2014-06-12 Thread Matt Riedemann



On 6/10/2014 5:36 AM, Michael Still wrote:

https://review.openstack.org/99002 adds more logging to
nova/network/manager.py, but I think you're not going to love the
debug log level. Was this the sort of thing you were looking for
though?

Michael

On Mon, Jun 9, 2014 at 11:45 PM, Sean Dague s...@dague.net wrote:

Based on some back of envelope math the gate is basically processing 2
changes an hour, failing one of them. So if you want to know how long
the gate is, take the length / 2 in hours.

Right now we're doing a lot of revert roulette, trying to revert things
that we think landed about the time things went bad. I call this
roulette because in many cases the actual issue isn't well understood. A
key reason for this is:

*nova network is a blackhole*

There is no work unit logging in nova-network, and no attempted
verification that the commands it ran did a thing. Most of these
failures that we don't have good understanding of are the network not
working under nova-network.

So we could *really* use a volunteer or two to prioritize getting that
into nova-network. Without it we might manage to turn down the failure
rate by reverting things (or we might not) but we won't really know why,
and we'll likely be here again soon.

 -Sean

--
Sean Dague
http://dague.net


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev







I mentioned this in the nova meeting today also but the assocated bug 
for the nova-network ssh timeout issue is bug 1298472 [1].


My latest theory on that one is if there could be a race/network leak in 
the ec2 third party tests in Tempest or something in the ec2 API in 
nova, because I saw this [2] showing up in the n-net logs.  My thinking 
is the tests or the API are not tearing down cleanly and eventually 
network resources are leaked and we start hitting those timeouts.  Just 
a theory at this point, but the ec2 3rd party tests do run concurrently 
with the scenario tests so things could be colliding at that point, but 
I haven't had time to dig into it, plus I have very little experience in 
those tests or the ec2 API in nova.


[1] https://bugs.launchpad.net/tempest/+bug/1298472
[2] http://goo.gl/6f1dfw

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Message level security plans.

2014-06-12 Thread Matt Riedemann



On 6/12/2014 10:08 AM, Kelsey, Timothy Joh wrote:

Hello OpenStack folks,

First please allow me to introduce myself, my name is Tim Kelsey and I’m a 
security developer working at HP. I am very interested in projects like Kite 
and the work that’s being undertaken to introduce message level security into 
OpenStack and would love to help out on that front. In an effort to ascertain 
the current state of development it would be great to hear from the people who 
are involved in this and find out what's being worked on or planned in 
blueprints.

Many Thanks,

--
Tim Kelsey
Cloud Security Engineer
HP Helion

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Are you talking about log messages or RPC messages?  For log messages, 
there is a thread that started yesterday on masking auth tokens [1].


If RPC, I'm aware of at least one issue filed against Qpid [2] for 
allowing a way to tell Qpid not to log a message since it might contain 
sensitive information (like auth tokens).


Looks like there is also an older blueprint for trusted messaging here [3].

[1] http://lists.openstack.org/pipermail/openstack-dev/2014-June/037345.html
[2] https://issues.apache.org/jira/browse/QPID-5772
[3] https://blueprints.launchpad.net/oslo.messaging/+spec/trusted-messaging

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Gate proposal - drop Postgresql configurations in the gate

2014-06-12 Thread Matt Riedemann



On 6/12/2014 9:38 AM, Mike Bayer wrote:


On 6/12/14, 8:26 AM, Julien Danjou wrote:

On Thu, Jun 12 2014, Sean Dague wrote:


That's not cacthable in unit or functional tests?

Not in an accurate manner, no.


Keeping jobs alive based on the theory that they might one day be useful
is something we just don't have the liberty to do any more. We've not
seen an idle node in zuul in 2 days... and we're only at j-1. j-3 will
be at least +50% of this load.

Sure, I'm not saying we don't have a problem. I'm just saying it's not a
good solution to fix that problem IMHO.


Just my 2c without having a full understanding of all of OpenStack's CI
environment, Postgresql is definitely different enough that MySQL
strict mode could still allow issues to slip through quite easily, and
also as far as capacity issues, this might be longer term but I'm hoping
to get database-related tests to be lots faster if we can move to a
model that spends much less time creating databases and schemas.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Is there some organization out there that uses PostgreSQL in production 
that could stand up 3rd party CI with it?


I know that at least for the DB2 support we're adding across the 
projects we're doing 3rd party CI for that. Granted it's a proprietary 
DB unlike PG but if we're talking about spending resources on testing 
for something that's not widely used, but there is a niche set of users 
that rely on it, we could/should move that to 3rd party CI.


I'd much rather see us spend our test resources on getting multi-node 
testing running in the gate so we can test migrations in Nova.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Olso] Periodic task coalescing

2014-06-12 Thread Matt Riedemann



On 6/12/2014 8:55 AM, Tom Cammann wrote:

Hello,

I'm addressing https://bugs.launchpad.net/oslo/+bug/1326020 which is
dealing with periodic tasks.

There is currently a code block that checks if a task is 0.2 seconds
away from being run and if so it run now instead. Essentially
coalescing nearby tasks together.

 From oslo-incubator/openstack/common/periodic_task.py:162

# If a periodic task is _nearly_ due, then we'll run it early
idle_for = min(idle_for, spacing)
if last_run is not None:
 delta = last_run + spacing - time.time()
 if delta  0.2:
 idle_for = min(idle_for, delta)
 continue

However the resolution in the config for various periodic tasks is by
the second, and I have been unable to find a task that has a
millisecond resolution. I intend to get rid of this coalescing in this
bug fix.

It fits in with this bug fix as I intend to make the tasks run on their
specific spacing boundaries, i.e. if spacing is 10 seconds, it will run
at 17:30:10, 17:30:20, etc.

Is there any reason to keep the coalescing of tasks?

Thanks,

Tom


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Seems reasonable to remove this.  For historical context, it looks like 
this code was moved over to oslo-incubator from nova in early Havana 
[1]. Going back to grizzly-eol on nova, the periodic task code was in 
nova.manager. From what I can tell, the 0.2 check was added here [2]. 
There isn't really an explicit statement about why that was added in the 
commit message or the related bug though. Maybe it had something to do 
with the tests or the dynamic looping call that was added?  You could 
see if Michael (mikal) remembers.


[1] https://review.openstack.org/#/c/25885/
[2] https://review.openstack.org/#/c/18618/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][QA] Disabling the v3 API tests in the gate

2014-06-12 Thread Matt Riedemann



On 6/12/2014 10:51 AM, Matthew Treinish wrote:

On Fri, Jun 13, 2014 at 12:41:19AM +0930, Christopher Yeoh wrote:

On Fri, Jun 13, 2014 at 12:25 AM, Dan Smith d...@danplanet.com wrote:


I think it'd be OK to move them to the experimental queue and a periodic

nightly job until the v2.1 stuff shakes out.  The v3 API is marked
experimental right now so it seems fitting that it'd be running tests in
the experimental queue until at least the spec is approved and
microversioning starts happening in the code base.



I think this is reasonable. Continuing to run the full set of tests on
every patch for something we never expect to see the light of day (in its
current form) seems wasteful to me. Plus, we're going to (presumably) be
ramping up tests on v2.1, which means to me that we'll need to clear out
some capacity to make room for that.



Thats true, though I was suggesting as v2.1microversions rolls out  we drop
the test out of v3 and move it to v2.1microversions testing, so there's no
change in capacity required.


That's why I wasn't proposing that we rip the tests out of the tree. I'm just
trying to weigh the benefit of leaving them enabled on every run against
the increased load they cause in an arguably overworked gate.



Matt - how much of the time overhead is scenario tests? That's something
that would have a lot less impact if moved to and experimental queue.
Although the v3 api as a whole won't be officially exposed, the api tests
test specific features fairly indepdently which are slated for
v2.1microversions on a case by case basis and I don't want to see those
regress. I guess my concern is how often the experimental queue results get
really looked at and how hard/quick it is to revert when lots of stuff
merges in a short period of time)


The scenario tests tend to be the slower tests in tempest. I have to disagree
that removing them would have lower impact. The scenario tests provide the best
functional verification, which is part of the reason we always have failures in
the gate on them. While it would make the gate faster the decrease in what were
testing isn't worth it. Also, for reference I pulled the test run times that
were greater than 10sec out of a recent gate run:
http://paste.openstack.org/show/83827/

The experimental jobs aren't automatically run, they have to be manually
triggered by leaving a 'check experimental' comment. So for changes that we want
to test the v3 api on a comment would have to left. To prevent regression is why
we'd also have the nightly job, which I think is a better compromise for the v3
tests while we wait to migrate them to the v2.1 microversion tests.

Another, option is that we make the v3 job run only on the check queue and not
on the gate. But the benefits of that are slightly more limited, because we'd
still be holding up the check queue.

-Matt Treinish



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Yeah the scenario tests need to stay, that's how we've exposed the two 
big ssh bugs in the last couple of weeks which are obvious issues at scale.


I still think experimental/periodic is the way to go, not a hybrid of 
check-on/gate-off.  If we want to explicitly test v3 API changes we can 
do that with 'recheck experimental'.  Granted someone has to remember to 
run those, much like checking/rechecking 3rd party CI results.


One issue I've had with the nightly periodic job is finding out where 
the results are in an easy to consume format.  Is there something out 
there for that?  I'm thinking specifically of things we've turned off in 
the gate before like multi-backend volume tests and 
allow_tenant_isolation=False.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Message level security plans.

2014-06-12 Thread Matt Riedemann



On 6/12/2014 10:31 AM, Kelsey, Timothy Joh wrote:

Thanks for the info Matt, I guess I should have been clearer about what I
was asking. I was indeed referring to the trusted RPC messaging proposal
you linked. Im keen to find out whats happening with that and where I can
help.



Looks like there was a short related thread in the dev list last month:

http://lists.openstack.org/pipermail/openstack-dev/2014-May/034392.html

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Gate still backed up - need assistance with nova-network logging enhancements

2014-06-12 Thread Matt Riedemann



On 6/12/2014 10:41 AM, Davanum Srinivas wrote:

Hey Matt,

There is a connection pool in
https://github.com/boto/boto/blob/develop/boto/connection.py which
could be causing issues...

-- dims

On Thu, Jun 12, 2014 at 10:50 AM, Matt Riedemann
mrie...@linux.vnet.ibm.com wrote:



On 6/10/2014 5:36 AM, Michael Still wrote:


https://review.openstack.org/99002 adds more logging to
nova/network/manager.py, but I think you're not going to love the
debug log level. Was this the sort of thing you were looking for
though?

Michael

On Mon, Jun 9, 2014 at 11:45 PM, Sean Dague s...@dague.net wrote:


Based on some back of envelope math the gate is basically processing 2
changes an hour, failing one of them. So if you want to know how long
the gate is, take the length / 2 in hours.

Right now we're doing a lot of revert roulette, trying to revert things
that we think landed about the time things went bad. I call this
roulette because in many cases the actual issue isn't well understood. A
key reason for this is:

*nova network is a blackhole*

There is no work unit logging in nova-network, and no attempted
verification that the commands it ran did a thing. Most of these
failures that we don't have good understanding of are the network not
working under nova-network.

So we could *really* use a volunteer or two to prioritize getting that
into nova-network. Without it we might manage to turn down the failure
rate by reverting things (or we might not) but we won't really know why,
and we'll likely be here again soon.

  -Sean

--
Sean Dague
http://dague.net


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev







I mentioned this in the nova meeting today also but the assocated bug for
the nova-network ssh timeout issue is bug 1298472 [1].

My latest theory on that one is if there could be a race/network leak in the
ec2 third party tests in Tempest or something in the ec2 API in nova,
because I saw this [2] showing up in the n-net logs.  My thinking is the
tests or the API are not tearing down cleanly and eventually network
resources are leaked and we start hitting those timeouts.  Just a theory at
this point, but the ec2 3rd party tests do run concurrently with the
scenario tests so things could be colliding at that point, but I haven't had
time to dig into it, plus I have very little experience in those tests or
the ec2 API in nova.

[1] https://bugs.launchpad.net/tempest/+bug/1298472
[2] http://goo.gl/6f1dfw

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






mtreinish also pointed out that the nightly periodic job to run tempest 
with nova-network and without tenant isolation is failing from hitting 
over quotas on floating IPs [1].  That's also hitting security group 
rule failures [2], possibly those are related.


[1] 
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/b92b844/console.html#_2014-06-12_08_02_55_875
[2] 
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/b92b844/console.html#_2014-06-12_08_02_56_623


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Review guidelines for API patches

2014-06-12 Thread Matt Riedemann



On 6/12/2014 5:58 PM, Christopher Yeoh wrote:

On Fri, Jun 13, 2014 at 8:06 AM, Michael Still mi...@stillhq.com
mailto:mi...@stillhq.com wrote:

In light of the recent excitement around quota classes and the
floating ip pollster, I think we should have a conversation about the
review guidelines we'd like to see for API changes proposed against
nova. My initial proposal is:

  - API changes should have an associated spec


+1

  - API changes should not be merged until there is a tempest change to
test them queued for review in the tempest repo


+1

Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



We do have some API change guidelines here [1].  I don't want to go 
overboard on every change and require a spec if it's not necessary, i.e. 
if it falls into the 'generally ok' list in that wiki.  But if it's 
something that's not documented as a supported API (so it's completely 
new) and is pervasive (going into novaclient so it can be used in some 
other service), then I think that warrants some spec consideration so we 
don't miss something.


To compare, this [2] is an example of something that is updating an 
existing API but I don't think warrants a blueprint since I think it 
falls into the 'generally ok' section of the API change guidelines.


[1] https://wiki.openstack.org/wiki/APIChangeGuidelines
[2] https://review.openstack.org/#/c/99443/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] Review guidelines for API patches

2014-06-12 Thread Matt Riedemann



On 6/12/2014 8:58 PM, Matt Riedemann wrote:



On 6/12/2014 5:58 PM, Christopher Yeoh wrote:

On Fri, Jun 13, 2014 at 8:06 AM, Michael Still mi...@stillhq.com
mailto:mi...@stillhq.com wrote:

In light of the recent excitement around quota classes and the
floating ip pollster, I think we should have a conversation about the
review guidelines we'd like to see for API changes proposed against
nova. My initial proposal is:

  - API changes should have an associated spec


+1

  - API changes should not be merged until there is a tempest
change to
test them queued for review in the tempest repo


+1

Chris


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



We do have some API change guidelines here [1].  I don't want to go
overboard on every change and require a spec if it's not necessary, i.e.
if it falls into the 'generally ok' list in that wiki.  But if it's
something that's not documented as a supported API (so it's completely
new) and is pervasive (going into novaclient so it can be used in some
other service), then I think that warrants some spec consideration so we
don't miss something.

To compare, this [2] is an example of something that is updating an
existing API but I don't think warrants a blueprint since I think it
falls into the 'generally ok' section of the API change guidelines.

[1] https://wiki.openstack.org/wiki/APIChangeGuidelines
[2] https://review.openstack.org/#/c/99443/



I think I'd like to say I think something about something a few more 
times... :)


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Gate proposal - drop Postgresql configurations in the gate

2014-06-12 Thread Matt Riedemann



On 6/12/2014 5:11 PM, Michael Still wrote:

On Thu, Jun 12, 2014 at 10:06 PM, Sean Dague s...@dague.net wrote:

We're definitely deep into capacity issues, so it's going to be time to
start making tougher decisions about things we decide aren't different
enough to bother testing on every commit.


I think one of the criticisms that could be made about OpenStack at
the moment is that we're not opinionated enough. We have a lot of bugs
because we support huge numbers of drivers of varying quality and
completeness. Do you think its time for the gate to be an opinionated
set of tests of how OpenStack can be deployed? Perhaps we should gate
on only one permutation of a possible OpenStack cloud, and then let
people who want to propose deviations from that permutation run their
own CI as third parties.

I'm not particularly advocating this stance, but it is an option and
I'd like to see it explored a bit more.

Michael



Yeah was sort of thinking along the same lines - does any of the survey 
data help here, i.e. what's the percentage of deployments using mysql vs 
postgresql?


Another example is we want testing for Ceph/Rbd but I don't expect that 
to be in the upstream CI/gate, I more or less expect that from some 3rd 
party CI run by someone using it in production and really really cares 
about it's quality and maintenance in the tree.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][ceilometer] FloatingIp pollster spamming n-api logs (bug 1328694)

2014-06-13 Thread Matt Riedemann



On 6/12/2014 10:31 AM, John Garbutt wrote:

On 11 June 2014 20:07, Joe Gordon joe.gord...@gmail.com wrote:

On Wed, Jun 11, 2014 at 11:38 AM, Matt Riedemann
mrie...@linux.vnet.ibm.com wrote:

On 6/11/2014 10:01 AM, Eoghan Glynn wrote:

Thanks for bringing this to the list Matt, comments inline ...


tl;dr: some pervasive changes were made to nova to enable polling in
ceilometer which broke some things and in my opinion shouldn't have been
merged as a bug fix but rather should have been a blueprint.

===

The detailed version:

I opened bug 1328694 [1] yesterday and found that came back to some
changes made in ceilometer for bug 1262124 [2].

Upon further inspection, the original ceilometer bug 1262124 made some
changes to the nova os-floating-ips API extension and the database API
[3], and changes to python-novaclient [4] to enable ceilometer to use
the new API changes (basically pass --all-tenants when listing floating
IPs).

The original nova change introduced bug 1328694 which spams the nova-api
logs due to the ceilometer change [5] which does the polling, and right
now in the gate ceilometer is polling every 15 seconds.



IIUC that polling cadence in the gate is in the process of being reverted
to the out-of-the-box default of 600s.


I pushed a revert in ceilometer to fix the spam bug and a separate patch
was pushed to nova to fix the problem in the network API.



Thank you for that. The revert is just now approved on the ceilometer
side,
and is wending its merry way through the gate.


The bigger problem I see here is that these changes were all made under
the guise of a bug when I think this is actually a blueprint.  We have
changes to the nova API, changes to the nova database API, CLI changes,
potential performance impacts (ceilometer can be hitting the nova
database a lot when polling here), security impacts (ceilometer needs
admin access to the nova API to list floating IPs for all tenants),
documentation impacts (the API and CLI changes are not documented), etc.

So right now we're left with, in my mind, two questions:

1. Do we just fix the spam bug 1328694 and move on, or
2. Do we revert the nova API/CLI changes and require this goes through
the nova-spec blueprint review process, which should have happened in
the first place.



So just to repeat the points I made on the unlogged #os-nova IRC channel
earlier, for posterity here ...

Nova already exposed an all_tenants flag in multiple APIs (servers,
volumes,
security-groups etc.) and these would have:

(a) generally pre-existed ceilometer's usage of the corresponding APIs

and:

(b) been tracked and proposed at the time via straight-forward LP
bugs,
as  opposed to being considered blueprint material

So the manner of the addition of the all_tenants flag to the floating_ips
API looks like it just followed existing custom  practice.

Though that said, the blueprint process and in particular the nova-specs
aspect, has been tightened up since then.

My preference would be to fix the issue in the underlying API, but to use
this as a teachable moment ... i.e. to require more oversight (in the
form of a reviewed  approved BP spec) when such API changes are proposed
in the future.

Cheers,
Eoghan


Are there other concerns here?  If there are no major objections to the
code that's already merged, then #2 might be excessive but we'd still
need docs changes.

I've already put this on the nova meeting agenda for tomorrow.

[1] https://bugs.launchpad.net/ceilometer/+bug/1328694
[2] https://bugs.launchpad.net/nova/+bug/1262124
[3] https://review.openstack.org/#/c/81429/
[4] https://review.openstack.org/#/c/83660/
[5] https://review.openstack.org/#/c/83676/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



While there is precedent for --all-tenants with some of the other APIs,
I'm concerned about where this stops.  When ceilometer wants polling on some
other resources that the nova API exposes, will it need the same thing?
Doing all of this polling for resources in all tenants in nova puts an undue
burden on the nova API and the database.

Can we do something with notifications here instead?  That's where the
nova-spec process would have probably caught this.


++ to notifications and not polling.


Yeah, I think we need to revert this, and go through the specs
process. Its been released in Juno-1 now, so this revert feels bad,
but perhaps its the best of a bad situation?

Word of caution, we need to get notifications versioned correctly if
we want this as a more formal external API. I think Heat have
similar issues in this area, efficiently knowing about something
happening in Nova. So we do need

Re: [openstack-dev] [Nova] Nominating Ken'ichi Ohmichi for nova-core

2014-06-14 Thread Matt Riedemann



On 6/14/2014 5:40 AM, Sean Dague wrote:

On 06/13/2014 06:40 PM, Michael Still wrote:

Greetings,

I would like to nominate Ken'ichi Ohmichi for the nova-core team.

Ken'ichi has been involved with nova for a long time now.  His reviews
on API changes are excellent, and he's been part of the team that has
driven the new API work we've seen in recent cycles forward. Ken'ichi
has also been reviewing other parts of the code base, and I think his
reviews are detailed and helpful.

Please respond with +1s or any concerns.


+1



References:

   
https://review.openstack.org/#/q/owner:ken1ohmichi%2540gmail.com+status:open,n,z

   https://review.openstack.org/#/q/reviewer:ken1ohmichi%2540gmail.com,n,z

   http://www.stackalytics.com/?module=nova-groupuser_id=oomichi

As a reminder, we use the voting process outlined at
https://wiki.openstack.org/wiki/Nova/CoreTeam to add members to our
core team.

Thanks,
Michael






___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



+1

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][qa] Do all turbo-hipster jobs fail in stable/havana?

2014-06-17 Thread Matt Riedemann



On 6/16/2014 11:58 PM, Joshua Hesketh wrote:

Hi there,

Very sorry for the mishap. I manually enqueued our zuul to run tests on
changes that turbo-hipster had recently missed and did not pay attention
to the branch they were for.

Turbo-Hipster doesn't run tests on stable or non-master branches so it
should have never attempted to. Because I enqueued the changes manually
it accidentally attempted to run them and didn't know how to handle it
correctly.

I have removed the negative votes. Please let me know if I have missed any.

Sorry again for the trouble.

Cheers,
Josh

On 6/17/14 11:44 AM, wu jiang wrote:

Hi all,

Is turbo-hipster OK for stable/havana?

I found all turbo-hipster jobs after 06/09 failed in stable/havana [1].
And the 'recheck migrations' command didn't trigger the re-examination
of turbo-hipster, but Jenkins recheck work..

Thanks.

WingWJ

---

[1]
https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/havana,n,z


[2] https://review.openstack.org/#/c/67613/
[3] https://review.openstack.org/#/c/72521/
[4] https://review.openstack.org/#/c/98874/


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev






Yeah I have some on stable/icehouse with -1 votes from t-h:

https://review.openstack.org/#/c/99215/
https://review.openstack.org/#/c/97811/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] nova default quotas

2014-06-17 Thread Matt Riedemann



On 6/10/2014 3:56 PM, Matt Riedemann wrote:



On 6/4/2014 11:02 AM, Day, Phil wrote:

 Matt and I chatted on IRC and have come up with an outlined plan, if
we missed anything please don't hesitate to comment or ask.

 

 https://etherpad.openstack.org/p/quota-classes-goof-up

I added a few thoughts / questions

*From:*Joe Gordon [mailto:joe.gord...@gmail.com]
*Sent:* 02 June 2014 21:52
*To:* OpenStack Development Mailing List (not for usage questions)
*Subject:* Re: [openstack-dev] [nova] nova default quotas

On Mon, Jun 2, 2014 at 12:29 PM, Matt Riedemann
mrie...@linux.vnet.ibm.com mailto:mrie...@linux.vnet.ibm.com wrote:



On 6/2/2014 12:53 PM, Joe Gordon wrote:




On Thu, May 29, 2014 at 10:46 AM, Matt Riedemann

mrie...@linux.vnet.ibm.com mailto:mrie...@linux.vnet.ibm.com
mailto:mrie...@linux.vnet.ibm.com
mailto:mrie...@linux.vnet.ibm.com wrote:



 On 5/27/2014 4:44 PM, Vishvananda Ishaya wrote:

 I’m not sure that this is the right approach. We really
have to
 add the old extension back for compatibility, so it
might be
 best to simply keep that extension instead of adding a
new way
 to do it.

 Vish

 On May 27, 2014, at 1:31 PM, Cazzolato, Sergio J
 sergio.j.cazzol...@intel.com
mailto:sergio.j.cazzol...@intel.com

 mailto:sergio.j.cazzol...@intel.com
mailto:sergio.j.cazzol...@intel.com wrote:

 I have created a blueprint to add this
functionality to nova.

https://review.openstack.org/#__/c/94519/


 https://review.openstack.org/#/c/94519/


 -Original Message-
 From: Vishvananda Ishaya
[mailto:vishvana...@gmail.com mailto:vishvana...@gmail.com
 mailto:vishvana...@gmail.com
mailto:vishvana...@gmail.com]
 Sent: Tuesday, May 27, 2014 5:11 PM
 To: OpenStack Development Mailing List (not for
usage questions)
 Subject: Re: [openstack-dev] [nova] nova default
quotas

 Phil,

 You are correct and this seems to be an error. I
don't think
 in the earlier ML thread[1] that anyone remembered
that the
 quota classes were being used for default quotas.
IMO we
 need to revert this removal as we (accidentally)
removed a
 Havana feature with no notification to the
community. I've
 reactivated a bug[2] and marked it critcal.

 Vish

 [1]


http://lists.openstack.org/__pipermail/openstack-dev/2014-__February/027574.html



http://lists.openstack.org/pipermail/openstack-dev/2014-February/027574.html

 [2] https://bugs.launchpad.net/__nova/+bug/1299517


 https://bugs.launchpad.net/nova/+bug/1299517

 On May 27, 2014, at 12:19 PM, Day, Phil
philip@hp.com mailto:philip@hp.com

 mailto:philip@hp.com
mailto:philip@hp.com wrote:

 Hi Vish,

 I think quota classes have been removed from
Nova now.

 Phil


 Sent from Samsung Mobile


  Original message 
 From: Vishvananda Ishaya
 Date:27/05/2014 19:24 (GMT+00:00)
 To: OpenStack Development Mailing List (not
for usage
 questions)
 Subject: Re: [openstack-dev] [nova] nova
default quotas

 Are you aware that there is already a way to do
this
 through the cli using quota-class-update?


http://docs.openstack.org/__user-guide-admin/content/cli___set_quotas.html





http://docs.openstack.org/user-guide-admin/content/cli_set_quotas.html
 (near the bottom)

 Are you suggesting that we also add the ability
to use
 just regular quota-update? I'm not sure i see
the need
 for both.

 Vish

 On May 20, 2014, at 9:52 AM, Cazzolato, Sergio J
 sergio.j.cazzol...@intel.com
mailto:sergio.j.cazzol...@intel.com

 mailto:sergio.j.cazzol...@intel.com
mailto:sergio.j.cazzol...@intel.com wrote:

 I would to hear your thoughts about an idea
to add a
 way to manage the default quota values
through the API

[openstack-dev] [neutron][nova] nova needs a new release of neutronclient for OverQuotaClient exception

2014-06-23 Thread Matt Riedemann
There are at least two changes [1][2] proposed to Nova that use the new 
OverQuotaClient exception in python-neutronclient, but the unit test 
jobs no longer test against trunk-level code of the client packages so 
they fail.  So I'm here to lobby for a new release of 
python-neutronclient if possible so we can keep these fixes moving.  Are 
there any issues with that?


[1] https://review.openstack.org/#/c/62581/
[2] https://review.openstack.org/#/c/101462/
--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa][infra] etherpad on elastic-recheck testing improvements

2014-06-25 Thread Matt Riedemann
Sean asked me to jot some thoughts down on how we can automate some of 
our common review criteria for elastic-recheck queries, so that's here:


https://etherpad.openstack.org/p/elastic-recheck-testing

There is some low-hanging-fruit in there I think, but the bigger / more 
useful change is actually automating running the proposed query against 
ES and validating the results within some defined criteria.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] milestone-proposed is dead, long lives proposed/foo

2014-06-28 Thread Matt Riedemann



On 6/27/2014 7:44 AM, Thierry Carrez wrote:

Hi everyone,

Since the dawn of time, we have been using milestone-proposed branches
for milestone and final release branches. Those would get
milestone-critical and release-critical bugfixes backports, while the
master branch can continue to be open for development.

However, reusing the same blanket name for every such branch is causing
various issues, especially around upgrade testing. It also creates havoc
in local repositories which may have kept traces of previous
incarnations of milestone-proposed.

For all those reasons, we decided at the last summit to use unique
pre-release branches, named after the series (for example,
proposed/juno). That branch finally becomes stable/juno at release
time. In parallel, we abandoned the usage of release branches for
development milestones, which are now tagged directly on the master
development branch.

The visible impact of this change will be apparent when we reach Juno
RC1s. RC bugfixes will have to be backported to proposed/juno instead
of milestone-proposed. Tarballs automatically generated from this
branch will be named PROJECT-proposed-juno.tar.gz instead of
PROJECT-milestone-proposed.tar.gz. All relevant process wiki pages will
be adapted to match the new names in the coming weeks.

We are also generally changing[1] ACLs which used to apply to
milestone-proposed branches so that they now apply to proposed/*
branches. If you're a stackforge or non-integrated project which made
use of milestone-proposed branches, you should probably switch to using
proposed/foo branches when that patch lands.

[1] https://review.openstack.org/#/c/102822/

Regards,



We've been using a similar concept internally, we call it 
havana-proposed, icehouse-proposed, etc, but it sounds like the same 
idea.  We're supporting more than the last 2 stable releases and it's 
hard to tell when we need to quickly turn out a release candidate build 
(like for a security issue going back to Folsom or something), so it 
makes sense for us to try and avoid branch naming collisions.


Besides that branch naming we basically follow the same release process 
as upstream based on the same wiki's, with some additional automation 
for tagging and cleanup after the release.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Log / error message format best practices standards

2014-06-28 Thread Matt Riedemann



On 6/26/2014 1:54 PM, Jay Pipes wrote:

On 06/26/2014 12:14 PM, boden wrote:

We were recently having a discussion over here in trove regarding a
standardized format to use for log and error messages - obviously
consistency is ideal (within and across projects). As this discussion
involves the broader dev community, bringing this topic to the list for
feedback...

I'm aware of the logging standards wiki[1], however this page does not
describe in depth a standardized format to use for log / error messages.

In particular w/r/t program values in messages:

(a) For in-line program values, I've seen both single quoted and
unquoted formatting used. e.g.
single quote: LOG.info(The ID '%s' is not invalid. % (resource.id))
unquoted: LOG.info(The ID %s is not valid. % (resource.id))


No opinion on this one.


(b) For program values appended to the message, I've seen various
formats used. e.g.
LOG.info(This path is invalid: %s % (obj.path))
LOG.info(This path is invalid %s % (obj.path))
LOG.info(This path is invalid - %s % (obj.path))


The first would be my preference (i.e. using a :  to delineate the
target of the log message)


 From a consistency perspective, it seems we should consider
standardizing a best practice for such formatting.


Possibly, though this is likely getting into the realm of femto-nits and
bike-shedding.


Ha, you read my mind, i.e. bike-shedding.

There are a few wikis and devref docs on style guides in openstack 
including logging standards, I'd say make sure there is common sense in 
there and then leave the rest to the review team to police the logs in 
new changes - if it's ugly, change it with a patch.


We don't need to boil the ocean to develop a set of standards/processes 
that are so heavy weight that people aren't going to follow anyway.


This sounds exactly like the kind of thing I see a lot within the 
workings of my corporate overlord and it drives me crazy, so I'm a bit 
biased here. :)


FWIW, Sean Dague has a draft logging standards spec for nova here:

https://review.openstack.org/#/c/91446/




For in-line values (#a above) I find single quotes the most consumable
as they are a clear indication the value came from code and moreover
provide a clear set of delimiters around the value. However to date
unquoted appears to be the most widely used.

For appended values (#b above) I find a delimiter such as ':' most
consumable as it provides a clear boundary between the message and
value. Using ':' seems fairly common today, but you'll find other
formatting throughout the code.

If we wanted to squash this topic the high level steps are
(approximately):
- Determine and document message format.
- Ensure the format is part of the dev process (coding + review).
- Cross team work to address existing messages not following the format.


Thoughts / comments?


[1] https://wiki.openstack.org/wiki/LoggingStandards



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [trove] how to trigger a recheck of reddwarf CI?

2014-06-29 Thread Matt Riedemann
The reddwarf 3rd party CI is failing on an oslo sync patch [1] but 
Jenkins is fine, I'm unable to find any wiki or guideline on how to 
recheck just the reddwarf CI, is that possible?


[1] https://review.openstack.org/#/c/103232/
--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][i18n] why isn't the client translated?

2014-06-30 Thread Matt Riedemann
I noticed that there is no locale directory or setup.cfg entry for 
babel, which surprises me.  The v1_1 shell in python-novaclient has a 
lot of messages marked for translation using the _() function but the v3 
shell doesn't, presumably because someone figured out we don't translate 
the client messages anyway.


I'm just wondering why we don't translate the client?

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] how to unit test scripts outside of nova/nova?

2014-07-01 Thread Matt Riedemann
As part of the enforce-unique-instance-uuid-in-db blueprint [1] I'm 
writing a script to scan the database and find any NULL instance_uuid 
records that will cause the new database migration to fail so that 
operators can run this before they run the migration, otherwise the 
migration blocks if these types of records are found.


I have the script written [2], but wanted to also write unit tests for 
it. I guess I assumed the script would go under nova/tools/db like the 
schema_diff.py script, but I'm not sure how to unit test anything 
outside of the nova/nova tree.


Nova's testr configuration is only discovering tests within nova/tests 
[3].  But I don't think I can put the unit tests under nova/tests and 
then import the module from nova/tools.


So I'm a bit stuck.  I could take the easy way out and just throw the 
script under nova/db/sqlalchemy/migrate_repo and put my unit tests under 
nova/tests/db/, and I'd also get pep8 checking with that, but that 
doesn't seem right - but I'm also possibly over-thinking this.


Anyone else have any ideas?

[1] 
https://blueprints.launchpad.net/nova/+spec/enforce-unique-instance-uuid-in-db

[2] https://review.openstack.org/#/c/97946/
[3] http://git.openstack.org/cgit/openstack/nova/tree/.testr.conf#n5

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] how to unit test scripts outside of nova/nova?

2014-07-01 Thread Matt Riedemann



On 7/1/2014 4:03 PM, Matthew Treinish wrote:

On Tue, Jul 01, 2014 at 03:21:06PM -0500, Matt Riedemann wrote:

As part of the enforce-unique-instance-uuid-in-db blueprint [1] I'm writing
a script to scan the database and find any NULL instance_uuid records that
will cause the new database migration to fail so that operators can run this
before they run the migration, otherwise the migration blocks if these types
of records are found.

I have the script written [2], but wanted to also write unit tests for it. I
guess I assumed the script would go under nova/tools/db like the
schema_diff.py script, but I'm not sure how to unit test anything outside of
the nova/nova tree.

Nova's testr configuration is only discovering tests within nova/tests [3].
But I don't think I can put the unit tests under nova/tests and then import
the module from nova/tools.


So we hit a similar issue in tempest when we wanted to unit test some utility
scripts in tempest/tools. Changing the discovery path to find tests outside of
nova/tests is actually a pretty easy change[4], but I don't think that will 
solve
the use case with tox. What happened when we tried to do this in tempest use
case was that when the project was getting installed the tools dir wasn't
included so when we ran with tox it couldn't find the files we were trying to
test. The solution we came up there was to put the script under the tempest
namespace and add unit tests in tempest/tests. (we also added an entry point for
the script to expose it as a command when tempest was installed)



So I'm a bit stuck.  I could take the easy way out and just throw the script
under nova/db/sqlalchemy/migrate_repo and put my unit tests under
nova/tests/db/, and I'd also get pep8 checking with that, but that doesn't
seem right - but I'm also possibly over-thinking this.

Anyone else have any ideas?


I think it really comes down to how you want to present the utility to the end
users. To enable unit testing it, it's just easier to put it in the nova
namespace. I couldn't come up with a good way to get around the
install/namespace issue. (maybe someone else who is more knowledgeable here has
a good way to get around this) So then you can symlink it to the tools dir or
add an entry point (or bake it into nova-manage) to make it easy to find. I
think the issue with putting it in nova/db/sqlalchemy/migrate_repo is that it's
hard to find.



[1] 
https://blueprints.launchpad.net/nova/+spec/enforce-unique-instance-uuid-in-db
[2] https://review.openstack.org/#/c/97946/
[3] http://git.openstack.org/cgit/openstack/nova/tree/.testr.conf#n5

[4] 
http://git.openstack.org/cgit/openstack/tempest/tree/tempest/test_discover/test_discover.py

-Matt Treinish



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Matt,

Thanks for the help, I completely forgot about making the new script an 
entry point in setup.cfg, that's a good idea.


Before I saw this I did move the script under 
nova/db/sqlalchemy/migrate_repo and moved the tests under nova/tests/db 
and have that all working now, so will probably just move forward with 
that rather than try to do some black magic with test discovery and 
getting the module imported.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Openstack and SQLAlchemy

2014-07-07 Thread Matt Riedemann



On 7/2/2014 8:23 PM, Mike Bayer wrote:


I've just added a new section to this wiki, MySQLdb + eventlet = sad,
summarizing some discussions I've had in the past couple of days about
the ongoing issue that MySQLdb and eventlet were not meant to be used
together.   This is a big one to solve as well (though I think it's
pretty easy to solve).

https://wiki.openstack.org/wiki/OpenStack_and_SQLAlchemy#MySQLdb_.2B_eventlet_.3D_sad



On 6/30/14, 12:56 PM, Mike Bayer wrote:

Hi all -

For those who don't know me, I'm Mike Bayer, creator/maintainer of
SQLAlchemy, Alembic migrations and Dogpile caching.   In the past month
I've become a full time Openstack developer working for Red Hat, given
the task of carrying Openstack's database integration story forward.
To that extent I am focused on the oslo.db project which going forward
will serve as the basis for database patterns used by other Openstack
applications.

I've summarized what I've learned from the community over the past month
in a wiki entry at:

https://wiki.openstack.org/wiki/Openstack_and_SQLAlchemy

The page also refers to an ORM performance proof of concept which you
can see at https://github.com/zzzeek/nova_poc.

The goal of this wiki page is to publish to the community what's come up
for me so far, to get additional information and comments, and finally
to help me narrow down the areas in which the community would most
benefit by my contributions.

I'd like to get a discussion going here, on the wiki, on IRC (where I am
on freenode with the nickname zzzeek) with the goal of solidifying the
blueprints, issues, and SQLAlchemy / Alembic features I'll be focusing
on as well as recruiting contributors to help in all those areas.  I
would welcome contributors on the SQLAlchemy / Alembic projects directly
as well, as we have many areas that are directly applicable to Openstack.

I'd like to thank Red Hat and the Openstack community for welcoming me
on board and I'm looking forward to digging in more deeply in the coming
months!

- mike



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Regarding the eventlet + mysql sadness, I remembered this [1] in the 
nova.db.api code.


I'm not sure if that's just nova-specific right now, I'm a bit too lazy 
at the moment to check if it's in other projects, but I'm not seeing it 
in neutron, for example, and makes me wonder if it could help with the 
neutron db lock timeouts we see in the gate [2].  Don't let the bug 
status fool you, that thing is still showing up, or a variant of it is.


There are at least 6 lock-related neutron bugs hitting the gate [3].

[1] https://review.openstack.org/59760
[2] https://bugs.launchpad.net/neutron/+bug/1283522
[3] http://status.openstack.org/elastic-recheck/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo] Openstack and SQLAlchemy

2014-07-07 Thread Matt Riedemann



On 7/7/2014 3:28 PM, Jay Pipes wrote:



On 07/07/2014 04:17 PM, Mike Bayer wrote:


On 7/7/14, 3:57 PM, Matt Riedemann wrote:




Regarding the eventlet + mysql sadness, I remembered this [1] in the
nova.db.api code.

I'm not sure if that's just nova-specific right now, I'm a bit too
lazy at the moment to check if it's in other projects, but I'm not
seeing it in neutron, for example, and makes me wonder if it could
help with the neutron db lock timeouts we see in the gate [2].  Don't
let the bug status fool you, that thing is still showing up, or a
variant of it is.

There are at least 6 lock-related neutron bugs hitting the gate [3].

[1] https://review.openstack.org/59760
[2] https://bugs.launchpad.net/neutron/+bug/1283522
[3] http://status.openstack.org/elastic-recheck/



yeah, tpool, correct me if I'm misunderstanding, we take some API code
that is 90% fetching from the database, we have it all under eventlet,
the purpose of which is, IO can be shoveled out to an arbitrary degree,
e.g. 500 concurrent connections type of thing, but then we take all the
IO (MySQL access) and put it into a thread pool anyway.


Yep. It makes no sense to do that, IMO.

The solution is to use a non-blocking MySQLdb library which will yield
appropriately for evented solutions like gevent and eventlet.

Best,
-jay

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Yeah, nevermind my comment, since it's not working without an eventlet 
patch, details in the nova bug here [1].  And it sounds like it's still 
not 100% with the patch.


[1] https://bugs.launchpad.net/nova/+bug/1171601

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] new nasty gate bug 1338844 with nova-network races

2014-07-07 Thread Matt Riedemann
I noticed the bug [1] today.  Given the trend in logstash, it might be 
related to some fixes proposed to try and resolve the other big nova ssh 
timeout bug 1298472.  It appears to only be in jobs using nova-network.


[1] https://bugs.launchpad.net/nova/+bug/1338844

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova] whatever happened to removing instance.locked in icehouse?

2014-07-08 Thread Matt Riedemann
I came across this [1] today and noticed the note to remove 
instance.locked in favor of locked_by is still in master, so apparently 
not being removed in Icehouse.


Is anyone aware of intentions to remove instance.locked, or we don't 
care, or other?  If we don't care, maybe we should remove the note in 
the code.


I found it and thought about this because the check_instance_lock 
decorator in nova.compute.api doesn't check the locked_by field [2] but 
I'm guessing it probably should...


[1] https://review.openstack.org/#/c/38196/13/nova/objects/instance.py
[2] 
http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/api.py?id=2014.2.b1#n184


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] new nasty gate bug 1338844 with nova-network races

2014-07-09 Thread Matt Riedemann



On 7/7/2014 9:29 PM, Matt Riedemann wrote:

I noticed the bug [1] today.  Given the trend in logstash, it might be
related to some fixes proposed to try and resolve the other big nova ssh
timeout bug 1298472.  It appears to only be in jobs using nova-network.

[1] https://bugs.launchpad.net/nova/+bug/1338844



Looks like jogo got the fix here:

https://review.openstack.org/#/c/105651/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [gate] concurrent workers are overwhelming postgresql in the gate - bug 1338841

2014-07-09 Thread Matt Riedemann
Bug 1338841 [1] started showing up yesterday and I first noticed it on 
the change to set osapi_volume_workers equal to the number of CPUs 
available by default.  Similar patches for trove (api/conductor workers) 
and glance (api/registry workers) have landed in the last week also, and 
nova has been running with multiple api/conductor workers by default 
since Icehouse.


It looks like the cinder change tipped the default postgresql 
max_connections over and we started getting asynchronous connection 
failures in that job. [2]


We can also note that the postgresql job is the only one that runs the 
nova api-metadata service, which has it's own workers.


The VMs the jobs are running on have 8 VCPUs, so that's at least 88 
workers between nova (3), cinder (1), glance (2), trove (2), neutron, 
heat and ceilometer.


So osapi_volume_workers (8) + n-api-meta workers (8) seems to have 
tipped it over.


The first attempt at a fix is to simply double the default 
max_connections value [3].


While looking up the postgresql configuration docs, I also read a bit on 
synchronous_commit=off and fsync=off, which sound like we might want to 
also think about using one of those in devstack runs since they are 
supposed to be more performant if you don't care about disaster recovery 
(which we don't in gate runs on VMs).


Anyway, bumping max connections might fix the gate, I'm just sending 
this out to see if there are any postgresql experts out there with 
additional tips or insights on things we can tweak or look for, 
including whether or not it might be worthwhile to set 
synchronous_commit=off or fsync=off for gate runs.


[1] https://bugs.launchpad.net/nova/+bug/1338841
[2] http://goo.gl/yRBDjQ
[3] https://review.openstack.org/#/c/105854/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [gate] concurrent workers are overwhelming postgresql in the gate - bug 1338841

2014-07-09 Thread Matt Riedemann



On 7/9/2014 2:59 PM, Matt Riedemann wrote:

Bug 1338841 [1] started showing up yesterday and I first noticed it on
the change to set osapi_volume_workers equal to the number of CPUs
available by default.  Similar patches for trove (api/conductor workers)
and glance (api/registry workers) have landed in the last week also, and
nova has been running with multiple api/conductor workers by default
since Icehouse.

It looks like the cinder change tipped the default postgresql
max_connections over and we started getting asynchronous connection
failures in that job. [2]

We can also note that the postgresql job is the only one that runs the
nova api-metadata service, which has it's own workers.

The VMs the jobs are running on have 8 VCPUs, so that's at least 88
workers between nova (3), cinder (1), glance (2), trove (2), neutron,
heat and ceilometer.

So osapi_volume_workers (8) + n-api-meta workers (8) seems to have
tipped it over.

The first attempt at a fix is to simply double the default
max_connections value [3].

While looking up the postgresql configuration docs, I also read a bit on
synchronous_commit=off and fsync=off, which sound like we might want to
also think about using one of those in devstack runs since they are
supposed to be more performant if you don't care about disaster recovery
(which we don't in gate runs on VMs).

Anyway, bumping max connections might fix the gate, I'm just sending
this out to see if there are any postgresql experts out there with
additional tips or insights on things we can tweak or look for,
including whether or not it might be worthwhile to set
synchronous_commit=off or fsync=off for gate runs.

[1] https://bugs.launchpad.net/nova/+bug/1338841
[2] http://goo.gl/yRBDjQ
[3] https://review.openstack.org/#/c/105854/



Typo in my math on the workers, it should be:

nova (3*8), cinder (1*8), glance (2*8), trove (2*8), neutron (1), heat 
(1) and ceilometer (1) = 67.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Inter cloud resource federation [Alliance]

2014-07-09 Thread Matt Riedemann



On 7/9/2014 12:33 PM, Tiwari, Arvind wrote:

Hi All,

I am investigating on inter cloud resource federation across OS based
cloud deployments, this is needed to support multi regions, cloud
bursting, VPC and more use cases. I came up with a design (link below)
which advocate a new service (a.k.a. Alliance), this service sits close
to Keystone and help abstracting all the inter cloud concerns from
Keystone. This service will be abstracted from end users and there won’t
be any direct interactions between user and Alliance service. Keystone
will be delegating all inter cloud concerns to Alliance.

https://wiki.openstack.org/wiki/Inter_Cloud_Resource_Federation

Apart from basic resource federation use cases, Alliance service will
add following features

1.UUID token support across cloud

2.PKI Token support

3.Inter Cloud Token Validation

4.Inter Cloud Communication to allow

•Region/endpoint Discovery

•Service Discovery

•Remote Resource Provisioning

5.Resource Access Across Clouds

6.SSO Across Cloud

7.SSOut Across Cloud (or Inter Cloud Token Revocation)

8.Notification to propagate meter info, resource de-provisioning ….

I would appreciate if you guys take a look and share your perspective. I
am open to any questions, suggestions, discussions on the same.

Thanks for your time,

Arvind

*Please excuse any typographical error.***



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Is this only identity (keystone) are other things like booting instances 
in nova from public/private clouds which are abstracted from the client, 
and if so have you heard of nova-cells?


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] fastest way to run individual tests ?

2014-07-09 Thread Matt Riedemann



On 6/12/2014 6:17 AM, Daniel P. Berrange wrote:

On Thu, Jun 12, 2014 at 07:07:37AM -0400, Sean Dague wrote:

On 06/12/2014 06:59 AM, Daniel P. Berrange wrote:

Does anyone have any tip on how to actually run individual tests in an
efficient manner. ie something that adds no more than 1 second penalty
over  above the time to run the test itself. NB, assume that i've primed
the virtual env with all prerequisite deps already.



The overhead is in the fact that we have to discover the world, then
throw out the world.

You can actually run an individual test via invoking the testtools.run
directly:


python -m testtools.run nova.tests.test_versions


(Also, when testr explodes because of an import error this is about the
only way to debug what's going on).


Most excellent, thankyou. I knew someone must know a way to do it :-)

Regards,
Daniel



I've been beating my head against the wall a bit on unit tests too this 
week, and here is another tip that just uncovered something for me when 
python -m testtools.run and nosetests didn't help.


I sourced the tox virtualenv and then ran the test from there, which 
gave me the actual error, so something like this:


source .tox/py27/bin/activate
python -m testtools.run test

Props to Matt Odden for helping me with the source of the venv tip.

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [nova][qa] proposal for moving forward on cells/tempest testing

2014-07-14 Thread Matt Riedemann
Today we only gate on exercises in devstack for cells testing coverage 
in the gate-devstack-dsvm-cells job.


The cells tempest non-voting job was moving to the experimental queue 
here [1] since it doesn't work with a lot of the compute API tests.


I think we all agreed to tar and feather comstud if he didn't get 
Tempest working (read: passing) with cells enabled in Juno.


The first part of this is just figuring out where we sit with what's 
failing in Tempest (in the check-tempest-dsvm-cells-full job).


I'd like to propose that we do the following to get the ball rolling:

1. Add an option to tempest.conf under the compute-feature-enabled 
section to toggle cells and then use that option to skip tests that we 
know will fail in cells, e.g. security group tests.


2. Open bugs for all of the tests we're skipping so we can track closing 
those down, assuming they aren't already reported. [2]


3. Once the known failures are being skipped, we can move 
check-tempest-dsvm-cells-full out of the experimental queue.  I'm not 
proposing that it'd be voting right away, I think we have to see it burn 
in for awhile first.


With at least this plan we should be able to move forward on identifying 
issues and getting some idea for how much of Tempest doesn't work with 
cells and the effort involved in making it work.


Thoughts? If there aren't any objections, I said I'd work on the qa-spec 
and can start doing the grunt-work of opening bugs and skipping tests.


[1] https://review.openstack.org/#/c/87982/
[2] https://bugs.launchpad.net/nova/+bugs?field.tag=cells+

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [oslo][glance] what to do about tons of http connection pool is full warnings in g-api log?

2014-07-14 Thread Matt Riedemann
I opened bug 1341777 [1] against glance but it looks like it's due to 
the default log level for requests.packages.urllib3.connectionpool in 
oslo's log module.


The problem is this warning shows up nearly 420K times in 7 days in 
Tempest runs:


WARNING urllib3.connectionpool [-] HttpConnectionPool is full, 
discarding connection: 127.0.0.1


So either glance is doing something wrong, or that's logging too high of 
a level (I think it should be debug in this case).  I'm not really sure 
how to scope this down though, or figure out what is so damn chatty in 
glance-api that is causing this.  It doesn't seem to be causing test 
failures, but the rate at which this is logged in glance-api is surprising.


[1] https://bugs.launchpad.net/glance/+bug/1341777

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][glance] what to do about tons of http connection pool is full warnings in g-api log?

2014-07-14 Thread Matt Riedemann



On 7/14/2014 4:09 PM, Matt Riedemann wrote:

I opened bug 1341777 [1] against glance but it looks like it's due to
the default log level for requests.packages.urllib3.connectionpool in
oslo's log module.

The problem is this warning shows up nearly 420K times in 7 days in
Tempest runs:

WARNING urllib3.connectionpool [-] HttpConnectionPool is full,
discarding connection: 127.0.0.1

So either glance is doing something wrong, or that's logging too high of
a level (I think it should be debug in this case).  I'm not really sure
how to scope this down though, or figure out what is so damn chatty in
glance-api that is causing this.  It doesn't seem to be causing test
failures, but the rate at which this is logged in glance-api is surprising.

[1] https://bugs.launchpad.net/glance/+bug/1341777



I found this older thread [1] which led to this in oslo [2] but I'm not 
really sure how to use it to make the connectionpool logging quieter in 
glance, any guidance there?  It looks like in Joe's change to nova for 
oslo.messaging he just changed the value directly in the log module in 
nova, something I thought was forbidden.


[1] 
http://lists.openstack.org/pipermail/openstack-dev/2014-March/030763.html

[2] https://review.openstack.org/#/c/94001/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][glance] what to do about tons of http connection pool is full warnings in g-api log?

2014-07-14 Thread Matt Riedemann



On 7/14/2014 5:18 PM, Ben Nemec wrote:

On 07/14/2014 04:21 PM, Matt Riedemann wrote:



On 7/14/2014 4:09 PM, Matt Riedemann wrote:

I opened bug 1341777 [1] against glance but it looks like it's due to
the default log level for requests.packages.urllib3.connectionpool in
oslo's log module.

The problem is this warning shows up nearly 420K times in 7 days in
Tempest runs:

WARNING urllib3.connectionpool [-] HttpConnectionPool is full,
discarding connection: 127.0.0.1

So either glance is doing something wrong, or that's logging too high of
a level (I think it should be debug in this case).  I'm not really sure
how to scope this down though, or figure out what is so damn chatty in
glance-api that is causing this.  It doesn't seem to be causing test
failures, but the rate at which this is logged in glance-api is surprising.

[1] https://bugs.launchpad.net/glance/+bug/1341777



I found this older thread [1] which led to this in oslo [2] but I'm not
really sure how to use it to make the connectionpool logging quieter in
glance, any guidance there?  It looks like in Joe's change to nova for
oslo.messaging he just changed the value directly in the log module in
nova, something I thought was forbidden.

[1]
http://lists.openstack.org/pipermail/openstack-dev/2014-March/030763.html
[2] https://review.openstack.org/#/c/94001/



There was a change recently in incubator to address something related,
but since it's setting to WARN I don't think it would get rid of this
message:
https://github.com/openstack/oslo-incubator/commit/3310d8d2d3643da2fc249fdcad8f5000866c4389

It looks like Joe's change was a cherry-pick of the incubator change to
add oslo.messaging, so discouraged but not forbidden (and apparently
during feature freeze, which is understandable).

-Ben

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Yeah it sounds like either a problem in glance because they don't allow 
configuring the max pool size so it defaults to 1, or it's an issue in 
python-swiftclient and is being tracked in a different bug:


https://bugs.launchpad.net/python-swiftclient/+bug/1295812

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova][qa] proposal for moving forward on cells/tempest testing

2014-07-15 Thread Matt Riedemann



On 7/15/2014 12:36 AM, Sean Dague wrote:

On 07/14/2014 07:44 PM, Matt Riedemann wrote:

Today we only gate on exercises in devstack for cells testing coverage
in the gate-devstack-dsvm-cells job.

The cells tempest non-voting job was moving to the experimental queue
here [1] since it doesn't work with a lot of the compute API tests.

I think we all agreed to tar and feather comstud if he didn't get
Tempest working (read: passing) with cells enabled in Juno.

The first part of this is just figuring out where we sit with what's
failing in Tempest (in the check-tempest-dsvm-cells-full job).

I'd like to propose that we do the following to get the ball rolling:

1. Add an option to tempest.conf under the compute-feature-enabled
section to toggle cells and then use that option to skip tests that we
know will fail in cells, e.g. security group tests.


I don't think we should do that. Part of creating the feature matrix in
devstack gate included the follow on idea of doing extension selection
based on branch or feature.

I'm happy if that gets finished, then tests are skipped by known not
working extensions, but just landing a ton of tempest ifdefs that will
all be removed is feeling very gorpy. Especially as we're now at Juno 2,
which was supposed to be the checkpoint for this being on track for
completion and... people are just talking about starting.


2. Open bugs for all of the tests we're skipping so we can track closing
those down, assuming they aren't already reported. [2]

3. Once the known failures are being skipped, we can move
check-tempest-dsvm-cells-full out of the experimental queue.  I'm not
proposing that it'd be voting right away, I think we have to see it burn
in for awhile first.

With at least this plan we should be able to move forward on identifying
issues and getting some idea for how much of Tempest doesn't work with
cells and the effort involved in making it work.

Thoughts? If there aren't any objections, I said I'd work on the qa-spec
and can start doing the grunt-work of opening bugs and skipping tests.

[1] https://review.openstack.org/#/c/87982/
[2] https://bugs.launchpad.net/nova/+bugs?field.tag=cells+



All the rest is fine, I just think we should work on the proper way to
skip things.

-Sean



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



OK I don't know anything about the extensions in devstack-gate or how 
the skips would work then, I'll have to bug some people in IRC unless 
there is an easy example that can be pointed out here.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [oslo][glance] what to do about tons of http connection pool is full warnings in g-api log?

2014-07-15 Thread Matt Riedemann



On 7/14/2014 5:28 PM, Matt Riedemann wrote:



On 7/14/2014 5:18 PM, Ben Nemec wrote:

On 07/14/2014 04:21 PM, Matt Riedemann wrote:



On 7/14/2014 4:09 PM, Matt Riedemann wrote:

I opened bug 1341777 [1] against glance but it looks like it's due to
the default log level for requests.packages.urllib3.connectionpool in
oslo's log module.

The problem is this warning shows up nearly 420K times in 7 days in
Tempest runs:

WARNING urllib3.connectionpool [-] HttpConnectionPool is full,
discarding connection: 127.0.0.1

So either glance is doing something wrong, or that's logging too
high of
a level (I think it should be debug in this case).  I'm not really sure
how to scope this down though, or figure out what is so damn chatty in
glance-api that is causing this.  It doesn't seem to be causing test
failures, but the rate at which this is logged in glance-api is
surprising.

[1] https://bugs.launchpad.net/glance/+bug/1341777



I found this older thread [1] which led to this in oslo [2] but I'm not
really sure how to use it to make the connectionpool logging quieter in
glance, any guidance there?  It looks like in Joe's change to nova for
oslo.messaging he just changed the value directly in the log module in
nova, something I thought was forbidden.

[1]
http://lists.openstack.org/pipermail/openstack-dev/2014-March/030763.html

[2] https://review.openstack.org/#/c/94001/



There was a change recently in incubator to address something related,
but since it's setting to WARN I don't think it would get rid of this
message:
https://github.com/openstack/oslo-incubator/commit/3310d8d2d3643da2fc249fdcad8f5000866c4389


It looks like Joe's change was a cherry-pick of the incubator change to
add oslo.messaging, so discouraged but not forbidden (and apparently
during feature freeze, which is understandable).

-Ben

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Yeah it sounds like either a problem in glance because they don't allow
configuring the max pool size so it defaults to 1, or it's an issue in
python-swiftclient and is being tracked in a different bug:

https://bugs.launchpad.net/python-swiftclient/+bug/1295812



It looks like the issue for the g-api logs was bug 1295812 in 
python-swiftclient, around the time that moved to using python-requests.


I noticed last night that the n-cpu/c-vol logs started spiking with the 
urllib3 connectionpool warning on 7/11 which is when python-glanceclient 
started using requests, so I've changed bug 1341777 to a 
python-glanceclient bug.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] python-glanceclient with requests is spamming the logs

2014-07-15 Thread Matt Riedemann
I've been looking at bug 1341777 since yesterday originally because of 
g-api logs and this warning:


HttpConnectionPool is full, discarding connection: 127.0.0.1

But that's been around awhile and it sounds like an issue with 
python-swiftclient since it started using python-requests (see bug 1295812).


I did also noticed that the warning started spiking in the n-cpu and 
c-vol logs on 7/11 and traced that back to this change in 
python-glanceclient to start using requests:


https://review.openstack.org/#/c/78269/

This is nasty because it's generating around 166K warnings since 7/11 in 
those logs:


http://goo.gl/p0urYm

It's a big change in glanceclient so I wouldn't want to propose a revert 
for this, but hopefully the glance team can sort this out quickly since 
it's going to impact our elastic search cluster.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] request to tag novaclient 2.18.0

2014-07-18 Thread Matt Riedemann



On 7/17/2014 5:48 PM, Steve Baker wrote:

On 18/07/14 00:44, Joe Gordon wrote:




On Wed, Jul 16, 2014 at 11:28 PM, Steve Baker sba...@redhat.com
mailto:sba...@redhat.com wrote:

On 12/07/14 09:25, Joe Gordon wrote:




On Fri, Jul 11, 2014 at 4:42 AM, Jeremy Stanley
fu...@yuggoth.org mailto:fu...@yuggoth.org wrote:

On 2014-07-11 11:21:19 +0200 (+0200), Matthias Runge wrote:
 this broke horizon stable and master; heat stable is
affected as
 well.
[...]

I guess this is a plea for applying something like the oslotest
framework to client libraries so they get backward-compat
jobs run
against unit tests of all dependant/consuming software...
branchless
tempest already alleviates some of this, but not the case of
changes
in a library which will break unit/functional tests of another
project.


We actually do have some tests for backwards compatibility, and
they all passed. Presumably because both heat and horizon have
poor integration test.

We ran

  * check-tempest-dsvm-full-havana

http://logs.openstack.org/66/94166/3/check/check-tempest-dsvm-full-havana/8e09faa
SUCCESS in 40m 47s (non-voting)
  * check-tempest-dsvm-neutron-havana

http://logs.openstack.org/66/94166/3/check/check-tempest-dsvm-neutron-havana/b4ad019
SUCCESS in 36m 17s (non-voting)
  * check-tempest-dsvm-full-icehouse

http://logs.openstack.org/66/94166/3/check/check-tempest-dsvm-full-icehouse/c0c62e5
SUCCESS in 53m 05s
  * check-tempest-dsvm-neutron-icehouse

http://logs.openstack.org/66/94166/3/check/check-tempest-dsvm-neutron-icehouse/a54aedb
SUCCESS in 57m 28s


on the offending patches (https://review.openstack.org/#/c/94166/)

Infra patch that added these tests:
https://review.openstack.org/#/c/80698/



Heat-proper would have continued working fine with novaclient
2.18.0. The regression was with raising novaclient exceptions,
which is only required in our unit tests. I saw this break coming
and switched to raising via from_response
https://review.openstack.org/#/c/97977/22/heat/tests/v1_1/fakes.py

Unit tests tend to deal with more internals of client libraries
just for mocking purposes, and there have been multiple breaks in
unit tests for heat and horizon when client libraries make
internal changes.

This could be avoided if the client gate jobs run the unit tests
for the projects which consume them.

That may work but isn't this exactly what integration testing is for?

If you mean tempest then no, this is different.

Client projects have done a good job of keeping their public library
APIs stable. An exception type is public API, but the constructor for
raising that type arguably is more of a gray area since only the client
library should be raising its own exceptions.

However heat and horizon unit tests need to raise client exceptions to
test their own error condition handling, so exception constructors could
be considered public API, but only for unit test mocking in other projects.

This problem couldn't have been caught in an integration test because
nothing outside the unit tests directly raises a client exception.

There have been other breakages where internal client library changes
have broken the mocking in our unit tests (I recall a neutronclient
internal refactor).

In many cases the cause may be inappropriate mocking in the unit tests,
but that is cold comfort when the gates break when a client library is
released.

Maybe we can just start with adding heat and horizon to the check jobs
of the clients they consume, but the following should also be considered:
grep python-.*client */requirements.txt

This could give client libraries more confidence that internal changes
don't break anything, and allows them to fix mocking in other projects
before their changes land.



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



I don't think we should have to change the gate jobs just so that other 
projects can test against the internals of their dependent clients, that 
sounds like a flawed unit test design to me.


Looking at 
https://review.openstack.org/#/c/97977/22/heat/tests/v1_1/fakes.py for 
example, why is a fake_exception needed to mock out novaclient's 
NotFound exception?  A better way to do this is that whatever is 
expecting to raise the NotFound should use mock with a side_effect to 
raise novaclient.exceptions.NotFound, then mock handles the spec being 
set on the mock and you don't have to worry about the internal 
construction of the exception class in your unit tests.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org

Re: [openstack-dev] [gate] Automatic elastic rechecks

2014-07-18 Thread Matt Riedemann



On 7/17/2014 9:01 AM, Matthew Booth wrote:

Elastic recheck is a great tool. It leaves me messages like this:

===
I noticed jenkins failed, I think you hit bug(s):

check-devstack-dsvm-cells: https://bugs.launchpad.net/bugs/1334550
gate-tempest-dsvm-large-ops: https://bugs.launchpad.net/bugs/1334550

We don't automatically recheck or reverify, so please consider doing
that manually if someone hasn't already. For a code review which is not
yet approved, you can recheck by leaving a code review comment with just
the text:

 recheck bug 1334550

For bug details see: http://status.openstack.org/elastic-recheck/
===

In an ideal world, every person seeing this would diligently check that
the fingerprint match was accurate before submitting a recheck request.

In the real world, how about we just do it automatically?

Matt



We don't want automatic rechecks because then we're just piling on to 
races, because you can have jenkins failures where we have a fingerprint 
for one job failure but there is some other job failing on your patch 
which is an unrecognized failure (no e-r fingerprint query yet).  If we 
never force people to investigate the failures and write fingerprints 
because we're just always automatically rechecking things for them, 
we'll drop our categorization rates and most likely eventually fall into 
a locked gate once we hit 2-3 really nasty races hitting at the same time.


So the best way to avoid a locked gate is to stay on top of managing the 
worst offenders and making sure everyone is actually looking at what 
failed so we can quickly identify new races.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] About the ERROR:cliff.app Service Unavailable during deploy openstack by devstack.

2014-07-24 Thread Matt Riedemann



On 7/14/2014 3:47 AM, Meng Jie MJ Li wrote:

HI,


I tried to use devstack to deploy openstack. But encountered an issue :
ERROR: cliff.app Service Unavailable (HTTP 503).  Tried several times
all same result.

2014-07-14 05:53:39.430 | + create_keystone_accounts
2014-07-14 05:53:39.431 | ++ get_or_create_project admin
2014-07-14 05:53:39.433 | +++ openstack project show admin -f value -c id
2014-07-14 05:53:40.147 | +++ openstack project create admin -f value -c id
2014-07-14 05:53:40.771 | ERROR: cliff.app Service Unavailable (HTTP 503)


2014-07-14 05:53:41.519 | +++ openstack user create admin --password
admin --project --email ad...@example.com -f value -c id
2014-07-14 05:53:42.080 | usage: openstack user create [-h] [-f
{shell,table,value}] [-c COLUMN]
2014-07-14 05:53:42.080 |  [--max-width integer]
[--prefix PREFIX]
2014-07-14 05:53:42.080 |  [--password
user-password] [--password-prompt]
2014-07-14 05:53:42.080 |  [--email user-email]
[--project project]
2014-07-14 05:53:42.080 |  [--enable | --disable]
2014-07-14 05:53:42.080 |  user-name
2014-07-14 05:53:42.081 | openstack user create: error: argument
--project: expected one argument
2014-07-14 05:53:42.109 | ++ USER_ID=
2014-07-14 05:53:42.109 | ++ echo
2014-07-14 05:53:42.109 | + ADMIN_USER=
2014-07-14 05:53:42.110 | ++ get_or_create_role admin
2014-07-14 05:53:42.111 | +++ openstack role show admin -f value -c id
2014-07-14 05:53:42.682 | +++ openstack role create admin -f value -c id
2014-07-14 05:53:43.235 | ERROR: cliff.app Service Unavailable (HTTP 503)





By checked in google, found someone encountered the same problem logged
in https://bugs.launchpad.net/devstack/+bug/129, I tried to
workaround but didn't work. The below is workaround way.
=
1st, I tried setting HOST_IP to 127.0.0.1.
Next, I set it to *9.21.xxx.xxx* , which is the address of my eth0
interface, and added
export no_proxy=localhost,127.0.0.1,*9.21.xxx.xxx*

Neither of these fixed the problem.





My localrc file:

HOST_IP=9.21.xxx.xxx
FLAT_INTERFACE=eth0
#FIXED_RANGE=10.4.128.0/20
#FIXED_NETWORK_SIZE=4096
#FLOATING_RANGE=192.168.42.128/25
MULTI_HOST=1
LOGFILE=/opt/stack/logs/stack.sh.log
ADMIN_PASSWORD=admin
MYSQL_PASSWORD=admin
RABBIT_PASSWORD=admin
SERVICE_PASSWORD=admin
SERVICE_TOKEN=xyzpdqlazydog
===

Any help appreciated


Regards
Mengjie







___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



There was a recent change to devstack to default to running keystone in 
apache, that might be what you're hitting.  There is an env var to 
disable that so it doesn't run in apache, but you'd have to look up the 
change for the details.  Should be in the devstack/libs/keystone file 
history.


--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [neutron] requesting python-neutronclient release for MacAddressInUseClient exception

2014-07-28 Thread Matt Riedemann
Nove needs a python-neutronclient release to use the new 
MacAddressInUseClient exception type defined here [1].


[1] https://review.openstack.org/#/c/109052/

--

Thanks,

Matt Riedemann


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


<    1   2   3   4   5   6   7   8   9   10   >