[openstack-dev] [qa] identity v3 issue causing non-admin job to fail

2015-07-14 Thread David Kranz
Now that the tempest periodic jobs are back (thanks infra!), I was 
looking into the real failures. It seems the main one is caused by the 
fact that the v3 check for primary creds fails if 'admin_domain_name' in 
the identity section is None, which it is when devstack configures 
tempest for non-admin.


The problem is with this code and there is even a comment related to 
this issue. There are various ways to fix this but I'm not sure what the 
value should be for the non-admin case. Andrea, any ideas?


 -David

def get_credentials(fill_in=True, identity_version=None, **kwargs):
params = dict(DEFAULT_PARAMS, **kwargs)
identity_version = identity_version or CONF.identity.auth_version
# In case of v3 add the domain from config if not specified
if identity_version == 'v3':
domain_fields = set(x for x in 
auth.KeystoneV3Credentials.ATTRIBUTES

if 'domain' in x)
if not domain_fields.intersection(kwargs.keys()):
# TODO(andreaf) It might be better here to use a dedicated 
config

# option such as CONF.auth.tenant_isolation_domain_name
params['user_domain_name'] = CONF.identity.admin_domain_name
auth_url = CONF.identity.uri_v3
else:
auth_url = CONF.identity.uri
return auth.get_credentials(auth_url,
fill_in=fill_in,
identity_version=identity_version,
**params)


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] Tempest bug triage questions

2015-06-22 Thread David Kranz

On 06/22/2015 10:15 AM, Yaroslav Lobankov wrote:

Hello everyone,

I have some questions about the bug triage procedure for Tempest:

1. Some bugs in Tempest have status Fix committed. Should we move 
statuses of these bugs to Fix released?
Yes, tempest doesn't have the kind of releases where Fix committed makes 
sense.


2. Many bugs have statuses In progress, but patches for these bugs 
have -1 from someone (or their workflow is -1)
and it looks like these patches are abandoned, while statuses of 
such patches are Review in progress.

What should we do with such bugs?
This is kind of tricky without the project having a manager. I think 
we usually ping with a comment in the bug to see what the holdup is. If 
there is no response, we would set it back to Triaged of Confirmed.


3. What should we do with bugs like this [1]? It says that 
TimeoutException occurred, but the log of the test is unavailable
by the link already. I don't know how to reproduce the issue, 
besides the log of the test is unavailable. What should I do with this 
bug?
This bug is about how tempest should handle cases where a resource 
deletion fails. I think it is a legitimate issue though the answer is 
not clear given a gate that may be very slow at times.


 -David



Thank you!

[1] https://bugs.launchpad.net/tempest/+bug/1322011

Regards,
Yaroslav Lobankov.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA][Tempest] Proposing Jordan Pittier for Tempest Core

2015-06-22 Thread David Kranz

+1

On 06/22/2015 04:23 PM, Matthew Treinish wrote:


Hi Everyone,

I'd like to propose we add Jordan Pittier (jordanP) to the tempest core team.
Jordan has been a steady contributor and reviewer on tempest over the past few
cycles and he's been actively engaged in the Tempest community. Jordan has had
one of the higher review counts on Tempest for the past cycle, and he has
consistently been providing reviews that show insight into both the project
internals and it's future direction. I feel that Jordan will make an excellent
addition to the core team.

As per the usual, if the current Tempest core team members would please vote +1
or -1(veto) to the nomination when you get a chance. We'll keep the polls open
for 5 days or until everyone has voted.

Thanks,

Matt Treinish

References:

https://review.openstack.org/#/q/reviewer:jordan.pittier%2540scality.com+project:openstack/tempest+OR+project:openstack/tempest-lib,n,z

http://stackalytics.com/?metric=marksuser_id=jordan-pittier


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] Status of account creds in the [identity] section of tempest.conf

2015-06-18 Thread David Kranz
We had a discussion about this at the qa meeting today around the 
following proposal:


tl;dr The test accounts feature provides the same functionality as the 
embedded credentials. We should deprecate the account information 
embedded directly in tempest.conf in favor of test-accounts, and remove 
those options at the beginning of the M cycle. We would also rework the 
non-isolated jobs to use parallel test accounts, with and without admin 
creds. Starting now, new features such as cleanup and tempest config 
will not be required to work well (or at all) if the embedded creds are 
used instead of test accounts.


We have (at least) three use cases that are important, and we want 
tempest to work well with all of them, but that means something 
different in each case:


1. throw-away clouds (ci, gate)
2. test clouds
3. production clouds

For (1), the most important thing is that failing tests not cause false 
negatives in other tests due to re-using a tenant. This makes tenant 
isolation continue to be a good choice here, and requiring admin is not 
an issue. In a perfect world where tempest never left behind any 
resources regardless of an error at any line of code, test accounts 
could be used. But we are probably a long way from that.


For (3), we cannot use admin creds for tempest runs, and test accounts 
with cleanup allow parallel execution, accepting the risk of a leak 
causing a false negative. The only way to avoid that risk is to stamp 
out all leak bugs in tempest.


For (2), either isolation or test accounts with cleanup can be used

The tempest.conf values are not used in any of these scenarios. Is there 
a reason they are needed for anything?


 -David





__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [QA] Meeting Thursday June 18th at 17:00 UTC

2015-06-17 Thread David Kranz

Hi everyone,

Just a quick reminder that the weekly OpenStack QA team IRC meeting will be
tomorrow Thursday, June 18th at 17:00 UTC in the #openstack-meeting channel.

The agenda for tomorrow's meeting can be found here:
https://wiki.openstack.org/wiki/Meetings/QATeamMeeting
Anyone is welcome to add an item to the agenda.

To help people figure out what time 17:00 UTC is in other timezones tomorrow's
meeting will be at:

13:00 EDT
02:00 JST
02:30 ACST
19:00 CEST
12:00 CDT
10:00 PDT

-David Kranz


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] The Nova API in Kilo and Beyond

2015-06-05 Thread David Kranz

On 06/05/2015 07:32 AM, Sean Dague wrote:

One of the things we realized at the summit was that we'd been working
through a better future for the Nova API for the past 5 cycles, gotten
somewhere quite useful, but had really done a poor job on communicating
what was going on and why, and where things are headed next.

I've written a bunch of English to explain it (which should be on the
planet shortly as well) -
https://dague.net/2015/06/05/the-nova-api-in-kilo-and-beyond-2/ (with
lots of help from Ed Leaf, John Garbutt, and Matt Gillard on content and
copy editing).

Yes, this is one of those terrible mailing list posts that points people
to read a thing not on the list (I appologize). But at 2700 words, I
think you'll find it more comfortable to read not in email.

Discussion is welcome here for any comments folks have. Some details
were trimmed for the sake of it not being a 6000 word essay, and to make
it accessible to people that don't have a ton of Nova internals
knowledge. We'll do our best to field questions, all of which will be
integrated into the eventual dev ref version of this.

Thanks for your time,

-Sean

Thanks, Sean. Great writeup. There are two issues I think might need 
more clarification/amplification:


1. Does the microversion methodology, and the motivation for true 
interoperability, imply that there needs to be a new version for every 
bug fix that could be detected by users of an api? There was back and 
forth about that in the review about the ip6 server list filter bug you 
referenced. If so, this is a pretty strong constraint that will need 
more guidance for reviewers about which kinds of changes need new 
versions and which don't.


2. What is the policy for making incompatible changes, now that 
versioning allows such changes to be made? If some one doesn't like 
the name of one of the keys in a returned dict, and submits a change 
with new microversion, how should that be evaluated? IIRC, this was an 
issue that inspired some dislike about the original v3 work.


 -David

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [QA] Meeting Thursday June 4th at 17:00 UTC

2015-06-03 Thread David Kranz

Hi everyone,

Just a quick reminder that the weekly OpenStack QA team IRC meeting will be
tomorrow Thursday, June 4th at 17:00 UTC in the #openstack-meeting channel.

The agenda for tomorrow's meeting can be found here:
https://wiki.openstack.org/wiki/Meetings/QATeamMeeting
Anyone is welcome to add an item to the agenda.

To help people figure out what time 17:00 UTC is in other timezones tomorrow's
meeting will be at:

13:00 EDT
02:00 JST
02:30 ACST
19:00 CEST
12:00 CDT
10:00 PDT

-David Kranz

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] SafeConfigParser.write duplicates defaults: bug or feature?

2015-06-02 Thread David Kranz
The verify_tempest_config script has an option to write a new conf file. 
I noticed that when you do this, the items in DEFAULT are duplicated in 
every section that is written. Looking at the source I can see why this 
happens. I guess it is not harmful but is this considered a bug in the 
write method?


 -David

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] Need volunteers for tempest bug triage

2015-06-01 Thread David Kranz

On 05/30/2015 09:15 AM, Kashyap Chamarthy wrote:

On Sat, May 30, 2015 at 03:52:02PM +0300, Yaroslav Lobankov wrote:

Hi everyone,

Is it possible for other people (not only core reviewers) to
participate in bug triage? I would like to help in doing this.

Absolutely. There's no such silly rule that only core reviwers can do
bug triage.
Of course it is helpful for any one to help with bug triage. Beware that 
confirming new bugs is a bit trickier
for tempest than other projects because it has such a high rate of 
Invalid bug reports. This is because bugs in other projects will often cause
tempest to fail and many people just file tempest bugs with a stacktrace 
from the console. You often have to dig further to find the real issue.


For that reason we like to always have a core reviewer watching bug 
traffic and being the bug supervisor in the below referenced wiki.

That was the point of this message.

 -David


While not mandatory, it can make your life a bit more easier while
troubleshooting if you have spent some time test/debugging OpenStack
environments.

Take a look here:

 https://wiki.openstack.org/wiki/BugTriage

Also, you might want to refer this useful notes from Sean Dague about
what to consider while triaging bugs (though, the example below is from
the Nova bug tracker, it's generally applicable across components):

 
http://lists.openstack.org/pipermail/openstack-dev/2014-September/046517.html




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] Need volunteers for tempest bug triage

2015-05-29 Thread David Kranz
The rotation has gotten a bit thin, and the untriaged bug count growing, 
with no one signing up for this past week:


https://etherpad.openstack.org/p/qa-bug-triage-rotation

It would help if every core reviewer could be doing this every other 
month. Getting some more sign ups would be very helpful!


 -David

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [new][cloudpulse] Announcing a project to HealthCheck OpenStack deployments

2015-05-13 Thread David Kranz

On 05/13/2015 09:06 AM, Simon Pasquier wrote:

Hello,

Like many others commented before, I don't quite understand how unique 
are the Cloudpulse use cases.


For operators, I got the feeling that existing solutions fit well:
- Traditional monitoring tools (Nagios, Zabbix, ) are necessary 
anyway for infrastructure monitoring (CPU, RAM, disks, operating 
system, RabbitMQ, databases and more) and diagnostic purposes. Adding 
OpenStack service checks is fairly easy if you already have the toolchain.
Is it really so easy? Rabbitmq has an aliveness test that is easy to 
hook into. I don't know exactly what it does, other than what the doc 
says, but I should not have to. If I want my standard monitoring system 
to call into a cloud and ask is nova healthy?, is glance healthy?, 
etc. are their such calls?


There are various sets of calls associated with nagios, zabbix, etc. but 
those seem like after-market parts for a car. Seems to me the services 
themselves would know best how to check if they are healthy, 
particularly as that could change version to version. Has their been 
discussion of adding a health-check (admin) api in each service? Lacking 
that, is there documentation from any OpenStack projects about how to 
check the health of nova? When I saw this thread start, that is what I 
thought it was going to be about.


 -David

- OpenStack projects like Rally or Tempest can generate synthetic 
loads and run end-to-end tests. Integrating them with a monitoring 
system isn't terribly difficult either.


As far as Monitoring-as-a-service is concerned, do you have plans to 
integrate/leverage Ceilometer?


BR,
Simon

On Tue, May 12, 2015 at 7:20 PM, Vinod Pandarinathan (vpandari) 
vpand...@cisco.com mailto:vpand...@cisco.com wrote:


Hello,

  I'm pleased to announce the development of a new project called
CloudPulse.  CloudPulse provides Openstack
health-checking services to both operators, tenants, and
applications. This project will begin as
a StackForge project based upon an empty cookiecutter[1] repo. 
The repos to work in are:

Server: https://github.com/stackforge/cloudpulse
Client: https://github.com/stackforge/python-cloudpulseclient

Please join us via iRC on #openstack-cloudpulse on freenode.

I am holding a doodle poll to select times for our first meeting
the week after summit.  This doodle poll will close May 24th and
meeting times will be announced on the mailing list at that time.
At our first IRC meeting,
we will draft additional core team members, so if your interested
in joining a fresh new development effort, please attend our first
meeting.
Please take a moment if your interested in CloudPulse to fill out
the doodle poll here:

https://doodle.com/kcpvzy8kfrxe6rvb

The initial core team is composed of
Ajay Kalambur,
Behzad Dastur, Ian Wells, Pradeep chandrasekhar, Steven
DakeandVinod Pandarinathan.
I expect more members to join during our initial meeting.

 A little bit about CloudPulse:
 Cloud operators need notification of OpenStack failures before a
customer reports the failure. Cloud operators can then take timely
corrective actions with minimal disruption to applications. Many
cloud applications, including
those I am interested in (NFV) have very stringent service level
agreements.  Loss of service can trigger contractual
costs associated with the service.  Application high availability
requires an operational OpenStack Cloud, and the reality
is that occascionally OpenStack clouds fail in some mysterious
ways.  This project intends to identify when those failures
occur so corrective actions may be taken by operators, tenants,
and the applications themselves.

OpenStack is considered healthy when OpenStack API services
respond appropriately.  Further OpenStack is
healthy when network traffic can be sent between the tenant
networks and can access the Internet.  Finally OpenStack
is healthy when all infrastructure cluster elements are in an
operational state.

For information about blueprints check out:
https://blueprints.launchpad.net/cloudpulse
https://blueprints.launchpad.net/python-cloudpulseclient

For more details, check out our Wiki:
https://wiki.openstack.org/wiki/Cloudpulse

Plase join the CloudPulse team in designing and implementing a
world-class Carrier Grade system for checking
the health of OpenStack clouds.  We look forward to seeing you on
IRC on #openstack-cloudpulse.

Regards,
Vinod Pandarinathan
[1] https://github.com/openstack-dev/cookiecutter


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

Re: [openstack-dev] [new][cloudpulse] Announcing a project to HealthCheck OpenStack deployments

2015-05-13 Thread David Kranz

On 05/13/2015 09:51 AM, Simon Pasquier wrote:



On Wed, May 13, 2015 at 3:27 PM, David Kranz dkr...@redhat.com 
mailto:dkr...@redhat.com wrote:


On 05/13/2015 09:06 AM, Simon Pasquier wrote:

Hello,

Like many others commented before, I don't quite understand how
unique are the Cloudpulse use cases.

For operators, I got the feeling that existing solutions fit well:
- Traditional monitoring tools (Nagios, Zabbix, ) are
necessary anyway for infrastructure monitoring (CPU, RAM, disks,
operating system, RabbitMQ, databases and more) and diagnostic
purposes. Adding OpenStack service checks is fairly easy if you
already have the toolchain.

Is it really so easy? Rabbitmq has an aliveness test that is
easy to hook into. I don't know exactly what it does, other than
what the doc says, but I should not have to. If I want my standard
monitoring system to call into a cloud and ask is nova healthy?,
is glance healthy?, etc. are their such calls?


Regarding RabbitMQ aliveness test, it has its own limits (more on that 
latter, I've got an interesting RabbitMQ outage that I'm going to 
discuss in a new thread) and it doesn't replicate exactly what the 
clients (eg OpenStack services) are doing.
I'm sure it has limits but my point was that the developers of rabbitmq 
understood that it would be difficult for users to know exactly what 
should be poked at inside to check health, so they provide a call to do it.


Regarding the service checks, there are already plenty of scripts that 
exist for Nagios, Collectd and so on. Some of them are listed in the 
Wiki [1].
I understand and that is what I meant by after-market. If some one 
puts a  new feature in service X, that requires some monitoring to be 
healthy, then all those different scripts need to chase after it to keep 
up to date. Poking at service internals to check the health of a service 
is an abstraction violation. As some one on this thread said, 
tempest/rally can be used to check a certain kind of health but it is 
akin to black-box testing whereas health monitoring should be more akin 
to whitebox-testing.



There are various sets of calls associated with nagios, zabbix,
etc. but those seem like after-market parts for a car. Seems to
me the services themselves would know best how to check if they
are healthy, particularly as that could change version to version.
Has their been discussion of adding a health-check (admin) api in
each service? Lacking that, is there documentation from any
OpenStack projects about how to check the health of nova? When I
saw this thread start, that is what I thought it was going to be
about.


Starting with Kilo, you could configure your OpenStack API services 
with the healthcheck middleware [2]. This has been inspired by what 
Swift's been doing for some time now [3].IIUC the default healthcheck 
is minimalist and doesn't check that dependent services (like 
RabbitMQ, database) are healthy but the framework is extensible and 
more healthchecks can be added.
I can see that but the real value would be in abstracting the details of 
what it means for a service to be healthy inside the implementation and 
exporting an api. If that were present, the question of whether calling 
it used middleware or not would be secondary. I'm not sure what the 
value-add of middleware would be in this case.


 -David






 -David


BR,
Simon

[1] 
https://wiki.openstack.org/wiki/Operations/Tools#Monitoring_and_Trending
[2] 
http://docs.openstack.org/developer/oslo.middleware/api.html#oslo_middleware.Healthcheck
[3] 
http://docs.openstack.org/kilo/config-reference/content/object-storage-healthcheck.html




- OpenStack projects like Rally or Tempest can generate synthetic
loads and run end-to-end tests. Integrating them with a
monitoring system isn't terribly difficult either.

As far as Monitoring-as-a-service is concerned, do you have plans
to integrate/leverage Ceilometer?

BR,
Simon

On Tue, May 12, 2015 at 7:20 PM, Vinod Pandarinathan (vpandari)
vpand...@cisco.com mailto:vpand...@cisco.com wrote:

Hello,

  I'm pleased to announce the development of a new project
called CloudPulse.  CloudPulse provides Openstack
health-checking services to both operators, tenants, and
applications. This project will begin as
a StackForge project based upon an empty cookiecutter[1]
repo. The repos to work in are:
Server: https://github.com/stackforge/cloudpulse
Client: https://github.com/stackforge/python-cloudpulseclient

Please join us via iRC on #openstack-cloudpulse on freenode.

I am holding a doodle poll to select times for our first
meeting the week after summit.  This doodle poll will close
May 24th and meeting times will be announced on the mailing
list at that time.  At our first IRC meeting

Re: [openstack-dev] [api] Changing 403 Forbidden to 400 Bad Request for OverQuota was: [nova] Which error code should we return when OverQuota

2015-05-06 Thread David Kranz

On 05/06/2015 02:07 PM, Jay Pipes wrote:

Adding [api] topic. API WG members, please do comment.

On 05/06/2015 08:01 AM, Sean Dague wrote:

On 05/06/2015 07:11 AM, Chris Dent wrote:

On Wed, 6 May 2015, Sean Dague wrote:


All other client errors, just be a 400. And use the emerging error
reporting json to actually tell the client what's going on.


Please do not do this. Please use the 4xx codes as best as you
possibly can. Yes, they don't always match, but there are several of
them for reasons™ and it is usually possible to find one that sort
of fits.

Using just 400 is bad for a healthy HTTP ecosystem. Sure, for the
most part people are talking to OpenStack through official clients
but a) what happens when they aren't, b) is that the kind of world
we want?

I certainly don't. I want a world where the HTTP APIs that OpenStack
and other services present actually use HTTP and allow a diversity
of clients (machine and human).


Absolutely. And the problem is there is not enough namespace in the HTTP
error codes to accurately reflect the error conditions we hit. So the
current model means the following:

If you get any error code, it means multiple failure conditions. Throw
it away, grep the return string to decide if you can recover.

My proposal is to be *extremely* specific for the use of anything
besides 400, so there is only 1 situation that causes that to arise. So
403 means a thing, only one thing, ever. Not 2 kinds of things that you
need to then figure out what you need to do.

If you get a 400, well, that's multiple kinds of errors, and you need to
then go conditional.

This should provide a better experience for all clients, human and 
machine.


I agree with Sean on this one.


Using response codes effectively makes it easier to write client code
that is either simple or is able to use generic libraries effectively.

Let's be honest: OpenStack doesn't have a great record of using HTTP
effectively or correctly. Let's not make it worse.

In the case of quota, 403 is fairly reasonable because you are in
fact Forbidden from doing the thing you want to do. Yes, with the
passage of time you may very well not be forbidden so the semantics
are not strictly matching but it is more immediately expressive yet
not quite as troubling as 409 (which has a more specific meaning).


Except it's not, because you are saying to use 403 for 2 issues (Don't
have permissions and Out of quota).

Turns out, we have APIs for adjusting quotas, which your user might have
access to. So part of 403 space is something you might be able to code
yourself around, and part isn't. Which means you should always ignore it
and write custom logic client side.

Using something beyond 400 is *not* more expressive if it has more than
one possible meaning. Then it's just muddy. My point is that all errors
besides 400 should have *exactly* one cause, so they are specific.


Yes, agreed.

I think Sean makes an excellent point that if you have 1 condition 
that results in a 403 Forbidden, it actually does not make things more 
expressive. It actually just means both humans and clients need to now 
delve deeper into the error context to determine if this is something 
they actually don't have permission to do, or whether they've exceeded 
their quota but otherwise have permission to do some action.


Best,
-jay

+1
The basic problem is we are trying to fit a square (generic api) peg in 
a round (HTTP request/response) hole.
But if we do say we are recognizing sub-error-codes, it might be good 
to actually give them numbers somewhere in the response (maybe an error 
code header) rather than relying on string matching to determine the 
real error. String matching is fragile and has icky i18n implications.


 -David


p.s. And, yes, Chris, I definitely do see your side of the coin on 
this. It's nuanced, and a grey area...


__ 


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [swift] Go! Swift!

2015-04-30 Thread David Kranz

On 04/30/2015 12:52 PM, Jay Pipes wrote:

On 04/30/2015 12:40 PM, John Dickinson wrote:

Swift is a scalable and durable storage engine for storing
unstructured data. It's been proven time and time again in production
in clusters all over the world.

We in the Swift developer community are constantly looking for ways
to improve the codebase and deliver a better quality codebase to
users everywhere. During the past year, the Rackspace Cloud Files
team has been exploring the idea of reimplementing parts of Swift in
Go. Yesterday, they released some of this code, called hummingbird,
for the first time. It's been proposed to a feature/hummingbird
branch in Swift's source repo.

https://review.openstack.org/#/c/178851

I am very excited about this work being in the greater OpenStack
Swift developer community. If you look at the patch above, you'll see
that there are various parts of Swift reimplemented in Go. During the
next six months (i.e. before Tokyo), I would like us to answer this
question:

What advantages does a compiled-language object server bring, and do
they outweigh the costs of using a different language?
Although I have come to like python in certain ways, here is my take on 
advantages:


1. Performance
2. Code understandability, when paired with a good IDE. Particularly 
with folks new to a code base. With static typing:
a) You can hover over any variable and know its type (and value 
when in the debugger)
b) You can find definitions and references discriminated using real 
name scopes, not getting false hits for different variables with the 
same name

c) You can examine static call graphs at any point in the code
d) The IDE can do refactoring for you without worrying about 
variables names that are the same even though they are in different scopes
e) check the features of any real modern IDE that uses static 
typing and type inference to understand the code


This sort of question has spawned many religious wars of course. 
Statically typed languages like Java can be very clunky to use with a 
lot of boilerplate code to write. It is easier to prototype things using 
a language like Python because you do not have to determine type 
structure up front. This is a double-edged sword IMO, when you are not 
prototyping.


Of course, there are a ton of things we need to explore on this
topic, but I'm happy that we'll be doing it in the context of the
open community instead of behind closed doors. We will have a
fishbowl session in Vancouver on this topic. I'm looking forward to
the discussion.


Awesome discussion topic. I've long argued that OpenStack should be 
the API, not the implementation, to allow for experimentation in other 
languages such as Golang.
I have always thought there were valid arguments on both sides for the 
API vs. implementation debate, but I don't think it is necessary to 
resolve that to proceed in such a direction. Seems this is more about 
the current OpenStack position that all implementations must be written 
in Python. There would be nothing (other than vociferous objections from 
some folks :-) ) to stop us from saying that OpenStack is an 
implementation andnot an API, but not all OpenStack project 
implementations are required to use Python.


 -David




Kudos to the Rackspace Cloud Files team for this effort. I'll 
definitely dig into the code.





Best,
-jay

__ 


OpenStack Development Mailing List (not for usage questions)
Unsubscribe: 
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] trimming down Tempest smoke tag

2015-04-28 Thread David Kranz

On 04/28/2015 06:38 AM, Sean Dague wrote:

The Tempest Smoke tag was originally introduced to provide a quick view
of your OpenStack environment to ensure that a few basic things were
working. It was intended to be fast.

However, during Icehouse the smoke tag was repurposed as a way to let
neutron not backslide (so it's massively overloaded with network tests).
It current runs at about 15 minutes on neutron jobs. This is why grenade
neutron takes *so* long, because we run tempest smoke twice.

The smoke tag needs a diet. I believe our working definition should be
something as follows:

  - Total run time should be fast (= 5 minutes)
  - No negative tests
  - No admin tests
  - No tests that test optional extensions
  - No tests that test advanced services (like lbaas, vpnaas)
  - No proxy service tests

The criteria for a good set of tests is CRUD operations on basic
services. For instance, with compute we should have built a few servers,
ensure we can shut them down. For neutron we should have done some basic
network / port plugging.
That makes sense. On IRC, Sean and I agreed that this would include 
creation of users, projects, etc. So some of the keystone smoke tests
will be left in even though admin. IMO, it is debatable whether admin is 
relevant as part of the criteria for smoke.


We also previously had the 'smoke' tag include all of the scenario
tests, which was fine when we had 6 scenario tests. However as those
have grown I think that should be trimmed back to a few basic through
scenarios.

The results of this are -
https://review.openstack.org/#/q/status:open+project:openstack/tempest+branch:master+topic:smoke,n,z

The impacts on our upstream gate will mean that grenade jobs will speed
up dramatically (20 minutes faster on grenade neutron).

There is one edge condition which exists, which is the
check-tempest-dsvm-neutron-icehouse job. Neutron couldn't pass either a
full or parallel tempest run in icehouse (it's far too racy). So that's
current running the smoke-serial tag. This would end up reducing the
number of tests run on that job. However, based on the number of
rechecks I've had to run in this series, that job is currently at about
a 30% fail rate - http://goo.gl/N2w7qc - which means some test reduction
is probably in order anyway, as it's mostly just preventing other people
from landing unrelated patches.

This was something we were originally planning on doing during the QA
Sprint but ran out of time. It looks like we'll plan to land this right
after Tempest 4 is cut this week, so that people that really want the
old behavior can stay on the Tempest 4 release, but master is moving
forward.

I think that once we trim down we can decide to point add specific tests
later. I expect smoke to be a bit more fluid over time, so it's not a
tag that anyone should count on a test going into that tag and staying
forever.
Agreed. The criteria and purpose should stay the same but individual 
tests may be added or removed from smoke.

Thanks for doing this.

 -David


-Sean




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Question for the TC candidates

2015-04-24 Thread David Kranz

On 04/24/2015 10:42 AM, Chris Dent wrote:

On Fri, 24 Apr 2015, Ed Leafe wrote:


I read the downstream to mean what you refer to as people who
deploy workloads on them. In this context, I saw the operators as the
end-users of the work the devs do. If that gave the impression that I
don't care about people who actually run their stuff on the clouds we
build, I apologize - I was simply trying to answer what was asked.


I left the terminology intentional vague to allow people to interpret
them as they felt appropriate. To be clear I was thinking in these
sorts of ways:

* Downstream: The companies that are packaging and selling OpenStack
  in some fashion. I left these people out because I personally think
  the OpenStack project does _far_ too much to keep these people happy
  and should actually do less and since that is a somewhat
  contentious position I wanted to leave it out of the discussion (at
  least initially) as it would just muddy the waters.
Interesting. Can you list a few specific things we do as a project that 
are only for the benefit of such companies and that

you believe we should stop doing?

 -David



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] official clients and tempest

2015-04-08 Thread David Kranz
Since tempest no longer uses the official clients as a literal code 
dependency, except for the cli tests which are being removed, the 
clients have been dropping from requirements.txt. But when debugging 
issues uncovered by tempest, or when debugging tempest itself, it is 
useful to use the cli to check various things. I think it would be a 
good service to users of tempest to include the client libraries when 
tempest is installed on a machine. Is there a reason to not do this?


 -David

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] official clients and tempest

2015-04-08 Thread David Kranz

On 04/08/2015 02:36 PM, Matthew Treinish wrote:

On Wed, Apr 08, 2015 at 01:08:03PM -0400, David Kranz wrote:

Since tempest no longer uses the official clients as a literal code
dependency, except for the cli tests which are being removed, the clients
have been dropping from requirements.txt. But when debugging issues
uncovered by tempest, or when debugging tempest itself, it is useful to use
the cli to check various things. I think it would be a good service to users
of tempest to include the client libraries when tempest is installed on a
machine. Is there a reason to not do this?

i

Umm, so that is not what requirements.txt is for, we should only put what is
required to run the tempest in the requirements file. It's a package 
dependencies
list, not a list of everything you find useful for developing tempest code.
I was more thinking of users of tempest than developers of tempest, 
though it is useful to both.
But we can certainly say that this is an issue for those who provide 
tempest to users.


 -David




I get what you're going for but doing that as part of the tempest install is not
the right place for it. We can put it as a recommendation in the developer
documentation or have scripts somewhere which sets setups up a dev env or
something.

-Matt Treinish


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] [devstack] Confusion about member roles in tempest/devstack

2015-04-06 Thread David Kranz
There have been a number of changes in tempest recently that seem to 
coordinate with devstack that are a bit unclear.


The following values are defined in tempest config as defaults:

[auth]
# Roles to assign to all users created by tempest (list value)
#tempest_roles =

[object-storage]
# Role to add to users created for swift tests to enable creating
# containers (string value)
#operator_role = Member

[orchestration]
# Role required for users to be able to manage stacks (string value)
#stack_owner_role = heat_stack_owner

These are the values created in tempest.conf by devstack:

[auth]

tempest_roles = Member


[orchestration]
stack_owner_role = _member_

So a couple of questions.

Why do we have Member and _member_, and what is the difference supposed 
to be?


Experimentally, it seems that the tempest roles cannot be empty, so why 
is that the default?


The heat_stack_owner role used to be created in juno devstack but no 
longer. Is there a reason to leave this as the default?


Any explanations appreciated !

 -David


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] [devstack] Confusion about member roles in tempest/devstack

2015-04-06 Thread David Kranz

On 04/06/2015 03:14 PM, Matthew Treinish wrote:

On Mon, Apr 06, 2015 at 02:25:14PM -0400, David Kranz wrote:

There have been a number of changes in tempest recently that seem to
coordinate with devstack that are a bit unclear.

Well, the issue was that before tempest was making all sorts of incorrect
implicit assumptions about the underlying configuration. As part of the test
accounts part 2 bp [1] we needed to correct these and make things more explicit
which resulted in a number of changes around the configuration in tempest.

FWIW, I push to have detailed commit messages to try and make it clear from the
git log and explain the rationale behind changes like this.


The following values are defined in tempest config as defaults:

[auth]
# Roles to assign to all users created by tempest (list value)
#tempest_roles =

So this option is used to set roles on every user created by tenant isolation.
Outside of tenant isolation this option does nothing.


[object-storage]
# Role to add to users created for swift tests to enable creating
# containers (string value)
#operator_role = Member

[orchestration]
# Role required for users to be able to manage stacks (string value)
#stack_owner_role = heat_stack_owner

These are the values created in tempest.conf by devstack:

[auth]

tempest_roles = Member


[orchestration]
stack_owner_role = _member_

So a couple of questions.

Why do we have Member and _member_, and what is the difference supposed to
be?

IIRC _member_ is the default role with keystone v3 which is used to show
membership in a project. I'm sure Jamie or Morgan will correct me if I'm wrong
on this.


Experimentally, it seems that the tempest roles cannot be empty, so why is
that the default?

So, I'm surprised by this, the tests which require the role Member to be set on
the created users should be specifically requesting this now. (as part of the
test accounts bp we had to make these expectations explicit) It should only be
required for the swift tests that do container manipulation.[2] I'm curious to
see what you're hitting here. The one thing is from the git log there may be
an interaction here depending on the keystone api version you're using. [3] My
guess is that it's needed for using keystone v2 in a v3 env, or vice versa, but
I'm sure Andrea will chime in if this is wrong.
Seems right to me. I should have said it is the identity v3 tests that 
fail if you leave the default for tempest_roles. It does seem that 
Member is related to swift tests and it would be less confusing if this 
were called SwiftOperator instead of Member. The only hardcoded 
reference to Member in tempest now is in javelin and that is going to be 
removed https://review.openstack.org/#/c/169108/


Andrea, can you explain why the role that is required in the 
tempest_roles is Member?
If this is really the way it needs to be we should have this as the 
default in tempest.conf rather than having devstack always set it, no?


 -David





The heat_stack_owner role used to be created in juno devstack but no longer.
Is there a reason to leave this as the default?

IIRC, the use of explicit role was removed in kilo (and maybe backported into
juno?) and was replaced with the use of delegations. It removed the need for
an explicit role to manipulate heat stacks. The config option is necessary
because of branchless tempest considerations and that you might need a specific
role to perform stack operations. [4][5] The use of _member_ on master is to
indicate that the no special role is needed to perform stack operations. When
icehouse support goes eol we probably can remove this option from tempest.

-Matt Treinish

[1] 
http://specs.openstack.org/openstack/qa-specs/specs/test-accounts-continued.html
[2] 
http://git.openstack.org/cgit/openstack/tempest/commit/?id=8f26829e939a695732cd5a242dddf63a9a84ecb8
[3] 
http://git.openstack.org/cgit/openstack-dev/devstack/commit/?id=72f026b60d350ede39e22e08b8f7f286fd0d2633
[4] 
http://git.openstack.org/cgit/openstack/tempest/commit/?id=db9721dfecd99421f89ca9e263a97271e5f79ca0
[5] 
http://git.openstack.org/cgit/openstack-dev/devstack/commit/?id=886cbb2a86e475a7982df1d98ea8452d0f9873fd


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA] [rally] How will Tempest discover/run tests migrated to specific projects?

2015-03-31 Thread David Kranz

On 03/30/2015 10:44 AM, Matthew Treinish wrote:

On Mon, Mar 30, 2015 at 12:21:18PM +0530, Rohan Kanade wrote:

Since tests can now be removed from Tempest 
https://wiki.openstack.org/wiki/QA/Tempest-test-removal and migrated to
their specific projects.

Does Tempest plan to discover/run these tests in tempest gates? If yes, how
is that going to be done?  Will there be a discovery mechanism in Tempest
to discover tests from individual projects?


No, the idea behind that wiki page is to outline the procedure for finding
something that is out of scope and doesn't belong in tempest and is also safe
to remove from the tempest jobs. The point of going through that entire
procedure is that the test being removed should not be run in the tempest gates
anymore and will become the domain of the other project.

Also, IMO the moved test ideally won't be in the same pattern of a tempest test
or have the same constraints of a tempest test and would ideally be more coupled
to the project under test's internals. So that wouldn't be appropriate to
include in a tempest run either.

For example, the first test we removed with that procedure was:

https://review.openstack.org/#/c/158852/

which removed the flavor negative tests from tempest. These were just testing
operations that would go no deeper than Nova's DB layer. Which was something
we couldn't verify in tempest. They also didn't really belong in tempest because
they were just implicitly verifying Nova's DB layer through API responses. The
replacement tests:

http://git.openstack.org/cgit/openstack/nova/tree/nova/tests/functional/wsgi/test_flavor_manage.py

were able to verify the state of the DB was correct and ensure the correct
behavior both in the api and nova's internals. This kind of testing is something
which doesn't belong in tempest or any other external test suite. It is also
what I feel we should be targeting for with project specific in-tree functional
testing and the kind of thing we should be using the removal process on that
wiki page for.


-Matt Treinish


Matt, while everything you say here is true, I don't think it answers 
the whole question. neutron is also planning to move the tempest 
networking tests into the neutron repo with safeguards to prevent 
incompatible changes, but also keeping the tests in a form that is not 
so different from tempest.


The problem is that deployers/users/refstack/etc. (let's call them 
verifiers) want an OpenStack functional verification suite. Until now 
that has been easy since most of what that requires is in tempest, and 
Rally calls tempest. But to a verifier, the fact that all the tests used 
for verification are in one tempest repo is an implementation detail. 
OpenStack verifiers do not want to lose neutron tests because they moved 
out of tempest. So verifiers will need to do something about this and it 
would be better if we all did it as a community by agreeing on a UX and 
method for locating and running all the tests that should be included in 
an OpenStack functional test suite. Even now, there are tests that are 
useful for verification that are not in tempest.


I think the answer that Boris gave 
http://lists.openstack.org/pipermail/openstack-dev/2015-March/060173.html is 
trying to address this by saying that Rally will take on the role of 
being the OpenStack verification suite (including performance tests). 
I don't know if that is the best answer and tempest/rally could agree on 
a UX/discovery/import mechanism, but I think we are looking at one of 
those two choices.


 -David


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] Question about is_admin_available()

2015-03-10 Thread David Kranz
In the process of writing a unit test for this I discovered that it can 
call out to keystone for a token with some configurations through the 
call to get_configured_credentials. This surprised me since I thought it 
would just check for the necessary admin credentials in either 
tempest.conf or accounts.yaml. Is this a bug?


 -David


def is_admin_available():
is_admin = True
# If tenant isolation is enabled admin will be available
if CONF.auth.allow_tenant_isolation:
return is_admin
# Check whether test accounts file has the admin specified or not
elif os.path.isfile(CONF.auth.test_accounts_file):
check_accounts = accounts.Accounts(name='check_admin')
if not check_accounts.admin_available():
is_admin = False
else:
try:
cred_provider.get_configured_credentials('identity_admin')
except exceptions.InvalidConfiguration:
is_admin = False
return is_admin


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [heat][qa] Forward plan for heat scenario tests

2015-03-09 Thread David Kranz
Since test_server_cfn_init was recently moved from tempest to the heat 
functional tests, there are no subclasses of OrchestrationScenarioTest.
If there is no plan to add any more heat scenario tests to tempest I 
would like to remove that class. So I want to confirm that future 
scenario tests will go in the heat tree.


 -David

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA] testing implementation-specific features not covered by OpenStack APIs

2015-03-03 Thread David Kranz

On 03/03/2015 11:28 AM, Radoslaw Zarzynski wrote:

As we know Tempest provides many great tests for verification of
conformance with OpenStack interfaces - the tempest/api directory is
full of such useful stuff. However, regarding the #1422728 ticket [1]
(dependency on private HTTP header of Swift), I think we all need to
answer for one single but fundamental question: which interfaces we
truly want to test? I see two options:

1) implementation-specific private interfaces (like the Swift interface),
2) well-specified and public OpenStack APIs (eg. the Object Storage
API v1 [2]).
As Jordan said, these two are one and the same. One could imagine a 
situation where there was an abstract object storage api
and swift was an implementation, but that view has been rejected by the 
OpenStack community many times (thought not without some controversy).


I think that Tempest should not relay on any behaviour not specified
in public API (Object Storage API v1 in this case). Test for Swift-
specific features/extensions is better be shipped along with Swift
and actually it already has pretty good internal test coverage.
I agree, depending on what specified means. Lack of adequate 
documentation should not be equated with being unspecified for the 
purpose of determining test coverage criteria. This is partly addressed 
in the api stability document 
https://wiki.openstack.org/wiki/APIChangeGuidelines under  /*The 
existing API is not well documented*/


As I already wrote in similar thread regarding Horizon, from my
perspective, the OpenStack is much more than yet another IaaS/PaaS
implementation or a bunch of currently developed components. I think
its main goal is to specify a universal set of APIs covering all
functional areas relevant for cloud computing, and to place that set
of APIs in front as many implementations as possible. Having an
open source reference implementation of a particular API is required
to prove its viability, but is secondary to having an open and
documented API. I am sure the same idea of interoperability should
stand behind Tempest - the OpenStack's Test Suite.
The community has (thus far) rejected the notion that our code is a 
reference implementation for an abstract api. But yes, tempest is 
supposed to be able to run against any OpenStack (TM?) cloud.


 -David



Regards,
Radoslaw Zarzynski

[1] https://bugs.launchpad.net/tempest/+bug/1422728
[2] http://developer.openstack.org/api-ref-objectstorage-v1.html

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Re-evaluating the suitability of the 6 month release cycle

2015-02-24 Thread David Kranz

On 02/24/2015 09:37 AM, Chris Dent wrote:

On Tue, 24 Feb 2015, Sean Dague wrote:


That also provides a very concrete answer to will people show up.
Because if they do, and we get this horizontal refactoring happening,
then we get to the point of being able to change release cadences
faster. If they don't, we remain with the existing system. Vs changing
the system and hoping someone is going to run in and backfill the 
breaks.


Isn't this the way of the world? People only put halon in the
machine room after the fire.

I agree that people showing up is a real concern, but I also think
that we shy away too much from the productive energy of stuff
breaking. It's the breakage that shows where stuff isn't good
enough.

[Flavio said]:

To this I'd also add that bug fixing is way easier when you have
aligned releases for projects that are expected to be deployed
together. It's easier to know what the impact of a change/bug is
throughout the infrastructure.


Can't this be interpreted as an excuse for making software which
does not have a low surface area and a good API?

(Note I'm taking a relatively unrealistic position for sake of
conversation.)
I'm not so sure about that. IMO, much of this goes back to the question 
of whether OpenStack services are APIs or implementations. This was 
debated with much heat at the Diablo summit (Hi Jay). I frequently have 
conversations where there is an issue about release X vs Y when it is 
really about api versions. Even if we say that we are about 
implementations as well as apis, we can start to organize our processes 
and code as if we were just apis. If each service had a well-defined, 
versioned, discoverable, well-tested api, then projects could follow 
their own release schedule, relying on distros or integrators to put the 
pieces together and verify the quality of the whole stack to the users. 
Such entities could still collaborate on that task, and still identify 
longer release cycles, using stable branches. The upstream project 
could still test the latest released versions together. Some of these 
steps are now being taken to resolve gate issues and horizontal resource 
issues. Doing this would vastly increase agility but with some costs:


1. The upstream project would likely have to give up on the worthy goal 
of providing an actual deployable stack that could be used as an 
alternative to AWS, etc. That saddens me, but for various reasons, 
including that we do no scale/performance testing on the upstream code, 
we are not achieving that goal anyway. The big tent proposals are also a 
move away from that goal.


2. We would have to give up on incompatible api changes. But with the 
replacement of nova v3 with microversions we are already doing that. 
Massive adoption with release agility is simply incompatible with 
allowing incompatible api changes.


Most of this is just echoing what Jay said. I think this is the way any 
SOA would be designed. If we did this, and projects released frequently, 
would there be a reason for any one to be chasing master?


 -David


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova][Tempest] Tempest will deny extra properties on Nova v2/v2.1 API

2015-02-24 Thread David Kranz

On 02/24/2015 06:55 AM, Ken'ichi Ohmichi wrote:

Hi Ghanshyam,

2015-02-24 20:28 GMT+09:00 GHANSHYAM MANN ghanshyamm...@gmail.com:

On Tue, Feb 24, 2015 at 6:48 PM, Ken'ichi Ohmichi ken1ohmi...@gmail.com
wrote:

Hi

Nova team is developing Nova v2.1 API + microversions in this cycle,
and the status of Nova v2.1 API has been changed to CURRENT from
EXPERIMENTAL.
That said new API properties should be added via microversions, and
v2/v2.1 API(*without* microversions) should return the same response
without any new properties.
Now Tempest allows extra properties of a Nova API response because we
thought Tempest should not block Nova API development.

However, I think Tempest needs to deny extra properties in
non-microversions test cases because we need to block accidental
changes of v2/v2.1 API and encourage to use microversions for API
changes.
https://review.openstack.org/#/c/156130/ is trying to do that, but I'd
like to get opinions before that.

If the above change is merged, we can not use Tempest on OpenStack
environments which provide the original properties.


I think that will be nice to block additional properties.

Do you mean OpenStack environment with micro-versions enabled?
In those cases too tempest should run successfully as it requests on V2 or
V2.1 endpoint not on microversion.

My previous words were unclear, sorry.
The above OpenStack environment means the environment which is
customized by a cloud service provider and it returns a response which
includes the provider original properties.

On microversions discussion, we considered the customized API by
a cloud service provider for the design. Then I guess there are some
environments return extra properties and Tempest will deny them if
the patch is merged. I'd like to know the situation is acceptable or not
as Tempest purpose.
Ken'ichi, can you please provide a pointer to the referenced 
microversions discussion and/or summarize the conclusion?


The commit message is saying that returning extra values without a new 
microversion is an incompatible (disallowed) change. This was already 
true, unless creating a new extension, according to 
https://wiki.openstack.org/wiki/APIChangeGuidelines.


Seems to me that extra properties (unless using a syntax marking them as 
such), are either allowed or not. If not, tempest should fail on them. 
If service providers are allowed to add returned properties, and not 
required to use some special syntax to distinguish them, that is a bad 
api. If tempest can't tell the difference between a legitimate added 
property and some one misspelling while returning an optional property, 
I'm not sure how we test for the unintentional change case.


 -David



Thanks
Ken Ohmichi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa][swift] Signature of return values in tempest swift client

2015-02-13 Thread David Kranz
Almost all of the OpenStack REST apis return little of user value in the 
response headers, with json bodies containing the returned data. The 
tempest client methods had been returning two values with one always 
being ignored. To clean that up before moving the service clients to 
tempest-lib, we changed the client methods to return one value instead 
as recorded in this blueprint: 
https://blueprints.launchpad.net/tempest/+spec/clients-return-one-value.


This is mostly done except for swift. Swift is different in that most 
interesting data is in the headers except for GET methods, and applying 
the same methodology as the others does not make sense to me. There are 
various ways the swift client could be changed to return one value, or 
it could be left as is. I am soliciting proposals from those most 
interested in the swift tests. If there is no input or consensus on how 
this should be changed, I will leave the swift client as-is and close 
the blueprint.


 -David

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [stable] juno is fubar in the gate

2015-02-10 Thread David Kranz

On 02/10/2015 10:35 AM, Matthew Treinish wrote:

On Tue, Feb 10, 2015 at 11:19:20AM +0100, Thierry Carrez wrote:

Joe, Matt  Matthew:

I hear your frustration with broken stable branches. With my
vulnerability management team member hat, responsible for landing
patches there with a strict deadline, I can certainly relate with the
frustration of having to dive in to unbork the branch in the first
place, rather than concentrate on the work you initially planned on doing.

That said, wearing my stable team member hat, I think it's a bit unfair
to say that things are worse than they were and call for dramatic
action. The stable branch team put a structure in place to try to
continuously fix the stable branches rather than reactively fix it when
we need it to work. Those champions have been quite active[1] unbreaking
it in the past months. I'd argue that the branch is broken much less
often than it used to. That doesn't mean it's never broken, though, or
that those people are magicians.

I don't at all for 2 reasons. The first being in every discussion we had at 2
summits I raised the increased maint. burden for a longer support window and
was told that people were going to stand up so it wouldn't be an issue. I have
yet to see that happen. I have not seen anything to date that would convince
me that we are at all ready to be maintaining 3 stable branches at once.

The second is while I've seen that etherpad, I still view their still being a
huge disconnect here about what actually maintaining the branches requires. The
issue which I'm raising is about issues related to the gating infrastructure and
how to ensure that things stay working. There is a non-linear overhead involved
with making sure any gating job stays working. (on stable or master) People need
to take ownership of jobs to make sure they keep working.


One issue in the current situation is that the two groups (you and the
stable maintainers) seem to work in parallel rather than collaborate.
It's quite telling that the two groups maintained separate etherpads to
keep track of the fixes that needed landing.

I don't actually view it as that. Just looking at the etherpad it has a very
small subset of the actual types of issues we're raising here.

For example, there was a week in late Nov. when 2 consecutive oslo project
releases broke the stable gates. After we unwound all of this and landed the
fixes in the branches the next step was to changes to make sure we didn't allow
breakages in the same way:

http://lists.openstack.org/pipermail/openstack-dev/2014-November/051206.html

This was also happened at the same time as a new testtools stack release which
broke every branch (including master). Another example is all of the setuptools
stack churn from the famed Christmas releases. That was another critical
infrastructure piece that fell apart and was mostly handled by the infra team.
All of these things are getting fixed because they have to be, to make sure
development on master can continue not because those with a vested interest in
the stable branches working for 15 months are working on them.

The other aspect here are development efforts to make things more stable in this
space. Things like the effort to pin the requirements on stable branches which
Joe is spearheading. These are critical to the long term success of the stable
branches yet no one has stepped up to help with it.

I view this as a disconnect between what people think maintaining a stable
branch means and what it actually entails. Sure, the backporting of fixes to
intermittent failures is part of it. But, the most effort is spent on making
sure the gating machinery stays well oiled and doesn't breakdown.


[1] https://etherpad.openstack.org/p/stable-tracker

Matthew Treinish wrote:

So I think it's time we called the icehouse branch and marked it EOL. We
originally conditioned the longer support window on extra people stepping
forward to keep things working. I believe this latest issue is just the latest
indication that this hasn't happened. Issue 1 listed above is being caused by
the icehouse branch during upgrades. The fact that a stable release was pushed
at the same time things were wedged on the juno branch is just the latest
evidence to me that things aren't being maintained as they should be. Looking at
the #openstack-qa irc log from today or the etherpad about trying to sort this
issue should be an indication that no one has stepped up to help with the
maintenance and it shows given the poor state of the branch.

I disagree with the assessment. People have stepped up. I think the
stable branches are less often broken than they were, and stable branch
champions (as their tracking etherpad shows) have made a difference.
There just has been more issues as usual recently and they probably
couldn't keep track. It's not a fun job to babysit stable branches,
belittling the stable branch champions results is not the best way to
encourage them to continue in this position. I 

Re: [openstack-dev] [stable] juno is fubar in the gate

2015-02-10 Thread David Kranz

On 02/10/2015 12:20 PM, Jeremy Stanley wrote:

On 2015-02-10 11:50:28 -0500 (-0500), David Kranz wrote:
[...]

I would rather give up branchless tempest than the ability for
real distributors/deployers/operators to collaborate on stable
branches.

[...]

Keep in mind that branchless tempest came about in part due to
downstream use cases as well, not merely as a means to simplify our
testing implementation. Specifically, the interoperability (defcore,
refstack) push was for a testing framework and testset which could
work against multiple deployed environments regardless of what
release(s) they're running and without having to decide among
multiple versions of a tool to do so (especially since they might be
mixing components from multiple OpenStack integrated releases at any
given point in time).
Yes, but that goes out the window in the real world because tempest is 
not really branchless when we periodically
throw out older releases, as we must.  And the earlier we toss out 
things like icehouse, the less branchless it is from the 
interoperability perspective.
Also, tempest is really based on api versions of services, not 
integrated releases, so I'm not sure where mixing components comes into 
play.


In any event, this is a tradeoff and since refstack or whomever has to 
deal with releases that are no longer supported upstream anyway,
they could just do whatever the solution is from the get-go. That said, 
I feel like the current situation is caused by a perfect storm of 
branchless tempest, unpinned versions, and running multiple releases on 
the same machine so there could be other ways to untangle things. I just 
think it is a bad idea to throw the concept of stable branches overboard 
just because the folks who care about it can't deal with the current 
complexity. Once we simplify it, some way or another, I am sure more 
folks will step up or those who have already can get more done.


 -David


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [OpenStack Foundation] Finding people to work on the EC2 API in Nova

2015-02-06 Thread David Kranz

On 02/06/2015 07:49 AM, Sean Dague wrote:

On 02/06/2015 07:39 AM, Alexandre Levine wrote:

Rushi,

We're adding new tempest tests into our stackforge-api/ec2-api. The
review will appear in a couple of days. These tests will be good for
running against both nova/ec2-api and stackforge/ec2-api. As soon as
they are there, you'll be more than welcome to add even more.

Best regards,
   Alex Levine


Honestly, I'm more more pro having the ec2 tests in a tree that isn't
Tempest. Most Tempest reviewers aren't familiar with the ec2 API, their
focus has been OpenStack APIs.

Having a place where there is a review team that is dedicated only to
the EC2 API seems much better.

-Sean


+1

 And once similar coverage to the current tempest ec2 tests is 
achieved, either by copying from tempest or creating anew, we should 
remove the ec2 tests from tempest.


 -David


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA] Prototype of the script for Tempest auto-configuration

2015-02-04 Thread David Kranz

On 01/26/2015 09:39 AM, Timur Nurlygayanov wrote:

Hi,

Sorry for the late reply. Was on vacation.


*Yaroslav*,thank you for raising the question, I realy like this 
feature, I discussed this script with several people during the 
OpenStack summit in Paris and heard many the same things - we need to 
have something like this to execute tempest tests automatically for 
validation of different production and test OpenStack clouds - it is 
real pain to create our own separate scripts for each project / team 
which will configure Tempest for some specific configurations / 
installations, because tempest configuration file can be changed and 
we will need to update our scripts.


We need to discuss, first of all, what we need to change in this 
script before this script will be merged.
As I can see, the spec description [1] not fully meet the current 
implementation [2] and the spec looks really general - probably we can 
describe separate 'simple' spec for this script and just abandon the 
current spec or update the spec to sync spec and this script?

Good idea.


*David*, we found many issues with the current version of script, many 
tempest tests failed for our custom OpenStack configurations (for 
example, with and without Swift or Ceph) and we have our own scripts 
which already can solve the problem. Can we join you and edit the 
patch together? (or we can describe our ideas in the comments for the 
patch).

I welcome edits to this patch.

 -David


Also, looks like we need review from Tempest core team - they can 
write more valuable comments and suggest some cool ideas for the 
implementation.


[1] https://review.openstack.org/#/c/94473
[2] https://review.openstack.org/#/c/133245


On Fri, Jan 23, 2015 at 7:12 PM, Yaroslav Lobankov 
yloban...@mirantis.com mailto:yloban...@mirantis.com wrote:


Hello everyone,

I would like to discuss the following patch [1] for Tempest. I
think that such feature
as auto-configuration of Tempest would be very useful for many
engineers and users.
I have recently tried to use the script from [1]. I rebased the
patch on master and ran the script.
The script was finished without any errors and the tempest.conf
was generated! Of course,
this patch needs a lot of work, but the idea looks very cool!

Also I would like to thank David Kranz for his working on initial
version of the script.

Any thoughts?

[1] https://review.openstack.org/#/c/133245

Regards,
Yaroslav Lobankov.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




--

Timur,
Senior QA Engineer
OpenStack Projects
Mirantis Inc


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [QA] Meeting Thursday January 15th at 17:00 UTC

2015-01-14 Thread David Kranz

Hi everyone,

Just a quick reminder that the weekly OpenStack QA team IRC meeting will be
tomorrow Thursday, January 15th at 17:00 UTC in the #openstack-meeting
channel.

The agenda for tomorrow's meeting can be found here:
https://wiki.openstack.org/wiki/Meetings/QATeamMeeting
Anyone is welcome to add an item to the agenda.

It's also worth noting that a few weeks ago we started having a regular
dedicated Devstack topic during the meetings. So if anyone is interested in
Devstack development please join the meetings to be a part of the discussion.

To help people figure out what time 17:00 UTC is in other timezones tomorrow's
meeting will be at:

12:00 EST
02:00 JST
03:30 ACDT
18:00 CET
11:00 CST
09:00 PST

-David Kranz


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA] Moving tempest clients to tempest-lib (was Meeting Thursday January 8th at 22:00 UTC)

2015-01-13 Thread David Kranz

On 01/08/2015 05:34 AM, Ken'ichi Ohmichi wrote:

Hi,

Unfortunately, I cannot join tomorrow meeting.
So I'd like to share the progress of tempest-lib RestClient
dev before the meeting.

As Paris summit consensus, we have a plan to move RestClient
from tempest to tempest-lib for moving API tests to each project
in the future. And we are cleaning the code of RestClient up in
tempest now. The progress will be complete with some patches[1].
After merging them, I will move the code to tempest-lib.

This dev requires many patches/reviews, and many people have
already worked well. Thank you very much for helping this dev,
and I appreciate continuous effort.

[1]: 
https://review.openstack.org/#/q/status:open+project:openstack/tempest+branch:master+topic:rest-client,n,z

Thanks
Ken Ohmichi
Ken, I have a question about this. The end goal is to move the service 
clients and so they must also be free of CONF references. But your 
current changes create a ServiceClient that still uses CONF in its 
constructor rather than taking the arguments. So I'm not sure what 
ServiceClient is adding. I also think whatever class the service clients 
are inheriting cannot contain CONF values?


I was assuming the final arrangement would be something like, using 
neutron as an example:


tempest_lib.RestClient(all needed args)
  tempest_lib.NeutronClient(all needed args to super)
 tempest.NeutronClient(pass CONF values to super)

and where the tempest_lib neutron client would be used by neutron tests 
either through inheritance or delegation. Is that different than your 
vision?


 -David


---

2015-01-08 2:44 GMT+09:00 David Kranz dkr...@redhat.com:

Hi everyone,

Just a quick reminder that the weekly OpenStack QA team IRC meeting will be
tomorrow Thursday, January 8th at 22:00 UTC in the #openstack-meeting
channel.

The agenda for tomorrow's meeting can be found here:
https://wiki.openstack.org/wiki/Meetings/QATeamMeeting
Anyone is welcome to add an item to the agenda.

It's also worth noting that a few weeks ago we started having a regular
dedicated Devstack topic during the meetings. So if anyone is interested in
Devstack development please join the meetings to be a part of the
discussion.

To help people figure out what time 22:00 UTC is in other timezones
tomorrow's
meeting will be at:

17:00 EST
07:00 JST
08:30 ACDT
23:00 CET
16:00 CST
14:00 PST

-David Kranz


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [QA] Meeting Thursday January 8th at 22:00 UTC

2015-01-07 Thread David Kranz

Hi everyone,

Just a quick reminder that the weekly OpenStack QA team IRC meeting will be
tomorrow Thursday, January 8th at 22:00 UTC in the #openstack-meeting
channel.

The agenda for tomorrow's meeting can be found here:
https://wiki.openstack.org/wiki/Meetings/QATeamMeeting
Anyone is welcome to add an item to the agenda.

It's also worth noting that a few weeks ago we started having a regular
dedicated Devstack topic during the meetings. So if anyone is interested in
Devstack development please join the meetings to be a part of the discussion.

To help people figure out what time 22:00 UTC is in other timezones tomorrow's
meeting will be at:

17:00 EST
07:00 JST
08:30 ACDT
23:00 CET
16:00 CST
14:00 PST

-David Kranz

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [all] Proper use of 'git review -R'

2014-12-30 Thread David Kranz
Many times when I review a revision of an existing patch, I can't see 
just the change from the previous version due to other rebases. The 
git-review documentation mentions this issue and suggests using -R to 
make life easier for reviewers when submitting new revisions. Can some 
one explain when we should *not* use -R after doing 'git commit 
--amend'? Or is using -R just something that should be done but many 
folks don't know about it?


-David

From git-review doc:

-R, --no-rebase
Do not automatically perform a rebase before submitting the
change to Gerrit.

When submitting a change for review, you will usually want it to
be based on the tip of upstream branch in order to avoid possible
conflicts. When amending a change and rebasing the new patchset,
the Gerrit web interface will show a difference between the two
patchsets which contains all commits in between. This may confuse
many reviewers that would expect to see a much simpler differ‐
ence.


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Proper use of 'git review -R'

2014-12-30 Thread David Kranz

On 12/30/2014 11:37 AM, Jeremy Stanley wrote:

On 2014-12-30 09:46:35 -0500 (-0500), David Kranz wrote:
[...]

Can some one explain when we should *not* use -R after doing 'git
commit --amend'?

[...]

In the standard workflow this should never be necessary. The default
behavior in git-review is to attempt a rebase and then undo it
before submitting. If the rebase shows merge conflicts, the push
will be averted and the user instructed to deal with those
conflicts. Using -R will skip this check and allow you to push
changes which can't merge due to conflicts.


 From git-review doc:

-R, --no-rebase
Do not automatically perform a rebase before submitting the
change to Gerrit.

When submitting a change for review, you will usually want it to
be based on the tip of upstream branch in order to avoid possible
conflicts. When amending a change and rebasing the new patchset,
the Gerrit web interface will show a difference between the two
patchsets which contains all commits in between. This may confuse
many reviewers that would expect to see a much simpler differ‐
ence.

While not entirely incorrect, it could stand to be updated with
slightly more clarification around the fact that git-review (since
around 1.16 a few years ago) does not push an automatically rebased
change for you unless you are using -F/--force-rebase.

If you are finding changes which are gratuitously rebased, this is
likely either from a contributor who does not use the recommended
change update workflow, has modified their rebase settings or
perhaps is running a very, very old git-review version.
Thanks for the replies. The rebases I was referring to are not 
gratuitous, they just make it harder for the reviewer. I take a few 
things away from this.


1. This is really a UI issue, and one that is experienced by many. What 
is desired is an option to look at different revisions of the patch that 
show only what the author actually changed, unless there was a conflict.


2. Using -R is dangerous unless you really know what you are doing. The 
doc string makes it sound like an innocuous way to help reviewers.


 -David



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] tempest stable/icehouse builds are broken

2014-12-29 Thread David Kranz
Some kind of regression has caused stable/icehouse builds to fail, and 
hence prevents any code from merging in tempest. This is being tracked 
at https://bugs.launchpad.net/python-heatclient/+bug/1405579. Jeremy 
(fungi) provided a hacky work-around here 
https://review.openstack.org/#/c/144347/ which I hope can soon be +A by 
a tempest core until there is some better fix.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] neutron client returns one value has finally merged

2014-12-19 Thread David Kranz

Neutron patches can resume as  normal. Thanks for the patience.

 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] Please do not merge neutron test changes until client returns one value is merged

2014-12-17 Thread David Kranz
This https://review.openstack.org/#/c/141152/ gets rid of the useless second 
return value from neutron client methods according to this spec: 
https://github.com/openstack/qa-specs/blob/master/specs/clients-return-one-value.rst.

Because the client and test changes have to be in the same patch, this one is 
very large. So please let it merge before any other neutron stuff. 
Any neutron patches will require the simple change of removing the unused first 
return value from neutron client methods. Thanks!

 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Reason for mem/vcpu ratio in default flavors

2014-12-11 Thread David Kranz
Perhaps this is a historical question, but I was wondering how the 
default OpenStack flavor size ratio of 2/1 was determined? According to 
http://aws.amazon.com/ec2/instance-types/, ec2 defines the flavors for 
General Purpose (M3) at about 3.7/1, with Compute Intensive (C3) at 
about 1.9/1 and Memory Intensive (R3) at about 7.6/1.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] branchless tempest and the use of 'all' for extensions in tempest.conf

2014-12-03 Thread David Kranz
A recent proposed test to tempest was making explicit calls to the nova 
extension discovery api rather than using test.requires_ext. The reason 
was because we configure tempest.conf in the gate as 'all' for 
extensions, and the test involved an extension that was new in Juno. So 
the icehouse run failed. Since the methodology of branchless tempest 
requires that new conf flags be added for new features, we should stop 
having devstack configure with 'all'. Does any one disagree with that, 
or have a better solution?


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Plans for removing xml support

2014-11-12 Thread David Kranz
Code has started going into tempest for several projects 
(nova,neutron,keystone) to allow removal of xml support in kilo. There 
have been many (heated) off and on threads on this list over the years. 
I'm sure many projects would like to do this, but there is evidence that 
not all have an understanding that this is ok. Is this a TC issue? If 
so, could there be a clear statement about xml support?


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA][All] Prelude to functional testing summit discussions

2014-10-30 Thread David Kranz

On 10/30/2014 07:49 AM, Sean Dague wrote:

On 10/29/2014 12:30 PM, Matthew Treinish wrote:

Hi everyone,

Before we start the larger discussion at summit next week about the future of
testing in OpenStack - specifically about spinning up functional testing and how
it relates to tempest - I would like to share some of my thoughts on how we can
get things started and how I think they'll eventually come together.

Currently in tempest we have a large number of tests (mostly api-focused)
which are probably a better fit for a project's functional test suite. The best
example I can think of is the nova flavors tests. Validation of flavor
manipulation doesn't need to run in the integrated test suite on every commit to
every project because it only requires Nova. A simple win for initiating in-tree
functional testing would be to move these kinds of tests into the projects and
run the tests from the project repos instead of from tempest.
I think a lot of the negative API testing is also a great thing to be 
done back at the project level. All of that testing should be able to 
work without a full OpenStack, as it should be caught and managed by 
the API service and never get any further than that.



This would have the advantage of making tempest slimmer for every project
and begin the process of getting projects to take responsibility for their
functional testing rather than relying on tempest. As tests are moved tempest
can start to become the integration test suite it was meant to be. It would
retain only tests that involve multiple projects and stop being the OpenStack
black box testing suite. I think that this is the right direction for tempest
moving forward, especially as we move to having project-specific functional
testing.

Doing this migration is dependent on some refactors in tempest and moving
the required bits to tempest-lib so they can be easily consumed by the
other projects. This will be discussed at summit, is being planned
for implementation this cycle, and is similar to what is currently in progress
for the cli tests.

The only reason this testing existed in tempest in the first place was as
mechanism to block and then add friction against breaking api changes. Tempest's
api testing has been been pretty successful at achieving these goals. We'll want
to ensure that migrated tests retain these characteristics. If we are using
clients from tempest-lib we should get this automatically since to break
the api you'd have to change the api client. Another option proposed was to
introduce a hacking rule that would block changes to api tests at the same time
other code was being changed.

There is also a concern for external consumers of tempest if we move the tests
out of the tempest tree (I'm thinking refstack). I think the solution is
to maintain a load_tests discovery method inside of tempest or elsewhere that
will run the appropriate tests from the other repos for something like refstack.
Assuming that things are built in a compatible way using the same framework then
running the tests from separate repos should be a simple matter of pointing the
test runner in the right direction.
I think we can see where this takes us. I'm still skeptical of cross 
project loading of tests because it's often quite fragile. However, if 
you look at what refstack did they had a giant evaluation of all of 
tempest and pruned a bunch of stuff out. I would imagine maybe there 
is a conversation there about tests that refstack feels are important 
to stay in Tempest for their validation reasons. I think having a few 
paths that are tested both in Tempest and in project functional tests 
is not a bad thing.
Refstack is not the only thing that cares about validation of real 
clouds. As we move forward with this, it would be good to separate the 
issues of in which repo does a functional test live and can a 
functional test be run against a real cloud. IMO, over use of mocking 
(broadly defined) in functional tests should be avoided unless it is 
configurable to also work in an unmocked fashion. Whether the way to 
combine all of the functional tests is by cross project loading of tests 
or by some other means is more of an implementation detail.


But I think that's an end of cycle at best discussion.

Also, there probably need to be a few discussions anyway of 
refstack/tempest/defcore. The fact that Keystone was dropped from 
defcore because there were no non admin Keystone tests explicitly in 
Tempest (even though we make over 5000 keystone non admin API calls 
over a tempest run) was very odd. That is something that could have 
been fixed in a day.



I also want to comment on the role of functional testing. What I've proposed
here is only one piece of what project specific functional testing should be
and just what I feel is a good/easy start. I don't feel that this should be
the only testing done in the projects.  I'm suggesting this as a first
step because the tests already exist and it should be a relatively 

Re: [openstack-dev] [QA][All] Prelude to functional testing summit discussions

2014-10-30 Thread David Kranz

On 10/30/2014 09:52 AM, Sean Dague wrote:

On 10/30/2014 09:33 AM, David Kranz wrote:

On 10/30/2014 07:49 AM, Sean Dague wrote:

On 10/29/2014 12:30 PM, Matthew Treinish wrote:

Hi everyone,

Before we start the larger discussion at summit next week about the future of
testing in OpenStack - specifically about spinning up functional testing and how
it relates to tempest - I would like to share some of my thoughts on how we can
get things started and how I think they'll eventually come together.

Currently in tempest we have a large number of tests (mostly api-focused)
which are probably a better fit for a project's functional test suite. The best
example I can think of is the nova flavors tests. Validation of flavor
manipulation doesn't need to run in the integrated test suite on every commit to
every project because it only requires Nova. A simple win for initiating in-tree
functional testing would be to move these kinds of tests into the projects and
run the tests from the project repos instead of from tempest.
I think a lot of the negative API testing is also a great thing to 
be done back at the project level. All of that testing should be 
able to work without a full OpenStack, as it should be caught and 
managed by the API service and never get any further than that.



This would have the advantage of making tempest slimmer for every project
and begin the process of getting projects to take responsibility for their
functional testing rather than relying on tempest. As tests are moved tempest
can start to become the integration test suite it was meant to be. It would
retain only tests that involve multiple projects and stop being the OpenStack
black box testing suite. I think that this is the right direction for tempest
moving forward, especially as we move to having project-specific functional
testing.

Doing this migration is dependent on some refactors in tempest and moving
the required bits to tempest-lib so they can be easily consumed by the
other projects. This will be discussed at summit, is being planned
for implementation this cycle, and is similar to what is currently in progress
for the cli tests.

The only reason this testing existed in tempest in the first place was as
mechanism to block and then add friction against breaking api changes. Tempest's
api testing has been been pretty successful at achieving these goals. We'll want
to ensure that migrated tests retain these characteristics. If we are using
clients from tempest-lib we should get this automatically since to break
the api you'd have to change the api client. Another option proposed was to
introduce a hacking rule that would block changes to api tests at the same time
other code was being changed.

There is also a concern for external consumers of tempest if we move the tests
out of the tempest tree (I'm thinking refstack). I think the solution is
to maintain a load_tests discovery method inside of tempest or elsewhere that
will run the appropriate tests from the other repos for something like refstack.
Assuming that things are built in a compatible way using the same framework then
running the tests from separate repos should be a simple matter of pointing the
test runner in the right direction.
I think we can see where this takes us. I'm still skeptical of cross 
project loading of tests because it's often quite fragile. However, 
if you look at what refstack did they had a giant evaluation of all 
of tempest and pruned a bunch of stuff out. I would imagine maybe 
there is a conversation there about tests that refstack feels are 
important to stay in Tempest for their validation reasons. I think 
having a few paths that are tested both in Tempest and in project 
functional tests is not a bad thing.
Refstack is not the only thing that cares about validation of real 
clouds. As we move forward with this, it would be good to separate 
the issues of in which repo does a functional test live and can a 
functional test be run against a real cloud. IMO, over use of 
mocking (broadly defined) in functional tests should be avoided 
unless it is configurable to also work in an unmocked fashion. 
Whether the way to combine all of the functional tests is by cross 
project loading of tests or by some other means is more of an 
implementation detail.
Part of the perspective I'm bringing in is actually knowing what to do 
when your tests fail. Using Tempest against real clouds is great, 
people should keep doing that. But if you are rolling out a real cloud 
yourself, in the future you should be running the functional tests in 
staging to ensure you are functioning. Those will also provide you, 
hopefully, with a better path to understand what's wrong.
Sean, sorry if I was unclear. By real clouds, I just meant the tests 
should be able to use  OpenStack apis with no mocking.


 -David





This will mean that as an arbitrary 3rd party accessing a public 
cloud, you don't have a test suite that pushes every button of the 
cloud. But I

Re: [openstack-dev] [QA][All] Prelude to functional testing summit discussions

2014-10-30 Thread David Kranz

On 10/30/2014 11:12 AM, Sean Dague wrote:

On 10/30/2014 10:47 AM, Eoghan Glynn wrote:

Matthew wrote:

This would have the advantage of making tempest slimmer for every project
and begin the process of getting projects to take responsibility for their
functional testing rather than relying on tempest.

[much snipping]


Sean wrote:

Ok, so part of this remains to be seen about what the biggest bang for the
buck is. The class of bugs I feel like we need to nail in Nova right now are
going to require tests that bring up pieces of the wsgi stack, but are
probably not runable on a real deploy. Again, this is about debugability.

So this notion of the biggest bang for our buck is an aspect of the drive
for in-tree functional tests, that's not entirely clear to me as yet.

i.e. whether individual projects should be prioritizing within this effort:

(a) the creation of net-new coverage for scenarios (especially known or
 suspected bugs) that were not previously tested, in a non-unit sense

(b) the relocation of existing integration test coverage from Tempest to
 the project trees, in order to make the management of Tempest more
 tractable

It feels like there may be a tension between (a) and (b) in terms of the
pay-off for this effort. I'd interested in hearing other opinions on this,
on what aspect projects are expecting (and expected) to concentrate on
initially.

For what it's worth I have a bunch of early targets listed for Nova for
our summit session -
https://etherpad.openstack.org/p/kilo-nova-functional-testing

My focus in kilo is going to be first about A), as that provides value
out of the gate (pun intended). Then peel off some stuff from B as makes
sense.

-Sean

That seems sensible from the nova point of view and overall health, and 
not all projects have to pursue the same priorities at the same time. 
But a big part of the benefit of (b) is the impact it has on all the 
other projects in that other projects will stop getting as many gate 
failures, and that benefit could be achieved right now by simply 
changing the set of tempest tests that run against each project.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ceilometer] [qa] [oslo] Declarative HTTP Tests

2014-10-24 Thread David Kranz

On 10/23/2014 06:27 AM, Chris Dent wrote:


I've proposed a spec to Ceilometer

   https://review.openstack.org/#/c/129669/

for a suite of declarative HTTP tests that would be runnable both in
gate check jobs and in local dev environments.

There's been some discussion that this may be generally applicable
and could be best served by a generic tool. My original assertion
was let's make something work and then see if people like it but I
thought I also better check with the larger world:

* Is this a good idea?

I think so


* Do other projects have similar ideas in progress?
Tempest faced a similar problem around negative tests in particular. We 
have code in tempest that automatically generates a series of negative
test cases based on illegal variations of a schema. If you want to look 
at it the NegativeAutoTest class is probably a good place to start. We have
discussed using a similar methodology for positive test cases but never 
did anything with that.


Currently only a few of the previous negative tests have been replaced 
with auto-gen tests. In addition to the issue of how to represent the 
schema, the other major issue we encountered was the need to create 
resources used by the auto-generated tests and a way to integrate a 
resource description into the schema. We use json for the schema and 
hoped one day to be able to receive base schemas from the projects 
themselves.


* Is this concept something for which a generic tool should be
  created _prior_ to implementation in an individual project?

* Is there prior art? What's a good format?
Marc Koderer and I did a lot of searching and asking folks if there was 
some python code that we could use as a starting point but in the end 
did not find anything. I do not have a list of what we considered and 
rejected.


 -David


Thanks.




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] periodic jobs for master

2014-10-22 Thread David Kranz

On 10/22/2014 06:07 AM, Thierry Carrez wrote:

Ihar Hrachyshka wrote:

[...]
For stable branches, we have so called periodic jobs that are
triggered once in a while against the current code in a stable branch,
and report to openstack-stable-maint@ mailing list. An example of
failing periodic job report can be found at [2]. I envision that
similar approach can be applied to test auxiliary features in gate. So
once something is broken in master, the interested parties behind the
auxiliary feature will be informed in due time.
[...]

The main issue with periodic jobs is that since they are non-blocking,
they can get ignored really easily. It takes a bit of organization and
process to get those failures addressed.

It's only recently (and a lot thanks to you) that failures in the
periodic jobs for stable branches are being taken into account quickly
and seriously. For years the failures just lingered until they blocked
someone's work enough for that person to go and fix them.

So while I think periodic jobs are a good way to increase corner case
testing coverage, I am skeptical of our collective ability to have the
discipline necessary for them not to become a pain. We'll need a strict
process around them: identified groups of people signed up to act on
failure, and failure stats so that we can remove jobs that don't get
enough attention.

While I share some of your skepticism, we have to find a way to make 
this work.
Saying we are doing our best to ensure the quality of upstream OpenStack 
based on a single-tier of testing (the gate) that is limited to 40min runs
is not plausible. Of course a lot more testing happens downstream but we 
can do better as a community. I think we should rephrase this subject as 
non-gating jobs. We could have various kinds of stress and longevity 
jobs running to good effect if we can solve this process problem.


Following on your process suggestion, in practice the most likely way 
this could actually work is to have a rotation of build guardians that 
agree to keep an eye on jobs for a short period of time. There would 
need to be a separate rotation list for each project that has 
non-gating, project-specific jobs. This will likely happen as we move 
towards deeper functional testing in projects. The qa team would be the 
logical pool for a rotation of more global jobs of the kind I think Ihar 
was referring to.


As for failure status, each of these non-gating jobs would have their 
own name so logstash could be used to debug failures. Do we already have 
anything that tracks failure rates of jobs?


 -David




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Thoughts on OpenStack Layers and a Big Tent model

2014-09-24 Thread David Kranz

On 09/24/2014 02:48 PM, Clint Byrum wrote:

Excerpts from Robert Collins's message of 2014-09-23 21:14:47 -0700:

No one helped me edit this :)

http://rbtcollins.wordpress.com/2014/09/24/what-poles-for-the-tent/

I hope I haven't zoned out and just channelled someone else here ;)


This sounds like API's are what matters. You did spend some time
working with Simon Wardley, didn't you? ;)

I think it's a sound argument, but I'd like to banish the term reference
implementation from any discussions around what OpenStack, as a project,
delivers. It has too many negative feelings wrapped up in it.

I also want to call attention to how what you describe feels an awful
lot like POSIX to me. Basically offering guarantees of API compatibility,
but then letting vendors run wild around and behind it.

I'm not sure if that is a good thing, or a bad thing. I do, however,
think if we can avoid a massive vendor battle that involves multiple
vendors pushing multiple implementations, we will save our companies a
lot of money, and our users will get what they need sooner.
I like what Rob had to say here, and have expressed similar views. 
Having competition between implementations is good for every one (except 
for the losers) if that competition takes place in a way that shields 
users and the ecosystem from the aftermath of such competition. That is 
what standards, defined apis, whetever we want to call it, is all about. 
By analogy, competition by electronics companies around who can make the 
best performing blu-ray player with the most features is a good thing 
for users and that ecosystem. Competition about whether the ecosystem 
should use blu-ray or HD DVD, not so much: 
http://en.wikipedia.org/wiki/High_definition_optical_disc_format_war.


This is what I see as the main virtue of the TC blessing things as the 
one OpenStack way to do X. There is also the potential of efficiency if 
more people contribute to the same project that is doing X as compared 
to multiple projects doing X. But as we have seen, that efficiency is 
only realized if X turns out to be the right thing. There is no 
particular reason to think the TC will be great at picking winners.


Blessing apis, though difficult, would have huge benefit and provide 
more room for leeway and experimentation. Blessing code, before it has 
been proven in the real world, is the worst of all worlds when it turns 
out to be wrong.


I believe our scale problems can be addressed by thoughtful 
decentralization and I hope we move in that direction, and in terms of 
how many pieces of the run a real cloud we have in our tent, we may 
have shot too high. But some of the recent proposals to move to an 
extreme in the other direction would be  a mistake IMO. To be important, 
and be competitive with non-OpenStack cloud solutions, we need to 
provide a critical mass so that most other interesting things can glom 
on and form a larger ecosystem.


 -David




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] Tempest Bug triage

2014-09-12 Thread David Kranz

On 09/12/2014 05:11 AM, Kashyap Chamarthy wrote:

On Thu, Sep 11, 2014 at 03:52:56PM -0400, David Kranz wrote:

So we had a Bug Day this week and the results were a bit disappointing due
to lack of participation. We went from 124 New bugs to 75.

There were also many cases where bugs referred to logs that no longer
existed. This suggests that we really need to keep up with bug triage
in real time.

Alternatively, strongly recommend people to post *contextual* logs to
the bug, so they're there for reference forever and makes life less
painful while triaging bugs. Many times bugs are just filed in a hurry,
posting a quick bunch of logstash URLs which expires sooner or later.

Sure, posting contextual logs takes time, but as you can well imagine,
it results in higher quality reports (hopefully), and saves time for
others who have to take a fresh look at the bug and have to begin with
the maze of logs.
This would be in addition to, not alternatively. Of course better bug 
reports with as much information as possible, with understanding of how 
long log files will be retained, etc. would always be better. But due to 
the sorry state we are now in, it is simply unrealistic to expect people 
to start investigating failures in code they do not understand that are 
obviously unrelated to the code they are trying to babysit through the 
gate. I wish it were otherwise, and believe this may change as  we 
achieve the goal of focusing our test time on tests that are related to 
the code being tested (in-project functional testing).


The purpose of rotating bug triage is that it was not happening at all. 
When there is a not-so-much-fun task for which every one is responsible, 
no one is responsible. It is better to share the load in a well 
understood way and know who has taken on responsibility at any point in 
time.


 -David


--
/kashyap

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Kilo Cycle Goals Exercise

2014-09-11 Thread David Kranz

On 09/11/2014 07:32 AM, Eoghan Glynn wrote:



As you all know, there has recently been several very active discussions
around how to improve assorted aspects of our development process. One idea
that was brought up is to come up with a list of cycle goals/project
priorities for Kilo [0].

To that end, I would like to propose an exercise as discussed in the TC
meeting yesterday [1]:
Have anyone interested (especially TC members) come up with a list of what
they think the project wide Kilo cycle goals should be and post them on this
thread ...

Here's my list of high-level cycle goals, for consideration ...


1. Address our usability debts

With some justification, we've been saddled with the perception
of not caring enough about the plight of users and operators. The
frustrating thing is that much of this is very fixable, *if* we take
time out from the headlong rush to add features. Achievable things
like documentation completeness, API consistency, CLI intuitiveness,
logging standardization, would all go a long way here.

These things are of course all not beyond the wit of man, but we
need to take the time out to actually do them. This may involve
a milestone, or even longer, where we accept that the rate of
feature addition will be deliberately slowed down.


2. Address the drags on our development velocity

Despite the Trojan efforts of the QA team, the periodic brownouts
in the gate are having a serious impact on our velocity. Over the
past few cycles, we've seen the turnaround time for patch check/
verification spike up unacceptably long multiple times, mostly
around the milestones.

Whatever we can do to smoothen out these spikes, whether it be
moving much of the Tempest coverage into the project trees, or
switching focus onto post-merge verification as suggested by
Sean on this thread, or even considering some more left-field
approaches such as staggered milestones, we need to grasp this
nettle as a matter of urgency.

Further back in the pipeline, the effort required to actually get
something shepherded through review is steadily growing. To the
point that we need to consider some radical approaches that
retain the best of our self-organizing model, while setting more
reasonable  reliable expectations for patch authors, and making
it more likely that narrow domain expertise is available to review
their contributions in timely way. For the larger projects, this
is likely to mean something different (along the lines of splits
or sub-domains) than it does for the smaller projects.


3. Address the long-running what's in and what's out questions

The way some of the discussions about integration and incubation
played out this cycle have made me sad. Not all of these discussions
have been fully supported by the facts on the ground IMO. And not
all of the issues that have been held up as justifications for
whatever course of exclusion or inclusion would IMO actually be
solved in that way.

I think we need to move the discussion around a new concept of
layering, or redefining what it means to be in the tent, to a
more constructive and collaborative place than heretofore.


4. Address the fuzziness in cross-service interactions

In a semi-organic way, we've gone and built ourselves a big ol'
service-oriented architecture. But without necessarily always
following the strong contracts, loose coupling, discoverability,
and autonomy that a SOA approach implies.

We need to take the time to go back and pay down some of the debt
that has accreted over multiple cycles around these these
cross-service interactions. The most pressing of these would
include finally biting the bullet on the oft-proposed but never
delivered-upon notion of stabilizing notifications behind a
well-defined contract. Also, the more recently advocated notions
of moving away from coarse-grained versioning of the inter-service
APIs, and supporting better introspection and discovery of
capabilities.

+1
IMO, almost all of the other ills discussed recently derive from this 
single failure.


 -David

by end of day Wednesday, September 10th.

Oh, yeah, and impose fewer arbitrary deadlines ;)

Cheers,
Eoghan


After which time we can
begin discussing the results.
The goal of this exercise is to help us see if our individual world views
align with the greater community, and to get the ball rolling on a larger
discussion of where as a project we should be focusing more time.


best,
Joe Gordon

[0]
http://lists.openstack.org/pipermail/openstack-dev/2014-August/041929.html
[1]
http://eavesdrop.openstack.org/meetings/tc/2014/tc.2014-09-02-20.04.log.html

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




[openstack-dev] [qa] Tempest Bug triage

2014-09-11 Thread David Kranz
So we had a Bug Day this week and the results were a bit disappointing 
due to lack of participation. We went from 124 New bugs to 75. There 
were also many cases where bugs referred to logs that no longer existed. 
This suggests that we really need to keep up with bug triage in real 
time. Since bug triage should involve the Core review team, we propose 
to rotate the responsibility of triaging bugs weekly. I put up an 
etherpad here https://etherpad.openstack.org/p/qa-bug-triage-rotation 
and I hope the tempest core review team will sign up. Given our size, 
this should involve signing up once every two months or so. I took next 
week.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] Reminder: Tempest Bug Day: Tuesday September 9 (tomorrow)

2014-09-08 Thread David Kranz


It's been a while since we had a bug day. We now have 121 (now 124) NEW 
bugs:


https://bugs.launchpad.net/tempest/+bugs?field.searchtext=field.status%3Alist=NEWorderby=-importance

The first order of business is to triage these bugs. This is a large 
enough number that I hesitate to
mention anything else, but there are also many In Progress bugs that 
should be looked at to see if they should

be closed or an assignee removed if no work is actually planned:

https://bugs.launchpad.net/tempest/+bugs?search=Searchfield.status=In+Progress

I hope we will see a lot of activity on this bug day. During the 
Thursday meeting right after we can see if
there are ideas for how to manage the bugs on a more steady-state basis. 
We could also discuss how the grenade and

devstack bugs should fit in to such activities.

-David



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa][all][Heat] Packaging of functional tests

2014-09-05 Thread David Kranz

On 09/05/2014 12:10 PM, Matthew Treinish wrote:

On Fri, Sep 05, 2014 at 09:42:17AM +1200, Steve Baker wrote:

On 05/09/14 04:51, Matthew Treinish wrote:

On Thu, Sep 04, 2014 at 04:32:53PM +0100, Steven Hardy wrote:

On Thu, Sep 04, 2014 at 10:45:59AM -0400, Jay Pipes wrote:

On 08/29/2014 05:15 PM, Zane Bitter wrote:

On 29/08/14 14:27, Jay Pipes wrote:

On 08/26/2014 10:14 AM, Zane Bitter wrote:

Steve Baker has started the process of moving Heat tests out of the
Tempest repository and into the Heat repository, and we're looking for
some guidance on how they should be packaged in a consistent way.
Apparently there are a few projects already packaging functional tests
in the package projectname.tests.functional (alongside
projectname.tests.unit for the unit tests).

That strikes me as odd in our context, because while the unit tests run
against the code in the package in which they are embedded, the
functional tests run against some entirely different code - whatever
OpenStack cloud you give it the auth URL and credentials for. So these
tests run from the outside, just like their ancestors in Tempest do.

There's all kinds of potential confusion here for users and packagers.
None of it is fatal and all of it can be worked around, but if we
refrain from doing the thing that makes zero conceptual sense then there
will be no problem to work around :)

I suspect from reading the previous thread about In-tree functional
test vision that we may actually be dealing with three categories of
test here rather than two:

* Unit tests that run against the package they are embedded in
* Functional tests that run against the package they are embedded in
* Integration tests that run against a specified cloud

i.e. the tests we are now trying to add to Heat might be qualitatively
different from the projectname.tests.functional suites that already
exist in a few projects. Perhaps someone from Neutron and/or Swift can
confirm?

I'd like to propose that tests of the third type get their own top-level
package with a name of the form projectname-integrationtests (second
choice: projectname-tempest on the principle that they're essentially
plugins for Tempest). How would people feel about standardising that
across OpenStack?

By its nature, Heat is one of the only projects that would have
integration tests of this nature. For Nova, there are some functional
tests in nova/tests/integrated/ (yeah, badly named, I know) that are
tests of the REST API endpoints and running service daemons (the things
that are RPC endpoints), with a bunch of stuff faked out (like RPC
comms, image services, authentication and the hypervisor layer itself).
So, the integrated tests in Nova are really not testing integration
with other projects, but rather integration of the subsystems and
processes inside Nova.

I'd support a policy that true integration tests -- tests that test the
interaction between multiple real OpenStack service endpoints -- be left
entirely to Tempest. Functional tests that test interaction between
internal daemons and processes to a project should go into
/$project/tests/functional/.

For Heat, I believe tests that rely on faked-out other OpenStack
services but stress the interaction between internal Heat
daemons/processes should be in /heat/tests/functional/ and any tests the
rely on working, real OpenStack service endpoints should be in Tempest.

Well, the problem with that is that last time I checked there was
exactly one Heat scenario test in Tempest because tempest-core doesn't
have the bandwidth to merge all (any?) of the other ones folks submitted.

So we're moving them to openstack/heat for the pure practical reason
that it's the only way to get test coverage at all, rather than concerns
about overloading the gate or theories about the best venue for
cross-project integration testing.

Hmm, speaking of passive aggressivity...

Where can I see a discussion of the Heat integration tests with Tempest QA
folks? If you give me some background on what efforts have been made already
and what is remaining to be reviewed/merged/worked on, then I can try to get
some resources dedicated to helping here.

We recieved some fairly strong criticism from sdague[1] earlier this year,
at which point we were  already actively working on improving test coverage
by writing new tests for tempest.

Since then, several folks, myself included, commited very significant
amounts of additional effort to writing more tests for tempest, with some
success.

Ultimately the review latency and overhead involved in constantly rebasing
changes between infrequent reviews has resulted in slow progress and
significant frustration for those attempting to contribute new test cases.

It's been clear for a while that tempest-core have significant bandwidth
issues, as well as not necessarily always having the specific domain
expertise to thoroughly review some tests related to project-specific
behavior or functionality.

So I view this as actually a breakdown 

[openstack-dev] [qa] Tempest Bug Day: Tuesday September 9

2014-09-02 Thread David Kranz

It's been a while since we had a bug day. We now have 121 NEW bugs:

https://bugs.launchpad.net/tempest/+bugs?field.searchtext=field.status%3Alist=NEWorderby=-importance

The first order of business is to triage these bugs. This is a large 
enough number that I hesitate to
mention anything else, but there are also many In Progress bugs that 
should be looked at to see if they should

be closed or an assignee removed if no work is actually planned:

https://bugs.launchpad.net/tempest/+bugs?search=Searchfield.status=In+Progress

I hope we will see a lot of activity on this bug day. During the 
Thursday meeting right after we can see if
there are ideas for how to manage the bugs on a more steady-state basis. 
We could also discuss how the grenade and

devstack bugs should fit in to such activities.

-David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] Lack of consistency in returning response from tempest clients

2014-08-29 Thread David Kranz
While reviewing patches for moving response checking to the clients, I 
noticed that there are places where client methods do not return any value.
This is usually, but not always, a delete method. IMO, every rest client 
method should return at least the response. Some services return just 
the response for delete methods and others return (resp, body). Does any 
one object to cleaning this up by just making all client methods return 
resp, body? This is mostly a change to the clients. There were only a 
few places where a non-delete  method was returning just a body that was 
used in test code.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] Lack of consistency in returning response from tempest clients

2014-08-29 Thread David Kranz

On 08/29/2014 10:56 AM, Sean Dague wrote:

On 08/29/2014 10:19 AM, David Kranz wrote:

While reviewing patches for moving response checking to the clients, I
noticed that there are places where client methods do not return any value.
This is usually, but not always, a delete method. IMO, every rest client
method should return at least the response. Some services return just
the response for delete methods and others return (resp, body). Does any
one object to cleaning this up by just making all client methods return
resp, body? This is mostly a change to the clients. There were only a
few places where a non-delete  method was returning just a body that was
used in test code.

Yair and I were discussing this yesterday. As the response correctness
checking is happening deeper in the code (and you are seeing more and
more people assigning the response object to _ ) my feeling is Tempest
clients should probably return a body obj that's basically.

class ResponseBody(dict):
 def __init__(self, body={}, resp=None):
 self.update(body)
self.resp = resp

Then all the clients would have single return values, the body would be
the default thing you were accessing (which is usually what you want),
and the response object is accessible if needed to examine headers.

-Sean

Heh. I agree with that and it is along a similar line to what I proposed 
here https://review.openstack.org/#/c/106916/ but using a dict rather 
than an attribute dict. I did not propose this since it is such a big 
change. All the test code would have to be changed to remove the resp or 
_ that is now receiving the response. But I think we should do this 
before the client code is moved to tempest-lib.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] gate debugging

2014-08-27 Thread David Kranz

On 08/27/2014 02:54 PM, Sean Dague wrote:

Note: thread intentionally broken, this is really a different topic.

On 08/27/2014 02:30 PM, Doug Hellmann wrote:

On Aug 27, 2014, at 1:30 PM, Chris Dent chd...@redhat.com wrote:


On Wed, 27 Aug 2014, Doug Hellmann wrote:


I have found it immensely helpful, for example, to have a written set
of the steps involved in creating a new library, from importing the
git repo all the way through to making it available to other projects.
Without those instructions, it would have been much harder to split up
the work. The team would have had to train each other by word of
mouth, and we would have had constant issues with inconsistent
approaches triggering different failures. The time we spent building
and verifying the instructions has paid off to the extent that we even
had one developer not on the core team handle a graduation for us.

+many more for the relatively simple act of just writing stuff down

Write it down.” is my theme for Kilo.

I definitely get the sentiment. Write it down is also hard when you
are talking about things that do change around quite a bit. OpenStack as
a whole sees 250 - 500 changes a week, so the interaction pattern moves
around enough that it's really easy to have *very* stale information
written down. Stale information is even more dangerous than no
information some times, as it takes people down very wrong paths.

I think we break down on communication when we get into a conversation
of I want to learn gate debugging because I don't quite know what that
means, or where the starting point of understanding is. So those
intentions are well meaning, but tend to stall. The reality was there
was no road map for those of us that dive in, it's just understanding
how OpenStack holds together as a whole and where some of the high risk
parts are. And a lot of that comes with days staring at code and logs
until patterns emerge.

Maybe if we can get smaller more targeted questions, we can help folks
better? I'm personally a big fan of answering the targeted questions
because then I also know that the time spent exposing that information
was directly useful.

I'm more than happy to mentor folks. But I just end up finding the I
want to learn at the generic level something that's hard to grasp onto
or figure out how we turn it into action. I'd love to hear more ideas
from folks about ways we might do that better.

-Sean

Race conditions are what makes debugging very hard. I think we are in 
the process of experimenting with such an idea: asymetric gating by 
moving functional tests to projects, making them deeper and more 
extensive, and gating against their own projects. The result should be 
that when a code change is made, we will spend much more time running 
tests of code that is most likely to be growing a race bug from the 
change. Of course there is a risk that we will impair integration 
testing and we will have to be vigilant about that. One mitigating 
factor is that if cross-project interaction uses apis (official or not) 
that are well tested by the functional tests, there is less risk that a 
bug will only show up only when those apis are used by another project.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] gate debugging

2014-08-27 Thread David Kranz

On 08/27/2014 03:43 PM, Sean Dague wrote:

On 08/27/2014 03:33 PM, David Kranz wrote:

On 08/27/2014 02:54 PM, Sean Dague wrote:

Note: thread intentionally broken, this is really a different topic.

On 08/27/2014 02:30 PM, Doug Hellmann wrote:

On Aug 27, 2014, at 1:30 PM, Chris Dent chd...@redhat.com wrote:


On Wed, 27 Aug 2014, Doug Hellmann wrote:


I have found it immensely helpful, for example, to have a written set
of the steps involved in creating a new library, from importing the
git repo all the way through to making it available to other projects.
Without those instructions, it would have been much harder to split up
the work. The team would have had to train each other by word of
mouth, and we would have had constant issues with inconsistent
approaches triggering different failures. The time we spent building
and verifying the instructions has paid off to the extent that we even
had one developer not on the core team handle a graduation for us.

+many more for the relatively simple act of just writing stuff down

Write it down.” is my theme for Kilo.

I definitely get the sentiment. Write it down is also hard when you
are talking about things that do change around quite a bit. OpenStack as
a whole sees 250 - 500 changes a week, so the interaction pattern moves
around enough that it's really easy to have *very* stale information
written down. Stale information is even more dangerous than no
information some times, as it takes people down very wrong paths.

I think we break down on communication when we get into a conversation
of I want to learn gate debugging because I don't quite know what that
means, or where the starting point of understanding is. So those
intentions are well meaning, but tend to stall. The reality was there
was no road map for those of us that dive in, it's just understanding
how OpenStack holds together as a whole and where some of the high risk
parts are. And a lot of that comes with days staring at code and logs
until patterns emerge.

Maybe if we can get smaller more targeted questions, we can help folks
better? I'm personally a big fan of answering the targeted questions
because then I also know that the time spent exposing that information
was directly useful.

I'm more than happy to mentor folks. But I just end up finding the I
want to learn at the generic level something that's hard to grasp onto
or figure out how we turn it into action. I'd love to hear more ideas
from folks about ways we might do that better.

 -Sean


Race conditions are what makes debugging very hard. I think we are in
the process of experimenting with such an idea: asymetric gating by
moving functional tests to projects, making them deeper and more
extensive, and gating against their own projects. The result should be
that when a code change is made, we will spend much more time running
tests of code that is most likely to be growing a race bug from the
change. Of course there is a risk that we will impair integration
testing and we will have to be vigilant about that. One mitigating
factor is that if cross-project interaction uses apis (official or not)
that are well tested by the functional tests, there is less risk that a
bug will only show up only when those apis are used by another project.

So, sorry, this is really not about systemic changes (we're running
those in parallel), but more about skills transfer in people getting
engaged. Because we need both. I guess that's the danger of breaking the
thread is apparently I lost part of the context.

-Sean

I agree we need both. I made the comment because if we can make gate 
debugging less daunting
then less skill will be needed and I think that is key. Honestly, I am 
not sure the full skill you have can be transferred. It was gained 
partly through

learning in simpler times.

 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa][all][Heat] Packaging of functional tests

2014-08-26 Thread David Kranz

On 08/26/2014 10:14 AM, Zane Bitter wrote:
Steve Baker has started the process of moving Heat tests out of the 
Tempest repository and into the Heat repository, and we're looking for 
some guidance on how they should be packaged in a consistent way. 
Apparently there are a few projects already packaging functional tests 
in the package projectname.tests.functional (alongside 
projectname.tests.unit for the unit tests).


That strikes me as odd in our context, because while the unit tests 
run against the code in the package in which they are embedded, the 
functional tests run against some entirely different code - whatever 
OpenStack cloud you give it the auth URL and credentials for. So these 
tests run from the outside, just like their ancestors in Tempest do.


There's all kinds of potential confusion here for users and packagers. 
None of it is fatal and all of it can be worked around, but if we 
refrain from doing the thing that makes zero conceptual sense then 
there will be no problem to work around :)
Thanks, Zane. The point of moving functional tests to projects is to be 
able to run more of them
in gate jobs for those projects, and allow tempest to survive being 
stretched-to-breaking horizontally as we scale to more projects. At the 
same time, there are benefits to the 
tempest-as-all-in-one-functional-and-integration-suite that we should 
try not to lose:


1. Strong integration testing without thinking too hard about the actual 
dependencies
2. Protection from mistaken or unwise api changes (tempest two-step 
required)
3. Exportability as a complete blackbox functional test suite that can 
be used by Rally, RefStack, deployment validation, etc.


I think (1) may be the most challenging because tests that are moved out 
of tempest might be testing some integration that is not being covered
by a scenario. We will need to make sure that tempest actually has a 
complete enough set of tests to validate integration. Even if this is 
all implemented in a way where tempest can see in-project tests as 
plugins, there will still not be time to run them all as part of 
tempest on every commit to every project, so a selection will have to be 
made.


(2) is quite difficult. In Atlanta we talked about taking a copy of 
functional tests into tempest for stable apis. I don't know how workable 
that is but don't see any other real options except vigilance in reviews 
of patches that change functional tests.


(3) is what Zane was addressing. The in-project functional tests need to 
be written in a way that they can, at least in some configuration, run 
against a real cloud.





I suspect from reading the previous thread about In-tree functional 
test vision that we may actually be dealing with three categories of 
test here rather than two:


* Unit tests that run against the package they are embedded in
* Functional tests that run against the package they are embedded in
* Integration tests that run against a specified cloud

i.e. the tests we are now trying to add to Heat might be qualitatively 
different from the projectname.tests.functional suites that already 
exist in a few projects. Perhaps someone from Neutron and/or Swift can 
confirm?
That seems right, except that I would call the third functional tests 
and not integration tests, because the purpose is not really 
integration but deep testing of a particular service. Tempest would 
continue to focus on integration testing. Is there some controversy 
about that?

The second category could include whitebox tests.

I don't know about swift, but in neutron the intent was to have these 
tests be configurable to run against a real cloud, or not. Maru Newby 
would have details.


I'd like to propose that tests of the third type get their own 
top-level package with a name of the form 
projectname-integrationtests (second choice: projectname-tempest 
on the principle that they're essentially plugins for Tempest). How 
would people feel about standardising that across OpenStack?

+1 But I would not call it integrationtests for the reason given above.

 -David


thanks,
Zane.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] [ptls] The Czar system, or how to scale PTLs

2014-08-26 Thread David Kranz

On 08/26/2014 10:04 AM, Doug Hellmann wrote:

On Aug 26, 2014, at 5:13 AM, Thierry Carrez thie...@openstack.org wrote:


OK, now that we have evacuated the terminology issue (we'll use liaison
or janitor or secretary, not czar), and side-stepped the offtopic
development (this is not about suppressing PTLs, just a framework to let
them delegate along predetermined lines if they want to)... which of
those unnamed roles do we need ?

In the thread were mentioned:
- Bugs janitor (keep reported bugs under control)
- Oslo liaison (already in place)
- Security mule (VMT first point of contact)
- Release secretary (communication with integrated release management)
- Infrastructure contact (for gate and other infra issues)
- Docs lieutenant (docs point of contact)

Anita mentioned the 3rd party space person, but I wonder if it would
not be specific to some projects. Would it actually be separate from the
Infra contact role ?

Do we need someone to cover the QA space ? Anything else missing ?

It seems the QA team is also feeling pressure due to the growing community, so 
it seems wise to ensure every team has someone designated to help with 
coordinating work on QA projects.
Very much so, and having such a someone would help. But I also think 
that the moving of functional tests to be housed in-project will help 
even more.


 -David


Doug


At first glance I don't think we need a role for logistics (chairing
meetings and organizing meetups), design summit planning, roadmapping,
user point of contact, or spokesperson -- as I expect the PTL to retain
those roles anyway...

--
Thierry Carrez (ttx)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] The future of the integrated release

2014-08-21 Thread David Kranz

On 08/21/2014 02:39 PM, gordon chung wrote:

 The point I've been making is
 that by the TC continuing to bless only the Ceilometer project as the
 OpenStack Way of Metering, I think we do a disservice to our users by
 picking a winner in a space that is clearly still unsettled.

can we avoid using the word 'blessed' -- it's extremely vague and 
seems controversial. from what i know, no one is being told project 
x's services are the be all end all and based on experience, companies 
(should) know this. i've worked with other alternatives even though i 
contribute to ceilometer.

 Totally agree with Jay here, I know people who gave up on trying to
 get any official project around deployment because they were told they
 had to do it under the TripleO umbrella
from the pov of a project that seems to be brought up constantly and 
maybe it's my naivety, i don't really understand the fascination with 
branding and the stigma people have placed on 
non-'openstack'/stackforge projects. it can't be a legal thing because 
i've gone through that potential mess. also, it's just as easy to 
contribute to 'non-openstack' projects as 'openstack' projects (even 
easier if we're honest).
Yes, we should be honest. The even easier part is what Sandy cited as 
the primary motivation for pursuing stacktach instead of ceilometer.


I think we need to consider the difference between why OpenStack wants 
to bless a project, and why a project might want to be blessed by 
OpenStack. Many folks believe that for OpenStack to be successful it 
needs to present itself as a stack that can be tested and deployed, not 
a sack of parts that only the most extremely clever people can manage to 
assemble into an actual cloud. In order to have such a stack, some code 
(or, alternatively, dare I say API...) needs to be blessed. Reasonable 
debates will continue about which pieces are essential to this stack, 
and which should be left to deployers, but metering was seen as such a 
component and therefore something needed to be blessed. The hope was 
that every one would jump on that and make it great but it seems that 
didn't quite happen (at least yet).


Though Open Source has many advantages over proprietary development, the 
ability to choose a direction and marshal resources for efficient 
delivery is the biggest advantage of proprietary development like what 
AWS does. The TC process of blessing is, IMO, an attempt to compensate 
for that in an OpenSource project. Of course if the wrong code is 
blessed, the negative  impact can be significant. Blessing APIs would be 
more forgiving, though with its own perils. I am reminded of this 
session, in which Jay was involved, at my first OpenStack summit: 
http://essexdesignsummit.sched.org/event/66f38d3bb4a1b8b169b81179e7f03215#.U_ZLI3Wx02Q


As for why projects have a desire to be blessed, I suspect in many cases 
it is because the OpenStack brand will attract contributors to their 
project.


 -David




in my mind, the goal of the programs is to encourage collaboration 
from projects with the same focus (whether they overlap or not). that 
way, even if there's differences in goal/implementation, there's a 
common space between them so users can easily decide. also, hopefully 
with the collaboration, it'll help teams realise that certain problems 
have already been solved and certain parts of code can be shared 
rather than having project x, y, and z all working in segregated 
streams, racing as fast as they can to claim supremacy (how you'd 
decide is another mess) and then n number of months/years later we 
decide to throw away (tens/hundreds) of thousands of person hours of 
work because we just created massive projects that overlap.


suggestion: maybe it's better to drop the branding codenames and just 
refer to everything as their generic feature? ie. identity, telemetry, 
orchestration, etc...


cheers,
/gord/


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] The future of the integrated release

2014-08-21 Thread David Kranz

On 08/21/2014 04:12 PM, Clint Byrum wrote:

Excerpts from David Kranz's message of 2014-08-21 12:45:05 -0700:

On 08/21/2014 02:39 PM, gordon chung wrote:

The point I've been making is
that by the TC continuing to bless only the Ceilometer project as the
OpenStack Way of Metering, I think we do a disservice to our users by
picking a winner in a space that is clearly still unsettled.

can we avoid using the word 'blessed' -- it's extremely vague and
seems controversial. from what i know, no one is being told project
x's services are the be all end all and based on experience, companies
(should) know this. i've worked with other alternatives even though i
contribute to ceilometer.

Totally agree with Jay here, I know people who gave up on trying to
get any official project around deployment because they were told they
had to do it under the TripleO umbrella

from the pov of a project that seems to be brought up constantly and
maybe it's my naivety, i don't really understand the fascination with
branding and the stigma people have placed on
non-'openstack'/stackforge projects. it can't be a legal thing because
i've gone through that potential mess. also, it's just as easy to
contribute to 'non-openstack' projects as 'openstack' projects (even
easier if we're honest).

Yes, we should be honest. The even easier part is what Sandy cited as
the primary motivation for pursuing stacktach instead of ceilometer.

I think we need to consider the difference between why OpenStack wants
to bless a project, and why a project might want to be blessed by
OpenStack. Many folks believe that for OpenStack to be successful it
needs to present itself as a stack that can be tested and deployed, not
a sack of parts that only the most extremely clever people can manage to
assemble into an actual cloud. In order to have such a stack, some code
(or, alternatively, dare I say API...) needs to be blessed. Reasonable
debates will continue about which pieces are essential to this stack,
and which should be left to deployers, but metering was seen as such a
component and therefore something needed to be blessed. The hope was
that every one would jump on that and make it great but it seems that
didn't quite happen (at least yet).

Though Open Source has many advantages over proprietary development, the
ability to choose a direction and marshal resources for efficient
delivery is the biggest advantage of proprietary development like what
AWS does. The TC process of blessing is, IMO, an attempt to compensate
for that in an OpenSource project. Of course if the wrong code is
blessed, the negative  impact can be significant. Blessing APIs would be

Hm, I wonder if the only difference there is when AWS blesses the wrong
thing, they evaluate the business impact, and respond by going in a
different direction, all behind closed doors. The shame is limited to
that inner circle.
It is only limited to the inner circle if the wrong thing had no 
public api in wide use. The advantage of blessing apis rather than 
implementations is that mistakes can be corrected. I realize many people 
hate that idea.


Here, with full transparency, calling something the wrong thing is
pretty much public humiliation for the team involved.

So it stands to reason that we shouldn't call something the right
thing if we aren't comfortable with the potential public shaming.
Of course not, and no one would argue that we should. The question being 
debated is whether the benefits of choosing wisely are worth the risk of 
choosing wrongly, as compared to the different-in-nature risks of not 
choosing at all. Not so easy to answer IMO.


 -David


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [QA] Picking a Name for the Tempest Library

2014-08-20 Thread David Kranz

On 08/18/2014 04:57 PM, Matthew Treinish wrote:

On Sat, Aug 16, 2014 at 06:27:19PM +0200, Marc Koderer wrote:

Hi all,

Am 15.08.2014 um 23:31 schrieb Jay Pipes jaypi...@gmail.com:

I suggest that tempest should be the name of the import'able library, and that the integration 
tests themselves should be what is pulled out of the current Tempest repository, into their own repo called 
openstack-integration-tests or os-integration-tests.

why not keeping it simple:

tempest: importable test library
tempest-tests: all the test cases

Simple, obvious and clear ;)


While I agree that I like how this looks, and that it keeps things simple, I
don't think it's too feasible. The problem is the tempest namespace is already
kind of large and established. The libification effort, while reducing some of
that, doesn't eliminate it completely. So what this ends meaning is that we'll
have to do a rename for a large project in order to split certain functionality
out into a smaller library. Which really doesn't seem like the best way to do
it, because a rename is a considerable effort.

Another wrinkle to consider is that the tempest namespace on pypi is already in
use: https://pypi.python.org/pypi/Tempest so if we wanted to publish the library
as tempest we'd need to figure out what to do about that.

-Matt Treinish
Yes, I agree. Tempest is also used by Refstack, Rally, and I'm sure many 
other parts of our ecosystem. I would vote for tempest-lib as the 
library and keeping tempest to mean the same thing in the ecosystem as 
it does at present. I would also not be opposed to a different name than 
tempest-lib if it were related to its function.


 -David



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Implications of moving functional tests to projects (was Re: Which program for Rally)

2014-08-12 Thread David Kranz

Changing subject line to continue thread about new $subj

On 08/12/2014 08:56 AM, Doug Hellmann wrote:


On Aug 11, 2014, at 12:00 PM, David Kranz dkr...@redhat.com 
mailto:dkr...@redhat.com wrote:



On 08/06/2014 05:48 PM, John Griffith wrote:
I have to agree with Duncan here.  I also don't know if I fully 
understand the limit in options.  Stress test seems like it 
could/should be different (again overlap isn't a horrible thing) and 
I don't see it as siphoning off resources so not sure of the issue. 
 We've become quite wrapped up in projects, programs and the like 
lately and it seems to hinder forward progress more than anything else.


I'm also not convinced that Tempest is where all things belong, in 
fact I've been thinking more and more that a good bit of what 
Tempest does today should fall more on the responsibility of the 
projects themselves.  For example functional testing of features 
etc, ideally I'd love to have more of that fall on the projects and 
their respective teams.  That might even be something as simple to 
start as saying if you contribute a new feature, you have to also 
provide a link to a contribution to the Tempest test-suite that 
checks it.  Sort of like we do for unit tests, cross-project 
tracking is difficult of course, but it's a start.  The other idea 
is maybe functional test harnesses live in their respective projects.


Honestly I think who better to write tests for a project than the 
folks building and contributing to the project.  At some point IMO 
the QA team isn't going to scale.  I wonder if maybe we should be 
thinking about proposals for delineating responsibility and goals in 
terms of functional testing?




All good points. Your last paragraph was discussed by the QA team 
leading up to and at the Atlanta summit. The conclusion was that the 
api/functional tests focused on a single project should be part of 
that project. As Sean said, we can envision there being half (or some 
other much smaller number) as many such tests in tempest going forward.


Details are under discussion, but the way this is likely to play out 
is that individual projects will start by creating their own 
functional tests outside of tempest. Swift already does this and 
neutron seems to be moving in that direction. There is a spec to 
break out parts of tempest 
(https://github.com/openstack/qa-specs/blob/master/specs/tempest-library.rst) 
into a library that might be used by projects implementing functional 
tests.


Once a project has sufficient functional testing, we can consider 
removing its api tests from tempest. This is a bit tricky because 
tempest needs to cover *all* cross-project interactions. In this 
respect, there is no clear line in tempest between scenario tests 
which have this goal explicitly, and api tests which may also involve 
interactions that might not be covered in a scenario. So we will need 
a principled way to make sure there is complete cross-project 
coverage in tempest with a smaller number of api tests.


 -David


We need to be careful about dumping the tests from tempest now that 
the DefCore group is relying on them as well. Tempest is no longer 
just a developer/QA/operations tool. It's also being used as the basis 
of a trademark enforcement tool. That's not to say we can't change the 
test suite, but we have to consider a new angle when doing so.


Doug
Thanks, Doug. We need to get away from acceptance test == tempest 
while retaining the ability to define and run an acceptance test as 
easily as tempest can be run now. My view is that functional tests in 
projects should have the capability to be run against real clouds, and 
that in-project functional tests should look like, and be 
interchangeable with, the api tests in tempest. The in-project tests 
would be focused on completeness of api testing and the tempest tests 
focused on cross-project interaction, but they could be run in similar 
ways. Then a trademark enforcement tool, or any other kind of acceptance 
test, could select which tests to run. I think this view may be a bit 
controversial but your point obviously needs to be addressed.



 -David



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org 
mailto:OpenStack-dev@lists.openstack.org

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Which program for Rally

2014-08-11 Thread David Kranz

On 08/06/2014 05:48 PM, John Griffith wrote:
I have to agree with Duncan here.  I also don't know if I fully 
understand the limit in options.  Stress test seems like it 
could/should be different (again overlap isn't a horrible thing) and I 
don't see it as siphoning off resources so not sure of the issue. 
 We've become quite wrapped up in projects, programs and the like 
lately and it seems to hinder forward progress more than anything else.


I'm also not convinced that Tempest is where all things belong, in 
fact I've been thinking more and more that a good bit of what Tempest 
does today should fall more on the responsibility of the projects 
themselves.  For example functional testing of features etc, ideally 
I'd love to have more of that fall on the projects and their 
respective teams.  That might even be something as simple to start as 
saying if you contribute a new feature, you have to also provide a 
link to a contribution to the Tempest test-suite that checks it. 
 Sort of like we do for unit tests, cross-project tracking is 
difficult of course, but it's a start.  The other idea is maybe 
functional test harnesses live in their respective projects.


Honestly I think who better to write tests for a project than the 
folks building and contributing to the project.  At some point IMO the 
QA team isn't going to scale.  I wonder if maybe we should be thinking 
about proposals for delineating responsibility and goals in terms of 
functional testing?




All good points. Your last paragraph was discussed by the QA team 
leading up to and at the Atlanta summit. The conclusion was that the 
api/functional tests focused on a single project should be part of that 
project. As Sean said, we can envision there being half (or some other 
much smaller number) as many such tests in tempest going forward.


Details are under discussion, but the way this is likely to play out is 
that individual projects will start by creating their own functional 
tests outside of tempest. Swift already does this and neutron seems to 
be moving in that direction. There is a spec to break out parts of 
tempest 
(https://github.com/openstack/qa-specs/blob/master/specs/tempest-library.rst) 
into a library that might be used by projects implementing functional 
tests.


Once a project has sufficient functional testing, we can consider 
removing its api tests from tempest. This is a bit tricky because 
tempest needs to cover *all* cross-project interactions. In this 
respect, there is no clear line in tempest between scenario tests which 
have this goal explicitly, and api tests which may also involve 
interactions that might not be covered in a scenario. So we will need a 
principled way to make sure there is complete cross-project coverage in 
tempest with a smaller number of api tests.


 -David
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Which program for Rally

2014-08-11 Thread David Kranz

On 08/11/2014 04:21 PM, Matthew Treinish wrote:


I apologize for the delay in my response to this thread, between 
travelling
and having a stuck 'a' key on my laptop this is the earliest I could 
respond.
I opted for a separate branch on this thread to summarize my views and 
I'll

respond inline later on some of the previous discussion.

On Wed, Aug 06, 2014 at 12:30:35PM +0200, Thierry Carrez wrote:
 Hi everyone,

 At the TC meeting yesterday we discussed Rally program request and
 incubation request. We quickly dismissed the incubation request, as
 Rally appears to be able to live happily on top of OpenStack and would
 benefit from having a release cycle decoupled from the OpenStack
 integrated release.

 That leaves the question of the program. OpenStack programs are created
 by the Technical Committee, to bless existing efforts and teams that are
 considered *essential* to the production of the OpenStack integrated
 release and the completion of the OpenStack project mission. There are 3
 ways to look at Rally and official programs at this point:

 1. Rally as an essential QA tool
 Performance testing (and especially performance regression testing) is
 an essential QA function, and a feature that Rally provides. If the QA
 team is happy to use Rally to fill that function, then Rally can
 obviously be adopted by the (already-existing) QA program. That said,
 that would put Rally under the authority of the QA PTL, and that raises
 a few questions due to the current architecture of Rally, which is more
 product-oriented. There needs to be further discussion between the QA
 core team and the Rally team to see how that could work and if that
 option would be acceptable for both sides.

So ideally this is where Rally would belong, the scope of what Rally is
attempting to do is definitely inside the scope of the QA program. I 
don't see
any reason why that isn't the case. The issue is with what Rally is in 
it's
current form. It's scope is too large and monolithic, and it 
duplicates much of

the functionality we either already have or need in current QA or Infra
projects. But, nothing in Rally is designed to be used outside of it. 
I actually
feel pretty strongly that in it's current form Rally should *not* be a 
part of

any OpenStack program.

All of the points Sean was making in the other branch on this thread 
(which I'll
probably respond to later) are a huge concerns I share with Rally. He 
basically
summarized most of my views on the topic, so I'll try not to rewrite 
everything.
But, the fact that all of this duplicate functionality was implemented 
in a
completely separate manner which is Rally specific and can't really be 
used

unless all of Rally is used is of a large concern. What I think the path
forward here is to have both QA and Rally work together on getting common
functionality that is re-usable and shareable. Additionally, I have some
concerns over the methodology that Rally uses for it's performance 
measurement.
But, I'll table that discussion because I think it would partially 
derail this

discussion.

So one open question is long-term where would this leave Rally if we 
want to
bring it in under the QA program. (after splitting up the 
functionality to more
conducive with all our existing tools and projects) The one thing 
Rally does
here which we don't have an analogous solution for is, for lack of 
better term,
the post processing layer. The part that generates the performs the 
analysis on
the collected data and generates the graphs. That is something that 
we'll have
an eventually need for and that is something that we can work on 
turning rally

into as we migrate everything to actually work together.

There are probably also other parts of Rally which don't fit into an 
existing

QA program project, (or the QA program in general) and in those cases we
probably should split them off as smaller projects to implement that 
bit. For
example, the SLA stuff Rally has that probably should be a separate 
entity as

well, but I'm unsure if that fits under QA program.

My primary concern is the timing for doing all of this work. We're 
approaching
J-3 and honestly this feels like it would take the better part of an 
entire
cycle to analyze, plan, and then implement. Starting an analysis of 
how to do

all of the work at this point I feel would just distract everyone from
completing our dev goals for the cycle. Probably the Rally team, if 
they want
to move forward here, should start the analysis of this structural 
split and we

can all pick this up together post-juno.


 2. Rally as an essential operator tool
 Regular benchmarking of OpenStack deployments is a best practice for
 cloud operators, and a feature that Rally provides. With a bit of a
 stretch, we could consider that benchmarking is essential to the
 completion of the OpenStack project mission. That program could one day
 evolve to include more such operations best practices tools. In
 addition to the slight stretch 

[openstack-dev] [QA] Meeting Thursday July 31 at 17:00UTC

2014-07-29 Thread David Kranz

Just a quick reminder that the weekly OpenStack QA team IRC meeting will be
this Thursday, July 31st at 17:00 UTC in the #openstack-meeting channel.

The agenda for Thursday's meeting can be found here:
https://wiki.openstack.org/wiki/Meetings/QATeamMeeting
Anyone is welcome to add an item to the agenda.

To help people figure out what time 17:00 UTC is in other timezones, Thursday's
meeting will be at:

13:00 EDT
02:00 JST
02:30 ACST
19:00 CEST
12:00 CDT
10:00 PDT

-David Kranz


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] The role of an abstract client in tempest

2014-07-25 Thread David Kranz
Even as a core contributor for several years, it has never been clear 
what the scope of these tests should be.
As we move forward with the necessity of moving functional testing to 
projects, we need to answer this question for real, understanding that 
part of the mission for these tests now is validation of clouds.  Doing 
so is made difficult by the fact that the tempest api tests take a very 
opinionated view of how services are invoked. In particular, the tempest 
client is very low-level and at present the way a functional test is 
written depends on how and where it is going to run.


In an ideal world, functional tests could execute in a variety of 
environments ranging from those that completely bypass wsgi layers and 
make project api calls directly, to running in a fully integrated real 
environment as the tempest tests currently do. The challenge is that 
there are mismatches between how the tempest client looks to test code 
and how doing object-model api calls looks. Most of this discrepancy is 
because many pieces of invoking a service are hard-coded into the tests 
rather than being abstracted in a client. Some examples are:


1. Response validation
2. json serialization/deserialization
3. environment description (tempest.conf)
4. Forced usage of addCleanUp

Maru Newby and I have proposed changing the test code to use a more 
abstract client by defining the expected signature and functionality
of methods on the client. Roughly, the methods would take positional 
arguments for pieces that go in the url part of a REST call, and kwargs 
for the json payload. The client would take care of these enumerated 
issues (if necessary) and return an attribute dict. The test code itself 
would then just be service calls and checks of returned data. Returned 
data would be inspected as resource.id instead of resource['id']. There 
is a strawman example of this for a few neutron apis here: 
https://review.openstack.org/#/c/106916/
Doing this would have the twin advantages of eliminating the need for 
boilerplate code in tests and making it possible to run the tests in 
different environments. It would also allow the inclusion of project 
functional tests in more general validation scenarios.


Since we are proposing to move parts of tempest into a stable library 
https://review.openstack.org/108858, we need to define the client in a 
way that meets all the needs outlined here before doing so. The actual 
work of defining the client in tempest and changing the code that uses 
it could largely be done one service at a time, in the tempest code, 
before being split out.


What do folks think about this idea?

 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Thoughts on the patch test failure rate and moving forward

2014-07-25 Thread David Kranz

On 07/25/2014 10:01 AM, Steven Hardy wrote:

On Wed, Jul 23, 2014 at 02:39:47PM -0700, James E. Blair wrote:
snip

   * Put the burden for a bunch of these tests back on the projects as
 functional tests. Basically a custom devstack environment that a
 project can create with a set of services that they minimally need
 to do their job. These functional tests will live in the project
 tree, not in Tempest, so can be atomically landed as part of the
 project normal development process.

+1 - FWIW I don't think the current process where we require tempest
cores to review our project test cases is working well, so allowing
projects to own their own tests will be a major improvement.

++
We will still need some way to make sure it is difficult to break api 
compatibility by submitting a change to both code and its tests, which
currently requires a tempest two-step. Also, tempest will still need 
to retain integration testing of apis that use apis from other projects.


In terms of how this works in practice, will the in-tree tests still be run
via tempest, e.g will there be a (relatively) stable tempest api we can
develop the tests against, as Angus has already mentioned?
That is a really good question. I hope the answer is that they can still 
be run  by tempest, but don't have to be. I tried to address this in a 
message within the last hour 
http://lists.openstack.org/pipermail/openstack-dev/2014-July/041244.html


 -David


Steve

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [cinder][qa] cinder client versions and tempest

2014-07-24 Thread David Kranz
I noticed that the cinder list-extensions url suffix is underneath the 
v1/v2 in the GET url but the returned result is the same either way. 
Some of the

returned items have v1 in the namespace, and others v2.

Also, in tempest, there is a single config section for cinder and only a 
single extensions client even though we run cinder
tests for v1 and v2 through separate volume clients. I would have 
expected that listing extensions would be separate calls for v1
and v2 and that the results might be different, implying that tempest 
conf should have a separate section (and service enabled) for volumes v2
rather than treating the presence of v1 and v2 as flags in 
volume-feature-enabled. Am I missing something here?


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc][rally][qa] Application for a new OpenStack Program: Performance and Scalability

2014-07-22 Thread David Kranz

On 07/22/2014 10:44 AM, Sean Dague wrote:

Honestly, I'm really not sure I see this as a different program, but is
really something that should be folded into the QA program. I feel like
a top level effort like this is going to lead to a lot of duplication in
the data analysis that's currently going on, as well as functionality
for better load driver UX.

-Sean

+1
It will also lead to pointless discussions/arguments about which 
activities are part of QA and which are part of

Performance and Scalability Testing.

QA Program mission:

 Develop, maintain, and initiate tools and plans to ensure the upstream 
stability and quality of OpenStack, and its release readiness at any 
point during the release cycle.


It is hard to see how $subj falls outside of that mission. Of course 
rally would continue to have its own repo, review team, etc. as do 
tempest and grenade.


  -David



On 07/21/2014 05:53 PM, Boris Pavlovic wrote:

Hi Stackers and TC,

The Rally contributor team would like to propose a new OpenStack program
with a mission to provide scalability and performance benchmarking, and
code profiling tools for OpenStack components.

We feel we've achieved a critical mass in the Rally project, with an
active, diverse contributor team. The Rally project will be the initial
project in a new proposed Performance and Scalability program.

Below, the details on our proposed new program.

Thanks for your consideration,
Boris



[1] https://review.openstack.org/#/c/108502/


Official Name
=

Performance and Scalability

Codename


Rally

Scope
=

Scalability benchmarking, performance analysis, and profiling of
OpenStack components and workloads

Mission
===

To increase the scalability and performance of OpenStack clouds by:

* defining standard benchmarks
* sharing performance data between operators and developers
* providing transparency of code paths through profiling tools

Maturity


* Meeting logs http://eavesdrop.openstack.org/meetings/rally/2014/
* IRC channel: #openstack-rally
* Rally performance jobs are in (Cinder, Glance, Keystone  Neutron)
check pipelines.
*  950 commits over last 10 months
* Large, diverse contributor community
  * 
http://stackalytics.com/?release=junometric=commitsproject_type=Allmodule=rally
  * http://stackalytics.com/report/contribution/rally/180

* Non official lead of project is Boris Pavlovic
  * Official election In progress.

Deliverables


Critical deliverables in the Juno cycle are:

* extending Rally Benchmark framework to cover all use cases that are
required by all OpenStack projects
* integrating OSprofiler in all core projects
* increasing functional  unit testing coverage of Rally.

Discussion
==

One of the major goals of Rally is to make it simple to share results of
standardized benchmarks and experiments between operators and
developers. When an operator needs to verify certain performance
indicators meet some service level agreement, he will be able to run
benchmarks (from Rally) and share with the developer community the
results along with his OpenStack configuration. These benchmark results
will assist developers in diagnosing particular performance and
scalability problems experienced with the operator's configuration.

Another interesting area is Rally  the OpenStack CI process. Currently,
working on performance issues upstream tends to be a more social than
technical process. We can use Rally in the upstream gates to identify
performance regressions and measure improvement in scalability over
time. The use of Rally in the upstream gates will allow a more rigorous,
scientific approach to performance analysis. In the case of an
integrated OSprofiler, it will be possible to get detailed information
about API call flows (e.g. duration of API calls in different services).




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev





___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [gate] The gate: a failure analysis

2014-07-21 Thread David Kranz

On 07/21/2014 04:13 PM, Jay Pipes wrote:

On 07/21/2014 02:03 PM, Clint Byrum wrote:

Thanks Matthew for the analysis.

I think you missed something though.

Right now the frustration is that unrelated intermittent bugs stop your
presumably good change from getting in.

Without gating, the result would be that even more bugs, many of them 
not

intermittent at all, would get in. Right now, the one random developer
who has to hunt down the rechecks and do them is inconvenienced. But
without a gate, _every single_ developer will be inconvenienced until
the fix is merged.

The false negative rate is _way_ too high. Nobody would disagree there.
However, adding more false negatives and allowing more people to ignore
the ones we already have, seems like it would have the opposite effect:
Now instead of annoying the people who hit the random intermittent bugs,
we'll be annoying _everybody_ as they hit the non-intermittent ones.


+10

Right, but perhaps there is a middle ground. We must not allow changes 
in that can't pass through the gate, but we can separate the problems
of constant rechecks using too many resources, and of constant rechecks 
causing developer pain. If failures were deterministic we would skip the 
failing tests until they were fixed. Unfortunately many of the common 
failures can blow up any test, or even the whole process. Following on 
what Sam said, what if we automatically reran jobs that failed in a 
known way, and disallowed recheck/reverify no bug? Developers would 
then have to track down what bug caused a failure or file a new one. But 
they would have to do so much less frequently, and as more common 
failures were catalogued it would become less and less frequent.


Some might (reasonably) argue that this would be a bad thing because it 
would reduce the incentive for people to fix bugs if there were less 
pain being inflicted. But given how hard it is to track down these race 
bugs, and that we as a community have no way to force time to be spent 
on them, and that it does not appear that these bugs are causing real 
systems to fall down (only our gating process), perhaps something 
different should be considered?


 -David


Best,
-jay


Excerpts from Matthew Booth's message of 2014-07-21 03:38:07 -0700:

On Friday evening I had a dependent series of 5 changes all with
approval waiting to be merged. These were all refactor changes in the
VMware driver. The changes were:

* VMware: DatastorePath join() and __eq__()
https://review.openstack.org/#/c/103949/

* VMware: use datastore classes get_allowed_datastores/_sub_folder
https://review.openstack.org/#/c/103950/

* VMware: use datastore classes in file_move/delete/exists, mkdir
https://review.openstack.org/#/c/103951/

* VMware: Trivial indentation cleanups in vmops
https://review.openstack.org/#/c/104149/

* VMware: Convert vmops to use instance as an object
https://review.openstack.org/#/c/104144/

The last change merged this morning.

In order to merge these changes, over the weekend I manually submitted:

* 35 rechecks due to false negatives, an average of 7 per change
* 19 resubmissions after a change passed, but its dependency did not

Other interesting numbers:

* 16 unique bugs
* An 87% false negative rate
* 0 bugs found in the change under test

Because we don't fail fast, that is an average of at least 7.3 hours in
the gate. Much more in fact, because some runs fail on the second pass,
not the first. Because we don't resubmit automatically, that is only if
a developer is actively monitoring the process continuously, and
resubmits immediately on failure. In practise this is much longer,
because sometimes we have to sleep.

All of the above numbers are counted from the change receiving an
approval +2 until final merging. There were far more failures than this
during the approval process.

Why do we test individual changes in the gate? The purpose is to find
errors *in the change under test*. By the above numbers, it has failed
to achieve this at least 16 times previously.

Probability of finding a bug in the change under test: Small
Cost of testing:   High
Opportunity cost of slowing development:   High

and for comparison:

Cost of reverting rare false positives:Small

The current process expends a lot of resources, and does not achieve 
its
goal of finding bugs *in the changes under test*. In addition to 
using a
lot of technical resources, it also prevents good change from making 
its

way into the project and, not unimportantly, saps the will to live of
its victims. The cost of the process is overwhelmingly greater than its
benefits. The gate process as it stands is a significant net 
negative to

the project.

Does this mean that it is worthless to run these tests? Absolutely not!
These tests are vital to highlight a severe quality deficiency in
OpenStack. Not addressing this is, imho, an existential risk to the
project. However, the current approach is 

Re: [openstack-dev] [QA] Proposed Changes to Tempest Core

2014-07-21 Thread David Kranz
+1

On Jul 21, 2014, at 6:37 PM, Matthew Treinish mtrein...@kortar.org wrote:

 
 Hi Everyone,
 
 I would like to propose 2 changes to the Tempest core team:
 
 First, I'd like to nominate Andrea Fritolli to the Tempest core team. Over the
 past cycle Andrea has been steadily become more actively engaged in the 
 Tempest
 community. Besides his code contributions around refactoring Tempest's
 authentication and credentials code, he has been providing reviews that have
 been of consistently high quality that show insight into both the project
 internals and it's future direction. In addition he has been active in the
 qa-specs repo both providing reviews and spec proposals, which has been very
 helpful as we've been adjusting to using the new process. Keeping in mind that
 becoming a member of the core team is about earning the trust from the members
 of the current core team through communication and quality reviews, not 
 simply a
 matter of review numbers, I feel that Andrea will make an excellent addition 
 to
 the team.
 
 As per the usual, if the current Tempest core team members would please vote 
 +1
 or -1(veto) to the nomination when you get a chance. We'll keep the polls open
 for 5 days or until everyone has voted.
 
 References:
 
 https://review.openstack.org/#/q/reviewer:%22Andrea+Frittoli+%22,n,z
 
 http://stackalytics.com/?user_id=andrea-frittolimetric=marksmodule=qa-group
 
 
 The second change that I'm proposing today is to remove Giulio Fidente from 
 the
 core team. He asked to be removed from the core team a few weeks back because 
 he
 is no longer able to dedicate the required time to Tempest reviews. So if 
 there
 are no objections to this I will remove him from the core team in a few days.
 Sorry to see you leave the team Giulio...
 
 
 Thanks,
 
 Matt Treinish
 ___
 OpenStack-dev mailing list
 OpenStack-dev@lists.openstack.org
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa][all] Branchless Tempest beyond pure-API tests, impact on backporting policy

2014-07-10 Thread David Kranz

On 07/10/2014 08:53 AM, Matthew Treinish wrote:

On Thu, Jul 10, 2014 at 08:37:40AM -0400, Eoghan Glynn wrote:

Note that the notifications that capture these resource state transitions
are a long-standing mechanism in openstack that ceilometer has depended
on from the very outset. I don't think it's realistic to envisage these
interactions will be replaced by REST APIs any time soon.

I wasn't advocating doing everything over a REST API. (API is an
overloaded term) I just meant that if we're depending on
notifications for communication between projects then we should
enforce a stability contract on them. Similar to what we already
have with the API stability guidelines for the REST APIs. The fact
that there is no direct enforcement on notifications, either through
social policy or testing, is what I was taking issue with.

I also think if we decide to have a policy of enforcing notification
stability then we should directly test the notifications from an
external repo to block slips. But, that's a discussion for later, if
at all.

A-ha, OK, got it.

I've discussed enforcing such stability with jogo on IRC last night, and
kicked off a separate thread to capture that:

   http://lists.openstack.org/pipermail/openstack-dev/2014-July/039858.html

However the time-horizon for that effort would be quite a bit into the
future, compared to the test coverage that we're aiming to have in place
for juno-2.


The branchless Tempest spec envisages new features will be added and
need to be skipped when testing stable/previous, but IIUC requires
that the presence of new behaviors is externally discoverable[5].

I think the test case you proposed is fine. I know some people will
argue that it is expanding the scope of tempest to include more
whitebox like testing, because the notification are an internal
side-effect of the api call, but I don't see it that way. It feels
more like exactly what tempest is there to enable testing, a
cross-project interaction using the api.

In my example, APIs are only used to initiate the action in cinder
and then to check the metering data in ceilometer.

But the middle-piece, i.e. the interaction between cinder  ceilometer,
is not mediated by an API. Rather, its carried via an unversioned
notification.

Yeah, exactly, that's why I feel it's a valid Tempest test case.

Just to clarify: you meant to type it's a valid Tempest test case
as opposed to it's *not* a valid Tempest test case, right?

Heh, yes I meant to say, it is a valid test case.

  

What I was referring to as the counter argument, and where the
difference of opinion was, is that the test will be making REST API
calls to both trigger a nominally internal mechanism (the
notification) from the services and then using the ceilometer api to
validate the notification worked.

Yes, that's exactly the idea.


But, it's arguably the real intent of these tests is to validate
that internal mechanism, which is basically a whitebox test. The
argument was that by testing it in tempes we're testing
notifications poorly because of it's black box limitation
notifications will just be tested indirectly. Which I feel is a
valid point, but not a sufficient reason to exclude the notification
tests from tempest.

Agreed.


I think the best way to move forward is to have functional whitebox
tests for the notifications as part of the individual projects
generating them, and that way we can direct validation of the
notification. But, I also feel there should be tempest tests on top
of that that verify the ceilometer side of consuming the
notification and the api exposing that information.

Excellent. So, indeed, more fullsome coverage of the notification
logic with in-tree tests on the producer side would definitely
to welcome, and could be seen as a phase zero of an overall to
fix/impove the notification mechanism.
   

But, they're is also a slight misunderstanding here. Having a
feature be externally discoverable isn't a hard requirement for a
config option in tempest, it's just *strongly* recommended. Mostly,
because if there isn't a way to discover it how are end users
expected to know what will work.

A-ha, I missed the subtle distinction there and thought that this
discoverability was a *strict* requirement. So how bad a citizen would
a project be considered to be if it chose not to meet that strong
recommendation?

You'd be far from the only ones who are doing that, for an existing example
look at anything on the nova driver feature matrix. Most of those aren't
discoverable from the API. So I think it would be ok to do that, but when we
have efforts like:

https://review.openstack.org/#/c/94473/

it'll make that more difficult. Which is why I think having discoverability
through the API is important. (it's the same public cloud question)

So for now, would it suffice for the master versus stable/icehouse
config to be checked-in in static form pending the completion of that
BP on tempest-conf-autogen?

Yeah, I think that'll be fine. The 

Re: [openstack-dev] [qa][all] Branchless Tempest beyond pure-API tests, impact on backporting policy

2014-07-10 Thread David Kranz

On 07/10/2014 09:47 AM, Thierry Carrez wrote:

Hi!

There is a lot of useful information in that post (even excluding the
part brainstorming solutions) and it would be a shame if it was lost in
a sub-thread. Do you plan to make a blog post, or reference wiki page,
out of this ?

Back to the content, I think a more layered testing approach (as
suggested) is a great way to reduce our gating issues, but also to
reduce the configuration matrix issue.

On the gating side, our current solution is optimized to detect rare
issues. It's a good outcome, but the main goal should really be to
detect in-project and cross-project regressions, while not preventing us
from landing patches. Rare issues detection should be a side-effect of
the data we generate, not the life-and-death issue it currently is.

So limiting co-gating tests to external interfaces blackbox testing,
while the project would still run more whitebox tests on its own
behavior sounds like a good idea. It would go a long way to limit the
impact a rare issue in project A has on project B velocity, which is
where most of the gate frustration comes from.

Adding another level of per-project functional testing also lets us test
more configuration options outside of co-gating tests. If we can test
that MySQL and Postgres behave the same from Nova's perspective in
Nova-specific functional whitebox testing, then we really don't need to
test both in cogating tests. By being more specific in what we test for
each project, we can actually test more things by running less tests.

+10

Once we recognize that co-gating of every test on every commit does not 
scale, many other options come into play.
This issue is closely related to the decision by the qa group in Atlanta 
that migrating api tests from tempest to projects was a good idea.
All of this will have to be done incrementally, presumably on a project 
by project basis. I think neutron may lead the way.
There are many issues around sharing test framework code that Matt 
raised in another message.
When there are good functional api tests running in a project, a subset 
could be selected to run in the gate. This was the original intent of the

'smoke' tag in tempest.

 -David


Sean Dague wrote:

I think we need to actually step back a little and figure out where we
are, how we got here, and what the future of validation might need to
look like in OpenStack. Because I think there has been some
communication gaps. (Also, for people I've had vigorous conversations
about this before, realize my positions have changed somewhat,
especially on separation of concerns.)

(Also note, this is all mental stream right now, so I will not pretend
that it's an entirely coherent view of the world, my hope in getting
things down is we can come up with that coherent view of the wold together.)

== Basic History ==

In the essex time frame Tempest was 70 tests. It was basically a barely
adequate sniff test for integration for OpenStack. So much so that our
first 3rd Party CI system, SmokeStack, used it's own test suite, which
legitimately found completely different bugs than Tempest. Not
surprising, Tempest was a really small number of integration tests.

As we got to Grizzly Tempest had grown to 1300 tests, somewhat
organically. People were throwing a mix of tests into the fold, some
using Tempest's client, some using official clients, some trying to hit
the database doing white box testing. It had become kind of a mess and a
rorshack test. We had some really weird design summit sessions because
many people had only looked at a piece of Tempest, and assumed the rest
was like it.

So we spent some time defining scope. Tempest couldn't really be
everything to everyone. It would be a few things:
  * API testing for public APIs with a contract
  * Some throughput integration scenarios to test some common flows
(these were expected to be small in number)
  * 3rd Party API testing (because it had existed previously)

But importantly, Tempest isn't a generic function test suite. Focus is
important, because Tempests mission always was highly aligned with what
eventually became called Defcore. Some way to validate some
compatibility between clouds. Be that clouds built from upstream (is the
cloud of 5 patches ago compatible with the cloud right now), clouds from
different vendors, public clouds vs. private clouds, etc.

== The Current Validation Environment ==

Today most OpenStack projects have 2 levels of validation. Unit tests 
Tempest. That's sort of like saying your house has a basement and a
roof. For sufficiently small values of house, this is fine. I don't
think our house is sufficiently small any more.

This has caused things like Neutron's unit tests, which actually bring
up a full wsgi functional stack and test plugins through http calls
through the entire wsgi stack, replicated 17 times. It's the reason that
Neutron unit tests takes many GB of memory to run, and often run longer
than Tempest runs. (Maru has been doing hero's 

Re: [openstack-dev] [qa] [rfc] move scenario tests to tempest client

2014-07-10 Thread David Kranz

On 07/10/2014 08:08 AM, Frittoli, Andrea (HP Cloud) wrote:

++

The ugly monkey patch approach is still working fine for my downstream
testing, but that's something I'd be happy to get rid of.

Something that may be worth considering is to have an abstraction layer on top
of tempest clients, to allow switching the actual implementation below:

- REST call as now for the gate  jobs
- python calls for running the tests in non-integrated environments - these
would live in-tree with the services rather than in tempest - similar  to what
the neutron team is doing to run tests in tree
- python calls to the official clients, so that a tempest run could still be
used to verify the python bindings  in a dedicated job
+1 to using tempest client. The requirement for enhanced debugging 
features was not seen as critical when the original decision was made. I 
don't think the badness

of the current situation was anticipated.

The abstraction layer comment is related to the discussion about 
moving functional api tests to projects with a retargetable client. I 
was discussing this with
Maru yesterday at the neutron mid-cycle.  In addition to abstracting the 
client, we need to abstract the way the test code calls the client and 
handles the results.
We looked at some of the networking tests in tempest. In addition to the 
boilerplate code for checking success response codes, which I have 
started moving to the clients, there is also boilerplate code around 
deserializing the body that is returned. There is also lack of 
uniformity of what the various client method signatures look like. Here 
is a strawman proposal we came up with to unify this:


Most of our REST APIs accept some parameters that are inserted into the 
url string, and others that become part of a json payload. They return a 
response with a json body. A python client method should accept 
arguments for each inserted parameter, and **kwargs for the json part. 
If multiple success response codes might be returned, the method would 
take another argument specifying which should be checked.


The client methods should no longer return (resp, body) where body is 
somewhat raw. Since response checking will now be done all at the 
client, the resp return value is
no longer needed. The json body should be returned as a single return 
value, but as an attribute dict. Any extraneous top-level dict envelope 
can be stripped out. For example, the neutron create apis return a 
top-level dict with one value and a key that is the same name as the 
resource being created.


Doing this would have several advantages:

1. The test code would be smaller.
2. The test code would only  be involved with behavior checking and any 
client-specific checking or serialize/deserialize would be done by the 
client.
3. As a result of (2), there would be sufficient abstraction that a 
variety of clients could be used by the same test code.


 -David

andrea

-Original Message-
From: Sean Dague [mailto:s...@dague.net]
Sent: 10 July 2014 12:23
To: OpenStack Development Mailing List (not for usage questions)
Subject: [openstack-dev] [qa] [rfc] move scenario tests to tempest client

As I've been staring at failures in the gate a lot over the past month, we've
managed to increasingly tune the tempest client for readability and
debugability. So when something fails in an API test, pin pointing it's
failure point is getting easier. The scenario tests... not so much.

Using the official clients in the scenario tests was originally thought of as
a way to get some extra testing on those clients through Tempest.
However it has a ton of debt associated with it. And I think that client
testing should be done as functional tests in the client trees[1], not as a
side effect of Tempest.

  * It makes the output of a fail path radically different between the 2 types
  * It adds a bunch of complexity on tenant isolation (and basic duplication
between building accounts for both clients)
  * It generates a whole bunch of complexity around waiting for
resources, and safe creates which garbage collect. All of which has to be done
above the client level because the official clients don't provide that
functionality.

In addition the official clients don't do the right thing when hitting API
rate limits, so are dubious in running on real clouds. There was a proposed
ugly monkey patch approach which was just too much for us to deal with.

Migrating to tempest clients I think would clean up a ton of complexity, and
provide for a more straight forward debuggable experience when using Tempest.

I'd like to take a temperature on this though, so comments welcomed.

-Sean

[1] -
http://lists.openstack.org/pipermail/openstack-dev/2014-July/039733.html
(see New Thinking about our validation layers)

--
Sean Dague
http://dague.net



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org

[openstack-dev] [qa] Strange error in non-isolated periodic tests

2014-07-10 Thread David Kranz
I was trying to bring the periodic non-isolated jobs back to health. One 
problem with them is all the scenario tests fail with


Captured traceback:
2014-06-26 07:00:42.312  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_312
  | ~~~
2014-06-26 07:00:42.312  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_312
  | Traceback (most recent call last):
2014-06-26 07:00:42.312  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_312
  |   File tempest/scenario/test_dashboard_basic_ops.py, line 39, in setUpClass
2014-06-26 07:00:42.312  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_312
  | super(TestDashboardBasicOps, cls).setUpClass()
2014-06-26 07:00:42.312  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_312
  |   File tempest/scenario/manager.py, line 72, in setUpClass
2014-06-26 07:00:42.312  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_312
  | network_resources=cls.network_resources)
2014-06-26 07:00:42.312  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_312
  |   File tempest/common/isolated_creds.py, line 41, in __init__
2014-06-26 07:00:42.312  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_312
  | self._get_admin_clients())
2014-06-26 07:00:42.312  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_312
  |   File tempest/common/isolated_creds.py, line 54, in _get_admin_clients
2014-06-26 07:00:42.312  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_312
  | auth.get_default_credentials('identity_admin')
2014-06-26 07:00:42.312  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_312
  |   File tempest/clients.py, line 481, in __init__
2014-06-26 07:00:42.313  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_313
  | credentials)
2014-06-26 07:00:42.313  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_313
  |   File tempest/clients.py, line 707, in _get_ceilometer_client
2014-06-26 07:00:42.313  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_313
  | import ceilometerclient.client
2014-06-26 07:00:42.313  
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_313
  | ImportError: No module named ceilometerclient.client
2014-06-26 07:00:42.313  http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz#_2014-06-26_07_00_42_313  






That was from 
http://logs.openstack.org/periodic-qa/periodic-tempest-dsvm-full-non-isolated-master/f1bd962/console.html.gz
The only difference between this and the normal tempest jobs that are 
passing is that allow_tenant_isolation is off and it is running 
serially. I cannot reproduce it in my local environment either. Does any 
one have any clues?


 -David
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [keystone][qa] Returning 203 in keystone v2 list apis?

2014-07-03 Thread David Kranz
While moving success response code checking in tempest to the client, I 
noticed that exactly one of the calls to list users for a tenant checked 
for 200 or 203. Looking at 
http://docs.openstack.org/api/openstack-identity-service/2.0/content/, 
it seems that most of the list apis can return 203. But given that 
almost all of the tempest tests only pass on getting 200, I am guessing 
that 203 is not actually ever being returned. Is the doc just wrong? If 
not, what kind of call would trigger a 203 response?


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa][nova] Please run 'check experimental' on changes to nova v3 tests in tempest

2014-07-02 Thread David Kranz
Due to the status of nova v3, to save time, running the tempest v3 tests 
has been moved out of the gate/check jobs to the experimental queue. So 
please run 'check experimental' on v3-related patches.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] checking success response in clients

2014-06-30 Thread David Kranz
We approved 
https://github.com/openstack/qa-specs/blob/master/specs/client-checks-success.rst 
which recommends that checking of correct success codes be moved to the 
tempest clients. This has been done for the image tests but not others 
yet. But new client/test code coming in should definitely be doing the 
checks in the client rather then the test bodies. Here is the image 
change for reference: https://review.openstack.org/#/c/101310/


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [marconi][qa] marconi check run in tempest is on experimental queue

2014-06-23 Thread David Kranz
Please remember to do 'check experimental' after uploading new marconi 
patches in tempest.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Rethink how we manage projects? (was Gate proposal - drop Postgresql configurations in the gate)

2014-06-16 Thread David Kranz

On 06/16/2014 05:33 AM, Thierry Carrez wrote:

David Kranz wrote:

[...]
There is a different way to do this. We could adopt the same methodology
we have now around gating, but applied to each project on its own
branch. These project branches would be integrated into master at some
frequency or when some new feature in project X is needed by project Y.
Projects would want to pull from the master branch often, but the push
process would be less frequent and run a much larger battery of tests
than we do now.

So we would basically discover the cross-project bugs when we push to
the master master branch. I think you're just delaying discovery of
the most complex issues, and push the responsibility to resolve them
onto a inexistent set of people. Adding integration branches only makes
sense if you have an integration team. We don't have one, so we'd call
back on the development teams to solve the same issues... with a delay.
You are assuming that the problem is cross-project bugs. A lot of these 
bugs are not really bugs that
are *caused* by cross-project interaction. Many are project-specific 
bugs that could have been squashed before being integrated if enough 
testing had been done, but since we do all of our testing in a 
fully-integrated environment we often don't know where they came from. I 
am not suggesting this proposal would help much to get out of the 
current jam, just make it harder to get into it again once master is 
stabilized


In our specific open development setting, delaying is bad because you
don't have a static set of developers that you can assume will be on
call ready to help with what they have written a few months later:
shorter feedback loops are key to us.
I hope you did not think I was suggesting a few months as a typical 
frequency for a project updating master. That would be unacceptable. But 
there is a continuum between on every commit and months. I was 
thinking of perhaps once a week but it would really depend on a lot of 
things that happen.



Doing this would have the following advantages:

1. It would be much harder for a race bug to get in. Each commit would
be tested many more times on its branch before being merged to master
than at present, including tests specialized for that project. The
qa/infra teams and others would continue to define acceptance at the
master level.
2. If a race bug does get in, projects have at least some chance to
avoid merging the bad code.
3. Each project can develop its own gating policy for its own branch
tailored to the issues and tradeoffs it has. This includes focus on
spending time running their own tests. We would no longer run a complete
battery of nova tests on every commit to swift.
4. If a project branch gets into the situation we are now in:
  a) it does not impact the ability of other projects to merge code
  b) it is highly likely the bad code is actually in the project so
it is known who should help fix it
  c) those trying to fix it will be domain experts in the area that
is failing
5. Distributing the gating load and policy to projects makes the whole
system much more scalable as we add new projects.

Of course there are some drawbacks:

1. It will take longer, sometimes much longer, for any individual commit
to make it to master. Of course if a super-serious issue made it to
master and had to be fixed immediately it could be committed to master
directly.
2. Branch management at the project level would be required. Projects
would have to decide gating criteria, timing of pulls, and coordinate
around integration to master with other projects.
3. There may be some technical limitations with git/gerrit/whatever that
I don't understand but which would make this difficult.
4. It makes the whole thing more complicated from a process standpoint.

An extra drawback is that you can't really do CD anymore, because your
master master branch gets big chunks of new code in one go at push time.
That depends on how big and delayed the chunks are. The question is how 
do we test commits enough to make sure they don't cause new races 
without using vastly more resources than we have, and without it taking 
days to test a patch?. I am suggesting an alternative as a possible 
least-bad approach, not a panacea. I didn't think that doing CD implied 
literally that the unit of integration was exactly one developer commit.



I have used this model in previous large software projects and it worked
quite well. This may also be somewhat similar to what the linux kernel
does in some ways.

Please keep in mind that some techniques which are perfectly valid (and
even recommended) when you have a captive set of developers just can't
work in our open development setting. Some techniques which work
perfectly for a release-oriented product just don't cut it when you also
want the software to be consumable in a continuous delivery fashion. We
certainly can and should learn from other experiences, but we also need
to recognize our challenges

[openstack-dev] [qa] Clarification of policy for qa-specs around adding new tests

2014-06-16 Thread David Kranz
I have been reviewing some of these specs and sense a lack of clarity 
around what is expected. In the pre-qa-specs world we did not want 
tempest blueprints to be used by projects to track their tempest test 
submissions because the core review team did not want to have to spend a 
lot of time dealing with that. We said that each project could have one 
tempest blueprint that would point to some other place (project 
blueprints, spreadsheet, etherpad, etc.) that would track specific tests 
to be added. I'm not sure what aspect of the new qa-spec process would 
make us feel differently about this. Has this policy changed? We should 
spell out the expectation in any event. I will update the README when we 
have a conclusion.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Rethink how we manage projects? (was Gate proposal - drop Postgresql configurations in the gate)

2014-06-13 Thread David Kranz

On 06/13/2014 07:31 AM, Sean Dague wrote:

On 06/13/2014 02:36 AM, Mark McLoughlin wrote:

On Thu, 2014-06-12 at 22:10 -0400, Dan Prince wrote:

On Thu, 2014-06-12 at 08:06 -0400, Sean Dague wrote:

We're definitely deep into capacity issues, so it's going to be time to
start making tougher decisions about things we decide aren't different
enough to bother testing on every commit.

In order to save resources why not combine some of the jobs in different
ways. So for example instead of:

  check-tempest-dsvm-full
  check-tempest-dsvm-postgres-full

Couldn't we just drop the postgres-full job and run one of the Neutron
jobs w/ postgres instead? Or something similar, so long as at least one
of the jobs which runs most of Tempest is using PostgreSQL I think we'd
be mostly fine. Not shooting for 100% coverage for everything with our
limited resource pool is fine, lets just do the best we can.

Ditto for gate jobs (not check).

I think that's what Clark was suggesting in:

https://etherpad.openstack.org/p/juno-test-maxtrices


Previously we've been testing Postgresql in the gate because it has a
stricter interpretation of SQL than MySQL. And when we didn't test
Postgresql it regressed. I know, I chased it for about 4 weeks in grizzly.

However Monty brought up a good point at Summit, that MySQL has a strict
mode. That should actually enforce the same strictness.

My proposal is that we land this change to devstack -
https://review.openstack.org/#/c/97442/ and backport it to past devstack
branches.

Then we drop the pg jobs, as the differences between the 2 configs
should then be very minimal. All the *actual* failures we've seen
between the 2 were completely about this strict SQL mode interpretation.


I suppose I would like to see us keep it in the mix. Running SmokeStack
for almost 3 years I found many an issue dealing w/ PostgreSQL. I ran it
concurrently with many of the other jobs and I too had limited resources
(much less that what we have in infra today).

Would MySQL strict SQL mode catch stuff like this (old bugs, but still
valid for this topic I think):

  https://bugs.launchpad.net/nova/+bug/948066

  https://bugs.launchpad.net/nova/+bug/1003756


Having support for and testing against at least 2 databases helps keep
our SQL queries and migrations cleaner... and is generally a good
practice given we have abstractions which are meant to support this sort
of thing anyway (so by all means let us test them!).

Also, Having compacted the Nova migrations 3 times now I found many
issues by testing on multiple databases (MySQL and PostgreSQL). I'm
quite certain our migrations would be worse off if we just tested
against the single database.

Certainly sounds like this testing is far beyond the might one day be
useful level Sean talks about.

The migration compaction is a good point. And I'm happy to see there
were some bugs exposed as well.

Here is where I remain stuck

We are now at a failure rate in which it's 3 days (minimum) to land a
fix that decreases our failure rate at all.

The way we are currently solving this is by effectively building manual
zuul and taking smart humans in coordination to end run around our
system. We've merged 18 fixes so far -
https://etherpad.openstack.org/p/gatetriage-june2014 this way. Merging a
fix this way is at least an order of magnitude more expensive on people
time because of the analysis and coordination we need to go through to
make sure these things are the right things to jump the queue.

That effort, over 8 days, has gotten us down to *only* a 24hr merge
delay. And there are no more smoking guns. What's left is a ton of
subtle things. I've got ~ 30 patches outstanding right now (a bunch are
things to clarify what's going on in the build runs especially in the
fail scenarios). Every single one of them has been failed by Jenkins at
least once. Almost every one was failed by a different unique issue.

So I'd say at best we're 25% of the way towards solving this. That being
said, because of the deep queues, people are just recheck grinding (or
hitting the jackpot and landing something through that then fails a lot
after landing). That leads to bugs like this:

https://bugs.launchpad.net/heat/+bug/1306029

Which was seen early in the patch - https://review.openstack.org/#/c/97569/

Then kind of destroyed us completely for a day -
http://status.openstack.org/elastic-recheck/ (it's the top graph).

And, predictably, a week into a long gate queue everyone is now grumpy.
The sniping between projects, and within projects in assigning blame
starts to spike at about day 4 of these events. Everyone assumes someone
else is to blame for these things.

So there is real community impact when we get to these states.



So, I'm kind of burnt out trying to figure out how to get us out of
this. As I do take it personally when we as a project can't merge code.
As that's a terrible state to be in.

Pleading to get more people to dive in, is mostly not helping.

So my only 

[openstack-dev] [qa] Meaning of 204 from DELETE apis

2014-06-12 Thread David Kranz
Tempest has a number of tests in various services for deleting objects 
that mostly return 204. Many, but not all, of these tests go on to check 
that the resource was actually deleted but do so in different ways. 
Sometimes they go into a timeout loop waiting for a GET on the object to 
fail. Sometimes they immediately call DELETE again or GET and assert 
that it fails. According to what I can see about the HTTP spec, 204 
should mean that the object was deleted. So is waiting for something to 
disappear unnecessary? Is immediate assertion wrong? Does this behavior 
vary service to service? We should be as consistent about this as 
possible but I am not sure what the expected behavior of all services 
actually is.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] Meaning of 204 from DELETE apis

2014-06-12 Thread David Kranz

On 06/12/2014 05:27 PM, Jay Pipes wrote:

On 06/12/2014 05:17 PM, David Kranz wrote:

Tempest has a number of tests in various services for deleting objects
that mostly return 204. Many, but not all, of these tests go on to check
that the resource was actually deleted but do so in different ways.
Sometimes they go into a timeout loop waiting for a GET on the object to
fail. Sometimes they immediately call DELETE again or GET and assert
that it fails. According to what I can see about the HTTP spec, 204
should mean that the object was deleted. So is waiting for something to
disappear unnecessary? Is immediate assertion wrong? Does this behavior
vary service to service? We should be as consistent about this as
possible but I am not sure what the expected behavior of all services
actually is.


The main problem I've seen is that while the resource is deleted, it 
stays in a deleting state for some time, and quotas don't get adjusted 
until the server is finally set to a terminated status.
So you are talking about nova here. In tempest I think we need to more 
clearly distinguish when delete is being called to test the delete api 
vs. as part of some cleanup. There was an irc discussion related to this 
recently.  The question is, if I do a delete and get a 204, can I expect 
that immediately doing another delete or get will fail? And that 
question needs an answer for each api that has delete in order to have 
proper tests for delete.


 -David


Best,
-jay


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] Should Assignee, Milestone really be in the qa-spec?

2014-06-11 Thread David Kranz
While reviewing some specs I noticed that I had put myself down for more 
Juno-2 work than is likely to be completed. I suspect this will happen 
routinely with many folks. Also, assignees may change. This information 
is not really part of the spec at all. Since we are still using 
blueprints to actually track progress, I think it would be better to use 
the corresponding fields in blueprints to make sure these values reflect 
reality on an ongoing basis.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] Should Assignee, Milestone really be in the qa-spec?

2014-06-11 Thread David Kranz

On 06/11/2014 03:50 PM, Matthew Treinish wrote:

On Wed, Jun 11, 2014 at 03:17:48PM -0400, David Kranz wrote:

While reviewing some specs I noticed that I had put myself down for more
Juno-2 work than is likely to be completed. I suspect this will happen
routinely with many folks. Also, assignees may change. This information is
not really part of the spec at all. Since we are still using blueprints to
actually track progress, I think it would be better to use the corresponding
fields in blueprints to make sure these values reflect reality on an ongoing
basis.


TL;DR: While they're not part of the spec they are part of the proposal and I
feel they have value when I'm reviewing a spec because that does influence my
feedback.

This was something we actually debated when we first added the spec template.
I think that they both have value, and my view at the time and still is that we
want both, in the spec review.

The milestone target isn't a hard date but just a realistic time frame of how
long you're expecting the work to take. I think the template even says something
like targeted milestone of completion to reflect this. The target milestone is a
part of the spec proposal and as a part of reviewing the spec I'm considering it
to gauge what work the spec is going to entail. I don't thinking having to jump
back and forth between the spec and the blueprint is a good way to do it. The
date in the spec is definitely not binding I have 2 approved bps that I targeted
for J-1 that I'm deferring one milestone. I think just tracking that in the BP
after the spec is approved is fine.

As for the assignees I think that's also fine to keep in the template mostly
just because of the limitation of LP to only have a single assignee. It's also
come up where people are drafting specs but don't plan to work on anything. So
seeing in the proposal that someone has signed up to do the work I think is
important. Just like the milestone I think we'll track this in LP after it's
been approved. As someone who's reviewing the specs I look to see that someone
has signed up to work before I approve it and how many people are working on it
to gauge how involved implementing it will be. It probably makes some sense to
add something to the template to indicate that once it's approved LP will
contain the contribution history and the current assignee.
I think that's fine. My issue was really more with having to use gerrit 
to do real-time tracking than with their presence in the spec when it is 
reviewed.


 -David


-Matt Treinish


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] shared review dashboard proposal

2014-06-10 Thread David Kranz

On 06/09/2014 02:24 PM, Sean Dague wrote:

On 06/09/2014 01:38 PM, David Kranz wrote:

On 06/02/2014 06:57 AM, Sean Dague wrote:

Towards the end of the summit there was a discussion about us using a
shared review dashboard to see if a common view by the team would help
accelerate people looking at certain things. I spent some time this
weekend working on a tool to make building custom dashboard urls much
easier.

My current proposal is the following, and would like comments on it:
https://github.com/sdague/gerrit-dash-creator/blob/master/dashboards/qa-program.dash

All items in the dashboard are content that you've not voted on in the
current patch revision, that you don't own, and that have passing
Jenkins test results.

1. QA Specs - these need more eyes, so we highlight them at top of page
2. Patches that are older than 5 days, with no code review
3. Patches that you are listed as a reviewer on, but haven't voting on
current version
4. Patches that already have a +2, so should be landable if you agree.
5. Patches that have no negative code review feedback on them
6. Patches older than 2 days, with no code review

Thanks, Sean. This is working great for me, but I think there is another
important item that is missing and hope it is possible to add, perhaps
even as among the most important items:

Patches that you gave a -1, but the response is a comment explaining why
the -1 should be withdrawn rather than a new patch.

So how does one automatically detect those using the gerrit query language?

-Sean
Based on the docs I looked at, you can't. The one downside of every one 
using a dashboard like this is that if a patch does not show in your 
view, it is as if it does not exist for you. So at least for now, if you 
want some one to remove a -1 based on some argument, you have to ping 
them directly. Not the end of the world.


 -David



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] shared review dashboard proposal

2014-06-09 Thread David Kranz

On 06/02/2014 06:57 AM, Sean Dague wrote:

Towards the end of the summit there was a discussion about us using a
shared review dashboard to see if a common view by the team would help
accelerate people looking at certain things. I spent some time this
weekend working on a tool to make building custom dashboard urls much
easier.

My current proposal is the following, and would like comments on it:
https://github.com/sdague/gerrit-dash-creator/blob/master/dashboards/qa-program.dash

All items in the dashboard are content that you've not voted on in the
current patch revision, that you don't own, and that have passing
Jenkins test results.

1. QA Specs - these need more eyes, so we highlight them at top of page
2. Patches that are older than 5 days, with no code review
3. Patches that you are listed as a reviewer on, but haven't voting on
current version
4. Patches that already have a +2, so should be landable if you agree.
5. Patches that have no negative code review feedback on them
6. Patches older than 2 days, with no code review
Thanks, Sean. This is working great for me, but I think there is another 
important item that is missing and hope it is possible to add, perhaps 
even as among the most important items:


Patches that you gave a -1, but the response is a comment explaining why 
the -1 should be withdrawn rather than a new patch.


 -David


These are definitely a judgement call on what people should be looking
at, but this seems a pretty reasonable triaging list. I'm happy to have
a discussion on changes to this list.

The url for this is -  http://goo.gl/g4aMjM

(the long url is very long:
https://review.openstack.org/#/dashboard/?foreach=%28project%3Aopenstack%2Ftempest+OR+project%3Aopenstack-dev%2Fgrenade+OR+project%3Aopenstack%2Fqa-specs%29+status%3Aopen+NOT+owner%3Aself+NOT+label%3AWorkflow%3C%3D-1+label%3AVerified%3E%3D1%2Cjenkins+NOT+label%3ACode-Review%3C%3D-1%2Cself+NOT+label%3ACode-Review%3E%3D1%2Cselftitle=QA+Review+InboxQA+Specs=project%3Aopenstack%2Fqa-specsNeeds+Feedback+%28Changes+older+than+5+days+that+have+not+been+reviewed+by+anyone%29=NOT+label%3ACode-Review%3C%3D2+age%3A5dYour+are+a+reviewer%2C+but+haven%27t+voted+in+the+current+revision=reviewer%3AselfNeeds+final+%2B2=%28project%3Aopenstack%2Ftempest+OR+project%3Aopenstack-dev%2Fgrenade%29+label%3ACode-Review%3E%3D2+limit%3A50Passed+Jenkins%2C+No+Negative+Feedback=NOT+label%3ACode-Review%3E%3D2+NOT+label%3ACode-Review%3C%3D-1+limit%3A50Wayward+Changes+%28Changes+with+no+code+review+in+the+last+2days%29=NOT+label%3ACode-Review%3C%3D2+age%3A2d

The url can be regenerated easily using the gerrit-dash-creator.

-Sean



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa][nova] Status of v3 tests in tempest

2014-05-20 Thread David Kranz

On 05/20/2014 03:19 PM, Christopher Yeoh wrote:
On Tue, May 20, 2014 at 8:58 PM, Sean Dague s...@dague.net 
mailto:s...@dague.net wrote:


On 05/19/2014 11:49 PM, Christopher Yeoh wrote:

 - if/else inlined in tests based on the microversion mode that is
 being tested at the moment (perhaps least amount of code but
cost is
 readability)
 - class inheritance (override specific bits where necessary -
bit more
 code, but readbility better?).
 - duplicated tests (min sharing)

Realistically, the current approach won't scale to micro versions. We
really won't be able to have 100 directories for Nova, or a 100 class
inheritances.

When a micro version happens, it will affect a small number of
interfaces. So the important thing will be testing those interfaces
before and after that change. We'll have to be really targeted here.
Much like the way the database migration tests with data injection
are.

Honestly, I think this is going to be hard to fully map until
we've got
an interesting version sitting in front of us.


So I agree that we won't be able to have a new directory for every 
microversion. But for the v2/v3 changes

we already have a lot of typical minor changes we'll need to handle. Eg.

- a parameter that has been renamed or removed (effectively the same 
thing from an API point of view)

- a success status code that has changed

Something like say a tasks API would I think be quite different 
because there would be a lot less shared code for the tests and so 
we'll need a different solution.


I guess what I'm saying is once we have a better idea of how the 
microversion interface will work then I think doing the work to 
minimise the code duplication on the tempest side is worth it because 
we have lots of examples of the sorts of cases we'll need to handle.


I agree. I think what Sean is saying, and this was the original intent 
of starting this thread, is that the structure we come up with for micro 
versions will look a lot different than the v2/v3 consolidation that was 
in progress in tempest when the decision to abandon v3 as a monolithic 
new api was made. So we have to stop the current changes based on a 
monolithic v2/v3, and then come up with a new organization based on 
micro versions when the nova approach has solidified sufficiently.


 -David



Regards,

Chris

-Sean

--
Sean Dague
http://dague.net


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
mailto:OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa][nova] Status of v3 tests in tempest

2014-05-19 Thread David Kranz
It seems the nova team decided in Atlanta that v3 as currently 
understood is never going to exist:

https://etherpad.openstack.org/p/juno-nova-v3-api.

There are a number of patches in flight that tweak how we handle 
supporting both v2/v3 in tempest to reduce duplication.
We need to decide what to do about this. At a minimum, I think we should 
stop any work that is inspired by any v3-related activity
except to revert any v2/v3 integration that was already done. We should 
really rip out the v3 stuff that was recently added. I know Matt had 
some concern about that regarding testing v3 in stable/icehouse but 
perhaps he can say more.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa][nova] Status of v3 tests in tempest

2014-05-19 Thread David Kranz

On 05/19/2014 01:24 PM, Frittoli, Andrea (HP Cloud) wrote:

Thanks for bringing this up.

We won't be testing v3 in Juno, but we'll need coverage for v2.1.

In my understanding will be a v2 compatible API - so including proxy to
glance cinder and neutron - but with micro-versions to bring in v3 features
such as CamelCase and Tasks.
So we should be able to reuse a good chunk of the v3 test code for testing
v2.1. Adding some config options for the v2.1 to v3 differences we could try
and use the same tests for icehouse v3 and juno v2.1.
While it is true that we may reuse some of the actual test code 
currently in v3, the overall code structure for micro-versions will be
much different than for a parallel v2/v3. I wanted to make sure every 
one  on the qa list knows that v3 is being scrapped and that we should 
stop making changes that are intended only to enhance the 
maintainability of an active v2/v3 scenario.


With regard to icehouse, my understanding is that we are basically 
deprecating v3 as an api before it was ever declared stable. Should we 
continue to carry technical debt in tempest to support testing the 
unstable v3 in icehouse? Another alternative, if we really want to 
continue testing v3 on icehouse but want to remove v3 from tempest, 
would be to create a stable/icehouse branch in tempest and run that 
against changes to stable/icehouse in projects in addition to running 
tempest master.


 -David


We may have to implement support for micro-versions in tempests own rest
client as well.

andrea


-Original Message-
From: David Kranz [mailto:dkr...@redhat.com]
Sent: 19 May 2014 10:49
To: OpenStack Development Mailing List
Subject: [openstack-dev] [qa][nova] Status of v3 tests in tempest

It seems the nova team decided in Atlanta that v3 as currently understood
is never going to exist:
https://etherpad.openstack.org/p/juno-nova-v3-api.

There are a number of patches in flight that tweak how we handle supporting
both v2/v3 in tempest to reduce duplication.
We need to decide what to do about this. At a minimum, I think we should
stop any work that is inspired by any v3-related activity except to revert
any v2/v3 integration that was already done. We should really rip out the v3
stuff that was recently added. I know Matt had some concern about that
regarding testing v3 in stable/icehouse but perhaps he can say more.

   -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa][nova] Status of v3 tests in tempest

2014-05-19 Thread David Kranz

Removing [nova]

On 05/19/2014 02:55 PM, Sean Dague wrote:

My suggestion is that we stop merging new Nova v3 tests from here
forward. However I think until we see the fruits of the v2.1 effort I
don't want to start ripping stuff out.
Fair enough but we need to revert, or at least stop taking patches, for 
https://blueprints.launchpad.net/tempest/+spec/nova-api-test-inheritance
which is trying to make supporting two monolithic apis share code. We 
will share code for micro versions but it will be distributed and not 
based on class inheritance.


 -David


The path to removing is going to be disable Nova v3 in devstack-gate,
when the Nova team decides it's right to do that. Once it's disconnected
we can start the removes. Because the interface wasn't considered stable
in icehouse, I don't think we need to keep it around for the icehouse tree.

-Sean

On 05/19/2014 07:42 AM, David Kranz wrote:

On 05/19/2014 01:24 PM, Frittoli, Andrea (HP Cloud) wrote:

Thanks for bringing this up.

We won't be testing v3 in Juno, but we'll need coverage for v2.1.

In my understanding will be a v2 compatible API - so including proxy to
glance cinder and neutron - but with micro-versions to bring in v3 features
such as CamelCase and Tasks.
So we should be able to reuse a good chunk of the v3 test code for testing
v2.1. Adding some config options for the v2.1 to v3 differences we could try
and use the same tests for icehouse v3 and juno v2.1.

While it is true that we may reuse some of the actual test code
currently in v3, the overall code structure for micro-versions will be
much different than for a parallel v2/v3. I wanted to make sure every
one  on the qa list knows that v3 is being scrapped and that we should
stop making changes that are intended only to enhance the
maintainability of an active v2/v3 scenario.

With regard to icehouse, my understanding is that we are basically
deprecating v3 as an api before it was ever declared stable. Should we
continue to carry technical debt in tempest to support testing the
unstable v3 in icehouse? Another alternative, if we really want to
continue testing v3 on icehouse but want to remove v3 from tempest,
would be to create a stable/icehouse branch in tempest and run that
against changes to stable/icehouse in projects in addition to running
tempest master.

  -David

We may have to implement support for micro-versions in tempests own rest
client as well.

andrea


-Original Message-
From: David Kranz [mailto:dkr...@redhat.com]
Sent: 19 May 2014 10:49
To: OpenStack Development Mailing List
Subject: [openstack-dev] [qa][nova] Status of v3 tests in tempest

It seems the nova team decided in Atlanta that v3 as currently understood
is never going to exist:
https://etherpad.openstack.org/p/juno-nova-v3-api.

There are a number of patches in flight that tweak how we handle supporting
both v2/v3 in tempest to reduce duplication.
We need to decide what to do about this. At a minimum, I think we should
stop any work that is inspired by any v3-related activity except to revert
any v2/v3 integration that was already done. We should really rip out the v3
stuff that was recently added. I know Matt had some concern about that
regarding testing v3 in stable/icehouse but perhaps he can say more.

   -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev





___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] Checking for return codes in tempest client calls

2014-05-09 Thread David Kranz

On 05/09/2014 11:29 AM, Matthew Treinish wrote:

On Thu, May 08, 2014 at 09:50:03AM -0400, David Kranz wrote:

On 05/07/2014 10:48 AM, Ken'ichi Ohmichi wrote:

Hi Sean,

2014-05-07 23:28 GMT+09:00 Sean Dague s...@dague.net:

On 05/07/2014 10:23 AM, Ken'ichi Ohmichi wrote:

Hi David,

2014-05-07 22:53 GMT+09:00 David Kranz dkr...@redhat.com:

I just looked at a patch https://review.openstack.org/#/c/90310/3 which was
given a -1 due to not checking that every call to list_hosts returns 200. I
realized that we don't have a shared understanding or policy about this. We
need to make sure that each api is tested to return the right response, but
many tests need to call multiple apis in support of the one they are
actually testing. It seems silly to have the caller check the response of
every api call. Currently there are many, if not the majority of, cases
where api calls are made without checking the response code. I see a few
possibilities:

1. Move all response code checking to the tempest clients. They are already
checking for failure codes and are now doing validation of json response and
headers as well. Callers would only do an explicit check if there were
multiple success codes possible.

2. Have a clear policy of when callers should check response codes and apply
it.

I think the first approach has a lot of advantages. Thoughts?

Thanks for proposing this, I also prefer the first approach.
We will be able to remove a lot of status code checks if going on
this direction.
It is necessary for bp/nova-api-test-inheritance tasks also.
Current https://review.openstack.org/#/c/92536/ removes status code checks
because some Nova v2/v3 APIs return different codes and the codes are already
checked in client side.

but it is necessary to create a lot of patch for covering all API tests.
So for now, I feel it is OK to skip status code checks in API tests
only if client side checks are already implemented.
After implementing all client validations, we can remove them of API
tests.

Do we still have instances where we want to make a call that we know
will fail and not through the exception?

I agree there is a certain clarity in putting this down in the rest
client. I just haven't figured out if it's going to break some behavior
that we currently expect.

If a server returns unexpected status code, Tempest fails with client
validations
like the following sample:

Traceback (most recent call last):
   File /opt/stack/tempest/tempest/api/compute/servers/test_servers.py,
line 36, in test_create_server_with_admin_password
 resp, server = self.create_test_server(adminPass='testpassword')
   File /opt/stack/tempest/tempest/api/compute/base.py, line 211, in
create_test_server
 name, image_id, flavor, **kwargs)
   File /opt/stack/tempest/tempest/services/compute/json/servers_client.py,
line 95, in create_server
 self.validate_response(schema.create_server, resp, body)
   File /opt/stack/tempest/tempest/common/rest_client.py, line 596,
in validate_response
 raise exceptions.InvalidHttpSuccessCode(msg)
InvalidHttpSuccessCode: The success code is different than the expected one
Details: The status code(202) is different than the expected one([200])


Thanks
Ken'ichi Ohmichi

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Note that there are currently two different methods on RestClient
that do this sort of thing. Your stacktrace shows
validate_response which expects to be passed a schema. The other
is expected_success which takes the expected response code and is
only used by the image clients.
Both of these will need to stay around since not all APIs have
defined schemas but the expected_success method should probably be
changed to accept a list of valid success responses rather than just
one as it does at present.

So expected_success() is just a better way of doing something like:

assert.Equals(resp.status, 200)

There isn't anything specific about the images clients with it.
validate_response() should just call expected_success(), which I pushed out
here:
https://review.openstack.org/93035
Right, I was just observing that it was only used by the image clients 
at present.




I hope we can get agreement to move response checking to the client.
There was no opposition when we started doing this in nova to check
schema. Does any one see a reason to not do this? It would both
simplify the code and make sure responses are checked in all cases.

Sean, do you have a concrete example of what you are concerned about
here? Moving the check from the value returned by a client call to
inside the client code should not have any visible effect unless the
value was actually wrong but not checked by the caller. But this
would be a bug that was just found if a test started failing.


Please draft a spec/bp for doing this, we can sort out the implementation
details in the spec review

Re: [openstack-dev] [qa] Checking for return codes in tempest client calls

2014-05-08 Thread David Kranz

On 05/07/2014 10:48 AM, Ken'ichi Ohmichi wrote:

Hi Sean,

2014-05-07 23:28 GMT+09:00 Sean Dague s...@dague.net:

On 05/07/2014 10:23 AM, Ken'ichi Ohmichi wrote:

Hi David,

2014-05-07 22:53 GMT+09:00 David Kranz dkr...@redhat.com:

I just looked at a patch https://review.openstack.org/#/c/90310/3 which was
given a -1 due to not checking that every call to list_hosts returns 200. I
realized that we don't have a shared understanding or policy about this. We
need to make sure that each api is tested to return the right response, but
many tests need to call multiple apis in support of the one they are
actually testing. It seems silly to have the caller check the response of
every api call. Currently there are many, if not the majority of, cases
where api calls are made without checking the response code. I see a few
possibilities:

1. Move all response code checking to the tempest clients. They are already
checking for failure codes and are now doing validation of json response and
headers as well. Callers would only do an explicit check if there were
multiple success codes possible.

2. Have a clear policy of when callers should check response codes and apply
it.

I think the first approach has a lot of advantages. Thoughts?

Thanks for proposing this, I also prefer the first approach.
We will be able to remove a lot of status code checks if going on
this direction.
It is necessary for bp/nova-api-test-inheritance tasks also.
Current https://review.openstack.org/#/c/92536/ removes status code checks
because some Nova v2/v3 APIs return different codes and the codes are already
checked in client side.

but it is necessary to create a lot of patch for covering all API tests.
So for now, I feel it is OK to skip status code checks in API tests
only if client side checks are already implemented.
After implementing all client validations, we can remove them of API
tests.

Do we still have instances where we want to make a call that we know
will fail and not through the exception?

I agree there is a certain clarity in putting this down in the rest
client. I just haven't figured out if it's going to break some behavior
that we currently expect.

If a server returns unexpected status code, Tempest fails with client
validations
like the following sample:

Traceback (most recent call last):
   File /opt/stack/tempest/tempest/api/compute/servers/test_servers.py,
line 36, in test_create_server_with_admin_password
 resp, server = self.create_test_server(adminPass='testpassword')
   File /opt/stack/tempest/tempest/api/compute/base.py, line 211, in
create_test_server
 name, image_id, flavor, **kwargs)
   File /opt/stack/tempest/tempest/services/compute/json/servers_client.py,
line 95, in create_server
 self.validate_response(schema.create_server, resp, body)
   File /opt/stack/tempest/tempest/common/rest_client.py, line 596,
in validate_response
 raise exceptions.InvalidHttpSuccessCode(msg)
InvalidHttpSuccessCode: The success code is different than the expected one
Details: The status code(202) is different than the expected one([200])


Thanks
Ken'ichi Ohmichi

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
Note that there are currently two different methods on RestClient that 
do this sort of thing. Your stacktrace shows validate_response which 
expects to be passed a schema. The other is expected_success which 
takes the expected response code and is only used by the image clients.
Both of these will need to stay around since not all APIs have defined 
schemas but the expected_success method should probably be changed to 
accept a list of valid success responses rather than just one as it does 
at present.


I hope we can get agreement to move response checking to the client. 
There was no opposition when we started doing this in nova to check 
schema. Does any one see a reason to not do this? It would both simplify 
the code and make sure responses are checked in all cases.


Sean, do you have a concrete example of what you are concerned about 
here? Moving the check from the value returned by a client call to 
inside the client code should not have any visible effect unless the 
value was actually wrong but not checked by the caller. But this would 
be a bug that was just found if a test started failing.


 -David

 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [qa] Checking for return codes in tempest client calls

2014-05-07 Thread David Kranz
I just looked at a patch https://review.openstack.org/#/c/90310/3 which 
was given a -1 due to not checking that every call to list_hosts returns 
200. I realized that we don't have a shared understanding or policy 
about this. We need to make sure that each api is tested to return the 
right response, but many tests need to call multiple apis in support of 
the one they are actually testing. It seems silly to have the caller 
check the response of every api call. Currently there are many, if not 
the majority of, cases where api calls are made without checking the 
response code. I see a few possibilities:


1. Move all response code checking to the tempest clients. They are 
already checking for failure codes and are now doing validation of json 
response and headers as well. Callers would only do an explicit check if 
there were multiple success codes possible.


2. Have a clear policy of when callers should check response codes and 
apply it.


I think the first approach has a lot of advantages. Thoughts?

 -David



___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] Checking for return codes in tempest client calls

2014-05-07 Thread David Kranz

On 05/07/2014 10:07 AM, Duncan Thomas wrote:

On 7 May 2014 14:53, David Kranz dkr...@redhat.com wrote:

I just looked at a patch https://review.openstack.org/#/c/90310/3 which was
given a -1 due to not checking that every call to list_hosts returns 200. I
realized that we don't have a shared understanding or policy about this.

snip


Thoughts?

While I don't know the tempest code well enough to opine where the
check should be, every call should definitely be checked and failures
reported - I've had a few cases where I've debugged failures (some
tempest, some other tests) where somebody says 'my volume attach isn't
working' and the reason turned out to be because their instance never
came up properly, or snapshot delete failed because the create failed
but wasn't logged. Anything that causes the test to automatically
report the narrowest definition of the fault is definitely a good
thing.


Yes. To be clear, all calls raise an exception on failure. What we don't 
check on every call is if an api  that is supposed to return 200 might 
have returned 201, etc.


 -David


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Openstack-dev][qa] Running tempest tests against existing openstack installation

2014-05-05 Thread David Kranz

On 05/05/2014 02:26 AM, Swapnil Kulkarni wrote:

Hello,

I am trying to run tempest tests against an existing openstack 
deployment. I have configured tempest.conf for the environment 
details. But when I execute run_tempest.sh, it does not run any tests.


Although when I run testr run, the tests fail with *NoSuchOptError: no 
such option: IMAGE_ID*


This must be coming from not having changed this value in the [compute] 
section of tempest.conf:


#image_ref={$IMAGE_ID}

See etc/tempest.conf.sample.

 -David



*
*
The trace has been added at [1]

If anyone has tried that before, any pointers are much appreciated.

[1] http://paste.openstack.org/show/78885/

Best Regards,
Swapnil Kulkarni
irc : coolsvap
cools...@gmail.com mailto:cools...@gmail.com
+91-87960 10622(c)
http://in.linkedin.com/in/coolsvap
*It's better to SHARE*


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Branchless Tempest QA Spec - final draft

2014-05-01 Thread David Kranz

On 05/01/2014 11:36 AM, Matthew Treinish wrote:

On Thu, May 01, 2014 at 06:18:10PM +0900, Ken'ichi Ohmichi wrote:

# Sorry for sending this again, previous mail was unreadable.

2014-04-28 11:54 GMT+09:00 Ken'ichi Ohmichi ken1ohmi...@gmail.com:

This is also why there are a bunch of nova v2 extensions that just add
properties to an existing API. I think in v3 the proposal was to do this with
microversioning of the plugins. (we don't have a way to configure
microversioned v3 api plugins in tempest yet, but we can cross that bridge when
the time comes) Either way it will allow tempest to have in config which
behavior to expect.

Good point, my current understanding is:
When adding new API parameters to the existing APIs, these parameters should
be API extensions according to the above guidelines. So we have three options
for handling API extensions in Tempest:

1. Consider them as optional, and cannot block the incompatible
changes of them. (Current)
2. Consider them as required based on tempest.conf, and can block the
incompatible changes.
3. Consider them as required automatically with microversioning, and
can block the incompatible changes.

I investigated the way of the above option 3, then have one question
about current Tempest implementation.

Now verify_tempest_config tool gets API extension list from each
service including Nova and verifies API extension config of tempest.conf
based on the list.
Can we use the list for selecting what extension tests run instead of
the verification?
As you said In the previous IRC meeting, current API tests will be
skipped if the test which is decorated with requires_ext() and the
extension is not specified in tempest.conf. I feel it would be nice
that Tempest gets API extension list and selects API tests automatically
based on the list.

So we used to do this type of autodiscovery in tempest, but we stopped because
it let bugs slip through the gate. This topic has come up several times in the
past, most recently in discussing reorganizing the config file. [1] This is why
we put [2] in the tempest README. I agree autodiscovery would be simpler, but
the problem is because we use tempest as the gate if there was a bug that caused
autodiscovery to be different from what was expected the tests would just
silently skip. This would often go unnoticed because of the sheer volume of
tempest tests.(I think we're currently at ~2300) I also feel that explicitly
defining what is a expected to be enabled is a key requirement for branchless
tempest for the same reason.




The verify_tempest_config tool was an attempt at a compromise between being
explicit and also using auto discovery. By using the APIs to help create a
config file that reflected the current configuration state of the services. It's
still a WIP though, and it's really just meant to be a user tool. I don't ever
see it being included in our gate workflow.
I think we have to accept that there are two legitimate use cases for 
tempest configuration:


1. The entity configuring tempest is the same as the entity that 
deployed. This is the gate case.
2. Tempest is to be pointed at an existing cloud but was not part of a 
deployment process. We want to run the tests for the supported 
services/extensions.


We should modularize the code around discovery so that the discovery 
functions return the changes to conf that would have to be made. The 
callers can then decide how that information is to be used. This would 
support both use cases. I have some changes to the verify_tempest_config 
code that does this which I will push up if the concept is agreed.


 -David

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


  1   2   >