[openstack-dev] [infra] Third Party CI naming and contact (action required)

2014-08-13 Thread James E. Blair
Hi,

We've updated the registration requirements for third-party CI systems
here:

  http://ci.openstack.org/third_party.html

We now have 86 third-party CI systems registered and have undertaken an
effort to make things more user-friendly for the developers who interact
with them.  There are two important changes to be aware of:

1) We now generally name third-party systems in a descriptive manner
including the company and product they are testing.  We have renamed
currently-operating CI systems to match these standards to the best of
our abilities.  Some of them ended up with particularly bad names (like
"Unknown Function...").  If your system is one of these, please join us
in #openstack-infra on Freenode to establish a more descriptive name.

2) We have established a standard wiki page template to supply a
description of the system, what is tested, and contact information for
each system.  See https://wiki.openstack.org/wiki/ThirdPartySystems for
an index of such pages and instructions for creating them.  Each
third-party CI system will have its own page in the wiki and it must
include a link to that page in every comment that it leaves in Gerrit.

If you operate a third-party CI system, please ensure that you register
a wiki page and update your system to link to it in every new Gerrit
comment by the end of August.  Beginning in September, we will disable
systems that have not been updated.

Thanks,

Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] 3rd Party CI vs. Gerrit

2014-08-13 Thread James E. Blair
cor...@inaugust.com (James E. Blair) writes:

> Sean Dague  writes:
>
>> This has all gone far enough that someone actually wrote a Grease Monkey
>> script to purge all the 3rd Party CI content out of Jenkins UI. People
>> are writing mail filters to dump all the notifications. Dan Berange
>> filters all them out of his gerrit query tools.
>
> I should also mention that there is a pending change to do something
> similar via site-local Javascript in our Gerrit:
>
>   https://review.openstack.org/#/c/95743/
>
> I don't think it's an ideal long-term solution, but if it works, we may
> have some immediate relief without all having to install greasemonkey
> scripts.

You may have noticed that this has merged, along with a further change
that shows the latest results in a table format.  (You may need to
force-reload in your browser to see the change.)

The table only includes CI systems that leave their results in the same
format that we use for Jenkins; we will update the recommendations for
third party CI systems to encourage the use of that format.

This is all still fairly brittle, based mostly on javascript-powered
screen scraping.  However, I'm hoping we can get something like it in a
Gerrit plugin for a more long-term solution to the problem.

Thanks again to Radoslav Gerganov for writing the original change.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] 3rd Party CI vs. Gerrit

2014-08-13 Thread James E. Blair
Dan Smith  writes:

>> You may have noticed that this has merged, along with a further change
>> that shows the latest results in a table format.  (You may need to
>> force-reload in your browser to see the change.)
>
> Friggin. Awesome.
>
>> Thanks again to Radoslav Gerganov for writing the original change.
>
> Thanks to all involved, as this is a major improvement for everyone!
>
> One thing that we discussed at the nova meetup was having a space for
> each CI we *expect* to vote. I haven't looked at the implementation
> here, but I assume it just parses the comments to generate the table.
> How hard would it be to make the table show all the CI systems we expect
> so that it's very obvious that one has gone MIA (as they often do)
> before we merge a patch? I think we struggle right now with merging
> things that a CI system would have NAKed, but only because we didn't
> notice that it hadn't voted.

I think my preferred method of doing this would be to drive all
third-party CI systems from OpenStack's Zuul.  We are not far from being
able to do that technically, though it's not clear to me that there is
the will for third party CI systems to do that.  However, if we really
are expecting them to report back and are willing to hold up merging a
change because of it, then we really should consider that.

As we look into using a Gerrit plugin for test results, we could see
about adding that as a feature.

Something that we could technically do immediately would be to add
third-party CI systems as reviewers to changes when they are uploaded.
Then they would show up in the list of reviewers at the top of the
change.  This would require maintaining a list of which systems were
expected to vote on which repositories.  Rather than keeping that in
git, we could use group membership for it (and implement something we
have talked about off-and-on, which is using Gerrit group membership to
grant voting privileges to each individual repository).  That would also
allow projects to self-manage both who is permitted to vote on their
repos and who is expected to vote.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] 3rd Party CI vs. Gerrit

2014-08-13 Thread James E. Blair
Chmouel Boudjnah  writes:

> On Wed, Aug 13, 2014 at 3:05 PM, James E. Blair  wrote:
>
>> You may have noticed that this has merged, along with a further change
>> that shows the latest results in a table format.  (You may need to
>> force-reload in your browser to see the change.)
>>
>
>
> Very cool!! this is really nice UI, super useful
>
> one litle suggestions for the folk that knows how to do that if that's
> possible to do is to sort between the voting and the non-voting so we can
> easily spot which one are worthwhile to look at or not.

If it is not worth looking at a job that is run by the OpenStack CI
system, please propose a patch to openstack-infra/config to delete it
from the Zuul config.  We only want to run what's useful, and we have
other methods (the silent and experimental queues) to develop new jobs
without creating noise.

If there is a third-party CI system reporting non-voting jobs, um, I
don't know what that means.  If it bothers you, you might ask the
third-party CI system to disable them and if they don't, then ask us to
disable the third-party CI system.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] 3rd Party CI vs. Gerrit

2014-08-13 Thread James E. Blair
Chmouel Boudjnah  writes:

> On Wed, Aug 13, 2014 at 6:27 PM, James E. Blair  wrote:
>
>> If it is not worth looking at a job that is run by the OpenStack CI
>> system, please propose a patch to openstack-infra/config to delete it
>> from the Zuul config.  We only want to run what's useful, and we have
>> other methods (the silent and experimental queues) to develop new jobs
>> without creating noise.
>>
>> If there is a third-party CI system reporting non-voting jobs, um, I
>> don't know what that means.  If it bothers you, you might ask the
>> third-party CI system to disable them and if they don't, then ask us to
>> disable the third-party CI system.
>
> I didn't meant that they were not worthwhile to look at I was just thinking
> it could be useful to sort them so we can easily identify from a UI
> perspective which one was voting or not.

I think part of why I responded with that is because this is not the
first time I have heard someone say they don't want to see the
non-voting jobs.  If that's a real desire, we should get to the bottom
of it.  We don't actually want the CI system to be annoying.  :)

>From my perspective, anything red should represent something that
someone needs to process and either correct or make an informed decision
to ignore.  That is, a failing non-voting job should either cause the
submitter to fix a problem with their code (same as a failing voting
job), or investigate the result and determine that in this case, the
non-voting job can be safely ignored (ie, is an expected result of the
change).

If we have non-voting jobs that don't match that criteria, we should
remove them.  Or make them voting.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Acceptable methods for establishing per-test-suite behaviors

2014-08-22 Thread James E. Blair
Hi,

One of the things we've wanted for a while in some projects is a
completely separate database environment for each test when using MySQL.
To that end, I wrote a MySQL schema fixture that is in use in nodepool:

http://git.openstack.org/cgit/openstack-infra/nodepool/tree/nodepool/tests/__init__.py#n75

While a per-test schema is more overhead than what you're asking about,
it's sometimes very desirable and quite simple.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] [neutron] [tc] Neutron Incubator workflow

2014-08-26 Thread James E. Blair
Hi,

After reading https://wiki.openstack.org/wiki/Network/Incubator I have
some thoughts about the proposed workflow.

We have quite a bit of experience and some good tools around splitting
code out of projects and into new projects.  But we don't generally do a
lot of importing code into projects.  We've done this once, to my
recollection, in a way that preserved history, and that was with the
switch to keystone-lite.

It wasn't easy; it's major git surgery and would require significant
infra-team involvement any time we wanted to do it.

However, reading the proposal, it occurred to me that it's pretty clear
that we expect these tools to be able to operate outside of the Neutron
project itself, to even be releasable on their own.  Why not just stick
with that?  In other words, the goal of this process should be to create
separate projects with their own development lifecycle that will
continue indefinitely, rather than expecting the code itself to merge
into the neutron repo.

This has advantages in simplifying workflow and making it more
consistent.  Plus it builds on known integration mechanisms like APIs
and python project versions.

But more importantly, it helps scale the neutron project itself.  I
think that a focused neutron core upon which projects like these can
build on in a reliable fashion would be ideal.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] [glance] python namespaces considered harmful to development, lets not introduce more of them

2014-08-27 Thread James E. Blair
Sean Dague  writes:

> On 08/27/2014 11:14 AM, Flavio Percoco wrote:
>> On 08/27/2014 04:31 PM, Sean Dague wrote:
>> 1. Do a partial rename and then complete it after the glance migration
>> is done. If I'm not missing anything, we should be able to do something
>> like:
>>  - Rename the project internally
>>  - Release a new version with the new name `glancestore`
>>  - Switch glance over to `glancestore`
>>  - Complete the rename process with support from infra
>> 
>> 2. Let this patch land, complete Glance's switch-over using namespaces
>> and then do the rename all together.
>> 
>> Do you have any other suggestion that would help avoiding namespaces
>> without blocking glance.store?
>
> I think those are the 2 paths. I think path #1 is completely sane. All
> it will mean is that the repo name isn't exactly the package name, but I
> think that's fine. We can take the repo name later in the release cycle.
> So I'm really pro #1 if you can do that.

I'm happy to rename the project as early as this Saturday if that is
convenient.  I expect the system to be quiet enough for that by then.

It would be helpful if some glance.store folks are around to expedite
approval of any changes needed as a result of the rename before the rush
begins again on Monday.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra] [neutron] [tc] Neutron Incubator workflow

2014-08-27 Thread James E. Blair
Kevin Benton  writes:

> From what I understand, the intended projects for the incubator can't
> operate without neutron because they are just extensions/plugins/drivers.

I could have phrased that better.  What I meant was that they could
operate without being actually in the Neutron repo, not that they could
not operate without Neutron itself.

The proposal for the incubator is that extensions be developed outside
of the Neutron repo.  My proposed refinement is that they stay outside
of the Neutron repo.  They live their entire lives as extension modules
in separate projects.

> For example, if the DVR modifications to the reference reference L3 plugin
> weren't already being developed in the tree, DVR could have been developed
> in the incubator and then merged into Neutron once the bugs were ironed out
> so a huge string of Gerrit patches didn't need to be tracked. If that had
> happened, would it make sense to keep the L3 plugin as a completely
> separate project or merge it? I understand this is the approach the load
> balancer folks took by making Octavia a separate project, but I think it
> can still operate on its own, where the reference L3 plugin (and many of
> the other incubator projects) are just classes that expect to be able to
> make core Neutron calls.

The list of Juno/Kilo candidates doesn't seem to have projects that are
quite so low-level.

If a feature is going to become part of the neutron core, then it should
be developed in the neutron repository.  If we need a place to land code
that isn't master, it's actually far easier to just use a feature branch
on the neutron repo.  Commits can land there as needed, master can be
periodically merged into it, and when the feature is ready, the feature
branch can be merged into master.

I think between those two options: incubate/spin-out components that are
high-level enough not to have deep integration in the neutron core, and
using feature branches for large experimental changes to the core
itself, we can handle the problems the incubator repo is intended to
address.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [all] Gerrit Downtime on August 30, 2014

2014-08-28 Thread James E. Blair
Hi,

Gerrit will be unavailable starting at 1600-1630 UTC on Saturday,
August 30, 2014 to rename the glance.store project to glancestore.

I apologize for the late notice, however, in another thread on the -dev
list, you'll find the rationale for executing this change swiftly.

Thanks,

Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Gerrit Downtime on August 30, 2014

2014-08-28 Thread James E. Blair
Flavio Percoco  writes:

> On 08/28/2014 05:39 PM, James E. Blair wrote:
>> Hi,
>> 
>> Gerrit will be unavailable starting at 1600-1630 UTC on Saturday,
>> August 30, 2014 to rename the glance.store project to glancestore.
>
> I went with glance_store
>
> Hope that's fine!

Even better!

> Thanks a lot for addressing this so quickly.

No problem.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [OpenStack-Infra] [third-party] [infra] New mailing lists for third party announcements and account requests

2014-08-29 Thread James E. Blair
Stefano Maffulli  writes:

> On Fri 29 Aug 2014 12:47:00 PM PDT, Elizabeth K. Joseph wrote:
>> Third-party-request
>>
>> This list is the new place to request the creation or modification of
>> your third party account. Note that old requests sent to the
>> openstack-infra mailing list don't need to be resubmitted, they are
>> already in the queue for creation.
>
> I'm not happy about this decision: creating new lists is expensive, it
> multiplies entry points for newcomers, which need to be explained *and*
> understood. We've multiplying processes, rules, points of contact and
> places to monitor, be aware of... I feel overwhelmed. I wonder how much
> worse that feeling is for people who are not 150% of their time
> following discussions online and offline on all OpenStack channels.

I'm thrilled about it.  Creating new lists is cheap, a lot cheaper than
asking people who want to discuss infrastructure tooling to wade through
hundreds of administrative messages about ssh keys, email addresses,
etc.

> Are you sure that a mailing list is the most appropriate way of handling
> requests? Aren't bug trackers more appropriate instead?  And don't we
> have a bug tracker already?

It's the best way we have right now, until we have time to make it more
self-service.  We received one third-party CI request in 2 years, then
we received 88 more in 6 months.  Our current process is built around
the old conditions.  I don't know if the request list will continue
indefinitely, but the announce list will.  We definitely need a
low-volume place to announce changes to third-party CI operators.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Announcing Gertty 1.0.0: A console interface to Gerrit

2014-09-04 Thread James E. Blair
Announcing Gertty 1.0.0

Gertty is a console-based interface to the Gerrit Code Review system.

If that doesn't sound interesting to you, then just skip right on to
the next message.  This mailing list gets a lot of traffic, and it's
going to take you a while to read it all in that web browser you're
using.

Gertty was written by and for coremudgeons.  But it's not just because
we think mutt is the apex of user interface design.

We write code in a terminal.  We read logs in a terminal.  We debug code
in a terminal.  We commit in a terminal.  You know what's next.

This is why I wrote Gertty:

 * Workflow -- the interface is designed to support a workflow similar
   to reading network news or mail.  In particular, it is designed to
   deal with a large number of review requests across a large number
   of projects.

 * Offline Use -- Gertty syncs information about changes in subscribed
   projects to a local database and local git repos.  All review
   operations are performed against that database and then synced back
   to Gerrit.

 * Speed -- user actions modify locally cached content and need not
   wait for server interaction.

 * Convenience -- because Gertty downloads all changes to local git
   repos, a single command instructs it to checkout a change into that
   repo for detailed examination or testing of larger changes.

 * Information Architecture -- in a console environment, Gertty can
   display information to reviewers in a more compact and relevant
   way.

 * Colors -- I think ANSI escape sequences are a neat idea.

Here are some reasons you may want to use Gertty:

 * Single page diff -- when you look at a diff, all of the files are
   displayed on the same screen making it easier to see the full
   context of a change as you scroll effortlessly around the files
   that comprise it.  This may be the most requested feature in
   Gerrit.  It was harder to make Gertty show only only one file than
   it was to do all of them so that's what we have.  You still get the
   choice of side-by-side or unified diff, color coding, inline
   comments, and intra-line diffs.

 * The checkout and cherry-pick commands -- Gertty works directly on
   your local git repos, even the same ones you hack on.  It doesn't
   change them unless you ask it to, so normally you don't notice it's
   there, but with a simple command you can tell Gertty to check out a
   change into your working tree, or cherry-pick a bunch of changes
   onto a branch to build up a new patch series.  It's like "git
   review -d" if you've ever used it, but instead of typing "git
   review -d what-was-that-change-number-again?" you type "c".

 * Your home address is seat 7A (or especially if it's 1A) -- Gertty
   works seamlessly online or offline so you can review changes while
   you're flying to your 15th mid-cycle meetup.  Gertty syncs all of
   the open changes for subscribed projects to a local database and
   performs all of its operations there.  When it's able to connect to
   Gerrit, it uploads your reviews instantly.  When it's unable, they
   are queued for the next time you are online.  It handles the
   transition between online and offline effortlessly.  If your
   Internet connection is slow or unreliable, Gertty helps with that
   too.

 * You review a lot of changes -- Gertty is fast.  All of the typical
   review operations are performed against the local database or the
   local git repos.  Gertty can review changes as fast as you can.  It
   has commands to instantly navigate from change to change, and
   shortcuts to leave votes on a change with a single keypress.

 * You are particular about the changes you review -- Gertty lets you
   subscribe to projects, and then displays each of those projects
   along with the number of open changes and changes you have not
   reviewed.  Open up those projects like you would a newsgroup or
   email folder, and scroll down the list of changes.  If you don't
   have anything to say about a change but want to see it again the
   next time it's updated, just hit a key to mark it reviewed.  If you
   don't want to see a change ever again, hit a different key to kill
   it.  Gertty helps you review all of the changes you want to review,
   and none of the changes you don't.

 * Radical customization -- The queries that Gertty uses by default
   can be customized.  It uses the same search syntax as Gerrit and
   support most of its operators.  It has user-defined dashboards that
   can be bound to any key.  In fact, any command can be bound to any
   key.  The color palette can be customized.  You spend a lot of time
   reviewing changes, you should be comfortable.

 * Your terminal is an actual terminal -- Gertty works just fine in 80
   columns, but it is also happy to spread out into hundreds of
   columns for ideal side-by-side diffing.

 * Colors -- you think ANSI escape sequences are a neat idea.

If you're ready to give it a shot, here's what to do:

  pip install gertty
  wge

Re: [openstack-dev] Announcing Gertty 1.0.0: A console interface to Gerrit

2014-09-04 Thread James E. Blair
cor...@inaugust.com (James E. Blair) writes:

> If you're ready to give it a shot, here's what to do:
>
>   pip install gertty
>   wget 
> https://git.openstack.org/cgit/stackforge/gertty/plain/examples/openstack-gertty.yaml
>  -O ~/.gertty.yaml
>   # edit ~/.gertty.yaml and update anything that says "CHANGEME"
>   gertty
>
> It will walk you through what to do next.  For help on any screen, hit
> F1 or "?".
>
> For more information on installation or usage, see the README here:
> https://pypi.python.org/pypi/gertty

Of course it's that easy.  If you already have a Gerrit HTTP password.

But you probably don't.

Gertty uses Gerrit's new REST API, all over HTTP.  To dodge all those
complicated issues about how to authenticate a REST API user when web
access is authenticated with OpenID or LDAP or x.509 or whatever, Gerrit
can generate a password for use with the REST API.  To do this, visit
the following URL:

  https://review.openstack.org/#/settings/http-password

And click "Generate Password".  Copy the resulting value into your
~/.gertty.yaml file.

Also, if you aren't into saving passwords in plain text files, you can
omit the password entry entirely from the YAML file, and Gertty should
prompt you to enter it on startup.  You'll still need to use the
randomly generated password from Gerrit as described above.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Announcing Gertty 1.0.0: A console interface to Gerrit

2014-09-08 Thread James E. Blair
I just released 1.0.1 with some bug fixes for issues found by early
adopters.  Thanks!

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Kilo Cycle Goals Exercise

2014-09-08 Thread James E. Blair
Thanks for starting this, Joe.

I think that we need to address the operator and user experience by
improving the consistency and stability of OpenStack overall.  Here are
five ways of doing that:

1) Improve log correlation and utility

If we're going to improve the stability of OpenStack, we have to be
able to understand what's going on when it breaks.  That's both true
as developers when we're trying to diagnose a failure in an
integration test, and it's true for operators who are all too often
diagnosing the same failure in a real deployment.  Consistency in
logging across projects as well as a cross-project request token would
go a long way toward this.

2) Improve API consistency

As projects are becoming more integrated (which is happening at least
partially as we move functionality _out_ of previously monolithic
projects), the API between them becomes more important.  We keep
generating APIs with different expectations that behave in very
different ways across projects.  We need to standardize on API
behavior and expectations, for the sake of developers of OpenStack who
are increasingly using them internally, but even moreso for our users
who expect a single API and are bewildered when they get dozens
instead.

3) A real SDK

OpenStack is so nearly impossible to use, that we have a substantial
amount of code in the infrastructure program to do things that,
frankly, we are a bit surprised that the client libraries don't do.
Just getting an instance with an IP address is an enormous challenge,
and something that took us years to get right.  We still have problems
deleting instances.  We need client libraries (an SDK if you will) and
command line clients that are easy for users to understand and work
with, and hide the gory details of how the sausage is made.

In OpenStack, we have chosen to let a thousand flowers bloom and
deployers have a wide array of implementation options available.
However, it's unreasonable to expect all of our users to understand
all of the implications of all of those choices.  Our SDK must help
users deal with that complexity.

4) Reliability

Parts of OpenStack break all the time.  In general, we accept that the
environment a cloud operates in can be unreliable (we design for
failure).  However, that should be the exception, not the norm.  Our
current failure modes and rates are hurting everyone -- developers
merging changes in the gate, operators in continual fire-fighting
mode, and users who have to handle and recover from every kind of
internal error that OpenStack externalizes.  We need to focus on
making OpenStack itself operate reliably.

5) Functional testing

We've hit the limit of what we can reasonably accomplish by putting
all of our testing efforts into cross-project integration testing.
Instead, we need to functionally test individual projects much more
strongly, so that we can reserve integration testing (which is much
more complicated) for catching real "integration" bugs rather than
expecting it to call all functional bugs.  To that end, we should help
projects focus on robust functional testing in the Kilo cycle.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] how to provide tests environments for python things that require C extensions

2014-09-08 Thread James E. Blair
Sean Dague  writes:

> The crux of the issue is that zookeeper python modules are C extensions.
> So you have to either install from packages (which we don't do in unit
> tests) or install from pip, which means forcing zookeeper dev packages
> locally. Realistically this is the same issue we end up with for mysql
> and pg, but given their wider usage we just forced that pain on developers.
...
> Which feels like we need some decoupling on our requirements vs. tox
> targets to get there. CC to Monty and Clark as our super awesome tox
> hackers to help figure out if there is a path forward here that makes sense.

>From a technical standpoint, all we need to do to make this work is to
add the zookeeper python client bindings to (test-)requirements.txt.
But as you point out, that makes it more difficult for developers who
want to run unit tests locally without having the requisite libraries
and header files installed.

We could add another requirements file with heavyweight optional
dependencies, and use that in gate testing, but also have a lightweight
tox environment that does not include them for ease of use in local
testing.

What would be really great is if we could use setuptools extras_require
for this:

https://pythonhosted.org/setuptools/setuptools.html#declaring-extras-optional-features-with-their-own-dependencies

However, I'm not sure what the situation is with support for that in pip
(and we might need pbr support too).

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Bringing back auto-abandon

2014-09-10 Thread James E. Blair
Steven Hardy  writes:

> Yeah, I don't know what the optimal solution is - my attention has recently
> been drawn to queries generated via gerrit-dash-creator, which I'm finding
> help a lot.

This is one of several great solutions to the problem.  Any query in
Gerrit can include an age specifier.  To get the old behavior, just add
"age:-2week" (that translates to "last updated less than 2 weeks ago")
to any query -- whether a dashboard or your own bookmarked query like
this one:

  
https://review.openstack.org/#/q/status:open+age:-2week+project:openstack/nova,n,z

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] adding RSS feeds to specs repositories

2014-09-10 Thread James E. Blair
Doug Hellmann  writes:

> I originally thought we would want to add these feeds to
> planet.openstack.org, but given the length of some of the specs I’m
> less sure of that. Instead, now I think it would be better to
> publicize the list of URLs for people who want to subscribe to some or
> all of them separately. After some of them land and we have a few
> feeds published, I will find a good place to do that.

You could add them to the specs.o.o index here:

  
http://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/specs/index.html

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Bringing back auto-abandon

2014-09-10 Thread James E. Blair
James Polley  writes:

> On Thu, Sep 11, 2014 at 6:52 AM, James E. Blair  wrote:
>
>> Steven Hardy  writes:
>>
>> > Yeah, I don't know what the optimal solution is - my attention has
>> recently
>> > been drawn to queries generated via gerrit-dash-creator, which I'm
>> finding
>> > help a lot.
>>
>> This is one of several great solutions to the problem.  Any query in
>> Gerrit can include an age specifier.  To get the old behavior, just add
>> "age:-2week" (that translates to "last updated less than 2 weeks ago")
>> to any query -- whether a dashboard or your own bookmarked query like
>> this one:
>>
>>
>> https://review.openstack.org/#/q/status:open+age:-2week+project:openstack/nova,n,z
>
>
> If someone uploads a patch, and 15 days later it's had no comments at all,
> would it be visible in this query? My understanding is that it wouldn't, as
> it was last updated more than two weeks ago
>
> In my mind, a patch that's had no comments in two weeks should be high on
> the list of thing that need feedback. As far as I know, Gerrit doesn't have
> any way to sort by oldest-first though, so even if a two-week-old patch was
> visible in the query, it would be at the bottom of the list.

Indeed, however, a slightly different query will get you exactly what
you're looking for.  This will show changes that are at least 2 days
old, have no code reviews, are not WIP, and have passed Jenkins:

  project:openstack/nova status:open label:Verified>=1,jenkins NOT 
label:Workflow<=-1 NOT label:Code-Review<=2 age:2d

or the direct link:

  
https://review.openstack.org/#/q/project:openstack/nova+status:open+label:Verified%253E%253D1%252Cjenkins+NOT+label:Workflow%253C%253D-1+NOT+label:Code-Review%253C%253D2+age:2d,n,z

Incidentally, that is the query in the "Wayward Changes" section of the
"Review Inbox" dashboard (thanks Sean!); for nova, you can see it here:

  
https://review.openstack.org/#/projects/openstack/nova,dashboards/important-changes:review-inbox-dashboard

The key here is that there are a lot of changes in a lot of different
states, and one query isn't going to do everything that everyone wants
it to do.  Gerrit has a _very_ powerful query language that can actually
help us make sense of all the changes we have in our system without
externalizing the cost of that onto contributors in the form of
forced-abandoning of changes.  Dashboards can help us share the
knowledge of how to get the most out of it.

  https://review.openstack.org/Documentation/user-dashboards.html
  https://review.openstack.org/Documentation/user-search.html

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Set WIP for stale patches?

2014-09-17 Thread James E. Blair
"Sullivan, Jon Paul"  writes:

> I think this highlights exactly why this should be an automated
> process.  No errors in application, and no errors in interpretation of
> what has happened.
>
> So the -1 from Jenkins was a reaction to the comment created by adding
> the workflow -1.  This is going to happen on all of the patches that
> have their workflow value altered (tests will run, result would be
> whatever the result of the test was, of course).

Jenkins only runs tests in reaction to comments if they say "recheck".

> But I also agree that the Jenkins vote should not be included in the
> determination of marking a patch WIP, but a human review should (So
> Code-Review and not Verified column).
>
> And in fact, for the specific example to hand, the last Jenkins vote
> was actually a +1, so as I understand it should not have been marked
> WIP.

I'd like to help you see the reviews you want to see without
externalizing your individual workflow onto contributors.  What tool do
you use to find reviews?

If it's gerrit's webui, have you tried using the Review Inbox dashboard?
Here it is for the tripleo-image-elements project:

  
https://review.openstack.org/#/projects/openstack/tripleo-image-elements,dashboards/important-changes:review-inbox-dashboard

If you would prefer something else, we can customize those dashboards to
do whatever you want, including ignoring changes that have not been
updated in 2 weeks.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Set WIP for stale patches?

2014-09-18 Thread James E. Blair
"Sullivan, Jon Paul"  writes:

> This is not solely about finding reviews.  It is about pruning stale
> reviews.  I think the auto-abandon code was excellent at doing this,
> but alas, it is no more.

What's the purpose of pruning stale reviews?  I've read the IRC log of
the meeting you mentioned.  It's becoming apparent to me that this is
about making some numbers that reviewstats produces look good.

I am rather disappointed in that.

If reviewstats is not measuring what you want it to measure (eg, how
well you keep up with incoming reviews) then you should change how it
measures it.  If you want the number of open reviews to be low, then
change the definition of open reviews to what you think it should be --
don't create automated processes to WIP changes just to get your numbers
down.

The reason that we made it so that any core team member could WIP or
abandon a change was so that you could make a judgement call and say
"this change needs more work" or "this change is defunct".  You might
even use a tool to help you find those reviews and make those judgement
calls.  But no automated tool can make those decisions.  Luckily, it
does not need to.

If you want to organize your review queue, use a tool like
gerrit-dash-creator:

  https://github.com/stackforge/gerrit-dash-creator

If you want to change how stats are generated, patch reviewstats.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] The recent gate performance and how it affects you

2013-11-21 Thread James E. Blair
Matt Riedemann  writes:

> People get heads-down in their own projects and what they are working
> on and it's hard to keep up with what's going on in the infra channel
> (or nova channel for that matter), so sending out a recap that
> everyone can see in the mailing list is helpful to reset where things
> are at and focus possibly various isolated investigations (as we saw
> happen this week).

Further on that point, Joe and I and others have been brainstorming
about how to prevent this situation and improve things when it does
happen.  To that end, I'd like to propose we adopt some process around
gate-blocking bugs:

1) The QA team should have the ability to triage bugs in _all_ OpenStack
projects, specifically so that they may set gate-blocking bugs to
critical priority.

2) If there isn't an immediately obvious assignee for the bug, send an
email to the -dev list announcing it and asking for someone to take or
be assigned to the bug.

I think the expectation should be that the bug triage teams or PTLs
should help get someone assigned to the bug in a reasonable time (say,
24 hours, or ideally much less).

3) If things get really bad, as they have recently, we send a mail to
the list asking core devs to stop approving patches that don't address
gate-blocking bugs.

I don't think any of this is revolutionary -- we have more or less done
these things already in this situation, but we usually take a while to
get there.  I think setting expectations around this and standardizing
how we proceed will make us better able to handle it.

Separately we will be following up with information on some changes that
we hope will reduce the likelihood of nondeterministic bugs creeping in
in the first place.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Unwedging the gate

2013-11-25 Thread James E. Blair
Joe Gordon  writes:

> On Sun, Nov 24, 2013 at 10:48 PM, Robert Collins
> wrote:
>
>> On 25 November 2013 19:25, Joe Gordon  wrote:
>> >
>> >
>> >
>> > On Sun, Nov 24, 2013 at 9:58 PM, Robert Collins <
>> robe...@robertcollins.net>
>> > wrote:
>> >>
>> >> I have a proposal - I think we should mark all recheck bugs critical,
>> >> and the respective project PTLs should actively shop around amongst
>> >> their contributors to get them fixed before other work: we should
>> >> drive the known set of nondeterministic issues down to 0 and keep it
>> >> there.
>> >
>> >
>> >
>> > Yes! In fact we are already working towards that. See
>> >
>> http://lists.openstack.org/pipermail/openstack-dev/2013-November/020048.html
>>
>> Indeed I saw that thread - I think I'm proposing something slightly
>> different, or perhaps 'gate blocking' needs clearing up. Which is -
>> that once we have sufficient evidence to believe there is a
>> nondeterministic bug in trunk, whether or not the gate is obviously
>> suffering, we should consider it critical immediately. I don't think
>> we need 24h action on such bugs at that stage - gate blocking zomg
>> issues obviously do though!
>>
>
> I see what your saying. That sounds like a good idea, all gate bugs are
> critical, but only zomg gate is bad gets 24h action.

This is fundamentally the same idea -- we're talking about degrees.  And
I'm afraid that the difference in degree between a "gate bug" and a
"zomg gate bug" has more to do with the number of changes in the gate
queue than the bug itself.

So yeah, my proposal is that nondeterministic bugs that show up in the
gate should be marked critical, and the expectation is that PTLs should
help get people assigned to them.

Nondeterministic bugs that show up in the gate with no one working on
them are just waiting for a big queue or another nondetermistic bug to
come along and halt everything.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Gate broken right now

2013-11-26 Thread James E. Blair
Chmouel Boudjnah  writes:

> On Tue, Nov 26, 2013 at 9:52 AM, Flavio Percoco  wrote:
>
>> This seems to be the issue you're talking about, is it?
>>
>> http://logs.openstack.org/49/57049/2/gate/gate-swift-
>> python26/c1aedf1/console.html
>>
>
> Thanks for the heads up indeed, I was wondering if there was somebody on
> shift/awake from infra during european hours?

There is not, currently, though we would love to have more people on the
infra-core (and infra-root) teams.  If you know someone, or you
yourself, or your company are interested in contributing in this area,
please get in contact and we can help.  There is a significant time
commitment to working on infra due to the wide range of complex systems,
however, a number of companies are working with these systems internally
and should be able to spare some expertise in contributing to their
upstream development and operation without undue burden.

> This seems to be a pure infrastructure issue with the python26 gate as
> this is happen for all py26 tests.

Monty corrected the problem with the bad Jenkins slave around 14:22 UTC.

In the short term, we plan to start running unit tests on single-use
slaves, just as we do for devstack jobs, which should mean single-node
errors will auto-correct (also, it makes the system more auto-scalable
according to workload).

In the long term we're looking at using non-Jenkins test runners which
should avoid this sort of problem by being significantly less complex.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Retiring "reverify no bug"

2013-12-09 Thread James E. Blair
Hi,

On Wednesday December 11, 2013 we will remove the ability to use
"reverify no bug" to re-trigger gate runs for changes that have failed
tests.

This was previously discussed[1] on this list.  There are a few key
things to keep in mind:

* This only applies to "reverify", not "recheck".  That is, it only
  affects the gate pipeline, not the check pipeline.  You can still use
  "recheck no bug" to make sure that your patch still works.

* Core reviewers can still resubmit a change to the queue by leaving
  another "Approved" vote.  Please don't abuse this to bypass the intent
  of this change: to help identify and close gate-blocking bugs.

* You may still use "reverify bug #" to re-enqueue if there is a bug
  report for a failure, and of course you are encouraged to file a bug
  report if there is not.  Elastic-recheck is doing a great job of
  indicating which bugs might have caused a failure.

As discussed in the previous thread, the goal is to prevent new
transient bugs from landing in code by ensuring that if a change fails a
gate test that it is because of a known bug, and not because it's
actually introducing a bug, so please do your part to help in this
effort.

[1] http://lists.openstack.org/pipermail/openstack-dev/2013-November/020280.html

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Retiring "reverify no bug"

2013-12-09 Thread James E. Blair
Mark McLoughlin  writes:

> I wonder could we make it standard practice for an infra bug to get
> filed whenever there's a known issue causing gate jobs to fail so that
> everyone can use that bug number when re-triggering?
>
> (Apologies if that's already happening)
>
> I guess we'd want to broadcast that bug number with statusbot?
>
> Basically, the times I've used 'reverify no bug' is where I see some job
> failures that look like an infra issue that was already resolved.

Yes, in those cases a bug should be filed on the openstack-ci project
(either by us or by anyone encountering such a bug if we haven't gotten
to it yet).

In the past we have sometimes done that, but not always.  This will
force us to be better about it.  :)

And yes, I'd like to use statusbot for that (I'm currently working on
making it more reliable), but otherwise searching for the most recently
filed bug in openstack-ci would probably get you the right one (if there
is one) on those occasions.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Sphinx 1.2 incompatibility (failing -docs jobs)

2013-12-10 Thread James E. Blair
Hi,

Sphinx 1.2 was just released and it is incompatible with distutils in
python 2.7.  See these links for more info:

  
https://bitbucket.org/birkenfeld/sphinx/pull-request/193/builddoc-shouldnt-fail-on-unicode-paths/diff
  http://bugs.python.org/issue19570

This has caused all -docs jobs to fail.  This morning we merged a change
to openstack/requirements to pin Sphinx to version 1.2:

  https://review.openstack.org/#/c/61164/

Sergey Lukjanov, Clark Boylan, and Jeremy Stanley finished up the
automatic requirements proposal job (Thanks!), and so now updates have
been automatically proposed to all projects that subscribe:

  https://review.openstack.org/#/q/topic:openstack/requirements,n,z

Once those changes merge, -docs jobs for affected projects should start
working again.

Note that requirements updates for stable branches are proceeding
separately; you can track their progress here:

  https://review.openstack.org/#/q/I0487b4eca8f2755b882689289e3cdf429729b1fb,n,z

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Infra] Next two infra meetings canceled

2013-12-23 Thread James E. Blair
Hi,

Since they fall on the evenings of some major holidays, we're canceling
the next two Project Infrastructure meetings.  Enjoy the holidays!

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Turbo-hipster

2014-01-02 Thread James E. Blair
Michael Still  writes:

> Heh, I didn't know that wiki page existed. I've added an entry to the 
> checklist.
>
> There's also some talk of adding some help text to the vote message
> turbo-hipster leaves in gerrit, but we haven't gotten around to doing
> that yet.

I would rather not mention it on that page, which is the documentation
for the project gating system and developer workflow (Zuul links to it
when it leaves a failure message) so I have removed it.

I _do_ think adding help text to the messages third-party tools leave,
and/or linking to specific documentation (ideally also in the OpenStack
wiki) from there is a good idea.

However, there are _a lot_ of third-party test systems coming on-line,
and I'm not sure that expanding the "recheck language" to support ever
more complexity is a good idea.  I can see how being able to say
"recheck foo" would be useful in some circumstances, but given that just
saying "recheck" will suffice, I'd prefer that we kept the general
recommendation simple so developers can worry about something else.

Certainly at a minimum, "recheck" should recheck all the systems; that's
one of the proposed requirements here:

  https://review.openstack.org/#/c/63478/5/doc/source/third_party.rst

I think it would be best if we stopped there.  But if you still feel
very strongly that you want a private extension to the syntax, please
consider how necessary it is for most developers to know about it when
you decide how prominently to feature it in messages or documentation
about your tools.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Turbo-hipster

2014-01-02 Thread James E. Blair
Sean Dague  writes:

> On 01/02/2014 04:29 PM, Michael Still wrote:
>> Heh, I didn't know that wiki page existed. I've added an entry to the 
>> checklist.
>> 
>> There's also some talk of adding some help text to the vote message
>> turbo-hipster leaves in gerrit, but we haven't gotten around to doing
>> that yet.
>> 
>> Cheers,
>> Michael
>
> So was there enough countable slowness earlier in the run that you could
> have predicted these runs would be slower overall?
>
> My experience looking at Tempest run data is there can be as much as an
> +60% variance from fastest and slowest nodes (same instance type) within
> the same cloud provider, which is the reason we've never tried to
> performance gate on it.
>
> However if there was some earlier benchmark that would let you realize
> that the whole run was slow, so give it more of a buffer, that would
> probably be useful.

If you are able to do this and benchmark the performance of a cloud
server reliably enough, we might be able to make progress on performance
testing, which has been long desired.  The large ops test is (somewhat
accidentally) a performance test, and predictably, it has failed when we
change cloud node provider configurations.  A benchmark could make this
test more reliable and other tests more feasible.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [elastic-recheck] Thoughts on next steps

2014-01-03 Thread James E. Blair
Sean Dague  writes:

> So my feeling is we should move away from the point graphs we have,
> and present these as weekly and daily failure rates (with graphs and
> error bars). And slice those per job. My suggestion is that we do the
> actual visualization with matplotlib because it's super easy to output
> that from pandas data sets.

I am very excited about this and everything above it!

> = Take over of /recheck =
>
> There is still a bunch of useful data coming in on "recheck bug "
> data which hasn't been curated into ER queries. I think the right
> thing to do is treat these as a work queue of bugs we should be
> building patterns out of (or completely invalidating). I've got a
> preliminary gerrit bulk query piece of code that does this, which
> would remove the need of the daemon the way that's currently
> happening. The gerrit queries are a little long right now, but I think
> if we are only doing this on hourly cron, the additional load will be
> negligible.

I think this is fine and am all for reducing complexity, but consider
this alternative: over the break, I moved both components of
elastic-recheck onto a new server (status.openstack.org).  Since they
are now co-located, you could have the component of e-r that watches the
stream to provide responses to gerrit also note recheck actions.  You
could stick the data in a file, memcache, trove database, etc, and the
status page could display that "work queue".  No extra daemons required.

I think the main user-visible aspect of this decision is the delay
before unprocessed bugs are made visible.  If a bug starts affecting a
number of jobs, it might be nice to see what bug numbers people are
using for rechecks without waiting for the next cron run.

On another topic, it's worth mentioning that we now (again, this is new
from over the break) have timeouts _inside_ the devstack-gate jobs that
should hit before the Jenkins timeout, so log collection for
devstack-gate jobs that run long and hit the timeout should still happen
(meaning that e-r can now see these failures).

Thanks for all your work on this.  I think it's extremely useful and
exciting!

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] [devstack-gate] Nominating Sean Dague for devstack-gate-core

2014-01-03 Thread James E. Blair
Occasionally it becomes clear that a part of the project infrastructure
has its own community interested in it.  Such is the case with
devstack-gate, which is the nexus of infra and openstack.  Not only does
it interact with infrastructure systems (in a surprisingly complex way)
to prepare an integration test run, but it also is very much concerned
with how those tests are run -- which components are enabled under what
circumstances, and what configuration they use.

For some time, Sean Dague has shown that not only does he understand the
mechanics of how the integration test operates, but what the project
hopes to accomplish by running such tests.  Both the patches he has
submitted and his reviews indicate a comprehensive knowledge and desire
to help maintain and improve the system over time.

Therefore I propose that we create a devstack-gate-core group, and to it
we add the core infrastructure team and Sean.

Thanks,

Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Turbo-hipster

2014-01-03 Thread James E. Blair
Dan Prince  writes:

> - Original Message -
>> From: "Michael Still" 
>> - commenting "recheck .*"
>> - commenting "recheck migrations"
>
> With the growing interest in 3rd party testing systems would using 'recheck 
> turbo-hipster' make more sense here?
>
> I'm fine with 'recheck migrations' in addition for turbo-hipster but it would 
> make sense to align the recheck naming scheme with the title of the reviewer 
> for the 3rd party testing system.

This is the can of worms I was hoping we would not open.  Or try to get
them all back into the can and close it again is perhaps the better
metaphor.

I do not think that system-specific recheck commands will be actually
that useful or important.  As I mentioned, I understand the theoretical
usefulness of being able to say "oops, I can tell this one system messed
up, let's ask it to try again".  But given that case is covered by
asking all systems to recheck, it seems harmless to say "all systems to
recheck".

I just don't think that asking developers to learn a micro-language of
commands left in gerrit comments is the best use of time.  Maybe I'm
wrong about that, but I was hoping we could try not creating it first to
see if there's really a need.  Dealing with the expected errors of a
system first coming online doesn't, in my mind, demonstrate that need.

If it turns out that asking programmers not to create a new language is
futile, yes, I'd rather we have some predictability.  Most third party
CI systems have descriptive names, so rather than saying "recheck
turbo-hipster", perhaps we should change turbo-hipster's display name to
something related to "migrations".

So _if_ we want to have system-specific recheck commands, then I think
we should ask operators to make them in the form "recheck ",
and that they output a brief sentence of help text to remind people of
that in the report message in Gerrit.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [requirements] - taskflow preventing sqla 0.8 upgrade

2014-01-05 Thread James E. Blair
Joshua Harlow  writes:

> It seems simple to have variations of venvs (or something similar)
> that taskflow tox.ini can have that specify the different 0.7, 0.8,
> 0.9, when sqlalchemy 1.0 comes out then this should become a nonissue
> (hopefully). I will bug the infra folks to see what can be done here
> (hopefully this is as simple as it sounds).

It is.  See pecan for an example:

  
http://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/jenkins_job_builder/config/projects.yaml#n1636

And thanks to Ryan Petrello for setting that system up!

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [elastic-recheck] Thoughts on next steps

2014-01-05 Thread James E. Blair
Sean Dague  writes:

>> I think the main user-visible aspect of this decision is the delay
>> before unprocessed bugs are made visible.  If a bug starts affecting a
>> number of jobs, it might be nice to see what bug numbers people are
>> using for rechecks without waiting for the next cron run.
>
> So my experience is that most rechecks happen > 1 hr after a patch
> fails. And the people that are sitting on patches for bugs that have
> never been seen before find their way to IRC.
>
> The current state of the world is not all roses and unicorns. The
> recheck daemon has died, and not been noticed that it was dead for
> *weeks*. So a guarantee that we are only 1 hr delayed would actually
> be on average better than the delays we've seen over the last six
> months of following the event stream.

I wasn't suggesting that we keep the recheck daemon, I was suggesting
moving the real-time observation of rechecks into the elastic-recheck
daemon which will remain an important component of this system for the
foreseeable future.  It is fairly reliable and if it does die, we will
desperately want get it running again and fix the underlying problem
because it is so helpful.

> I also think that caching should probably actually happen in gerritlib
> itself. There is a concern that too many things are hitting gerrit,
> and the result is that everyone is implementing their own client side
> caching to try to be nice. (like the pickles in Russell's review stats
> programs). This seems like the wrong place to do be doing it.

That's not a bad idea, however it doesn't really address the fact that
you're looking for events -- you need to run a very large bulk query to
find all of the reviews over a certain amount of time.  You could reduce
this by caching results and then only querying reviews that are newer
than the last update.  But even so, you'll always have to query for that
window.  That's not as bad as querying for the same two weeks of data
every X minutes, but since there's already a daemon watching all of the
events anyway in real time, you already have the information if you just
don't discard it.

> But, part of the reason for this email was to sort these sorts of
> issues out, so let me know if you think the caching issue is an
> architectural blocker.
>
> Because if we're generally agreed on the architecture forward and are
> just reviewing for correctness, the code can move fast, and we can
> actually have ER 1.0 by the end of the month. Architecture review in
> gerrit is where we grind to a halt.

It looks like the bulk queries take about 4 full minutes of Gerrit CPU
time to fetch data from the last two weeks (and the last two weeks have
been quiet; I'd expect the next two weeks to take longer).  I don't
think it's going to kill us, but I think there are some really easy ways
to make this way more efficient, which isn't just about being nice to
Gerrit, but is also about being responsive for users.

My first preference is still to use the real-time data that the e-r
daemon collects already and feed it to the dashboard.

If you feel like the inter-process communication needed for that will
slow you down too much, then my second preference would be to introduce
local caching of the results so that you can query for
"-age:" instead of the full two weeks every time.  (And
if it's generalized enough, sure let's add that to gerritlib.)

I really think we at least ought to do one of those.  Running the same
bulk query repeatedly is, in this case, so inefficient that I think this
little bit of optimization is not premature.

Thanks again for working on this.  I really appreciate it and the time
you're spending on architecture.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Changes coming in gate structure

2014-01-22 Thread James E. Blair
Robert Collins  writes:

> On 23 January 2014 09:39, Sean Dague  wrote:
>> 
>> Changes coming in gate structure
>> 
>
>> Svelt Gate
>> ==
>>
>> The gate jobs will be trimmed down immensely. Nothing project
>> specific, so pep8 / unit tests all ripped out, no functional test
>> runs. Less overall configs. Exactly how minimal we'll figure out as we
>> decide what we can live without. The floor for this would be
>> devstack-tempest-full and grenade.
>>
>> This is basically sanity check that the combination of patches in
>> flight doesn't ruin the world for everyone.
>
> So two things occur to me here -
>  - this increases thread-the-needle risks.
>  - what value does the sanity check still offer?

Here's how I see this: we have a process that ensures that the code
remains as perfect as its tests.  We have seen that the combined system
of openstack-and-its-tests is not perfect.  The complexity of that
system is great, and it seems we will likely never remove all
non-deterministic failures from this system.  At least, we seem to be
able to introduce them as fast as they are removed.

So this is an adaptation to two aspects of our current situation:
single-project-test failures are causing gate failures, and the
non-deterministic failure rate in cross-project-tests are as well.  The
first is addressed by moving single-project tests out of the gate.  So a
single project is more likely to break itself but less likely to break
others.  This is a trade-off, but it localizes the pain in case of
error.  The second is addressed by removing some of our
cross-project-test variants from the gate.  The more variants we run,
the more likely we are to hit non-deterministic bugs.  You have argued
before that rate is not high enough for us to prevent these bugs from
entering, but perhaps with the reduction it will be low enough that it
allows us to get work done.

It's possible both of those decisions may let in more bugs than we are
comfortable with.  However, given the current state where cross-project
frustrations are high and development has slowed dramatically,
localizing failures when possible and focusing the gate on ensuring that
there's at least a basic workable system seems worth trying.

If we take too much out of the gate, we can put it back in.  Regardless,
I think the changes to require recent check votes will help and be a
long-lasting improvement.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] Proposing Sergey Lukjanov for infra-core

2014-02-10 Thread James E. Blair
Hi,

I'm very pleased to propose that we add Sergey Lukjanov to the
infra-core team.

He is among the top reviewers of projects in openstack-infra, and is
very familiar with how jenkins-job-builder and zuul are used and
configured.  He has done quite a bit of work in helping new projects
through the process and ensuring that changes to the CI system are
correct.  In addition to providing very helpful reviews he has also
contributed significant patches to our Python projects illustrating a
high degree of familiarity with the code base and project direction.
And as a bonus, we're all looking forward to once again having an
infra-core member in a non-US time zone!

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra] Proposing Sergey Lukjanov for infra-core

2014-02-10 Thread James E. Blair
Clark Boylan  writes:

> On Mon, Feb 10, 2014 at 9:48 AM, James E. Blair  wrote:
>> Hi,
>>
>> I'm very pleased to propose that we add Sergey Lukjanov to the
>> infra-core team.
>>
>> He is among the top reviewers of projects in openstack-infra, and is
>> very familiar with how jenkins-job-builder and zuul are used and
>> configured.  He has done quite a bit of work in helping new projects
>> through the process and ensuring that changes to the CI system are
>> correct.  In addition to providing very helpful reviews he has also
>> contributed significant patches to our Python projects illustrating a
>> high degree of familiarity with the code base and project direction.
>> And as a bonus, we're all looking forward to once again having an
>> infra-core member in a non-US time zone!
>>
>> -Jim
>>
>> ___
>> OpenStack-dev mailing list
>> OpenStack-dev@lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> +1 here. Sergey has done a great job of keeping up with reviews
> lately, which has been quite helpful.

I seem to be Monty's IRC-to-email gateway today.  He adds a very
emphatic +1, which makes it unanimous.

Congratulations, Sergey, and thanks for all the help!

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [OpenStack-Infra] [TripleO] promoting devtest_seed and devtest_undercloud to voting, + experimental queue for nova/neutron etc.

2014-02-14 Thread James E. Blair
Sean Dague  writes:

> On 02/14/2014 03:43 PM, Robert Collins wrote:
>> Thanks to a massive push this week, both the seed *and* undercloud
>> jobs are now passing on tripleo-gate nodes, but they are not yet
>> voting.
>> 
>> I'd kind of like to get them voting on tripleo jobs (check only). We
>> don't have 2 clouds yet, so if the tripleo ci-cloud suffers a failure,
>> we'd have -1's everywhere. I think this would be an ok tradeoff (its
>> check after all), but I'd like -infra admin folks opinion on this -
>> would it cause operational headaches for you, over and above the
>> current risks w/ the tripleo-ci cloud?

You won't end up with -1's everywhere, you'll end up with jobs stuck in
the queue indefinitely, as we saw when the tripleo cloud failed
recently.  What's worse is that now that positive check results are
required for enqueuing into the gate, you will also not be able to merge
anything.

>> OTOH - we actually got passing ops with a fully deployed virtual cloud
>> - which is awesome.

Great! :)

>> Now we need to push through to having the overcloud deploy tests pass,
>> then the other scenarios we depend on - upgrades w/rebuild, and we'll
>> be in good shape to start optimising (pre-heated clouds, local distro
>> mirrors etc) and broadening (other distros ...).
>> 
>> Lastly, I'm going to propose a merge to infra/config to put our
>> undercloud story (which exercises the seed's ability to deploy via
>> heat with bare metal) as a check experimental job on our dependencies
>> (keystone, glance, nova, neutron) - if thats ok with those projects?
>>
>> -Rob
>
> My biggest concern with adding this to check experimental, is the
> experimental results aren't published back until all the experimental
> jobs are done.
>
> We've seen really substantial delays, plus a 5 day complete outage a
> week ago, on the tripleo cloud. I'd like to see that much more proven
> before it starts to impact core projects, even in experimental.

Until the tripleo cloud is multi-region, HA, and has a proven track
record of reliability, we can't have jobs that run on its nodes in any
pipeline for any non-tripleo project, for those reasons.  I do look
forward to when that is the case.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra] reverify/recheck

2014-02-18 Thread James E. Blair
Sean Dague  writes:

> We're still working through kinks in the new system, which is why it's
> not fully documented yet.

We did not intend to change the general operation of 'recheck' and
'reverify', however, we did have some bugs in the early stages where we
missed a possible state-change.  I believe they have been worked out now
and at this point you should be able to leave 'reverify bug #' on a
change that has failed the gate and it will have its check jobs re-run
and then automatically re-enqueued in the gate pipeline if it gets a +1.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Proposal to move from Freenode to OFTC

2014-03-03 Thread James E. Blair
Hi,

Freenode has been having a rough time lately due to a series of DDoS
attacks which have been increasingly disruptive to collaboration.
Fortunately there's an alternative.

OFTC http://www.oftc.net/> is a robust and established alternative
to Freenode.  It is a smaller network whose mission statement makes it a
less attractive target.  It's significantly more stable than Freenode
and has friendly and responsive operators.  The infrastructure team has
been exploring this area and we think OpenStack should move to using
OFTC.

This would obviously be a big change, but we think that with the
following process, we can move fairly smoothly:

0) Establish channel and bot registrations on OFTC.  This has already
been done (for all the channels listed on the wiki and all the bots
managed via the infrastructure program).  That actually puts us ahead of
Freenode where we still haven't managed to register all the channels we
use.

1) Create an irc.openstack.org CNAME record that points to
chat.freenode.net.  Update instructions to suggest users configure their
clients to use that alias.

2) Set a date and time for a cutover, at least several weeks out.  Make
multiple announcements about the move on mailing lists, blogs, etc.

3) Set channel topics in OFTC to remind people that the move has not yet
occurred.

Nearer to the cutover date:

4) Ask a few people (perhaps some core members of each team) to join
OFTC a few days early to be there to assist anyone who shows up there
and is confused.

5) On the cutover, change the CNAME and links to web clients in the
wiki.  The infrastructure team will switch IRC bots (including channel
logging) at the cutover time as well.  Send reminder announcements.

6) Ask those same people from #4 to stick around Freenode for a few
weeks after the cutover to assist anyone who shows up there and is
confused.

7) Set channel topics in Freenode to remind people that we have moved to
OFTC.

If there aren't objections to this plan, I think we can propose a motion
to the TC with a date and move forward with it fairly soon.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Proposal to move from Freenode to OFTC

2014-03-03 Thread James E. Blair
Sean Dague  writes:

[I don't have anything substantial to add to q1 right now, though it's a
good one.]

> #2) how bad do we believe nick contention might be in this transition?
> We've got 1000+ people that have well known nicks on freenode, that
> might hit conflicts on oftc.

Now might be a good time to go register your nick (and your casual
Friday nick as well) and see.  OFTC has a similar policy to Freenode
around releasing nicks that have not been used in years, and the
operators in #oftc have been responsive to such requests.  That way, if
this is a problem, we might find out early.

> #3) while for IRC veterans this is a simple matter of changing a config
> in your IRC proxy, we have been training new folks for a long time (and
> all through our wiki and documentation) that Freenode is our place. That
> might have required some of these folks to get firewall rules for
> freenode created. What kind of timeline are you thinking about for the
> cutover to hopefully catch all these folks?

Good point (though with Freenode's constant server rotation, I wonder
how many folks actually have IP-based firewall rules [as opposed to just
a port rule which should work for OFTC as well]).

I was thinking about a month, for starters.  That gives us plenty of
time for notice, and early April puts us in the RC phase but not too
close to the release.  If we decide to proceed but feel this isn't
enough time for this or other issues, we should probably push it to
early May (the "off week" after the release).

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] Gerrit downtime and upgrade on 2014-04-28

2014-04-25 Thread James E. Blair
Hi,

This is the third and final reminder that next week Gerrit will be
unavailable for a few hours starting at 1600 UTC on April 28th.

You may read about the changes that will impact you as a developer
(please note that the SSH host key change is particularly important) at
this location:

  https://wiki.openstack.org/wiki/GerritUpgrade

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra] Gerrit downtime and upgrade on 2014-04-28

2014-04-25 Thread James E. Blair
Jay Faulkner  writes:

> Can you guys publish the ssh host key we should expect from the new
> gerrit server?

Certainly!  As the wiki page[1] notes, you can see the current ssh host
key fingerprints at:

  https://review.openstack.org/#/settings/ssh-keys

Of course, right now, that's for the current key.  After the upgrade
when you visit that page it will display the values for the new key.

It might seem odd to verify the fingerprints for the server you are
connecting to by visiting a web page on the same server, however, since
it is over HTTPS, some additional confidence is provided by the trust in
the CA system.

Of course, for some of us, that's not a lot.  So on Monday, we'll send a
GPG signed email with the fingerprints as well.  And this is just
another reminder that as a community, we should endeavor to build our
GPG web of trust.  See you at the Summit!

-Jim

[1] https://wiki.openstack.org/wiki/GerritUpgrade

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [OpenStack-Infra] [infra] elements vs. openstack-infra puppet for CI "infra" nodes

2014-05-09 Thread James E. Blair
"Elizabeth K. Joseph"  writes:

> On the Debian side, I also have a bug (with some mirror discussion and
> an attached review) here:
>
> https://bugs.launchpad.net/openstack-ci/+bug/1311855
>
> After discussing this particular patch+bug with the rest of the -infra
> team, there wasn't a ton of interest in running an infra-based mirror
> due to the package index out of sync issue in unofficial mirrors,
> which would be a problem for us.
>
> I had hoped we could sit down and chat about this at the summit for
> both Fedora and Debian mirrors, but unfortunately I won't be able to
> attend (been very sick this week, doctor didn't approve getting on a
> plane on Sunday). So I'm hoping some other infra folks can sync up
> with Dan and the TripleO crew to chat about how we can best get these
> changes in so they'll work effectively for everyone. Also happy to
> continue this discussion here on list or resume at a meeting after
> summit.

Thanks Dan and Liz.  Let's do try to sync up on this at the summit.  I
think this is important and there are good arguments for both
approaches.  I don't think it's an easy question to answer so let's
get together and consider the options.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra] *urgent* Jenkins keeps verifying non-stop

2014-05-19 Thread James E. Blair
Jerry Xinyu Zhao  writes:

> Better send to infra's list too.
>
>
>
> On Mon, May 19, 2014 at 10:06 AM, Yi Sun  wrote:
>
>> More info, I add a follow up comment on an old change set, and then it
>> happened.
>> Yi
>>
>>
>> On Mon, May 19, 2014 at 2:49 AM, Carlos Gonçalves wrote:
>>
>>> I was able to broke the loop by uploading a new patchset to Gerrit.
>>> Infra team, could you please clean the mess caused by Jenkins on Gerrit,
>>> please?
>>>
>>> Thanks,
>>> Carlos Goncalves
>>>
>>> On 19 May 2014, at 07:54, Carlos Gonçalves  wrote:
>>>
>>> Hi,
>>>
>>> Could someone from the infra team check what's happening to Jenkins here
>>> https://review.openstack.org/#/c/92477/? It keeps re-verifying the
>>> change over and over for no apparent reason.
>>>
>>> Thanks,
>>> Carlos Goncalves

Thanks.  There was a problem with a behavioral change in the new Gerrit
and our clean-check implementation in Zuul.  This change and its
dependency have merged so we don't expect this to happen again:

  https://review.openstack.org/#/c/94243/

I manually removed the comments from that change.  Sorry for the
inconvenience.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [OpenStack-Infra] [infra] Meeting Tuesday May 20th at 19:00 UTC

2014-05-20 Thread James E. Blair
"Elizabeth K. Joseph"  writes:

> On Mon, May 19, 2014 at 9:40 AM, Elizabeth K. Joseph
>  wrote:
>> Hi everyone,
>>
>> The OpenStack Infrastructure (Infra) team is hosting our weekly
>> meeting on Tuesday May 20th, at 19:00 UTC in #openstack-meeting
>
> Great post-summit meeting, thanks to everyone who joined us.

Yes, great to see new people!

I'm sorry the open discussion period was short today.  It isn't always
like that, but sometimes is.

If you do have something to discuss that you want to make sure to get on
the agenda, feel free to add items by editing the wiki page here:
https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] Gerrit downtime on May 23 for project renames

2014-05-21 Thread James E. Blair
Hi,

On Friday, May 23 at 21:00 UTC Gerrit will be unavailable for about 20
minutes while we rename some projects.  Existing reviews, project
watches, etc, should all be carried over.  The current list of projects
that we will rename is:

stackforge/barbican -> openstack/barbican
openstack/oslo.test -> openstack/oslotest
openstack-dev/openstack-qa -> openstack-attic/openstack-qa
openstack/melange -> openstack-attic/melange
openstack/python-melangeclient -> openstack-attic/python-melangeclient
openstack/openstack-chef -> openstack-attic/openstack-chef
stackforge/climate -> stackforge/blazar
stackforge/climate-nova -> stackforge/blazar-nova
stackforge/python-climateclient -> stackforge/python-blazarclient
openstack/database-api -> openstack-attic/database-api
openstack/glance-specs -> openstack/image-specs
openstack/neutron-specs -> openstack/networking-specs
openstack/oslo-specs -> openstack/common-libraries-specs

Though that list is subject to change.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] Nominating Nikita Konovalov for storyboard-core

2014-05-21 Thread James E. Blair
Nikita Konovalov has been reviewing changes to both storyboard and
storyboard-webclient for some time.  He is the second most active
storyboard reviewer and is very familiar with the codebase (having
written a significant amount of the server code).  He regularly provides
good feedback, understands where the project is heading, and in general
is in accord with the current core team, which has been treating his +1s
as +2s for a while now.

Please respond with +1s or concerns, and if the consensus is in favor, I
will add him to the group.

Nikita, thank you very much for your work!

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] Nominating Sergey Lukjanov for infra-root

2014-05-21 Thread James E. Blair
The Infrastructure program has a unique three-tier team structure:
contributors (that's all of us!), core members (people with +2 ability
on infra projects in Gerrit) and root members (people with
administrative access).  Read all about it here:

  http://ci.openstack.org/project.html#team

Sergey has been an extremely valuable member of infra-core for some time
now, providing reviews on a wide range of infrastructure projects which
indicate a growing familiarity with the large number of complex systems
that make up the project infrastructure.  In particular, Sergey has
expertise in systems related to the configuration of Jenkins jobs, Zuul,
and Nodepool which is invaluable in diagnosing and fixing operational
problems as part of infra-root.

Please respond with any comments or concerns.

Thanks again Sergey for all your work!

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra] Gerrit downtime on May 23 for project renames

2014-05-21 Thread James E. Blair
Tom Fifield  writes:

> May I ask, will the old names have some kind of redirect to the new names?

Of course you may ask!  And it's a great question!  But sadly the answer
is "no".  Unfortunately, Gerrit's support for renaming projects is not
very good (which is why we need to take downtime to do it).

I'm personally quite fond of stable URLs.  However, these started as an
"experiment" so we were bound to get some things wrong (and will
probably continue to do so) and it's better to try to fix them early.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] Nominating Joshua Hesketh for infra-core

2014-05-21 Thread James E. Blair
The Infrastructure program has a unique three-tier team structure:
contributors (that's all of us!), core members (people with +2 ability
on infra projects in Gerrit) and root members (people with
administrative access).  Read all about it here:

  http://ci.openstack.org/project.html#team

Joshua Hesketh has been reviewing a truly impressive number of infra
patches for quite some time now.  He has an excellent grasp of how the
CI system functions, no doubt in part because he runs a copy of it and
has been doing significant work on evolving it to continue to scale.
His reviews of python projects are excellent and particularly useful,
but he also has a grasp of how the whole system fits together, which is
a key thing for a member of infra-core.

Please respond with any comments or concerns.

Thanks, Joshua, for all your work!

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra] Gerrit downtime on May 23 for project renames

2014-05-22 Thread James E. Blair
Thierry Carrez  writes:

> James E. Blair wrote:
>> openstack/oslo-specs -> openstack/common-libraries-specs
>
> I understand (and agree with) the idea that -specs repositories should
> be per-program.
>
> That said, you could argue that "oslo" is a shorthand for "common
> libraries" and is the code name for the *program* (rather than bound to
> any specific project). Same way "infra" is shorthand for
> "infrastructure". So I'm not 100% convinced this one is necessary...

"data-processing-specs" has been pointed out as a similarly awkward
name.  According to the programs.yaml file, each program does have a
codename, and the compute program's codename is 'nova'.  I suppose we
could have said the repos are per-program though using the program's
codename.  But that doesn't actually help someone who wants to write a
swift-bench spec know that it should go in the swift-specs repo.

I'm happy to drop oslo from the rename list if Doug wants to mull this
over a bit more.  The only thing I hate more than renaming repos is
renaming repos twice.  I'm hoping we can have some kind of consistency,
though.  People are in quite a hurry to have these created (we made 5
more for official openstack programs yesterday, plus a handful for
stackforge).

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Storyboard] [UX] Atlanta Storyboard UX Summary

2014-05-22 Thread James E. Blair
Sean Dague  writes:

> It's worth noting, most (>90%) of OpenStack developers aren't trying to
> land or track features across projects. And realistically, in my
> experience working code into different repositories, the blueprint / bug
> culture between projects varies widely (what requires and artifact, how
> big that artifact is, etc).

Fortunately, storyboard doesn't make that simple case any harder.  A
story with a single task looks just like a simple bug in any other
system (including a bug in LP that only "affects" one project).  There
are probably some things we can change in the UI that make that easier
for new users.

It's worth noting that at this point, all blueprint-style stories are
likely to affect at least two projects (nova, nova-specs), possibly many
more (novaclient, tempest, *-manual, ...).  So being able to support
this kind of work is key.  But yeah, we should make the simple case of
"report a bug in nova" easy.  And I think we will.  But storyboard is
barely self-hosting at this point and there's still quite a bit to flesh
out.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Spec repo names

2014-05-23 Thread James E. Blair
Hi,

With apologies to the specs repos we just created, the more I think
about this, the more I think that the right answer is that we should
stick with codenames for the spec repos.  The codenames are actually
more discoverable for potential contributors and collaborators.  If
you're looking for the place to submit a spec for swift-bench, you're
much more likely to find the 'swift-specs' repo than 'object-specs'.
And while some of our older programs have nice catchy names, the newer
ones can be a mouthful.  Here's a list of likely names based on the
program name:

Program Names
-
compute-specs
object-specs
image-specs
identity-specs
dashboard-specs
networking-specs
volume-specs
telemetry-specs
orchestration-specs
database-specs
baremetal-specs
common-libraries-specs
infra-specs
docs-specs
qa-specs
deployment-specs
devstack-specs
release-management-specs
queue-specs
data-processing-specs
key-management-specs

Note that "database-specs" is potentially quite confusing.

And here's a list based on the program's codename:

Codenames
-
nova-specs
swift-specs
glance-specs
keystone-specs
horizon-specs
neutron-specs
cinder-specs
ceilometer-specs
heat-specs
trove-specs
ironic-specs
oslo-specs
infra-specs
docs-specs
qa-specs
tripleo-specs
devstack-specs
release-management-specs
marconi-specs
sahara-specs
barbican-specs

When I look at the two of those, I have to admit that it's the second
one I find more intuitive and I'm pretty sure I'll end up calling it
'sahara-specs' in common usage no matter the name.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra] Gerrit downtime on May 23 for project renames

2014-05-23 Thread James E. Blair
jebl...@openstack.org (James E. Blair) writes:

> Hi,
>
> On Friday, May 23 at 21:00 UTC Gerrit will be unavailable for about 20
> minutes while we rename some projects.  Existing reviews, project
> watches, etc, should all be carried over.

This is complete.  The actual list of renamed projects is:

stackforge/barbican -> openstack/barbican
openstack/oslo.test -> openstack/oslotest
openstack-dev/openstack-qa -> openstack-attic/openstack-qa
openstack/melange -> openstack-attic/melange
openstack/python-melangeclient -> openstack-attic/python-melangeclient
openstack/openstack-chef -> openstack-attic/openstack-chef
openstack/database-api -> openstack-attic/database-api
stackforge/climate -> stackforge/blazar
stackforge/climate-nova -> stackforge/blazar-nova
stackforge/python-climateclient -> stackforge/python-blazarclient
stackforge/murano-api -> stackforge/murano
openstack/baremetal-specs -> openstack/ironic-specs
openstack/object-specs -> openstack/swift-specs
openstack/orchestration-specs -> openstack/heat-specs
openstack/identity-specs -> openstack/keystone-specs
openstack/telemetry-specs -> openstack/ceilometer-specs

The consensus about specs repo names seemed to be converging on
'codename' significantly enough that we felt it would be best to drop
the previous plan to rename to program name, and instead rename the
handful of brand-new repos that had been created to their codenames.
Giving that kind of notice isn't our favorite thing to do, to put it
mildly, but I think this is for the best and we can get back to actually
writing specs now.  And maybe code.

Thanks to Sergey, Monty, Clark and Jeremy all of whom did a great deal
of work on this.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Election Stats and Review Discussion

2014-05-23 Thread James E. Blair
Joshua Harlow  writes:

> Is there any kind of central location where we can look at what a TC
> candidate has done before (what their proposals we, what they voted on, or
> any other similar kind of information; in a way this is similar to having
> visibility into what a US congressman/senator does, which is *some of the*
> information people use to determine who they should vote for in the US)?

Yes!

In fact we recently (well, recently is relative) starting doing all of
the actual voting in Gerrit.  Naturally.

So you can see all of the proposals that have been, well, proposed,
approved, rejected, etc., since then through Gerrit, as well as the
votes of the TC members[1].

  https://review.openstack.org/#/q/project:openstack/governance,n,z

We're also working on getting the approved bits published to a web site.

The meetings are still quite important for the same reason that project
meetings are -- it's very helpful for us to get on the same page about
issues.

-Jim
[1] Note that the votes of previous TC members do not appear in the grid
at the top of the page, however, they are visible in the comments and in
the Gerrit API.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Hide CI comments in Gerrit

2014-05-27 Thread James E. Blair
Radoslav Gerganov  writes:

> Hi Monty,
>
>> As a next step - why not take the Javascript you've got there and submit
>> it as a patch to the file above? We can probably figure out a way to
>> template the third party CI names ... but starting one step at a time is
>> a great idea.
>> 
>
> Thanks for the pointer, I have submitted 
> https://review.openstack.org/#/c/95743
>
> It is still a work in progress as I need to figure out how to augment
> only review pages, not the entire Gerrit interface. What would be a
> good way to test this kind of changes? I have been told there is
> Gerrit dev box out there. I guess I can also test it with an http
> proxy that rewrites the pages from the upstream Gerrit but I am
> looking for something easier :)
>
> Having template for CI names would be very helpful indeed.

Very cool!

When that's ready, we can manually apply it to review-dev only to make
sure it works before we put it into production.

We're definitely interested in templating the CI names, but we haven't
actually asked anyone to start doing that yet (we're getting some other
changes ready at the same time so that the third-party CI folks can make
all the requested changes at once).  Realistically, it's going to take a
little while for all of that to get into place.  We could just list the
names for now and then change to a regex later, or I wonder if it would
be possible to detect them based on the presence of a "Verified" vote?

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Nova] [Ironic] [Infra] Making Ironic vote as a third-party Nova driver

2014-05-28 Thread James E. Blair
Devananda van der Veen  writes:

> Hi all!
>
> This is a follow-up to several summit discussions on
> how-do-we-deprecate-baremetal, a summary of the plan forward, a call to
> raise awareness of the project's status, and hopefully gain some interest
> from folks on nova-core to help with spec and code reviews.
>
> The nova.virt.ironic driver lives in Ironic's git tree today [1]. We're
> cleaning it up and submitting it to Nova again this cycle. I've posted
> specs [2] outlining the design and planned upgrade process. Earlier today,
> we enabled voting in Ironic's check and gate queues for the
> tempest-dsvm-virtual-ironic job. This runs a tempest scenario test [3]
> against devstack, exercising Nova with the Ironic driver to PXE boot a
> virtual machine. It has been running for a few months on Ironic, and has
> been stable for more than a month. However, because Ironic is not
> integrated, we also can't vote in check/gate queues on integrated projects
> (like Nova). We can - and do - report the test result in a non-voting way,
> though that's easy to miss, since it looks like every other non-voting test.
>
> At the summit [4], it was suggested that we make this job report as though
> it were a third-party CI test for a Nova driver. This would be removed at
> the time that Ironic graduates and the job is allowed to vote in the gate.
> Until that time, I'm happy to have the nova.virt.ironic driver reporting as
> a third-party driver (even though it's not) simply to help raise awareness
> (third-party CI jobs are watched more closely than non-voting jobs) and
> decrease the likelihood that Nova developers will inadvertently break
> Ironic's gate.
>
> Given that there's a concrete plan forward, why am I sending this email to
> all three teams? A few reasons:
> - document the plan that we discussed
> - many people from infra and nova were not present during the discussion
> and may not be aware of the details
> - I may have gotten something wrong (it was a long week)
> - and mostly because I don't technically know how to make an upstream job
> report as though it's a third-party job, and am hoping someone wants to
> volunteer to help figure that out

I think it's a reasonable plan.  To elaborate a bit, I think we
identified three categories of jobs that we run:

a) jobs that are voting
b) jobs that are non-voting because they are advisory
c) jobs that are non-voting for policy reasons but we feel fairly
   strongly about

There's a pretty subtle distinction between b and c.  Ideally, there
shouldn't be any.  We've tried to minimize the number of non-voting jobs
to make sure that people don't ignore them.  Nonetheless, it seems that
a large enough number of people still do that non-voting jobs are
considered ineffective in Nova.  I think it's worth noting the potential
danger of de-emphasizing the actual results.  It may make other
non-voting jobs even less effective than they already are.

The intent is to make the jobs described by (c) into voting jobs, but in
a way that they can still be overridden if need be.  The aim is to help
new (eg, incubated) projects join the integrated gate in a way that lets
them prove they are sufficiently mature to do so without impacting the
currently integrated projects.  I believe we're currently thinking that
point is after their integration approval.  If we are comfortable with
incubated projects being able to block the integrated gate earlier, we
could simply make the non-voting jobs voting instead.

Back to the proposal at hand.  I think we should call the kinds of jobs
described in (c) as "non-binding".

The best way to do that is to register a second user with Gerrit for
Zuul to use, and have it report non-binding jobs with a +/- 1 vote in
the check queue that is separate from the normal "Jenkins" vote.  In
order to do that, we will have to modify Zuul to be able to support a
second user, and associate that user with a pipeline.  Then configure a
new "non-binding" pipeline to use that user and run the desired jobs.

Note that a similar problem of curation may occur with the non-binding
jobs.  If we run jobs for the incubated projects Foo and Bar, they will
share a vote in Gerrit, and Nova developers will have to examine the
results of -1 votes; if Bar consistently fails tests, it may need to be
made non-voting or removed to avoid obscuring Foo's results.

I expect the Zuul modification to take an experienced Zuul developer
about 2-3 days to write, or an inexperienced one about a week.  If no
one else has started it by then, I will probably have some time around
the middle of the cycle to hack on it.  In the mean time, we may want to
make sure that the number of non-voting jobs is at a minimum (and
further reduce them if possible), and emphasize to reviewers the
importance of checking posted results.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/opensta

Re: [openstack-dev] Designate Incubation Request

2014-05-28 Thread James E. Blair
Sean Dague  writes:

> I would agree this doesn't make sense in Neutron.
>
> I do wonder if it makes sense in the Network program. I'm getting
> suspicious of the programs for projects model if every new project
> incubating in seems to need a new program. Which isn't really a
> reflection on designate, but possibly on our program structure.

One of the reasons we created programs was so that we wouldn't have to
feel compelled to constrain our growth because of how it relates to our
bureaucracy (specifically "core").  So I don't think we should limit the
number of programs for that reason.

To the task at hand -- Designate has its own group of people working on
it and moreover is in an entirely different problem space than Neutron.
Its PTL needs to be familiar with people and technology that are vastly
different.  I think DNSaaS makes sense as a new program, and I'm
personally delighted to see the incubation request.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra] Nominating Nikita Konovalov for storyboard-core

2014-05-29 Thread James E. Blair
jebl...@openstack.org (James E. Blair) writes:

> Nikita Konovalov has been reviewing changes to both storyboard and
> storyboard-webclient for some time.  He is the second most active
> storyboard reviewer and is very familiar with the codebase (having
> written a significant amount of the server code).  He regularly provides
> good feedback, understands where the project is heading, and in general
> is in accord with the current core team, which has been treating his +1s
> as +2s for a while now.
>
> Please respond with +1s or concerns, and if the consensus is in favor, I
> will add him to the group.
>
> Nikita, thank you very much for your work!

Nikita is now in storyboard-core.  Congratulations!

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra] Nominating Joshua Hesketh for infra-core

2014-05-29 Thread James E. Blair
jebl...@openstack.org (James E. Blair) writes:

> The Infrastructure program has a unique three-tier team structure:
> contributors (that's all of us!), core members (people with +2 ability
> on infra projects in Gerrit) and root members (people with
> administrative access).  Read all about it here:
>
>   http://ci.openstack.org/project.html#team
>
> Joshua Hesketh has been reviewing a truly impressive number of infra
> patches for quite some time now.  He has an excellent grasp of how the
> CI system functions, no doubt in part because he runs a copy of it and
> has been doing significant work on evolving it to continue to scale.
> His reviews of python projects are excellent and particularly useful,
> but he also has a grasp of how the whole system fits together, which is
> a key thing for a member of infra-core.
>
> Please respond with any comments or concerns.
>
> Thanks, Joshua, for all your work!

Joshua is now in infra-core.  Congratulations!

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra] Nominating Sergey Lukjanov for infra-root

2014-05-29 Thread James E. Blair
jebl...@openstack.org (James E. Blair) writes:

> The Infrastructure program has a unique three-tier team structure:
> contributors (that's all of us!), core members (people with +2 ability
> on infra projects in Gerrit) and root members (people with
> administrative access).  Read all about it here:
>
>   http://ci.openstack.org/project.html#team
>
> Sergey has been an extremely valuable member of infra-core for some time
> now, providing reviews on a wide range of infrastructure projects which
> indicate a growing familiarity with the large number of complex systems
> that make up the project infrastructure.  In particular, Sergey has
> expertise in systems related to the configuration of Jenkins jobs, Zuul,
> and Nodepool which is invaluable in diagnosing and fixing operational
> problems as part of infra-root.
>
> Please respond with any comments or concerns.
>
> Thanks again Sergey for all your work!

Sergey is now in infra-root.  Congratulations!

-Jim
(And Jeremy is no longer the "new guy"!)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] An alternative approach to enforcing "expected election behaviour"

2014-06-17 Thread James E. Blair
Eoghan Glynn  writes:

> TL;DR: how about we adopt a "soft enforcement" model, relying 
>on sound judgement and good faith within the community?

Thank you very much for bringing this up and proposing it to the TC.  As
others have suggested, having a concrete alternative is very helpful in
revealing both the positive and negative aspects of a proposal.

I think our recent experience has shown that the fundamental problem is
that not all of the members of our community knew what kind of behavior
we expected around elections.  That's understandable -- we had hardly
articulated it.  I think the best solution to that is therefore to
articulate and communicate that.

I believe Anita's proposal starts off by doing a very good job of
exactly that, so I would like to see a final resolution based on that
approach with very similar text to what she has proposed.  That
statement of expected behavior should then be communicated by election
officials to all participants in announcements related to all elections.
Those two simple acts will, I believe, suffice to address the problem we
have seen.

I do agree that a heavy bureaucracy is not necessary for this.  Our
community has a Code of Conduct established and administered by the
Foundation.  I think we should focus on minimizing additional process
and instead try to make this effort slot into the existing framework as
easily as possible by expecting the election officials to forward
potential violations to the Foundation's Executive Director (or
delegate) to handle as they would any other potential CoC violation.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Proposal to move from Freenode to OFTC

2014-03-06 Thread James E. Blair
Sean Dague  writes:

> #1) do we believe OFTC is fundamentally better equipped to resist a
> DDOS, or do we just believe they are a smaller target? The ongoing DDOS
> on meetup.com the past 2 weeks is a good indicator that being a smaller
> fish only helps for so long.

After speaking with a Freenode and OFTC staffer, I am informed that OFTC
is generally and currently not the target of DDoS attacks, likely due to
their smaller profile.  If they were subject to such attacks, they would
likely be less prepared to deal with them than Freenode, however, in
that event, they would expect to extend their capabilities to deal with
it, partially borrowing on experience from Freenode.  And finally,
Freenode is attempting to work with sponsors and networks that can help
mitigate the ongoing DDoS attacks.

I agree that this is not a decision to be taken lightly.  I believe that
we can effect the move successfully if we plan it well and execute it
over an appropriate amount of time.  My own primary concern is actually
the loss of network effect.  If you're only on one network, Freenode is
probably the place to be since so many other projects are there.
Nevertheless, I think our project is substantial enough that we can move
with little attrition.

The fact is though that Freenode has had significant service degradation
due to DDoS attacks for quite some time -- the infra team notices this
every time we have to chase down which side of a netsplit our bots ended
up on and try to bring them back.  We also had an entire day recently
(it was a Saturday) where we could not use Freenode at all.

There isn't much we can do about DDoS attacks on Freenode.  If we stay,
we're going to continue to deal with the occasional outage and spend a
significant amount of time chasing bots.  It's clear that Freenode is
better able to deal with attacks than OFTC would be.  However, OFTC
doesn't have to deal with them because they aren't happening; and that's
worth considering.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Spec repos for blueprint development and review

2014-03-24 Thread James E. Blair
Hi,

So recently we started this experiment with the compute and qa programs
to try using Gerrit to review blueprints.  Launchpad is deficient in
this area, and while we hope Storyboard will deal with it much better,
but it's not ready yet.

As a development organization, OpenStack scales by adopting common tools
and processes, and true to form, we now have a lot of other projects
that would like to join the "experiment".  At some point that stops
being an experiment and becomes practice.

However, at this very early point, we haven't settled on answers to some
really basic questions about how this process should work.  Before we
extend it to more projects, I think we need to establish a modicum of
commonality that helps us integrate it with our tooling at scale, and
just as importantly, helps new contributors and people who are working
on multiple projects have a better experience.

I'd like to hold off on creating any new specs repos until we have at
least the following questions answered:

a) Should the specs repos be sphinx documents?
b) Should the follow the Project Testing Interface[1]?
c) Some basic agreement on what information is encoded?
   eg: don't encode implementation status (it should be in launchpad)
   do encode branches (as directories? as ...?)
d) Workflow process -- what are the steps to create a new spec and make
   sure it also exists and is tracked correctly in launchpad?

-Jim

[1] https://wiki.openstack.org/wiki/ProjectTestingInterface

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Spec repos for blueprint development and review

2014-03-24 Thread James E. Blair
Russell Bryant  writes:

> On 03/24/2014 12:34 PM, James E. Blair wrote:
>> Hi,
>> 
>> So recently we started this experiment with the compute and qa programs
>> to try using Gerrit to review blueprints.  Launchpad is deficient in
>> this area, and while we hope Storyboard will deal with it much better,
>> but it's not ready yet.
>
> This seems to be a point of confusion.  My view is that Storyboard isn't
> intended to implement what gerrit provides.  Given that, it seems like
> we'd still be using this whether the tracker is launchpad or storyboard.

I don't think it's intended to implement what Gerrit provides, however,
I'm not sure what Gerrit provides is _exactly_ what's needed here.  I do
agree that Gerrit is a much better tool than launchpad for collaborating
on some kinds of blueprints.

However, one of the reasons we're creating StoryBoard is so that we have
a tool that is compatible with our workflow and meets our requirements.
It's not just about tracking work items, it should be a tool for
creating, evaluating, and progressing changes to projects (stories),
across all stages.

I don't envision the end-state for storyboard to be that we end up
copying data back and forth between it and Gerrit.  Since we're
designing a system from scratch, we might as well design it to do what
we want.

One of our early decisions was to say that UX and code stories have
equally important use cases in StoryBoard.  Collaboration around UX
style blueprints (especially those with graphical mock-ups) sets a
fairly high bar for the kind of interaction we will support.

Gerrit is a great tool for reviewing code and other text media.  But
somehow it is even worse than launchpad for collaborating when visual
media are involved.  Quite a number of blueprints could benefit from
better support for that (not just UI mockups but network diagrams, etc).
We can learn a lot from the experiment of using Gerrit for blueprint
review, and I think it's going to help make StoryBoard a lot better for
all of our use cases.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [qa] [neutron] Neutron Full Parallel job - Last 4 days failures

2014-03-26 Thread James E. Blair
Salvatore Orlando  writes:

> On another note, we noticed that the duplicated jobs currently executed for
> redundancy in neutron actually seem to point all to the same build id.
> I'm not sure then if we're actually executing each job twice or just
> duplicating lines in the jenkins report.

Thanks for catching that, and I'm sorry that didn't work right.  Zuul is
in fact running the jobs twice, but it is only looking at one of them
when sending reports and (more importantly) decided whether the change
has succeeded or failed.  Fixing this is possible, of course, but turns
out to be a rather complicated change.  Since we don't make heavy use of
this feature, I lean toward simply instantiating multiple instances of
identically configured jobs and invoking them (eg "neutron-pg-1",
"neutron-pg-2").

Matthew Treinish has already worked up a patch to do that, and I've
written a patch to revert the incomplete feature from Zuul.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Infra] PTL Candidacy

2014-03-28 Thread James E. Blair
Hi,

I would like to announce my candidacy for the Infrastructure PTL.

I have developed and operated the project infrastructure for several
years and have been honored to serve as the PTL for the Icehouse cycle.

I was instrumental not only in creating the project gating system and
development process, but also in scaling it from three projects to 250.

We face more scaling challenges of a different nature this cycle.
Interest in our development tools and processes in their own right has
increased dramatically.  This is great for us and the project as it
provides new opportunities for contributions.  Helping the
infrastructure projects evolve into more widely useful tools while
maintaining the sharp focus on serving OpenStack's needs that made them
compelling in the first place is a challenge I look forward to.

The amazing growth of the third-party CI system is an area where we can
make a lot of improvement.  During Icehouse, everyone was under deadline
pressure so we tried to limit system or process modifications that would
impact that.  During Juno, I would like to improve the experience both
for third-party CI providers as well as the developers who use their
results.

I am thrilled to be a part of one of the most open free software project
infrastructures, and I would very much like to continue to serve as its
PTL.

Thanks,

Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Dependency freeze exception for happybase (I would like version 0.8)

2014-03-28 Thread James E. Blair
Sean Dague  writes:

> So how did Ceilometer get into this situation? Because the ceilometer
> requirements are happybase>=0.4,<=0.6

Is this a case where testing minimums might have helped?

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [OpenStack-Infra] [infra] Consolidating efforts around Fedora/Centos gate job

2014-04-11 Thread James E. Blair
Ian Wienand  writes:

> Then we have the question of the nodepool setup scripts working on
> F20.  I just tested the setup scripts from [3] and it all seems to
> work on a fresh f20 cloud image.  I think this is due to kchamart,
> peila2 and others who've fixed parts of this before.
>
> So, is there:
>
>  1) anything blocking having f20 in the nodepool?
>  2) anything blocking a simple, non-voting job to test devstack
> changes on f20?

I'm not aware of blockers; I think this is great, so thanks to all the
people who have worked to make it happen!

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra][nova][docker]

2014-04-12 Thread James E. Blair
Michael Still  writes:

> Agreed, where it makes sense. In general we should be avoiding third
> party CI unless we need to support something we can't in the gate -- a
> proprietary virt driver, or weird network hardware for example. I
> think we've now well and truly demonstrated that third party CI
> implementations are hard to run well.
>
> Docker doesn't meet either of those tests.
>
> However, I can see third party CI being a stepping stone required by
> the infra team to reduce their workload -- in other words that they'd
> like to see things running consistently as a third party CI before
> they move it into their world. However, I'll leave that call to the
> infra team.

Well put, Michael and Russel.  We see the test infrastructure as a
commons that helps enable new projects join the community.  Any tests
that involve open source components that physically can run in the
project infrastructure almost certainly should.  That helps with
repeatability, maintenance, and integration into the OpenStack community
and eventually project.

As far as workload goes, I wouldn't ask for a third-party CI setup first
to demonstrate viability because as you say, it's quite a bit of work.
What would be very helpful is to try to do as much local testing of
proposed jobs before submitting a review to infra to add the job.  Most
folks are pretty good about that already.  Also, reviews of jobs changes
are always welcome in the openstack-infra/config repository!  Both of
those things will help with infra review workload tremendously, and
still take less time than running private infrastructure.

I look forward to seeing docker tests running in OpenStack
infrastructure soon.

Thanks,

Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] TC Candidacy

2014-04-14 Thread James E. Blair
Hi,

I'd like to announce my candidacy for re-election to the TC.

About Me


I am the PTL for the OpenStack Infrastructure Program, which I have been
helping to build for nearly three years.  I also served on the TC during
the Icehouse cycle.

I am responsible for a significant portion of our project infrastructure
and developer workflow.  I set up gerrit, helped write git-review, and
moved all of the OpenStack projects from bazaar to git.  All of that is
to say that I've given a lot of thought and action to helping scale
OpenStack to the number of projects and developers it has today.

I also wrote zuul, nodepool, and devstack-gate to make sure that we are
able to test all components of the project on every commit.  There are a
lot of moving parts in OpenStack and I strongly believe that at the end
of the day they need to work together as a cohesive whole.  A good deal
of my technical work is focused on achieving that.

Throughout my time working on OpenStack I have always put the needs of
the whole project first, above those of any individual contributor,
organization, or program.  I also believe in the values we have
established as a project: open source, design, development, and
community.  To that end, I have worked hard to ensure that the project
infrastructure is run just like an OpenStack project, using the same
tools and processes, and I think we've succeeding in creating one of the
most open operational project infrastructures ever.

My Platform
===

An important shift has taken place since the convening of the first
all-elected TC.  I am proud of what we have done to address important
issues of consistency and quality that affect all of the projects.

As we continue to add new projects and programs (at a measured pace),
the role of the TC will need to continue to evolve to provide
coordination and leadership in cross-project issues.  We have made a lot
of progress on raising our expectations around testing, and there is
more to do there.  The TC also has a role in improving user and operator
experience by facilitating cross-project standards.

Recently, the TC has begun to work with the Board of Directors on issues
that concern us both.  The trademark issue is particularly important --
it is no less than our identity as a project.  To me, the question is,
"what is the OpenStack community working to produce?"  I believe that
software derived from OpenStack with limited and reasonable
modifications should be able to use the OpenStack trademark.  However, I
do not believe our intent is to produce a crippled, open-core product.
We are building one of the most significant open source systems and it
is important that our name continue to stand for that.  I think the
careful progress that the TC and Board have been making in this area
will ultimately reflect that.

I believe that the TC has been doing good work and is heading in the
right direction for the project.  I would love to continue to help it do
so and would appreciate your vote.

Thanks,

Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Gerrit downtime and upgrade on 2014-04-28

2014-04-22 Thread James E. Blair
Zaro  writes:

> Hello All.  The OpenStack infra team has been working to put
> everything in place so that we can upgrade review.o.o from Gerrit
> version 2.4.4 to version 2.8.4  We are happy to announce that we are
> finally ready to make it happen!
>
> We will begin the upgrade on Monday, April 28th at 1600 UTC (the
> OpenStack recommended 'off' week).
>
> We would like to advise that you can expect a couple hours of downtime
> followed by several more hours of automated systems not quite working
> as expected.  Hopefully you shouldn't notice anyway because you should
> all be on vacation :)

Hi,

This is a reminder that next week, Gerrit will be unavailable for a few
hours starting at 1600 UTC on April 28th.

There are a few changes that will impact developers.  We will have more
detailed documentation about this soon, but here are the main things you
should know about:

* The "Important Changes" view is going away.  Instead, Gerrit 2.8
  supports complex custom dashboards.  We will have an equivalent of the
  "Important Changes" screen implemented as a custom dashboard.

* The "Approval" review label will be renamed to "Workflow".  The +1
  value will still be "Approved" and will be available to core
  developers -- nothing about the approval process is changing.

* The new "Workflow" label will have a "-1 Work In Progress" value which
  will replace the "Work In Progress" button and review state.  Core
  reviewers and change owners will have permission to set that value
  (which will be removed when a new patchset is uploaded).

* We will also take this opportunity to change Gerrit's SSH host key.
  We will supply instructions for updating your known_hosts file.  As a
  reminder, you can always verify the fingerprints on this page:
  https://review.openstack.org/#/settings/ssh-keys

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [Infra] PTL Candidacy

2014-09-22 Thread James E. Blair
I would like to announce my candidacy for the Infrastructure PTL.

I have developed and operated the project infrastructure for several
years and have been honored to serve as the PTL for the Juno cycle.

I was instrumental not only in creating the project gating system and
development process, but also in scaling it from three projects to 400.

During the Juno cycle, we have just started on a real effort to make the
project infrastructure consumable in its own right.  There is a lot of
interest from downstream consumers of our tools but our infrastructure
is not set up for that kind of re-use.  We're slowly changing that so
that people who run infrastructure systems similar to ours can
contribute back upstream just like any other OpenStack project.

I am anticipating a number of changes to the OpenStack project that are
related: the further acceptance of a "Big Tent", and changes to the
gating structure to accommodate it.  I believe that changes to our
testing methodology, including a smaller integrated gate and more
functional testing which we outlined at the QA/Infra sprint fit right
into that.  These are multi-release efforts, and I am looking forward to
continuing them in Kilo.

All of these efforts mean a lot of new people working on a lot of new
areas of the Infrastructure program in parallel.  A big part of the work
in the next cycle will be helping to coordinate those efforts and make
the Infrastructure program a little less monolithic to support all of
this work.

I am thrilled to be a part of one of the most open free software project
infrastructures, and I would very much like to continue to serve as its
PTL.

Thanks,

Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] [all] Freezing openstack-infra/config repo on Sept 25

2014-09-23 Thread James E. Blair
Hi,

On Thursday, Sept 25 at 00:01 UTC we will freeze project-configuration
related changes to the openstack-infra/config repo.

This is part of an effort to move project-related configuration out of
the config repository and into its own repo.  The goals are both to make
the existing config repo (soon to be renamed to system-config) more
re-usable by other parties, as well as to make it easier for people
interested in maintaining the CI and developer systems of the OpenStack
project to collaborate.

In other words, we're moving all of the Jenkins, Zuul, and Gerrit
configuration files into their own repo.  This means you can review
these changes without having to deal with all that puppet stuff we're
always doing.  And if you want to make our puppet very shiny but don't
want to deal with all those new projects and their jobs, we'll have a
repo for that too.  I'll write more about this after the move.

The work is described in this spec (including a list of affected files):

http://specs.openstack.org/openstack-infra/infra-specs/specs/config-repo-split.html

We anticipate that the actual cutover will happen sometime Thursday, and
we will follow up with an update when the freeze is over.  Note that
after the cutover, project configuration changes will need to be
proposed to the new repo (and any changes that miss the freeze cutoff
will need to be re-proposed).

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] [all] Announcing the project-config repo

2014-09-25 Thread James E. Blair
We have moved (most) project configuration data out of the
openstack-infra/config repository into a new repository called
openstack-infra/project-config.

This repo contains only config files related to configuring software
projects in the OpenStack project infrastructure.  This includes:

  * Zuul
  * Jenkins Job Builder
  * Gerrit
  * Nodepool
  * IRC bots
  * The index page for specs.openstack.org

There are some things that are still in the config repo that we would
like to move but require further refactoring.  However, the bulk of
project related configuration is in the new repository.

Why Was This Done?
==

We have done this for a number of reasons:

  * To make it easier for people who care about the "big tent" of
OpenStack to review changes to add new projects and changes to the
CI system.

  * To make it easier for people who care about system administration of
the project infrustructure to review those changes.

  * To make the software that we use to run the infrastructure a little
more reusable by downstream consumers.

For more about the rationale and the mechanics of the split itself, see
this spec:

  
http://specs.openstack.org/openstack-infra/infra-specs/specs/config-repo-split.html

How To Use the New Repo
===

All of the same files are present with their history, but we have
reorganized the repo to make it a bit more convenient.  Most files are
simply one or two levels down from the root directory under what I
sincerely hope is a meaningful name.  For instance:

  zuul/layout.yaml
  jenkins/jobs/devstack-gate.yaml
  ...and so on...

Here is a browseable link to the repo:

  http://git.openstack.org/cgit/openstack-infra/project-config/tree/

And you know about our documentation, right?  It's all been updated with
the new paths.  Highlights include:

  * The stackforge howto:  http://ci.openstack.org/stackforge.html
  * The our Zuul docs:  http://ci.openstack.org/zuul.html
  * Our JJB docs:  http://ci.openstack.org/jjb.html
  * And many others accessible from:  http://ci.openstack.org/

Finally, all those neat jobs that tell you that you added a job without
a definition or didn't put something in alphabetical order are all
running on the new repo as well.

What Next?
==

If you had an outstanding patch against the config repo that was
affected by the split, you will need to re-propose it to the
project-config repo.

You should review changes in project-config.  Yes -- you.  If you have
any idea what this stuff is, reviewing changes to this repo will be a
big help to all the projects that are using our infrastructure.

This repo has its own group of core reviewers.  Currently it includes
only infra-core, but regular reviewers who understand the major systems
involved, the general requirements for new projects, and the overall
direction of the testing infrastructure will be nominated for membership
in the project-config-core team.

As always, feel free to reply to this email or visit us in
#openstack-infra with any questions.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] Nominating Anita Kuno for project-config-core

2014-09-26 Thread James E. Blair
I'm pleased to nominate Anito Kuno to the project-config core team.

The project-config repo is a constituent of the Infrastructure Program
and has a core team structured to be a superset of infra-core with
additional reviewers who specialize in the area.

Anita has been reviewing new projects in the config repo for some time
and I have been treating her approval as required for a while.  She has
an excellent grasp of the requirements and process for creating new
projects and is very helpful to the people proposing them (who are often
creating their first commit to any OpenStack repository).

She also did most of the work in actually creating the project-config
repo from the config repo.

Please respond with support or concerns and if the consensus is in
favor, we will add her next week.

Thanks,

Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] Nominating Sean Dague for project-config-core

2014-09-26 Thread James E. Blair
I'm pleased to nominate Sean Dague to the project-config core team.

The project-config repo is a constituent of the Infrastructure Program
and has a core team structured to be a superset of infra-core with
additional reviewers who specialize in the area.

For some time, Sean has been the person we consult to make sure that
changes to the CI system are testing what we think we should be testing
(and just as importantly, not testing what we think we should not be
testing).  His knowledge of devstack, devstack-gate, tempest, and nova
is immensely helpful in making sense of what we're actually trying to
accomplish.

Please respond with support or concerns and if the consensus is in
favor, we will add him next week.

Thanks,

Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] Nominating Andreas Jaeger for project-config-core

2014-09-26 Thread James E. Blair
I'm pleased to nominate Andreas Jaeger to the project-config core team.

The project-config repo is a constituent of the Infrastructure Program
and has a core team structured to be a superset of infra-core with
additional reviewers who specialize in the area.

Andreas has been doing an incredible amount of work simplifying the
Jenkins and Zuul configuration for some time.  He's also been making it
more complicated where it needs to be -- making the documentation jobs
in particular a model of efficient re-use that is far easier to
understand than what he replaced.  In short, he's an expert in Jenkins
and Zuul configuration and both his patches and reviews are immensely
helpful.

Please respond with support or concerns and if the consensus is in
favor, we will add him next week.

Thanks,

Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] periodic jobs for master

2014-10-24 Thread James E. Blair
Andrea Frittoli  writes:

> I also believe we can find ways to make post-merge / periodic checks useful.
> We need to do that to keep the gate to a sane scale.

Yes, we have a plan to do that that we outlined at the infra/QA meetup
this summer and described to this list in this email:

http://lists.openstack.org/pipermail/openstack-dev/2014-July/041057.html

Particularly this part, but please read the whole message if you have
not already, or have forgotten it:

  * For all non gold standard configurations, we'll dedicate a part of
our infrastructure to running them in a continuous background loop,
as well as making these configs available as experimental jobs. The
idea here is that we'll actually be able to provide more
configurations that are operating in a more traditional CI (post
merge) context. People that are interested in keeping these bits
functional can monitor those jobs and help with fixes when needed.
The experimental jobs mean that if developers are concerned about
the effect of a particular change on one of these configs, it's easy
to request a pre-merge test run.  In the near term we might imagine
this would allow for things like ceph, mongodb, docker, and possibly
very new libvirt to be validated in some way upstream.

  * Provide some kind of easy to view dashboards of these jobs, as well
as a policy that if some job is failing for > some period of time,
it's removed from the system. We want to provide whatever feedback
we can to engaged parties, but people do need to realize that
engagement is key. The biggest part of putting tests into OpenStack
isn't landing the tests, but dealing with their failures.

I'm glad to see people interested in this.  If you're ready to
contribute to it, please stop by #openstack-infra or join our next team
meeting[1] to discuss how you can help.

-Jim

[1] https://wiki.openstack.org/wiki/Meetings/InfraTeamMeeting

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][doc] project testing interface

2014-10-24 Thread James E. Blair
Angelo Matarazzo  writes:

> Hi all,
> I have a question for you devs.
> I don't understand the difference between this link
> http://git.openstack.org/cgit/openstack/governance/tree/reference/project-testing-interface.rst
> and
> https://wiki.openstack.org/wiki/ProjectTestingInterface
>
> Some parts don't match (e.g. unittest running section).
> If the git link is the right doc should we update the wiki page?
>
> I found the reference to the wiki page here:
> https://lists.launchpad.net/openstack/msg08058.html

The git repo is authoritative now, and the wiki is out of date.  Feel
free to update the wiki to point to the git repo.  We're working on
publishing the governance repo and so should have a nicer looking page
and URL for that soon.  Thanks!

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] Nominations for jenkins-job-builder core

2014-06-20 Thread James E. Blair
Hi,

The Jenkins Job Builder project (part of the Infrastructure program) is
quite popular even outside of OpenStack and has a group of specialist
core reviewers supplemental to the rest of the Infrastructure program.

To that group I would like to add Darragh Bailey:

https://review.openstack.org/#/q/reviewer:%22Darragh+Bailey%22+project:openstack-infra/jenkins-job-builder,n,z

and Marc Abramowitz:

https://review.openstack.org/#/q/reviewer:%22Marc+Abramowitz%22+project:openstack-infra/jenkins-job-builder,n,z

Both have contributed significantly to the project and have a sustained
record of reviews that show understanding of the project and development
environment.

Please feel free to respond with messages of support or concern, and if
the consensus is in favor, we will add them to the core team next week.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [infra] Gerrit downtime on Saturday, June 28, 2014

2014-06-26 Thread James E. Blair
Hi,

On Saturday, June 28 at 15:00 UTC Gerrit will be unavailable for about
15 minutes while we rename some projects.  Existing reviews, project
watches, etc, should all be carried over.  The current list of projects
that we will rename is:

stackforge/designate -> openstack/designate
openstack-dev/bash8 -> openstack-dev/bashate
stackforge/murano-repository -> stackforge-attic/murano-repository
stackforge/murano-metadataclient -> stackforge-attic/murano-metadataclient
stackforge/murano-common -> stackforge-attic/murano-common
stackforge/murano-conductor -> stackforge-attic/murano-conductor
stackforge/murano-tests -> stackforge-attic/murano-tests

This list is subject to change.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] 3rd Party CI vs. Gerrit

2014-06-27 Thread James E. Blair
Sean Dague  writes:

> It seems what we actually want is a dashboard of these results. We want
> them available when we go to Gerrit, but we don't want them in Gerrit
> itself.

I agree with most of what you wrote, particularly that we want them
available in Gerrit and with a sensible presentation.  Though I don't
think that necessarily excludes storing the data in Gerrit, as Khai
points out.

I think one ideal interface is a table in Gerrit of all the jobs run on
a change (from all systems) and their results (with links).  That sounds
like the project David is working on (that Khai pointed out), which
isn't surprising -- we sketched out the initial design with him about
three years ago; our needs are very similar.

David's proposal is a table like:

  Job/Category | started on | ended on | href | status
  foo [...]
  bar [...]
  bat […]

I think that's a great approach because it provides the needed
information to reviewers in an appropriate interface in Gerrit.

I suspect that whether the data are stored in Gerrit or a different
system, the (future) vinz web ui could retrieve it from either.

An alternate approach would be to have third-party CI systems register
jobs with OpenStack's Zuul rather than using their own account.  This
would mean only a single report of all jobs (upstream and 3rd-party)
per-patchset.  It significantly reduces clutter and makes results more
accessible -- but even with one system we've never actually wanted to
have Jenkins results in comments, so I think one of the other options
would be preferred.  Nonetheless, this is possible with a little bit of
work.

If anyone is seriously interested in working on Sean's proposal or
something similar, please write up a spec in the
openstack-infra/infra-specs repo.  If you'd like to help with David's
proposal, the upstream Gerrit project is the best place to collaborate.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] 3rd Party CI vs. Gerrit

2014-06-27 Thread James E. Blair
Sean Dague  writes:

> This has all gone far enough that someone actually wrote a Grease Monkey
> script to purge all the 3rd Party CI content out of Jenkins UI. People
> are writing mail filters to dump all the notifications. Dan Berange
> filters all them out of his gerrit query tools.

I should also mention that there is a pending change to do something
similar via site-local Javascript in our Gerrit:

  https://review.openstack.org/#/c/95743/

I don't think it's an ideal long-term solution, but if it works, we may
have some immediate relief without all having to install greasemonkey
scripts.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] 3rd Party CI vs. Gerrit

2014-06-28 Thread James E. Blair
Matt Riedemann  writes:

> I would be good with Jenkins not reporting on a successful run, or if
> rather than a comment from Jenkins the vote in the table had a link to
> the test results, so if you get a -1 from Jenkins you can follow the
> link from the -1 in the table rather than the comment (to avoid
> cluttering up the review comments, especially if it's a +1).

The problem with that is it makes non-voting jobs very difficult to see,
not to mention ancillary information like job runtime.  Plus it adds
extra clicks to get to jobs whose output are frequently reviewed even
with positive votes, such as docs jobs (on docs-draft.o.o).

I think a new table on the page, separate from the comment stream, with
only the latest results of each job is the ideal presentation.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] 3rd Party CI vs. Gerrit

2014-06-30 Thread James E. Blair
Joshua Hesketh  writes:

> On 6/28/14 10:40 AM, James E. Blair wrote:
>> An alternate approach would be to have third-party CI systems register
>> jobs with OpenStack's Zuul rather than using their own account.  This
>> would mean only a single report of all jobs (upstream and 3rd-party)
>> per-patchset.  It significantly reduces clutter and makes results more
>> accessible -- but even with one system we've never actually wanted to
>> have Jenkins results in comments, so I think one of the other options
>> would be preferred.  Nonetheless, this is possible with a little bit of
>> work.
>
> I agree this isn't the preferred solution, but I disagree with the
> little bit of work. This would require CI systems registering with
> gearman which would mean security issues. The biggest problem with
> this though is that zuul would be stuck waiting from results from 3rd
> parties which often have very slow return times.

"Security issues" is a bit vague.  They already register with Gerrit;
I'm only suggesting that the point of aggregation would change.  I'm
anticipating that they would use authenticated SSL, with ACLs scoped to
the names of jobs each system is permitted to run.  From the perspective
of overall security as well as network topology (ie, firewalls), very
little changes.  The main differences are third party CI systems don't
have to run Zuul anymore, and we go back to having a smaller number of
votes/comments.

Part of the "little bit of work" I was referring to was adding a
timeout.  That should truly be not much work, and work we're planning on
doing anyway to help with the tripleo cloud.

But anyway, it's not important to design this out if we prefer another
solution (and I prefer the table of results separated from comments).

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [OpenStack-Infra] [QA][Infra] Mid-Cycle Meet-up Registration Closed

2014-07-03 Thread James E. Blair
Matthew Treinish  writes:

> Hi Everyone,
>
> Just a quick update, we have to close registration for the Infra/QA mid-cycle
> meet-up. Based on the number of people who have signed up on the wiki page [1]
> we are basically at the maximum capacity for the rooms we reserved. So if you
> had intended to come but didn't sign up on the wiki unfortunately there isn't
> any space left.

We've had a few people contact us after registration closed.  I've added
their names, in order, to a waitlist on the wiki page:

  https://wiki.openstack.org/wiki/Qa_Infra_Meetup_2014

If you have already registered and find that you can no longer attend,
please let us know ASAP.

If you can only attend some days, please note that in the comments
field.

If you would like to attend but have not registered, you may add your
name to the end of the waitlist in case there are cancellations; but we
can't guarantee anything.

Thanks to everyone who has expressed interest!  And I'm sorry we can't
accommodate everyone.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Thoughts on the patch test failure rate and moving forward

2014-07-23 Thread James E. Blair
OpenStack has a substantial CI system that is core to its development
process.  The goals of the system are to facilitate merging good code,
prevent regressions, and ensure that there is at least one configuration
of upstream OpenStack that we know works as a whole.  The "project
gating" technique that we use is effective at preventing many kinds of
regressions from landing, however more subtle, non-deterministic bugs
can still get through, and these are the bugs that are currently
plaguing developers with seemingly random test failures.

Most of these bugs are not failures of the test system; they are real
bugs.  Many of them have even been in OpenStack for a long time, but are
only becoming visible now due to improvements in our tests.  That's not
much help to developers whose patches are being hit with negative test
results from unrelated failures.  We need to find a way to address the
non-deterministic bugs that are lurking in OpenStack without making it
easier for new bugs to creep in.

The CI system and project infrastructure are not static.  They have
evolved with the project to get to where they are today, and the
challenge now is to continue to evolve them to address the problems
we're seeing now.  The QA and Infrastructure teams recently hosted a
sprint where we discussed some of these issues in depth.  This post from
Sean Dague goes into a bit of the background: [1].  The rest of this
email outlines the medium and long-term changes we would like to make to
address these problems.

[1] https://dague.net/2014/07/22/openstack-failures/

==Things we're already doing==

The elastic-recheck tool[2] is used to identify "random" failures in
test runs.  It tries to match failures to known bugs using signatures
created from log messages.  It helps developers prioritize bugs by how
frequently they manifest as test failures.  It also collects information
on unclassified errors -- we can see how many (and which) test runs
failed for an unknown reason and our overall progress on finding
fingerprints for random failures.

[2] http://status.openstack.org/elastic-recheck/

We added a feature to Zuul that lets us manually "promote" changes to
the top of the Gate pipeline.  When the QA team identifies a change that
fixes a bug that is affecting overall gate stability, we can move that
change to the top of the queue so that it may merge more quickly.

We added the clean check facility in reaction to the January gate break
down. While it does mean that any individual patch might see more tests
run on it, it's now largely kept the gate queue at a countable number of
hours, instead of regularly growing to more than a work day in
length. It also means that a developer can Approve a code merge before
tests have returned, and not ruin it for everyone else if there turned
out to be a bug that the tests could catch.

==Future changes==

===Communication===
We used to be better at communicating about the CI system.  As it and
the project grew, we incrementally added to our institutional knowledge,
but we haven't been good about maintaining that information in a form
that new or existing contributors can consume to understand what's going
on and why.

We have started on a major effort in that direction that we call the
"infra-manual" project -- it's designed to be a comprehensive "user
manual" for the project infrastructure, including the CI process.  Even
before that project is complete, we will write a document that
summarizes the CI system and ensure it is included in new developer
documentation and linked to from test results.

There are also a number of ways for people to get involved in the CI
system, whether focused on Infrastructure or QA, but it is not always
clear how to do so.  We will improve our documentation to highlight how
to contribute.

===Fixing Faster===

We introduce bugs to OpenStack at some constant rate, which piles up
over time. Our systems currently treat all changes as equally risky and
important to the health of the system, which makes landing code changes
to fix key bugs slow when we're at a high reset rate. We've got a manual
process of promoting changes today to get around this, but that's
actually quite costly in people time, and takes getting all the right
people together at once to promote changes. You can see a number of the
changes we promoted during the gate storm in June [3], and it was no
small number of fixes to get us back to a reasonably passing gate. We
think that optimizing this system will help us land fixes to critical
bugs faster.

[3] https://etherpad.openstack.org/p/gatetriage-june2014

The basic idea is to use the data from elastic recheck to identify that
a patch is fixing a critical gate related bug. When one of these is
found in the queues it will be given higher priority, including bubbling
up to the top of the gate queue automatically. The manual promote
process should no longer be needed, and instead bugs fixing elastic
recheck tracked issues will be promoted automatica

[openstack-dev] [Infra] Infrastructure PTL candidacy

2013-09-20 Thread James E. Blair
Hi,

I would like to announce my candidacy for the Infrastructure PTL.

I have been developing and operating the infrastructure for the
OpenStack project for more than two years and have been instrumental in
establishing our current project gating system and development
methodology.

I am extremely proud that we have created what is possibly the most
transparent, open, reusable and accessible infrastructure of any free
software project.  I want to continue to move toward having all
infrastructure maintained through code-review and open to contributions.
I want our infrastructure to be re-usable by other projects and easy for
other projects to contribute improvements, while keeping sight of the
fact that running services for the OpenStack project is our primary
goal.

I look forward to continuing to help the Infrastructure program meet the
incredibly unique needs of the OpenStack project as well as be a model
for other free software projects.

Thanks,

Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Introducing the NNFI scheduler for Zuul

2013-09-26 Thread James E. Blair
We recently made a change to Zuul's scheduling algorithm (how it
determines which changes to combine together and run tests).  Now when a
change fails tests (or has a merge conflict), Zuul will move it out of
the series of changes that it is stacking together to be tested, but it
will still keep that change's position in the queue.  Jobs for changes
behind it will be restarted without the failed change in their proposed
repo states.  And if something later fails ahead of it, Zuul will once
again put it back into the stream of changes it's testing and give it
another chance.

To visualize this, we've updated the status screen to include a tree
view:

  http://status.openstack.org/zuul/

(If you already have that loaded, be sure to hit reload.)

In Zuul, this is called the Nearest Non-Failing Item (NNFI) algorithm
because in short, each item in a queue is at all times being tested
based on the nearest non-failing item ahead of it in the queue.

On the infrastructure side, this is going to drive our use of cloud
resources even more, as Zuul will now try to run as many jobs as it can,
continuously.  Every time a change fails, all of the jobs for changes
behind it will be aborted and restarted with a new proposed future
state.

For developers, this means that changes should land faster, and more
throughput overall, as Zuul won't be waiting as long to re-test changes
after a job has failed.  And that's what this is ultimately about --
virtual machines are cheap compared to developer time, so the more
velocity our automated tests can sustain, the more velocity our project
can achieve.

-Jim


(PS: There is a known problem with the status page not being able to
display the tree correctly while Zuul is in the middle of recalculating
the change graph.  That should be fixed by next week, but in the mean
time, just enjoy the show.)

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] Gerrit downtime for repository renames

2013-10-03 Thread James E. Blair
Hi,

On Saturday October 5th at 1600 UTC, Gerrit will be offline for a
short time while we rename source code repositories.  To convert
that time to your local timezone, see:

  http://www.timeanddate.com/worldclock/fixedtime.html?iso=20131005T16

We will be renaming the following repositories:

  stackforge/python-savannaclient->  openstack/python-savannaclient
  stackforge/savanna ->  openstack/savanna
  stackforge/savanna-dashboard   ->  openstack/savanna-dashboard
  stackforge/savanna-extra   ->  openstack/savanna-extra
  stackforge/savanna-image-elements  ->  openstack/savanna-image-elements
  stackforge/python-tuskarclient ->  openstack/python-tuskarclient
  stackforge/tuskar  ->  openstack/tuskar
  stackforge/tuskar-ui   ->  openstack/tuskar-ui

As usual, we will announce updates on Freenode in #openstack-dev and
will be available in #openstack-infra to help with any issues.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] TC Candidacy

2013-10-04 Thread James E. Blair
Hi,

I'd like to announce my candidacy for the TC.

About Me


I am the PTL for the OpenStack Infrastructure Program, which I have been
helping to build for the past two years.

I am responsible for a significant portion of our project infrastructure
and developer workflow.  I set up gerrit, helped write git-review, and
moved all of the OpenStack projects from bzr to git.  All of that is to
say that I've given a lot of thought and action to helping scale
OpenStack to the number of projects and developers it has today.

I also wrote zuul, nodepool, and devstack-gate to make sure that we are
able to test all components of the project on every commit.  There are a
lot of moving parts in OpenStack and I strongly believe that at the end
of the day they need to work together as a cohesive whole.  A good deal
of my technical work is focused on achieving that.

Throughout my time working on OpenStack I have always put the needs of
the whole project first, above those of any individual contributor,
organization, or program.  I also believe in the values we have
established as a project: open source, design, development, and
community.  To that end, I have worked hard to ensure that the project
infrastructure is run just like an OpenStack project, using the same
tools and processes, and I think we've succeeding in creating one of the
most open operational project infrastructures ever.

My Platform
===

As a significant OpenStack user, I'm excited about the direction that
OpenStack is heading.  I'm glad that we're accepting new programs that
expand the scope of our project to make it more useful for everyone.  I
believe a major function of the Technical Committee is to curate and
shepherd new programs through the incubation process.  However, I
believe that it should be more involved than it has been.  We have been
very quick to promote out of integration some exciting new projects that
may not have been fully integrated.  As a member of the TC, I support
our continued growth, and I want to make sure that the ties that hold
our collection of projects together are strong, and more than just a
marketing umbrella.

I have also seen a shift since the formation of the OpenStack
Foundation.  Our project is a technical meritocracy, but when the
Foundation's board of directors was established, some issues of
project-wide scope have been taken up by the board of directors while
the Technical Committee has been content to limit their involvement.
The Foundation board is extremely valuable, and I want the Technical
Committee to work closely with them on issues that concern them both.

Adding new projects is not the only purpose of the TC, which is charged
in the bylaws as being responsible for all technical matters relating to
the project.  The reformation of the Technical Committee with an
all-elected membership provides an opportunity to strengthen the
technical meritocracy of the OpenStack project by electing people who
will execute the full mission of the TC.  I would be happy to serve in
that capacity and would appreciate your vote.

Thanks,

Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Gerrit downtime for repository renames

2013-10-04 Thread James E. Blair
jebl...@openstack.org (James E. Blair) writes:

> Hi,
>
> On Saturday October 5th at 1600 UTC, Gerrit will be offline for a
> short time while we rename source code repositories.  To convert
> that time to your local timezone, see:
>
>   http://www.timeanddate.com/worldclock/fixedtime.html?iso=20131005T16
>
> We will be renaming the following repositories:
>
>   stackforge/python-savannaclient->  openstack/python-savannaclient
>   stackforge/savanna ->  openstack/savanna
>   stackforge/savanna-dashboard   ->  openstack/savanna-dashboard
>   stackforge/savanna-extra   ->  openstack/savanna-extra
>   stackforge/savanna-image-elements  ->  openstack/savanna-image-elements
>   stackforge/python-tuskarclient ->  openstack/python-tuskarclient
>   stackforge/tuskar  ->  openstack/tuskar
>   stackforge/tuskar-ui   ->  openstack/tuskar-ui
>
> As usual, we will announce updates on Freenode in #openstack-dev and
> will be available in #openstack-infra to help with any issues.
>
> -Jim

Additionally we will rename:

  stackforge/fuel-ostf-tests  ->  stackforge/fuel-ostf

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Gerrit tools

2013-10-23 Thread James E. Blair
"Daniel P. Berrange"  writes:

> Actually, from my POV, the neat one there is the qgerrit script - I had
> no idea you could query this info so easily.

FYI the query syntax for SSH and the web is the same, so you can also
make a bookmark for a query like that.  The search syntax is here:

  https://review.openstack.org/Documentation/user-search.html

In the next version of Gerrit, you can actually make a dashboard based
on such queries.

However, note the following in the docs about the "file" operator:

  Currently this operator is only available on a watched project and may
  not be used in the search bar.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [OpenStack-Infra] [summit] Youtube stream from Design Summit?

2013-11-03 Thread James E. Blair
Stefano Maffulli  writes:

> This lead us to give up trying for Hong Kong. We searched for 
> alternatives (see the infra meeting logs a few months back and my 
> personal experiments [1] based on UDS experience) and the only solid 
> one was to use a land line to call in the crucial people that *have to* 
> be in a session. In the end I and Thierry made a judgement call to drop 
> that too because we didn't hear anyone demanding it and setting them up 
> in a reliable way for all sessions required a lot of efforts (that in 
> the past we felt went wasted anyway).
>
> We decided to let moderators find their preferred way to pull in those 
> 1-3 people ad-hoc with their preferred methods.

Indeed, we have an asterisk server that we set up with the intent that
it should be able to support a lot (we tested at least 100) simultaneous
attendees for remote summit participation.  But since Stefano and
Thierry knew there was no interest, we did not do the final prep needed
to support the summit.

If people really are interested in remote participation, we would be
happy to help with that using open source tools that are accessible to a
large number of people over a wide range of technology.  Please register
your interest with the track leads, Stefano, or the infrastructure team
a bit earlier before the next summit and it can happen.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Swift] code changes feedback

2013-06-17 Thread James E. Blair
"Pecoraro, Alex"  writes:

> Is there, by chance, a way to submit a patch file to gerrit via a web
> interface? Just wondering because I'm having a heck of time submitting
> my changes. My desktop computer seems to be blocked by the corporate
> firewall and when I try on my laptop connected to an external network
> I get an error from git review: ValueError: too many values to unpack.

Hi,

Gerrit only accepts commits via git.  Git-review isn't necessary, but it
(usually!) makes it a lot easier.  I'd be happy to help figure out what
the problem is.  If you'd like real-time interactive help, you can
always join us in #openstack-infra on Freenode.  Or we can use email,
but you might consider replying to me personally (keep in mind the
Reply-To header is set to the list) as the next bit may not be
interesting for the rest of the folks here.

Can you send me the output of the following commands:

  git review --version
  git review -v

Thanks,

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Diskimage-builder, Heat, and the Gate

2013-06-25 Thread James E. Blair
Sean Dague  writes:

> Cool proposed change coming in from the Heat folks - 
> https://review.openstack.org/#/c/34278/ to use dib to build their base
> images in devstack. From a development perspective, will make
> experimenting with Heat a lot easier.
>
> However, this raises an issue as we look towards using this in the
> gate, because diskimage-builder isn't currently gated by
> devstack-gate. But if Heat wants to use it we're talking about pulling
> upstream master as part of the build. Which opens us up to an
> asymmetric gate where a dib change can land which breaks the gate,
> then all nova changes are blocked from merging until it gets in.
>
> I think we need to be really explicit that on the gate, every git tree
> we pull in is also gated, to ensure nothing breaks other projects
> ability to merge code. Asymmetric gating is no fun.

Agreed -- and in fact this is enforced by code (which is why 34278 is
currently failing).  Devstack-gate sets the FAIL_ON_CLONE option in
devstack so that if devstack clones a repo that is not managed by
devstack-gate, the tests fail.

Anyway, back to the issue -- yes asymmetric gating is not fun, so the
only way we should be incorporating code into the devstack gate is
either via another gated project, or packaged and released code (be it
an egg or an operating system package).

> This gets a little odder in that dib is out on stackforge and not as
> part of an openstack* github tree. Which on the one hand is just
> naming, on the other hand if heat's going to need that to get through
> the gate, maybe we should rethink it being on stackforge
> vs. openstack-dev?

It looks like the TC agreed that dib should be part of a tripleo
program, and the TC is currently working out the details of programs.
So it looks like there is unlikely to be a hurdle for having dib
officially recognized.  We just need to decide where to put it.

-Jim

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


  1   2   3   >