Re: [openstack-dev] [dib][heat] dib-utils/dib-run-parts/dib v2 concern

2017-03-16 Thread Gregory Haynes
On Thu, Mar 16, 2017, at 05:18 AM, Steven Hardy wrote:
> On Wed, Mar 15, 2017 at 04:22:37PM -0500, Ben Nemec wrote:
> > While looking through the dib v2 changes after the feature branch was merged
> > to master, I noticed this commit[1], which bring dib-run-parts back into dib
> > itself.  Unfortunately I missed the original proposal to do this, but I have
> > some concerns about the impact of this change.
> > 
> > Originally the split was done so that dib-run-parts and one of the
> > os-*-config projects (looks like os-refresh-config) that depends on it could
> > be included in a stock distro cloud image without pulling in all of dib.
> > Note that it is still present in the requirements of orc: 
> > https://github.com/openstack/os-refresh-config/blob/master/requirements.txt#L5
> > 
> > Disk space in a distro cloud image is at a premium, so pulling in a project
> > like diskimage-builder to get one script out of it was not acceptable, at
> > least from what I was told at the time.
> > 
> > I believe this was done so a distro cloud image could be used with Heat out
> > of the box, hence the heat tag on this message.  I don't know exactly what
> > happened after we split out dib-utils, so I'm hoping someone can confirm
> > whether this requirement still exists.  I think Steve was the one who made
> > the original request.  There were a lot of Steves working on Heat at the
> > time though, so it's possible I'm wrong. ;-)
> 
> I don't think I'm the Steve you're referring to, but I do have some
> additional info as a result of investigating this bug:
> 
> https://bugs.launchpad.net/tripleo/+bug/1673144
> 
> It appears we have three different versions of dib-run-parts on the
> undercloud (and, presumably overcloud nodes) at the moment, which is a
> pretty major headache from a maintenance/debugging perspective.
> 

I looked at the bug and I think there may only be two different
versions? The versions in /bin and /usr/bin seem to come from the same
package (so I hope they are the same version). I don't understand what
is going on with the ./lib version but that seems like either a local
package / checkout or something else non-dib related.

Two versions is certainly less than ideal, though :).

> However we resolve this, *please* can we avoid permanently forking the
> tool, as e.g in that bug, where do I send the patch to fix leaking
> profiledir directories?  What package needs an update?  What is
> installing
> the script being run that's not owned by any package?
> 
> Yes, I know the answer to some of those questions, but I'm trying to
> point
> out duplicating this script and shipping it from multiple repos/packages
> is
> pretty horrible from a maintenance perspective, especially for new or
> casual contributors.
> 

I agree. You answered my previous question of whether os-refresh-config
is still in use (sounds like it definitely is) so this complicates
things a bit.

> If we have to fork it, I'd suggest we should rename the script to avoid
> the
> confusion I outline in the bug above, e.g one script -> one repo -> one
> package?

I really like this idea of renaming the script in dib which should
clarify the source of each script and prevent conflicts, but this still
leaves the fork-related issues. If we go the route of just keeping the
current state (of there being a fork) I think we should do the rename.

The issue I spoke of (complications with depending on dib-utils when
installing dib in a venv) I think came from a combination of this
dependency and not requiring a package install (you used to be able to
./bin/disk-image-create without installation). Now that we require
installation this may be less of an issue.

So the two reasonable options seem to be: 
* Deal with the forking cost. Not the biggest cost when you notice
dib-utils hasn't had a commit in over 3 months and that one was a robot
commit to add some github flair.
* Switch back to dib-utils in the other repo. I'm starting to prefer
this slightly given that it seems there's a valid use case for it to
live externally and our installation story has become a lot more clean.
AFAIK this shouldn't prevent us from making the script more portable,
but please correct me if there's something I'm missing.

> 
> Thanks!
> 
> Steve
> 

Cheers,
- Greg


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [dib][heat] dib-utils/dib-run-parts/dib v2 concern

2017-03-15 Thread Gregory Haynes
On Wed, Mar 15, 2017, at 04:22 PM, Ben Nemec wrote:
> While looking through the dib v2 changes after the feature branch was 
> merged to master, I noticed this commit[1], which bring dib-run-parts 
> back into dib itself.  Unfortunately I missed the original proposal to 
> do this, but I have some concerns about the impact of this change.
> 
> Originally the split was done so that dib-run-parts and one of the 
> os-*-config projects (looks like os-refresh-config) that depends on it 
> could be included in a stock distro cloud image without pulling in all 
> of dib.  Note that it is still present in the requirements of orc: 
> https://github.com/openstack/os-refresh-config/blob/master/requirements.txt#L5
> 

I had forgotten about this, but you're completely correct - the
os-refresh-config phases are run via dib-run-parts. The reason for
moving dib-run-parts back in to dib was to simplify some of the
installation insanity we had going on, I want to say it was one reason
you couldn't run disk-image-create from a virtualenv without sourcing it
first.

> Disk space in a distro cloud image is at a premium, so pulling in a 
> project like diskimage-builder to get one script out of it was not 
> acceptable, at least from what I was told at the time.
> 
> I believe this was done so a distro cloud image could be used with Heat 
> out of the box, hence the heat tag on this message.  I don't know 
> exactly what happened after we split out dib-utils, so I'm hoping 
> someone can confirm whether this requirement still exists.  I think 
> Steve was the one who made the original request.  There were a lot of 
> Steves working on Heat at the time though, so it's possible I'm wrong.
> ;-)
> 
> Anyway, I don't know that anything is broken at the moment since I 
> believe dib-run-parts was brought over unchanged, but the retirement of 
> dib-utils was proposed in https://review.openstack.org/#/c/445617 and I 
> would like to resolve this question before we do anything like that.
> 

I think you're right in that nothing should be broken ATM since the API
is consistent. I agree that it doesn't make a lot of sense to retire
something which is depended on by other non-retired projects. The
biggest issue I can see with us leaving dib-utils in its current state
is there's the opportunity for the two implementations to drift and have
slightly different dib-run-parts APIs. Maybe we could prevent this by
deprecating dib-utils (or leaving a big warning of this tool is frozen
in the README) and leaving os-refresh-config as is. Although it isn't
ideal for os-refresh-config to depend on a deprecated tool I am not sure
anyone is making use of os-refresh-config currently so I am hesitant to
suggest we add back the complexity to DIB.

> Thanks.
> 
> -Ben
> 
> 1: 
> https://github.com/openstack/diskimage-builder/commit/d65678678ec0416550d768f323ceace4d0861bca
> 

Thanks!
- Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][diskimage-builder] Status of diskimage-builder

2017-03-06 Thread Gregory Haynes
On Sat, Mar 4, 2017, at 12:13 PM, Andre Florath wrote:
> Hello!
> 
> Thanks Greg for sharing your thoughts.  The idea of splitting off DIB
> from OpenStack is new for me, therefore I collect some pros and
> cons:
> 
> Stay in OpenStack:
> 
> + Use available OpenStack infrastructure and methods
> + OpenStack should include a possibility to create images for ironic,
>   VMs and docker. (Yes - there are others, but DIB is the best! :-) )
> + Customers use DIB because it's part of OpenStack and for OpenStack
>   (see e.g. [1])
> + Popularity of OpenStack attracts more developers than a separate
>   project (IMHO running DIB as a separate project even lowers the low
>   number of contributors).
> + 'Short Distances' if there are special needs for OpenStack.
> + Some OpenStack projects use DIB - and also use internal 'knowledge'
>   (like build-, run- or test-dependencies) - it would be not that easy
>   to completely separate this in short term.
> 

Ah, I may have not been super clear - I definitely agree that we
wouldn't want to move off of being hosted by OpenStack infra (for all
the reasons you list). There are actually two classes of project hosted
by OpenStack infra - OpenStack projects and OpenStack related projects
which have differing requirements
(https://docs.openstack.org/infra/manual/creators.html#decide-status-of-your-project).
What I've noticed is we tend to align more with the openstack-related
projects in terms of what we ask for / how we develop (e.g. not
following the normal release cycle, not really being a 'deliverable' of
an OpenStack release). AIUI though the distinction of whether you're an
official project team or a related project just distinguishes what
restrictions are placed on you, not whether you can be hosted by
OpenStack infra.

> As a separate project:
> 
> - Possibly less organizational overhead.
> - Independent releases possible.
> - Develop / include / concentrate also for / on other non-OpenStack
>   based virtualization platforms (EC2, Google Cloud, ...)
> - Extend the use cases to something like 'DIB can install a wide range
>   of Linux distributions on everything you want'.
>   Example: DIB Element to install Raspberry Pi [2] (which is currently
>   not the core use-case but shows how flexible DIB is).
> 
> In my opinion the '+' arguments are more important, therefore DIB
> should stay within OpenStack as a sub-project.  I don't really care
> about the master: TripleO, Infra, glance, ...
> 
> 

Out of this list I think infra is really the only one which makes sense.
TripleO is the current setup and makes only slightly more sense than
Glance at this point: we'd be an odd appendage in both situations.
Having been in this situation for some time I tend to agree that it
isn't a big issue it tends to just be a mild annoyance every now and
then. IMO it'd be nice to resolve this issue once and for all, though
:).

> I want to touch an important point: Greg you are right that there are
> only a very few developers contributing for DIB.  One reason
> is IMHO, that it is not very attractive to work on DIB; some examples:
> 
> o The documentation how to set up a DIB development environment [3]
>   is out of date.
> o Testing DIB is nightmare: a developer has no chance to test
>   as it is done in the CI (which is currently setup by other OpenStack
>   projects?). Round-trip times of ~2h - and then it often fails,
>   because of some mirror problem...
> o It takes sometimes very long until a patch is reviewed and merged
>   (e.g. still open since 1y1d [6]; basic refactoring [7] was filed
>   about 9 month ago and still not in the master).
> o There are currently about 100 elements in DIB. Some of them are
>   highly hardware dependent; some are known not to work; a lot of them
>   need refactoring.

I cant agree more on all of this. TBH I think working on docs is
probably the most effective thing someone could do with DIB ATM because,
as you say, that's how you enable people to contribute. The theory is
that this is also what helps with the review latency - ask newer
contributors to help with initial reviews. That being said, I'd be
surprised if the large contributor count grows much unless some of the
use cases change simply because its very much a plumbing tool for many
of our consumers, not something people are looking to drive feature
development in to.

> 
> It is important to work on these topics to make DIB more attractive and
> possible have more contributors.  Discussions about automated
> development environment setup [4] or better developer tests [5] started
> but need more attention and discussions (and maybe a different setting
> than a patch / review).
> In addition we should concentrate on the core functionalities: block
> device setup, minimal system installation, bootloader, kernel and
> ramdisk creation and a stable extensible element interface; drop
> non-core elements or move them to the projects where they are used.
> 

+1

> Kind regards
> 
> Andre
> 
> 
> [1] 

Re: [openstack-dev] [tripleo][diskimage-builder] Status of diskimage-builder

2017-03-03 Thread Gregory Haynes
Hello,

Thanks for bringing this back to life.

As I am sure some are aware I have been mostly absent from DIB lately,
so don't let me stop you all from going forward with this or any of the
other plans. I just wanted to do a bit of a braindump on my thought
process from a while back on why I never went through with trying to
become an independent openstack project team.

The main issue that prevented me from going forward with this was that I
worried we were too small for it to work effectively.  IME DIB tends to
have a fair amount of drive by contributors and a very small (roughly
2-3)  set of main contributors who are very part-time and who certainly
aren't primarily focused on DIB (or even upstream OpenStack).
Fortunately, I think the project does fine with this setup: The number
of new features scales up or down to meet our contributor capacity and
we haven't required a ton of firefighting in recent memory. Not only
that, we actually seem to be extremely stable in this setup which is
great given how we can break many other projects in ways which are non
trivial to debug.

Our low contributor capacity does pose some problems when you try to
become an OpenStack project team though. Firstly, someone needs to agree
to be PTL and, essentially, take the responsibilities seriously [1]. In
addition to the issue of having someone willing to do this, I worried
that the responsibilities would take up a non trivial amount of time
(for our low activity project) which previously went to other tasks
keeping the project afloat. I also was not sure we would be doing anyone
any favors if a cycle or two down the road we ended up in a spot where
no one is interested in running for PTL even though the project itself
is doing fine. Maybe some of the TC folks can correct me if i'm wrong
but that seems to create a fair bit of churn where a decision has to be
made on whether to attic the project or do something else like appoint a
PTL.

All that to say - If we decide to go the route of becoming on
independent openstack project would we have someone willing to be PTL
and do we think that would be an effective use of our time?



WRT us being consumed by glance or infra - I think either of these could
work. I hadn't heard anything to the effect of infra not wanting us, but
AFAIK none of us has stepped up to really ask. One issue with infra is
that, typically, OpenStack projects do not depend directly on infra
projects. I am sure others have a better idea of the pitfalls here. OTOH
we have a pretty large shared set of knowledge between DIB and infra
which makes this option fairly attractive.

My primary concern with glance is that AFAIK the only relation we've had
historically is the word 'image' in our project description. That is to
say, I don't know of any shared knowledge between the contributor base.
As a result I am not really a fan of this option.

For both of these its not really an issue of whether we'd like to 'own'
the project IMO (its all the same open source project after all, we
don't own it). It's mostly a matter of whether its technically feasible
(e.g. are there issues with infra due to things like dependencies) and
whether it makes any sense from a collaboration standpoint (otherwise
we'll end up right back where we are but with another parent project
team).



I'd like to propose a third option which I think may be best - We could
become an independent non-openstack project hosted by openstack infra.
This would allow us to effectively continue operating as we do today
which is IMO ideal. Furthermore, this would resolve some of the issues
we've had relating to the release process where we desired to be
release:independent and tag our own releases (we would no longer be of
the release team's concern rather than need to be special cased). I feel
like we've been effectively operating in this manner (a non openstack
independent project) so it seems a natural fit to me. Hopefully some of
the more openstack-process enlightened can chime in confirming that this
is doable and ok or if theres some big issues I am missing here...


HTH,
Greg

--

1: https://docs.openstack.org/project-team-guide/ptl.html


On Thu, Mar 2, 2017, at 03:31 PM, Emilien Macchi wrote:
> On Thu, Jan 12, 2017 at 3:06 PM, Yolanda Robla Mota 
> wrote:
> > From my point of view, i've been using that either on infra with
> > puppet-infracloud, glean.. and now with TripleO. So in my opinion, it shall
> > be an independent project, with core contributors from both sides.
> >
> > On Thu, Jan 12, 2017 at 8:51 PM, Paul Belanger 
> > wrote:
> >>
> >> On Thu, Jan 12, 2017 at 02:11:42PM -0500, James Slagle wrote:
> >> > On Thu, Jan 12, 2017 at 1:55 PM, Emilien Macchi 
> >> > wrote:
> >> > > On Thu, Jan 12, 2017 at 12:06 PM, Paul Belanger
> >> > >  wrote:
> >> > >> Greetings,
> >> > >>
> >> > >> With the containerization[1] of tripleo, I'd like to know more about
> >> > >> 

Re: [openstack-dev] [infra][diskimage-builder] containers, Containers, CONTAINERS!

2017-01-16 Thread Gregory Haynes
On Thu, Jan 12, 2017, at 03:32 PM, Andre Florath wrote:
> Hello!
> 
> > The end result of this would be we have distro-minimal which depends on
> > kernel, minimal-userspace, and yum/debootstrap to build a vm/baremetal
> > capable image. We could also create a distro-container element which
> > only depends on minimal-userspace and yum/debootstrap and creates a
> > minimal container. The point being - the top level -container or
> > -minimal elements are basically convenience elements for exporting a few
> > vars and pulling in the proper elements at this point and the
> > elements/code are broken down by the functionality they provide rather
> > than use case.
> 
> This sounds awesome! Do we have some outline (etherpad) around
> where we collect all those ideas?
> 

Not that I know of... we have the ML now though :).

In seriousness though, doing this as part of the spec, or as a different
spec sounds like a great idea.

> Kind regards
> 
> Andre
> 
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [infra][diskimage-builder] containers, Containers, CONTAINERS!

2017-01-13 Thread Gregory Haynes
On Wed, Jan 11, 2017, at 03:04 PM, Paul Belanger wrote:
> On Sun, Jan 08, 2017 at 02:45:28PM -0600, Gregory Haynes wrote:
> > On Fri, Jan 6, 2017, at 09:57 AM, Paul Belanger wrote:
> > > On Fri, Jan 06, 2017 at 09:48:31AM +0100, Andre Florath wrote:
> > > > Hello Paul,
> > > > 
> > > > thank you very much for your contribution - it is very appreciated.
> > > > 
> > 
> > Seconded - I'm very excited for some effort to be put in to improving
> > the use case of making containers with DIB. Thanks :).
> > 
> > > > You addressed a topic with your patch set that was IMHO not in a wide
> > > > focus: generating images for containers.  The ideas in the patches are
> > > > good and should be implemented.
> > > > 
> > > > Nevertheless I'm missing the concept behind your patches. What I saw
> > > > are a couple of (independent?) patches - and it looks that there is
> > > > one 'big goal' - but I did not really get it.  My proposal is (as it
> > > > is done for other bigger changes or introducing new concepts) that
> > > > you write a spec for this first [1].  That would help other people
> > > > (see e.g. Matthew) to use the same blueprint also for other
> > > > distributions.
> > 
> > I strongly agree with the point that this is something were going to end
> > up repeating across many distros so we should make sure there's some
> > common patterns for doing so. A spec seems fine to me, but ideally the
> > end result involves some developer documentation. A spec is probably a
> > good place to get started on getting some consensus which we can turn in
> > to the dev docs.
> > 
> This plan is to start with ubuntu, then move to debian, then fedora and
> finally
> centos. Fedora and CentOS are obviously harder, since a debootstrap tool
> doesn't
> exist.
> 

Right, although I believe we've solve a fair amount of the hard bits
with our yum-minimal element which performs a similar operation to
debootstrap for laying down the root file tree.

> > > Sure, I can write a spec if needed but the TL;DR is:
> > > 
> > > Use diskimage-builder to build debootstrap --variant=minbase chroot, and
> > > nothing
> > > else. So I can then use take the generated tarball and do something else
> > > with
> > > it.
> > > 
> > > > One possibility would be to classify different element sets and define
> > > > the dependency between them.  E.g. to have a element class 'container'
> > > > which can be referenced by other classes, but is not able to reference
> > > > these (e.g. VM or hardware specific things).
> > > > 
> > 
> > It sounds like we need to step back a bit get a clear idea of how were
> > going to manage the full use case matrix of distro * (minimal / full) *
> > (container / vm / baremetal), which is something that would be nice to
> > get consensus on in a spec. This is something that keeps tripping up
> > both users and devs and I think adding containers to the matrix is sort
> > of a tipping point in terms of complexity so again, some docs after
> > figuring out our plan would be *awesome*.
> > 
> > Currently we have distro-minimal elements which are minimal
> > vm/baremetal, and distro elements which actually are full vm/baremetal
> > elements. I assume by adding an element class you mean add a set of
> > distro-container elements? If so, I worry that we might be falling in to
> > a common dib antipattern of making distro-specific elements. I have a
> > alternate proposal:
> > 
> > Lets make two elements: kernel, and minimal-userspace which,
> > respectively, install the kernel package and a minimal set of userspace
> > packages for dib to function (e.g. dependencies for dib-run-parts,
> > package-installs). The kernel package should be doable as basically a
> > package-installs and a pkg-map. The minimal-userspace element gets
> > tricky because it needs to install deps which are required for things
> > like package-installs to function (which is why the various distro
> > elements do this independently).  Even so, I think it would be nice to
> > take care of installing these from within the chroot rather than from
> > outside (see https://review.openstack.org/#/c/392253/ for a good reason
> > why). If we do this then the minimal-userspace element can have some
> > common logic to enter the chroot as part of root.d and then install the
> > needed deps.
> > 
> > The end result of this would be we have distro-minimal which depe

Re: [openstack-dev] [infra][diskimage-builder] containers, Containers, CONTAINERS!

2017-01-08 Thread Gregory Haynes
On Fri, Jan 6, 2017, at 09:57 AM, Paul Belanger wrote:
> On Fri, Jan 06, 2017 at 09:48:31AM +0100, Andre Florath wrote:
> > Hello Paul,
> > 
> > thank you very much for your contribution - it is very appreciated.
> > 

Seconded - I'm very excited for some effort to be put in to improving
the use case of making containers with DIB. Thanks :).

> > You addressed a topic with your patch set that was IMHO not in a wide
> > focus: generating images for containers.  The ideas in the patches are
> > good and should be implemented.
> > 
> > Nevertheless I'm missing the concept behind your patches. What I saw
> > are a couple of (independent?) patches - and it looks that there is
> > one 'big goal' - but I did not really get it.  My proposal is (as it
> > is done for other bigger changes or introducing new concepts) that
> > you write a spec for this first [1].  That would help other people
> > (see e.g. Matthew) to use the same blueprint also for other
> > distributions.

I strongly agree with the point that this is something were going to end
up repeating across many distros so we should make sure there's some
common patterns for doing so. A spec seems fine to me, but ideally the
end result involves some developer documentation. A spec is probably a
good place to get started on getting some consensus which we can turn in
to the dev docs.

> Sure, I can write a spec if needed but the TL;DR is:
> 
> Use diskimage-builder to build debootstrap --variant=minbase chroot, and
> nothing
> else. So I can then use take the generated tarball and do something else
> with
> it.
> 
> > One possibility would be to classify different element sets and define
> > the dependency between them.  E.g. to have a element class 'container'
> > which can be referenced by other classes, but is not able to reference
> > these (e.g. VM or hardware specific things).
> > 

It sounds like we need to step back a bit get a clear idea of how were
going to manage the full use case matrix of distro * (minimal / full) *
(container / vm / baremetal), which is something that would be nice to
get consensus on in a spec. This is something that keeps tripping up
both users and devs and I think adding containers to the matrix is sort
of a tipping point in terms of complexity so again, some docs after
figuring out our plan would be *awesome*.

Currently we have distro-minimal elements which are minimal
vm/baremetal, and distro elements which actually are full vm/baremetal
elements. I assume by adding an element class you mean add a set of
distro-container elements? If so, I worry that we might be falling in to
a common dib antipattern of making distro-specific elements. I have a
alternate proposal:

Lets make two elements: kernel, and minimal-userspace which,
respectively, install the kernel package and a minimal set of userspace
packages for dib to function (e.g. dependencies for dib-run-parts,
package-installs). The kernel package should be doable as basically a
package-installs and a pkg-map. The minimal-userspace element gets
tricky because it needs to install deps which are required for things
like package-installs to function (which is why the various distro
elements do this independently).  Even so, I think it would be nice to
take care of installing these from within the chroot rather than from
outside (see https://review.openstack.org/#/c/392253/ for a good reason
why). If we do this then the minimal-userspace element can have some
common logic to enter the chroot as part of root.d and then install the
needed deps.

The end result of this would be we have distro-minimal which depends on
kernel, minimal-userspace, and yum/debootstrap to build a vm/baremetal
capable image. We could also create a distro-container element which
only depends on minimal-userspace and yum/debootstrap and creates a
minimal container. The point being - the top level -container or
-minimal elements are basically convenience elements for exporting a few
vars and pulling in the proper elements at this point and the
elements/code are broken down by the functionality they provide rather
than use case.

> > There are additional two major points:
> > 
> > * IMHO you addressed only some elements that needs adaptions to be
> >   able to used in containers.  One element I stumbled over yesterday
> >   is the base element: it is always included until you explicitly
> >   exclude it.  This base element depends on a complete init-system -
> >   which is for a container unneeded overhead. [2]

I think you're on the right track with removing base - we had consensus
a while back that it should go away but we never got around to it. The
big issue is going to be preserving backwards compat or making it part
of a major version bump and not too painful to upgrade to. I think we
can have this convo on the patchset, though.

> 
> Correct, for this I simply pass the -n flag to disk-image-create. This
> removes
> the need for include the base element. If we want to make a future
> optimization
> 

Re: [openstack-dev] Fedora AArch64 (64-bit ARM) support in diskimage-builder

2016-11-14 Thread Gregory Haynes

On Mon, Nov 14, 2016, at 05:06 PM, dmar...@redhat.com wrote:
> On 11/14/2016 04:40 PM, Ben Nemec wrote:
> >
> >
> > On 11/11/2016 10:55 AM, dmar...@redhat.com wrote:
> >>
> >> I have been looking at using diskimage-builder on Fedora AArch64. While
> >> there is 64-bit ARM support for Ubuntu (arm64), there appear to be a few
> >> things missing for Fedora.  Is this the correct list to ask questions,
> >> and propose minor changes to diskimage-builder in support of this 
> >> effort?
> >
> > Yes.  I would suggest you tag your emails with [diskimage-builder] so 
> > the relevant people are more likely to see them.
> 
> Will do.
> 
> 
> Thank you for the feedback,
> 
> d.marlin
> 

Additionally, if youre an IRC fan, we hang out in #openstack-dib on
freenode.

Thanks,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [diskimage-builder] Stepping down as diskimage-builder core

2016-09-18 Thread Gregory Haynes
On Sat, Sep 17, 2016, at 10:23 AM, Clint Byrum wrote:
> It's been my honor to be included with the core reviewer group on
> diskimage-builder, but my focus has changed quite a bit and I'm not
> able to keep up with the review load. As a result, I've lost most of
> the context and struggle to understand patches enough to +2 them.
> 
> Given that, I'm laying down my +2 in diskimage-builder.
> 

Thanks for all your help, Clint!

-Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [devstack] on stud replacement for tls-proxy option on Ubuntu Xenial

2016-09-04 Thread Gregory Haynes
On Sun, Sep 4, 2016, at 10:51 AM, Gregory Haynes wrote:
> On Sat, Sep 3, 2016, at 09:41 AM, Dean Troyer wrote:
>> On Fri, Sep 2, 2016 at 11:30 PM, Masanori Itoh
>> <masanori.i...@gmail.com> wrote:
>>> It eliminates 'stud' usage and replace it by apache2/mod_ssl, right?
>>>
>>> But, there are use cases like:
>>> - use apache2/mod_wsgi for better performance
>>> and
>>> - have an out-of-the-box SSL terminator (box)
>>
>> What is the ongoing use case for a distinct terminator?  To simulate
>> an actual non-Apache terminator?  I really don't know how often this
>> is needed, if the TLS can be properly handled otherwise.
>
> I actually think the split isn't a bad idea - IMO I would have just
> made one user facing var for the sake of simplicity (which is what
> Dean seems to be proposing below) but internally its still probably
> good to make a split between ssl frontend and internal apis. The way
> things are coded this split is already paid for: service installers
> already bind to their own internal port and tell the tls proxy about
> the mapping, and by having the split we get a bunch of extra
> flexibility.
>
> My thinking is that with [1] we can support a mix of services which
> support ssl termination on their own (simply run them with ssl
> termination on and have apache ssl to the backend as well), run only
> http (done in the patch), or run within apache (keystone/horizon)
> using a single service and code path which looks almost identical. It
> also is trivial to support the use case Masanori was describing with
> this setup (although I'm still unsure whether devstack actually wants
> to support that).
>
>>
>>> Also, we have 'USE_TLS' option enabling to terminate SSL by
>>> apache2/mod_ssl.
>>
>> This should become a default True
>>
>>> So, I think it's better to leave 'tls-proxy' using a non-apache SSL
>>> terminater like 'stud' or 'hitch'
>>> as an option for the use case above.
>>> My fix is like that.
>>>
>>> What do you think about?
>>
>> The addition of stud was originally driven in order to test client
>> TLS operation because of the mess that enabling SSL/TLS on the
>> services directly was at the time and haproxy was overkill.  The
>> complicated split configuration should really be an anti-pattern for
>> OpenStack deployment and needs to just go away in favor of a TLS-
>> everywhere approach using preferably Apache or haproxy (although this
>> is still overkill for DevStack IMHO).  We should be doing that anyway
>> to ensure that all of the internally used client libs are properly
>> TLS-capable (I believe this is the case now).
>>
>> The TLS_PROXY setting could either go away or be an alias for
>> USE_TLS.  And honestly, we should think about setting USE_TLS True by
>> default anyway.
>
> Getting tls on by default is exactly what I was going for in
> https://review.openstack.org/#/c/364016/2 but this is making me
> rethink the method I proposed there. I now think the easiest way to
> attack this is to make tls-proxy work again, and then make USE_SSL
> turn on tls-proxy by default (or require it to be enabled). After that
> whether a service manages ssl directly is an unrelated issue we can
> solve service-by-service while still working and having tls on by
> default - we just turn on tls for the services and have them tell the
> tls-proxy to talk to the backend over tls.
>
>>
>> Eek, making DevStack more secure, who whoulda thunk it?
>>
>> dt
>>
>> --
>>
>> Dean Troyer
>> dtro...@gmail.com
>
> Cheers,
> Greg

Aaand I chopped off [1] -

1: https://review.openstack.org/#/c/364013/
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [devstack] on stud replacement for tls-proxy option on Ubuntu Xenial

2016-09-04 Thread Gregory Haynes
On Sat, Sep 3, 2016, at 09:41 AM, Dean Troyer wrote:
> On Fri, Sep 2, 2016 at 11:30 PM, Masanori Itoh
>  wrote:
>> It eliminates 'stud' usage and replace it by apache2/mod_ssl, right?
>>
>>  But, there are use cases like:
>>  - use apache2/mod_wsgi for better performance
>>  and
>>  - have an out-of-the-box SSL terminator (box)
>
> What is the ongoing use case for a distinct terminator?  To simulate
> an actual non-Apache terminator?  I really don't know how often this
> is needed, if the TLS can be properly handled otherwise.

I actually think the split isn't a bad idea - IMO I would have just made
one user facing var for the sake of simplicity (which is what Dean seems
to be proposing below) but internally its still probably good to make a
split between ssl frontend and internal apis. The way things are coded
this split is already paid for: service installers already bind to their
own internal port and tell the tls proxy about the mapping, and by
having the split we get a bunch of extra flexibility.

My thinking is that with [1] we can support a mix of services which
support ssl termination on their own (simply run them with ssl
termination on and have apache ssl to the backend as well), run only
http (done in the patch), or run within apache (keystone/horizon)
using a single service and code path which looks almost identical. It
also is trivial to support the use case Masanori was describing with
this setup (although I'm still unsure whether devstack actually wants
to support that).

>
>> Also, we have 'USE_TLS' option enabling to terminate SSL by
>> apache2/mod_ssl.
>
> This should become a default True
>
>> So, I think it's better to leave 'tls-proxy' using a non-apache SSL
>>  terminater like 'stud' or 'hitch'
>>  as an option for the use case above.
>>  My fix is like that.
>>
>>  What do you think about?
>
> The addition of stud was originally driven in order to test client TLS
> operation because of the mess that enabling SSL/TLS on the services
> directly was at the time and haproxy was overkill.  The complicated
> split configuration should really be an anti-pattern for OpenStack
> deployment and needs to just go away in favor of a TLS-everywhere
> approach using preferably Apache or haproxy (although this is still
> overkill for DevStack IMHO).  We should be doing that anyway to ensure
> that all of the internally used client libs are properly TLS-capable
> (I believe this is the case now).
>
> The TLS_PROXY setting could either go away or be an alias for
> USE_TLS.  And honestly, we should think about setting USE_TLS True by
> default anyway.

Getting tls on by default is exactly what I was going for in
https://review.openstack.org/#/c/364016/2 but this is making me rethink
the method I proposed there. I now think the easiest way to attack this
is to make tls-proxy work again, and then make USE_SSL turn on tls-proxy
by default (or require it to be enabled). After that whether a service
manages ssl directly is an unrelated issue we can solve service-by-
service while still working and having tls on by default - we just turn
on tls for the services and have them tell the tls-proxy to talk to the
backend over tls.

>
> Eek, making DevStack more secure, who whoulda thunk it?
>
> dt
>
> --
>
> Dean Troyer
> dtro...@gmail.com

Cheers,
Greg
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] A daily life of a dev in ${name_your_fav_project}

2016-08-27 Thread Gregory Haynes
On Fri, Aug 26, 2016, at 11:03 AM, Joshua Harlow wrote:
> Hi folks (dev and more!),
> 
> I was having a conversation with some folks at godaddy around our future 
> plans for a developer lab (where we can have various setups of 
> networking, compute, storage...) for 'exploring' purposes (testing out a 
> new LBAAS for example or ...) and as well as for 'developer' purposes 
> (aka, the 'I need to reproduce a bug or work on a feature that requires 
> having a setup mimicking closer to what we have in staging or
> production').
> 
> And it got me thinking about how other developers (and other companies) 
> are doing this. Do various companies have shared labs that their 
> developers get partitions of for (periods of) usage (for example for a 
> volume vendor I would expect this) or if you are a networking company do 
> you hand out miniature networks (with associated gear) as needed (or do 
> you build out such labs via SDN and software only)?
> 
> Then of course there are the people developing libraries (somewhat of my 
> territory), part of that development can just be done locally and 
> running of tox and such via that, but often times even that is not 
> sufficient (for example pick oslo.messaging or oslo.db, altering this in 
> ways that could pass unittests could still end up breaking its 
> integration with other projects); so the gate helps here (but the gate 
> really is a 'last barrier') so have folks that have been working on say 
> zeromq or the newer amqp versions, what is the daily life of testing and 
> exploring features and development for you?
> 
> Are any of the environments that people may be getting build-out on 
> demand (ie in a cloud-like manner)? For example I could see how it could 
> be pretty nifty to request a environment be built out with say something 
> like the following as a descriptor language:
> 
> build_out:
> nova:
>git_url: git://git.openstack.org/openstack/nova
>git_ref: 
> neutron:
>git_url: 
>git_ref: my sha
> 
> topology:
>use_standard_config: true
>build_with_switch_type: XYZ...

I've been playing around with using diskimage-builder to build images
using an input which looks amazingly similar to the top half of this
[1].  I haven't quite taken the plunge to use this for my usual dev
environment but my hope has been to use this + the built in dib caching
+ docker/other container/qcow2 outputting to perform more realistic
tests of what I've been deving. The nifty bit is we've had tooling to
override the location of any repositories in order to be useful with
Zuul so it is trivial to support a set of input's like this and then
override them to refer to an on disk location (rather than git.o.o)[2].

> 
> I hope this info is not just useful to myself (and maybe it's been 
> talked about before, but nothing of recent that I can recall) and I'd be 
> very much interested in hearing what other companies (big and small) are 
> doing here (and also from folks that are not associated with any 
> company, which I guess brings in the question of the OSIC lab).
> 
> -Josh
> 

Cheers,
Greg

[1]:
https://review.openstack.org/#/c/336933/2/elements/python-apps/README.rst
[2]:
http://docs.openstack.org/developer/diskimage-builder/elements/source-repositories/README.html

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][DIB] Proposal to move DIB to its own project team

2016-07-29 Thread Gregory Haynes
On Fri, Jul 29, 2016, at 11:55 AM, Ben Nemec wrote:
> As I noted in the meeting yesterday, I think the lack of response from
> TripleO regarding this topic is kind of answer enough.  TripleO has
> moved away from having a heavy dependency on diskimage-builder (it's
> basically used to install some packages and a handful of elements that
> we haven't been able to replace yet), so I don't see a problem with
> moving dib out of TripleO, as long as we still have some TripleO folks
> on the core team and tripleo-ci continues to test all changes against
> it.  We still care about keeping dib working, but the motivation from
> the TripleO side to do feature development in dib is pretty nonexistent
> at this point, so if a new team wants to take that on then I'm good with
> it.
> 
> Note that the diskimage-builder core team has always been separate from
> the tripleo-core team, so ultimately I guess this would just be a
> governance change?
> 

Awesome, that is what I hoped/expected and why I figured this was a
reasonable move to make. It's good to hear some confirmation.

The cores thing is a bit tricky - there is a separate
diskimage-builder-core group but tripleo-core is a member of
diskimage-builder core. I think tripleo-core should get moved out from
being diskimage-builder-core but there's some folks who are not in
diskimage-builder-core that are in tripleo-core and are active in DIB.
Maybe we can take all tripleo-core folk who have done 2 or more reviews
this past cycle and add them to diskimage-builder-core?

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][DIB] Proposal to move DIB to its own project team

2016-07-21 Thread Gregory Haynes
Hello everyone,

The subject sort of says it all - I'd like to propose making
diskimage-builder its own project team.

When we started diskimage-builder and many of the other TripleO
components we designed them with the goal in mind of creating tools that
are useful outside of the TripleO context (in addition to fulfilling our
immediate needs).  To that effect diskimage-builder has become more of a
cross-project tool designed and used by several of the OpenStack
projects and as a result it no longer seems to make sense for
diskimage-builder to be part of the TripleO project team. Our two core
groups have diverged to a large extent over the last several cycles
which has removed much of the value of being part of that project team
while creating some awkward communication issues. To be clear - I
believe this is purely a result of the TripleO project team succeeding
in its goal to improve OpenStack by use of the virtuous cycle and this
is an ideal result of that goal.

Is this is something the DIB and TripleO folks agree/disagree with? If
we all agree then I think this should be a fairly straightforward
process, otherwise I welcome some discussion on the topic :).

Cheers,
Greg

-- 
  Gregory Haynes
  g...@greghaynes.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [DIB] [TripleO] Should we have a DIB meeting?

2016-07-18 Thread Gregory Haynes
 
> I'm glad to see some interest. I've proposed a time slot for the new
> meeting here:
>
> https://review.openstack.org/#/c/343871/
>
> which is currently Thursdays, 2000 UTC. I realize the timing isn't
> ideal, but given how widely distributed regular contributors are it's
> not easy to pick a good time. I welcome input here or on the review
> proposal.
 
This works for me, but I am PST so that might be obvious. It'll be
early for the AU folk / late for the EU folk but I am not sure we can
do any better...
 
>
> Thanks,
> Stephane
>
 
Thanks a ton for organizing this!
-Greg
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [DIB] [TripleO] Should we have a DIB meeting?

2016-07-15 Thread Gregory Haynes
On Fri, Jul 15, 2016, at 06:20 PM, Stephane Miller wrote:
> There are a lot of interesting and complex changes going on with DIB
> at the moment. This is a good thing, but it also means that we have
> more complex decisions to make for the project. We've already taken
> one step to address this in proposing a specs process
> (https://review.openstack.org/#/c/336109/). However, I'm thinking we
> could also use some higher-bandwidth communication.
>
> I'm proposing that we have a regular, IRC-based meeting for the
> project. This could be done on its own, or as part of the TripleO
> meeting. I don't think we necessarily need to do this every week, but
> a fortnightly chance to get together to chat about big changes,
> design, etc would be great.
>
> DIB and TripleO DIB community, what are your thoughts?
>
> - Stephane
 
Thanks for sending this out. I'd like to add another cause for having a
weekly meeting:
 
I have noticed a pattern where our users don't have a great way to
communicate with us and often this results in some incorrect
assumptions about DIB as well as us not realizing some features that
could help these users. I've been trying to make a point to get in
contact with these downstreams and it's been pretty clear that just a
small amount of communication between them and some folks who are more
familiar with DIB goes a long way. Maybe this could serve as a sort of
open office hours for these downstreams to bring up potential issues /
ask questions / etc?
 
Something that might also work rather than decreasing frequency is
making the meeting smaller than the standard hour (maybe just a 30 min
max)? Obviously meeting times are never a minimum though, so we could
also just hope to regularly decide we are done early.
 
Anyhow, I think the risk here is very low and there's a few things it
could help with so IMO it's worth trying out.
 
Cheers,
Greg
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [DIB] [Ironic] [TripleO] Moving IPA element out of DIB tree

2016-07-15 Thread Gregory Haynes
On Fri, Jul 15, 2016, at 03:46 PM, Ben Nemec wrote:
> I think this probably makes sense, but some more thoughts inline.
> 
> On 07/15/2016 03:13 PM, Stephane Miller wrote:
> > To better support diskimage-builder based IPA images going forward, we'd
> > like to move the ironic-agent element into the ironic-python-agent
> > repository. This will involve:
> > 
> > - Improving support for having multiple copies of an element, so that we
> > may deprecate the diskimage-builder repository copy of the element. See
> > this change and related: https://review.openstack.org/#/c/334785
> > - Moving the element into the repository. This change has been proposed
> > as https://review.openstack.org/#/c/335583/
> > - Deprecating the diskimage-builder copy of the element (TBD)
> > - Adding tests to gate IPA changes on DIB builds (TBD)

We now have some machinery to write per-element tests which result in an
image build and the ability to assert properties of that image. AFAIK no
downstreams of DIB have begun using it but this seems like a great
candidate.

> 
> We could potentially add tripleo-ci to the IPA repo, which would take
> care of this.  As an added bonus, it could cover both the introspection
> and deployment use cases for IPA.
> 
> On the other hand, if a separate Ironic job were added to cover this,
> tripleo could stop ever building new IPA images in CI except in the
> promote jobs when we bump our version of IPA.  This would delay our
> finding problems with IPA element changes, but realistically I'm not
> sure how many of those are happening these days anyway.  I'd expect that
> most changes are happening in IPA itself, which we don't currently CI.
> 
> > - Add upload of DIB-built images to tarballs.openstack.org
> >  (TBD)
> 
> We would also need to resolve https://review.openstack.org/#/c/334042/
> 
> I'm not clear why, but the ironic-agent element was given special
> treatment in disk-image-create (which is evil, but what's done is done)
> and we'd need to figure out why and a solution that wouldn't require
> referencing an out-of-tree element in diskimage-builder.
> 

I agree that this is something we should solve, but I don't think its a
blocker for the element moving out of tree - I think the (nasty) dib
special-casing will still apply as long as the element is named the
same?

I suspect this is a ways off, but an interesting question will be what
distros to base these images off of. AIUI the current published image is
CoreOS based which is something we haven't written an element for (yet).
I don't think there's any issues here, just a lot of options - Do we add
CoreOS support, do you publish multiple images built on various distros
we currently support?

> > 
> > Many IPA deployers currently use DIB based IPA images using the
> > ironic-agent element. However, IPA does not officially support DIB - IPA
> > changes are not tested against DIB, nor are DIB-built images published.
> 
> tripleo-ci actually does publish images, but they aren't well publicized
> at this point, and it only does so when we promote a repo.
> 
> > 
> > This has the following disadvantages:
> > 
> > - The DIB element is not versioned along with IPA, resulting in
> > potential version mismatch and breakage
> > - ironic-agent element changes are not tested with DIB prior to merge
> 
> This isn't true today.  tripleo-ci runs against all diskimage-builder
> changes and uses an IPA ramdisk.  The version mismatch is a legit
> problem with the current setup, although I'm not aware of any actual
> breakages that have happened (which doesn't necessarily mean they
> haven't :-).
> 

I think there's another aspect to this which is that by hosting IPA in
tree we are effectively saying that DIB should co-gate with IPA changes
(how else can IPA test changes to its element?). The problem with this
is that DIB installs a lot of things and there isn't much value in us
co-gating with every thing we install - it also isn't sustainable.
Really, we want IPA to gate on changes to the IPA element and for DIB to
have robust enough testing that it will reliably produce a workable OS
for the IPA install logic to run in. I think that moving the IPA element
in to the IPA tree makes a lot of sense from this standpoint.

As for breakages from not co-gating - all of the dib + ironic breakages
I remember were when we used the old ramdisk element which had a lot
more ironic specific logic in the element. Now that IPA is a thing and
isn't a bunch of bash inside of DIB the surface area for DIB to break
Ironic is actually pretty low (which is awesome).

> > 
> > Understandably, tripleo and other projects may have concerns with regard
> > to this change. I hope to start a discussion here so that those concerns
> > can be addressed. Further in-depth discussion of this issue can be found
> > in the relevant launchpad bug:
> > https://bugs.launchpad.net/ironic-python-agent/+bug/1590935
> > 
> > Thanks,
> > Stephane


Cheers,
Greg


[openstack-dev] [DIB][TripleO] Refreshing the DIB specs process

2016-06-30 Thread Gregory Haynes
Hello everyone,

I believe our DIB specs process is in need of a refresh. Currently, we
seem to avoid specs altogether. I think this has worked while we have
mostly maintained our status-quo of fixing bugs which pop up and adding
fairly straightforward elements. Recently, however, we seem to be making
a push toward some larger changes which require more careful thought and
discussion. I think this is great and I really want this type of
development to continue and so I would like to steer us towards using
specs for these larger changes in order to keep our development process
sustainable.

The biggest barrier I see to us using specs is that historically our
specs have lived in the tripleo-specs repo. When we had a significant
overlap between tripleo-core and dib-core this worked well, but lately
many of the dib reviewers are not tripleo-core. This means that if we
were to use tripleo-specs we would not be able to approve our own specs
(which, obviously, doesn't make a lot of sense). As a result, I'd like
to propose the creation of a specs directory inside of the
diskimage-builder repo[1] which we use for our specs process.

Additionally, one of the goals I have for our specs process is to not
stifle the ability for developers to quickly fix bugs. Relative to other
projects we seem to have a high rate of trivial bugfixes which come in
(I believe due to the nature of the problem we are solving) and we need
to not place unnecessary roadblocks on getting those merged. Similarly
to other projects, I have documented a trivial specs clause to our specs
process so we can hopefully facilitate this.

Cheers,
Greg

1: https://review.openstack.org/#/c/336109/

-- 
  Gregory Haynes
  g...@greghaynes.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [lbaas][octavia] suggestion for today's meeting agenda: How to make the Amphora-agent support additional Linux flavors

2016-06-29 Thread Gregory Haynes
On Wed, Jun 29, 2016, at 02:18 PM, Nir Magnezi wrote:
> Hi Greg,
>
> Thanks for the replay, comments inline.
>
> On Wed, Jun 29, 2016 at 9:59 PM, Gregory Haynes
> <g...@greghaynes.net> wrote:
>> __
>> On Wed, Jun 29, 2016, at 10:26 AM, Nir Magnezi wrote:
>>> Hello,
>>>
>>> Lately, I've been working on a fix[1] for the amhpora-agent, which
>>> currently only support Debian based flavors such as Ubuntu.
>>>
>>> The main Issues here:
>>> 1. NIC hot plugs: Ubuntu's ethX.cfg files looks different from ifcfg-
>>>ethX files which are accepted in Linux flavors such a RHEL,
>>>CentOS and Fedora, read more in the fix commit msg.
>>> 2. The usage of Flavor specific features such as 'upstart'.
>>>
>>> I would like to have a discussion about the second bullet mentioned
>>> above.
>>> Due to the fact that in Octavia the loadbalancer runs inside of an
>>> instance (Amphora), There are few actions that need to take place in
>>> the Amphora instance boot process:
>>> a. namespace and NIC created.
>>> b. amphora agent starts
>>> c. haproxy (and possibly keepalived) start
>>>
>>> The Amphora-agent leverages[2] the capabilities of 'upstart' to make
>>> that happen, which is a bit problematic if we wish it to work on
>>> other flavors.
>>> The default cloud image for Amphora today is Ubuntu, yet there are
>>> few more options[3] such as CentOS and Fedora.
>>> Unlike the Ubuntu base image, which uses 'sysvinit', The latter two
>>> flavors use 'systemd'.
>>> This creates incompatibility with the jinja2[4][5] templates used by
>>> the agent.
>>>
>>> The way I see it there are two possible solutions for this:
>>> 1. Include a systemd equivalent in the fix[1] that will essentially
>>>duplicate the functionality mentioned above and work in the other
>>>flavors.
>>> 2. Have a the amphora agent be the only binary that needs to be
>>>configured to start upon boot, and that agent will take care of
>>>plugging namespaces and NICs and also spawning needs processes.
>>>This is how it is done in lbaas and l3 agents.
>>>
>>> While the latter solution looks like a more "clean" design, the trade-
>>> off here is a bigger change to the amphora agent.
>>>
>>> [1] https://review.openstack.org/#/c/331841/
>>> [2] 
>>> https://github.com/openstack/octavia/blob/master/octavia/amphorae/backends/agent/api_server/listener.py#L128
>>> [3]https://github.com/openstack/octavia/blob/master/diskimage-create/diskimage-create.sh#L27
>>> [4]https://github.com/openstack/octavia/blob/master/octavia/amphorae/backends/agent/api_server/templates/upstart.conf.j2
>>> [5]https://github.com/openstack/octavia/blob/master/octavia/amphorae/backends/agent/api_server/templates/sysvinit.conf.j2
>>>
>>>
>>> Thanks,
>>> Nir
>>
>> I have an alternative suggestion - Maybe we shouldn't be templating
>> out the init scripts? What we are effectively doing here is code-gen
>> which leads to problems exactly like this, and fixing it with more
>> code gen actually makes the problem more difficult.
>>
> The incompatibility to systemd is not due to usage of templates and
> code generated files is a nice and useful tool to have.
 
Sure, its not a direct result, but it just shouldn't be necessary here
and it makes this problem far more complicated than it needs to be. If
we weren't using templating then supporting non-upstart would be as easy
as creating a trivial init script and including it in the amphora
element (which only requies copying a file in to that element, done.).
 
>
>> I see two fairly straightforward ways to not require this templating:
>>
>> 1) Use the agent to write out config for the init scripts in to
>>/etc/defaults/amphora and have the init scripts consume that file
>>(source variables in that file). The init script can then simply
>>be a static file which we can even bake in to the image directly.
>
> systemd does not use init script, which is why the current code is
> incompatible to the distros i mentioned.
 
Right, what I am saying is to separate out configuration from the
init/upstart/systemd files and if necessary source that
configuration. This is how init/upstart/systemd scripts are written
for almost every application for a reason and why ubuntu has
/etc/defaults and why systemd has things like EnvironmentFile. It
sounds like the second option is what were leaning towards though, in
which case this isn't needed.
 
>>

Re: [openstack-dev] [lbaas][octavia] suggestion for today's meeting agenda: How to make the Amphora-agent support additional Linux flavors

2016-06-29 Thread Gregory Haynes
On Wed, Jun 29, 2016, at 10:26 AM, Nir Magnezi wrote:
> Hello,
>
> Lately, I've been working on a fix[1] for the amhpora-agent, which
> currently only support Debian based flavors such as Ubuntu.
>
> The main Issues here:
> 1. NIC hot plugs: Ubuntu's ethX.cfg files looks different from ifcfg-
>ethX files which are accepted in Linux flavors such a RHEL, CentOS
>and Fedora, read more in the fix commit msg.
> 2. The usage of Flavor specific features such as 'upstart'.
>
> I would like to have a discussion about the second bullet
> mentioned above.
> Due to the fact that in Octavia the loadbalancer runs inside of an
> instance (Amphora), There are few actions that need to take place in
> the Amphora instance boot process:
> a. namespace and NIC created.
> b. amphora agent starts
> c. haproxy (and possibly keepalived) start
>
> The Amphora-agent leverages[2] the capabilities of 'upstart' to make
> that happen, which is a bit problematic if we wish it to work on other
> flavors.
> The default cloud image for Amphora today is Ubuntu, yet there are few
> more options[3] such as CentOS and Fedora.
> Unlike the Ubuntu base image, which uses 'sysvinit', The latter two
> flavors use 'systemd'.
> This creates incompatibility with the jinja2[4][5] templates used by
> the agent.
>
> The way I see it there are two possible solutions for this:
> 1. Include a systemd equivalent in the fix[1] that will essentially
>duplicate the functionality mentioned above and work in the other
>flavors.
> 2. Have a the amphora agent be the only binary that needs to be
>configured to start upon boot, and that agent will take care of
>plugging namespaces and NICs and also spawning needs processes.
>This is how it is done in lbaas and l3 agents.
>
> While the latter solution looks like a more "clean" design, the trade-
> off here is a bigger change to the amphora agent.
>
> [1] https://review.openstack.org/#/c/331841/
> [2] 
> https://github.com/openstack/octavia/blob/master/octavia/amphorae/backends/agent/api_server/listener.py#L128
> [3]https://github.com/openstack/octavia/blob/master/diskimage-create/diskimage-create.sh#L27
> [4]https://github.com/openstack/octavia/blob/master/octavia/amphorae/backends/agent/api_server/templates/upstart.conf.j2
> [5]https://github.com/openstack/octavia/blob/master/octavia/amphorae/backends/agent/api_server/templates/sysvinit.conf.j2
>
>
> Thanks,
> Nir
 
I have an alternative suggestion - Maybe we shouldn't be templating out
the init scripts? What we are effectively doing here is code-gen which
leads to problems exactly like this, and fixing it with more code gen
actually makes the problem more difficult.
 
I see two fairly straightforward ways to not require this templating:
 
1) Use the agent to write out config for the init scripts in to
   /etc/defaults/amphora and have the init scripts consume that file
   (source variables in that file). The init script can then simply be a
   static file which we can even bake in to the image directly.
 
2) Move the code which requires the templating in to another executable
   which the init scripts call out to. e.g. create a amphora-net-init
   executable that runs the same code as in the pre-up section of the
   upstart script. Then there is no need for templating in the init
   scripts themselves (they will all simply call the same executable)
   and we can also do something like bake init scripts directly in to
   the image.
 
 
My thinking is that this is only going to get more complex - what are we
going to do to support the new ubuntu LTS which is systemd based? Or if
folk show up with other distros that match neither? Its a lot easier to
decouple the init scripts, which are target specific, from configuration
specific components and avoid this issues all together.
 
HTH,
Greg
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Placement API WSGI code -- let's just use flask

2016-06-23 Thread Gregory Haynes
On Wed, Jun 22, 2016, at 09:07 AM, Chris Dent wrote:
> On Tue, 21 Jun 2016, Sylvain Bauza wrote:
> 
> > To be honest, Chris and you were saying that you don't like Flask, and I'm 
> > a 
> > bit agreeing with you. Why now it's a good possibility ?
> 
> As I said when I started the other version of this same thread: What is
> most important to me is generating a consensus that we can actually
> commit to. To build a _real_ consensus it is important to have
> strong opionions that are weakly held[1] otherwise we are not
> actually evaluating the options.
> 
> You are right: I don't like Flask. It claims to be a microframework
> but to me it is overweight. I do, however, like it more than the
> chaos that is the current Nova WSGI stack.

This seems to be a recurring complaint in this thread - has any
consideration been given to using werkzeug[1] directly (its the library
underneath Flask). IMO this isn't a big win because the extra stuff that
comes in with Flask shouldn't present additional problems for us, but if
that really is the sticking point then it might be worth a look.

> 
> Flask has a very strong community and it does lots of stuff well
> such that we, in OpenStack, could just stop worrying about it. That
> is one reasonable way to approach doing WSGI moving forward.
> 

++. If there are issues we hit in Flask because of the extra components
we are so worried about then maybe we could work with them to resolve
these issues? I get the impression we are throwing out the baby with the
bathwater avoiding it altogether due to this fear.

> > Why not Routes and Paste couldn't be using also ?
> 
> More on this elsewhere in the thread.
> 

Cheers,
Greg

1: https://github.com/pallets/werkzeug

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [diskimage-builder] ERROR: embedding is not possible, but this is required for cross-disk install

2016-06-20 Thread Gregory Haynes
On Mon, Jun 20, 2016, at 06:01 PM, Jay Pipes wrote:
> On 06/20/2016 06:41 PM, Paul Belanger wrote:
> > On Mon, Jun 20, 2016 at 04:52:38PM -0400, Jay Pipes wrote:
> >> Hi dib-gurus,
> >>
> >> I'm trying to build a simple ubuntu VM image on a local Gigabyte BRIX with 
> >> a
> >> AMD A8-5557M APU with Ubuntu 16.04 installed and getting an odd error.
> >> Hoping someone has some ideas...
> >>
> >> The command I am running is:
> >>
> >> disk-image-create -o /tmp/ubuntu.qcow2 --image-size=10 ubuntu vm
> >>
> > Do you have the same issue if you use ubuntu-minimal? I only suggest it 
> > since
> > openstack-infra defaults to -minimal elements now, which is a more tested 
> > code
> > path IMO.
> 
> Hmm, unfortunately same error...
> 
> I've been working with Andre Florath on a private thread. He asked me to 
> drop dib into a debug shell and output the contents of:
> 
> /sbin/fdisk -l /dev/loop0p1
> 
> That returned some nastysauce:
> 
> root@brix-1:/# echo $IMAGE_BLOCK_DEVICE
> /dev/loop0p1
> root@brix-1:/# /sbin/fdisk -l /dev/loop0p1
> 
> Disk /dev/loop0p1: 1743 MB, 1743584768 bytes
> 255 heads, 63 sectors/track, 211 cylinders, total 3405439 sectors
> Units = sectors of 1 * 512 = 512 bytes
> Sector size (logical/physical): 512 bytes / 512 bytes
> I/O size (minimum/optimal): 512 bytes / 512 bytes
> Disk identifier: 0x
> 
> Disk /dev/loop0p1 doesn't contain a valid partition table
> 
> The quest continues... :)
> 
> -jay

Ah, the plot thickens.

Partition creation happens here:
http://git.openstack.org/cgit/openstack/diskimage-builder/tree/elements/vm/block-device.d/10-partition#n22

It might be worth dropping a bash right afterward to see whether that is
actually succeeding. Its possible that parted is failing but we aren't
catching that failure for some reason, or when we dd the image back on
to the block device we are failing horribly and writing over the
partition table (that would be a totally new bug to me though).

Alternatively, if you could paste a complete log of you image build
somewhere so we can dig through maybe we can spot something interesting.

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [diskimage-builder] ERROR: embedding is not possible, but this is required for cross-disk install

2016-06-20 Thread Gregory Haynes
On Mon, Jun 20, 2016, at 03:52 PM, Jay Pipes wrote:
> Hi dib-gurus,
> 
> I'm trying to build a simple ubuntu VM image on a local Gigabyte BRIX 
> with a AMD A8-5557M APU with Ubuntu 16.04 installed and getting an odd 
> error. Hoping someone has some ideas...
> 
> The command I am running is:
> 
> disk-image-create -o /tmp/ubuntu.qcow2 --image-size=10 ubuntu vm
> 
> Everything goes smoothly until trying to write the MBR, at which point I 
> get the following error:
> 
> + /usr/sbin/grub-install '--modules=biosdisk part_msdos' 
> --target=i386-pc /dev/loop0
> Installing for i386-pc platform.
> /usr/sbin/grub-install: warning: this msdos-style partition label has no 
> post-MBR gap; embedding won't be possible.
> /usr/sbin/grub-install: error: embedding is not possible, but this is 
> required for cross-disk install.
> /dev/loop0: [0047]:3 (/tmp/image.hk8wiFJe/image.raw)
> 
> Anybody got any ideas?
> 
> Thanks in advance!
> -jay

Hey Jay,

I just tried to reproduce this on my 14.04 box and wasn't able to so I
am betting there's some kind of new bug with us on 16.04. Do you get the
same error if you run without --image-size=10? Last time we had an issue
like this a new grub version changed the default behavior, so I'd
suspect something along those lines.

I am trying out a new run on a 16.04 box but its going to be a bit
before the cloud image downloads (cloud-image.ubuntu.com is pretty
slow)...

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [DIB] Adding GPT support

2016-06-14 Thread Gregory Haynes
On Tue, Jun 14, 2016, at 07:36 PM, Tony Breeds wrote:
> Hi All,
> I'd like to add GPT partitioning supporg to DIB.  My current
> understanding is that the only partitioning in DIB currently is
> provided by partitioning-sfdisk, which will not support GPT.
> 

This isn't made very clear by looking at the elements, but there are
actually two ways to partition images. There is the partitioning-sfdisk
element (which I am guessing is what you found) that partitions with
sfdisk. There is also the vm element which is the way most users
partition / create a bootloader for their images. The vm element uses
parted. There is also a patch up which adds GPT/UEFI support[1].

> My proposed solution is:
> 
> 1. Create a new element called partitioning-parted  that (surprise
> surprise)
>uses parted to create the disk labal and partitions.  This would like
>along
>side partitioning-sfdisk but provide a somewhat compatible way

I'd still like to see this - it would be great to break the partitioning
bits out of the vm element and in to a partitioning-parted element which
the vm element depends on.

> 2. Teach partitioning-parted how to parse DIB_PARTITIONING_SFDISK_SCHEMA
> and
>use parted to produce msdos disklabeled partition tables
> 3. Deprecate partitioning-sfdisk
> 4. Remove partitioning-sfdisk in line with thew std. deprecation process.

>From my cursory reading it seems like parted is the thing to use for
this and there's really no reason to chose sfdisk over parted? I don't
have a ton of knowledge about this, but if that is the case then I like
this plan. I definitely want to make sure that there's no reason a user
would prefer to use sfdisk over parted, though...

> 
> Does this sound like a reasonable plan?
> Yours Tony.

Something else worth mentioning is that Andreas has been working on some
refactoring of our block-device.d phase[2]. I think the changes your
looking for are mostly addressed in[1] but if you're hoping to do
something larger its probably good to make sure they align with Andreas'
goals.

1:
https://review.openstack.org/#/c/287784/22/elements/vm/block-device.d/10-partition
2: https://review.openstack.org/#/c/319591/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Reasoning behind my vote on the Go topic

2016-06-08 Thread Gregory Haynes
On Wed, Jun 8, 2016, at 03:46 AM, Thierry Carrez wrote:
> Another option (raised by dims) is to find a way to allow usage of 
> golang (or another language) in a more granular way: selectively allow 
> projects which really need another tool to use it. The benefit is that 
> it lets project teams make a case and use the best tool for the job, 
> while limiting the cross-project impact and without opening the 
> distraction floodgates of useless rewrites. The drawback is that 
> depending on how it's done, it may place the TC in the role of granting 
> "you're tall enough to use Go" badges, creating new endless discussions 
> and more "you're special" exceptions. That said, I'm still interested in 
> exploring that option, for one reason. I think that whenever a project 
> team considers adding a component or a rewrite in another language, they 
> are running into an interesting problem with Python, on which they 
> really could use advice from the rest of the OpenStack community. I'd 
> really like to see a cross-project team of Python performance experts to 
> look at the problem this specific team has that makes them want to use 
> another language. That seems like a great way to drive more practice 
> sharing and reduce fragmentation in "OpenStack" in general. We might 
> just need to put the bar pretty high so that we are not flooded by silly 
> rewrite requests.
> 

++.  There's a lot of value in these issues getting bubbled up to the
cross-project level: If we have identified a serious hurdle then this
knowledge really shouldn't live inside of a single project. Otherwise,
if we haven't identified such an issue, then the (the greater openstack
community) can offer some alternative solutions which is also a huge
win.

I completely understand the fear that we might be creating an endless
review stream for the TC by making them the review squad for getting
approval to use a new language, and I agree that we need to make sure
that doesn't happen. OTOH, I strongly believe that in almost all of the
cases which would be proposed some alternative solutions could be found.
I worry that if we just tell these folks 'the solution you thought of
isn't allowed' rather than offer an outlet for seriously investigating
the issue were likely to see teams try and find ways around that
restriction when really we want to identify another solution to the
problem. A perf team sounds like a great way to help both the tribal
knowledge problem and support the type of problem solving we are asking
for. Sign me up :).

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Using image metadata to sanity check supplied authentication data at nova 'create' or 'recreate' time?

2016-06-06 Thread Gregory Haynes
On Mon, Jun 6, 2016, at 05:44 PM, Gregory Haynes wrote:
>
> On Mon, Jun 6, 2016, at 05:31 PM, Michael Still wrote:
>> On Tue, Jun 7, 2016 at 7:41 AM, Clif Houck <m...@clifhouck.com> wrote:
>>> Hello all,
>>>
>>> At Rackspace we're running into an interesting problem: Consider
>>> a user
>>> who boots an instance in Nova with an image which only supports SSH
>>> public-key authentication, but the user doesn't provide a public
>>> key in
>>> the boot request. As far as I understand it, today Nova will happily
>>> boot that image and it may take the user some time to realize their
>>> mistake when they can't login to the instance.
>>
>> What about images where the authentication information is inside the
>> image? For example, there's just a standard account baked in that
>> everyone knows about? In that case Nova doesn't need to inject
>> anything into the instance, and therefore the metadata doesn't need
>> to supply anything.
>
> We have an element in diskimage-builder[1] which allows a user to pass
> a kernel boot param to inject an ssh key if needed due to a reason
> like this. Obviously, this wouldn't 'just work' in any normal cloud
> deploy since the kernel boot params are baked in to the image itself
> (this is currently useful to ironic users who boot ramdisks) but maybe
> the pattern is helpful: Check something once at boot time via init
> script and that's it. The downside being that a user has to reboot the
> image to inject the key, but IMO its a huge decrease in complexity
> (over something like file injection) for something a user who just
> booted a new image should be OK with.
>
> Cheers,
> Greg
 
Looks like I left out the actual useful info:
 
[1]:http://docs.openstack.org/developer/diskimage-builder/elements/dynamic-login/README.html
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Using image metadata to sanity check supplied authentication data at nova 'create' or 'recreate' time?

2016-06-06 Thread Gregory Haynes
 
On Mon, Jun 6, 2016, at 05:31 PM, Michael Still wrote:
> On Tue, Jun 7, 2016 at 7:41 AM, Clif Houck  wrote:
>> Hello all,
>>
>>  At Rackspace we're running into an interesting problem: Consider
>>  a user
>>  who boots an instance in Nova with an image which only supports SSH
>>  public-key authentication, but the user doesn't provide a public
>>  key in
>>  the boot request. As far as I understand it, today Nova will happily
>>  boot that image and it may take the user some time to realize their
>>  mistake when they can't login to the instance.
>
> What about images where the authentication information is inside the
> image? For example, there's just a standard account baked in that
> everyone knows about? In that case Nova doesn't need to inject
> anything into the instance, and therefore the metadata doesn't need to
> supply anything.
 
We have an element in diskimage-builder[1] which allows a user to pass a
kernel boot param to inject an ssh key if needed due to a reason like
this. Obviously, this wouldn't 'just work' in any normal cloud deploy
since the kernel boot params are baked in to the image itself (this is
currently useful to ironic users who boot ramdisks) but maybe the
pattern is helpful: Check something once at boot time via init script
and that's it. The downside being that a user has to reboot the image to
inject the key, but IMO its a huge decrease in complexity (over
something like file injection) for something a user who just booted a
new image should be OK with.
 
Cheers,
Greg
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][diskimage-builder] Proposing Stephane Miller to dib-core

2016-06-01 Thread Gregory Haynes
Hello everyone,

I'd like to propose adding Stephane Miller (cinerama) to the
diskimage-builder core team. She has been a huge help with our reviews
for some time now and I think she would make a great addition to our
core team. I know I have benefited a lot from her bash expertise in many
of my reviews and I am sure others have as well :).

I've spoken with many of the active cores privately and only received
positive feedback on this, so rather than use this as an all out vote
(although feel free to add your ++'s) I'd like to use this as a final
call out in case any objections are wanting to be made. If none have
been made by next Wednesday (6/8) I'll go ahead and add her to dib-core.

Cheers,
Greg

-- 
  Gregory Haynes
  g...@greghaynes.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] [diskimage-builder] Howto refactor?

2016-06-01 Thread Gregory Haynes
On Wed, Jun 1, 2016, at 01:06 AM, Ian Wienand wrote:
> On 06/01/2016 02:10 PM, Andre Florath wrote:
> > My long term goal is, to add some functionality to the DIB's block
> > device layer, like to be able to use multiple partitions, logical
> > volumes and mount points.
> 
> Some thoughts...
> 
> There's great specific info in the readme's of the changes you posted
> ... but I'm missing a single big picture context of what you want to
> build on-top of all this and how the bits fit into it.  We don't have
> a formalised spec or blueprint process, but something that someone who
> knows *nothing* about this can read and follow through will help; I'd
> suggest an etherpad, but anything really.  At this point, you are
> probably the world-expert on dib's block device layer, you just need
> to bring the rest of us along :)

++, but one clarification: We do have a spec process which is to use the
tripleo-specs repo. Since this is obviously not super clear and there is
a SnR issue for folks who are only dib core maybe we should move specs
in to the dib repo?

I also agree that some type of overview is extremely useful. I hesitate
to recommend writing specs because of how much extra work it tends to be
for all of us, but I honestly think that a couple of these features
could individually use specs - more detail in review comments on
https://review.openstack.org/#/c/319591/5.

> 
> There seems to be a few bits that are useful outside the refactor.
> Formalising python elements, extra cleanup phases, dib-run-parts
> fixes, etc.  Splitting these out, we can get them in quicker and it
> reduces the cognitive load for the rest.  I'd start there.

Splitting these out would help a lot. This whole set of features is
going to take a while to iterate on (sorry! - reviewer capacity is
limited and there are big changes here) and a few of these are pretty
straightforward things I think we really want (such as the cleanup
phase). There's also a lot of risk to us in merging large changes since
we are the poster child for how having external dependencies makes
testing hard. Making smaller changes lets us release / debug /
potentially revert them individually which is a huge win.

> 
> #openstack-infra is probably fine to discuss this too, as other dib
>   knowledgeable people hang out there.
> 
> -i

As for what to do about the existing and potentially conflicting changes
-  that's harder to answer. I think there's a very valid concern from
the original authors about scope creep of their original goal. We also,
obviously, don't want to land something that will make it more difficult
for us to enhance later on.

I think with the LVM patch there actually isn't much of risk to making
your work more difficult - the proposed change is pretty small and has a
small input surface area - it should be easy to preserve its behavior
while also supporting a more general solution. For the EFI change there
are some issues you've hit on that need to be fixed, but I am not sure
they are going to require basing the change off  a more general fix. It
might be as easy as copying the element contents in to a new dir when a
more general solution is completed in which case getting the changes
completed in smaller portions is more beneficial IMO.

I also wanted to say thanks a ton for the work and reviews - it is
extremely useful stuff and we desperately need the help. :)

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][tc] Languages vs. Scope of "OpenStack"

2016-05-23 Thread Gregory Haynes
On Mon, May 23, 2016, at 05:24 PM, Morgan Fainberg wrote:
>
>
> On Mon, May 23, 2016 at 2:57 PM, Gregory Haynes
> <g...@greghaynes.net> wrote:
>> On Fri, May 20, 2016, at 07:48 AM, Thierry Carrez wrote:
>> > John Dickinson wrote:
>> > > [...]
>> > >> So the real question we need to answer is... where does
>> > >> OpenStack
>> > >> stop, and where does the wider open source community start ? If
>> > >> OpenStack is purely an "integration engine", glue code for other
>> > >> lower-level technologies like hypervisors, databases, or
>> > >> distributed
>> > >> block storage, then the scope is limited, Python should be
>> > >> plenty
>> > >> enough, and we don't need to fragment our community. If
>> > >> OpenStack is
>> > >> "whatever it takes to reach our mission", then yes we need to
>> > >> add one
>> > >> language to cover lower-level/native optimization, because we'll
>> > >> need that... and we need to accept the community cost as a
>> > >> consequence of that scope choice. Those are the only two options
>> > >> on
>> > >> the table.
>> > >>
>> > >> I'm actually not sure what is the best answer. But I'm convinced
>> > >> we,
>> > >> as a community, need to have a clear answer to that. We've been
>> > >> avoiding that clear answer until now, creating tension between
>> > >> the
>> > >> advocates of an ASF-like collection of tools and the advocates
>> > >> of a
>> > >> tighter-integrated "openstack" product. We have created silos
>> > >> and
>> > >> specialized areas as we got into the business of developing time-
>> > >> series databases or SDNs. As a result, it's not "one community"
>> > >> anymore. Should we further encourage that, or should we focus on
>> > >> what the core of our mission is, what we have in common, this
>> > >> integration engine that binds all those other open source
>> > >> projects
>> > >> into one programmable infrastructure solution ?
>> > >
>> > > You said the answer in your question. OpenStack isn't defined
>> > > as an
>> > > integration engine[3]. The definition of OpenStack is whatever it
>> > > takes to fulfill our mission[4][5]. I don't mean that as a
>> > > tautology.
>> > > I mean that we've already gone to the effort of defining
>> > > OpenStack. It's
>> > > our mission statement. We're all about building a cloud platform
>> > > upon
>> > > which people can run their apps ("cloud-native" or otherwise),
>> > > so we
>> > > write the software needed to do that.
>> > >
>> > > So where does OpenStack stop and the wider community start?
>> > > OpenStack
>> > > includes the projects needed to fulfill its mission.
>> >
>> > I'd totally agree with you if OpenStack was developed in a
>> > vacuum. But
>> > there is a large number of open source projects and libraries that
>> > OpenStack needs to fulfill its mission that are not in
>> > "OpenStack": they
>> > are external open source projects we depend on. Python, MySQL,
>> > libvirt,
>> > KVM, Ceph, OpenvSwitch, RabbitMQ... We are not asking that those
>> > should
>> > be included in OpenStack, and we are not NIHing replacements for
>> > those
>> > in OpenStack either.
>> >
>> > So it is not as clear-cut as you present it, and you can
>> > approach this
>> > dependency question from two directions.
>> >
>> > One is community-centric: "anything produced by our community is
>> > OpenStack". If we are missing a lower-level piece to achieve our
>> > mission
>> > and are developing it ourselves as a result, then it is
>> > OpenStack, even
>> > if it ends up being a message queue or a database.
>> >
>> > The other approach is product-centric: "lower-level pieces are
>> > OpenStack
>> > dependencies, rather than OpenStack itself". If we are missing a
>> > lower-level piece to achieve our mission and are developing it as a
>> > result, it could be developed on OpenStack infrastructure by
>> > members of
>> > the OpenStack communit

Re: [openstack-dev] [all][tc] Languages vs. Scope of "OpenStack"

2016-05-23 Thread Gregory Haynes
On Fri, May 20, 2016, at 07:48 AM, Thierry Carrez wrote:
> John Dickinson wrote:
> > [...]
> >> So the real question we need to answer is... where does OpenStack
> >> stop, and where does the wider open source community start ? If
> >> OpenStack is purely an "integration engine", glue code for other
> >> lower-level technologies like hypervisors, databases, or distributed
> >> block storage, then the scope is limited, Python should be plenty
> >> enough, and we don't need to fragment our community. If OpenStack is
> >> "whatever it takes to reach our mission", then yes we need to add one
> >> language to cover lower-level/native optimization, because we'll
> >> need that... and we need to accept the community cost as a
> >> consequence of that scope choice. Those are the only two options on
> >> the table.
> >>
> >> I'm actually not sure what is the best answer. But I'm convinced we,
> >> as a community, need to have a clear answer to that. We've been
> >> avoiding that clear answer until now, creating tension between the
> >> advocates of an ASF-like collection of tools and the advocates of a
> >> tighter-integrated "openstack" product. We have created silos and
> >> specialized areas as we got into the business of developing time-
> >> series databases or SDNs. As a result, it's not "one community"
> >> anymore. Should we further encourage that, or should we focus on
> >> what the core of our mission is, what we have in common, this
> >> integration engine that binds all those other open source projects
> >> into one programmable infrastructure solution ?
> >
> > You said the answer in your question. OpenStack isn't defined as an
> > integration engine[3]. The definition of OpenStack is whatever it
> > takes to fulfill our mission[4][5]. I don't mean that as a tautology.
> > I mean that we've already gone to the effort of defining OpenStack. It's
> > our mission statement. We're all about building a cloud platform upon
> > which people can run their apps ("cloud-native" or otherwise), so we
> > write the software needed to do that.
> >
> > So where does OpenStack stop and the wider community start? OpenStack
> > includes the projects needed to fulfill its mission.
> 
> I'd totally agree with you if OpenStack was developed in a vacuum. But 
> there is a large number of open source projects and libraries that 
> OpenStack needs to fulfill its mission that are not in "OpenStack": they 
> are external open source projects we depend on. Python, MySQL, libvirt, 
> KVM, Ceph, OpenvSwitch, RabbitMQ... We are not asking that those should 
> be included in OpenStack, and we are not NIHing replacements for those 
> in OpenStack either.
> 
> So it is not as clear-cut as you present it, and you can approach this 
> dependency question from two directions.
> 
> One is community-centric: "anything produced by our community is 
> OpenStack". If we are missing a lower-level piece to achieve our mission 
> and are developing it ourselves as a result, then it is OpenStack, even 
> if it ends up being a message queue or a database.
> 
> The other approach is product-centric: "lower-level pieces are OpenStack 
> dependencies, rather than OpenStack itself". If we are missing a 
> lower-level piece to achieve our mission and are developing it as a 
> result, it could be developed on OpenStack infrastructure by members of 
> the OpenStack community but it is not "OpenStack the product", it's an 
> OpenStack *dependency*. It is not governed by the TC, it can use any 
> language and tool deemed necessary.
> 
> On this second approach, there is the obvious question of where 
> "lower-level" starts, which as you explained above is not really 
> clear-cut. A good litmus test for it could be whenever Python is not 
> enough. If you can't develop it effectively with the language that is 
> currently sufficient for the rest of OpenStack, then developing it as an 
> OpenStack dependency in whatever language is appropriate might be the 
> solution...
> 
> That is what I mean by 'scope': where does "OpenStack" stop, and where 
> do "OpenStack dependencies" start ? It is a lot easier and a lot less 
> community-costly to allow additional languages in OpenStack dependencies 
> (we already have plenty there).
> 

I strongly agree with both of the points about what OpenStack is defined
as. We are  a set of projects attempting to fulfill our mission. In
doing so, we try to use outside dependencies to help us as much as
possible. Sometimes we cannot find an outside dependency to satisfy a
need whether due to optimization needs, licensing issues, usability
problems, or simply because an outside project doesn't exist. That is
when things become less clear-cut and we might need to develop software
not purely/directly related to fulfilling our mission.

In the product-centric approach I worry that we are going to be paying
many of the costs which the existing TC resolutions hoped to prevent. We
still will have to maintain and debug these 

Re: [openstack-dev] [all][tc] Languages vs. Scope of "OpenStack"

2016-05-23 Thread Gregory Haynes

On Mon, May 23, 2016, at 02:57 PM, Sean Dague wrote:
> On 05/23/2016 03:34 PM, Gregory Haynes wrote:
> > 
> > On Mon, May 23, 2016, at 11:48 AM, Doug Hellmann wrote:
> >> Excerpts from Chris Dent's message of 2016-05-23 17:07:36 +0100:
> >>> On Mon, 23 May 2016, Doug Hellmann wrote:
> >>>> Excerpts from Chris Dent's message of 2016-05-20 14:16:15 +0100:
> >>>>> I don't think language does (or should) have anything to do with it.
> >>>>>
> >>>>> The question is whether or not the tool (whether service or
> >>>>> dependent library) is useful to and usable outside the openstack-stack.
> >>>>> For example gnocchi is useful to openstack but you can use it with other
> >>>>> stuff, therefore _not_ openstack. More controversially: swift can be
> >>>>> usefully used all by its lonesome: _not_ openstack.
> >>>>
> > 
> > Making a tool which is useful outside of the OpenStack context just
> > seems like good software engineering - it seems odd that we would try
> > and ensure our tools do not fit this description. Fortunately, many (or
> > even most) of the tools we create *are* useful outside of the OpenStack
> > world - pbr, git-review, diskimage-builder, (I hope) many of the oslo
> > libraries. This is really a question of defining useful interfaces more
> > than anything else, not a statement of whether a tool is part of our
> > community.
> 
> Only if you are willing to pay the complexity and debt cost of having
> optional backends all over the place.
> 
> For instance, I think we're well beyond that point that Keystone being
> optional should be a thing anywhere (and it is a thing in a number of
> places). Keystone should be our auth system, all projects 100% depend on
> it, and if you have different site needs, put that into a Keystone
> backend.
> 

Services and Projects seem to be getting conflated here. IIUC Your two
points apply only to services - we certainly aren't paying any
complexity costs for making pbr optional and the same could be said for
many of our tools.

I don't have a ton of context for why some services are electing to pay
the cost of making Keystone optional. The point I was hoping to make is
that there is value in defining an interface which is useful outside of
OpenStack, and this is a very common pattern with many of our tools. I
completely agree that there are additional costs to doing so at times,
and obviously they have to be weighed against the benefits. That is
really a problem-specific issue, though.

> Most of the oslo libraries require other oslo libraries, which is fine.
> They aren't trying to solve the general purpose case of logging or
> configuration or db access. They are trying to solve a specific set of
> patterns that are applicable to OpenStack projects.
> 

This is true up to a point - there isn't any inherent value in
overfitting a problem to be OpenStack specific. To beat on the pbr
hammer some more - we created a tool that fulfills our needs and making
it in a way where others can use it didn't cost us anything. This isn't
always the case but sometimes it is, and there is absolutely value in
making a tool which others can use.

>   -Sean
> 
> -- 
> Sean Dague
> http://dague.net

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][tc] Languages vs. Scope of "OpenStack"

2016-05-23 Thread Gregory Haynes

On Mon, May 23, 2016, at 11:48 AM, Doug Hellmann wrote:
> Excerpts from Chris Dent's message of 2016-05-23 17:07:36 +0100:
> > On Mon, 23 May 2016, Doug Hellmann wrote:
> > > Excerpts from Chris Dent's message of 2016-05-20 14:16:15 +0100:
> > >> I don't think language does (or should) have anything to do with it.
> > >>
> > >> The question is whether or not the tool (whether service or
> > >> dependent library) is useful to and usable outside the openstack-stack.
> > >> For example gnocchi is useful to openstack but you can use it with other
> > >> stuff, therefore _not_ openstack. More controversially: swift can be
> > >> usefully used all by its lonesome: _not_ openstack.
> > >

Making a tool which is useful outside of the OpenStack context just
seems like good software engineering - it seems odd that we would try
and ensure our tools do not fit this description. Fortunately, many (or
even most) of the tools we create *are* useful outside of the OpenStack
world - pbr, git-review, diskimage-builder, (I hope) many of the oslo
libraries. This is really a question of defining useful interfaces more
than anything else, not a statement of whether a tool is part of our
community.

> > > Add keystone, cinder, and ironic to that list.
> > 
> > Hmmm. You can, but would people want to (that is, would it be a sound
> > choice?)? Or _do_ people? Maybe that's the distinction? As far as I
> 
> Yes, I'm aware of cases of each of those projects being used without
> "the rest" of OpenStack. I used keystone like that to secure some
> internal APIs myself.
> 

This has become  a very popular way of using Ironic as well. We even
have an OpenStack project (bifrost) which is used to deploy Ironic in
this fashion.

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] [diskimage-builder] LVM in diskimage-builder

2016-05-18 Thread Gregory Haynes
On Tue, May 17, 2016, at 02:32 PM, Andre Florath wrote:
> Hi All!
> 
> AFAIK the diskimage-builder started as a testing tool, but it looks
> that it evolves more and more into a general propose tool for creating
> docker and VM disk images.
> 
> Currently there are ongoing efforts to add LVM [1]. But because some
> features that I need are missing, I created my own prototype to get a
> 'feeling' for the complexity and a possible way doing things [2]. I
> contacted Yolanda (the author of the original patch) and we agreed to
> join forces here to implement a patch that fits our both needs.
> 

Glad to hear this. I'd recommend first making sure the public interfaces
defined in[1] don't conflict with the features you'd like to add (or
even potentially like to add). AFAICT this only matters for the LVM
config passed in via DIB_LVM_SETTINGS. The other features should be
doable in a follow on patch without breaking backwards compatibility and
this is probably the best path forward (rather than adding them in to
[1]). Obviously, if there's any other potentially conflicting public
interfaces I am missing we should fix those before [1] goes in.

As for the DIB_LVM_SETTINGS variable - I don't entirely understand the
issues being raised but we can continue that conversation on [1] since
it is a lot easier to discuss there.

> Yolanda made the proposal before starting implementing things, we
> could contact Open Stack developers via this mailing list and ask
> about possible additional requirements and comments.
> 
> Here is a short list of my requirements - and as far as I understood
> Yolanda, her requirements are a subset:
> 
> MUST be able to
> o use one partition as PV
> o use one VG
> o use many LV (up to 10)
> o specify the mount point for each of the LVs
> o specify mount points that 'overlap', e.g.
>   /, /var, /var/log, /var/spool
> o use the default file system (options) as specified via command line
> o survive in every day's live - and not only in dedicated test
>   environment: must be robust and handle error scenarios
> o use '/' (root fs) as LV
> o run within different distributions - like Debian, Ubuntu, Centos7.
> 
> NICE TO HAVE
> o Multiple partitions as PVs
> o Multiple VGs
> o LVM without any partition
>   Or: why do we need partitions these days? ;-)
>   Problem: I have no idea how to boot from a pure LVM image.
> 
> Every idea or comment will help!  Please feel invited to have a
> (short) look / review at the implementation [1] and the design study
> [2].
> 
> Kind regards
> 
> Andre


Thanks,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] supporting Go

2016-05-11 Thread Gregory Haynes
On Wed, May 11, 2016, at 01:11 PM, Robert Collins wrote:
> As a community, we decided long ago to silo our code: Nova and Swift
> could have been in one tree, with multiple different artifacts - think
> of all the cross-code-base-friction we would not have had if we'd done
> that! The cultural consequence though is that bringing up new things
> is much less disruptive: add a new tree, get it configured in CI, move
> on. We're a team-centric structure for governance, and the costs of
> doing cross-team code-level initiatives are already prohibitive: we
> have already delegated all those efforts to the teams that own the
> repositories: requirements syncing, translations syncing, lint fixups,
> security audits... Having folk routinely move cross project is
> something we seem to be trying to optimise for, but we're not actually
> achieving.
> 

I think this greatly understates how much cross project work goes on. I
agree that cross-team feature development is prohibitively difficult
most of the time, but I do see a lot of cross-project reading/debugging
happening on a day to day basis. I worry that this type of work is
extremely valuable but not very visible and therefore easy to overlook.
I know I do this a lot as both a deployer and a developer, and I have to
imagine others do as well.

> So, given that that is the model - why is language part of it? Yes,
> there are minimum overheads to having a given language in CI - we need
> to be able to do robust reliable builds [or accept periodic downtime
> when the internet is not cooperating], and that sets a lower fixed
> cost, varying per language. Right now we support Python, Java,
> Javascript, Ruby in CI (as I understand it - infra focused folk please
> jump in here :)).
> 

The upfront costs are not a huge issue IMO, for reasons you hit on -
folks wanting the support for a new lanaguage are silo'd off enough that
they can pay down upfront costs without effecting the rest of the
community much. What I worry about are the maintenance costs and the
cross-team costs which are hard to quantify in a list of requirements
for a new language:

Its reasonable to assume that new languages will have similar amounts of
upstream breakage that we get from python tooling (such as a new pip
releases), and so we would just be increasing the load here by a factor
of the number of languages we gate on. This is especially concerning
given that a lot of the firefighting to solve these types of issues
seems to center around one team doing cross-project work (infra).

The ability for folks to easily debug problems across projects is a huge
issue. This isn't mostly a language issue even, its a tooling issue: we
are going to have to use and/or develop a lot of tools to replace their
python counterparts we rely on and debugging issues (especially in the
gate) is going to require knowledge of each one.

-Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] supporting Go

2016-05-11 Thread Gregory Haynes
On Wed, May 11, 2016, at 03:24 AM, Thierry Carrez wrote:
> 
> That said I know that the Swift team spent a lot of time in the past 6 
> years optimizing their Python code, so I'm not sure we can generalize 
> this "everything to do with the algorithms" analysis to them ?
> 

I agree. The swift case is clearly a not easy engineering problem and
the awesome write up we got on it points out what issues they are
running in to that aren't easily solved. At that point the onus is on us
to either be certain there is a much easier way for them to solve the
issues they are running in to without the cost to the community or to
accept that another tool might be good for this job. Personally, I can
completely empathize with running into a problem that requires a lot of
direct OS interaction and just wanting a tool that gets out of my way
for it. I might have picked a different tool, but I don't think that is
the point of the conversation here :). I know we have a lot of Python
gurus here though, so maybe they feel differently.

Even if we're OK with part of Swift being in Go, I do still think there
is a lot of value in us remaining a Python project rather than a
Python+Go project. I really think Swift might be *the* exception to the
rule here and we don't need to completely open ourselves up to anyone re
implementing in Go as a result. Yes, some of the cost of Go will be
lowered when we support a single project having Go in tree, but there
are plenty of additional costs - especially for us all understanding a
common language/toolset - which aren't effected much by having one
component of one project be in a different language.

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] supporting Go

2016-05-11 Thread Gregory Haynes
On Wed, May 11, 2016, at 05:09 AM, Hayes, Graham wrote:
> On 10/05/2016 23:28, Gregory Haynes wrote:
> >
> > OK, I'll bite.
> >
> > I had a look at the code and there's a *ton* of low hanging fruit. I
> > decided to hack in some fixes or emulation of fixes to see whether I
> > could get any major improvements. Each test I ran 4 workers using
> > SO_REUSEPORT and timed doing 1k axfr's with 4 in parallel at a time and
> > recorded 5 timings. I also added these changes on top of one another in
> > the order they follow.
> 
> Thanks for the analysis - any suggestions about how we can improve the
> current design are more than welcome .
> 
> For this test, was it a single static zone? What size was it?
> 

This was a small single static zone - so the most time possible was
spent in python, as opposed to blocking on the network.

> >
> > Base timings: [9.223, 9.030, 8.942, 8.657, 9.190]
> >
> > Stop spawning a thread per request - there are a lot of ways to do this
> > better, but lets not even mess with that and just literally move the
> > thread spawning that happens per request because its a silly idea here:
> > [8.579, 8.732, 8.217, 8.522, 8.214] (almost 10% increase).
> >
> > Stop instantiating oslo config object per request - this should be a no
> > brainer, we dont need to parse config inside of a request handler:
> > [8.544, 8.191, 8.318, 8.086] (a few more percent).
> >
> > Now, the slightly less low hanging fruit - there are 3 round trips to
> > the database *every request*. This is where the vast majority of request
> > time is spent (not in python). I didn't actually implement a full on
> > cache (I just hacked around the db queries), but this should be trivial
> > to do since designate does know when to invalidate the cache data. Some
> > numbers on how much a warm cache will help:
> >
> > Caching zone: [5.968, 5.942, 5.936, 5.797, 5.911]
> >
> > Caching records: [3.450, 3.357, 3.364, 3.459, 3.352].
> >
> > I would also expect real-world usage to be similar in that you should
> > only get 1 cache miss per worker per notify, and then all the other
> > public DNS servers would be getting cache hits. You could also remove
> > the cost of that 1 cache miss by pre-loading data in to the cache.
> 
> I actually would expect the real world use of this to have most of the
> servers have a cache miss.
> 
> We shuffle the order of the miniDNS servers sent out to the user facing
> DNS servers, so I would expect them to hit different minidns servers
> at nearly same time, and each of them try to generate the cache entry.
> 
> For pre-loading - this could work, but I *really* don't like relying on
> a cache for one of the critical path components.
> 

I am not sure what the issue with caching in general is, but its not
far-fetched to pre load an axfr into a cache before you send out any
notifies (since you know exactly when that will happen). For the herding
issue - that's just a matter of how you design your cache coherence
system. Usually you want to design that around your threading/worker
model and since we actually get a speed increase by turning the current
threading off it might be worth fixing that first...

That being said - this doesn't need to be designed amazingly to reap the
benefits being argued for. I haven't heard any goals of 'make every
single request as low latency as possible' (which is when you would
worry about dealing with cold cache costs), but instead that there's a
need to scale up to a potentially large number of clients all requesting
the same axfr at once. In that scenario even the most simple caching
setup would make a huge difference.

> >
> > All said and done, I think that's almost a 3x speed increase with
> > minimal effort. So, can we stop saying that this has anything to do with
> > Python as a language and has everything to do with the algorithms being
> > used?
> 
> As I have said before - for us, the time spent : performance
> improvement ratio is just much higher (for our dev team at least) with
> Go.
> 
> We saw a 50x improvement for small SOA queries, and ~ 10x improvement
> for 2000 record AXFR (without caching). The majority of your
> improvement came from caching, so I would imagine that would speed up
> the Go implementation as well.
> 

There has to be something very different between your python testing set
up and mine. In my testing there simply wasn't enough time spent in
Python to get even a 2x speed increase by removing all execution time. I
wonder if this is because the code originally spawns a thread per
request and therefore if you run in with a large number of parallel
requests you'll effectively thread bomb all the workers?

The 

Re: [openstack-dev] [tc] supporting Go

2016-05-10 Thread Gregory Haynes
On Tue, May 10, 2016, at 11:10 AM, Hayes, Graham wrote:
> On 10/05/2016 01:01, Gregory Haynes wrote:
> >
> > On Mon, May 9, 2016, at 03:54 PM, John Dickinson wrote:
> >> On 9 May 2016, at 13:16, Gregory Haynes wrote:
> >>>
> >>> This is a bit of an aside but I am sure others are wondering the same
> >>> thing - Is there some info (specs/etherpad/ML thread/etc) that has more
> >>> details on the bottleneck you're running in to? Given that the only
> >>> clients of your service are the public facing DNS servers I am now even
> >>> more surprised that you're hitting a python-inherent bottleneck.
> >>
> >> In Swift's case, the summary is that it's hard[0] to write a network
> >> service in Python that shuffles data between the network and a block
> >> device (hard drive) and effectively utilizes all of the hardware
> >> available. So far, we've done very well by fork()'ing child processes,
> >> using cooperative concurrency via eventlet, and basic "write more
> >> efficient code" optimizations. However, when it comes down to it,
> >> managing all of the async operations across many cores and many drives
> >> is really hard, and there just isn't a good, efficient interface for
> >> that in Python.
> >
> > This is a pretty big difference from hitting an unsolvable performance
> > issue in the language and instead is a case of language preference -
> > which is fine. I don't really want to fall in to the language-comparison
> > trap, but I think more detailed reasoning for why it is preferable over
> > python in specific use cases we have hit is good info to include /
> > discuss in the document you're drafting :). Essentially its a matter of
> > weighing the costs (which lots of people have hit on so I won't) with
> > the potential benefits and so unless the benefits are made very clear
> > (especially if those benefits are technical) its pretty hard to evaluate
> > IMO.
> >
> > There seemed to be an assumption in some of the designate rewrite posts
> > that there is some language-inherent performance issue causing a
> > bottleneck. If this does actually exist then that is a good reason for
> > rewriting in another language and is something that would be very useful
> > to clearly document as a case where we support this type of thing. I am
> > highly suspicious that this is the case though, but I am trying hard to
> > keep an open mind...
> 
> The way this component works makes it quite difficult to make any major
> improvement.

OK, I'll bite.

I had a look at the code and there's a *ton* of low hanging fruit. I
decided to hack in some fixes or emulation of fixes to see whether I
could get any major improvements. Each test I ran 4 workers using
SO_REUSEPORT and timed doing 1k axfr's with 4 in parallel at a time and
recorded 5 timings. I also added these changes on top of one another in
the order they follow.

Base timings: [9.223, 9.030, 8.942, 8.657, 9.190]

Stop spawning a thread per request - there are a lot of ways to do this
better, but lets not even mess with that and just literally move the
thread spawning that happens per request because its a silly idea here:
[8.579, 8.732, 8.217, 8.522, 8.214] (almost 10% increase).

Stop instantiating oslo config object per request - this should be a no
brainer, we dont need to parse config inside of a request handler:
[8.544, 8.191, 8.318, 8.086] (a few more percent).

Now, the slightly less low hanging fruit - there are 3 round trips to
the database *every request*. This is where the vast majority of request
time is spent (not in python). I didn't actually implement a full on
cache (I just hacked around the db queries), but this should be trivial
to do since designate does know when to invalidate the cache data. Some
numbers on how much a warm cache will help:

Caching zone: [5.968, 5.942, 5.936, 5.797, 5.911]

Caching records: [3.450, 3.357, 3.364, 3.459, 3.352].

I would also expect real-world usage to be similar in that you should
only get 1 cache miss per worker per notify, and then all the other
public DNS servers would be getting cache hits. You could also remove
the cost of that 1 cache miss by pre-loading data in to the cache.

All said and done, I think that's almost a 3x speed increase with
minimal effort. So, can we stop saying that this has anything to do with
Python as a language and has everything to do with the algorithms being
used?

> 
> MiniDNS (the component) takes data and sends a zone transfer every time 
> a recordset gets updated. That is a full (AXFR) zone transfer, so every
> record in the zone gets sent to each of the DNS servers that end users
> can hit.
> 
> This can be quite a large number - ns[1-

Re: [openstack-dev] [tc] supporting Go

2016-05-09 Thread Gregory Haynes

On Mon, May 9, 2016, at 03:54 PM, John Dickinson wrote:
> On 9 May 2016, at 13:16, Gregory Haynes wrote:
> >
> > This is a bit of an aside but I am sure others are wondering the same
> > thing - Is there some info (specs/etherpad/ML thread/etc) that has more
> > details on the bottleneck you're running in to? Given that the only
> > clients of your service are the public facing DNS servers I am now even
> > more surprised that you're hitting a python-inherent bottleneck.
> 
> In Swift's case, the summary is that it's hard[0] to write a network
> service in Python that shuffles data between the network and a block
> device (hard drive) and effectively utilizes all of the hardware
> available. So far, we've done very well by fork()'ing child processes,
> using cooperative concurrency via eventlet, and basic "write more
> efficient code" optimizations. However, when it comes down to it,
> managing all of the async operations across many cores and many drives
> is really hard, and there just isn't a good, efficient interface for
> that in Python.

This is a pretty big difference from hitting an unsolvable performance
issue in the language and instead is a case of language preference -
which is fine. I don't really want to fall in to the language-comparison
trap, but I think more detailed reasoning for why it is preferable over
python in specific use cases we have hit is good info to include /
discuss in the document you're drafting :). Essentially its a matter of
weighing the costs (which lots of people have hit on so I won't) with
the potential benefits and so unless the benefits are made very clear
(especially if those benefits are technical) its pretty hard to evaluate
IMO.

There seemed to be an assumption in some of the designate rewrite posts
that there is some language-inherent performance issue causing a
bottleneck. If this does actually exist then that is a good reason for
rewriting in another language and is something that would be very useful
to clearly document as a case where we support this type of thing. I am
highly suspicious that this is the case though, but I am trying hard to
keep an open mind...

> 
> Initial results from a golang reimplementation of the object server in
> Python are very positive[1]. We're not proposing to rewrite Swift
> entirely in Golang. Specifically, we're looking at improving object
> replication time in Swift. This service must discover what data is on
> a drive, talk to other servers in the cluster about what they have,
> and coordinate any data sync process that's needed.
> 
> [0] Hard, not impossible. Of course, given enough time, we can do
>  anything in a Turing-complete language, right? But we're not talking
>  about possible, we're talking about efficient tools for the job at
>  hand.

Sorry to be a pedant but that is just plain false - there are plenty of
intractable problems.

> 
> [1] http://d.not.mn/python_vs_golang_gets.png and
>  http://d.not.mn/python_vs_golang_puts.png
> 
> 
> --John

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tc] supporting Go

2016-05-09 Thread Gregory Haynes

On Mon, May 9, 2016, at 01:01 PM, Hayes, Graham wrote:
> On 09/05/2016 20:46, Adam Young wrote:
> > On 05/09/2016 02:14 PM, Hayes, Graham wrote:
> >> On 09/05/2016 19:09, Fox, Kevin M wrote:
> >>> I think you'll find that being able to embed a higher performance 
> >>> language inside python will be much easier to do for optimizing a 
> >>> function or two rather then deal with having a separate server have to be 
> >>> created, authentication be added between the two, and 
> >>> marshalling/unmarshalling the data to and from it to optimize one little 
> >>> piece. Last I heard, you couldn't just embed go in python. C/C++ is 
> >>> pretty easy to do. Maybe I'm wrong and its possible to embed go now. 
> >>> Someone, please chime in if you know of a good way.
> >>>
> >>> Thanks,
> >>> Kevin
> >> We won't be replacing any particular function, we will be replacing a
> >> whole service.
> >>
> >> There is no auth (or inter-service communications) from this component,
> >> all it does it query the DB and spit out DNS packets.
> >>
> >> I can't talk for what swift are doing, but we have a very targeted scope
> >> for our Go work.
> >>
> >> - Graham
> > I'm assuming you have a whole body of work discussing Bind and why it is
> > not viable for these cases.  Is there a concise version of the discussion?
> 
> Because we work with multiple DNS servers. This is a component that
> sits between Designate and the end user DNS servers (like Bind,
> PowerDNS, NSD and others, or service providers like Akamai or DynECT)
> 
> The best solution for use to push out DNS information to other DNS
> servers was to us the DNS protocol, so we have a small DNS server that
> knows how to get zones and recordsets from our DB, and send them as
> zone transfers to the end user facing servers.
> 
> The design discussion happened 2 years ago now - this blueprint as the 
> most detail [0].
> 
> Ironically the entire conversation was driven by a need to scale the
> Bind9 backend by supporting an async API.
> 
> 0 - https://blueprints.launchpad.net/designate/+spec/mdns-master

This is a bit of an aside but I am sure others are wondering the same
thing - Is there some info (specs/etherpad/ML thread/etc) that has more
details on the bottleneck you're running in to? Given that the only
clients of your service are the public facing DNS servers I am now even
more surprised that you're hitting a python-inherent bottleneck.


Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [trove][sahara][infra][Octavia][manila] discussion of image building in Trove

2016-05-05 Thread Gregory Haynes

> 
> The approach being proposed by Pete is something that is equally
> applicable to DIB, I think. I believe that he makes a valid observation
> and our current element design may in fact be bad.
>  
> The invocation of DIB[1] is
> 
> ${PATH_DISKIMAGEBUILDER}/bin/disk-image-create -a amd64 -o "${VM}" \
> -x ${QEMU_IMG_OPTIONS} ${DISTRO} ${EXTRA_ELEMENTS} vm
> heat-cfntools \
> cloud-init-datasources ${DISTRO}-guest ${DISTRO}-${SERVICE_TYPE}
> 
> Observe that we include the ${DISTRO} element, and we prefix ${DISTRO}
> into the name of the guest and the database to get, for example,
> 
>   ... ubuntu ubuntu-guest ubuntu-mysql
> 
> The minimal set of bash and data files could be used instead and we won't
> have this matrix of datastore-by-distro proliferation that you speak of.
> That's why I believe that this solution is equally applicable to DIB.
> 
> [1] trove-integration/scripts/files/functions_qemu
> 

Ah, yep - the typical pattern for elements which are not distro elements
is to not make the distro part of the element name and then have the
element itself use the provided tools as much as possible to perform
tasks in a distro-agnostic fashion (see the package-installs element for
an example tool).

Thanks,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [trove][sahara][infra][Octavia][manila] discussion of image building in Trove

2016-05-04 Thread Gregory Haynes
On Wed, May 4, 2016, at 10:33 AM, Peter MacKinnon wrote:
> 
> Well, certainly one downside in the case of Trove (and probably 
> elsewhere) with DIB is the src tree matrix of datastore-by-distro 
> elements required to support various guest image combinations, leading 
> to a proliferation of directories and files. We feel this can be greatly 
> simplified using a libguestfs approach using a minimal set of bash and 
> directly applicable data files (e.g., systemd unit files, conf files, 
> etc.).
> 

I am confused by this, so maybe I am just misunderstanding. Are you
saying that DIB requires you to support more distro combinations? What
combination of distros you support is entirely up to trove as a
downstream and has absolutely nothing to do with the image build tool,
or maybe you mean something else?

> >
> > What seemed very apparent to me in the summit session is that there are
> > a set of issues for Trove relating to image building, mostly relating to
> > reliability and ease of use. There was no one who even mentioned let
> > alone strongly cared about the issues which actually differentiate the
> > existing DIB build process from libguestfs (which is isolation). If that
> > has changed for some reason, then my recommendation would be to use a
> > tool like virt-dib which will allow for a single image building code
> > base while solving all the raised issues in the spec. I suspect when
> > this is tried out the downsides to booting a VM will highly outweigh the
> > benefits for almost all users (especially in trove's gate),
> 
> Anecdotally, it takes the same amount of time for a CentOS7 MySQL build 
> (~ 7 minutes) with either toolchain.
> 

I suspect this is actually "about the same amount of time with hardware
virtualization", which the gate does not have. Otherwise, awesome - lets
just use virt-dib / a libguestfs backend for DIB then.

> > but if the
> > libguestfs docs are to be believed this should be trivial to try out.
> 
> Not quite sure what you mean by "to be believed"?
> 

Just that it seems trivial to try out and there's no downsides
mentioned.

> 
> The various image building frameworks have been noted here 
> http://docs.openstack.org/image-guide/create-images-automatically.html 
> including libguestfs. So it's not like it is an unknown quantity. In the 
> interest of innovation I'm not sure I understand the hearty reluctance 
> to explore this path. We are proposing simply another Trove repo with an 
> alternate (and recognized) image build method. This is not displacing 
> any established tool for Trove; such a tool doesn't exist today. The 
> elements in trove-integration don't really count since they are largely 
> developed for Ubuntu only, inject Trove guestagent src from git only, 
> and, beyond MySQL 5.6, are not tested by the gate.
> 

The reluctance is that, as you say, the existing set of scripts have
deficiencies that need to be fixed. Rather than fix them, we are going
to put effort in to rewriting them to use another tool which does not
help the existing deficiencies. Distro support is still just as much of
a trove issue as it is now - its up to the trove scripts to support the
various distros. It seems a lot more reasonable to fix the problems in
the scripts rather than rewrite to use another tool given that the
problems mentioned have nothing to do with the image building tool.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [trove][sahara][infra][Octavia][manila] discussion of image building in Trove

2016-05-04 Thread Gregory Haynes
 
On Wed, May 4, 2016, at 09:57 AM, Ethan Gafford wrote:
>
> Sahara has support for several image generation-related cases:
> 1) Packing an image pre-cluster spawn in Nova.
> 2) Building clusters from a "clean" OS image post-Nova spawn, by
>downloading and installing Hadoop/Spark/Storm/WhatHaveYou.
> 3) Validating images after Nova spawn (to ensure that they're up to
>spec for the plugin in question.)
> Because of this, we are pulling image generation logic into the
> plugins themselves, from a monolithic sahara-image-elements
> repository. This will aid us in our eventual hope to pull plugins
> out of the Sahara tree entirely; more immediately, it will allow us
> to define logic for all of these cases in one place, written
> eventually in Python. In our Sahara session at summit, which was
> also attended by several members of the Ironic team (thanks all), we
> discussed our current plan to use libguestfs rather than DIB as our
> toolchain; our intent to define a linear, idempotent set of steps to
> pack images for any plugin lends itself much more neatly to
> libguestfs' API than to DIB's.
 
Ok, there is clearly some kind of communication breakdown between DIB
and some of its users that we *really* need to solve. AFAIK the primary
set of DIB developers had no idea this was going on. I am going to push
up a patch to our docs to make it a lot more clear where to find us
(#tripleo or the ML). I'm not really sure what else we could do to make
it easier for users to find us / raise issues with us so we can explore
ways to solve them, but any suggestions would be a huge help.
 
The idea that a linear set of idempotent steps is mutually exclusive
with DIB's API is really interesting. IIUC this is something we
grappled with in TripleO when creating the tool and simply never spent
time looking in to solving for the existing distro elements, although
that is an element issue not an API issue. DIB's API is just a linear
set of commands you opt in to (by way of elements), if you make those
commands idempotent then you have what you want, the problem is that
the upstream element's themselves are not written in that way so you
couldn't depend on them.  I think there are some pretty obvious ways to
solve this, potentially by making an element which provides the same
API as libguestfs virt-customize (this would be very simple to do, I
just haven't heard of anyone wanting it) or by hacking on some of the
in-tree elements. Could you go in to some more detail on how you use
this feature?
 
> Beyond that, the Sahara team has certainly seen profound difficulty in
> the field when customers attempt to generate their own images, and
> even for Sahara cores, building images is occasionally quite
> harrowing. These issues are seldom based on real issues in the scripts
> themselves, but are frequently the result of bleed between the host
> and guest; when these issues occur for a customer, they become
> extremely difficult to diagnose remotely. Still, it's entirely
> possible that DIB has answers to these problems, and it'd be a
> universal good to identify real flaws in DIB, or just to educate the
> uninitiated into how DIB can be made to work more cleanly if the
> features are already there (which they may well be; far be it from me
> to claim exhaustive mastery of DIB.) The technical reasons we like
> libguestfs over DIB are:
 
It would be *extremely* helpful to know about these issues when they
come up. There's definitely some things we can do to prevent this but
again, there hasn't been a lot of feedback that this is an issue people
are running in to. There's a few different ways to prevent these types
of problems ranging from doing something like virt-dib to having dib use
another chroot during the image building process but I would really like
to hear about some specific issues to get a better idea of what the root
cause is. Hopefully we could even get these filed as bugs.
 
> 1) Libguestfs manipulates images in a clean helper VM created by
>libguestfs in a predictable way. As such, no mounts are made on the
>host, no scripts can affect the host system, and no variables on
>the host system are likely to affect the image packing process. See
>http://libguestfs.org/guestfs-security.1.html for information on
>libguestfs security.
 
Yep, isolation is definitely something DIB currently gives up in order
to provide speed/lower resource usage. That being said, there are lots
of things that could be done to either mitigate these issues or remove
them altogether (by compromising on speed/resource requirements). My
biggest concern is that we are creating two sets of code to perform the
same task due to a false dichotomy - the obvious case being that at a
minimum we could still be using something compatible with DIB while
utilizing a vm/libguestfs. This would then have the benefits of having a
common toolset among the community without the raised issues.
 
> 2) In-place image manipulation means that 

Re: [openstack-dev] [trove][sahara][infra][Octavia][manila] discussion of image building in Trove

2016-05-04 Thread Gregory Haynes
On Wed, May 4, 2016, at 08:55 AM, Flavio Percoco wrote:
> On 04/05/16 15:05 +, Amrith Kumar wrote:
> >I'm emailing the ML on the subject of a review ongoing in the Trove project 
> >regarding image building[1].
> >
> >TL;DR
> >
> >One of the most frequent questions that new users of Trove ask is how and 
> >where to get guest images with which to experiment with Trove, and how to 
> >build these images for themselves. While documentation about this exists in 
> >multiple places (including [2], [3]) this is still something that can do 
> >with some improvement.
> >
> >Trove currently uses diskimage-builder for building images used in testing 
> >the product and these can serve as a good basis for anyone wishing to build 
> >an image for their own use of Trove. The review [1] makes the argument for 
> >the libguestfs based approach to building images and advocates that Trove 
> >should use this instead of diskimage-builder.
> 
> At the summit we discussed the possibility of providing an implementation
> that
> would allow for both DIB and libguestfs to be used but to give priority
> to DIB.
> Since there's no real intention of just switching tools at this point, I
> believe
> it'd be good to amend the spec so that it doesn't mention libguestfs
> should be
> used instead of DiB.
> 
> The goal at this stage is to provide both and help these move forward.
> 
> >I believe that a broader discussion of this is required and I appreciate 
> >Greg Haynes' proposal at the design summit to have this discussion on the 
> >ML. I took the action item to bring this discussion to the ML.
> >
> >Details follow ...
> >
> >Before going further, I will state my views on these matters.
> >
> >1. It is important for the Trove project to do things quickly to make it 
> >easier for end users who wish to use Trove and who wish to build their own 
> >images. I am not concerned what tool or tools a person will use to build 
> >these images.

++. One of the biggest issues I see users of DIB hit is ease of use for
'just make me an image, I don't care about twiddling knobs'. A wrapper
script in trove is one way to help with this, but I am sure there are
other solutions as well... maybe by rethinking some of our fear about
using elements as entry points to an image build, or by simply making
element's with better defaults.

> >
> >2. If we provide multiple alternatives to image building as part of the 
> >Trove project, we should make sure that images built with all sets of tools 
> >are equivalent and usable interchangeably. Failing to do this will make it 
> >harder for users to use Trove because we will be providing them with a false 
> >choice (i.e. the alternatives aren't really alternatives). This is harder 
> >than it sounds given the combination of tools, operating systems, and the 
> >source(s) from which you can get database software.
> 
> Maintaining both in the long run will be harder especially because, as
> you
> mentioned, the output must be usable interchangeably. However, I think
> we're at
> a point, based on the comments in [1] made by Pino Toscano, Luigi Toscano
> and
> some other folks that it'd be beneficial for us to have this discussion
> and to
> also experiment/test other options.
> 
> The Sahara team seems to be going in a direction that differs with the
> one used
> by the infra team and the one we're headed to (although they overlap in
> some
> areas).
> 

I would highly recommend against having two sets of image building code
for Trove - given DIB's current design there should not be any need for
this and there's a HUGE downside to maintaining two sets of code to do
the same thing in-tree. Ideally a single set of code would be used while
being able to be run in different environments if there are mutually
exclusive requirements being proposed by users.

What seemed very apparent to me in the summit session is that there are
a set of issues for Trove relating to image building, mostly relating to
reliability and ease of use. There was no one who even mentioned let
alone strongly cared about the issues which actually differentiate the
existing DIB build process from libguestfs (which is isolation). If that
has changed for some reason, then my recommendation would be to use a
tool like virt-dib which will allow for a single image building code
base while solving all the raised issues in the spec. I suspect when
this is tried out the downsides to booting a VM will highly outweigh the
benefits for almost all users (especially in trove's gate), but if the
libguestfs docs are to be believed this should be trivial to try out.


> >3. Trove already has elements for all supported databases using DIB in the 
> >trove-integration project but these elements are not packaged for customer 
> >use. Making them usable by customers is a relatively small effort including 
> >providing a wrapper script (derived from redstack[4]) and providing an 
> >element to install the guest agent software from a fixed location in 
> 

Re: [openstack-dev] [tripleo][releases] Remove diskimage-builder from releases

2016-04-19 Thread Gregory Haynes
On Tue, Apr 19, 2016, at 01:25 PM, Ian Wienand wrote:
> On 04/20/2016 03:25 AM, Doug Hellmann wrote:
> > It's not just about control, it's also about communication. One of
> > the most frequent refrains we hear is "what is OpenStack", and one
> > way we're trying to answer that is to publicize all of the things
> > we release through releases.openstack.org.
> 
> So for dib, this is mostly about documentation?
> 
> We don't have the issues around stable branches mentioned in the
> readme, nor do we worry about the requirements/constraints (proposal
> bot has always been sufficient?).
> 
> > Centralizing tagging also helps us ensure consistent versioning
> > rules, good timing, good release announcements, etc.
> 
> We so far haven't had issues keeping the version number straight.
> 
> As mentioned, the timing has extra constraints due to use in periodic
> infra jobs that I don't think the release team want to be involved
> with.  It's not like the release team will be going through the
> changes in a new release and deciding if they seem OK or not (although
> they're welcome to do dib reviews, before things get committed :) So I
> don't see what timing constraints will be monitored in this case.
> 
> When you look at this from my point of view, dib was left/is in an
> unreleasable state that I've had to clean up [1], we've now missed yet
> another day's build [2] and I'm not sure what's different except I now
> have to add probably 2 days latency to the process of getting fixes
> out there.
> 
> To try and be constructive : is what we want a proposal-bot job that
> polls for the latest release and adds it to the diskimage-builder.yaml
> file?  That seems to cover the documentation component of this.
> 
> Or, if you want to give diskimage-builder-release group permissions on
> the repo, so we can +2 changes on the diskimage-builder.yaml file, we
> could do that. [3]
> 
> -i
> 
> [1] https://review.openstack.org/#/c/306925/
> [2] https://review.openstack.org/#/c/307542/
> [3] my actual expectation of this happening is about zero

I think I have a handle on the different release tags after some reading
/ IRC chat, and AIUI - as long as DIB is an official project then in the
current system it's releases will be going through the release team.
Therefore, the discussion about release tag type isn't really relevant
to the concerns we seem to be bringing up which mostly center around us
worrying over the new step required to get a release out - there doesn't
exist a tag which will change this.

My concern with this extra step is mostly wondering what the practical
benefit to us is? Obviously there is some complexity being added to us
getting out a release, and were also involving a whole new team of folks
in doing this, so I think this absolutely warrants some benefit over the
existing system to us. There's a couple obvious things (having releases
go through review rather than just git push is extremely nice), but IMO
this doesn't outweigh the downsides of adding an additional review team.
I also feel like this is an issue we could solve while still allowing us
to retain release control.


So, a couple questions:

* Is there disagreement with the sentiment that adding the extra
releases review team to our release process is not desirable for DIB? I
am really wondering if there's some practical benefit here we might be
missing...

* Are there some benefits inherent to the extra releases review team
that we are missing and outweigh the benefits of the added process? I
want to make sure to distinguish between things (like gerrit perms with
our existing setup) which we could work with the releases team to fix,
and things that we can't. 

Thanks,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][releases] Remove diskimage-builder from releases

2016-04-19 Thread Gregory Haynes
On Tue, Apr 19, 2016, at 10:25 AM, Doug Hellmann wrote:
> Excerpts from Jeremy Stanley's message of 2016-04-19 15:41:26 +:
> > On 2016-04-19 09:22:57 -0400 (-0400), Doug Hellmann wrote:
> > > Excerpts from Ian Wienand's message of 2016-04-19 12:11:35 +1000:
> > [...]
> > > > I don't expect the stable release team to be involved with all this;
> > > > but if we miss windows then we're left either going to efforts getting
> > > > one of a handful of people with permissions to do manual rebuilds or
> > > > waiting yet another day to get something fixed.  Add some timezones
> > > > into this, and simple fixes are taking many days to get into builds.
> > > > Thus adding points where we can extend this by another 24 hours
> > > > really, well, sucks.
> > > 
> > > How often does that situation actually come up?
> > 
> > Semi-often. The project is officially under TripleO but it's sort of
> > a shared jurisdiction between some TripleO and Infra contributors. I
> > think the release team for diskimage-builder used to shoot for
> > tagging weekly (sans emergencies), though that's slacked off a bit
> > and is more like every 2 weeks lately.
> 
> That's about the same as or less often than we tag Oslo libraries.
> 
> > DIB is an unfortunate combination of a mostly stable framework and a
> > large pre-written set of scripts and declarative data which is
> > constantly evolving for widespread use outside the OpenStack
> > ecosystem (so most of the change volume is to the latter). As Ian
> > points out, the Infra team has already been tempted to stop relying
> > on DIB releases at all (or worse, maintain a fork) to reduce overall
> > latency for getting emergency fixes reflected in our worker images.
> 
> Sure, that's a compelling argument. I'm not opposed to making it easier
> for timely releases, just trying to understand the pressure.
> 
> > I suspect that most of the concern over using OpenStack release
> > process for DIB (and similarly Infra projects) is that the added
> > complexities introduce delays, especially if there's not a release
> > team member available to do on-the-spot approvals on weekends and
> > such. I don't know whether extending that to add tagging ACLs for
> > the library-release group would help? That would bring the total up
> > to 6 people, two more of whom are in non-American timezones, so
> > might be worth a try.
> > 
> > It's also worth keeping in mind that we've sort of already
> > identified two classes of official OpenStack projects. One is
> > "OpenStack the Product" only able to be distributed under the Apache
> > license and its contributors bound by a contributor license
> > agreement. The other is the output of a loose collective of groups
> > writing ancillary tooling consumed by the OpenStack community and
> > also often used for a lot of other things completely unrelated to
> > OpenStack. I can see where strict coordinated release process and
> > consistency for the former makes sense, but a lot of projects in the
> > latter category likely see it as unnecessary overkill for their
> > releases.
> 
> It's not just about control, it's also about communication. One of
> the most frequent refrains we hear is "what is OpenStack", and one
> way we're trying to answer that is to publicize all of the things
> we release through releases.openstack.org. Centralizing tagging
> also helps us ensure consistent versioning rules, good timing, good
> release announcements, etc.
> 
> Since dib is part of tripleo, and at least 2 other projects depend
> on it directly (sahara-image-elements and manila-image-elements),
> I would expect the tripleo team to want it included on the site,
> to publish release announcements, etc.
> 
> On the other hand, dib is using the release:independent model, which
> indicates that the team in fact doesn't think it should be considered
> part of the "product." Maybe we can use that as our flag for which
> projects should really be managed by the release team and which
> should not, but we don't want projects that want to be part of official
> releases to use that model.
> 
> With what I know today, I can't tell which side of the line dib is
> really on. Maybe someone can clarify?
> 
> Doug


There is a bit more nuance to getting releases out for downstream
consumers than just getting DIB fixes out quickly. Often there is a
situation where a downstream needs a fix/feature soonish but not
critically - maybe DIB is creating images for infra where networking is
not working due to a dib bug. Infra can delete the last round of images
and still function but its worthwhile to get things fixed soon. In that
case we don't want to just rush a release out the door (there's often
other things which have been merged and carry some risk), and we (or at
least I) like to wait to cut a DIB release until a morning, preferably
when a DIB core is around to help debug / verify the fix and any
fallout. I am not sure there is an easy fix where we can keep this model
without being 

Re: [openstack-dev] [DIB][Bifrost] Avoid running DIB with simple-init until a new release (1.14.1) is made

2016-04-06 Thread Gregory Haynes
On Wed, Apr 6, 2016, at 11:19 AM, Gregory Haynes wrote:
> This is a notice for users of diskimage-builder's simple-init element (I
> added Bifrost because I believe that is the recommended usage there).
> 
> There is a bug in the latest release (1.14.0) of diskimage-builder which
> will delete ssh host keys on the image building host when using the
> simple-init element. The fix is proposed[1] and we are working on
> merging it ASAP then cutting a new release. If you have a CI type set up
> (possibly via nodepool) which uses the simple-init element then its
> probably a good idea to disable it temporarily and check that you still
> have ssh host keys.
> 
> I really hope this hasn't bit anyone other than infra (sorry infra), but
> if it has bit you then I'm sorry!
> 
> -Greg
> 
> 1: https://review.openstack.org/#/c/302373/
> 

The new release has just been uploaded to pypi. Sorry, again for the
issues!

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [DIB][Bifrost] Avoid running DIB with simple-init until a new release (1.14.1) is made

2016-04-06 Thread Gregory Haynes
This is a notice for users of diskimage-builder's simple-init element (I
added Bifrost because I believe that is the recommended usage there).

There is a bug in the latest release (1.14.0) of diskimage-builder which
will delete ssh host keys on the image building host when using the
simple-init element. The fix is proposed[1] and we are working on
merging it ASAP then cutting a new release. If you have a CI type set up
(possibly via nodepool) which uses the simple-init element then its
probably a good idea to disable it temporarily and check that you still
have ssh host keys.

I really hope this hasn't bit anyone other than infra (sorry infra), but
if it has bit you then I'm sorry!

-Greg

1: https://review.openstack.org/#/c/302373/

-- 
  Gregory Haynes
  g...@greghaynes.net

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [neutron] - Changing the Neutron default security group rules

2016-03-02 Thread Gregory Haynes
Clearly, some operators and users disagree with the opinion that 'by
default security groups should closed off' given that we have several
large public providers who have changed these defaults (despite there
being no documented way to do so), and we have users in this thread
expressing that opinion. Given that, I am not sure there is any value
behind us expressing we have different opinions on what defaults should
be (let alone enforcing them by not allowing them to be configured)
unless there are some technical reasons beyond 'this is not what my
policy is, what my customers wants', etc. I also understand the goal of
trying to make clouds more similar for better interoperability (and I
think that is extremely important), but the reality is we have created
a situation where clouds are already not identical here in an even
worse, undocumented way because we are enforcing a certain set of
opinions here.

To me this is an extremely clear indication that at a minimum the
defaults should be configurable since discussion around them seems to
devolve into different opinions on security policies, and there is no
way we should be in the business of dictating that.

Cheers, Greg
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][DIB] diskimage-builder and python 2/3 compatibility

2015-12-09 Thread Gregory Haynes
Excerpts from Ian Wienand's message of 2015-12-09 09:35:15 +:
> On 12/09/2015 07:15 AM, Gregory Haynes wrote:
> > We ran in to a couple issues adding Fedora 23 support to
> > diskimage-builder caused by python2 not being installed by default.
> > This can be solved pretty easily by installing python2, but given that
> > this is eventually where all our supported distros will end up I would
> > like to find a better long term solution (one that allows us to make
> > images which have the same python installed that the distro ships by
> > default).
> 
> So I wonder if we're maybe hitting premature optimisation with this

That's a fair point. My thinking is that this is a thing we are hitting
now, and if we do not fix this then we are going to end up adding a
python2 dependency everywhere. This isn't the worst thing, but if we end
up wanting to remove that later it will be a backwards incompat issue.
So IMO if it's easy enough to get correct now it would be awesome to do
rather than ripping python2 out from underneath users at a later date.

> 
> > We use +x and a #! to specify a python
> > interpreter, but this needs to be python3 on distros which do not ship a
> > python2, and python elsewhere.
> 
> > Create a symlink in the chroot from /usr/local/bin/dib-python to
> > whatever the apropriate python executable is for that distro.
> 
> This is a problem for anyone wanting to ship a script that "just
> works" across platforms.  I found a similar discussion about a python
> launcher at [1] which covers most points and is more or less what
> is described above.
> 
> I feel like contribution to some sort of global effort in this regard
> might be the best way forward, and then ensure dib uses it.
> 
> -i
> 
> [1] https://mail.python.org/pipermail/linux-sig/2015-October/00.html
> 

My experience has been that this is something the python community
doesn't necessarially want (it would be pretty trivial to fix with a
python2or3 runner). I half expected some feedback of "please don't do
that, treat python2 and 3 as separate languages", which was a big reason
for this post. This is even more complicated by it being a
distro-specific issue (some distros do ship a /usr/bin/python which
points to either 2 or 3, depending on what is available). Basically,
yes, it would be great for the larger python community to solve this but
I am not hopeful of that actually happening.

We do need to come up with some kind of fix in the meantime. As I
mentioned above, the 'just install python2' fix has some annoyances
later down the road, and my thinking is that the symlink approach is not
much more work without the same problems for us...

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][DIB] diskimage-builder and python 2/3 compatibility

2015-12-08 Thread Gregory Haynes
Hello everyone,

I am hoping for some feedback from our developers/users on a potential
solution to a python 2/3 compatibility issue we are hitting in dib:

We ran in to a couple issues adding Fedora 23 support to
diskimage-builder caused by python2 not being installed by default.
This can be solved pretty easily by installing python2, but given that
this is eventually where all our supported distros will end up I would
like to find a better long term solution (one that allows us to make
images which have the same python installed that the distro ships by
default).

The problem is that many elements provide python scripts (such as
package-installs-v2 in the package-installs element) which we exec
during the image build process. We use +x and a #! to specify a python
interpreter, but this needs to be python3 on distros which do not ship a
python2, and python elsewhere. There is also one script
(pacakge-installs-squash) which runs outside of the chroot, but
otherwise all python scripts run inside the chroot.


Some brainstorming was done in #tripleo, and we came up with the
following plan for fixing this:

Create a symlink in the chroot from /usr/local/bin/dib-python to
whatever the apropriate python executable is for that distro. We will
then use dib-python in our #! lines for scripts which run in the chroot.

Scripts outside of the chroot will have to perform their own python
version detection and call any python using the detected python
interpreter - this is only one package-installs-squash script so
this seemed like not a big issue.


The other solutions we considered:

Switching everything to depend on python3 - This is ruled out since
rhel7/centos7 do not ship a python3 package.

Automatically rewriting #! lines as we exec them - This could work. I
personally think the sylink is simpler and solves the problem equally as
well.


Thoughts?

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Magnum][Testing] Reduce Functional testing ongate.

2015-11-13 Thread Gregory Haynes
Excerpts from Hongbin Lu's message of 2015-11-13 16:05:24 +:
> I am going to share something that might be off the topic a bit.
> 
> Yesterday, I was pulled to the #openstack-infra channel to participant a 
> discussion, which is related to the atomic image download in Magnum. It looks 
> the infra team is not satisfied with the large image size. In particular, 
> they need to double the timeout to accommodate the job [1] [2], which made 
> them unhappy. Is there a way to reduce the image size? Or even better, is it 
> possible to build the image locally instead of downloading it?
> 
> [1] https://review.openstack.org/#/c/242742/
> [2] https://review.openstack.org/#/c/244847/
> 
> Best regards,
> Hongbin

I am not sure how much of the current job is related to image
downloading (a previous message suggested maybe it isn't much?). If it
is an issue though - we have a tool for making images (DIB[1]) which is
already used by many OpenStack projects and it would be great if support
was added for it to make images that are useful to Magnum. DIB is also
pretty good at making images which are as small as possible, so it might
be a good fit.

I looked at doing this a while ago, and IIRC the atomic images were just
an lvm with a partition for a rootfs and a partition for a docker
overlay fs. The docs look like more options could be supported, but
regardless this seems like something DIB could do if someone was willing
to invest the effort.

Cheers,
Greg

1: http://docs.openstack.org/developer/diskimage-builder/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Outcome of distributed lock manager discussion @ the summit

2015-11-04 Thread Gregory Haynes
Excerpts from Clint Byrum's message of 2015-11-04 21:17:15 +:
> Excerpts from Joshua Harlow's message of 2015-11-04 12:57:53 -0800:
> > Ed Leafe wrote:
> > > On Nov 3, 2015, at 6:45 AM, Davanum Srinivas  wrote:
> > >> Here's a Devstack review for zookeeper in support of this initiative:
> > >>
> > >> https://review.openstack.org/241040
> > >>
> > >> Thanks,
> > >> Dims
> > >
> > > I thought that the operators at that session made it very clear that they 
> > > would *not* run any Java applications, and that if OpenStack required a 
> > > Java app to run, they would no longer use it.
> > >
> > > I like the idea of using Zookeeper as the DLM, but I don't think it 
> > > should be set up as a default, even for devstack, given the vehement 
> > > opposition expressed.
> > >
> > 
> > What should be the default then?
> > 
> > As for 'vehement opposition' I didn't see that as being there, I saw a 
> > small set of people say 'I don't want to run java or I can't run java', 
> > some comments about requiring using oracles JVM (which isn't correct, 
> > OpenJDK works for folks that I have asked in the zookeeper community and 
> > else where) and the rest of the folks were ok with it...
> > 
> > If people want a alternate driver, propose it IMHO...
> > 
> 
> The few operators who stated this position are very much appreciated
> for standing up and making it clear. It has helped us not step into a
> minefield with a native ZK driver!
> 
> Consul is the most popular second choice, and should work fine for the
> use cases we identified. It will not be sufficient if we ever have
> a use case where many agents must lock many resources, since Consul
> does not offer a way to grant lock access in a fair manner (ZK does,
> and we're not aware of any others that do actually). Using Consul or
> etcd for this case would result in situations where lock waiters may
> wait _forever_, and will likely wait longer than they should at times.
> Hopefully we can simply avoid the need for this in OpenStack all together.
> 
> I do _not_ think we should wait for constrained operators to scream
> at us about ZK to write a Consul driver. It's important enough that we
> should start documenting all of the issues we expect to see with Consul
> (it's not widely packaged, for instance) and writing a driver with its
> own devstack plugin.
> 
> If there are Consul experts who did not make it to those sessions,
> it would be greatly appreciated if you can spend some time on this.
> 
> What I don't want to see happen is we get into a deadlock where there's
> a large portion of users who can't upgrade and no driver to support them.
> So lets stay ahead of the problem, and get a set of drivers that works
> for everybody!
> 

One additional note - out of the three possible options I see for tooz
drivers in production (zk, consul, etcd) we currently only have drivers
for ZK. This means that unless new drivers are created, when we depend
on tooz we will be requiring folks deploy zk.

It would be *awesome* if some folks stepped up to create and support at
least one of the aternate backends.

Although I am a fan of the ZK solution, I have an old WIP patch for
creating an etcd driver. I would like to revive and maintain it, but I
would also need one more maintainer per the new rules for in tree
drivers...

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [requirements] [infra] speeding up gate runs?

2015-11-04 Thread Gregory Haynes
Excerpts from Jeremy Stanley's message of 2015-11-04 21:31:58 +:
> On 2015-11-04 15:34:26 -0500 (-0500), Sean Dague wrote:
> > This seems like incorrect logic. We should test devstack can do all the
> > things on a devstack change, not on every neutron / trove / nova change.
> > I'm fine if we want to have a slow version of this for devstack testing
> > which starts from a massively stripped down state, but for the 99% of
> > patches that aren't devstack changes, this seems like overkill.
> 
> We are, however, trying to get away from preinstalling additional
> distro packages on our job workers (in favor of providing a warm
> local cache) and leaving it up to the individual projects/jobs to
> define the packages they'll need to be able to run. I'll save the
> lengthy list of whys, it's been in progress for a while and we're
> finally close to making it a reality.

++

One way this could be done in DIB is to either:
bind mount the wheelhouse in from the build host, build an additional
image we dont use which fills up the wheel house, then bind mount that
in to the image we upload.

OR

make a chroot inside of our image build which creates the wheelhouse,
then either bind mount it out or copy it out in to the image we upload.

Either way, its pretty nasty and non trivial. I think the path of least
resistance for now is probably making a wheel mirror.

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Proposing Ian Wienand as core reviewer on diskimage-builder

2015-11-03 Thread Gregory Haynes
Hello everyone,

I would like to propose adding Ian Wienand as a core reviewer on the
diskimage-builder project. Ian has been making a significant number of
contributions for some time to the project, and has been a great help in
reviews lately. Thus, I think we could benefit greatly by adding him as
a core reviewer.

Current cores - Please respond with any approvals/objections by next Friday
(November 13th).

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Gregory Haynes
Excerpts from Chris Friesen's message of 2015-10-09 19:36:03 +:
> On 10/09/2015 12:55 PM, Gregory Haynes wrote:
> 
> > There is a more generalized version of this algorithm for concurrent
> > scheduling I've seen a few times - Pick N options at random, apply
> > heuristic over that N to pick the best, attempt to schedule at your
> > choice, retry on failure. As long as you have a fast heuristic and your
> > N is sufficiently smaller than the total number of options then the
> > retries are rare-ish and cheap. It also can scale out extremely well.
> 
> If you're looking for a resource that is relatively rare (say you want a 
> particular hardware accelerator, or a very large number of CPUs, or even to 
> be 
> scheduled "near" to a specific other instance) then you may have to retry 
> quite 
> a lot.
> 
> Chris
> 

Yep. You can either be fast or correct. There is no solution which will
both scale easily and allow you to schedule to a very precise node
efficiently or this would be a solved problem.

There is a not too bad middle ground here though - you can definitely do
some filtering beforehand efficiently (especially if you have some kind
of local cache similar to what Josh mentioned with ZK) and then this is
less of an issue. This is definitely a big step in complexity though...

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Gregory Haynes
Excerpts from Joshua Harlow's message of 2015-10-08 15:24:18 +:
> On this point, and just thinking out loud. If we consider saving
> compute_node information into say a node in said DLM backend (for
> example a znode in zookeeper[1]); this information would be updated
> periodically by that compute_node *itself* (it would say contain
> information about what VMs are running on it, what there utilization is
> and so-on).
> 
> For example the following layout could be used:
> 
> /nova/compute_nodes/
> 
>  data could be:
> 
> {
> vms: [],
> memory_free: XYZ,
> cpu_usage: ABC,
> memory_used: MNO,
> ...
> }
> 
> Now if we imagine each/all schedulers having watches
> on /nova/compute_nodes/ ([2] consul and etc.d have equivalent concepts
> afaik) then when a compute_node updates that information a push
> notification (the watch being triggered) will be sent to the
> scheduler(s) and the scheduler(s) could then update a local in-memory
> cache of the data about all the hypervisors that can be selected from
> for scheduling. This avoids any reading of a large set of data in the
> first place (besides an initial read-once on startup to read the
> initial list + setup the watches); in a way its similar to push
> notifications. Then when scheduling a VM -> hypervisor there isn't any
> need to query anything but the local in-memory representation that the
> scheduler is maintaining (and updating as watches are triggered)...
> 
> So this is why I was wondering about what capabilities of cassandra are
> being used here; because the above I think are unique capababilties of
> DLM like systems (zookeeper, consul, etcd) that could be advantageous
> here...
> 
> [1]
> https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#sc_zkDataModel_znodes
> 
> [2]
> https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html#ch_zkWatches

I wonder if we would even need to make something so specialized to get
this kind of local caching. I dont know what the current ZK tools are
but the original Chubby paper described that clients always have a
write-through cache for nodes which they set up subscriptions for in
order to break the cache.

Also, re: etcd - The last time I checked their subscription API was
woefully inadequate for performing this type of thing without hurding
issues.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] Scheduler proposal

2015-10-09 Thread Gregory Haynes
Excerpts from Zane Bitter's message of 2015-10-09 17:09:46 +:
> On 08/10/15 21:32, Ian Wells wrote:
> >
> > > 2. if many hosts suit the 5 VMs then this is *very* unlucky,because 
> > we should be choosing a host at random from the set of
> > suitable hosts and that's a huge coincidence - so this is a tiny
> > corner case that we shouldn't be designing around
> >
> > Here is where we differ in our understanding. With the current
> > system of filters and weighers, 5 schedulers getting requests for
> > identical VMs and having identical information are *expected* to
> > select the same host. It is not a tiny corner case; it is the most
> > likely result for the current system design. By catching this
> > situation early (in the scheduling process) we can avoid multiple
> > RPC round-trips to handle the fail/retry mechanism.
> >
> >
> > And so maybe this would be a different fix - choose, at random, one of
> > the hosts above a weighting threshold, not choose the top host every
> > time? Technically, any host passing the filter is adequate to the task
> > from the perspective of an API user (and they can't prove if they got
> > the highest weighting or not), so if we assume weighting an operator
> > preference, and just weaken it slightly, we'd have a few more options.
> 
> The optimal way to do this would be a weighted random selection, where 
> the probability of any given host being selected is proportional to its 
> weighting. (Obviously this is limited by the accuracy of the weighting 
> function in expressing your actual preferences - and it's at least 
> conceivable that this could vary with the number of schedulers running.)
> 
> In fact, the choice of the name 'weighting' would normally imply that 
> it's done this way; hearing that the 'weighting' is actually used as a 
> 'score' with the highest one always winning is quite surprising.
> 
> cheers,
> Zane.
> 

There is a more generalized version of this algorithm for concurrent
scheduling I've seen a few times - Pick N options at random, apply
heuristic over that N to pick the best, attempt to schedule at your
choice, retry on failure. As long as you have a fast heuristic and your
N is sufficiently smaller than the total number of options then the
retries are rare-ish and cheap. It also can scale out extremely well.

Obviously you lose some of the ability to micro-manage where things are
placed with a scheduling setup like that, but if scaling up is the
concern I really hope that isnt a problem...

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] diskimage-builder 1.0.0

2015-07-27 Thread Gregory Haynes
Excerpts from Gregory Haynes's message of 2015-06-29 12:44:18 +:
 Hello all,
 
 DIB has come a long way and we seem to have a fairly stable interface
 for the elements and the image creation scripts. As such, I think it's
 about time we commit to a major version release. Hopefully this can give
 our users the (correct) impression that DIB is ready for use by folks
 who want some level of interface stability.
 
 AFAICT our bug list does not have any major issues that might require us
 to break our interface, so I dont see any harm in 'just going for it'.
 If anyone has input on fixes/features we should consider including
 before a 1.0.0 release please speak up now. If there are no objections
 by next week I'd like to try and cut a release then. :)
 
 Cheers,
 Greg

I just cut the 1.0.0 release, so no going back now. Enjoy!

-Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Moving instack upstream

2015-07-22 Thread Gregory Haynes
Excerpts from Derek Higgins's message of 2015-07-21 19:29:49 +:
 Hi All,
 Something we discussed at the summit was to switch the focus of 
 tripleo's deployment method to deploy using instack using images built 
 with tripleo-puppet-elements. Up to now all the instack work has been 
 done downstream of tripleo as part of rdo. Having parts of our 
 deployment story outside of upstream gives us problems mainly because it 
 becomes very difficult to CI what we expect deployers to use while we're 
 developing the upstream parts.
 
 Essentially what I'm talking about here is pulling instack-undercloud 
 upstream along with a few of its dependency projects (instack, 
 tripleo-common, tuskar-ui-extras etc..) into tripleo and using them in 
 our CI in place of devtest.
 
 Getting our CI working with instack is close to working but has taken 
 longer then I expected because of various complications and distractions 
 but I hope to have something over the next few days that we can use to 
 replace devtest in CI, in a lot of ways this will start out by taking a 
 step backwards but we should finish up in a better place where we will 
 be developing (and running CI on) what we expect deployers to use.
 
 Once I have something that works I think it makes sense to drop the jobs 
 undercloud-precise-nonha and overcloud-precise-nonha, while switching 
 overcloud-f21-nonha to use instack, this has a few effects that need to 
 be called out
 
 1. We will no longer be running CI on (and as a result not supporting) 
 most of the the bash based elements
 2. We will no longer be running CI on (and as a result not supporting) 
 ubuntu

I'd like to point out that this means DIB will no longer have an image
booting test for Ubuntu. I have created a review[1] to try and get some
coverage of this in a dib speific test, hopefully we can get it merged
before we remove the tripleo ubuntu tests?

 
 Should anybody come along in the future interested in either of these 
 things (and prepared to put the time in) we can pick them back up again. 
 In fact the move to puppet element based images should mean we can more 
 easily add in extra distros in the future.
 
 3. While we find our feet we should remove all tripleo-ci jobs from non 
 tripleo projects, once we're confident with it we can explore adding our 
 jobs back into other projects again

I assume DIB will be keeping the tripleo jobs for now?

 
 Nothing has changed yet, I order to check we're all on the same page 
 this is high level details of how I see things should proceed so shout 
 now if I got anything wrong or you disagree.
 
 Sorry for not sending this out sooner for those of you who weren't at 
 the summit,
 Derek.
 

-Greg

[1] https://review.openstack.org/#/c/204639/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] diskimage-builder 1.0.0

2015-07-06 Thread Gregory Haynes
Excerpts from James Slagle's message of 2015-06-30 15:30:49 +:
 On Mon, Jun 29, 2015 at 8:44 AM, Gregory Haynes g...@greghaynes.net wrote:
  Hello all,
 
  DIB has come a long way and we seem to have a fairly stable interface
  for the elements and the image creation scripts. As such, I think it's
  about time we commit to a major version release. Hopefully this can give
  our users the (correct) impression that DIB is ready for use by folks
  who want some level of interface stability.
 
  AFAICT our bug list does not have any major issues that might require us
  to break our interface, so I dont see any harm in 'just going for it'.
  If anyone has input on fixes/features we should consider including
  before a 1.0.0 release please speak up now. If there are no objections
  by next week I'd like to try and cut a release then. :)
 
 Sounds good to me. I think the stable interfaces also includes the
 elements expected environment variables. It probably makes sense to
 document somewhere what the stable interfaces are, so that people
 doing releases know how to version the release appropriately based on
 any changes to those interfaces.
 
 Should we also remove the deprecated disk-image-get-kernel prior to
 the 1.0.0? There's a few other deprecations as well (map-packages),
 but I don't think we ever fully moved off of that in dib itself.
 

Both are great ideas. I've created [1] and [2] for this.

1: https://review.openstack.org/#/c/198810/
2: https://review.openstack.org/#/c/198814/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] diskimage-builder 1.0.0

2015-06-29 Thread Gregory Haynes
Hello all,

DIB has come a long way and we seem to have a fairly stable interface
for the elements and the image creation scripts. As such, I think it's
about time we commit to a major version release. Hopefully this can give
our users the (correct) impression that DIB is ready for use by folks
who want some level of interface stability.

AFAICT our bug list does not have any major issues that might require us
to break our interface, so I dont see any harm in 'just going for it'.
If anyone has input on fixes/features we should consider including
before a 1.0.0 release please speak up now. If there are no objections
by next week I'd like to try and cut a release then. :)

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Splitting out dib-core

2015-06-26 Thread Gregory Haynes
Hello TripleOers,

At the last mid-cycle we briefly discussed whether we should have
separate groups for tripleo and DIB core and decided it wasn't
necessary. I would like to revisit that topic.

It seems clear to me that we have some existing tripleo cores who are
becoming less familiar with the tripleo project as a whole but are
highly active in the DIB project. We also have new contributors who are
fairly active in the DIB project but are not active in the other tripleo
projects. I would really like a path where new contributors like this
can become core on DIB but that isn't really an option right now with
one core group for both tripleo and DIB.

I'd like to propose we make a new gerrit group for dib-core and add the
existing tripleo-core members to it (note: this is different than adding
tripleo-core as a member group of dib-core, which is an alternative).
This would mean that new tripleo cores would not automatically be DIB
cores, and new DIB cores would not automatically be tripleo cores.

Thoughts?

-Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Splitting out dib-core

2015-06-26 Thread Gregory Haynes
Excerpts from Clint Byrum's message of 2015-06-26 17:21:29 +:
 Excerpts from Gregory Haynes's message of 2015-06-26 08:17:36 -0700:
  Hello TripleOers,
  
  At the last mid-cycle we briefly discussed whether we should have
  separate groups for tripleo and DIB core and decided it wasn't
  necessary. I would like to revisit that topic.
  
  It seems clear to me that we have some existing tripleo cores who are
  becoming less familiar with the tripleo project as a whole but are
  highly active in the DIB project. We also have new contributors who are
  fairly active in the DIB project but are not active in the other tripleo
  projects. I would really like a path where new contributors like this
  can become core on DIB but that isn't really an option right now with
  one core group for both tripleo and DIB.
  
  I'd like to propose we make a new gerrit group for dib-core and add the
  existing tripleo-core members to it (note: this is different than adding
  tripleo-core as a member group of dib-core, which is an alternative).
  This would mean that new tripleo cores would not automatically be DIB
  cores, and new DIB cores would not automatically be tripleo cores.
 
 Why wouldn't we just make dib-core a sub-team for people not familiar
 with the broader TripleO effort, and just add tripleo-core to dib-core?
 

If I parse this correctly - you're asking about making dib-core a
superset of tripleo-core. I actually hadn't considered this, and I think
it's a good idea. The only reason way may not want this is if tripleo is
drifting far enough away from DIB that it doesnt make sense for new
tripleo cores to become DIB cores, but I don't think that is the case.

So, +1 on that idea.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Core reviewer update proposal

2015-05-05 Thread Gregory Haynes
Excerpts from James Slagle's message of 2015-05-05 11:57:46 +:
 TripleO cores, please respond with +1/-1 votes and any
 comments/objections within 1 week.

+1

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [magnum] New fedora 21 Atomic images available for testing

2015-04-26 Thread Gregory Haynes
Excerpts from Steven Dake (stdake)'s message of 2015-04-23 23:27:00 +:
 Hi folks,
 
 I have spent the last couple of days trying to bring some sanity to the image 
 building process for Magnum.
 
 I have found a tool which the Atomic upstream produces which allows a simple 
 repeatable building process for Fedora Atomic images using any upstream repos 
 of our choosing.
 
 I put in a kubernetes 0.15 COPR repo in this build.  Please test and get back 
 to me either on irc or the ML.
 
 The image is available for download from here:
 https://fedorapeople.org/groups/magnum/fedora-21-atomic-3.qcow2.xzhttps://fedorapeople.org/groups/magnum/
 
 Regards,
 -steve

Hey Steve,

Im wondering if youve looked at diskimage-builder for building these
images? Theres already a fair amount of openstack projects using this to
make disk images and I imagine it wouldnt be too hard for it to build
images of the type you need...

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] splitting out image building from devtest_overcloud.sh

2015-04-15 Thread Gregory Haynes
Excerpts from Dan Prince's message of 2015-04-15 02:14:12 +:
 I've been trying to cleanly model some Ceph and HA configurations in
 tripleo-ci that use Puppet (we are quite close to having these things in
 CI now!)
 
 Turns out the environment variables needed for these things are getting
 to be quite a mess. Furthermore we'd actually need to add to the
 environment variable madness to get it all working. And then there are
 optimization we'd like to add (like building a single image instead of
 one per role).
 
 One thing that would really help in this regard is splitting out image
 building from devtest_overcloud.sh. I took a stab at some initial
 patches to do this today.
 
 build-images: drive DIB via YAML config file
 https://review.openstack.org/#/c/173644/
 
 devtest_overcloud.sh: split out image building
 https://review.openstack.org/#/c/173645/
 
 If these sit well we could expand the effort to load images a bit more
 dynamically (a load-images script which could also be driven via a
 disk_images.yaml config file) and then I think devtest_overcloud.sh
 would be a lot more flexible for us Puppet users.
 
 Thoughts? I still have some questions myself but I wanted to get this
 out because we really do need some extra flexibility to be able to
 cleanly tune our scripts for more CI jobs.
 
 Dan
 

Have you looked at possibly using infra's nodepool for this? It is a bit
overkill, but currently nodepool lets you define a yaml file of images
for it to build using dib. If were not ok with bringing in all the
extras that nodepool has, maybe we could work on splitting out part of
nodepool for our needs, and having both projects this.

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] splitting out image building from devtest_overcloud.sh

2015-04-15 Thread Gregory Haynes
Excerpts from Gregory Haynes's message of 2015-04-16 02:50:17 +:
 Excerpts from Dan Prince's message of 2015-04-15 02:14:12 +:
  I've been trying to cleanly model some Ceph and HA configurations in
  tripleo-ci that use Puppet (we are quite close to having these things in
  CI now!)
  
  Turns out the environment variables needed for these things are getting
  to be quite a mess. Furthermore we'd actually need to add to the
  environment variable madness to get it all working. And then there are
  optimization we'd like to add (like building a single image instead of
  one per role).
  
  One thing that would really help in this regard is splitting out image
  building from devtest_overcloud.sh. I took a stab at some initial
  patches to do this today.
  
  build-images: drive DIB via YAML config file
  https://review.openstack.org/#/c/173644/
  
  devtest_overcloud.sh: split out image building
  https://review.openstack.org/#/c/173645/
  
  If these sit well we could expand the effort to load images a bit more
  dynamically (a load-images script which could also be driven via a
  disk_images.yaml config file) and then I think devtest_overcloud.sh
  would be a lot more flexible for us Puppet users.
  
  Thoughts? I still have some questions myself but I wanted to get this
  out because we really do need some extra flexibility to be able to
  cleanly tune our scripts for more CI jobs.
  
  Dan
  
 
 Have you looked at possibly using infra's nodepool for this? It is a bit
 overkill, but currently nodepool lets you define a yaml file of images
 for it to build using dib. If were not ok with bringing in all the
 extras that nodepool has, maybe we could work on splitting out part of
 nodepool for our needs, and having both projects this.
 
 Cheers,
 Greg

Did some digging and looks like infra has some planned work for this
already[1]. This would be great for TripleO as well for the same reasons
that infra wants it.

I do get that you have a need for this today though and what i'm
describing is a ways out, so I am +1 on your current approach for now.

Cheers,
Greg

1: 
http://specs.openstack.org/openstack-infra/infra-specs/specs/nodepool-workers.html

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Consistent variable documentation for diskimage-builder elements

2015-04-12 Thread Gregory Haynes
Excerpts from Clint Byrum's message of 2015-04-08 23:11:29 +:
 
 I discussed a format for something similar here:
 
 https://review.openstack.org/#/c/162267/
 
 Perhaps we could merge the effort.
 
 The design and implementation in that might take some time, but if we
 can document the variables at the same time we prepare the inputs for
 isolation, that seems like a winning path forward.
 

The solution presented there would be awesome for not having to document
the variables manually at all - we can do some sphinx plugin magic to
autogen the doc sections and even get some annoying to write out
features like static links for each var (Im sure you knew this, just
spelling it out).

I agree that itd be better to not put a lot of effort into switching all
the README's over right now and instead work on the argument isolation.
My hope is that in the meanwhile new elements we create and possibly
README's we end up editing get moved over to this new format. Then, we
can try and autogen something that is pretty similar when the time
comes.

Now, lets get that arg isolation donw already. ;)

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Consistent variable documentation for diskimage-builder elements

2015-04-07 Thread Gregory Haynes
Hello,

Id like to propse a standard for consistently documenting our
diskimage-builder elements. I have pushed a review which transforms the
apt-sources element to this format[1][2]. Essentially, id like to move
in the direction of making all our element README.rst's contain a sub
section called Environment Vairables with a Definition List[3] where
each entry is the environment variable. Under that environment variable
we will have a field list[4] with Required, Default, Description, and
optionally Example.

The goal here is that rather than users being presented with a wall of
text that they need to dig through to remember the name of a variable,
there is a quick way for them to get the information they need. It also
should help us to remember to document the vital bits of information for
each vairable we use.

Thoughts?

Cheers,
Greg

1 - https://review.openstack.org/#/c/171320/
2 - 
http://docs-draft.openstack.org/20/171320/1/check/gate-diskimage-builder-docs/d3bdf04//doc/build/html/elements/apt-sources/README.html
3 - http://docutils.sourceforge.net/docs/user/rst/quickref.html#definition-lists
4 - http://docutils.sourceforge.net/docs/user/rst/quickref.html#field-lists

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] Do we need release announcements for all the things?

2015-03-12 Thread Gregory Haynes
Excerpts from Clint Byrum's message of 2015-03-12 20:22:04 +:
 I spend a not-insignificant amount of time deciding which threads to
 read and which to fully ignore each day, so extra threads mean extra
 work, even with a streamlined workflow of single-key-press-per-thread.
 
 So I'm wondering what people are getting from these announcements being
 on the discussion list. I feel like they'd be better off in a weekly
 digest, on a web page somewhere, or perhaps with a tag that could be
 filtered out for those that don't benefit from them.
 

++

Or maybe even just send them to the already existing openstack-anounce
list?

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera

2015-02-05 Thread Gregory Haynes
Excerpts from Joshua Harlow's message of 2015-02-06 01:26:25 +:
 Angus Lees wrote:
  On Fri Feb 06 2015 at 4:25:43 AM Clint Byrum cl...@fewbar.com
  mailto:cl...@fewbar.com wrote:
  I'd also like to see consideration given to systems that handle
  distributed consistency in a more active manner. etcd and Zookeeper are
  both such systems, and might serve as efficient guards for critical
  sections without raising latency.
 
 
  +1 for moving to such systems.  Then we can have a repeat of the above
  conversation without the added complications of SQL semantics ;)
 
 
 So just an fyi:
 
 http://docs.openstack.org/developer/tooz/ exists.
 
 Specifically:
 
 http://docs.openstack.org/developer/tooz/developers.html#tooz.coordination.CoordinationDriver.get_lock
 
 It has a locking api that it provides (that plugs into the various 
 backends); there is also a WIP https://review.openstack.org/#/c/151463/ 
 driver that is being worked for etc.d.
 

An interesting note about the etcd implementation is that you can
select per-request whether you want to wait for quorum on a read or not.
This means that in theory you could obtain higher throughput for most
operations which do not require this and then only gain quorum for
operations which require it (e.g. locks).

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all][oslo.db][nova] TL; DR Things everybody should know about Galera

2015-02-05 Thread Gregory Haynes
Excerpts from Angus Lees's message of 2015-02-06 02:36:32 +:
 On Fri Feb 06 2015 at 12:59:13 PM Gregory Haynes g...@greghaynes.net
 wrote:
 
  Excerpts from Joshua Harlow's message of 2015-02-06 01:26:25 +:
   Angus Lees wrote:
On Fri Feb 06 2015 at 4:25:43 AM Clint Byrum cl...@fewbar.com
mailto:cl...@fewbar.com wrote:
I'd also like to see consideration given to systems that handle
distributed consistency in a more active manner. etcd and
  Zookeeper are
both such systems, and might serve as efficient guards for critical
sections without raising latency.
   
   
+1 for moving to such systems.  Then we can have a repeat of the above
conversation without the added complications of SQL semantics ;)
   
  
   So just an fyi:
  
   http://docs.openstack.org/developer/tooz/ exists.
  
   Specifically:
  
   http://docs.openstack.org/developer/tooz/developers.
  html#tooz.coordination.CoordinationDriver.get_lock
  
   It has a locking api that it provides (that plugs into the various
   backends); there is also a WIP https://review.openstack.org/#/c/151463/
   driver that is being worked for etc.d.
  
 
  An interesting note about the etcd implementation is that you can
  select per-request whether you want to wait for quorum on a read or not.
  This means that in theory you could obtain higher throughput for most
  operations which do not require this and then only gain quorum for
  operations which require it (e.g. locks).
 
 
 Along those lines and in an effort to be a bit less doom-and-gloom, I spent
 my lunch break trying to find non-marketing documentation on the Galera
 replication protocol and how it is exposed. (It was surprisingly difficult
 to find such information *)
 
 It's easy to get the transaction ID of the last commit
 (wsrep_last_committed), but I can't find a way to wait until at least a
 particular transaction ID has been synced.  If we can find that latter
 functionality, then we can expose that sequencer all the way through (HTTP
 header?) and then any follow-on commands can mention the sequencer of the
 previous write command that they really need to see the effects of.
 
 In practice, this should lead to zero additional wait time, since the
 Galera replication has almost certainly already caught up by the time the
 second command comes in - and we can just read from the local server with
 no additional delay.
 
 See the various *Index variables in the etcd API, for how the same idea
 gets used there.
 
  - Gus
 
 (*) In case you're also curious, the only doc I found with any details was
 http://galeracluster.com/documentation-webpages/certificationbasedreplication.html
 and its sibling pages.

My fear with something like this is that this is already a very hard
problem to get correct and this would be adding a fair amount of
complexity client side to achieve this. There is also an issue in that
this would a gelera-specific solution which means well be adding another
dimension to our feature testing matrix if we really wanted to support
it.

IMO we *really* do not want to be in the business of writing distrubuted
locking systems, but rather should be finding a way to either not
require them or rely on existing solutions.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Undercloud heat version expectations

2015-01-30 Thread Gregory Haynes
Excerpts from Gregory Haynes's message of 2015-01-30 18:28:19 +:
 Excerpts from Steven Hardy's message of 2015-01-30 10:29:05 +:
  Hi all,
  
  I've had a couple of discussions lately causing me to question $subject,
  and in particular what our expectations are around tripleo-heat-templates
  working with older (e.g non trunk) versions of Heat in the undercloud.
  
  For example, in [1], we're discussing merging a template-level workaround
  for a heat bug which has been fixed for nearly 4 months (I've now proposed
  a stable/juno backport..) - this raises the question, do we actually
  support tripleo-heat-templates with a stable/juno heat in the undercloud?
  
  Related to this is discussion such as [2], where ideally I'd like us to
  start using some new-shiny features we've been landing in heat to make the
  templates cleaner - is this valid, e.g can I start proposing template
  changes to tripleo-heat-templates which will definitely require
  new-for-kilo heat functionality?
  
  Thanks,
  
  Steve
  
  [1] https://review.openstack.org/#/c/151038/
  [2] https://review.openstack.org/#/c/151389/
  
 
 Hey Steve,
 
 A while ago (last mid cycle IIRC) we decided that rather than maintain
 stable branches we would ensure that we could deploy stable openstack
 releases from trunk. I believe Heat falls under this umbrella, and we
 need to make sure that we support deploying at least the latest stable
 heat release.
 
 That being said, were lacking in this plan ATM. We *really* should have
 a stable release CI job. We do have a spec though[1].
 
 Cheers,
 Greg
 
 
 [1] 
 http://git.openstack.org/cgit/openstack/tripleo-specs/tree/specs/juno/backwards-compat-policy.rst

We had a discussion in IRC about this and I wanted to bring up the points
that were made on the ML. By the end of the discussion I think the
consensus there was that we should resurrect the stable branches.
Therefore, I am especially seeking input from people who have arguments
for keeping our current 'deploy stable openstack from master' goals.

Our goal of being able to deploy stable openstack branches using HEAD of
tripleo tools makes some new feature development more difficult on
master than it needs to be. Specifically, dprince has been feeling this
pain in the tripleo/puppet integration work he is doing. There is also
some new heat feature work we could benefit from (like the patches
above) that were going to have to wait multiple cycles for or maintain
multiple implementations of. Therefore we should look into resurreting
our stable branches.

The backwards compat spec specifies that tripleo-image-elements and
tripleo-heat-templates are co-dependent WRT backwards compat. This
probably made some sense at the time of the spec writing since
alternatives to tripleo-image-elements did not exist, but with the
tripleo/puppet work we need to revisit this.

Thoughts? Comments?

Cheers,
Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Undercloud heat version expectations

2015-01-30 Thread Gregory Haynes
Excerpts from Steven Hardy's message of 2015-01-30 10:29:05 +:
 Hi all,
 
 I've had a couple of discussions lately causing me to question $subject,
 and in particular what our expectations are around tripleo-heat-templates
 working with older (e.g non trunk) versions of Heat in the undercloud.
 
 For example, in [1], we're discussing merging a template-level workaround
 for a heat bug which has been fixed for nearly 4 months (I've now proposed
 a stable/juno backport..) - this raises the question, do we actually
 support tripleo-heat-templates with a stable/juno heat in the undercloud?
 
 Related to this is discussion such as [2], where ideally I'd like us to
 start using some new-shiny features we've been landing in heat to make the
 templates cleaner - is this valid, e.g can I start proposing template
 changes to tripleo-heat-templates which will definitely require
 new-for-kilo heat functionality?
 
 Thanks,
 
 Steve
 
 [1] https://review.openstack.org/#/c/151038/
 [2] https://review.openstack.org/#/c/151389/
 

Hey Steve,

A while ago (last mid cycle IIRC) we decided that rather than maintain
stable branches we would ensure that we could deploy stable openstack
releases from trunk. I believe Heat falls under this umbrella, and we
need to make sure that we support deploying at least the latest stable
heat release.

That being said, were lacking in this plan ATM. We *really* should have
a stable release CI job. We do have a spec though[1].

Cheers,
Greg


[1] 
http://git.openstack.org/cgit/openstack/tripleo-specs/tree/specs/juno/backwards-compat-policy.rst

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] nominating James Polley for tripleo-core

2015-01-14 Thread Gregory Haynes
Excerpts from Clint Byrum's message of 2015-01-14 18:14:45 +:
 Hello! It has been a while since we expanded our review team. The
 numbers aren't easy to read with recent dips caused by the summit and
 holidays. However, I believe James has demonstrated superb review skills
 and a commitment to the project that shows broad awareness of the
 project.
 
 Below are the results of a meta-review I did, selecting recent reviews
 by James with comments and a final score. I didn't find any reviews by
 James that I objected to.
 
 https://review.openstack.org/#/c/133554/ -- Took charge and provided
 valuable feedback. +2
 https://review.openstack.org/#/c/114360/ -- Good -1 asking for better
 commit message and then timely follow-up +1 with positive comments for
 more improvement. +2
 https://review.openstack.org/#/c/138947/ -- Simpler review, +1'd on Dec.
 19 and no follow-up since. Allowing 2 weeks for holiday vacation, this
 is only really about 7 - 10 working days and acceptable. +2
 https://review.openstack.org/#/c/146731/ -- Very thoughtful -1 review of
 recent change with alternatives to the approach submitted as patches.
 https://review.openstack.org/#/c/139876/ -- Simpler review, +1'd in
 agreement with everyone else. +1
 https://review.openstack.org/#/c/142621/ -- Thoughtful +1 with
 consideration for other reviewers. +2
 https://review.openstack.org/#/c/113983/ -- Thorough spec review with
 grammar pedantry noted as something that would not prevent a positive
 review score. +2
 
 All current tripleo-core members are invited to vote at this time. Thank
 you!
 

Definite +1.

-Greg

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [all] [api] gabbi: A tool for declarative testing of APIs

2015-01-12 Thread Gregory Haynes
Excerpts from Chris Dent's message of 2015-01-12 19:20:18 +:
 
 After some discussion with Sean Dague and a few others it became
 clear that it would be a good idea to introduce a new tool I've been
 working on to the list to get a sense of its usefulness generally,
 work towards getting it into global requirements, and get the
 documentation fleshed out so that people can actually figure out how
 to use it well.
 
 tl;dr: Help me make this interesting tool useful to you and your
 HTTP testing by reading this message and following some of the links
 and asking any questions that come up.
 
 The tool is called gabbi
 
  https://github.com/cdent/gabbi
  http://gabbi.readthedocs.org/
  https://pypi.python.org/pypi/gabbi
 
 It describes itself as a tool for running HTTP tests where requests
 and responses are represented in a declarative form. Its main
 purpose is to allow testing of APIs where the focus of test writing
 (and reading!) is on the HTTP requests and responses, not on a bunch of
 Python (that obscures the HTTP).
 
 The tests are written in YAML and the simplest test file has this form:
 
 ```
 tests:
 - name: a test
url: /
 ```
 
 This test will pass if the response status code is '200'.
 
 The test file is loaded by a small amount of python code which transforms
 the file into an ordered sequence of TestCases in a TestSuite[1].
 
 ```
 def load_tests(loader, tests, pattern):
  Provide a TestSuite to the discovery process.
  test_dir = os.path.join(os.path.dirname(__file__), TESTS_DIR)
  return driver.build_tests(test_dir, loader, host=None,
intercept=SimpleWsgi,
fixture_module=sys.modules[__name__])
 ```
 
 The loader provides either:
 
 * a host to which real over-the-network requests are made
 * a WSGI app which is wsgi-intercept-ed[2]
 
 If an individual TestCase is asked to be run by the testrunner, those tests
 that are prior to it in the same file are run first, as prerequisites.
 
 Each test file can declare a sequence of nested fixtures to be loaded
 from a configured (in the loader) module. Fixtures are context managers
 (they establish the fixture upon __enter__ and destroy it upon
 __exit__).
 
 With a proper group_regex setting in .testr.conf each YAML file can
 run in its own process in a concurrent test runner.
 
 The docs contain information on the format of the test files:
 
  http://gabbi.readthedocs.org/en/latest/format.html
 
 Each test can state request headers and bodies and evaluate both response
 headers and response bodies. Request bodies can be strings in the
 YAML, files read from disk, or JSON created from YAML structures.
 Response verifcation can use JSONPath[3] to inspect the details of
 response bodies. Response header validation may use regular
 expressions.
 
 There is limited support for refering to the previous request
 to construct URIs, potentially allowing traversal of a full HATEOAS
 compliant API.
 
 At the moment the most complete examples of how things work are:
 
 * Ceilometer's pending use of gabbi:
https://review.openstack.org/#/c/146187/
 * Gabbi's testing of gabbi:
https://github.com/cdent/gabbi/tree/master/gabbi/gabbits_intercept
(the loader and faked WSGI app for those yaml files is in:
https://github.com/cdent/gabbi/blob/master/gabbi/test_intercept.py)
 
 One obvious thing that will need to happen is a suite of concrete
 examples on how to use the various features. I'm hoping that
 feedback will help drive that.
 
 In my own experimentation with gabbi I've found it very useful. It's
 helped me explore and learn the ceilometer API in a way that existing
 test code has completely failed to do. It's also helped reveal
 several warts that will be very useful to fix. And it is fast. To
 run and to write. I hope that with some work it can be useful to you
 too.
 
 Thanks.
 
 [1] Getting gabbi to play well with PyUnit style tests and
  with infrastructure like subunit and testrepository was one of
  the most challenging parts of the build, but the result has been
  a lot of flexbility.
 
 [2] https://pypi.python.org/pypi/wsgi_intercept
 [3] https://pypi.python.org/pypi/jsonpath-rw
 

Awesome! I was discussing trying to add extensions to RAML[1] so we
could do something like this the other day. Is there any reason you
didnt use an existing modeling language like this?

Cheers,
Greg

[1] http://raml.org/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][tripleo] Making diskimage-builder install from forked repo?

2015-01-08 Thread Gregory Haynes
Excerpts from Steven Hardy's message of 2015-01-08 17:37:55 +:
 Hi all,
 
 I'm trying to test a fedora-software-config image with some updated
 components.  I need:
 
 - Install latest master os-apply-config (the commit I want isn't released)
 - Install os-refresh-config fork from https://review.openstack.org/#/c/145764
 
 I can't even get the o-a-c from master part working:
 
 export PATH=${PWD}/dib-utils/bin:$PATH
 export
 ELEMENTS_PATH=tripleo-image-elements/elements:heat-templates/hot/software-config/elements
 export DIB_INSTALLTYPE_os_apply_config=source
 
 diskimage-builder/bin/disk-image-create vm fedora selinux-permissive \
   os-collect-config os-refresh-config os-apply-config \
   heat-config-ansible \
   heat-config-cfn-init \
   heat-config-docker \
   heat-config-puppet \
   heat-config-salt \
   heat-config-script \
   ntp \
   -o fedora-software-config.qcow2
 
 This is what I'm doing, both tools end up as pip installed versions AFAICS,
 so I've had to resort to manually hacking the image post-DiB using
 virt-copy-in.
 
 Pretty sure there's a way to make DiB do this, but don't know what, anyone
 able to share some clues?  Do I have to hack the elements, or is there a
 better way?
 
 The docs are pretty sparse, so any help would be much appreciated! :)
 
 Thanks,
 
 Steve
 

Hey Steve,

source-repositories is your friend here :) (check out
dib/elements/source-repositires/README). One potential gotcha is that
because source-repositires is an element it really only applies to tools
used within images (and os-apply-config is used outside the image). To
fix this we have a shim in tripleo-incubator/scripts/pull-tools which
emulates the functionality of source-repositories.

Example usage:

* checkout os-apply-config to the ref you wish to use
* export DIB_REPOLOCATION_os_apply_config=/path/to/oac
* export DIB_REPOREF_os_refresh_config=refs/changes/64/145764/1
* start your devtesting

HTH,
Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [heat][tripleo] Making diskimage-builder install from forked repo?

2015-01-08 Thread Gregory Haynes
Excerpts from Gregory Haynes's message of 2015-01-08 18:06:16 +:
 Excerpts from Steven Hardy's message of 2015-01-08 17:37:55 +:
  Hi all,
  
  I'm trying to test a fedora-software-config image with some updated
  components.  I need:
  
  - Install latest master os-apply-config (the commit I want isn't released)
  - Install os-refresh-config fork from 
  https://review.openstack.org/#/c/145764
  
  I can't even get the o-a-c from master part working:
  
  export PATH=${PWD}/dib-utils/bin:$PATH
  export
  ELEMENTS_PATH=tripleo-image-elements/elements:heat-templates/hot/software-config/elements
  export DIB_INSTALLTYPE_os_apply_config=source
  
  diskimage-builder/bin/disk-image-create vm fedora selinux-permissive \
os-collect-config os-refresh-config os-apply-config \
heat-config-ansible \
heat-config-cfn-init \
heat-config-docker \
heat-config-puppet \
heat-config-salt \
heat-config-script \
ntp \
-o fedora-software-config.qcow2
  
  This is what I'm doing, both tools end up as pip installed versions AFAICS,
  so I've had to resort to manually hacking the image post-DiB using
  virt-copy-in.
  
  Pretty sure there's a way to make DiB do this, but don't know what, anyone
  able to share some clues?  Do I have to hack the elements, or is there a
  better way?
  
  The docs are pretty sparse, so any help would be much appreciated! :)
  
  Thanks,
  
  Steve
  
 
 Hey Steve,
 
 source-repositories is your friend here :) (check out
 dib/elements/source-repositires/README). One potential gotcha is that
 because source-repositires is an element it really only applies to tools
 used within images (and os-apply-config is used outside the image). To
 fix this we have a shim in tripleo-incubator/scripts/pull-tools which
 emulates the functionality of source-repositories.
 
 Example usage:
 
 * checkout os-apply-config to the ref you wish to use
 * export DIB_REPOLOCATION_os_apply_config=/path/to/oac
 * export DIB_REPOREF_os_refresh_config=refs/changes/64/145764/1
 * start your devtesting

Actually, Chris's response is 100% correct. Even in the source
installtype we appear to be pip installing these tools so this will not
work.

In our CI we work around this by creating a local pypi mirror and
configuring pip to fall back to an upstream mirror. We then build
sdist's for anything we want to install via git and add them to our
'overlay mirror':

Code:
http://git.openstack.org/cgit/openstack-infra/tripleo-ci/tree/toci_devtest.sh#n139

Obviously, this isnt the most user friendly approach, but its an option.

Good luck,
Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] CI/CD report - 2014-12-12 - 2014-12-19

2014-12-19 Thread Gregory Haynes
Excerpts from James Polley's message of 2014-12-19 17:10:41 +:
 Two major CI outages this week
 
 2014-12-12 - 2014-12-15 - pip install MySQL-python failing on fedora
 - There was an updated mariadb-devel package, which caused pip install of
 the python bindings to fail as gcc could not build using the provided
 headers.
  - derekh put in a workaround on the 15th but we have to wait until
 upstream provides a fixed package for a permanent resolution
 
 2014-12-17 - failures in many projects on py33 tests
 - Caused by an unexpected interaction between new features in pbr and the
 way docutils handles python3 compatibility
 - derekh resolved this by tweaking the build process to not build pbr -
 just download the latest pbr from upstream

I am a bad person and forgot to update our CI outage etherpad, but we
had another outage that was caused by the setuptools PEP440 breakage:

https://review.openstack.org/#/c/141659/

We might be able to revert this now if the world is fixed

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Bug Squashing Day

2014-12-18 Thread Gregory Haynes
Excerpts from Gregory Haynes's message of 2014-12-16 19:47:54 +:
   On Wed, Dec 10, 2014 at 10:36 PM, Gregory Haynes g...@greghaynes.net
   wrote:
  
   A couple weeks ago we discussed having a bug squash day. AFAICT we all
   forgot, and we still have a huge bug backlog. I'd like to propose we
   make next Wed. (12/17, in whatever 24 window is Wed. in your time zone)
   a bug squashing day. Hopefully we can add this as an item to our weekly
   meeting on Tues. to help remind everyone the day before.
 
 Friendly Reminder that tomorrow (or today for some time zones) is our
 bug squash day! I hope to see youall in IRC squashing some of our
 (least) favorite bugs.
 
 Random Factoid: We currently have 299 open bugs.

Thanks to everyone who participated in our bug squash day! We are now
down to 264 open bugs (down from 299). There was also a fair number of
bugs filed today as part of our (anti) bug squashing efforts, bringing
our total bugs operated on today to 50.

Thanks, again!

Cheers,
Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Bug Squashing Day

2014-12-16 Thread Gregory Haynes
  On Wed, Dec 10, 2014 at 10:36 PM, Gregory Haynes g...@greghaynes.net
  wrote:
 
  A couple weeks ago we discussed having a bug squash day. AFAICT we all
  forgot, and we still have a huge bug backlog. I'd like to propose we
  make next Wed. (12/17, in whatever 24 window is Wed. in your time zone)
  a bug squashing day. Hopefully we can add this as an item to our weekly
  meeting on Tues. to help remind everyone the day before.

Friendly Reminder that tomorrow (or today for some time zones) is our
bug squash day! I hope to see youall in IRC squashing some of our
(least) favorite bugs.

Random Factoid: We currently have 299 open bugs.

Cheers,
Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Bug Squashing Day

2014-12-10 Thread Gregory Haynes
A couple weeks ago we discussed having a bug squash day. AFAICT we all
forgot, and we still have a huge bug backlog. I'd like to propose we
make next Wed. (12/17, in whatever 24 window is Wed. in your time zone)
a bug squashing day. Hopefully we can add this as an item to our weekly
meeting on Tues. to help remind everyone the day before.

Cheers,
Greg

-- 
  Gregory Haynes
  g...@greghaynes.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Using python logging.conf for openstack services

2014-12-04 Thread Gregory Haynes
Hello TripleOers,

I got a patch together to move us off of our upstart exec service |
logger -t service hack [1] and this got me wondering - why aren't we
using the python logging.conf supported by most OpenStack projects [2]
to write out logs to files in our desired location? 

This is highly desirable for a couple reasons:

* Less complexity / more straightforward. Basically we wouldn't have to
run rsyslog or similar and have app config to talk to syslog then syslog
config to put our logs where we want. We also don't have to battle with
upstart + rsyslog vs systemd-journald differences and maintain two sets
of configuration.

* We get actual control over formatting. This is a double edged sword in
that AFAICT you *have* to control formatting if you're using a
logging.conf with a custom log handler. This means it would be a bit of
a divergence from our use the defaults policy but there are some
logging formats in the OpenStack docs [3] named normal, maybe this
could be acceptable? The big win here is we can avoid issues like having
duplicate timestamps [4] (this issue still exists on Ubuntu, at least)
without having to do two sets of configuration, one for upstart +
rsyslog, one for systemd.

* This makes setting custom logging configuration a lot more feasible
for operators. As-is, if an operator wants to forward logs to an
existing central log server we dont really have a good way for them to
do this. We also have a requirement that we can come up with a way to
expose the rsyslog/journald config options needed to do this to
operators. If we are using logging.conf we can just use our existing
passthrough-config system to let operators simply write out custom
logging.conf files which are already documented by OpenStack.

Thoughts? Comments? Concerns?

Cheers,
Greg

[1] - https://review.openstack.org/#/c/138844/
[2] -
http://docs.openstack.org/admin-guide-cloud/content/section_manage-logs.html
* Note that Swift does not support this
[3] -
http://docs.openstack.org/trunk/config-reference/content/section_keystone-logging.conf.html
[4] - https://review.openstack.org/#/c/104619/

-- 
  Gregory Haynes
  g...@greghaynes.net

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Kilo Mid-Cycle Meetup Planning

2014-11-18 Thread Gregory Haynes
Excerpts from Gregory Haynes's message of 2014-10-09 20:32:26 +:
 Hello TripleO-ers,
 
 Last time around there was a lot of feedback that we should plan our
 mid-cycle metup a lot sooner, so lets do that! I've created a (mostly
 bare) etherpad here:
 
 https://etherpad.openstack.org/p/kilo-tripleo-midcycle-meetup
 
 Note that there are currently no possible venues listed. If you are able
 to provide a possible venue (Thank you!) please reply and/or add it to
 the etherpad.
 
 I have also listed a few possible Mon-Fri meetup dates. Do not take this
 as any indication that Mon-Fri is an ideal meetup length or time of
 week, and feel free to add feedback / combinations of your own.
 Personally, I felt pretty burned out by Friday last time so maybe
 Mon-Wed is a better size?

A bit of a status update: We are in the works of getting approval for a
Seattle location to host our mid-cycle sprint. If you havent already
(which is most everyone at this point) please add your preferences to
the list of dates at the bottom of the etherpad.

More updates to come!

Cheers,
Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Strategy for testing and merging the merge.py-free templates

2014-10-14 Thread Gregory Haynes
Excerpts from Tomas Sedovic's message of 2014-10-14 08:55:30 +:
 James Slagle proposed something like this when I talked to him on IRC:
 
 1. teach devtest about the new templates, driven by a 
 OVERCLOUD_USE_MERGE_PY switch (defaulting to the merge.py-based templates)
 2. Do a CI run of the new template patches, merge them
 3. Add a (initially non-voting?) job to test the heat-only templates
 4. When we've resolved all the issues stopping up from the switch, make 
 the native templates default, deprecate the merge.py ones

This sounds good to me. I would support even making it voting from the
start if it is on-par with pass rates of our other jobs (which seems
like it should be). The main question here is whether we have the
capacity for this (derekh?) as I know we tend to run close to our
capacity limit worth of jobs.

Cheers,
Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Kilo Mid-Cycle Meetup Planning

2014-10-09 Thread Gregory Haynes
Hello TripleO-ers,

Last time around there was a lot of feedback that we should plan our
mid-cycle metup a lot sooner, so lets do that! I've created a (mostly
bare) etherpad here:

https://etherpad.openstack.org/p/kilo-tripleo-midcycle-meetup

Note that there are currently no possible venues listed. If you are able
to provide a possible venue (Thank you!) please reply and/or add it to
the etherpad.

I have also listed a few possible Mon-Fri meetup dates. Do not take this
as any indication that Mon-Fri is an ideal meetup length or time of
week, and feel free to add feedback / combinations of your own.
Personally, I felt pretty burned out by Friday last time so maybe
Mon-Wed is a better size?

Cheers,
Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] a need to assert user ownership in preserved state

2014-10-06 Thread Gregory Haynes
Excerpts from Clint Byrum's message of 2014-10-02 14:15:30 +:
 Excerpts from Gregory Haynes's message of 2014-10-01 19:09:38 -0700:
  If we really want to go with this type of aproach we could also just
  copy the existing /etc/passwd into the image thats being built. Then
  when users are added they should be added in after existing users.
  
 
 I do like this approach, and it isn't one I had considered. We will know
 what image we want to update from in nearly every situation. Also this
 supports another case, which is rolling back to the previous image,
 quite well.
 
 Really this is just an automated form of static UID assignment.
 

Now that ive proposed this id like to make an argument against the copy
/etc/passwd as our long term solution (sorry). I do think its a not bad
immediate fix, but long term id prefer actual static UID assignment out
of the solutions proposed so far.

It seems like determining how to build a new image based on the state of
a previous image is an exact anti-pattern that read only / ephemeral
instances aim to solve - minimize entropy collected over time from doing
updates. Were also adding a requirement that user databases are now
precious data which cannot ever be lost for a given image type. Its
worth noting that these are both issues that operators will encounter.

Static UID/GID assignment requires more developer work but AFAICT it
passes less potential issues off onto operators (which should be our
goal).

Cheers,
Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] a need to assert user ownership in preserved state

2014-10-02 Thread Gregory Haynes
Excerpts from James Polley's message of 2014-10-02 05:37:25 +:
 All three of the options presented here seem to assume that UIDs will always 
 be allocated at image-build time. I think that's because most of these UIDs 
 will be used to write files into the chroot at image-create time - if I could 
 think of some way around that, I think we could avoid this problem more 
 neatly by not assigning the UIDs until first boot
 
 But since we can't do that, would it be possible to compromise by having the 
 UIDs read in from heat metadata, and using the current allocation process if 
 none is provided?
 
 This should allow people who prefer to have static UIDs to have simple 
 drop-in config, but also allow people who want to dynamically read from 
 existing images to scrape the details and then drop them in.
 
 To aid people who have existing images, perhaps we could provide a small tool 
 (if one doesn't already exist) that simply reads /etc/passwd and returns a 
 JSON username:uid map, to be added into the heat local environment when 
 building the next image?
 

What I was suggesting before as an alternate solution is a more simple
version of this - just copy the existing /etc/passwd and friends into
the chroot at the start of building a new image. This should cause new
users to be created in a safe way.

IMO I like the uid pinning better as a solution, though.

Cheers,
Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] a need to assert user ownership in preserved state

2014-10-01 Thread Gregory Haynes
Excerpts from Clint Byrum's message of 2014-10-02 01:50:33 +:
 Recently we've been testing image based updates using TripleO, and we've
 run into an interesting conundrum.
 
 Currently, our image build scripts create a user per service for the
 image. We don't, at this time, assert a UID, so it could get any UID in
 the /etc/passwd database of the image.
 
 However, if we add a service that happens to have its users created
 before a previously existing service, the UID's shift by one. When
 this new image is deployed, the username might be 'ceilometer', but
 /mnt/state/var/lib/ceilometer is now owned by 'cinder'.

Wow, nice find!

 
 Here are 3 approaches, which are not mutually exclusive to one another.
 There are likely others, and I'd be interested in hearing your ideas.
 
 * Static UID's for all state-preserving services. Basically we'd just
   allocate these UID's from a static pool and those are always the UIDs
   no matter what. This is the simplest solution, but does not help
   anybody who is already looking to update a TripleO cloud. Also, this
   would cause problems if TripleO wants to merge with any existing
   system that might also want to use similar UID's. This also provides
   no guard against non-static UID's storing things on the state
   partition.

+1 for this approach for the reasons mentioned.

 
 * Fix the UID's on image update. We can backup /etc/passwd and
   /etc/group to /mnt/state, and on bootup we can diff the two, and any
   UIDs that changed can be migrated. This could be very costly if the
   swift storage UID changed, with millions of files present on the
   system. This merge process is also not atomic and may not be
   reversible, so it is a bit scary to automate this.

If we really want to go with this type of aproach we could also just
copy the existing /etc/passwd into the image thats being built. Then
when users are added they should be added in after existing users.

I still prefer the first solution, though.

 
 * Assert ownership when registering state path. We could have any
   state-preserving elements register their desire for any important
   globs for the state drive to be owned by a particular symbolic
   username. This is just a different, more manual way to fix the UID's
   and carries the same cons.
 
 So, what do people think?
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Adding hp1 back running tripleo CI

2014-09-15 Thread Gregory Haynes
This is a total shot in the dark, but a couple of us ran into issues
with the Ubuntu Trusty kernel (I know I hit it on HP hardware) that was
causing severely degraded performance for TripleO. This fixed with a
recently released kernel in Trusty... maybe you could be running into
this?

-Greg

 Also its worth noting the test I have been using to compare jobs is the
 F20 overcloud job, something has happened recently causing this job to
 run slower then it used to run (possibly upto 30 minutes slower), I'll
 now try to get to the bottom of this. So the times may not end up being
 as high as referenced above but I'm assuming the relative differences
 between the two clouds wont change.
 
 thoughts?
 Derek
 

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Propose adding StevenK to core reviewers

2014-09-09 Thread Gregory Haynes
Hello everyone!

I have been working on a meta-review of StevenK's reviews and I would
like to propose him as a new member of our core team.

As I'm sure many have noticed, he has been above our stats requirements
for several months now. More importantly, he has been reviewing a wide
breadth of topics and seems to have a strong understanding of our code
base. He also seems to be doing a great job at providing valuable
feedback and being attentive to responses on his reviews.

As such, I think he would make a great addition to our core team. Can
the other core team members please reply with your votes if you agree or
disagree.

Thanks!
Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Glance][Nova][All] requests 2.4.0 breaks glanceclient

2014-09-03 Thread Gregory Haynes
Excerpts from Kuvaja, Erno's message of 2014-09-03 12:30:08 +:
 Hi All,
 
 While investigating glanceclient gating issues we narrowed it down to 
 requests 2.4.0 which was released 2014-08-29. Urllib3 seems to be raising new 
 ProtocolError which does not get catched and breaks at least glanceclient.
 Following error can be seen on console ProtocolError: ('Connection 
 aborted.', gaierror(-2, 'Name or service not known')).
 
 Unfortunately we hit on such issue just under the freeze. Apparently this 
 breaks novaclient as well and there is change 
 (https://review.openstack.org/#/c/118332/ )proposed to requirements to limit 
 the version 2.4.0.
 
 Is there any other projects using requirements and seeing issues with the 
 latest version?

Weve run into this in tripleo, specifically with os-collect-config.
Heres the upstream bug:
https://github.com/kennethreitz/requests/issues/2192

We had to pin it in our project to unwedge CI (otherwise we would be
blocked on cutting an os-collect-config release).

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ironic] (Non-)consistency of the Ironic hash ring implementation

2014-09-02 Thread Gregory Haynes
Excerpts from Nejc Saje's message of 2014-09-01 07:48:46 +:
 Hey guys,
 
 in Ceilometer we're using consistent hash rings to do workload 
 partitioning[1]. We've considered generalizing your hash ring 
 implementation and moving it up to oslo, but unfortunately your 
 implementation is not actually consistent, which is our requirement.
 
 Since you divide your ring into a number of equal sized partitions, 
 instead of hashing hosts onto the ring, when you add a new host,
 an unbound amount of keys get re-mapped to different hosts (instead of 
 the 1/#nodes remapping guaranteed by hash ring). I've confirmed this 
 with the test in aforementioned patch[2].

I am just getting started with the ironic hash ring code, but this seems
surprising to me. AIUI we do require some rebalancing when a conductor
is removed or added (which is normal use of a CHT) but not for every
host added. This is supported by the fact that we currently dont have a
rebalancing routine, so I would be surprised if ironic worked at all if
we required it for each host that is added.

Can anyone in Ironic with a bit more experience confirm/deny this?

 
 If this is good enough for your use-case, great, otherwise we can get a 
 generalized hash ring implementation into oslo for use in both projects 
 or we can both use an external library[3].
 
 Cheers,
 Nejc
 
 [1] https://review.openstack.org/#/c/113549/
 [2] 
 https://review.openstack.org/#/c/113549/21/ceilometer/tests/test_utils.py
 [3] https://pypi.python.org/pypi/hash_ring
 

Thanks,
Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][Ironic] Unique way to get a registered machine?

2014-08-22 Thread Gregory Haynes
Excerpts from Steve Kowalik's message of 2014-08-22 06:32:04 +:
 At the moment, if you run register-nodes a second time with the same 
 list of nodes, it will happily try and register them and then blow up 
 when Ironic or Nova-bm returns an error. If operators are going to 
 update their master list of nodes to add or remove machines and then run 
 register-nodes again, we need a way to skip registering nodes that are 
 already -- except that I don't really want to extract out the UUID of 
 the registered nodes, because that puts an onus on the operators to make 
 sure that the UUID is listed in the master list, and that would be mean 
 requiring manual data entry, or some way to get that data back out in 
 the tool they use to manage their master list, which may not even have 
 an API. Because our intent is for this bridge between an operators 
 master list, and a baremetal service, the intent is for this to run 
 again and again when changes happen.

I dont understand why inputting the UUID into the master list requires
manual entry? Why cant we, on insertion, also insert the UUID into the
nodes list? One potential downside is that operators cannot fully regen
the nodes list when editing it but have to 'merge in' changes but IMO
this is a good enough start and preferrable to some non-straightforward
implicit behavior done by our own routine.

 This means we need a way to uniquely identify the machines in the list 
 so we can tell if they are already registered.
 
 For the pxe_ssh driver, this means the set of MAC addresses must 
 intersect.
 
 For other drivers, we think that the pm_address for each machine will 
 be unique. Would it be possible add some advice to that effect to 
 Ironic's driver API?

Building off my previous comment - If we really want to provide an
implicit updating mechanism so operators can re-gen node lists in
entirety then why not build it as a new processing stage? Im thinking:

1) gen list of just_nodes.json
2) run update_nodes to update nodes.json containing new data and old
UUIDS where applicable
3) pass nodes.json into register nodes

Your proposal of auto detecting updated nodes would live purely in the
update_nodes script but operators could elect to skip this if their node
generation tooling supports it.

- Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Future CI jobs

2014-08-20 Thread Gregory Haynes
Excerpts from Derek Higgins's message of 2014-08-20 09:06:48 +:
 On 19/08/14 20:58, Gregory Haynes wrote:
  Excerpts from Giulio Fidente's message of 2014-08-19 12:07:53 +:
  One last comment, maybe a bit OT but I'm raising it here to see what is 
  the other people opinion: how about we modify the -ha job so that at 
  some point we actually kill one of the controllers and spawn a second 
  user image?
  
  I think this is a great long term goal, but IMO performing an update
  isnt really the type of verification we want for this kind of test. We
  really should have some minimal tempest testing in place first so we can
  verify that when these types of failures occur our cloud remains in a
  functioning state.
 
 Greg, you said performing an update did you mean killing a controller
 node ?
 
 if so I agree, verifying our cloud is still in a working order with
 tempest would get us more coverage then spawning a node. So once we have
 tempest in place we can add a test to kill a controller node.
 

Ah, I misread the original message a bit, but sounds like were all on
the same page.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Future CI jobs

2014-08-19 Thread Gregory Haynes
Excerpts from Derek Higgins's message of 2014-08-19 10:41:11 +:
 Hi All,
 
I'd like to firm up our plans around the ci jobs we discussed at the
 tripleo sprint, at the time we jotted down the various jobs on an
 etherpad, to better visualize the matrix of coverage I've put it into a
 spreadsheet[1]. Before we go about making these changes I'd like to go
 through a few questions for firm things up
 
 1. Did we miss any jobs that we should have included?
gfidente mentioned on IRC about adding blockstoragescale and
 swiftstoragescale jobs into the mix, should we add this to the matrix so
 at each is tested on at least one of the existing jobs?
 
 2. Which jobs should run where? i.e. we should probably only aim to run
 a subset of these jobs (possibly 1 fedora and 1 ubuntu?) on non tripleo
 projects.
 
 3. Are there any jobs here we should remove?
 
 4. Is there anything we should add to the test matrix?
Here I'm thinking we should consider dependent libraries i.e. have at
 least one job that uses the git version of dependent libraries rather
 then the released library
 
 5. On selinux we had said that we would set it to enforcing on Fedora
 jobs, once its ready we can flick the switch. This may cause us
 breakages as projects evolve but we can revisit if they are too frequent.
 
 Once anybody with an opinion has had had a chance to look over the
 spreadsheet, I'll start to make changes to our existing jobs so that
 they match jobs on the spreadsheet and then add the new jobs (one at a time)
 
 Feel free to add comments to the spreadsheet or reply here.
 
 thanks,
 Derek
 
 [1]
 https://docs.google.com/spreadsheets/d/1LuK4FaG4TJFRwho7bcq6CcgY_7oaGnF-0E6kcK4QoQc/edit?usp=sharing
 

Looks Great! One suggestion is that due to capacity issues we had a
prioritization of these jobs and were going to walk down the list to add
new jobs as capacity became available. It might be a good idea to add a
column for this?

-Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Future CI jobs

2014-08-19 Thread Gregory Haynes
Excerpts from Giulio Fidente's message of 2014-08-19 12:07:53 +:
 One last comment, maybe a bit OT but I'm raising it here to see what is 
 the other people opinion: how about we modify the -ha job so that at 
 some point we actually kill one of the controllers and spawn a second 
 user image?

I think this is a great long term goal, but IMO performing an update
isnt really the type of verification we want for this kind of test. We
really should have some minimal tempest testing in place first so we can
verify that when these types of failures occur our cloud remains in a
functioning state.

- Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] devtest environment for virtual or true bare metal

2014-08-18 Thread Gregory Haynes
Excerpts from Ben Nemec's message of 2014-08-08 22:25:35 +:
 That sounds essentially correct.  Note that all 15 vms aren't used in a
 normal devtest run, but we create them all anyway because of some
 difficulties adding new environments in some situations (namely CI, I
 believe).
 
 On 08/05/2014 11:27 AM, LeslieWang wrote:
  Hi Ben,
  Thanks for your reply. 
  Actually I'm a little confused by virtual environment. I guess what it 
  means is as below:  - 1 Seed VM as deployment starting point.  - Both 
  undercloud and overcloud images are loaded into Glance of Seed VM.  - 15 
  VMs are created. 1 for undercloud, 1 for overcloud controller, left 13 are 
  for overcloud compute.  - 1 Host machine acts as container for all 15 VMs. 
  It can be separated from Seed VM.  - Seed VM communicates with Host machine 
  to create 15 VMs and installed corresponding images.  Is it correct? Or can 
  you roughly introduces the topology of the devtest virtual environment.
  Best RegardsLeslie Wang
 

Yes, im not sure that '15 VMs are created' is entirely correct - we
create 15 vm definitions. This does include qcow2 images but these
typically start much smaller than their full possible size on disk.
The distinction is, as Ben points out, that we do not actually deploy to
all of these VMs.

In addition to the CI issues, we specify a static number that is large
enough for full runs to more correctly emulate a real deployment -
typically you have a set number of servers to deploy on which is not
directly determined by your deployment requirements.

Cheers,
Greg

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] fix poor tarball support in source-repositories

2014-08-18 Thread Gregory Haynes
Excerpts from Clint Byrum's message of 2014-08-16 14:33:20 +:
 That is a separate bug, but I think the answer to that is to use rsync
 instead of mv and globs. So this:
 
 mv $tmp/./* $destdir
 
 becomes this:
 
 rsync --remove-source-files $tmp/. $destdir
 

+1 on this approach. It's straightforward to explain and fairly easy to
reason about.

I saw mention of checking the in-tarball contents to determine what to
do - I would prefer us to be explicit about the behavior to take rather
than depend on some internal structure of an external package. While we
have to depend on the structure of the tarball to a certain extent, IMO
we should minimize this if possible.

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


  1   2   >