Re: [openstack-dev] [tripleo] tripleo-upgrade pike branch

2018-01-19 Thread John Trowbridge
On Fri, Jan 19, 2018 at 10:21 AM, Wesley Hayutin 
wrote:

> Thanks Marius for sending this out and kicking off a conversation.
>
> On Tue, Jan 2, 2018 at 12:56 PM, Marius Cornea  wrote:
>
>> Hi everyone and Happy New Year!
>>
>> As the migration of tripleo-upgrade repo to the openstack namespace is
>> now complete I think it's the time to create a Pike branch to capture
>> the current state so we can use it for Pike testing and keep the
>> master branch for Queens changes. The update/upgrade steps are
>> changing between versions and the aim of branching the repo is to keep
>> the update/upgrade steps clean per branch to avoid using conditionals
>> based on release. Also tripleo-upgrade should be compatible with
>> different tools used for deployment(tripleo-quickstart, infrared,
>> manual deployments) which use different vars for the version release
>> so in case of using conditionals we would need extra steps to
>> normalize these variables.
>>
>
> I understand the desire to create a branch to protect the work that has
> been done previously.
> The interesting thing is that you guys are proposing to use a branched
> ansible role with
> a branchless upstream project.  I want to make sure we have enough review
> so that we don't hit issues
> in the future.   Maybe that is OK, but I have at least one concern.
>
> My conern is about gating the tripleo-upgrade role and it's branches.
> When tripleo-quickstart is changed
> which is branchless we will be have to kick off a job for each
> tripleo-upgrade branch?  That immediately doubles
> the load on gates.
>

I do not think CI repos should be branched. Even more than the concern Wes
brought up about a larger gate matrix. Think
about how much would need to get backported. To start you would just have
the 2 branches, but eventually you will have 3.
Likely all 3 will have slight differences in how different pieces of the
upgrade are called (otherwise why branch), so when
you need to fix something on all branches the backports have a high
potential to be non-trivial too.

Release conditionals are not perfect, but I dont think compatibility is
really a major issue. Just document how to set the
release, and the different CI tools that use your role will just have to
adapt to that.

>
> It's extemely important to properly gate this role against the versions of
> TripleO and OSP.  I see very limited
> check jobs and gate jobs on tripleo-upgrades atm.  I have only found [1].
>  I think we need to see some external and internal
> jobs checking and gating this role with comments posted to changes.
>
> [1] https://review.rdoproject.org/jenkins/job/gate-tripleo-
> ci-centos-7-containers-multinode-upgrades-pike/
>
>
>
>>
>> I wanted to bring this topic up for discussion to see if branching is
>> the proper thing to do here.
>>
>> Thanks,
>> Marius
>>
>> 
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscrib
>> e
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] TripleO CI end of sprint status

2017-12-15 Thread John Trowbridge
On Fri, Dec 15, 2017 at 1:15 PM, Ben Nemec  wrote:

>
>
> On 12/15/2017 10:26 AM, Emilien Macchi wrote:
>
>> On Fri, Dec 15, 2017 at 5:04 AM, Arx Cruz  wrote:
>> [...]
>>
>>> The goal on this sprint was to enable into quickstart a way to reproduce
>>> upstream jobs, in your personal RDO cloud tenant, making easy to
>>> developers
>>> to debug and reproduce their code.
>>>
>>
>> This phrase confused some non-Red-Hat OpenStack contributors on
>> #openstack-tc:
>> http://eavesdrop.openstack.org/irclogs/%23openstack-tc/%23op
>> enstack-tc.2017-12-15.log.html#t2017-12-15T15:37:59
>>
>> 2 questions came up:
>>
>> 1) Do we need RDO Cloud access to reproduce TripleO CI jobs?
>>
>> I think the answer is no. What you need though is an OpenStack cloud,
>> with the work that is being done here:
>> https://review.openstack.org/#/c/525743
>>
>> I'll let the TripleO CI team to confirm that, no, you don't need RDO
>> Cloud access.
>>
>
> /me makes yet another note to try OVB against a public cloud
>
> At the moment, at least for the OVB jobs, you pretty much do need access
> to either RDO cloud or rh1/2.  It _may_ work against some public clouds,
> but I don't know of anyone trying it yet so I can't really recommend it.
>

Ah right, didnt think about the OVB part. That has nothing to do with the
reproducer script though... It is just not possible to reproduce OVB jobs
against a non-OVB cloud. The multinode jobs will work against any cloud
though.


>
>
>>
>> 2) Can anyone have access to RDO Cloud resources?
>>
>> One of the reasons of creating RDO Cloud was for developers so they
>> can get resources to build OpenStack.
>> RDO community organizes something called "test days", where anyone is
>> welcome to join and test OpenStack on centos7 with RDO packages.
>> See: https://dmsimard.com/2017/11/29/come-try-a-real-openstack-qu
>> eens-deployment/
>> The event is announced on RDO users mailing list:
>> https://lists.rdoproject.org/pipermail/users/2017-December/79.html
>> Other than that, I'm not sure about the process if someone needs
>> full-time access. FWIW, I never saw any rejection in the past. We
>> welcome contributors and we want to help how we can.
>>
>
> I am aware of a few people who have been rejected for RDO cloud access,
> and given the capacity constraints it is currently under I suspect there
> would need to be strong justification for new users.  I'm _not_ an RDO
> cloud admin though, so that's not an official statement of any kind.
>
> Also note that the test day is not happening on RDO cloud, but on a
> separate single node cloud (per https://etherpad.openstack.org
> /p/rdo-queens-m2-cloud).  It would not be particularly well suited to
> reproducing CI and presumably won't be around for long.
>
> So the story's not great right now unless you already have access to cloud
> resources.  The developer hardware requirements problem is not quite solved
> yet. :-/
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] TripleO CI end of sprint status

2017-12-15 Thread John Trowbridge
On Fri, Dec 15, 2017 at 11:26 AM, Emilien Macchi  wrote:

> On Fri, Dec 15, 2017 at 5:04 AM, Arx Cruz  wrote:
> [...]
> > The goal on this sprint was to enable into quickstart a way to reproduce
> > upstream jobs, in your personal RDO cloud tenant, making easy to
> developers
> > to debug and reproduce their code.
>
> This phrase confused some non-Red-Hat OpenStack contributors on
> #openstack-tc:
> http://eavesdrop.openstack.org/irclogs/%23openstack-tc/%
> 23openstack-tc.2017-12-15.log.html#t2017-12-15T15:37:59
>
> 2 questions came up:
>
> 1) Do we need RDO Cloud access to reproduce TripleO CI jobs?
>
> I think the answer is no. What you need though is an OpenStack cloud,
> with the work that is being done here:
> https://review.openstack.org/#/c/525743
>
> I'll let the TripleO CI team to confirm that, no, you don't need RDO
> Cloud access.
>

Correct, the reproducer script work does not require being run specifically
on RDO Cloud. Downloading images will be
a bit slower, since the images are hosted on the same infra as RDO Cloud.
However, the script simply creates the
resources nodepool would create on any OpenStack cloud, then runs the exact
script from CI.
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Proposing Ronelle Landy for Tripleo-Quickstart/Extras/CI core

2017-11-29 Thread John Trowbridge
I would like to propose Ronelle be given +2 for the above repos. She has
been a solid contributor to tripleo-quickstart and extras almost since the
beginning. She has solid review numbers, but more importantly has always
done quality reviews. She also has been working in the very intense rover
role on the CI squad in the past CI sprint, and has done very well in that
role.
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Log collection of overcloud nodes

2017-03-29 Thread John Trowbridge


On 03/28/2017 06:57 PM, Alex Schultz wrote:
> Hey folks,
> 
> So as part of the capture-environment-status-and-logs blueprint[0],
> I'm working on adding a single command to perform status and log
> collection of the overcloud nodes via the tripleoclient that can be
> used on the undercloud.  I would like to bring up switching over to
> this as part of our CI log collection activities as many of the
> relevant logs we want are already captured via the sosreport tool.
> Additionally this is the way many operators are collecting and
> reporting their logs when submitting issues.
> 
> I think this would benefit us to switch as sosreports also capture
> additional status of the services at the time of  the report and we
> can improve sosreports via plugins to help diagnose frequent service
> related problems.  I believe we're duplicating some of the items
> already covered via sosreport in tripleo-quickstart-extras[1] and I
> think it would be beneficial to not continue to duplicate this work
> but rather use already available tooling.  For CI once we have these
> sosreport bundles, it would be fairly straight forward to only extract
> relevant information for debugging use.
> 

I think this is a great idea. Looking at the tripleo-common patch, it
seems like we will end up with a sosreport tarball in the undercloud
swift at the end?

Maybe once we have the command available in tripleoclient, we can change
the collect-logs role to instead run that command and extract logs from
the tarball so we keep the browsability of CI logs.

Also, does this mean we need to add a requirement to the packaging of
tripleo-common on sosreport?

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Propose Attila Darazs and Gabriele Cerami for tripleo-ci core

2017-03-15 Thread John Trowbridge
Both Attila and Gabriele have been rockstars with the work to transition
tripleo-ci to run via quickstart, and both have become extremely
knowledgeable about how tripleo-ci works during that process. They are
both very capable of providing thorough and thoughtful reviews of
tripleo-ci patches.

On top of this Attila has greatly increased the communication from the
tripleo-ci squad as the liason, with weekly summary emails of our
meetings to this list.

- trown

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] propose Alex Schultz core on tripleo-heat-templates

2017-03-13 Thread John Trowbridge


On 03/13/2017 10:30 AM, Emilien Macchi wrote:
> Hi,
> 
> Alex is already core on instack-undercloud and puppet-tripleo.

+1 it is actually a bit odd to be +2 on puppet-tripleo without being +2
on THT, since so many changes span the two repos.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] Moving mirror server images to tarballs.o.o (Was: [rdo-list] RDO-CI Scrum Weekly Scrum Recap)

2017-02-08 Thread John Trowbridge
Moving this thread from rdo-list, but leaving in copy as it is relevant
there too. However it is more of a TripleO topic.

On 02/07/2017 03:56 PM, Paul Belanger wrote:
> On Tue, Feb 07, 2017 at 03:34:09PM -0500, Ronelle Landy wrote:
>> Greetings All,
>>
>> Links to the RDO-CI team scrum[1] and recording[2] from this week's
>> meeting are below.
>>
>> Highlights:
>>
>>  - TripleO CI images outage 
>>   [3] details the images missing from the mirror, causing CI failures
>>   We need a less 'reactive' approach to checking that the gates are working
>>   Action Items:
>> Attila added [4] to run check jobs and measure passing
>> IRC Bot needed to ping #oooq with failures
> It would be great if we can finally replace this private infrastructure in
> tripleo-ci with our community openstack-infra servers. This avoids the need 
> for
> tripleo-ci project to maintain servers.  Also, we likely have more HDD space
> available.
> 
> I recently helped kolla move there images to tarballs.o.o, we can do the same
> for tripleo-ci.
> 

That would be awesome. We want to cover stable release images as well,
and the current tripleo-ci mirror server seems inadequate for that. Are
there any docs on how to publish to tarballs.o.o? Right now, the image
upload happens in two stages. First, all periodic jobs of a particular
type (HA I think?) upload the images they built to the mirror server.
Second, when all periodic jobs pass on a particular dlrn hash the
symlink for the "known good" image is updated.

It seems like to start we could maybe update this symlink swapping logic
to instead publish the image on tarballs.o.o, and use that location as
the known good. That would still result in using the current mirror
server as a store of images under test, but it would give the gates
which consume the known good images (and users/devs who do that) much
more stability.

Maybe it would be possible to use tarballs.o.o for the testing images as
well, but then we would need some sort of cleanup and some way to mark
which images there are actually good (which seems a bit more complicated).

>>  -  Upgrade CI status
>>   There is a periodic job running on internal jenkins [5] testing upgrades
>>   [6] adds the sanity checks so that upgrade jobs will be able to check 
>> services without a full overcloud deployment
>>  - Ocata Pipeline status
>>   Phase 1 passed on Friday - but still some reviews outstanding (image 
>> build, release configs)
>>   Phase 2 - jobs have been set up internally
>> All baremetal jobs are failing due to JJB mismatches (master vs 
>> rdo_trunk for ocata)
>>  - DCI team is interested in using quickstart-extras
>>   RDO-CI team to support this usage 
>>   
>>
>> [1] - https://review.rdoproject.org/etherpad/p/rdo-infra-scrum
>> [2] - https://bluejeans.com/s/ugrW6/
>> [3] - https://bugs.launchpad.net/tripleo/+bug/1661656
>> [4] - https://review.openstack.org/430277
>> [5] - > jenkins>/RDO/view/openstack_virtual_baremetal/job/tripleo-quickstart-newton-to-master-upgrade-ovb-qeos7-rhos-dev-ci-multi-nic/
>> [6] - https://review.openstack.org/#/c/429716/
>>
>> /R
>>
>> Ronelle Landy
>>
>> ___
>> rdo-list mailing list
>> rdo-l...@redhat.com
>> https://www.redhat.com/mailman/listinfo/rdo-list
>>
>> To unsubscribe: rdo-list-unsubscr...@redhat.com
> 
> ___
> rdo-list mailing list
> rdo-l...@redhat.com
> https://www.redhat.com/mailman/listinfo/rdo-list
> 
> To unsubscribe: rdo-list-unsubscr...@redhat.com
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Proposing Sergey (Sagi) Shnaidman for core on tripleo-ci

2017-02-01 Thread John Trowbridge


On 01/30/2017 10:56 AM, Emilien Macchi wrote:
> Sagi, you're now core on TripleO CI repo. Thanks for your hard work on
> tripleo-quickstart transition, and also helping by keeping CI in good
> shape, your work is amazing!
> 
> Congrats!
> 
> Note: I couldn't add you to tripleo-ci group, but only to tripleo-core
> (Gerrit permissions), which mean you can +2 everything but we trust
> you to use it only on tripleo-ci. I'll figure out the Gerrit
> permissions later.
> 

I also told Sagi that he should also feel free to +2 any
tripleo-quickstart/extras patches which are aimed at transitioning
tripleo-ci to use quickstart. I didn't really think about this as an
extra permission, as any tripleo core has +2 on
tripleo-quickstart/extras. However, I seem to have surprised the other
quickstart cores with this. None were opposed to the idea, but just
wanted to make sure that it was clearly communicated that this is allowed.

If there is some objection to this, we can consider it further. FWIW,
Sagi has been consistently providing high quality critical reviews for
tripleo-quickstart/extras for some time now, and was pivotal in the
setup of the quickstart based OVB job.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Proposing Dmitry Tantsur and Alex Schultz as instack-undercloud cores

2017-01-31 Thread John Trowbridge


On 01/31/2017 11:02 AM, Ben Nemec wrote:
> In the spirit of all the core team changes, here are a couple more I'd
> like to propose.
> 
> Dmitry has been very helpful reviewing in instack-undercloud for a long
> time so this is way overdue.  I'm also going to propose that he be able
> to +2 anything Ironic-related in TripleO since that is his primary area
> of expertise.
> 

+1 long overdue. Dmitry reviews a ton of TripleO patches.

> Alex has ramped up quickly on TripleO and has also been helping out with
> instack-undercloud quite a bit.  He's already core for the puppet
> modules, and a lot of the changes to instack-undercloud these days are
> primarily in the puppet manifest so it's not a huge stretch to add him.
> 

+1 as well.


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] [tripleo-quickstart] pending reviews for composable upgrade for Ocata

2017-01-26 Thread John Trowbridge


On 01/26/2017 10:00 AM, Emilien Macchi wrote:
> On Thu, Jan 26, 2017 at 9:51 AM, John Trowbridge  wrote:
>>
>>
>> On 01/26/2017 04:03 AM, mathieu bultel wrote:
>>> Hi,
>>>
>>> I'm sending this email to the list to request reviews about the
>>> composable upgrade work I have been done in Tripleo quickstart. It's
>>> pending for a while (dec 4 for one of those 2 reviews), and I have
>>> addressed all the comments on time, rebase & so one [1].
>>> Those reviews is required, and very important for 3 reasons:
>>> 1/ It addressed the following BP: [2]
>>> 2/ It would give a tool for the other Squad and DFGs to start to play
>>> with composable upgrade in order to support their own components.
>>> 3/ It will be a first shot for the Tripleo-CI / Tripleo-Quickstart
>>> transition for supporting the tripleo-ci upgrade jobs that we have
>>> implemented few weeks ago now.
>>>
>>> I updated the documentation (README) regarding the upgrade workflow, the
>>> commit message explain the deployment workflow, I know it's not easy to
>>> review this stuff, and probably tripleo-quickstart cores don't give
>>> importance around this subject. I think I can't do much more now for
>>> making the review more easy for the Cores.
>>>
>>> It was one of my concerns about adding all the very specific extras
>>> roles (upgrade / baremetal / scale) in one common repo, loosing flexibly
>>> and reaction, but it's more than that...
>>>
>>> I'm planning to write a "How To" for helping to other DFGs/Squads to
>>> work on upgrade, but since this work is still under review, I'm stuck.
>>>
>>> Thanks.
>>>
>>> [1]
>>> tripleo-quickstart repo:
>>> https://review.openstack.org/#/c/410831/
>>> tripleo-quickstart-extras repo:
>>> https://review.openstack.org/#/c/416480/
>>>
>>> [2]
>>>
>>> https://blueprints.launchpad.net/tripleo/+spec/tripleo-composable-upgrade-job
>>>
>>
>> We discussed this a bit this morning on #tripleo, and the consensus
>> there was that we should be focusing upgrade CI efforts for the end of
>> Ocata cycle on the existing tripleo-ci multinode upgrades job. This is
>> due to priority constraints on both sides.
>>
>> On the quickstart side, we really need to focus on having good
>> replacements for all of the basic jobs which are solid (OVB, multinode,
>> scenarios), so we can switch over in early Pike.
>>
>> On the upgrades side, we really need to focus on having coverage for as
>> many services to upgrade as possible.
> 
> I'm currently working on this front, by implementing the
> scenarioXXX-upgrade jobs (with multinode, but not oooq yet):
> https://review.openstack.org/#/c/425727/
> 
> Any feedback on the review is welcome, I hope it's aligned with our plans.
> 

I think this approach is great and should allow transitioning it to run
via quickstart to be simple after we have the scenario jobs in quickstart.

>> As such, I think we should use the existing job for upgrades, and port
>> it to quickstart after we have switched over the basic jobs in early Pike.
>>
>> One note about making it easier to get patches reviewed. As a group, I
>> think we have been reviewing quickstart/extras patches at a very good
>> pace. However, adding a very large feature with no CI covering it, makes
>> me personally totally uninterested to review. Not only does it require
>> me to follow some manual instructions just to see it actually works, but
>> there is nothing preventing it from being completely broken within days
>> of merging the feature.
>>
>> Another thing we should probably document for Tripleo CI somewhere is
>> that we should be trying to create multinode based CI for anything that
>> does not require nova/ironic interactions. Upgrades are in this category.
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] [tripleo-quickstart] pending reviews for composable upgrade for Ocata

2017-01-26 Thread John Trowbridge


On 01/26/2017 04:03 AM, mathieu bultel wrote:
> Hi,
> 
> I'm sending this email to the list to request reviews about the
> composable upgrade work I have been done in Tripleo quickstart. It's
> pending for a while (dec 4 for one of those 2 reviews), and I have
> addressed all the comments on time, rebase & so one [1].
> Those reviews is required, and very important for 3 reasons:
> 1/ It addressed the following BP: [2]
> 2/ It would give a tool for the other Squad and DFGs to start to play
> with composable upgrade in order to support their own components.
> 3/ It will be a first shot for the Tripleo-CI / Tripleo-Quickstart
> transition for supporting the tripleo-ci upgrade jobs that we have
> implemented few weeks ago now.
> 
> I updated the documentation (README) regarding the upgrade workflow, the
> commit message explain the deployment workflow, I know it's not easy to
> review this stuff, and probably tripleo-quickstart cores don't give
> importance around this subject. I think I can't do much more now for
> making the review more easy for the Cores.
> 
> It was one of my concerns about adding all the very specific extras
> roles (upgrade / baremetal / scale) in one common repo, loosing flexibly
> and reaction, but it's more than that...
> 
> I'm planning to write a "How To" for helping to other DFGs/Squads to
> work on upgrade, but since this work is still under review, I'm stuck.
> 
> Thanks.
> 
> [1]
> tripleo-quickstart repo:
> https://review.openstack.org/#/c/410831/
> tripleo-quickstart-extras repo:
> https://review.openstack.org/#/c/416480/
> 
> [2]
> 
> https://blueprints.launchpad.net/tripleo/+spec/tripleo-composable-upgrade-job
> 

We discussed this a bit this morning on #tripleo, and the consensus
there was that we should be focusing upgrade CI efforts for the end of
Ocata cycle on the existing tripleo-ci multinode upgrades job. This is
due to priority constraints on both sides.

On the quickstart side, we really need to focus on having good
replacements for all of the basic jobs which are solid (OVB, multinode,
scenarios), so we can switch over in early Pike.

On the upgrades side, we really need to focus on having coverage for as
many services to upgrade as possible.

As such, I think we should use the existing job for upgrades, and port
it to quickstart after we have switched over the basic jobs in early Pike.

One note about making it easier to get patches reviewed. As a group, I
think we have been reviewing quickstart/extras patches at a very good
pace. However, adding a very large feature with no CI covering it, makes
me personally totally uninterested to review. Not only does it require
me to follow some manual instructions just to see it actually works, but
there is nothing preventing it from being completely broken within days
of merging the feature.

Another thing we should probably document for Tripleo CI somewhere is
that we should be trying to create multinode based CI for anything that
does not require nova/ironic interactions. Upgrades are in this category.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Proposing Sergey (Sagi) Shnaidman for core on tripleo-ci

2017-01-24 Thread John Trowbridge


On 01/24/2017 12:03 PM, Juan Antonio Osorio wrote:
> Sagi (sshnaidm on IRC) has done significant work in TripleO CI (both
> on the current CI solution and in getting tripleo-quickstart jobs for
> it); So I would like to propose him as part of the TripleO CI core team.
> 
> I think he'll make a great addition to the team and will help move CI
> issues forward quicker.
> 

+1

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Atlanta PTG

2017-01-23 Thread John Trowbridge


On 01/21/2017 05:37 AM, Michele Baldessari wrote:
> Hi Emilien,
> 
> while not a design session per se, I would love to propose a short slot
> for TripleO CI Q&A, if we have some time left. In short, I'd like to be
> more useful around CI failures, but I lack the understanding of a few
> aspects of our current CI (promotion, when do images get built, etc.),
> that would benefit quite a bit from a short session where we have a few
> CI folks in the room that could answer questions or give some tips.
> I know of quite few other people that are in the same boat and maybe
> this will help a bit our current issue where only a few folks always
> chase CI issues.
> 
> If there is consensus (and some CI folks willing to attend ;) and time
> for this, I'll be happy to organize this and prepare a bunch of
> questions ideas beforehand.
> 

Great idea. We have a room for three days, so it is not like summit
where there is really limited time.

> Thoughts?
> Michele
> 
> On Wed, Jan 04, 2017 at 07:26:52AM -0500, Emilien Macchi wrote:
>> I would like to bring this topic up on your inbox, so we can continue
>> to make progress on the agenda. Feel free to follow existing examples
>> in the etherpad and propose a design dession.
>>
>> Thanks,
>>
>> On Wed, Dec 21, 2016 at 9:06 AM, Emilien Macchi  wrote:
>>> General infos about PTG: https://www.openstack.org/ptg/
>>>
>>> Some useful informations about PTG/TripleO:
>>>
>>> * When? We have a room between Wednesday and Friday included.
>>> Important sessions will happen on Wednesday and Thursday. We'll
>>> probably have sessions on Friday, but it might be more hands-on and
>>> hackfest, where people can enjoy the day to work together.
>>>
>>> * Let's start to brainstorm our topics:
>>> https://etherpad.openstack.org/p/tripleo-ptg-pike
>>>   Feel free to add any topic, as soon as you can. We need to know asap
>>> which sessions will be share with other projects (eg: tripleo/mistral,
>>> tripleo/ironic, tripleo/heat, etc).
>>>
>>>
>>> Please let us know any question or feedback,
>>> Looking forward to seeing you there!
>>> --
>>> Emilien Macchi
>>
>>
>>
>> -- 
>> Emilien Macchi
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] short term roadmap (actions required)

2017-01-18 Thread John Trowbridge


On 01/17/2017 04:36 PM, Emilien Macchi wrote:
> I'm trying to dress a list of things important to know so we can
> successfully deliver Ocata release, please take some time to read and
> comment if needed.
> 
> == Triaging Ocata & Pike bugs
> 
> As we discussed in our weekly meeting, we decided to:
> 
> * move ocata-3 low/medium unassigned bugs to pike-1
> * move ocata-3 high/critical unassigned bugs to ocata-rc1
> * keep ocata-3 In Progress bugs to ocata-3 until next week and move
> them to ocata-rc1 if not fixed on time.
> 
> Which means, if you plan to file a new bug:
> 
> * low/medium: target it for pike-1
> * high/critical: target it for ocata-rc1
> 
> We still have 66 bugs In Progress for ocata-3. The top priority for
> this week is to make progress on those bugs and close it on time for
> ocata final release.
> 
> 
> == Releasing tripleoclient next week
> 
> If you're working on tripleoclient, you might want to help in fixing
> the bugs still targeted for Ocata:
> https://goo.gl/R2hO4Z
> We'll release python-tripleoclient final ocata by next week.
> 
> 
> == Freezing features next week
> 
> If you're working on a feature in TripleO which is part of a blueprint
> targeted for ocata-3, keep in mind you have until next week to make it
> merged.
> After January 27th, We will block (by a -2 in Gerrit) any patch that
> adds a feature in master until we release Ocata and branch
> stable/ocata.
> Some exceptions can be made, but they have to be requested on
> openstack-dev and team + PTL will decide if whether or not we accept
> it.
> If your blueprint is not High or Critical, there are a few chances we accept 
> it.
> 
> 
> == Preparing Pike together
> 
> In case you missed it, we're preparing Pike sessions for next PTG:
> https://etherpad.openstack.org/p/tripleo-ptg-pike
> Feel free to propose a session and announce/discuss it on the
> openstack-dev mailing-list.
> 
> 
> == CI freeze
> 
> From January 27th until final Ocata release, we will freeze any chance
> in our CI, except critical fixes but they need to be reported in
> Launchpad and team + PTL needs to know (ML openstack-dev).
> 

I think this is a really good idea. Could we have one exception for
changes to only the tripleo-quickstart toci scripts and the
scripts/quickstart directory in tripleo-ci? Those files are only
relevant to the quickstart jobs in the experimental queue, and we want
to continue making progress stabilizing them in the last weeks of Ocata.

> 
> If there is any question or feedback, please don't hesitate to use this 
> thread.
> 
> Thanks and let's make Ocata our best release ever ;-)
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci] Stop using "current" DLRN repo

2017-01-13 Thread John Trowbridge


On 01/13/2017 09:25 AM, Emilien Macchi wrote:
> On Fri, Jan 13, 2017 at 9:09 AM, Gabriele Cerami  wrote:
>>
>> Hi,
>>
>> following a suggestion from Alal Pevec I'm proposing to stop using
>> "current" repo from dlrn and start using "consistent" instead.
>> The main difference should only be that "consistent" is not affected by
>> packages in ftbfs, so we're testing with a bit more stability.
> 
> We might want to exclude tripleo projects because we always need the
> latest on this case. Otherwise, lgtm.

The "current" repo being replaced is actually already only used for the
tripleo gated projects (and stable branches). We definitely need
"current" in the tripleo gated project case, and because we don't have a
special repo for only tripleo projects in the stable case, we need it
there too.

> 
>> This is the proposal
>> https://review.openstack.org/419455
>>
>> please comment especially if I did not understood correctly what the
>> difference between "current" and "consistent" is.
>>
>>
>> Thanks.
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] quick reminder on review policy

2017-01-04 Thread John Trowbridge


On 01/03/2017 07:30 PM, Emilien Macchi wrote:
> I've noticed some TripleO core reviewers self approving patch without
> respecting our review policy, specially in tripleo-quickstart.
> 

This is slightly misleading. To me, self-approving is +2/+A on your own
patch.

What has been going in tripleo-quickstart is different though. We have
allowed cores to +A a patch from another core with only a single +2.
That is against the policies laid out in tripleo-specs[1,2]. However,
following those policies will effectively make it impossible for cores
of tripleo-quickstart to get their own work merged in anything
approaching a reasonable amount of time.

This is because there are currently only 3 cores reviewing
tripleo-quickstart with any regularity. So the policies below mean that
any patch submitted by a core must be reviewed by every other core. I
think it has actually been a full month since we even had all 3 cores
working at the same time due to holidays and PTO (currently we only have 2).

If we want to apply the policies below to quickstart, I get it... they
are after all the agreed on policies. I think this puts moving CI to
quickstart this cycle at a very high risk to complete though, which also
means getting container CI is also at risk.

[1]
http://specs.openstack.org/openstack/tripleo-specs/specs/policy/expedited-approvals.html#single-2-approvals
[2]
http://specs.openstack.org/openstack/tripleo-specs/specs/policy/expedited-approvals.html#self-approval

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] [tripleo-quickstart] Tripleo-Quickstart root privileges

2016-12-01 Thread John Trowbridge


On 11/22/2016 03:49 PM, Gabriele Cerami wrote:
> On 22 Nov, Yolanda Robla Mota wrote:
>> Hi all
>> I wanted to start a thread about the current privileges model for TripleO 
>> quickstart.
>> Currently there is the assumption that quickstart does not need root 
>> privileges after the environment and provision roles. However, this 
>> assumption cannot be valid for several use cases.
>> In particular, I have the need of creating working directories outside the 
>> home directory of the user running quickstart. This can be useful on 
>> environments where /home partition is small and cannot be modified (so there 
>> is not enough disk space to host TripleO quickstart artifacts there).
>> This is the change i'm working on for that use case: 
>> https://review.openstack.org/#/c/384892
> 
> Hi,
> 
> I may suggest a compromise that will allow not to break the model, and
> moving forward with you patch.
> If you can make it work, you can try to move the working_dir creation
> tasks to the provision role.
> You already moved working_dir default var to common role, so it should
> work.
> 
> Any other thoughts ?
> Thanks for raising the question.
> 

Sorry for the slow response, and thanks for raising this question. I
added Lars to the thread as well, because he was the original architect
of the current privilege model in quickstart.

There were two reasons (I can think of anyways) for the current model:

1. Doing tasks as root on the virthost makes clean up trickier. With the
current model, deleting the non-root quickstart user cleans up almost
everything. By keeping all of the root privilege tasks in the provision
and environment roles, it is much easier to reason about the few things
that do not get cleaned up when deleting the quickstart user. If we
start allowing root privilege tasks in the libvirt role, this will be
harder.

2. Theoretically, (I have not actually heard anyone actually doing
this), someone could set up a virthost for use by quickstart, and then
hand it over to someone with only non-root privileges. While I do not
know of anyone using quickstart this way today, it is a compelling use
case for setting up training environments using quickstart. An
admin/trainer could set up a bunch of virthosts for a training and the
students would only have non-root access to the machines.

I think at the very least, we want to maintain the default running of
quickstart with the current model. If some feature absolutely needs to
break this model, it needs to be guarded by a variable defaulted to false.

In the specific case of https://review.openstack.org/#/c/384892 I do
think we could do the directory creation tasks earlier, and then we do
not need to break the model at all to support your use case.

There is also https://review.openstack.org/#/c/399704/ that is running
into the same thing, but again, I think we could probably move all of
the root stuff to earlier roles (though I have yet to thoroughly review
that yet, so I am less sure).

I have also been working with some folks from the OPNFV Apex (which is
tripleo based) team to port their CI to quickstart. I have not seen
patches yet, but it does seem some of the networking requirements may
require us to run the virtual machines under qemu://system which will
break the current privilege model completely. Their case is why we may
need to make the model optional.

@Yolanda wdyt about the suggestion to move directory creation to an
earlier role in your patch? Also, thanks for all your work on quickstart!

-trown

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI scenarios design - how to add more services

2016-11-28 Thread John Trowbridge


On 11/22/2016 09:02 PM, Emilien Macchi wrote:
> == Context
> 
> In Newton we added new multinode jobs called "scenarios".
> The challenged we tried to solve was "how to test the maximum of
> services without overloading the nodes that run tests".
> 
> Each scenarios deploys a set of services, which allows us to
> horizontally scale the number of scenarios to increase the service
> testing coverage.
> See the result here:
> https://github.com/openstack-infra/tripleo-ci#service-testing-matrix
> 
> To implement this model, we took example of Puppet OpenStack CI:
> https://github.com/openstack/puppet-openstack-integration#description
> We even tried to keep consistent the services/scenarios relations, so
> it's consistent and easier to maintain.
> 
> Everything was fine until we had to add new services during Ocata cycles.
> Because tripleo-ci repository is not branched, adding Barbican service
> in the TripleO environment for scenario002 would break Newton CI jobs.
> During my vacations, the team created a new scenario, scenario004,
> that deploys Barbican and that is only run for Ocata jobs.
> I don't think we should proceed this way, and let me explain why.
> 
> == Problem
> 
> How to scale the number of services that we test without increasing
> the number of scenarios and therefore the complexity of maintaining
> them on long-term.
> 
> 
> == Solutions
> 
> The list is not exhaustive, feel free to add more.
> 
> 1) Re-use experience from Puppet OpenStack CI and have environments
> that are in a branched repository.
> environments.
> In Puppet OpenStack CI, the repository that deploys environments
> (puppet-openstack-integration) is branched. So if puppet-barbican is
> ready to be tested in Ocata, we'll patch
> puppet-openstack-integration/master to start testing it and it won't
> break stable jobs.
> Like this, we were successfully able to maintain a fair number of
> scenarios and keep our coverage increasing over each cycle.
> 
> I see 2 sub-options here:
> 
> a) Move CI environments and pingtest into
> tripleo-heat-templates/environments/ci/(scenarios|pingtest). This repo
> is branched and we could add a README to explain these files are used
> in CI and we don't guarantee they would work outside TripleO CI tools.

I also like this solution the best. It has the added benefit of being
able to review the CI for a new service in the same patch (or patch
chain) that adds the new service. We already have the low-memory
environment in tht, which while not CI specific, is definitely a CI
requirement.

> b) Branch tripleo-ci repository. Personally I don't like this solution
> because a lot of patches in this repo are not related to OpenStack
> versions, which means we would need to backport most of the things
> from master.
> 
> 2) Introduce branch-based scenario tests -
> https://review.openstack.org/#/c/396008/
> It duplicates a lot of code and it's imho not really effective, though
> this solution would correctly works.
> 
> 3) Introduce a new scenario each time we have new services (like we
> did with scenario004).
> By adding new scenarios at each release because we test new services
> is imho the wrong choice because:
> a) it adds complexity in our we're going to maintain these scenarios.
> b) it consumes more CI resources that we would need when some patches
> have to run all scenarios jobs.
> 
> 
> So I gave my opinion on the solutions, discussion is now open and my
> hope is that we find a consensus soon, so we can make progress in our
> testing coverage.
> Thanks,
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Proposing Julie Pichon for tripleo core

2016-11-22 Thread John Trowbridge


On 11/22/2016 12:01 PM, Dougal Matthews wrote:
> Hi all,
> 
> I would like to propose we add Julie (jpich) to the TripleO core team for
> python-tripleoclient and tripleo-common. This nomination is based partially
> on review stats[1] and also my experience with her reviews and
> contributions.
> 
> Julie has consistently provided thoughtful and detailed reviews since the
> start of the Newton cycle. She has made a number of contributions which
> improve the CLI and has been extremely helpful with other tasks that don't
> often get enough attention (backports, bug triaging/reporting and improving
> our processes[2]).
> 
> I think she will be a valuable addition to the review team
> 
> Dougal
> 
> 
> [1]: http://stackalytics.com/report/contribution/tripleo-group/90
> [2]: https://review.openstack.org/#/c/352852/
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

+1

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] proposing Michele Baldessari part of core team

2016-11-07 Thread John Trowbridge


On 11/04/2016 01:40 PM, Emilien Macchi wrote:
> MIchele Baldessari (bandini on IRC) has consistently demonstrated high
> levels of contributions in TripleO projects, specifically in High
> Availability area where's he's for us a guru (I still don't understand
> how pacemaker works, but hopefully he does).
> 
> He has done incredible work on composable services and also on
> improving our HA configuration by following reference architectures.
> Always here during meetings, and on #tripleo to give support to our
> team, he's a great team player and we are lucky to have him onboard.
> I believe he would be a great core reviewer on HA-related work and we
> expect his review stats to continue improving as his scope broadens
> over time.
> 
> As usual, feedback is welcome and please vote for this proposal!
> 
> Thanks,
> 

+1 I thought he was already core :P.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ironic] introspection and CI

2016-10-18 Thread John Trowbridge


On 10/18/2016 07:20 AM, Wesley Hayutin wrote:
> See my response inline.
> 
> On Tue, Oct 18, 2016 at 6:07 AM, Dmitry Tantsur  wrote:
> 
>> On 10/17/2016 11:10 PM, Wesley Hayutin wrote:
>>
>>> Greetings,
>>>
>>> The RDO CI team is considering adding retries to our calls to
>>> introspection
>>> again [1].
>>> This is very handy for bare metal environments where retries may be
>>> needed due
>>> to random chaos in the environment itself.
>>>
>>> We're trying to balance two things here..
>>> 1. reduce the number of false negatives in CI
>>> 2. try not to overstep what CI should vs. what the product should do.
>>>
>>> We would like to hear your comments if you think this is acceptable for
>>> CI or if
>>> this may be overstepping.
>>>
>>> Thank you
>>>
>>>
>>> [1] http://paste.openstack.org/show/586035/
>>>
>>
>> Hi!
>>
>> I probably lack some context of what exactly problems you face. I don't
>> have any disagreement with retrying it, just want to make sure we're not
>> missing actual bugs.
>>
> 
> I agree, we have to be careful not to paper over bugs while we try to
> overcome typical environmental delays that come w/ booting, rebooting $x
> number of random hardware nodes.
> To make this a little more crystal clear, I'm trying to determine is where
> progressive delays and retries should be injected into the workflow of
> deploying an overcloud.
> Should we add options in the product itself that allow for $x number of
> retries w/ a configurable set of delays for introspection? [2]  Is the
> expectation this works the first time everytime?
> Are we overstepping what CI should do by implementing [1].

IMO, yes, we are overstepping what CI should be doing with [1]. Mostly
because we are providing a better UX in CI than an actual user will get.
> 
> Additionally would it be appropriate to implement [1], while [2] is
> developed for the next release and is it OK to use [1] with older releases?
> 

However, I think it is ok to implement [1] in CI, if the following are true:

1) There is an in progress bug to make this UX better for non-CI user.
2) For older releases if said bug is deemed inappropriate for backport.

> Thanks for your time and responses.
> 
> 
> [1] http://paste.openstack.org/show/586035/
> [2]
> https://github.com/openstack/tripleo-common/blob/master/workbooks/baremetal.yaml#L169
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] [tripleo-ci] tripleo-quickstart-extras and tripleo third party ci

2016-10-17 Thread John Trowbridge
Added tripleo tag as this missed my email filter, and therefore may have
missed others as well.

On 10/14/2016 02:01 PM, Wesley Hayutin wrote:
> Greetings,
> 
> Hey everyone, I wanted to post a link to a blueprint I'm interested in
> discussing at summit with everyone.  Please share your thoughts and
> comments in the spec / gerrit review.
> 
> https://blueprints.launchpad.net/tripleo/+spec/tripleo-third-party-ci-quickstart
> 
> Thank you!
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] let's talk (development) environment deployment tooling and workflows

2016-09-21 Thread John Trowbridge


On 09/19/2016 01:21 PM, Steven Hardy wrote:
> Hi Alex,
> 
> Firstly, thanks for this detailed feedback - it's very helpful to have
> someone with a fresh perspective look at the day-1 experience for TripleO,
> and while some of what follows are "know issues", it's great to get some
> perspective on them, as well as ideas re how we might improve things.
> 
> On Thu, Sep 15, 2016 at 09:09:24AM -0600, Alex Schultz wrote:
>> Hi all,
>>
>> I've recently started looking at the various methods for deploying and
>> developing tripleo.  What I would like to bring up is the current
>> combination of the tooling for managing the VM instances and the
>> actual deployment method to launch the undercloud/overcloud
>> installation.  While running through the various methods and reading
>> up on the documentation, I'm concerned that they are not currently
>> flexible enough for a developer (or operator for that matter) to be
>> able to setup the various environment configurations for testing
>> deployments and doing development.  Additionally I ran into issues
>> just trying get them working at all so this probably doesn't help when
>> trying to attract new contributors as well.  The focus of this email
>> and of my experience seems to relate with workflow-simplification
>> spec[0].  I would like to share my experiences with the various
>> tooling available and raise some ideas.
>>
>> Example Situation:
>>
>> For example, I have a laptop with 16G of RAM and an SSD and I'd like
>> to get started with tripleo.  How can I deploy tripleo?
> 
> So, this is probably problem #1, because while I have managed to deploy a
> minimal TripleO environment on a laptop with 16G of RAM, I think it's
> pretty widely known that it's not really enough (certainly with our default
> configuration, which has unfortunately grown over time as more and more
> things got integrated).
> 
> I see two options here:
> 
> 1. Document the reality (which is really you need a physical machine with
> at least 32G RAM unless you're prepared to deal with swapping).
> 
> 2. Look at providing a "TripleO lite" install option, which disables some
> services (both on the undercloud and default overcloud install).
> 
> Either of these are defintely possible, but (2) seems like the best
> long-term solution (although it probably means another CI job).
> 
>> Tools:
>>
>> instack:
>>
>> I started with the tripleo docs[1] that reference using the instack
>> tools for virtual environment creation while deploying tripleo.   The
>> docs say you need at least 12G of RAM[2].  The docs lie (step 7[3]).
>> So after basically shutting everything down and letting it deploy with
>> all my RAM, the deployment fails because the undercloud runs out of
>> RAM and OOM killer kills off heat.  This was not because I had reduced
>> the amount of ram for the undercloud node or anything.  It was because
>> by default, 6GB of RAM with no swap is configured for the undercloud
>> (not sure if this is a bug?).  So I added a swap file to the
>> undercloud and continued. My next adventure was having the overcloud
>> deployment fail because lack of memory as puppet fails trying to spawn
>> a process and gets denied.  The instack method does not configure swap
>> for the VMs that are deployed and the deployment did not work with 5GB
>> RAM for each node.  So for a full 16GB I was unable to follow the
>> documentation and use instack to successfully deploy.  At this point I
>> switched over to trying to use tripleo-quickstart.  Eventually I was
>> able to figure out a configuration with instack to get it to deploy
>> when I figured out how to enable swap for the overcloud deployment.
> 
> Yeah, so this definitely exposes that we need to update the docs, and also
> provide an easy install-time option to enable swap on all-the-things for
> memory contrained environments.
> 
>> tripleo-quickstart:
>>
>> The next thing I attempted to use was the tripleo-quickstart[4].
>> Following the directions I attempted to deploy against my localhost.
>> I turns out that doesn't work as expected since ansible likes to do
>> magic when dealing with localhost[5].  Ultimately I was unable to get
>> it working against my laptop locally because I ran into some libvirt
>> issues.  But I was able to get it to work when I pointed it at a
>> separate machine.  It should be noted that tripleo-quickstart creates
>> an undercloud with swap which was nice because then it actually works,
>> but is an inconsistent experience depending on which tool you used for
>> your deployment.
> 
> Yeah, so while a lot of folks have good luck with tripleo-quickstart, it
> has the disadvantage of not currently being the tool used in upstream
> TripleO CI (which folks have looked at fixing, but it's not yet happened).
> 
> The original plan was for tripleo-quickstart to completely replace the
> instack-virt-setup workflow:
> 
> https://blueprints.launchpad.net/tripleo/+spec/tripleo-quickstart
> 
> But for a variety of reasons, we never quite

Re: [openstack-dev] [TripleO] TripleO Core nominations

2016-09-16 Thread John Trowbridge


On 09/15/2016 05:20 AM, Steven Hardy wrote:
> Hi all,
> 
> As we work to finish the last remaining tasks for Newton, it's a good time
> to look back over the cycle, and recognize the excellent work done by
> several new contributors.
> 
> We've seen a different contributor pattern develop recently, where many
> folks are subsystem experts and mostly focus on a particular project or
> area of functionality.  I think this is a good thing, and it's hopefully
> going to allow our community to scale more effectively over time (and it
> fits pretty nicely with our new composable/modular architecture).
> 
> We do still need folks who can review with the entire TripleO architecture
> in mind, but I'm very confident folks will start out as subsystem experts
> and over time broaden their area of experience to encompass more of
> the TripleO projects (we're already starting to see this IMO).
> 
> We've had some discussion in the past[1] about strictly defining subteams,
> vs just adding folks to tripleo-core and expecting good judgement to be
> used (e.g only approve/+2 stuff you're familiar with - and note that it's
> totally fine for a core reviewer to continue to +1 things if the patch
> looks OK but is outside their area of experience).
> 
> So, I'm in favor of continuing that pattern and just welcoming some of our
> subsystem expert friends to tripleo-core, let me know if folks feel
> strongly otherwise :)
> 
> The nominations, are based partly on the stats[2] and partly on my own
> experience looking at reviews, patches and IRC discussion with these folks
> - I've included details of the subsystems I expect these folks to focus
> their +2A power on (at least initially):
> 
> 1. Brent Eagles
> 
> Brent has been doing some excellent work mostly related to Neutron this
> cycle - his reviews have been increasingly detailed, and show a solid
> understanding of our composable services architecture.  He's also provided
> a lot of valuable feedback on specs such as dpdk and sr-iov.  I propose
> Brent continues this exellent Neutron focussed work, while also expanding
> his review focus such as the good feedback he's been providing on new
> Mistral actions in tripleo-common for custom-roles.
> 
> 2. Pradeep Kilambi
> 
> Pradeep has done a large amount of pretty complex work around Ceilomenter
> and Aodh over the last two cycles - he's dealt with some pretty tough
> challenges around upgrades and has consistently provided good review
> feedback and solid analysis via discussion on IRC.  I propose Prad
> continues this excellent Ceilomenter/Aodh focussed work, while also
> expanding review focus aiming to cover more of t-h-t and other repos over
> time.
> 
> 3. Carlos Camacho
> 
> Carlos has been mostly focussed on composability, and has done a great job
> of working through the initial architecture implementation, including
> writing some very detailed initial docs[3] to help folks make the transition
> to the new architecture.  I'd suggest that Carlos looks to maintain this
> focus on composable services, while also building depth of reviews in other
> repos.
> 
> 4. Ryan Brady
> 
> Ryan has been one of the main contributors implementing the new Mistral
> based API in tripleo-common.  His reviews, patches and IRC discussion have
> consistently demonstrated that he's an expert on the mistral
> actions/workflows and I think it makes sense for him to help with review
> velocity in this area, and also look to help with those subsystems
> interacting with the API such as tripleoclient.
> 
> 5. Dan Sneddon
> 
> For many cycles, Dan has been driving direction around our network
> architecture, and he's been consistently doing a relatively small number of
> very high-quality and insightful reviews on both os-net-config and the
> network templates for tripleo-heat-templates.  I'd suggest Dan continues
> this focus, and he's indicated he may have more bandwidth to help with
> reviews around networking in future.
> 
> Please can I get feedback from exisitng core reviewers - you're free to +1
> these nominations (or abstain), but any -1 will veto the process.  I'll
> wait one week, and if we have consensus add the above folks to
> tripleo-core.
> 
> Finally, there are quite a few folks doing great work that are not on this
> list, but seem to be well on track towards core status.  Some of those
> folks I've already reached out to, but if you're not nominated now, please
> don't be disheartened, and feel free to chat to me on IRC about it.  Also
> note the following:
> 
>  - We need folks to regularly show up, establishing a long-term pattern of
>doing useful reviews, but core status isn't about raw number of reviews,
>it's about consistent downvotes and detailed, well considered and
>insightful feedback that helps increase quality and catch issues early.
> 
>  - Try to spend some time reviewing stuff outside your normal area of
>expertise, to build understanding of the broader TripleO system - as
>discussed ab

Re: [openstack-dev] [TripleO] Proposing Attila Darazs for tripleo-quickstart core​

2016-07-29 Thread John Trowbridge
There were no objections, so I made the change in gerrit.

On 07/26/2016 10:32 AM, John Trowbridge wrote:
> I would like to add Attila to the tripleo-quickstart core reviewers
> group. Much of his work has been on some of the auxillary roles that
> quickstart makes use of in RDO CI, however his numbers on quickstart
> itself[1] are in line with the other core reviewers.
> 
> I will be out for paternity leave the next 4 weeks, so it will also be
> nice to have 3 core reviewers during that time in case I dont end up
> doing too many reviews.
> 
> If there are no objections I will make the change at the end of the week.
> 
> - trown
> 
> [1] http://stackalytics.com/report/contribution/tripleo-quickstart/90
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Proposing Attila Darazs for tripleo-quickstart core​

2016-07-26 Thread John Trowbridge
I would like to add Attila to the tripleo-quickstart core reviewers
group. Much of his work has been on some of the auxillary roles that
quickstart makes use of in RDO CI, however his numbers on quickstart
itself[1] are in line with the other core reviewers.

I will be out for paternity leave the next 4 weeks, so it will also be
nice to have 3 core reviewers during that time in case I dont end up
doing too many reviews.

If there are no objections I will make the change at the end of the week.

- trown

[1] http://stackalytics.com/report/contribution/tripleo-quickstart/90

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Proposing Gabriele Cerami for tripleo-quickstart core

2016-07-25 Thread John Trowbridge
Since there were no objections to this, I have added Gabriele to the
tripleo-quickstart core group.

This also means that all patches to tripleo-quickstart now require 2x +2
to merge.

On 07/18/2016 11:06 AM, John Trowbridge wrote:
> Howdy,
> 
> I would like to propose Gabriele (panda on IRC), for tripleo-quickstart
> core. He has worked on some pretty major features for the project
> (explicit teardown, devmode), and has a good understanding of the code base.
> 
> This will bring us to three dedicated core reviewers for
> tripleo-quickstart (myself and larsks being the other two), so I would
> also like to implement a 2x +2 policy at this time. Note, that all cores
> of TripleO are also cores on tripleo-quickstart, and should feel free to
> +2 changes as they are comfortable.
> 
> If there are no objections, I will put in a change at the end of the week.
> 
> Thanks,
> 
> - trown
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Proposing Gabriele Cerami for tripleo-quickstart core

2016-07-18 Thread John Trowbridge
Howdy,

I would like to propose Gabriele (panda on IRC), for tripleo-quickstart
core. He has worked on some pretty major features for the project
(explicit teardown, devmode), and has a good understanding of the code base.

This will bring us to three dedicated core reviewers for
tripleo-quickstart (myself and larsks being the other two), so I would
also like to implement a 2x +2 policy at this time. Note, that all cores
of TripleO are also cores on tripleo-quickstart, and should feel free to
+2 changes as they are comfortable.

If there are no objections, I will put in a change at the end of the week.

Thanks,

- trown

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Making TripleO CI easier to consume outside of TripleO CI

2016-07-12 Thread John Trowbridge
Howdy folks,

In the TripleO meeting two weeks ago, it came up that tripleo-quickstart
is being used as a CI tool in RDO. This came about organically, because
we needed to use RDO CI to self-gate quickstart (it relies on having a
baremetal virthost). It displaced another ansible based CI tool there
(khaleesi) and most(all?) of the extra functionality from that tool
(upgrades, scale, baremetal, etc.) has been moved into discrete ansible
roles that are able to plugin to quickstart.[1]

We are still left with two different tool sets, where one should suffice
(and focus CI efforts in one place).

I see two different ways to resolve this.

1. Actively work on making the tripleo-ci scripts consumable external to
tripleo-ci. We have a project in RDO (WeiRDO)[2] that is consuming
upstream CI for packstack and puppet, so it is not totally far-fetched
to add support for TripleO jobs.

Pros:
- All CI development just happens directly in tripleo-ci and RDO just
inherits that work.

Cons:
- This is totally untried, and therefore a totally unknown amount of work.
- It is all or nothing in that there is no incremental path to get the
CI scripts working outside of CI.
- We have to rewrite a bunch of working ansible code in bash which IMO
is the wrong direction for a modern CI system.


2. Actively work on making tripleo-ci consume the ansible work in
tripleo-quickstart and the external role ecosystem around it.

Pros:
- This could be done incrementally, replacing a single function from
tripleo.sh with an invocation of tripleo-quickstart that performs that
function instead.
- We would be able to pull in a lot of extra functionality via these
external roles for free(ish).

Cons:
- Similarly unknown amount of work to completely switch.
- CI development would be done in multiple repos, though each would have
discrete and well defined functionality.


Personally, I don't think we should do anything drastic with CI until
after we release Newton, so we don't add any risk of impacting new
features that haven't landed yet. I do think it would be a good goal for
Ocata to have a CI system in TripleO that is consumable outside of
TripleO. In any case, this email is simply to garner feedback if other's
think this is a worthy thing to pursue and opinions on how we can get there.


[1]
https://github.com/redhat-openstack?utf8=%E2%9C%93&query=ansible-role-tripleo
(note not all of these are actively used/developed)
[2] https://github.com/rdo-infra/weirdo



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] CI package build issues

2016-06-24 Thread John Trowbridge
A quick update on this. It appears that
https://review.rdoproject.org/r/1500 did indeed resolve the issue. There
have been no hits on the logstash query [1] since that merged.

[1]
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22cleaning%20directory%20and%20cloning%20again%5C%22%20AND%20filename%3A%5C%22console.html%5C%22

On 06/23/2016 03:50 PM, John Trowbridge wrote:
> 
> 
> On 06/23/2016 02:56 PM, Dan Prince wrote:
>> After discovering some regressions today we found what we think is a
>> package build issue in our CI environment which might be the cause of
>> our issues:
>>
>> https://bugs.launchpad.net/tripleo/+bug/1595660
>>
>> Specifically, there is a case where DLRN might not be giving an error
>> code if build failures occur, and thus our jobs don't get the updated
>> package symlink and thus give a false positive.
>>
>> Until we get this solved be careful when merging. You might look for
>> 'packages not built correctly: not updating the consistent symlink' in
>> the job output. I see over 200 of these in the last 24 hours:
>>
> 
> I updated the bug, but will reply here for completeness. The "not
> updating the consistent symlink" message appears 100% of the time when
> not building all packages in rdoinfo.
> 
> Instead what happened is we built HEAD of master instead of the refspec
> from zuul.
> 
> http://logs.openstack.org/17/324117/22/check-tripleo/gate-tripleo-ci-centos-7-nonha/3758449/console.html#_2016-06-23_03_40_49_234238
> 
> c48410a05ec0ffd11c717bcf350badc9e5f0e910 is the commit it should have built.
> 
> 4ef338574b1a7cef8b1b884d439556b24fb09718 was built instead.
> 
> So the logstash query we could use is instead:
> 
> http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22cleaning%20directory%20and%20cloning%20again%5C%22%20AND%20filename%3A%5C%22console.html%5C%22
> 
> I think https://review.rdoproject.org/r/1500 is the fix.
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] CI package build issues

2016-06-23 Thread John Trowbridge


On 06/23/2016 02:56 PM, Dan Prince wrote:
> After discovering some regressions today we found what we think is a
> package build issue in our CI environment which might be the cause of
> our issues:
> 
> https://bugs.launchpad.net/tripleo/+bug/1595660
> 
> Specifically, there is a case where DLRN might not be giving an error
> code if build failures occur, and thus our jobs don't get the updated
> package symlink and thus give a false positive.
> 
> Until we get this solved be careful when merging. You might look for
> 'packages not built correctly: not updating the consistent symlink' in
> the job output. I see over 200 of these in the last 24 hours:
> 

I updated the bug, but will reply here for completeness. The "not
updating the consistent symlink" message appears 100% of the time when
not building all packages in rdoinfo.

Instead what happened is we built HEAD of master instead of the refspec
from zuul.

http://logs.openstack.org/17/324117/22/check-tripleo/gate-tripleo-ci-centos-7-nonha/3758449/console.html#_2016-06-23_03_40_49_234238

c48410a05ec0ffd11c717bcf350badc9e5f0e910 is the commit it should have built.

4ef338574b1a7cef8b1b884d439556b24fb09718 was built instead.

So the logstash query we could use is instead:

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22cleaning%20directory%20and%20cloning%20again%5C%22%20AND%20filename%3A%5C%22console.html%5C%22

I think https://review.rdoproject.org/r/1500 is the fix.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Proposed TripleO core changes

2016-06-09 Thread John Trowbridge


On 06/09/2016 10:10 AM, Dougal Matthews wrote:
> On 9 June 2016 at 15:03, Steven Hardy  wrote:
> 
>> Hi all,
>>
>> I've been in discussion with Martin André and Tomas Sedovic, who are
>> involved with the creation of the new tripleo-validations repo[1]
>>
>> We've agreed that rather than create another gerrit group, they can be
>> added to tripleo-core and agree to restrict +A to this repo for the time
>> being (hopefully they'll both continue to review more widely, and obviously
>> Tomas is a former TripleO core anyway, so welcome back! :)
>>
> 
> +1, I think this approach works fine. Requiring sub groups only makes sense
> if we don't feel we can trust people, but then they shouldn't be core. It
> might
> be worth documenting this somewhere however as we have a few restricted
> cores.
> 

So, I am not strongly opinionated in either direction. However, I do
think sub groups can make some sense. I don't know if it makes sense or
not for tripleo-validations, but I quite like it for tripleo-quickstart.
I like that I can choose to trust someone with +2 on tripleo-quickstart
without forcing the rest of tripleo to trust them with +2 on all
projects. I think this could be an interesting model where we could have
at least one main tripleo core in any sub group who is responsible for
mentoring new people to become core in their sub group, and hopefully
eventually into the main tripleo core group.

There is also an accounting reason for sub groups. With no sub groups,
we would just have one large core team. However, effectively many of
these cores would actually be sub group specialists, and not +2ing
outside of their specialty. It is then hard to have any useful
accounting of how many tripleo-cores are specialists vs. generalists vs.
generalists with a specialty. Not sure if that is worth the overhead of
sub groups in and of itself, just wanted to point out there is more
benefit than just the trust issue.

> 
> If folks feel strongly we should create another group we can, but this
>> seems like a low-overhead approach, and well aligned with the scope of the
>> repo, let me know if you disagree.
>>
>> Also, while reviewing the core group[2] I noticed the following members who
>> are no longer active and should probably be removed:
>>
>> - Radomir Dopieralski
>> - Martyn Taylor
>> - Clint Byrum
>>
>> I know Clint is still involved with DiB (which has a separate core group),
>> but he's indicated he's no longer going to be directly involved in other
>> tripleo development, and AFAIK neither Martyn or Radomir are actively
>> involved in TripleO reviews - thanks to them all for their contribution,
>> we'll gladly add you back in the future should you wish to return :)
>>
>> Please let me know if there are any concerns or objections, if there are
>> none I will make these changes next week.
>>
>> Thanks,
>>
>> Steve
>>
>> [1] https://github.com/openstack/tripleo-validations
>> [2] https://review.openstack.org/#/admin/groups/190,members
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Nodes management in our shiny new TripleO API

2016-05-20 Thread John Trowbridge


On 05/19/2016 09:31 AM, Dmitry Tantsur wrote:
> Hi all!
> 
> We started some discussions on https://review.openstack.org/#/c/300200/
> about the future of node management (registering, configuring and
> introspecting) in the new API, but I think it's more fair (and
> convenient) to move it here. The goal is to fix several long-standing
> design flaws that affect the logic behind tripleoclient. So fasten your
> seatbelts, here it goes.
> 
> If you already understand why we need to change this logic, just scroll
> down to "what do you propose?" section.
> 
> "introspection bulk start" is evil
> --
> 
> As many of you obviously know, TripleO used the following command for
> introspection:
> 
>  openstack baremetal introspection bulk start
> 
> As not everyone knows though, this command does not come from
> ironic-inspector project, it's part of TripleO itself. And the ironic
> team has some big problems with it.
> 
> The way it works is
> 
> 1. Take all nodes in "available" state and move them to "manageable" state
> 2. Execute introspection for all nodes in "manageable" state
> 3. Move all nodes with successful introspection to "available" state.
> 
> Step 3 is pretty controversial, step 1 is just horrible. This not how
> the ironic-inspector team designed introspection to work (hence it
> refuses to run on nodes in "available" state), and that's now how the
> ironic team expects the ironic state machine to be handled. To explain
> it I'll provide a brief information on the ironic state machine.
> 
> ironic node lifecycle
> -
> 
> With recent versions of the bare metal API (starting with 1.11), nodes
> begin their life in a state called "enroll". Nodes in this state are not
> available for deployment, nor for most of other actions. Ironic does not
> touch such nodes in any way.
> 
> To make nodes alive an operator uses "manage" provisioning action to
> move nodes to "manageable" state. During this transition the power and
> management credentials (IPMI, SSH, etc) are validated to ensure that
> nodes in "manageable" state are, well, manageable. This state is still
> not available for deployment. With nodes in this state an operator can
> execute various pre-deployment actions, such as introspection, RAID
> configuration, etc. So to sum it up, nodes in "manageable" state are
> being configured before exposing them into the cloud.
> 
> The last step before the deployment it to make nodes "available" using
> the "provide" provisioning action. Such nodes are exposed to nova, and
> can be deployed to at any moment. No long-running configuration actions
> should be run in this state. The "manage" action can be used to bring
> nodes back to "manageable" state for configuration (e.g. reintrospection).
> 
> so what's the problem?
> --
> 
> The problem is that TripleO essentially bypasses this logic by keeping
> all nodes "available" and walking them through provisioning steps
> automatically. Just a couple of examples of what gets broken:
> 
> (1) Imagine I have 10 nodes in my overcloud, 10 nodes ready for
> deployment (including potential autoscaling) and I want to enroll 10
> more nodes.
> 
> Both introspection and ready-state operations nowadays will touch both
> 10 new nodes AND 10 nodes which are ready for deployment, potentially
> making the latter not ready for deployment any more (and definitely
> moving them out of pool for some time).
> 
> Particularly, any manual configuration made by an operator before making
> nodes "available" may get destroyed.
> 
> (2) TripleO has to disable automated cleaning. Automated cleaning is a
> set of steps (currently only wiping the hard drive) that happens in
> ironic 1) before nodes are available, 2) after an instance is deleted.
> As TripleO CLI constantly moves nodes back-and-forth from and to
> "available" state, cleaning kicks in every time. Unless it's disabled.
> 
> Disabling cleaning might sound a sufficient work around, until you need
> it. And you actually do. Here is a real life example of how to get
> yourself broken by not having cleaning:
> 
> a. Deploy an overcloud instance
> b. Delete it
> c. Deploy an overcloud instance on a different hard drive
> d. Boom.
> 
> As we didn't pass cleaning, there is still a config drive on the disk
> used in the first deployment. With 2 config drives present cloud-init
> will pick a random one, breaking the deployment.
> 
> To top it all, TripleO users tend to not use root device hints, so
> switching root disks may happen randomly between deployments. Have fun
> debugging.
> 
> what do you propose?
> 
> 
> I would like the new TripleO mistral workflows to start following the
> ironic state machine closer. Imagine the following workflows:
> 
> 1. register: take JSON, create nodes in "manageable" state. I do believe
> we can automate the enroll->manageable transition, as it serves the
> purpose of validation (and discovery, but lets pu

Re: [openstack-dev] [TripleO] Can we create some subteams?

2016-04-13 Thread John Trowbridge


On 04/11/2016 05:54 AM, John Trowbridge wrote:
> Hola OOOers,
> 
> It came up in the meeting last week that we could benefit from a CI
> subteam with its own meeting, since CI is taking up a lot of the main
> meeting time.
> 
> I like this idea, and think we should do something similar for the other
> informal subteams (tripleoclient, UI), and also add a new subteam for
> tripleo-quickstart (and maybe one for releases?).
> 
> We should make seperate ACL's for these subteams as well. The informal
> approach of adding cores who can +2 anything but are told to only +2
> what they know doesn't scale very well.
> 
> - trown
> 

I went ahead and did this for tripleo-quickstart[1], and added Lars to
the tripleo-quickstart core team. It is relatively painless for anyone
else wanting to do the same.

- trown

[1] https://review.openstack.org/#/c/304145/



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Can we create some subteams?

2016-04-11 Thread John Trowbridge


On 04/11/2016 06:19 AM, Steven Hardy wrote:
> On Mon, Apr 11, 2016 at 05:54:11AM -0400, John Trowbridge wrote:
>> Hola OOOers,
>>
>> It came up in the meeting last week that we could benefit from a CI
>> subteam with its own meeting, since CI is taking up a lot of the main
>> meeting time.
>>
>> I like this idea, and think we should do something similar for the other
>> informal subteams (tripleoclient, UI), and also add a new subteam for
>> tripleo-quickstart (and maybe one for releases?).
> 
> +1, from the meeting and other recent discussions it sounds like defining
> some sub-teams would be helpful, let's try to enumerate those discussed:
> 
> - tripleo-ci
> - API (Mistral based API which is landing in tripleo-common atm)
> - Tripleo-UI
> - os-net-config
> - python-tripleoclient
> - tripleo-quickstart
> 
> Of these, I think tripleo-ci, tripleo-ui, os-net-config and
> tripleo-quickstart all make sense to create sub-teams.
> 
> However it's less clear if we should create separate teams for
> tripleoclient and the API - IMO everyone should care about the CLI flow, so
> it'd be good to encourage broader participation there, but if there's
> consensus we can add that.
> 
> In the API case it's tough because it's being proposed to tripleo-common,
> so it'll be difficult to have an ACL which only affects the location of the
> API code.  Also it's another key interface where we probably want to really
> encourage broad participation in review/development - currently there's a
> small team working on the API implementation but I really hope that changes
> when we move the Mistral based API to be in the default deployment flow.
> 

For gerrit ACLs, I was thinking the main tripleo-core group would have
core on all of the subteams, and the subteam groups would be just for
adding folks who only have "limited" core responsibilities/privileges.

If we have more strictly limited subteams though, I agree that CLI and
API should probably not be split out.

If we do split out API, I think the ACL being on the whole
tripleo-common repo is fine. There is not a ton of non-API related stuff
in tripleo-common anyways.

> Regarding releases, there actually already is a tripleo-release group, but
> I'm not sure we want to maintain that model long-term, instead we should be
> moving towards the common openstack/releases tooling ref:
> 
> http://lists.openstack.org/pipermail/openstack-dev/2016-March/090737.html
> 
> Improving our release workflow and figuring out how we align/integrate
> better with the OpenStack coordinated/centralized release is high on my
> TODO list as PTL for Newton, and it's definitely something I'm keen to
> discuss further e.g at summit (and get help with! :)
> 
>> We should make seperate ACL's for these subteams as well. The informal
>> approach of adding cores who can +2 anything but are told to only +2
>> what they know doesn't scale very well.
> 
> Agreed, there's definitely value in doing this now and it will provide more
> value as our community grows.
> 
> Steve
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Propose Lars Kellogg-Stedman for tripleo-quickstart core

2016-04-11 Thread John Trowbridge
Hello again,

This is semi-related to the subteam thread, but deserves its own.

I would like to give Lars +2 on tripleo-quickstart. He has contributed
almost as much as I have to the project in terms of
LOC, and in terms of real impact probably more. He is responsible for
some of the best features in tripleo-quickstart (image caching,
uprivledged usage).

Futher, he had +2 before we moved the code to the openstack git, and has
been instrumental in the design and promotion of tripleo-quickstart.

If we go the subteam route, I would be happy to create the
tripleo-quickstart subteam and add him myself. Otherwise, could we add
him to the tripleo core team with implicit restriction to +2/+A only
tripleo-quickstart as we have in the past?


Thanks,
-trown

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Can we create some subteams?

2016-04-11 Thread John Trowbridge
Hola OOOers,

It came up in the meeting last week that we could benefit from a CI
subteam with its own meeting, since CI is taking up a lot of the main
meeting time.

I like this idea, and think we should do something similar for the other
informal subteams (tripleoclient, UI), and also add a new subteam for
tripleo-quickstart (and maybe one for releases?).

We should make seperate ACL's for these subteams as well. The informal
approach of adding cores who can +2 anything but are told to only +2
what they know doesn't scale very well.

- trown

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] tripleo-quickstart import

2016-03-31 Thread John Trowbridge


On 03/30/2016 02:16 PM, Paul Belanger wrote:
> On Tue, Mar 29, 2016 at 08:30:22PM -0400, John Trowbridge wrote:
>> Hola,
>>
>> With the approval of the tripleo-quickstart spec[1], it is time to
>> actually start doing the work. The first work item is moving it to the
>> openstack git. The spec talks about moving it as is, and this would
>> still be fine.
>>
>> However, there are roles in the tripleo-quickstart tree that are not
>> directly related to the instack-virt-setup replacement aspect that is
>> approved in the spec (image building, deployment). I think these should
>> be split into their own ansible-role-* repos, so that they can be
>> consumed using ansible-galaxy. It would actually even make sense to do
>> that with the libvirt role responsible for setting up the virtual
>> environment. The tripleo-quickstart would then be just an integration
>> layer making consuming these roles for virtual deployments easy.
>>
>> This way if someone wanted to make a different role for say OVB
>> deployments, it would be easy to use the other roles on top of a
>> differently provisioned undercloud.
>>
>> Similarly, if we wanted to adopt ansible to drive tripleo-ci, it would
>> be very easy to only consume the roles that make sense for the tripleo
>> cloud.
>>
>> So the first question is, should we split the roles out of
>> tripleo-quickstart?
>>
>> If so, should we do that before importing it to the openstack git?
>>
>> Also, should the split out roles also be on the openstack git?
>>
> So, we actually have a few ansible roles in OpenStack, mostly imported by
> myself.  The OpenStack ansible teams has a few too.
> 
> I would propose, keep them included in your project for now and maybe start a
> different discussion with all the ansible projects (kolla, ansible-openstack,
> windmill, etc) to see how to best move forward.  I've discussed with openstack
> ansible in the past about moving the roles I have uploaded into their team and
> hope to bring it up again at Austin.
> 

Awesome, thanks for the feedback Paul. I went ahead and started the
import process:

https://review.openstack.org/#/c/299932

>> Maybe this all deserves its own spec and we tackle it after completing
>> all of the work for the first spec. I put this on the meeting agenda for
>> today, but we didn't get to it.
>>
>> - trown
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] tripleo-quickstart import

2016-03-29 Thread John Trowbridge


On 03/29/2016 08:30 PM, John Trowbridge wrote:
> Hola,
> 
> With the approval of the tripleo-quickstart spec[1], it is time to
> actually start doing the work. The first work item is moving it to the
> openstack git. The spec talks about moving it as is, and this would
> still be fine.
> 
> However, there are roles in the tripleo-quickstart tree that are not
> directly related to the instack-virt-setup replacement aspect that is
> approved in the spec (image building, deployment). I think these should
> be split into their own ansible-role-* repos, so that they can be
> consumed using ansible-galaxy. It would actually even make sense to do
> that with the libvirt role responsible for setting up the virtual
> environment. The tripleo-quickstart would then be just an integration
> layer making consuming these roles for virtual deployments easy.
> 
> This way if someone wanted to make a different role for say OVB
> deployments, it would be easy to use the other roles on top of a
> differently provisioned undercloud.
> 
> Similarly, if we wanted to adopt ansible to drive tripleo-ci, it would
> be very easy to only consume the roles that make sense for the tripleo
> cloud.
> 
> So the first question is, should we split the roles out of
> tripleo-quickstart?
> 
> If so, should we do that before importing it to the openstack git?
> 
> Also, should the split out roles also be on the openstack git?
> 
> Maybe this all deserves its own spec and we tackle it after completing
> all of the work for the first spec. I put this on the meeting agenda for
> today, but we didn't get to it.
> 
> - trown
> 

whoops
[1]
https://github.com/openstack/tripleo-specs/blob/master/specs/mitaka/tripleo-quickstart.rst
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] tripleo-quickstart import

2016-03-29 Thread John Trowbridge
Hola,

With the approval of the tripleo-quickstart spec[1], it is time to
actually start doing the work. The first work item is moving it to the
openstack git. The spec talks about moving it as is, and this would
still be fine.

However, there are roles in the tripleo-quickstart tree that are not
directly related to the instack-virt-setup replacement aspect that is
approved in the spec (image building, deployment). I think these should
be split into their own ansible-role-* repos, so that they can be
consumed using ansible-galaxy. It would actually even make sense to do
that with the libvirt role responsible for setting up the virtual
environment. The tripleo-quickstart would then be just an integration
layer making consuming these roles for virtual deployments easy.

This way if someone wanted to make a different role for say OVB
deployments, it would be easy to use the other roles on top of a
differently provisioned undercloud.

Similarly, if we wanted to adopt ansible to drive tripleo-ci, it would
be very easy to only consume the roles that make sense for the tripleo
cloud.

So the first question is, should we split the roles out of
tripleo-quickstart?

If so, should we do that before importing it to the openstack git?

Also, should the split out roles also be on the openstack git?

Maybe this all deserves its own spec and we tackle it after completing
all of the work for the first spec. I put this on the meeting agenda for
today, but we didn't get to it.

- trown

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] gnocchi backport exception for stable/mitaka

2016-03-29 Thread John Trowbridge
+1 I think this is a good exception for the same reasons.

On 03/29/2016 12:27 PM, Steven Hardy wrote:
> On Tue, Mar 29, 2016 at 11:05:00AM -0400, Pradeep Kilambi wrote:
>> Hi Everyone:
>>
>> As Mitaka branch was cut yesterday, I would like to request a backport
>> exception to get gnocchi patches[1][2][3] into stable/mitaka. It
>> should low risk feature as we decided not to set ceilometer to use
>> gnocchi by default. So ceilometer would work as is and gnocchi is
>> deployed along side as a new service but not used out of the box. So
>> this should make upgrades pretty much a non issues as things should
>> work exactly like before. If someone want to use gnocchi backend, they
>> can add an env template file to override the backend. In Netwon, we'll
>> flip the switch to make gnocchi the default backend.
>>
>> If we can please vote to agree to get this in as an exception it would
>> be super useful.
> 
> +1, provided we're able to confirm this plays nicely wrt upgrades I think
> we should allow this.
> 
> We're taking a much stricter stance re backports for stable/mitaka, but I
> think this is justified for the following reasons:
> 
> - The patches have been posted in plenty of time, but have suffered from a
>   lack of reviews and a lot of issues getting CI passing, were it not for
>   those issues this should really have landed by now.
> 
> - The Ceilometer community have been moving towards replacing the database
>   dispatcher with gnocchi since kilo, and it should provide us with a
>   (better performing) alternative the current setup AIUI.
> 
> Thus I think this is a case where an exception is probably justified, but
> to be clear I'm generally opposed to granting exceptions for mitaka beyond
> the few things we may discover in the next few days prior to the
> coordinated release (in Newton I hope we can formalize this to be more
> aligned with the normal feature-freeze and RC process).
> 
> Steve
> 
>>
>> Thanks,
>> ~ Prad
>>
>> [1] https://review.openstack.org/#/c/252032/
>> [2] https://review.openstack.org/#/c/290710/
>> [3] https://review.openstack.org/#/c/238013/
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI jobs failures

2016-03-07 Thread John Trowbridge


On 03/06/2016 11:58 AM, James Slagle wrote:
> On Sat, Mar 5, 2016 at 11:15 AM, Emilien Macchi  wrote:
>> I'm kind of hijacking Dan's e-mail but I would like to propose some
>> technical improvements to stop having so much CI failures.
>>
>>
>> 1/ Stop creating swap files. We don't have SSD, this is IMHO a terrible
>> mistake to swap on files because we don't have enough RAM. In my
>> experience, swaping on non-SSD disks is even worst that not having
>> enough RAM. We should stop doing that I think.
> 
> We have been relying on swap in tripleo-ci for a little while. While
> not ideal, it has been an effective way to at least be able to test
> what we've been testing given the amount of physical RAM that is
> available.
> 
> The recent change to add swap to the overcloud nodes has proved to be
> unstable. But that has more to do with it being racey with the
> validation deployment afaict. There are some patches currently up to
> address those issues.
> 
>>
>>
>> 2/ Split CI jobs in scenarios.
>>
>> Currently we have CI jobs for ceph, HA, non-ha, containers and the
>> current situation is that jobs fail randomly, due to performances issues.
>>
>> Puppet OpenStack CI had the same issue where we had one integration job
>> and we never stopped adding more services until all becomes *very*
>> unstable. We solved that issue by splitting the jobs and creating scenarios:
>>
>> https://github.com/openstack/puppet-openstack-integration#description
>>
>> What I propose is to split TripleO jobs in more jobs, but with less
>> services.
>>
>> The benefit of that:
>>
>> * more services coverage
>> * jobs will run faster
>> * less random issues due to bad performances
>>
>> The cost is of course it will consume more resources.
>> That's why I suggest 3/.
>>
>> We could have:
>>
>> * HA job with ceph and a full compute scenario (glance, nova, cinder,
>> ceilometer, aodh & gnocchi).
>> * Same with IPv6 & SSL.
>> * HA job without ceph and full compute scenario too
>> * HA job without ceph and basic compute (glance and nova), with extra
>> services like Trove, Sahara, etc.
>> * ...
>> (note: all jobs would have network isolation, which is to me a
>> requirement when testing an installer like TripleO).
> 
> Each of those jobs would at least require as much memory as our
> current HA job. I don't see how this gets us to using less memory. The
> HA job we have now already deploys the minimal amount of services that
> is possible given our current architecture. Without the composable
> service roles work, we can't deploy less services than we already are.
> 
> 
> 
>>
>> 3/ Drop non-ha job.
>> I'm not sure why we have it, and the benefit of testing that comparing
>> to HA.
> 
> In my opinion, I actually think that we could drop the ceph and non-ha
> job from the check-tripleo queue.
> 
> non-ha doesn't test anything realistic, and it doesn't really provide
> any faster feedback on patches. It seems at most it might run 15-20
> minutes faster than the HA job on average. Sometimes it even runs
> slower than the HA job.
> 
> The ceph job we could move to the experimental queue to run on demand
> on patches that might affect ceph, and it could also be a daily
> periodic job.
> 
> The same could be done for the containers job, an IPv6 job, and an
> upgrades job. Ideally with a way to run an individual job as needed.
> Would we need different experimental queues to do that?
> 
> That would leave only the HA job in the check queue, which we should
> run with SSL and network isolation. We could deploy less testenv's
> since we'd have less jobs running, but give the ones we do deploy more
> RAM. I think this would really alleviate a lot of the transient
> intermittent failures we get in CI currently. It would also likely run
> faster.
> 
> It's probably worth seeking out some exact evidence from the RDO
> centos-ci, because I think they are testing with virtual environments
> that have a lot more RAM than tripleo-ci does. It'd be good to
> understand if they have some of the transient failures that tripleo-ci
> does as well.
> 

The HA job in RDO CI is also more unstable than nonHA, although this is
usually not to do with memory contention. Most of the time that I see
the HA job fail spuriously in RDO CI, it is because of the Nova
scheduler race. I would bet that this race is the cause for the
fluctuating amount of time jobs take as well, because the recovery
mechanism for this is just to retry. Those retries can add 15 min. per
retry to the deploy. In RDO CI there is a 60min. timeout for deploy as
well. If we can't deploy to virtual machines in under an hour, to me
that is a bug. (Note, I am speaking of `openstack overcloud deploy` when
I say deploy, though start to finish can take less than an hour with
decent CPUs)

RDO CI uses the following layout:
Undercloud: 12G RAM, 4 CPUs
3x Control Nodes: 4G RAM, 1 CPU
Compute Node: 4G RAM, 1 CPU

Is there any ability in our current CI setup to auto-identify the cause
of a failure? The nova schedu

Re: [openstack-dev] [ironic] [stable] Suggestion to remove stable/liberty and stable branches support from ironic-python-agent

2016-02-19 Thread John Trowbridge


On 02/19/2016 07:29 AM, Lucas Alvares Gomes wrote:
> Hi,
> 
> By removing stable branches you mean stable branches for mitaka and
> newer releases or that includes stable/liberty which already exist as
> well?
> 
> I think the latter is more complicated, I don't think we should drop
> stable/liberty like that because other people (apart from TripleO) may
> also depend on that. I mean, it wouldn't be very "stable" if stable
> branches were deleted before their supported phases.
>
I would argue it is also not very stable if there is not testing against
it :).

For the RDO use case in particular, it is about having LIO support in
liberty, so that it is feature complete with the bash ramdisk. Then the
bash ramdisk can return to the bit bucket.

The tricky bit is that RDO does not include patches in our packages
built from trunk (trunk.rdoproject.org), and for liberty we first check
if stable/liberty exists, then fallback to master if it does not. So the
presence of stable/liberty that is not actually the recommended way to
build IPA for liberty is a bit not ideal for us.

All of that said, I totally understand not wanting to delete a branch.
Especially since I think I am the one who Dmitry is referring to asking
for it. (Though I think what I wanted was releases which is subtly
different)

I think there are some hacks I could make in our trunk builder if I at
least have a ML post like this as justification. I am not 100% sure that
is possible though.

> But that said, I'm +1 to not have stable branches for newer releases.
> 
> Cheers,
> Lucas
> 
> On Fri, Feb 19, 2016 at 12:17 PM, Dmitry Tantsur  wrote:
>> Hi all!
>>
>> Initially we didn't plan on having stable branches for IPA at all. Our gate
>> is using the prebuilt image generated from the master branch even on
>> Ironic/Inspector stable branches. The branch in question was added by
>> request of RDO folks, and today I got a request from trown to remove it:
>>
>>  dtantsur: btw, what do you think the chances are that IPA gets rid
>> of stable branch?
>>  I'm +1 on that, because currently only tripleo is using this
>> stable branch, our own gates are using tarball from master
>>  s/tarball/prebuilt image/
>>  cool, from RDO perspective, I would prefer to have master package in
>> our liberty delorean server, but I cant do that (without major hacks) if
>> there is a stable/liberty branch
>>  LIO support being the main reason
>>  fwiw, I have tested master IPA on liberty and it works great
>>
>> So I suggest we drop stable branches from IPA. This won't affect the Ironic
>> gate in any regard, as we don't use stable IPA there anyway, as I mentioned
>> before. As we do know already, we'll keep IPA compatible with all supported
>> Ironic and Inspector versions.
>>
>> Opinions?
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Should we rename "RDO Manager" to "TripleO" ?

2016-02-17 Thread John Trowbridge
+1, I will also add for reference, that there is no other project in RDO
that is renamed/rebranded. In fact, even the TripleO packages in RDO
have the same naming as the upstream projects.

On 02/17/2016 11:27 AM, David Moreau Simard wrote:
> Greetings,
> 
> (Note: cross-posted between rdo-list and openstack-dev to reach a
> larger audience)
> 
> Today, because of the branding and the name "RDO Manager", you might
> think that it's something other than TripleO - either something
> entirely different or perhaps with downstream patches baked in.
> You would not be the only one because the community, the users and the
> developers alike have shared their confusion on that topic.
> 
> The truth is, as it stands right now, "RDO Manager" really is "TripleO".
> There is no code or documentation differences.
> 
> I feel the only thing that is different is the strategy around how we
> test TripleO to ensure the stability of RDO packages but it's already
> in the process of being sent upstream [1] because we're convinced it's
> the best way forward.
> 
> Historically, RDO Manager and TripleO were different things.
> Today this is no longer the case and we plan on keeping it that way.
> 
> With this in mind, we would like to drop the RDO manager branding and
> use TripleO instead.
> Not only would we clear the confusion on the topic of what RDO Manager
> really is but it would also strengthen the TripleO name.
> 
> We would love the RDO community to chime in on this and give their
> feedback as to whether or not this is a good initiative.
> We will proceed to a formal vote on $subject at the next RDO meeting
> on Wednesday, 24th Feb, 2016 1500 UTC [2]. Feel free to join us on
> #rdo on freenode.
> 
> Thanks,
> 
> [1]: https://review.openstack.org/#/c/276810/
> [2]: https://etherpad.openstack.org/p/RDO-Meeting
> 
> David Moreau Simard
> Senior Software Engineer | Openstack RDO
> 
> dmsimard = [irc, github, twitter]
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [ironic] [tripleo] [stable] Phasing out old Ironic ramdisk and its gate jobs

2016-02-17 Thread John Trowbridge


On 02/17/2016 08:30 AM, Dmitry Tantsur wrote:
> On 02/17/2016 02:22 PM, John Trowbridge wrote:
>>
>>
>> On 02/17/2016 06:27 AM, Dmitry Tantsur wrote:
>>> Hi everyone!
>>>
>>> Yesterday on the Ironic midcycle we agreed that we would like to remove
>>> support for the old bash ramdisk from our code and gate. This, however,
>>> pose a problem, since we still support Kilo and Liberty. Meaning:
>>>
>>> 1. We can't remove gate jobs completely, as they still run on
>>> Kilo/Liberty.
>>> 2. Then we should continue to run our job on DIB, as DIB does not have
>>> stable branches.
>>> 3. Then we can't remove support from Ironic master as well, as it would
>>> break DIB job :(
>>>
>>> I see the following options:
>>>
>>> 1. Wait for Kilo end-of-life (April?) before removing jobs and code.
>>> This means that the old ramdisk will essentially be supported in Mitaka,
>>> but we'll remove gating on stable/liberty and stable/mitaka very soon.
>>> Pros: it will happen soon. Cons: in theory we do support the old ramdisk
>>> on Liberty, so removing gates will end this support prematurely.
>>>
>>> 2. Wait for Liberty end-of-life. This means that the old ramdisk will
>>> essentially be supported in Mitaka and Newton. We should somehow
>>> communicate that it's not official and can be dropped at any moment
>>> during stable branches life time. Pros: we don't drop support of the
>>> bash ramdisk on any branch where we promised to support it. Cons: people
>>> might assume we still support the old ramdisk on Mitaka/Newton; it will
>>> also take a lot of time.
>>>
>>> 3. Do it now, recommend Kilo users to switch to IPA too. Pros: it
>>> happens now, no confusing around old ramdisk support in Mitaka and
>>> later. Cons: probably most Kilo users (us included) are using the bash
>>> ramdisk, meaning we can potentially break them when landing changes on
>>> stable/kilo.
>>>
>>
>> I think if we were to do this, then we need to backport LIO support in
>> IPA to liberty and kilo. While the bash ramdisk is not awesome to
>> troubleshoot, tgtd is not great either, and the bash ramdisk has
>> supported LIO since Kilo. However, there is not stable/kilo branch in
>> IPA, so that backport is impossible. I have not looked at how hard the
>> stable/liberty backport would be, but I imagine not very.
>>
>>> 4. Upper-cap DIB in stable/{kilo,liberty} to the current release, then
>>> remove gates from Ironic master and DIB, leaving them on Kilo and
>>> Liberty. Pros: we can remove old ramdisk support right now. Cons: DIB
>>> bug fixes won't affect kilo and liberty any more.
>>>
>>> 5. The same as #4, but only on Kilo.
>>>
>>> As gate on stable/kilo is not working right now, and end-of-life is
>>> quickly approaching, I see number 3 as a pretty viable option anyway. We
>>> probably won't land any more changes on Kilo, so no use in keeping gates
>>> on it. Liberty is still a concern though, as the old ramdisk was only
>>> deprecated in Liberty.
>>>
>>> What do you all think? Did I miss any options?
>>
>> My favorite option would be 5 with backport of LIO support to liberty
>> (since backport to kilo is not possible). That is the only benefit of
>> the current bash ramdisk over the liberty/kilo IPA ramdisk. This is not
>> just for RHEL, but RHEL derivatives like CentOS which the RDO distro is
>> based on. (technically tgt can still be installed from EPEL, but there
>> is a reason it is not included in the base repos)
> 
> Oh, that's a good catch, IPA is usable on RHEL starting with Mitaka... I
> wonder if having stable branches for IPA was a good idea at all,
> especially provided that our gate is using git master on all branches.
> 

Interesting, I did not know that master is used for all gates. Maybe RDO
should just build liberty IPA from master. That would solve my only
concern for 3.

>>
>> Other than that, I think 4 is the next best option.
>>>
>>> Cheers,
>>> Dmitry
>>>
>>> __
>>>
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> 

Re: [openstack-dev] [ironic] [tripleo] [stable] Phasing out old Ironic ramdisk and its gate jobs

2016-02-17 Thread John Trowbridge


On 02/17/2016 06:27 AM, Dmitry Tantsur wrote:
> Hi everyone!
> 
> Yesterday on the Ironic midcycle we agreed that we would like to remove
> support for the old bash ramdisk from our code and gate. This, however,
> pose a problem, since we still support Kilo and Liberty. Meaning:
> 
> 1. We can't remove gate jobs completely, as they still run on Kilo/Liberty.
> 2. Then we should continue to run our job on DIB, as DIB does not have
> stable branches.
> 3. Then we can't remove support from Ironic master as well, as it would
> break DIB job :(
> 
> I see the following options:
> 
> 1. Wait for Kilo end-of-life (April?) before removing jobs and code.
> This means that the old ramdisk will essentially be supported in Mitaka,
> but we'll remove gating on stable/liberty and stable/mitaka very soon.
> Pros: it will happen soon. Cons: in theory we do support the old ramdisk
> on Liberty, so removing gates will end this support prematurely.
> 
> 2. Wait for Liberty end-of-life. This means that the old ramdisk will
> essentially be supported in Mitaka and Newton. We should somehow
> communicate that it's not official and can be dropped at any moment
> during stable branches life time. Pros: we don't drop support of the
> bash ramdisk on any branch where we promised to support it. Cons: people
> might assume we still support the old ramdisk on Mitaka/Newton; it will
> also take a lot of time.
> 
> 3. Do it now, recommend Kilo users to switch to IPA too. Pros: it
> happens now, no confusing around old ramdisk support in Mitaka and
> later. Cons: probably most Kilo users (us included) are using the bash
> ramdisk, meaning we can potentially break them when landing changes on
> stable/kilo.
> 

I think if we were to do this, then we need to backport LIO support in
IPA to liberty and kilo. While the bash ramdisk is not awesome to
troubleshoot, tgtd is not great either, and the bash ramdisk has
supported LIO since Kilo. However, there is not stable/kilo branch in
IPA, so that backport is impossible. I have not looked at how hard the
stable/liberty backport would be, but I imagine not very.

> 4. Upper-cap DIB in stable/{kilo,liberty} to the current release, then
> remove gates from Ironic master and DIB, leaving them on Kilo and
> Liberty. Pros: we can remove old ramdisk support right now. Cons: DIB
> bug fixes won't affect kilo and liberty any more.
> 
> 5. The same as #4, but only on Kilo.
> 
> As gate on stable/kilo is not working right now, and end-of-life is
> quickly approaching, I see number 3 as a pretty viable option anyway. We
> probably won't land any more changes on Kilo, so no use in keeping gates
> on it. Liberty is still a concern though, as the old ramdisk was only
> deprecated in Liberty.
> 
> What do you all think? Did I miss any options?

My favorite option would be 5 with backport of LIO support to liberty
(since backport to kilo is not possible). That is the only benefit of
the current bash ramdisk over the liberty/kilo IPA ramdisk. This is not
just for RHEL, but RHEL derivatives like CentOS which the RDO distro is
based on. (technically tgt can still be installed from EPEL, but there
is a reason it is not included in the base repos)

Other than that, I think 4 is the next best option.
> 
> Cheers,
> Dmitry
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Thoughts about the relationship between RDO and TripleO

2016-02-15 Thread John Trowbridge


On 02/15/2016 02:50 PM, James Slagle wrote:
> On Mon, Feb 15, 2016 at 2:05 PM, John Trowbridge  wrote:
>> Howdy,
>>
>> The spec to replace instack-virt-setup[1] got me thinking about the
>> relationship between RDO and TripleO. Specifically, when thinking about
>> where to store/create an undercloud.qcow2 image, and if this effort is
>> worth duplicating.
>>
>> Originally, I agreed with the comments on the spec wrt the fact that we
>> do not want to rely on RDO artifacts for TripleO CI. However, we do
>> exactly that already. Delorean packages are 100% a RDO artifact. So it
>> seems a bit odd to say we do not want to rely on an image that is really
>> just a bunch of those other artifacts, that we already rely on, rolled
>> up into a qcow.
> 
> That's fair I suppose. It just felt a bit like adding dependencies in
> the wrong direction, and I would prefer to see the undercloud image
> generated directly via tripleo-ci.
> 
> Isn't the plan to push more of the Delorean package git repo sources
> under the OpenStack namespace in the future? If so, and things are
> moving in that direction then I figured it would be better to have
> less dependencies on the RDO side, and use as much of the existing
> tripleo-ci as possible.
>

There has been some talk of doing this, but I do not think anyone
currently working on RDO has time to be PTL of an openstack project. I'm
also not sure this would just be delorean packaging branches. My
understanding was that we would move all the branches upstream. In any
case, there is no movement on this right now.

> I was advocating for using tripleo-ci to build the undercloud image,
> because we know it can and already does (we just doesn't save it), so
> we should default to using tripleo-ci instead of RDO CI. tripleo-ci
> doesn't build a whole delorean snapshot of master today (we only build
> a small subset of the packages we're testing for that CI run), so in
> that case, we use the RDO hosted one. I'm not saying tripleo-ci should
> build a whole deloean repo by any means. Just that in the specific
> case of the undercloud image, tripleo-ci is more or less already
> building that, so we should save and reuse it.
>

Right I agree that it would be great if tripleo-ci was producing this
image. Then RDO could just be a consumer of it. Having solved this
problem for RDO though, I can say there are some pieces of it that are
non-trivial. Where to store the images is a big one. Does TripleO have
somewhere we can store images that would be used by both TripleO and
downstream consumers?

Efficiently creating the image is also not as simple as just taking what
we currently have in tripleo-ci and saving it off before `openstack
undercloud install` is run. We build the overcloud images and install
all of the packages after that in tripleo-ci currently. Those two steps
are the majority of the time, so saving an image before then does not
buy much.

It might be worth splitting out a second spec solely dedicated to how we
can efficiently build an undercloud.qcow2. This could be more process
oriented, ie what an efficient list of steps is, rather than a specific
tool to implement that process.
>>
>> On the other hand, it seems a bit odd that we rely on delorean packages
>> at all. This creates a bit of a sticky situation for RDO. Take the case
>> where RDO has identified all issues that need to be fixed to work with
>> HEAD of master, but some patches have not merged yet. It should be ok
>> for RDO to put a couple .patch files in the packaging, and be on our
>> merry way until those are merged upstream and can be removed.
> 
> I actually think that the trunk packaging shouldn't ever do that. It
> should be 100% straight from trunk with no patches.
> 
I agree in principle, but it seems weird that it is not even really the
RDO community's decision to make. This example was really just meant to
show that for better or worse the two projects are co-dependent in both
directions. Which then made me wonder where is the line?

It seems like if the undercloud.qcow2 is simply a collection of packages
pre-installed in an image that allow deploying TripleO it is not all
that different than a RPM repository. Note, there are some things in the
current image build that fall outside of that, but I think it might be a
better effort to fix those things in RDO than it would be to reinvent
what is already in RDO in TripleO.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Thoughts about the relationship between RDO and TripleO

2016-02-15 Thread John Trowbridge
Howdy,

The spec to replace instack-virt-setup[1] got me thinking about the
relationship between RDO and TripleO. Specifically, when thinking about
where to store/create an undercloud.qcow2 image, and if this effort is
worth duplicating.

Originally, I agreed with the comments on the spec wrt the fact that we
do not want to rely on RDO artifacts for TripleO CI. However, we do
exactly that already. Delorean packages are 100% a RDO artifact. So it
seems a bit odd to say we do not want to rely on an image that is really
just a bunch of those other artifacts, that we already rely on, rolled
up into a qcow.

On the other hand, it seems a bit odd that we rely on delorean packages
at all. This creates a bit of a sticky situation for RDO. Take the case
where RDO has identified all issues that need to be fixed to work with
HEAD of master, but some patches have not merged yet. It should be ok
for RDO to put a couple .patch files in the packaging, and be on our
merry way until those are merged upstream and can be removed. However,
if we did this today, it would break TripleO CI since TripleO CI would
then pick up these patched RPMs from delorean.

I am not sure what the best path to resolve this is. Ideally, the above
need for .patch files is not there, but that is another topic.

-trown


[1]
https://review.openstack.org/#/c/276810/2/specs/mitaka/tripleo-quickstart.rst

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Stable branch policy for Mitaka

2016-02-15 Thread John Trowbridge


On 02/15/2016 03:59 AM, Steven Hardy wrote:
> On Wed, Feb 10, 2016 at 07:05:41PM +0100, James Slagle wrote:
>>On Wed, Feb 10, 2016 at 4:57 PM, Steven Hardy  wrote:
>>
>>  Hi all,
>>
>>  We discussed this in our meeting[1] this week, and agreed a ML
>>  discussion
>>  to gain consensus and give folks visibility of the outcome would be a
>>  good
>>  idea.
>>
>>  In summary, we adopted a more permissive "release branch" policy[2] for
>>  our
>>  stable/liberty branches, where feature backports would be allowed,
>>  provided
>>  they worked with liberty and didn't break backwards compatibility.
>>
>>  The original idea was really to provide a mechanism to "catch up" where
>>  features are added e.g to liberty OpenStack components late in the cycle
>>  and TripleO requires changes to integrate with them.
>>
>>  However, the reality has been that the permissive backport policy has
>>  been
>>  somewhat abused (IMHO) with a large number of major features being
>>  proposed
>>  for backport, and in a few cases this has broken downstream (RDO)
>>  consumers
>>  of TripleO.
>>
>>  Thus, I would propose that from Mitaka, we revise our backport policy to
>>  simply align with the standard stable branch model observed by all
>>  projects[3].
>>
>>  Hopefully this will allow us to retain the benefits of the stable branch
>>  process, but provide better stability for downstream consumers of these
>>  branches, and minimise confusion regarding what is a permissable
>>  backport.
>>
>>  If we do this, only backports that can reasonably be considered
>>  "Appropriate fixes"[4] will be valid backports - in the majority of
>>  cases
>>  this will mean bugfixes only, and large features where the risk of
>>  regression is significant will not be allowed.
>>
>>  What are peoples thoughts on this?
>>
>>â**I'm in agreement. I think this change is needed and will help set
>>better expectations around what will be included in which release.
>>
>>If we adopt this as the new policy, then the immediate followup is to set
>>and communicate when we'll be cutting the stable branches, so that it's
>>understood when the features have to be done/committed. I'd suggest that
>>we more or less completely adopt the integrated release schedule[1]. Which
>>I believe means the week of RC1 for cutting the stable/mitaka branches,
>>which is March 14th-18th.
>>
>>It seems to follow logically then that we'd then want to also be more
>>aggresively aligned with other integrated release events such as the
>>feature freeze date, Feb 29th - March 4th.
> 
> Yes, agreeing a backport policy is the first step, and aligning all our
> release policies with the rest of OpenStack is the logical next step.
> 
>>An alternative to strictly following the schedule, would be to say that
>>TripleO lags the integrated release dates by some number of weeks (1 or 2
>>I'd think), to allow for some "catchup" time since TripleO is often
>>consuming features from projects part of the integrated release.
> 
> The risk with this approach is there remains some confusion about our
> deadlines, and there is an increased risk that our 1-2 weeks window slips
> and we end up with a similar problem to that which we have now.
> 
>From a packaging POV, I am also -1 on lagging the integrated release.
This creates a situation where TripleO can not be used as the method to
test the integrated release packaging. This means relying on other
installers (Packstack), which means less use of TripleO in the RDO
community.

Any big feature that needs support in TripleO, that is in the integrated
release, would have a spec landed in advance. So, I do not think it is
all that burdensome to land TripleO support for the features on the same
schedule.

> I'd propose we align with whatever schedule the puppet community observes,
> given that (with out current implementation at least), it's unlikely we can
> land any features actually related to new-feature-in-$service type patches
> without that feature already having support in the puppet modules?
> 

+1 to following puppet module lead. It seems like any new feature type
patch we wanted to support in TripleO should be implemented very close
to the patch in the puppet module which enables it. Ideally, we could
land TripleO support at the same time as the feature is enabled in
puppet using depends on.

> Perhaps we can seek out some guidance from Emilien, as I'm not 100% sure of
> the release model observed for the puppet modules?
> 
> If you look at the features we're backporting, most of them aren't related
> to features requiring "catchup", e.g IPv6, SSL, Upgrades - these are all
> cross-project TripleO features and there are very few (if any?) "catchup"
> type requirements AFAICT.
> 
> Also, if you look at other projects, such as Heat and Mistral, which

Re: [openstack-dev] [tripleo] use mitaka CI tested repo

2016-01-29 Thread John Trowbridge


On 01/29/2016 09:40 AM, Dan Prince wrote:
> On Fri, 2016-01-29 at 08:17 -0500, Emilien Macchi wrote:
>> Hi,
>>
>> I'm wondering why don't we use Mitaka CI tested repository [1].
>> IIRC, TripleO is currently using a snapshot which is updated
>> asynchronously by TripleO CI folks.
>> The problem is that we're not consistent with what RDO CI is testing.
>> In
>> my memory and tell me if I'm wrong but it happens we're using an old
>> snapshot of packages which is older that is actually tested &
>> verified
>> by RDO CI.
>>
>> The benefit of using this tested repo would be:
>> * this repository is already gated by Weirdo [2] which is running the
>> same jobs as Puppet OpenStack CI.
>> * you would not have less jobs failures, because RDO CI would have
>> detected bugs before.
>> * tripleo folks could focus a bit more on features & bugfixes in
>> TripleO
>> itself, rather than debugging CI issues and slowing down the review
>> process.
>> * Puppet OpenStack CI became really stable since we're using this
>> repository. We have a very few number of issues since then.
>>
>> Though, something I don't like in my proposal:
>> * tripleo would not bring short feedback to RDO folks if something is
>> broken
>>
>> But is TripleO supposed to debug other projects in the same time?
>> Don't
>> we have enough challenges in our project?
>>
>> This would be a temporary solution I think, until TripleO would be
>> part
>> of other upstream project gate (nova, etc) maybe one day.
>> But at this time, I honestly think TripleO (which is an installer)
>> folks, spend too much time at debugging CI for some reasons that are
>> related to projects outside tripleo (puppet modules, openstack bugs,
>> packaging issues, etc).
>>
>> This is just a proposal and an idea to help TripleO folks to focus on
>> their review process, instead of debugging CI failures every week.
> 
> The problem I think is largly due to the fact that the RDO CI doesn't
> test the same things we do in the TripleO CI. It is essentially a
> different CI system.
> 
> Because the CI systems are different different breakages happen when
> you try to update one component vs. another. This is why we have to
> maintain 'current-tripleo' separately between RDO and TripleO.
> 
> I agree it would be nice if we could consolidate on a single repository
> across these projects. It would allow us to focus on review more...
> (less CI fixing).
> 
> Perhaps a test matrix comparing the two CI systems would help us get
> them closer to parity with what is covered. Even better would be just
> simply contributing to the same CI systems and scripts.
>

I think contributing to the same scripts would be a big win. From the
'undercloud install' on, it is totally feasible for both CI systems to
use the same script right now.

I am working on using tripleo.sh in rdoci, which modulo the different
environment setups, would get us to the above goal.

If we could then get tripleoci consuming the same pre-built
undercloud.qcow2 that rdoci is using[1], I think the environment
differences would be negligible.

[1]
https://ci.centos.org/artifacts/rdo/images/mitaka/delorean/stable/undercloud.qcow2

> Dan
> 
>>
>>
>> Thanks for reading so far.
>> Your feedback and comments are more than welcome.
>>
>> [1] http://trunk.rdoproject.org/centos7/current-passed-ci/delorean.re
>> po
>> [2] https://github.com/redhat-openstack/weirdo
>> _
>> _
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubs
>> cribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] changes in irc alerts

2016-01-28 Thread John Trowbridge


On 01/27/2016 05:02 PM, James Slagle wrote:
> On Tue, Jan 26, 2016 at 7:15 PM, Derek Higgins  wrote:
> 
>> Hi All,
>>
>> For the last few months we've been alerting the #tripleo irc channel when
>> a card is open on the tripleo trello org, in the urgent list.
>>
>> When used I think it served a good purpose to alert people to the fact
>> that deploying master is currently broken, but it hasn't been used as much
>> as I hoped(not to mention the duplication of sometimes needing a LB bug
>> anyways). As most people are more accustomed to creating LP bugs when
>> things are broken and to avoid duplication perhaps it would have been
>> better to use LaunchPad to drive the alerts instead.
>>
>> I've changed the bot that was looking at trello to now instead look for
>> bugs on launchpad (hourly), it will alert the #tripleo channel if it finds
>> a bug that matches
>>
>> is filed against the tripleo project  AND
>> has a Importance or "Critical"AND
>> has the tag "alert" applied to it
>>
>> I brought this up in todays meeting and people were +1 on the idea, do the
>> rules above work for people? if not I can change them to something more
>> suitable.
>>
> 
> ​WFM, I just filed a new critical bug[1] and added the tag, so we can see
> if it works :)​
> 

Looks like it worked:

[17:10:24]  URGENT TRIPLEO TASKS NEED ATTENTION
[17:10:33]  https://bugs.launchpad.net/tripleo/+bug/1538761
[17:10:34]  Launchpad bug 1538761 in tripleo "stable/liberty
HA: mysqld on overcloud failing to start with /usr/libexec/mysqld:
option '--wsrep_notify_cmd' requires an argument" [Critical,In progress]
- Assigned to James Slagle (james-slagle)
[17:15:23]  slagle: I think we probably need to just merge
https://review.openstack.org/#/c/272194/

This is great! Thanks Derek.
> 
> ​[1] ​
>  https://bugs.launchpad.net/tripleo/+bug/1538761
> 
> thanks,
>> Derek.
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
> 
> 
> 
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] Mitaka Milestone 2

2016-01-18 Thread John Trowbridge
Hola Otripletas,

According to the official schedule[1], this week is scheduled for Mitaka
milestone 2. Do we intend to have something working for this milestone?

I have been tracking the issues when deploying with current from all
other projects[2] (as opposed to using an old pinned delorean for
non-tripleo projects). This has been a rapidly moving target (that we
have yet to hit for Mitaka), since we are not doing this in CI (which I
totally get the reasons for).

I think if we want to have TripleO working with the rest of OpenStack's
Mitaka milestone 2, we will need to prioritize resolving the outstanding
issues this week. I would love to see us not merge anything that is not
related to either adding some validation of the deployed overcloud or
fixing some issue related to deploying with delorean current for all
packages.

One huge benefit to TripleO of doing this prioritization would be the
free testing we could get in RDO next week. Otherwise, I will have to do
my best to hack around the known issues for the RDO test day, which will
not be a true test of TripleO.

Another benefit, would be that if we get RDO CI testing TripleO as part
of our delorean promotion process, TripleO will be able to use the
automated current-passed-ci link instead of the manual current-tripleo
link. It will then be much easier to trace issues close to when they are
introduced rather than having a huge number of commits to comb through,
with many issues happening concurrently.

- trown

[1] http://docs.openstack.org/releases/schedules/mitaka.html
[2] https://etherpad.openstack.org/p/delorean_master_current_issues

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] workflow

2015-12-06 Thread John Trowbridge


On 12/03/2015 03:47 PM, Dan Prince wrote:
> On Tue, 2015-11-24 at 15:25 +, Dougal Matthews wrote:
>> On 23 November 2015 at 14:37, Dan Prince  wrote:
>>> There are lots of references to "workflow" within TripleO
>>> conversations
>>> these days. We are at (or near) the limit of what we can do within
>>> Heat
>>> with regards to upgrades. We've got a new TripleO API in the works
>>> (a
>>> new version of Tuskar basically) that is specifically meant to
>>> encapsulates business logic workflow around deployment. And, Lots
>>> of
>>> interest in using Ansible for this and that.
>>>
>>> So... Last week I spent a bit of time tinkering with the Mistral
>>> workflow service that already exists in OpenStack and after a few
>>> patches got it integrated into my undercloud:
>>>
>>> https://etherpad.openstack.org/p/tripleo-undercloud-workflow
>>>
>>> One could imagine us coming up with a set of useful TripleO
>>> workflows
>>> (something like this):
>>>
>>>  tripleo.deploy 
>>>  tripleo.update 
>>>  tripleo.run_ad_hoc_whatever_on_specific_roles <>
>>>
>>> Since Mistral (the OpenStack workflow service) can already interact
>>> w/
>>> keystone and has a good many hooks to interact with core OpenStack
>>> services like Swift, Heat, and Nova we might get some traction very
>>> quickly here. Perhaps we add some new Mistral Ironic actions? Or
>>> imagine smaller more focused Heat configuration stacks that we
>>> drive
>>> via Mistral? Or perhaps we tie in Zaqar (which already has some
>>> integration into os-collect-config) to run ad-hoc deployment
>>> snippets
>>> on specific roles in an organized fashion?  Or wrapping mistral w/
>>> tripleoclient to allow users to more easily call TripleO specific
>>> workflows (enhancing the user feedback like we do with our
>>> heatclient
>>> wrapping already)?
>>>
>>> Where all this might lead... I'm not sure. But I feel like we might
>>> benefit by adding a few extra options to our OpenStack deployment
>>> tool
>>> chain.
>> I think this sounds promising. Lots of the code in the CLI is about
>> managing workflows. For example when doing introspection we change
>> the node state, poll for the result, start introspection, poll for
>> the result, change the node state back and poll for the result. If
>> mistral can help here I expect it could give us a much more robust
>> solution.
> 
> Hows this look:
> 
> https://github.com/dprince/tripleo-mistral-
> workflows/blob/master/tripleo/baremetal.yaml
> 

This is a really good starter example because the bulk inspection
command is particularly problematic. I like this a lot. One really nice
thing here is that we get a REST API for free by using Mistral.

>>
>>>  Dan
>>>
>>> ___
>>> ___
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsu
>>> bscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>> _
>> _
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubs
>> cribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] Location of TripleO REST API

2015-11-10 Thread John Trowbridge


On 11/10/2015 10:08 AM, Tzu-Mainn Chen wrote:
> Hi all,
> 
> At the last IRC meeting it was agreed that the new TripleO REST API
> should forgo the Tuskar name, and simply be called... the TripleO
> API.  There's one more point of discussion: where should the API
> live?  There are two possibilities:
> 
> a) Put it in tripleo-common, where the business logic lives.  If we
> do this, it would make sense to rename tripleo-common to simply
> tripleo.

This option makes the most sense to me now that I understand the
tripleoclient will also consume the REST API and not the underlying
python library with the "business logic".

>From a packaging point of view, I think this is pretty painless. I think
we would have a "python-tripleo" package for the library that
provides/obsoletes the current tripleo-common package. As well as a
"tripleo-api" package for the REST API service. We would not need to
touch the tripleo-incubator package that currently is named just "tripleo".

> 
> b) Put it in its own repo, tripleo-api
> 
> 
> The first option made a lot of sense to people on IRC, as the proposed
> API is a very thin layer that's bound closely to the code in tripleo-
> common.  The major objection is that renaming is not trivial; however
> it was mentioned that renaming might not be *too* bad... as long as
> it's done sooner rather than later.
> 
> What do people think?
> 
> 
> Thanks,
> Tzu-Mainn Chen
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] [Ironic] Let's stop hijacking other projects' OSC namespaces

2015-11-10 Thread John Trowbridge


On 11/10/2015 10:43 AM, Ben Nemec wrote:
> On 11/10/2015 05:26 AM, John Trowbridge wrote:
>>
>>
>> On 11/09/2015 07:44 AM, Dmitry Tantsur wrote:
>>> Hi OOO'ers, hopefully the subject caught your attentions :)
>>>
>>> Currently, tripleoclient exposes several commands in "openstack
>>> baremetal" and "openstack baremetal introspection" namespaces belonging
>>> to ironic and ironic-inspector accordingly. TL;DR of this email is to
>>> deprecate them and move to TripleO-specific namespaces. Read on to know
>>> why.
>>>
>>> Problem
>>> ===
>>>
>>> I realized that we're doing a wrong thing when people started asking me
>>> why "baremetal introspection start" and "baremetal introspection bulk
>>> start" behave so differently (the former is from ironic-inspector, the
>>> latter is from tripleoclient). The problem with TripleO commands is that
>>> they're highly opinionated workflows commands, but there's no way a user
>>> can distinguish them from general-purpose ironic/ironic-inspector
>>> commands. The way some of them work is not generic enough ("baremetal
>>> import"), or uses different defaults from an upstream project
>>> ("configure boot"), or does something completely unacceptable upstream
>>> (e.g. the way "introspection bulk start" deals with node states).
>>>
>>> So, here are commands that tripleoclient exposes with my comments:
>>>
>>> 1. baremetal instackenv validate
>>>
>>>  This command assumes there's an "baremetal instackenv" object, while
>>> instackenv is a tripleo-specific file format.
>>>
>>> 2. baremetal import
>>>
>>>  This command supports a limited subset of ironic drivers and driver
>>> properties, only those known to os-cloud-config.
>>>
>>> 3. baremetal introspection bulk start
>>>
>>>  This command does several bad (IMO) things:
>>>  a. Messes with ironic node states
>>>  b. Operates implicitly on all nodes (in a wrong state)
>>>  c. Defaults to polling
>>>
>>
>> I have considered this whole command as a bug for a while now. I
>> understand what we were trying to do and why, but it is pretty bad to
>> hijack another project's namespace with a command that would get a firm
>> -2 there.
>>
>>> 4. baremetal show capabilities
>>>
>>>  This is the only commands that is generic enough and could actually
>>> make it to ironicclient itself.
>>>
>>> 5. baremetal introspection bulk status
>>>
>>>  See "bulk start" above.
>>>
>>> 6. baremetal configure ready state
>>>
>>>  First of all, this and the next command use "baremetal configure"
>>> prefix. I would not promise we'll never start using it in ironic,
>>> breaking the whole TripleO.
>>>
>>>  Seconds, it's actually DELL-specific.
>>>
>>> 7. baremetal configure boot
>>>
>>>  This one is nearly ok, but it defaults to local boot, which is not an
>>> upstream default. Default values for images may not work outside of
>>> TripleO as well.
>>>
>>> Proposal
>>> 
>>>
>>> As we already have "openstack undercloud" and "openstack overcloud"
>>> prefixes for TripleO, I suggest we move these commands under "openstack
>>> overcloud nodes" namespace. So we end up with:
>>>
>>>  overcloud nodes import
>>>  overcloud nodes configure ready state --drac
>>>  overcloud nodes configure boot
>>>
>>> As you see, I require an explicit --drac argument for "ready state"
>>> command. As to the remaining commands:
>>>
>>> 1. baremetal introspection status --all
>>>
>>>   This is fine to move to inspector-client, as inspector knows which
>>> nodes are/were on introspection. We'll need a new API though.
>>>
>>> 2. baremetal show capabilities
>>>
>>>   We'll have this or similar command in ironic, hopefully this cycle.
>>>
>>> 3. overcloud nodes introspect --poll --allow-available
>>>
>>>   I believe that we need to make 2 things explicit in this replacement
>>> for "introspection bulk status": polling and operating on "available"
>>> nodes.
>>
>> I am not totally convince

Re: [openstack-dev] [tripleo] Location of TripleO REST API

2015-11-10 Thread John Trowbridge


On 11/10/2015 10:37 AM, Giulio Fidente wrote:
> On 11/10/2015 04:16 PM, Dmitry Tantsur wrote:
>> On 11/10/2015 04:08 PM, Tzu-Mainn Chen wrote:
>>> Hi all,
>>>
>>> At the last IRC meeting it was agreed that the new TripleO REST API
>>> should forgo the Tuskar name, and simply be called... the TripleO
>>> API.  There's one more point of discussion: where should the API
>>> live?  There are two possibilities:
>>>
>>> a) Put it in tripleo-common, where the business logic lives.  If we
>>> do this, it would make sense to rename tripleo-common to simply
>>> tripleo.
>>
>> +1 for both
>>
>>>
>>> b) Put it in its own repo, tripleo-api
> 
> if both the api (coming) and the cli (currently python-tripleoclient)
> are meant to consume the shared code (business logic) from
> tripleo-common, then I think it makes sense to keep each in its own repo
> ... so that we avoid renaming tripleo-common as well

I generally agree with this logic... it is a bit weird to have one
consumer of the shared code in tree, while the other is out of tree.
That said, I am ok with wherever we put the api and whatever we call it
so long as we do it soon and commit to not changing it once decided.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] [Ironic] Let's stop hijacking other projects' OSC namespaces

2015-11-10 Thread John Trowbridge


On 11/09/2015 07:44 AM, Dmitry Tantsur wrote:
> Hi OOO'ers, hopefully the subject caught your attentions :)
> 
> Currently, tripleoclient exposes several commands in "openstack
> baremetal" and "openstack baremetal introspection" namespaces belonging
> to ironic and ironic-inspector accordingly. TL;DR of this email is to
> deprecate them and move to TripleO-specific namespaces. Read on to know
> why.
> 
> Problem
> ===
> 
> I realized that we're doing a wrong thing when people started asking me
> why "baremetal introspection start" and "baremetal introspection bulk
> start" behave so differently (the former is from ironic-inspector, the
> latter is from tripleoclient). The problem with TripleO commands is that
> they're highly opinionated workflows commands, but there's no way a user
> can distinguish them from general-purpose ironic/ironic-inspector
> commands. The way some of them work is not generic enough ("baremetal
> import"), or uses different defaults from an upstream project
> ("configure boot"), or does something completely unacceptable upstream
> (e.g. the way "introspection bulk start" deals with node states).
> 
> So, here are commands that tripleoclient exposes with my comments:
> 
> 1. baremetal instackenv validate
> 
>  This command assumes there's an "baremetal instackenv" object, while
> instackenv is a tripleo-specific file format.
> 
> 2. baremetal import
> 
>  This command supports a limited subset of ironic drivers and driver
> properties, only those known to os-cloud-config.
> 
> 3. baremetal introspection bulk start
> 
>  This command does several bad (IMO) things:
>  a. Messes with ironic node states
>  b. Operates implicitly on all nodes (in a wrong state)
>  c. Defaults to polling
> 

I have considered this whole command as a bug for a while now. I
understand what we were trying to do and why, but it is pretty bad to
hijack another project's namespace with a command that would get a firm
-2 there.

> 4. baremetal show capabilities
> 
>  This is the only commands that is generic enough and could actually
> make it to ironicclient itself.
> 
> 5. baremetal introspection bulk status
> 
>  See "bulk start" above.
> 
> 6. baremetal configure ready state
> 
>  First of all, this and the next command use "baremetal configure"
> prefix. I would not promise we'll never start using it in ironic,
> breaking the whole TripleO.
> 
>  Seconds, it's actually DELL-specific.
> 
> 7. baremetal configure boot
> 
>  This one is nearly ok, but it defaults to local boot, which is not an
> upstream default. Default values for images may not work outside of
> TripleO as well.
> 
> Proposal
> 
> 
> As we already have "openstack undercloud" and "openstack overcloud"
> prefixes for TripleO, I suggest we move these commands under "openstack
> overcloud nodes" namespace. So we end up with:
> 
>  overcloud nodes import
>  overcloud nodes configure ready state --drac
>  overcloud nodes configure boot
> 
> As you see, I require an explicit --drac argument for "ready state"
> command. As to the remaining commands:
> 
> 1. baremetal introspection status --all
> 
>   This is fine to move to inspector-client, as inspector knows which
> nodes are/were on introspection. We'll need a new API though.
> 
> 2. baremetal show capabilities
> 
>   We'll have this or similar command in ironic, hopefully this cycle.
> 
> 3. overcloud nodes introspect --poll --allow-available
> 
>   I believe that we need to make 2 things explicit in this replacement
> for "introspection bulk status": polling and operating on "available"
> nodes.

I am not totally convinced that we gain a huge amount by hiding the
state manipulation in this command. We need to move that logic to
tripleo-common anyways, so I think it is worth considering splitting it
from the introspect command.

Dmitry and I discussed briefly at summit having the ability to pass a
list of nodes to the inspector client for introspection as well. So if
we separated out the bulk state manipulation bit, we could just use that.

I get that this is going in the opposite direction of the original
intention of lowering the amount of commands needed to get a functional
deployment. However, I think that goal is better solved elsewhere
(tripleo.sh, some ansible playbooks, etc.). Instead it would be nice if
the tripleoclient was more transparent.

Thanks Dmitry for starting this discussion.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] RFC: profile matching

2015-11-09 Thread John Trowbridge
In general, I think this is a good idea. Rather than putting the logic
for this in tripleoclient, it would be better to put it in
tripleo-common. Then the --assign-profiles just calls the function
imported from tripleo-common. This way the GUI could consume the same logic.

On 11/09/2015 09:51 AM, Dmitry Tantsur wrote:
> Hi folks!
> 
> I spent some time thinking about bringing profile matching back in, so
> I'd like to get your comments on the following near-future plan.
> 
> First, the scope of the problem. What we do is essentially kind of
> capability discovery. We'll help nova scheduler with doing the right
> thing by assigning a capability like "suits for compute", "suits for
> controller", etc. The most obvious path is to use inspector to assign
> capabilities like "profile=1" and then filter nodes by it.
> 
> A special care, however, is needed when some of the nodes match 2 or
> more profiles. E.g. if we have all 4 nodes matching "compute" and then
> only 1 matching "controller", nova can select this one node for
> "compute" flavor, and then complain that it does not have enough hosts
> for "controller".
> 
> We also want to conduct some sanity check before even calling to
> heat/nova to avoid cryptic "no valid host found" errors.
> 
> (1) Inspector part
> 
> During the liberty cycle we've landed a whole bunch of API's to
> inspector that allow us to define rules on introspection data. The plan
> is to have rules saying, for example:
> 
>  rule 1: if memory_mb >= 8192, add capability "compute_profile=1"
>  rule 2: if local_gb >= 100, add capability "controller_profile=1"
> 
> Note that these rules are defined via inspector API using a JSON-based
> DSL [1].
> 
> As you see, one node can receive 0, 1 or many such capabilities. So we
> need the next step to make a final decision, based on how many nodes we
> need of every profile.
> 
> (2) Modifications of `overcloud deploy` command: assigning profiles
> 
> New argument --assign-profiles will be added. If it's provided,
> tripleoclient will fetch all ironic nodes, and try to ensure that we
> have enough nodes with all profiles.
> 
> Nodes with existing "profile:xxx" capability are left as they are. For
> nodes without a profile it will look at "xxx_profile" capabilities
> discovered on the previous step. One of the possible profiles will be
> chosen and assigned to "profile" capability. The assignment stops as
> soon as we have enough nodes of a flavor as requested by a user.
> 
> (3) Modifications of `overcloud deploy` command: validation
> 
> To avoid 'no valid host found' errors from nova, the deploy command will
> fetch all flavors involved and look at the "profile" capabilities. If
> they are set for any flavors, it will check if we have enough ironic
> nodes with a given "profile:xxx" capability. This check will happen
> after profiles assigning, if --assign-profiles is used.
> 
> Please let me know what you think.
> 
> [1] https://github.com/openstack/ironic-inspector#introspection-rules
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] Introspection rules aka advances profiles replacement: next steps

2015-10-14 Thread John Trowbridge


On 10/14/2015 10:57 AM, Ben Nemec wrote:
> On 10/14/2015 06:38 AM, Dmitry Tantsur wrote:
>> Hi OoO'ers :)
>>
>> It's going to be a long letter, fasten your seat-belts (and excuse my 
>> bad, as usual, English)!
>>
>> In RDO Manager we used to have a feature called advanced profiles 
>> matching. It's still there in the documentation at 
>> http://docs.openstack.org/developer/tripleo-docs/advanced_deployment/profile_matching.html
>>  
>> but the related code needed reworking and didn't quite make it upstream 
>> yet. This mail is an attempt to restart the discussion on this topic.
>>
>> Short explanation for those unaware of this feature: we used detailed 
>> data from introspection (acquired using hardware-detect utility [1]) to 
>> provide scheduling hints, which we called profiles. A profile is 
>> essentially a flavor, but calculated using much more data. E.g. you 
>> could sat that a profile "foo" will be assigned to nodes with 1024 <= 
>> RAM <= 4096 and with GPU devices present (an artificial example). 
>> Profile was put on an Ironic as a capability as a result of 
>> introspection. Please read the documentation linked above for more details.
>>
>> This feature had a bunch of problems with it, to name a few:
>> 1. It didn't have an API
>> 2. It required a user to modify files by hand to use it
>> 3. It was tied to a pretty specific syntax of the hardware [1] library
>>
>> So we decided to split this thing into 2 parts, which are of value one 
>> their own:
>>
>> 1. Pluggable introspection ramdisk - so that we don't force dependency 
>> on hardware-detect on everyone.
>> 2. User-defined introspection rules - some DSL that will allow a user to 
>> define something like a specs file (see link above) via an API. The 
>> outcome would be something, probably capabilit(y|ies) set on a node.
>> 3. Scheduler helper - an utility that will take capabilities set by the 
>> previous step, and turn them into exactly one profile to use.
>>
>> Long story short, we got 1 and 2 implemented in appropriate projects 
>> (ironic-python-agent and ironic-inspector) during the Liberty time 
>> frame. Now it's time to figure out what we do in TripleO about this, namely:
>>
>> 1. Do we need some standard way to define introspection rules for 
>> TripleO? E.g. a JSON file like we have for ironic nodes?
> 
> Yes, please.
> 
>>
>> 2. Do we need a scheduler helper at all? We could use only capabilities 
>> for scheduling, but then we can end up with the following situation: 
>> node1 has capabilities C1 and C2, node2 has capability C1. First we 
>> deploy a flavor with capability C1, it goes to node1. Then we deploy a 
>> flavor with capability C2 and it fails, despite us having 2 correct 
>> nodes initially. This is what state files were solving in [1] (again, 
>> please refer to the documentation).
> 
> It sounds like the answer is yes.  If the existing scheduler can't
> handle a valid use case then we need some sort of solution.
> 
>>
>> 3. If we need, where does it go? tripleo-common? Do we need an HTTP API 
>> for it, or do we just do it in place where we need it? After all, it's a 
>> pretty trivial manipulation with ironic nodes..
> 
> I think that would depend on what the helper ends up being.  I can't see
> it needing a REST API, but presumably it will have to plug into Nova
> somehow.  If it's something that would be generally useful (which it
> sounds like it might be - Ironic capabilities aren't a TripleO-specific
> thing), then it belongs in Nova itself IMHO.
> 
>>
>> 4. Finally, we need an option to tell introspection to use 
>> python-hardware. I don't think it should be on by default, but it will 
>> require rebuilding of IPA (due to a new dependency).
> 
> Can we not just build it in always, but only use it when desired?  Is
> the one extra dependency that much of a burden?

It pulls in python-numpy and python-pandas, which are pretty large.

> 
>>
>> Looking forward to your opinions.
>> Dmitry.
>>
>> [1] https://github.com/redhat-cip/hardware
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
> 
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [api] [wsme] [ceilometer] Replacing WSME with _____ ?

2015-08-28 Thread John Trowbridge


On 08/28/2015 10:36 AM, Lucas Alvares Gomes wrote:
> Hi,
> 
>> If you just want to shoot the breeze please respond here. If you
>> have specific comments on the spec please response there.
>>
> 
> I have been thinking about doing it for Ironic as well so I'm looking
> for options. IMHO after using WSME I would think that one of the most
> important criteria we should start looking at is if the project has a
> health, sizable and active community around it. It's crucial to use
> libraries that are being maintained.
> 
> So at the present moment the [micro]framework that comes to my mind -
> without any testing or prototype of any sort - is Flask.

I personally find Flask to be super nice to work with. It is easy to
"visualize" what the API looks like just from reading the code. It also
has good documentation and a fairly large community.

> 
> Cheers,
> Lucas
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [Ironic] Proposal to add a new repository

2015-06-22 Thread John Trowbridge


On 06/22/2015 10:40 AM, Dmitry Tantsur wrote:
> On 06/22/2015 04:19 PM, Devananda van der Veen wrote:
>> Hi John,
>>
>> Thanks for the excellent summary! I found it very helpful to get caught
>> up. I'd like to make sure I understand the direction ahc is going. A
>> couple questions...

Thanks for your interest.

> 
> Let me add my $0.5 :)
> 
>>
>> I see that ahc is storing its information in swift. That's clever, but
>> if Ironic provided a blob store for each node, would that be better?

If the blob is large enough, this would be better. Originally we stored
the data in the extra column of the Ironic db, but that proved disastrous:

https://bugs.launchpad.net/ironic-inspector/+bug/1461252

>>
>> We discussed adding a search API to Ironic at the Vancouver summit,
>> though no work has been done on that yet, afaik. If ahc is going to grow
>> a REST API for searching for nodes based on specific criteria that it
>> discovered, could/should we combine these within Ironic's API?
> 
> I think John meant having API to replace scripts, so I guess search
> won't help. When we're talking about advanced matching, we're talking
> about the following:
> 1. We have a ramdisk tool (based on [8]) to get some insane of facts
> from withing the ramdisk (say, 1000 of them)
> 2. We have an Inspector plugin to put them all in Swift (or Ironic blob
> storage as above)
> 3. We have config files (aka rules) written in special JSON-alike DSL to
> do matching (one of the weak points is that these are files - I'd like
> API endpoint to accept these rules instead).
> 4. We have a script to run this DSL and get some output (match/not match
> + some matched variables - similar to what regexps do).
> As I understood it John want the latter to become an API endpoint,
> accepting rules (and maybe node UUIDs) and outputting some result.
> 
> Not sure about benchmarking here, but again, it's probably an API
> endpoint that accepts some minimal expectations, and puts failed nodes
> to maintenance mode, if they fail to comply (again, that's how I
> understood it).
> 
> It's not hard to make these API endpoints part of Inspector, but it's
> somewhat undesirable to have them optional...
> 
>>
>>  From a service coupling perspective, I like the approach that ahc is
>> optional, and also that Ironic-inspector is optional, because this keeps
>> the simple use-case for Ironic, well, simple! That said, this seems more
>> like a configuration setting (should inspector do extra things?) than an
>> entirely separate service, and separating them might be unnecessarily
>> complicated.
> 
> We keep thinking about it as well. After all, right now it's just a
> couple of utilities. There are 2 more concerns that initially made me
> pull out this code:
> 1. ahc-tools currently depends on the library [8], which I wish would be
> developed much more openly
> 2. it's cool that inspector is pluggable, but it has its cost: there's a
> poor feedback loop from inspector processing plugins to a user - like
> with all highly asynchronous code
> 3. it's also not possible (at least for now) to request a set of
> processing plugins when starting introspection via inspector.
> 
> We solved the latter 2 problems by moving code to scripts. So now
> Inspector only puts some data to Swift, and scripts can do everything else.
> 
> So now we've left with
> 1. dependency on "hardware" library
> 2. not very stable interface, much less stable than one of Inspector
> 
> We still wonder how to solve these 2 without creating one more
> repository. Any ideas are welcome :)

It is a goal of mine to solve issue 1 incrementally over time. Either by
improving the library (both in function and in openness), or by slowly
moving the implementation. That does not seem impossible to do within
the inspector tree.

However, issue 2 is a fact. We currently have scripts, and we want to
have a REST API. I do not see a transition between the two that does not
involve a large amount of churn.

I am not sure how to solve issue 2 without a separate repository. I do
think there is a logical separation of concerns though, so we may not
need to completely merge the two in the future. Inspector collects data,
and ahc-tools (or whatever it is eventually named) is used to act on the
data.
> 
>>
>> It sounds like this is the direction you'd like to go, and you took the
>> current approach for expediency. If so, I'd like us to discuss a path to
>> merge the functionality as it matures, and decide whether a separate
>> repository is the right way to go long term.
>>
>> Thanks much,
>> De

[openstack-dev] [Ironic] Proposal to add a new repository

2015-06-22 Thread John Trowbridge
This is a proposal to add a new repository governed by the ironic
inspector subteam. The current repository is named ahc-tools[1], however
there is no attachment to this name. "ironic-inspector-extra" would seem
to fit if this is moved under the Ironic umbrella.

What is AHC?

* AHC as a term comes from the enovance edeploy installation method[2].
* The general concept is that we want to have a very granular picture of
the physical hardware being used in a deployment in order to be able to
match specific hardware to specific roles, as well as the ability to
find poor performing outliers before we attempt to deploy.
* For example: As a cloud operator, I want to make sure all logical
disks have random read IOPs within 15% variance of each other.
* The huge benefit of this tooling over current inspection is the number
of facts collected (~1000 depending on the hardware), all of which can
be used for matching.
* Another example: As an end user, I would like to request a bare metal
machine with a specific model GPU.

What is ahc-tools?
--
* We first tried to place all of this logic into a plugin in
inspector[3] (discoverd at the time). [4]
* This worked fine for just collecting some of the simple facts, however
we now had a coupling between booting a ramdisk, and matching against
the collected data.
* ahc-tools started as a way to uncouple these two steps[5].
* We also added a wrapper around the enovance report tooling[6], as it
already had the ability to generate reports based on the collected data,
but was designed to read in the data from the filesystem.
* The report tool has two functions.
* First, it can group the systems by category (NICs, Firmware,
Processors, etc.).
* Second, it can use statistical analysis to find performance outliers.

Why is ahc-tools useful to Ironic?
--
* If we run benchmarks on hardware whenever it is turned back in by a
tenant, we can easily put nodes into maintenance if the hardware is
performing below some set threshold. This would allow us to have better
certainty that the end user is getting what we promised them.
* The advanced matching could also prove very useful. For VMs, I think
the pets vs cattle analogy holds up very well, however many use cases
for having cloud based bare metal involve access to specific hardware
capabilities. I think advanced matching could help bridge this gap.

Why not just put this code directly into inspector?
---
* Clearly this code is 100% dependent on inspector. However, inspector
is quite stable, and works great without any of this extra tooling.
* ahc-tools is very immature, and will need many breaking changes to get
to the same stability level of inspector.

Why aren't you following the downstream->stackforge->openstack path?

* This was the initial plan[7], however we were told that under the new
"big tent", that the openstack namespace is no longer meant to signify
maturity of a project.
* Instead, we were told we should propose the project directly to
Ironic, or make a new separate project.

What is the plan to make ahc-tools better?
--
* The first major overhaul we would like to do is to put the reporting
and matching functionality behind a REST API.
* Reporting in particular will require significant work, as the current
wrapper script wraps code that was never designed to be a library (Its
output is just a series of print statements). One option is to improve
the library[8] to be more library like, and the other is to reimplement
the logic itself. Personally, while reimplementing the library is a
large amount of work, I think it is probably worth the effort.
* We would also like to add an API endpoint to coordinate distributed
checks. For instance, if we want to confirm that there is physical
network connectivity between a set of nodes, or if we would like to
confirm the bandwidth of those connections.
* The distributed checks and REST API will hopefully be completed in the
Liberty timeframe.
* Overhaul of the reporting will likely be an M target, unless there is
interest from new contributors in working on this feature.
* We are planning a talk for Tokyo on inspector that will also include
details about this project.

Thank you very much for your consideration.

Respectfully,
John Trowbridge

[1] https://github.com/rdo-management/ahc-tools
[2] https://github.com/enovance/edeploy/blob/master/docs/AHC.rst
[3]
https://github.com/openstack/ironic-inspector/commit/22a0e24efbef149377ea1e020f2d81968c10b58c
[4] We can have out-of-tree plugins for the inspector, so some of this
code might become a plugin again, but within the new repository tree.
[5]
https://github.com/openstack/ironic-inspector/commit/eaad7e09b99ab498e080e6e0ab71e69d00275422
[6]
https://github.com/rdo-management/ahc-tools/blob/maste

Re: [openstack-dev] [Ironic] Time to decide something on the vendor tools repo

2015-06-04 Thread John Trowbridge


On 06/04/2015 09:29 AM, Dmitry Tantsur wrote:
> Hi again!
> 
> ~ half an hour has passed since my last email, and now I have one more
> question to discuss and decide!

At this rate, we could match the [all] tag today.

> 
> On the summit we were discussing things like chassis discovery, and
> arrived at rough conclusion that we want it to be somewhere in a
> separate repo. More precisely, we wanted some place for vendor to
> contribute code (aka scripts) that aren't good fit for both standard
> interfaces and existing vendor passthrough (chassis discovery again is a
> good example).
> 
> I suggest to decide something finally to unblock people. A few questions
> follow:
> 
> Should we
> 1. create one repo for all vendors (say, ironic-contrib-tools)

As this is for vendor-specific stuff, I think there is a good chance
that there will not be a lot of cross-vendor reviews.

> 2. create a repo for every vendor appearing
> 3. ask vendors to go for stackforge, at least until their solution
> shapes (like we did with inspector)?

It seems like 2 and 3 are the same except for ownership and location of
the repos. I think it makes more sense for vendors to own their own
repos on stackforge at least until there is enough interest outside of
that vendor to get good external reviews.

> 4. %(your_variant)s
> 
> If we go down 1-2 route, should
> 1. ironic-core team own the new repo(s)?
> 2. or should we form a new team from interested people?
> (1 and 2 and not exclusive actually).
> 
> I personally would go for #3 - stackforge. We already have e.g.
> stackforge/proliantutils as an example of something closely related to
> Ironic, but still independent.
> 
> I'm also fine with #1#1 (one repo, owned by group of interested people).
> 
> What do you think?
> 
> Dmitry
> 
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev