Re: [openstack-dev] [tripleo] CI is broken

2018-11-07 Thread Emilien Macchi
No alert anymore, gate is green.
recheck if needed.

On Wed, Nov 7, 2018 at 2:22 PM Emilien Macchi  wrote:

> I updated the bugs, and so far we have one alert left:
> https://bugs.launchpad.net/tripleo/+bug/1801969
>
> The patch is in gate, be patient and then we'll be able to +A/recheck
> stuff again.
>
> On Wed, Nov 7, 2018 at 7:30 AM Juan Antonio Osorio Robles <
> jaosor...@redhat.com> wrote:
>
>> Hello folks,
>>
>>
>> Please do not attempt to merge or recheck patches until we get this
>> sorted out.
>>
>> We are dealing with several issues that have broken all jobs.
>>
>> https://bugs.launchpad.net/tripleo/+bug/1801769
>> https://bugs.launchpad.net/tripleo/+bug/1801969
>> https://bugs.launchpad.net/tripleo/+bug/1802083
>> https://bugs.launchpad.net/tripleo/+bug/1802085
>>
>> Best Regards!
>>
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> --
> Emilien Macchi
>


-- 
Emilien Macchi
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI is broken

2018-11-07 Thread Emilien Macchi
I updated the bugs, and so far we have one alert left:
https://bugs.launchpad.net/tripleo/+bug/1801969

The patch is in gate, be patient and then we'll be able to +A/recheck stuff
again.

On Wed, Nov 7, 2018 at 7:30 AM Juan Antonio Osorio Robles <
jaosor...@redhat.com> wrote:

> Hello folks,
>
>
> Please do not attempt to merge or recheck patches until we get this
> sorted out.
>
> We are dealing with several issues that have broken all jobs.
>
> https://bugs.launchpad.net/tripleo/+bug/1801769
> https://bugs.launchpad.net/tripleo/+bug/1801969
> https://bugs.launchpad.net/tripleo/+bug/1802083
> https://bugs.launchpad.net/tripleo/+bug/1802085
>
> Best Regards!
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>


-- 
Emilien Macchi
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI is broken

2018-11-07 Thread Juan Antonio Osorio Robles
Hello folks,


Please do not attempt to merge or recheck patches until we get this
sorted out.

We are dealing with several issues that have broken all jobs.

https://bugs.launchpad.net/tripleo/+bug/1801769
https://bugs.launchpad.net/tripleo/+bug/1801969
https://bugs.launchpad.net/tripleo/+bug/1802083
https://bugs.launchpad.net/tripleo/+bug/1802085

Best Regards!


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci][upgrade] New jobs for tripleo Upgrade in the CI.

2018-10-14 Thread Arie Bregman
On Fri, Oct 12, 2018 at 2:10 PM Sofer Athlan-Guyot 
wrote:

> Hi,
>
> Testing and maintaining a green status for upgrade jobs within the 3h
> time limit has proven to be a very difficult job to say the least.
>
> The net result has been: we don't have anything even touching the
> upgrade code in the CI.
>
> So during Denver PTG it has been decided to give up on running a full
> upgrade job during the 3h time limit and instead to focus on two
> complementary approach to at least touch the upgrade code:
>  1. run a standalone upgrade: this test the ansible upgrade playbook;
>  2. run a N->N upgrade; this test the upgrade python code;
>
> And here there are, still not merged but seen working:
>  - tripleo-ci-centos-7-standalone-upgrade:
>https://review.openstack.org/#/c/604706/
>  - tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades:
>https://review.openstack.org/#/c/607848/9
>
> The first is good to merge (but other could disagree), the second could
> be as well (but I tend to disagree :))
>
> The first leverage the standalone deployment and execute an standalone
> upgrade just after it.
>
> The limitation is that it only tests non-HA services (sorry pidone,
> cannot test ha in standalone) and only the upgrade_tasks (ie not any
> workflow related to the upgrade cli)
>
> The main benefits here are:
>  - ~2h to run the upgrade, still a bit long but far away from the 3h
>time limit;
>  - we trigger a yum upgrade so that we can catch problems there as well;
>  - we test the standalone upgrade which is good in itself;
>  - composable role available (as in standalone/all-in-all deployment) so
>you can make a specific upgrade test for your project if it fits into
>the standalone constraint;
>
> For this last point, if standalone specific role eventually goes into
> project testing (nova, neutron ...), they could have as well a way to
> test upgrade tasks.  This would be a best case scenario.
>
> Now, for the second point, the N->N upgrade.  Its "limitation" is that
> ... well it doesn't run a yum upgrade at all.  We start from master and
> run the upgrade to master.
>
> It's main benefit are:
>  - it takes ~2h20 to run, so well under the 3h time;
>  - tripleoclient upgrade code is run, which is one thing that the
>standalone ugprade cannot do.
>  - It also tend to exercise idempotency of all the tasks as it runs them
>on an already "upgraded" node;
>  - As added bonus, it could gate the tripleo-upgrade role as well as it
>definitively loads all of the role's tasks[1]
>
> For those that stayed with me to this point, I'm throwing another CI
> test that already proved useful already (caught errors), it's the
> ansible-lint test.  After a standalone deployment we just run
> ansible-lint on all playbook generated[2].
>
> It produces standalone_ansible_lint.log[3] in the working directory. It
> only takes a couple of minute to install ansible-lint and run it. It
> definitively gate against typos and the like. It touches hard to
> reach code as well, for instance the fast_forward tasks are linted.
> Still no pidone tasks in there but it could easily be added to a job
> that has HA tasks generated.
>
> Note that by default ansible-lint barks, as the generated playbooks hit
> several lintage problems, so only syntax errors and misnamed tasks or
> parameters are currently activated.  But all the lint problems are
> logged in the above file and can be fixed later on.  At which point we
> could activate full lint gating.
>
> Thanks for this long reading, any comments, shout of victory, cry of
> despair and reviews are welcomed.
>

That's awesome. It's perfect for a project we are working on (Tobiko) where
we want to run tests before upgrade (setting up resources) and after
(verifying those resources are still available).

I want to add such job (upgrade standalone) and I need help:

https://review.openstack.org/#/c/610397/

How do I set  tempest regex for pre-upgrade and another one for post
upgrade?


> [1] but this has still to be investigated.
> [2] testing review https://review.openstack.org/#/c/604756/ and main code
> https://review.openstack.org/#/c/604757/
> [3] sample output http://paste.openstack.org/show/731960/
> --
> Sofer Athlan-Guyot
> chem on #freenode
> Upgrade DFG.
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci][upgrade] New jobs for tripleo Upgrade in the CI.

2018-10-12 Thread Wesley Hayutin
On Fri, Oct 12, 2018 at 5:10 AM Sofer Athlan-Guyot 
wrote:

> Hi,
>
> Testing and maintaining a green status for upgrade jobs within the 3h
> time limit has proven to be a very difficult job to say the least.
>

Indeed

>
> The net result has been: we don't have anything even touching the
> upgrade code in the CI.
>
> So during Denver PTG it has been decided to give up on running a full
> upgrade job during the 3h time limit and instead to focus on two
> complementary approach to at least touch the upgrade code:
>  1. run a standalone upgrade: this test the ansible upgrade playbook;
>  2. run a N->N upgrade; this test the upgrade python code;


> And here there are, still not merged but seen working:
>  - tripleo-ci-centos-7-standalone-upgrade:
>https://review.openstack.org/#/c/604706/
>  - tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades:
>https://review.openstack.org/#/c/607848/9
>
> The first is good to merge (but other could disagree), the second could
> be as well (but I tend to disagree :))
>
> The first leverage the standalone deployment and execute an standalone
> upgrade just after it.
>
> The limitation is that it only tests non-HA services (sorry pidone,
> cannot test ha in standalone) and only the upgrade_tasks (ie not any
> workflow related to the upgrade cli)
>

This can be augmented with 3rd party.  The pidone team and the ci team are
putting the final touches on a 3rd party job for HA services.  Looking
forward, I could see a 3rd party upgrade job that runs the pidone
verification tests.


>
> The main benefits here are:
>  - ~2h to run the upgrade, still a bit long but far away from the 3h
>time limit;
>  - we trigger a yum upgrade so that we can catch problems there as well;
>  - we test the standalone upgrade which is good in itself;
>  - composable role available (as in standalone/all-in-all deployment) so
>you can make a specific upgrade test for your project if it fits into
>the standalone constraint;
>

These are all huge benefits over the previous implementation that have been
made available to us via the standalone deployment

>
> For this last point, if standalone specific role eventually goes into
> project testing (nova, neutron ...), they could have as well a way to
> test upgrade tasks.  This would be a best case scenario.
>

!   woot !!!
This is a huge point that TripleO folks need to absorb!!
!   woot !!!

In the next several sprints the TripleO CI team will do our best to focus
on the standalone deployments to convert TripleO's upstream jobs over and
paving the way for other projects to start consuming it.  IMHO I would
think other projects would be *very* interested in testing an upgrade of
their individual component w/o all the noise of unrelated
services/components.


>
> Now, for the second point, the N->N upgrade.  Its "limitation" is that
> ... well it doesn't run a yum upgrade at all.  We start from master and
> run the upgrade to master.
>
> It's main benefit are:
>  - it takes ~2h20 to run, so well under the 3h time;
>  - tripleoclient upgrade code is run, which is one thing that the
>standalone ugprade cannot do.
>  - It also tend to exercise idempotency of all the tasks as it runs them
>on an already "upgraded" node;
>  - As added bonus, it could gate the tripleo-upgrade role as well as it
>definitively loads all of the role's tasks[1]
>
> For those that stayed with me to this point, I'm throwing another CI
> test that already proved useful already (caught errors), it's the
> ansible-lint test.  After a standalone deployment we just run
> ansible-lint on all playbook generated[2].
>

This is nice, thanks chem!


>
> It produces standalone_ansible_lint.log[3] in the working directory. It
> only takes a couple of minute to install ansible-lint and run it. It
> definitively gate against typos and the like. It touches hard to
> reach code as well, for instance the fast_forward tasks are linted.
> Still no pidone tasks in there but it could easily be added to a job
> that has HA tasks generated.
>
> Note that by default ansible-lint barks, as the generated playbooks hit
> several lintage problems, so only syntax errors and misnamed tasks or
> parameters are currently activated.  But all the lint problems are
> logged in the above file and can be fixed later on.  At which point we
> could activate full lint gating.
>
> Thanks for this long reading, any comments, shout of victory, cry of
> despair and reviews are welcomed.
>
> [1] but this has still to be investigated.
> [2] testing review https://review.openstack.org/#/c/604756/ and main code
> https://review.openstack.org/#/c/604757/
> [3] sample output http://paste.openstack.org/show/731960/
> --
> Sofer Athlan-Guyot
> chem on #freenode
> Upgrade DFG.
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: 

[openstack-dev] [tripleo][ci][upgrade] New jobs for tripleo Upgrade in the CI.

2018-10-12 Thread Sofer Athlan-Guyot
Hi,

Testing and maintaining a green status for upgrade jobs within the 3h
time limit has proven to be a very difficult job to say the least.

The net result has been: we don't have anything even touching the
upgrade code in the CI.

So during Denver PTG it has been decided to give up on running a full
upgrade job during the 3h time limit and instead to focus on two
complementary approach to at least touch the upgrade code:
 1. run a standalone upgrade: this test the ansible upgrade playbook;
 2. run a N->N upgrade; this test the upgrade python code;

And here there are, still not merged but seen working:
 - tripleo-ci-centos-7-standalone-upgrade:
   https://review.openstack.org/#/c/604706/
 - tripleo-ci-centos-7-scenario000-multinode-oooq-container-upgrades:
   https://review.openstack.org/#/c/607848/9

The first is good to merge (but other could disagree), the second could
be as well (but I tend to disagree :))

The first leverage the standalone deployment and execute an standalone
upgrade just after it.

The limitation is that it only tests non-HA services (sorry pidone,
cannot test ha in standalone) and only the upgrade_tasks (ie not any
workflow related to the upgrade cli)

The main benefits here are:
 - ~2h to run the upgrade, still a bit long but far away from the 3h
   time limit;
 - we trigger a yum upgrade so that we can catch problems there as well;
 - we test the standalone upgrade which is good in itself;
 - composable role available (as in standalone/all-in-all deployment) so
   you can make a specific upgrade test for your project if it fits into
   the standalone constraint;

For this last point, if standalone specific role eventually goes into
project testing (nova, neutron ...), they could have as well a way to
test upgrade tasks.  This would be a best case scenario.

Now, for the second point, the N->N upgrade.  Its "limitation" is that
... well it doesn't run a yum upgrade at all.  We start from master and
run the upgrade to master.

It's main benefit are:
 - it takes ~2h20 to run, so well under the 3h time;
 - tripleoclient upgrade code is run, which is one thing that the
   standalone ugprade cannot do.
 - It also tend to exercise idempotency of all the tasks as it runs them
   on an already "upgraded" node;
 - As added bonus, it could gate the tripleo-upgrade role as well as it
   definitively loads all of the role's tasks[1]

For those that stayed with me to this point, I'm throwing another CI
test that already proved useful already (caught errors), it's the
ansible-lint test.  After a standalone deployment we just run
ansible-lint on all playbook generated[2].

It produces standalone_ansible_lint.log[3] in the working directory. It
only takes a couple of minute to install ansible-lint and run it. It
definitively gate against typos and the like. It touches hard to
reach code as well, for instance the fast_forward tasks are linted.
Still no pidone tasks in there but it could easily be added to a job
that has HA tasks generated.

Note that by default ansible-lint barks, as the generated playbooks hit
several lintage problems, so only syntax errors and misnamed tasks or
parameters are currently activated.  But all the lint problems are
logged in the above file and can be fixed later on.  At which point we
could activate full lint gating.

Thanks for this long reading, any comments, shout of victory, cry of
despair and reviews are welcomed.

[1] but this has still to be investigated.
[2] testing review https://review.openstack.org/#/c/604756/ and main code 
https://review.openstack.org/#/c/604757/
[3] sample output http://paste.openstack.org/show/731960/
--
Sofer Athlan-Guyot
chem on #freenode
Upgrade DFG.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci] Having more that one queue for gate pipeline at tripleo

2018-10-11 Thread Clark Boylan
On Thu, Oct 11, 2018, at 7:17 AM, Ben Nemec wrote:
> 
> 
> On 10/11/18 8:53 AM, Felix Enrique Llorente Pastora wrote:
> > So for example, I don't see why changes at tripleo-quickstart can be 
> > reset if tripleo-ui fails, this is the kind of thing that maybe can be 
> > optimize.
> 
> Because if two incompatible changes are proposed to tripleo-quickstart 
> and tripleo-ui and both end up in parallel gate queues at the same time, 
> it's possible both queues could get wedged. Quickstart and the UI are 
> not completely independent projects. Quickstart has roles for deploying 
> the UI, which means there is a connection there.
> 
> I think the only way you could have independent gate queues is if you 
> had two disjoint sets of projects that could be gated without any use of 
> projects from the other set. I don't think it's possible to divide 
> TripleO in that way, but if I'm wrong then maybe you could do multiple 
> queues.

To follow up on this the Gate pipeline queue that your projects belong to are 
how you indicate to Zuul that there is coupling between these projects. Having 
things set up in this way allows you to ensure (through the Gate and Zuul's 
speculative future states) that a change to one project in the queue can't 
break another because they are tested together.

If your concern is "time to merge" splitting queues won't help all that much 
unless you put all of the unreliable broken code with broken tests in one queue 
and have the reliable code in another queue. Zuul tests everything in parallel 
within a queue. This means that if your code base and its tests are reliable 
you can merge 20 changes all at once and the time to merge for all 20 changes 
is the same as a single change. Problems arise when tests fail and these future 
states have to be updated and retested. This will affect one or many queues.

The fix here is to work on making reliable test jobs so that you can merge all 
20 changes in the span of time it takes to merge a single change.  This isn't 
necessarily easy, but helps you merge more code and be confident it works too.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci] Having more that one queue for gate pipeline at tripleo

2018-10-11 Thread Ben Nemec



On 10/11/18 8:53 AM, Felix Enrique Llorente Pastora wrote:
So for example, I don't see why changes at tripleo-quickstart can be 
reset if tripleo-ui fails, this is the kind of thing that maybe can be 
optimize.


Because if two incompatible changes are proposed to tripleo-quickstart 
and tripleo-ui and both end up in parallel gate queues at the same time, 
it's possible both queues could get wedged. Quickstart and the UI are 
not completely independent projects. Quickstart has roles for deploying 
the UI, which means there is a connection there.


I think the only way you could have independent gate queues is if you 
had two disjoint sets of projects that could be gated without any use of 
projects from the other set. I don't think it's possible to divide 
TripleO in that way, but if I'm wrong then maybe you could do multiple 
queues.




On Thu, Oct 11, 2018 at 1:17 PM Emilien Macchi > wrote:




On Thu, Oct 11, 2018 at 10:01 AM Felix Enrique Llorente Pastora
mailto:ellor...@redhat.com>> wrote:

Hello there,

    After suffering a lot from zuul's tripleo gate piepeline
queue reseting after failures on patches I have ask myself what
would happend if we have more than one queue for gating tripleo.

    After a quick read here
https://zuul-ci.org/docs/zuul/user/gating.html, I have found the
following:

"If changes with cross-project dependencies do not share a
change queue then Zuul is unable to enqueue them together, and
the first will be required to merge before the second is enqueued."

    So it make sense to share zuul queue, but maybe only one
queue for all tripleo projects is too  much, for example sharing
queue between tripleo-ui and tripleo-quickstart, maybe we need
for example to queues for product stuff and one for CI, so
product does not get resetted if CI fails in a patch.

    What do you think ?

Probably a wrong example, as TripleO UI gate is using CI jobs
running tripleo-quickstart scenarios.
We could create more queues for projects which are really
independent from each other but we need to be very careful about it.
-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



--
Quique Llorente

Openstack TripleO CI

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci] Having more that one queue for gate pipeline at tripleo

2018-10-11 Thread Felix Enrique Llorente Pastora
So for example, I don't see why changes at tripleo-quickstart can be reset
if tripleo-ui fails, this is the kind of thing that maybe can be optimize.

On Thu, Oct 11, 2018 at 1:17 PM Emilien Macchi  wrote:

>
>
> On Thu, Oct 11, 2018 at 10:01 AM Felix Enrique Llorente Pastora <
> ellor...@redhat.com> wrote:
>
>> Hello there,
>>
>>After suffering a lot from zuul's tripleo gate piepeline queue
>> reseting after failures on patches I have ask myself what would happend if
>> we have more than one queue for gating tripleo.
>>
>>After a quick read here https://zuul-ci.org/docs/zuul/user/gating.html,
>> I have found the following:
>>
>> "If changes with cross-project dependencies do not share a change queue
>> then Zuul is unable to enqueue them together, and the first will be
>> required to merge before the second is enqueued."
>>
>>So it make sense to share zuul queue, but maybe only one queue for all
>> tripleo projects is too  much, for example sharing queue between tripleo-ui
>> and tripleo-quickstart, maybe we need for example to queues for product
>> stuff and one for CI, so product does not get resetted if CI fails in a
>> patch.
>>
>>What do you think ?
>>
>
> Probably a wrong example, as TripleO UI gate is using CI jobs running
> tripleo-quickstart scenarios.
> We could create more queues for projects which are really independent from
> each other but we need to be very careful about it.
> --
> Emilien Macchi
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>


-- 
Quique Llorente

Openstack TripleO CI
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci] Having more that one queue for gate pipeline at tripleo

2018-10-11 Thread Emilien Macchi
On Thu, Oct 11, 2018 at 10:01 AM Felix Enrique Llorente Pastora <
ellor...@redhat.com> wrote:

> Hello there,
>
>After suffering a lot from zuul's tripleo gate piepeline queue reseting
> after failures on patches I have ask myself what would happend if we have
> more than one queue for gating tripleo.
>
>After a quick read here https://zuul-ci.org/docs/zuul/user/gating.html,
> I have found the following:
>
> "If changes with cross-project dependencies do not share a change queue
> then Zuul is unable to enqueue them together, and the first will be
> required to merge before the second is enqueued."
>
>So it make sense to share zuul queue, but maybe only one queue for all
> tripleo projects is too  much, for example sharing queue between tripleo-ui
> and tripleo-quickstart, maybe we need for example to queues for product
> stuff and one for CI, so product does not get resetted if CI fails in a
> patch.
>
>What do you think ?
>

Probably a wrong example, as TripleO UI gate is using CI jobs running
tripleo-quickstart scenarios.
We could create more queues for projects which are really independent from
each other but we need to be very careful about it.
-- 
Emilien Macchi
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo][ci] Having more that one queue for gate pipeline at tripleo

2018-10-11 Thread Felix Enrique Llorente Pastora
Hello there,

   After suffering a lot from zuul's tripleo gate piepeline queue reseting
after failures on patches I have ask myself what would happend if we have
more than one queue for gating tripleo.

   After a quick read here https://zuul-ci.org/docs/zuul/user/gating.html,
I have found the following:

"If changes with cross-project dependencies do not share a change queue
then Zuul is unable to enqueue them together, and the first will be
required to merge before the second is enqueued."

   So it make sense to share zuul queue, but maybe only one queue for all
tripleo projects is too  much, for example sharing queue between tripleo-ui
and tripleo-quickstart, maybe we need for example to queues for product
stuff and one for CI, so product does not get resetted if CI fails in a
patch.

   What do you think ?

-- 
Quique Llorente

Openstack TripleO CI
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI is blocked

2018-08-16 Thread Wesley Hayutin
On Wed, Aug 15, 2018 at 10:13 PM Wesley Hayutin  wrote:

> On Wed, Aug 15, 2018 at 7:13 PM Alex Schultz  wrote:
>
>> Please do not approve or recheck anything until further notice. We've
>> got a few issues that have basically broken all the jobs.
>>
>> https://bugs.launchpad.net/tripleo/+bug/1786764
>
>
fix posted: https://review.openstack.org/#/c/592577/


>
>> https://bugs.launchpad.net/tripleo/+bug/1787226
>
>
Dupe of 1786764 


>
>> https://bugs.launchpad.net/tripleo/+bug/1787244
>
>
Fixed Released: https://review.openstack.org/592146


>
>> https://bugs.launchpad.net/tripleo/+bug/1787268
>
>
Proposed:
https://review.openstack.org/#/c/592233/
https://review.openstack.org/#/c/592275/



> https://bugs.launchpad.net/tripleo/+bug/1736950
>
> w
>

Will post a patch to skip the above tempest test.

Also the patch to re-enable build-test-packages, the code that injects your
change into a rpm is about to merge.
https://review.openstack.org/#/c/592218/

Thanks Steve, Alex, Jistr and others :)


>
>>
>> Thanks,
>> -Alex
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
> --
>
> Wes Hayutin
>
> Associate MANAGER
>
> Red Hat
>
> 
>
> whayu...@redhat.comT: +1919 <+19197544114>4232509 IRC:  weshay
> 
>
> View my calendar and check my availability for meetings HERE
> 
>
-- 

Wes Hayutin

Associate MANAGER

Red Hat



whayu...@redhat.comT: +1919 <+19197544114>4232509 IRC:  weshay


View my calendar and check my availability for meetings HERE

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI is blocked

2018-08-15 Thread Wesley Hayutin
On Wed, Aug 15, 2018 at 7:13 PM Alex Schultz  wrote:

> Please do not approve or recheck anything until further notice. We've
> got a few issues that have basically broken all the jobs.
>
> https://bugs.launchpad.net/tripleo/+bug/1786764
> https://bugs.launchpad.net/tripleo/+bug/1787226
> https://bugs.launchpad.net/tripleo/+bug/1787244
> https://bugs.launchpad.net/tripleo/+bug/1787268


https://bugs.launchpad.net/tripleo/+bug/1736950

w

>
>
> Thanks,
> -Alex
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-- 

Wes Hayutin

Associate MANAGER

Red Hat



whayu...@redhat.comT: +1919 <+19197544114>4232509 IRC:  weshay


View my calendar and check my availability for meetings HERE

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI is blocked

2018-08-15 Thread Alex Schultz
Please do not approve or recheck anything until further notice. We've
got a few issues that have basically broken all the jobs.

https://bugs.launchpad.net/tripleo/+bug/1786764
https://bugs.launchpad.net/tripleo/+bug/1787226
https://bugs.launchpad.net/tripleo/+bug/1787244
https://bugs.launchpad.net/tripleo/+bug/1787268

Thanks,
-Alex

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci][metrics] Stucked in the middle of work because of RDO CI

2018-08-01 Thread Ben Nemec



On 07/31/2018 04:51 PM, Wesley Hayutin wrote:



On Tue, Jul 31, 2018 at 7:41 AM Sagi Shnaidman > wrote:


Hi, Martin

I see master OVB jobs are passing now [1], please recheck.

[1] http://cistatus.tripleo.org/


Things have improved and I see a lot of jobs passing however at the same 
time I see too many jobs failing due to node_failures.  We are tracking 
the data from [1].  Certainly the issue is NOT ideal for development and 
we need to remain focused on improving the situation.


I assume you're aware, but just to update the thread it looks like the 
OVB jobs are failing at a 50%+ rate again today (mostly unknown failures 
according to the tracking app).  Even with only two jobs that means your 
odds of getting them both to pass are pretty bad.




Thanks

[1] https://softwarefactory-project.io/zuul/api/tenant/rdoproject.org/builds



On Tue, Jul 31, 2018 at 12:24 PM, Martin Magr mailto:mm...@redhat.com>> wrote:

Greetings guys,

   it is pretty obvious that RDO CI jobs in TripleO projects are
broken [0]. Once Zuul CI jobs will pass would it be possible to
have AMQP/collectd patches ([1],[2],[3]) merged please even
though the negative result of RDO CI jobs? Half of the patches
for this feature is merged and the other half is stucked in this
situation, were nobody reviews these patches, because there is
red -1. Those patches passed Zuul jobs several times already and
were manually tested too.

Thanks in advance for consideration of this situation,
Martin

[0]

https://trello.com/c/hkvfxAdX/667-cixtripleoci-rdo-software-factory-3rd-party-jobs-failing-due-to-instance-nodefailure
[1] https://review.openstack.org/#/c/578749
[2] https://review.openstack.org/#/c/576057/
[3] https://review.openstack.org/#/c/572312/

-- 
Martin Mágr

Senior Software Engineer
Red Hat Czech


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Best regards

Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe:
openstack-dev-requ...@lists.openstack.org?subject:unsubscribe

http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

--

Wes Hayutin

Associate MANAGER

Red Hat



w hayu...@redhat.com 
  T: +1919 4232509     IRC: 
weshay





Viewmycalendar and check my availability for meetings HERE 




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci][metrics] FFE request for QDR integration in TripleO (Was: Stucked in the middle of work because of RDO CI)

2018-07-31 Thread Alex Schultz
On Tue, Jul 31, 2018 at 11:31 AM, Pradeep Kilambi  wrote:
> Hi Alex:
>
> Can you consider this our FFE for the QDR patches. Its mainly blocked on CI
> issues. Half the patches for QDR integration are already merged. The other 3
> referenced need to get merged once CI passes. Please consider this out
> formal request for FFE for QDR integration in tripleo.
>

Ok if it's just these patches and there is no further work it should
be OK. I did point out (prior to CI issues) that the patch[0] actually
broke the ovb jobs back in June. It seemed to be related to missing
containers or something to that effect.  So we'll need to be extra
care when merging this to ensure it does not break anything.  If we
get clean jobs prior to the rc1, we can merge it. If not I'd say we
need to hold off.  I don't consider this is a blocking feature.

Thanks,
-Alex

[0] https://review.openstack.org/#/c/578749/

> Cheers,
> ~ Prad
>
> On Tue, Jul 31, 2018 at 7:40 AM Sagi Shnaidman  wrote:
>>
>> Hi, Martin
>>
>> I see master OVB jobs are passing now [1], please recheck.
>>
>> [1] http://cistatus.tripleo.org/
>>
>> On Tue, Jul 31, 2018 at 12:24 PM, Martin Magr  wrote:
>>>
>>> Greetings guys,
>>>
>>>   it is pretty obvious that RDO CI jobs in TripleO projects are broken
>>> [0]. Once Zuul CI jobs will pass would it be possible to have AMQP/collectd
>>> patches ([1],[2],[3]) merged please even though the negative result of RDO
>>> CI jobs? Half of the patches for this feature is merged and the other half
>>> is stucked in this situation, were nobody reviews these patches, because
>>> there is red -1. Those patches passed Zuul jobs several times already and
>>> were manually tested too.
>>>
>>> Thanks in advance for consideration of this situation,
>>> Martin
>>>
>>> [0]
>>> https://trello.com/c/hkvfxAdX/667-cixtripleoci-rdo-software-factory-3rd-party-jobs-failing-due-to-instance-nodefailure
>>> [1] https://review.openstack.org/#/c/578749
>>> [2] https://review.openstack.org/#/c/576057/
>>> [3] https://review.openstack.org/#/c/572312/
>>>
>>> --
>>> Martin Mágr
>>> Senior Software Engineer
>>> Red Hat Czech
>>>
>>>
>>> __
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>>
>>
>> --
>> Best regards
>> Sagi Shnaidman
>
>
>
> --
> Cheers,
> ~ Prad

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci][metrics] Stucked in the middle of work because of RDO CI

2018-07-31 Thread Wesley Hayutin
On Tue, Jul 31, 2018 at 7:41 AM Sagi Shnaidman  wrote:

> Hi, Martin
>
> I see master OVB jobs are passing now [1], please recheck.
>
> [1] http://cistatus.tripleo.org/
>

Things have improved and I see a lot of jobs passing however at the same
time I see too many jobs failing due to node_failures.  We are tracking the
data from [1].  Certainly the issue is NOT ideal for development and we
need to remain focused on improving the situation.

Thanks

[1] https://softwarefactory-project.io/zuul/api/tenant/rdoproject.org/builds



>
>
> On Tue, Jul 31, 2018 at 12:24 PM, Martin Magr  wrote:
>
>> Greetings guys,
>>
>>   it is pretty obvious that RDO CI jobs in TripleO projects are broken
>> [0]. Once Zuul CI jobs will pass would it be possible to have AMQP/collectd
>> patches ([1],[2],[3]) merged please even though the negative result of RDO
>> CI jobs? Half of the patches for this feature is merged and the other half
>> is stucked in this situation, were nobody reviews these patches, because
>> there is red -1. Those patches passed Zuul jobs several times already and
>> were manually tested too.
>>
>> Thanks in advance for consideration of this situation,
>> Martin
>>
>> [0]
>> https://trello.com/c/hkvfxAdX/667-cixtripleoci-rdo-software-factory-3rd-party-jobs-failing-due-to-instance-nodefailure
>> [1] https://review.openstack.org/#/c/578749
>> [2] https://review.openstack.org/#/c/576057/
>> [3] https://review.openstack.org/#/c/572312/
>>
>> --
>> Martin Mágr
>> Senior Software Engineer
>> Red Hat Czech
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
>
> --
> Best regards
> Sagi Shnaidman
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-- 

Wes Hayutin

Associate MANAGER

Red Hat



w hayu...@redhat.comT: +1919 <+19197544114>4232509
   IRC:  weshay


View my calendar and check my availability for meetings HERE

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci][metrics] FFE request for QDR integration in TripleO (Was: Stucked in the middle of work because of RDO CI)

2018-07-31 Thread Pradeep Kilambi
Hi Alex:

Can you consider this our FFE for the QDR patches. Its mainly blocked on CI
issues. Half the patches for QDR integration are already merged. The other
3 referenced need to get merged once CI passes. Please consider this out
formal request for FFE for QDR integration in tripleo.

Cheers,
~ Prad

On Tue, Jul 31, 2018 at 7:40 AM Sagi Shnaidman  wrote:

> Hi, Martin
>
> I see master OVB jobs are passing now [1], please recheck.
>
> [1] http://cistatus.tripleo.org/
>
> On Tue, Jul 31, 2018 at 12:24 PM, Martin Magr  wrote:
>
>> Greetings guys,
>>
>>   it is pretty obvious that RDO CI jobs in TripleO projects are broken
>> [0]. Once Zuul CI jobs will pass would it be possible to have AMQP/collectd
>> patches ([1],[2],[3]) merged please even though the negative result of RDO
>> CI jobs? Half of the patches for this feature is merged and the other half
>> is stucked in this situation, were nobody reviews these patches, because
>> there is red -1. Those patches passed Zuul jobs several times already and
>> were manually tested too.
>>
>> Thanks in advance for consideration of this situation,
>> Martin
>>
>> [0]
>> https://trello.com/c/hkvfxAdX/667-cixtripleoci-rdo-software-factory-3rd-party-jobs-failing-due-to-instance-nodefailure
>> [1] https://review.openstack.org/#/c/578749
>> [2] https://review.openstack.org/#/c/576057/
>> [3] https://review.openstack.org/#/c/572312/
>>
>> --
>> Martin Mágr
>> Senior Software Engineer
>> Red Hat Czech
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
>
> --
> Best regards
> Sagi Shnaidman
>


-- 
Cheers,
~ Prad
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci][metrics] Stucked in the middle of work because of RDO CI

2018-07-31 Thread Sagi Shnaidman
Hi, Martin

I see master OVB jobs are passing now [1], please recheck.

[1] http://cistatus.tripleo.org/

On Tue, Jul 31, 2018 at 12:24 PM, Martin Magr  wrote:

> Greetings guys,
>
>   it is pretty obvious that RDO CI jobs in TripleO projects are broken
> [0]. Once Zuul CI jobs will pass would it be possible to have AMQP/collectd
> patches ([1],[2],[3]) merged please even though the negative result of RDO
> CI jobs? Half of the patches for this feature is merged and the other half
> is stucked in this situation, were nobody reviews these patches, because
> there is red -1. Those patches passed Zuul jobs several times already and
> were manually tested too.
>
> Thanks in advance for consideration of this situation,
> Martin
>
> [0] https://trello.com/c/hkvfxAdX/667-cixtripleoci-rdo-software-
> factory-3rd-party-jobs-failing-due-to-instance-nodefailure
> [1] https://review.openstack.org/#/c/578749
> [2] https://review.openstack.org/#/c/576057/
> [3] https://review.openstack.org/#/c/572312/
>
> --
> Martin Mágr
> Senior Software Engineer
> Red Hat Czech
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo][ci][metrics] Stucked in the middle of work because of RDO CI

2018-07-31 Thread Martin Magr
Greetings guys,

  it is pretty obvious that RDO CI jobs in TripleO projects are broken [0].
Once Zuul CI jobs will pass would it be possible to have AMQP/collectd
patches ([1],[2],[3]) merged please even though the negative result of RDO
CI jobs? Half of the patches for this feature is merged and the other half
is stucked in this situation, were nobody reviews these patches, because
there is red -1. Those patches passed Zuul jobs several times already and
were manually tested too.

Thanks in advance for consideration of this situation,
Martin

[0]
https://trello.com/c/hkvfxAdX/667-cixtripleoci-rdo-software-factory-3rd-party-jobs-failing-due-to-instance-nodefailure
[1] https://review.openstack.org/#/c/578749
[2] https://review.openstack.org/#/c/576057/
[3] https://review.openstack.org/#/c/572312/

-- 
Martin Mágr
Senior Software Engineer
Red Hat Czech
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo][ci] PTG Stein topics

2018-07-11 Thread Wesley Hayutin
Greetings,

Starting to collect thoughts and comments here,
https://etherpad.openstack.org/p/tripleoci-ptg-stein

Thanks
-- 

Wes Hayutin

Associate MANAGER

Red Hat



w hayu...@redhat.comT: +1919 <+19197544114>4232509
   IRC:  weshay


View my calendar and check my availability for meetings HERE

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI is down stop workflowing

2018-06-19 Thread Alex Schultz
On Tue, Jun 19, 2018 at 1:45 PM, Wesley Hayutin  wrote:
> Check and gate jobs look clear.
> More details on a bit.
>


So for a recap of the last 24 hours or so...

Mistral auth problems - https://bugs.launchpad.net/tripleo/+bug/1777541
 - caused by https://review.openstack.org/#/c/574878/
 - fixed by https://review.openstack.org/#/c/576336/

Undercloud install failure - https://bugs.launchpad.net/tripleo/+bug/1777616
- caused by https://review.openstack.org/#/c/570307/
- fixed by https://review.openstack.org/#/c/576428/

Keystone duplicate role - https://bugs.launchpad.net/tripleo/+bug/1777451
- caused by https://review.openstack.org/#/c/572243/
- fixed by https://review.openstack.org/#/c/576356 and
https://review.openstack.org/#/c/576393/

The puppet issues should be prevented in the future by adding tripleo
undercloud jobs back in to the appropriate modules, see
https://review.openstack.org/#/q/topic:tripleo-ci+(status:open)
I recommended the undercloud jobs because that gives us some basic
coverage and the instack-undercloud job still uses puppet without
containers.  We'll likely want to replace these jobs with standalone
versions at somepoint as that configuration gets more mature.

We've restored any patches that were abandoned in the gate and it
should be ok to recheck.

Thanks,
-Alex

> Thanks
>
> Sent from my mobile
>
> On Tue, Jun 19, 2018, 07:33 Felix Enrique Llorente Pastora
>  wrote:
>>
>> Hi,
>>
>>We have the following bugs with fixes that need to land to unblock
>> check/gate jobs:
>>
>>https://bugs.launchpad.net/tripleo/+bug/1777451
>>https://bugs.launchpad.net/tripleo/+bug/1777616
>>
>>You can check them out at #tripleo ooolpbot.
>>
>>Please stop workflowing temporally until they get merged.
>>
>> BR.
>>
>> --
>> Quique Llorente
>>
>> Openstack TripleO CI
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI is down stop workflowing

2018-06-19 Thread Wesley Hayutin
Check and gate jobs look clear.
More details on a bit.

Thanks

Sent from my mobile

On Tue, Jun 19, 2018, 07:33 Felix Enrique Llorente Pastora <
ellor...@redhat.com> wrote:

> Hi,
>
>We have the following bugs with fixes that need to land to unblock
> check/gate jobs:
>
>https://bugs.launchpad.net/tripleo/+bug/1777451
>https://bugs.launchpad.net/tripleo/+bug/1777616
>
>You can check them out at #tripleo ooolpbot.
>
>Please stop workflowing temporally until they get merged.
>
> BR.
>
> --
> Quique Llorente
>
> Openstack TripleO CI
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI is down stop workflowing

2018-06-19 Thread Felix Enrique Llorente Pastora
Hi,

   We have the following bugs with fixes that need to land to unblock
check/gate jobs:

   https://bugs.launchpad.net/tripleo/+bug/1777451
   https://bugs.launchpad.net/tripleo/+bug/1777616

   You can check them out at #tripleo ooolpbot.

   Please stop workflowing temporally until they get merged.

BR.

-- 
Quique Llorente

Openstack TripleO CI
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI Team Sprint 13 Summary

2018-05-30 Thread Matt Young
Greetings,

The TripleO CI team has just completed Sprint 13 (5/3 - 05/23).  The
following is a summary of activities during our sprint.   Details on our
team structure can be found in the spec [1].


# Sprint 13 Epic (CI Squad): Upgrade Support and Refactoring

- Epic Card: https://trello.com/c/cuKevn28/728-sprint-13-upgrades-goals
- Tasks: http://ow.ly/L86Y30kg75L

This sprint was spent with the CI squad focused on Upgrades.

We wanted to be able to use existing/working/tested CI collateral (ansible
playbooks and roles) used in CI today.  Throughout many of these are
references to  “{{ release }}” (e.g. ‘queens’, ‘pike’).  In order to not
retrofit the bulk of these with “upgrade aware” conditionals and/or logic
we needed a tool/module that could generate the inputs for the ‘release’
variable (and other similar inputs).   This allows us to reuse our common
roles and playbooks by decoupling the specifics of {upgrades, updates, FFU}
* {pike, queens, rocky,…}.  We’ve created this tool, and also put into
place a linting and unit tests for it as well.  We also made a few of the
jobs that had been prototyped and created in previous sprints voting, then
used them to validate changes to said jobs to wire in the new workflow/tool.

We are optimistic that work done in sprint 13 will prove useful in future
sprints.  A table to describe some of the problem set and our thinking
around variables used in CI is at [2].  The tool and tests are at [3].


# Sprint 13 Epic (Tempest Squad):

- Epic Card:
https://trello.com/c/efqE5XMr/82-sprint-13-refactor-python-tempestconf
- Tasks: http://ow.ly/LH8Q30kgd1C

In Sprint 13 the tempest squad was focused on refactoring
python-tempestconf.  It is the primary tool used by tempest users to
generate tempest.conf automatically so that users can easily run tempest
tests. Currently in TripleO and Infrared CI, we pass numerous parameters
manually via CLI.  This is cumbersome and error prone.

The high level goals were to reduce the number of default CLI overrides
used today, and to prepare python-tempestconf enabling better integration
with refstack-client.  This entailed service discoverability work.  We
added support for keystone, glance, cinder, swift, and neutron.  Additional
service support is planned for future sprints.  We also improved existing
documentation for python-tempestconf.


# Ruck & Rover (Sprint 13)

Sagi Shnaidman (sshnaidm), Matt Young (myoung)
https://review.rdoproject.org/etherpad/p/ruckrover-sprint13

A few notable issues where substantial time was spent are below.  Note that
this is not an inclusive list:

- When centos 7.5 was released, this caused a number of issues that
impacted gates.  This included deltas between package versions in BM vs.
container images, changes to centos that caused failures when modifying
images (e.g. IPA) in gates, and the like.
- We experienced issues with our promoter server, and the tripleo-infra
tenant generally around DNS and networking throughput, which impactacted
ability to process promotions.
- RHOS-13 jobs were created, and will eventually be used to gate changes to
TQ/TQE.
- Numerous patches/fixes to RDO Phase 2 jobs and CI infra. We had
accumulated technical debt.  While we have additional work to do,
particularly around some of the BM configs, we made good progress in
bringing various jobs back online.  We are still working on this in sprint
14 and moving forward.

Thanks,

The Tripleo CI team

[1]
https://specs.openstack.org/openstack/tripleo-specs/specs/policy/ci-team-structure.html
[2] https://wiki.openstack.org/wiki/Tripleo-upgrades-fs-variables
[3]
https://github.com/openstack-infra/tripleo-ci/blob/master/scripts/emit_releases_file
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching

2018-05-24 Thread Bogdan Dobrelya

On 5/23/18 6:49 PM, Sagi Shnaidman wrote:

Alex,

the problem is that you're working and focusing mostly on release 
specific code like featuresets and some scripts. But 
tripleo-quickstart(-extras) and tripleo-ci is much *much* more than set 
of featuresets. Only 10% of the code may be related to releases and 
branches, while other 90% is completely independent and not related to 
releases.


So in 90% code we DO need to backport every change, take for example the 
latest patch to extras: https://review.openstack.org/#/c/570167/, it's 
fixing reproducer. If oooq-extra was branched, we would need to backport 
this fix to every and every branch. And the same for all other 90% of 
code, which is complete nonsense.
Just because not using "{% if release %}" construct - to block the whole 
work of CI team and make the CI code is absolutely unmaintainable?


Some of release related templates we moved recently from tripleo-ci to 
THT repo like scenarios, OC templates, etc. If we discover another 
things in oooq that could be moved to branched THT I'd be only happy for 
that.


Sometimes it could be hard to maintain one file in extras templates with 
different logic for releases, like we have in tempest configuration for 
example. The solution is to create a few release-related templates and 
use one that match the current branch. It doesn't affect 90% of code and 
still "branch-like" approach. But I didn't see other scripts that are so 
release dependent. If we'll have ones, we could do the same. For now I 
see "{% if release %}" construct working very well.


I didn't see still any advantage of branching CI code, except of a 
little bit nicer jinja templates without "{% if release ", but amount of 
disadvantages is so huge, that it'll literally block all current work in CI.


[tl;dr] branching allows to not run cloned branched jobs against master 
patches. Or patches will wait longer in queues, and fail more often cuz 
of intermittent infra issues. See explanation and some calculations below.


So my main concern against additional stable release cloned jobs 
executed for master branches is that there is an "infra failure fee", 
which is a failure unrelated to the patch under check or gate, like an 
intermittent connectivity/timeout inducted failure. This normally is 
followed by a 'recheck' comment posted by an engineer, and sometimes is 
noticed by the elastic recheck bot as well. Say, that sort of a failure 
has a probability of N. And the real "product failure", which is related 
to the subject patch and not infra, takes P. So chances to fail for a job is


F = (1 - ((1 - N)*(1 - P)).

Now that we have added a two more "branched clones" for RDO CI OVB jobs 
and a two more zuul jobs, we have this equation as


F = (1 - ((1 - N)^4*(1 - P)).

(I assumed the chances to face a product defect for the cloned branched 
jobs remain unchanged).


This might bring significantly increased chances to fail (see some 
examples [0] for the N/P distribution cases). So folks will start 
posting 'recheck' comments now even more often, like x2 times more 
often. Which would make zuul and RDO CI queues larger, and patches 
sitting there longer - ending up with more time to wait for jobs to 
start its check/gate pipelines. That's what I call 'recheck storms'. And 
w/o branched quickstart/extras, we might have those storms amplified, 
tho that fully depends on real N/P distributions.


[0] https://pastebin.com/ckG5G7NG



Thanks



On Wed, May 23, 2018 at 7:04 PM, Alex Schultz > wrote:


On Wed, May 23, 2018 at 8:30 AM, Sagi Shnaidman > wrote:
> Hi, Sergii
>
> thanks for the question. It's not first time that this topic is raised and
> from first view it could seem that branching would help to that sort of
> issues.
>
> Although it's not the case. Tripleo-quickstart(-extras) is part of CI 
code,
> as well as tripleo-ci repo which have never been branched. The reason for
> that is relative small impact on CI code from product branching. Think 
about
> backport almost *every* patch to oooq and extras to all supported 
branches,
> down to newton at least. This will be a really *huge* price and non
> reasonable work. Just think about active maintenance of 3-4 versions of CI
> code in each of 3 repositories. It will take all time of CI team with 
almost
> zero value of this work.
>

So I'm not sure I completely agree with this assessment as there is a
price paid for every {%if release in [...]%} that we have to carry in
oooq{,-extras}.  These go away if we branch because we don't have to
worry about breaking previous releases or current release (which may
or may not actually have CI results).

> What regards patch you listed, we would have backport this change to 
*every*
> branch, and it wouldn't really help to avoid the issue. The source of

Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching

2018-05-23 Thread Sergii Golovatiuk
Hi,

On Wed, May 23, 2018 at 8:20 PM, Sagi Shnaidman  wrote:
>
>>
>> to reduce the impact of a change. From my original reply:
>>
>> > If there's a high maintenance cost, we haven't properly identified the
>> > optimal way to separate functionality between tripleo/quickstart.
>>
>> IMHO this is a side effect of having a whole bunch of roles in a
>> single repo.  oooq-extras has a mix of tripleo and non-tripleo related
>> content. The reproducer IMHO is related to provisioning and could fall
>> in the oooq repo and not oooq-extras.  This is a structure problem
>> with quickstart.  If it's not version specific, then don't put it in a
>> version specific repo. But that doesn't mean don't use version
>> specific repos at all.
>>
>> This is one of the reasons why we're opting not to use this pattern of
>> a bunch of roles in a single repo for tripleo itself[0][1][2].  We
>> learned with the puppet modules that carrying all this stuff in a
>> single repo has a huge maintenance cost and if you split them out you
>> can identify re-usability and establish proper patterns for moving
>> functionality into a shared place[3].  Yes there is a maintenance cost
>> of maintaining independent repos, but at the same time there's a
>> benefit of re-usability by other projects/groups when you expose
>> important pieces of functionality as a standalone. You can establish
>> clear ways to interact with each piece, test items, and release
>> independently.  For example the ansible-role-container-registry is not
>> tripleo specific and anyone looking to manage a standalone docker
>> registry can use it & contribute.
>>
>
> We were moving between having all roles in one repo and having a separate
> repo for each role a few times. Each case has it's advantages and
> disadvantages. Last time we moved to have roles in 2 repos - quickstart and
> extras, it was a year ago I think. So far IMHO it's the best approach. There
> will be a mechanism to install additional roles, like we have for
> tirpleo-upgrade, ops-tools, etc etc.

But at the moment we don't have that mechanism so we should live
somehow until it's implemented.

> It may be a much broader topic to discuss, although I think having part of
> roles branched and part of not branched is much more headache.
> Tripleo-upgrade is a good example of it.
>
>>
>> > So in 90% code we DO need to backport every change, take for example the
>> > latest patch to extras: https://review.openstack.org/#/c/570167/, it's
>> > fixing reproducer. If oooq-extra was branched, we would need to backport
>> > this fix to every and every branch. And the same for all other 90% of
>> > code,
>> > which is complete nonsense.
>> > Just because not using "{% if release %}" construct - to block the whole
>> > work of CI team and make the CI code is absolutely unmaintainable?
>> >
>>
>> And you're saying what we currently have is maintainable?  We keep
>> breaking ourselves, there's big gaps in coverage and it takes
>> time[4][5] to identify breakages. I don't consider that maintainable
>> because this is a recurring topic because we clearly haven't fixed it
>> with the current setup.  It's time to re-evaluate what we have an see
>> if there's room for improvement.  I know I wasn't proposing to branch
>> all the repositories, but it might make sense to figure out if there's
>> a way to reduce our recurring issues with stable branches or
>> independent modules for some of the functions in CI.
>
>
>> Considering this is how we broke Queens, I'm not sure I agree.

We broke Queens, Pike, Newton by merging [1] without testing against
these releases.

>>
>
> First of all I don't see any connection between maintenance and CI
> breakages, it's different topics. And yes, it IS maintainable CI that we
> have now, and I have what to compare it with. I remember very well
> tripleo.sh based approach, also you can see almost green dashboards last
> time which proves my statement. CI is not ideal now, but it's definitely
> much better than 1-2 years ago.
>
>
> Of course we have breakages, the CI is actually history of breakages and
> fixes, as any other product. Wrt queens issue, it took about a week to solve
> it not because it was so hard, but because we had a very difficult weeks
> when trying to fix all Centos 7.5 issues and queens branch was in second
> priority. And by the way, we fixed everything much faster then it was with
> CentOS 7.4.  Having the negative attitude that every CI breakage is proof of
> wrong CI structure is not correct and doesn't help. If branching helped in
> this case, it would create much bigger problems in all other cases.

I would like to forget about feeling and discuss the technical side of
2 solutions, cost for every team and product in general to find the
solution that fits all.

>
> Anyway, we saw that having branch jobs in OVB only didn't catch queens issue
> (why - you know better) so we added multinode branch specific ones, which
> will catch such issues in the 

Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching

2018-05-23 Thread Sagi Shnaidman
> to reduce the impact of a change. From my original reply:
>
> > If there's a high maintenance cost, we haven't properly identified the
> optimal way to separate functionality between tripleo/quickstart.
>
> IMHO this is a side effect of having a whole bunch of roles in a
> single repo.  oooq-extras has a mix of tripleo and non-tripleo related
> content. The reproducer IMHO is related to provisioning and could fall
> in the oooq repo and not oooq-extras.  This is a structure problem
> with quickstart.  If it's not version specific, then don't put it in a
> version specific repo. But that doesn't mean don't use version
> specific repos at all.
>
> This is one of the reasons why we're opting not to use this pattern of
> a bunch of roles in a single repo for tripleo itself[0][1][2].  We
> learned with the puppet modules that carrying all this stuff in a
> single repo has a huge maintenance cost and if you split them out you
> can identify re-usability and establish proper patterns for moving
> functionality into a shared place[3].  Yes there is a maintenance cost
> of maintaining independent repos, but at the same time there's a
> benefit of re-usability by other projects/groups when you expose
> important pieces of functionality as a standalone. You can establish
> clear ways to interact with each piece, test items, and release
> independently.  For example the ansible-role-container-registry is not
> tripleo specific and anyone looking to manage a standalone docker
> registry can use it & contribute.
>
>
We were moving between having all roles in one repo and having a separate
repo for each role a few times. Each case has it's advantages and
disadvantages. Last time we moved to have roles in 2 repos - quickstart and
extras, it was a year ago I think. So far IMHO it's the best approach.
There will be a mechanism to install additional roles, like we have for
tirpleo-upgrade, ops-tools, etc etc.
It may be a much broader topic to discuss, although I think having part of
roles branched and part of not branched is much more headache.
Tripleo-upgrade is a good example of it.


> > So in 90% code we DO need to backport every change, take for example the
> > latest patch to extras: https://review.openstack.org/#/c/570167/, it's
> > fixing reproducer. If oooq-extra was branched, we would need to backport
> > this fix to every and every branch. And the same for all other 90% of
> code,
> > which is complete nonsense.
> > Just because not using "{% if release %}" construct - to block the whole
> > work of CI team and make the CI code is absolutely unmaintainable?
> >
>
> And you're saying what we currently have is maintainable?  We keep
> breaking ourselves, there's big gaps in coverage and it takes
> time[4][5] to identify breakages. I don't consider that maintainable
> because this is a recurring topic because we clearly haven't fixed it
> with the current setup.  It's time to re-evaluate what we have an see
> if there's room for improvement.  I know I wasn't proposing to branch
> all the repositories, but it might make sense to figure out if there's
> a way to reduce our recurring issues with stable branches or
> independent modules for some of the functions in CI.
>

Considering this is how we broke Queens, I'm not sure I agree.
>
>
First of all I don't see any connection between maintenance and CI
breakages, it's different topics. And yes, it IS maintainable CI that we
have now, and I have what to compare it with. I remember very well
tripleo.sh based approach, also you can see almost green dashboards last
time which proves my statement. CI is not ideal now, but it's definitely
much better than 1-2 years ago.

Of course we have breakages, the CI is actually history of breakages and
fixes, as any other product. Wrt queens issue, it took about a week to
solve it not because it was so hard, but because we had a very difficult
weeks when trying to fix all Centos 7.5 issues and queens branch was in
second priority. And by the way, we fixed everything much faster then it
was with CentOS 7.4.  Having the negative attitude that every CI breakage
is proof of wrong CI structure is not correct and doesn't help. If
branching helped in this case, it would create much bigger problems in all
other cases.

Anyway, we saw that having branch jobs in OVB only didn't catch queens
issue (why - you know better) so we added multinode branch specific ones,
which will catch such issues in the future. We hit the problem, solved it,
set preventive actions and are ready to catch it next time. This is a
normal CI workflow and I don't see any problem with it. Having multinode
branch jobs is actually pretty similar to "branching" repos but without
maintenance nightmare.

Thanks

Thanks,
> -Alex
>
> [0] http://git.openstack.org/cgit/openstack/ansible-role-
> container-registry/
> [1] http://git.openstack.org/cgit/openstack/ansible-role-redhat-
> subscription/
> [2] http://git.openstack.org/cgit/openstack/ansible-role-tripleo-keystone/
> [3] 

Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching

2018-05-23 Thread Alex Schultz
On Wed, May 23, 2018 at 10:49 AM, Sagi Shnaidman  wrote:
> Alex,
>
> the problem is that you're working and focusing mostly on release specific
> code like featuresets and some scripts. But tripleo-quickstart(-extras) and
> tripleo-ci is much *much* more than set of featuresets. Only 10% of the code
> may be related to releases and branches, while other 90% is completely
> independent and not related to releases.
>

It is not necessarily about release specific code, it's about being
able to reduce the impact of a change. From my original reply:

> If there's a high maintenance cost, we haven't properly identified the 
> optimal way to separate functionality between tripleo/quickstart.

IMHO this is a side effect of having a whole bunch of roles in a
single repo.  oooq-extras has a mix of tripleo and non-tripleo related
content. The reproducer IMHO is related to provisioning and could fall
in the oooq repo and not oooq-extras.  This is a structure problem
with quickstart.  If it's not version specific, then don't put it in a
version specific repo. But that doesn't mean don't use version
specific repos at all.

This is one of the reasons why we're opting not to use this pattern of
a bunch of roles in a single repo for tripleo itself[0][1][2].  We
learned with the puppet modules that carrying all this stuff in a
single repo has a huge maintenance cost and if you split them out you
can identify re-usability and establish proper patterns for moving
functionality into a shared place[3].  Yes there is a maintenance cost
of maintaining independent repos, but at the same time there's a
benefit of re-usability by other projects/groups when you expose
important pieces of functionality as a standalone. You can establish
clear ways to interact with each piece, test items, and release
independently.  For example the ansible-role-container-registry is not
tripleo specific and anyone looking to manage a standalone docker
registry can use it & contribute.

> So in 90% code we DO need to backport every change, take for example the
> latest patch to extras: https://review.openstack.org/#/c/570167/, it's
> fixing reproducer. If oooq-extra was branched, we would need to backport
> this fix to every and every branch. And the same for all other 90% of code,
> which is complete nonsense.
> Just because not using "{% if release %}" construct - to block the whole
> work of CI team and make the CI code is absolutely unmaintainable?
>

And you're saying what we currently have is maintainable?  We keep
breaking ourselves, there's big gaps in coverage and it takes
time[4][5] to identify breakages. I don't consider that maintainable
because this is a recurring topic because we clearly haven't fixed it
with the current setup.  It's time to re-evaluate what we have an see
if there's room for improvement.  I know I wasn't proposing to branch
all the repositories, but it might make sense to figure out if there's
a way to reduce our recurring issues with stable branches or
independent modules for some of the functions in CI.

> Some of release related templates we moved recently from tripleo-ci to THT
> repo like scenarios, OC templates, etc. If we discover another things in
> oooq that could be moved to branched THT I'd be only happy for that.
>
> Sometimes it could be hard to maintain one file in extras templates with
> different logic for releases, like we have in tempest configuration for
> example. The solution is to create a few release-related templates and use
> one that match the current branch. It doesn't affect 90% of code and still
> "branch-like" approach. But I didn't see other scripts that are so release
> dependent. If we'll have ones, we could do the same. For now I see "{% if
> release %}" construct working very well.

Considering this is how we broke Queens, I'm not sure I agree.

>
> I didn't see still any advantage of branching CI code, except of a little
> bit nicer jinja templates without "{% if release ", but amount of
> disadvantages is so huge, that it'll literally block all current work in CI.
>

It's about reducing our risk with test coverage. We do not properly
test all jobs and all configurations when we make these changes. This
is a repeated problem and when we have to add version specific logic,
unless we're able to identify what this is actually impacting and
verify with jobs we have a risk of breaking ourselves.  We've seen
that code review is not sufficient for these changes as we merge
things and only find out after they've been merged that we broke
stable branches. Then it takes folks tracking down changes to decipher
what we broke. For example the original patch[4] broke Queens for
about a week.  That's 7 days of nothing being able to be merged,
that's not OK.

Thanks,
-Alex

[0] http://git.openstack.org/cgit/openstack/ansible-role-container-registry/
[1] http://git.openstack.org/cgit/openstack/ansible-role-redhat-subscription/
[2] 

Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching

2018-05-23 Thread Sagi Shnaidman
Alex,

the problem is that you're working and focusing mostly on release specific
code like featuresets and some scripts. But tripleo-quickstart(-extras) and
tripleo-ci is much *much* more than set of featuresets. Only 10% of the
code may be related to releases and branches, while other 90% is completely
independent and not related to releases.

So in 90% code we DO need to backport every change, take for example the
latest patch to extras: https://review.openstack.org/#/c/570167/, it's
fixing reproducer. If oooq-extra was branched, we would need to backport
this fix to every and every branch. And the same for all other 90% of code,
which is complete nonsense.
Just because not using "{% if release %}" construct - to block the whole
work of CI team and make the CI code is absolutely unmaintainable?

Some of release related templates we moved recently from tripleo-ci to THT
repo like scenarios, OC templates, etc. If we discover another things in
oooq that could be moved to branched THT I'd be only happy for that.

Sometimes it could be hard to maintain one file in extras templates with
different logic for releases, like we have in tempest configuration for
example. The solution is to create a few release-related templates and use
one that match the current branch. It doesn't affect 90% of code and still
"branch-like" approach. But I didn't see other scripts that are so release
dependent. If we'll have ones, we could do the same. For now I see "{% if
release %}" construct working very well.

I didn't see still any advantage of branching CI code, except of a little
bit nicer jinja templates without "{% if release ", but amount of
disadvantages is so huge, that it'll literally block all current work in CI.

Thanks



On Wed, May 23, 2018 at 7:04 PM, Alex Schultz  wrote:

> On Wed, May 23, 2018 at 8:30 AM, Sagi Shnaidman 
> wrote:
> > Hi, Sergii
> >
> > thanks for the question. It's not first time that this topic is raised
> and
> > from first view it could seem that branching would help to that sort of
> > issues.
> >
> > Although it's not the case. Tripleo-quickstart(-extras) is part of CI
> code,
> > as well as tripleo-ci repo which have never been branched. The reason for
> > that is relative small impact on CI code from product branching. Think
> about
> > backport almost *every* patch to oooq and extras to all supported
> branches,
> > down to newton at least. This will be a really *huge* price and non
> > reasonable work. Just think about active maintenance of 3-4 versions of
> CI
> > code in each of 3 repositories. It will take all time of CI team with
> almost
> > zero value of this work.
> >
>
> So I'm not sure I completely agree with this assessment as there is a
> price paid for every {%if release in [...]%} that we have to carry in
> oooq{,-extras}.  These go away if we branch because we don't have to
> worry about breaking previous releases or current release (which may
> or may not actually have CI results).
>
> > What regards patch you listed, we would have backport this change to
> *every*
> > branch, and it wouldn't really help to avoid the issue. The source of
> > problem is not branchless repo here.
> >
>
> No we shouldn't be backporting every change.  The logic in oooq-extras
> should be version specific and if we're changing an interface in
> tripleo in a breaking fashion we're doing it wrong in tripleo. If
> we're backporting things to work around tripleo issues, we're doing it
> wrong in quickstart.
>
> > Regarding catching such issues and Bogdans point, that's right we added a
> > few jobs to catch such issues in the future and prevent breakages, and a
> few
> > running jobs is reasonable price to keep configuration working in all
> > branches. Comparing to maintenance nightmare with branches of CI code,
> it's
> > really a *zero* price.
> >
>
> Nothing is free. If there's a high maintenance cost, we haven't
> properly identified the optimal way to separate functionality between
> tripleo/quickstart.  I have repeatedly said that the provisioning
> parts of quickstart should be separate because those aren't tied to a
> tripleo version and this along with the scenario configs should be the
> only unbranched repo we have. Any roles related to how to
> configure/work with tripleo should be branched and tied to a stable
> branch of tripleo. This would actually be beneficial for tripleo as
> well because then we can see when we are introducing backwards
> incompatible changes.
>
> Thanks,
> -Alex
>
> > Thanks
> >
> >
> > On Wed, May 23, 2018 at 3:43 PM, Sergii Golovatiuk 
> > wrote:
> >>
> >> Hi,
> >>
> >> Looking at [1], I am thinking about the price we paid for not
> >> branching tripleo-quickstart. Can we discuss the options to prevent
> >> the issues such as [1]? Thank you in advance.
> >>
> >> [1] https://review.openstack.org/#/c/569830/4
> >>
> >> --
> >> Best Regards,
> >> Sergii Golovatiuk
> >>
> >> 

Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching

2018-05-23 Thread Alex Schultz
On Wed, May 23, 2018 at 8:30 AM, Sagi Shnaidman  wrote:
> Hi, Sergii
>
> thanks for the question. It's not first time that this topic is raised and
> from first view it could seem that branching would help to that sort of
> issues.
>
> Although it's not the case. Tripleo-quickstart(-extras) is part of CI code,
> as well as tripleo-ci repo which have never been branched. The reason for
> that is relative small impact on CI code from product branching. Think about
> backport almost *every* patch to oooq and extras to all supported branches,
> down to newton at least. This will be a really *huge* price and non
> reasonable work. Just think about active maintenance of 3-4 versions of CI
> code in each of 3 repositories. It will take all time of CI team with almost
> zero value of this work.
>

So I'm not sure I completely agree with this assessment as there is a
price paid for every {%if release in [...]%} that we have to carry in
oooq{,-extras}.  These go away if we branch because we don't have to
worry about breaking previous releases or current release (which may
or may not actually have CI results).

> What regards patch you listed, we would have backport this change to *every*
> branch, and it wouldn't really help to avoid the issue. The source of
> problem is not branchless repo here.
>

No we shouldn't be backporting every change.  The logic in oooq-extras
should be version specific and if we're changing an interface in
tripleo in a breaking fashion we're doing it wrong in tripleo. If
we're backporting things to work around tripleo issues, we're doing it
wrong in quickstart.

> Regarding catching such issues and Bogdans point, that's right we added a
> few jobs to catch such issues in the future and prevent breakages, and a few
> running jobs is reasonable price to keep configuration working in all
> branches. Comparing to maintenance nightmare with branches of CI code, it's
> really a *zero* price.
>

Nothing is free. If there's a high maintenance cost, we haven't
properly identified the optimal way to separate functionality between
tripleo/quickstart.  I have repeatedly said that the provisioning
parts of quickstart should be separate because those aren't tied to a
tripleo version and this along with the scenario configs should be the
only unbranched repo we have. Any roles related to how to
configure/work with tripleo should be branched and tied to a stable
branch of tripleo. This would actually be beneficial for tripleo as
well because then we can see when we are introducing backwards
incompatible changes.

Thanks,
-Alex

> Thanks
>
>
> On Wed, May 23, 2018 at 3:43 PM, Sergii Golovatiuk 
> wrote:
>>
>> Hi,
>>
>> Looking at [1], I am thinking about the price we paid for not
>> branching tripleo-quickstart. Can we discuss the options to prevent
>> the issues such as [1]? Thank you in advance.
>>
>> [1] https://review.openstack.org/#/c/569830/4
>>
>> --
>> Best Regards,
>> Sergii Golovatiuk
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> --
> Best regards
> Sagi Shnaidman
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching

2018-05-23 Thread Sagi Shnaidman
Hi, Sergii

thanks for the question. It's not first time that this topic is raised and
from first view it could seem that branching would help to that sort of
issues.

Although it's not the case. Tripleo-quickstart(-extras) is part of CI code,
as well as tripleo-ci repo which have never been branched. The reason for
that is relative small impact on CI code from product branching. Think
about backport almost *every* patch to oooq and extras to all supported
branches, down to newton at least. This will be a really *huge* price and
non reasonable work. Just think about active maintenance of 3-4 versions of
CI code in each of 3 repositories. It will take all time of CI team with
almost zero value of this work.

What regards patch you listed, we would have backport this change to
*every* branch, and it wouldn't really help to avoid the issue. The source
of problem is not branchless repo here.

Regarding catching such issues and Bogdans point, that's right we added a
few jobs to catch such issues in the future and prevent breakages, and a
few running jobs is reasonable price to keep configuration working in all
branches. Comparing to maintenance nightmare with branches of CI code, it's
really a *zero* price.

Thanks


On Wed, May 23, 2018 at 3:43 PM, Sergii Golovatiuk 
wrote:

> Hi,
>
> Looking at [1], I am thinking about the price we paid for not
> branching tripleo-quickstart. Can we discuss the options to prevent
> the issues such as [1]? Thank you in advance.
>
> [1] https://review.openstack.org/#/c/569830/4
>
> --
> Best Regards,
> Sergii Golovatiuk
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci][infra] Quickstart Branching

2018-05-23 Thread Bogdan Dobrelya

On 5/23/18 2:43 PM, Sergii Golovatiuk wrote:

Hi,

Looking at [1], I am thinking about the price we paid for not
branching tripleo-quickstart. Can we discuss the options to prevent
the issues such as [1]? Thank you in advance.

[1] https://review.openstack.org/#/c/569830/4



That was only a half of the full price, actually, see also additional 
multinode containers check/gate jobs  [0],[1] from now on executed 
against the master branches of all tripleo repos (IIUC), for release -2 
and -1

from master.

[0] https://review.openstack.org/#/c/569932/
[1] https://review.openstack.org/#/c/569854/


--
Best regards,
Bogdan Dobrelya,
Irc #bogdando

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo][ci][infra] Quickstart Branching

2018-05-23 Thread Sergii Golovatiuk
Hi,

Looking at [1], I am thinking about the price we paid for not
branching tripleo-quickstart. Can we discuss the options to prevent
the issues such as [1]? Thank you in advance.

[1] https://review.openstack.org/#/c/569830/4

-- 
Best Regards,
Sergii Golovatiuk

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI Squads’ Sprint 12 Summary: libvirt-reproducer, python-tempestconf

2018-05-09 Thread Bogdan Dobrelya

On 5/9/18 4:24 AM, Matt Young wrote:

Greetings,

The TripleO squads for CI and Tempest have just completed Sprint 12.  
The following is a summary of activities during this sprint.   Details 
on our team structure can be found in the spec [1].


---

# Sprint 12 Epic (CI): Libvirt Reproducer

* Epic Card: https://trello.com/c/JEGLSVh6/51-reproduce-ci-jobs-with-libvirt
* Tasks: http://ow.ly/O1vZ30jTSc3

"Allow developers to reproduce a multinode CI job on a bare metal host 
using libvirt"
"Enable the same workflows used in upstream CI / reproducer using 
libvirt instead of OVB as the provisioning mechanism"


The CI Squad prototyped, designed, and implemented new functionality for 
our CI reproducer.   “Reproducers” are scripts generated by each CI job 
that allow the job/test to be recreated.  These are useful to both CI 
team members when investigating failures, as well as developers creating 
failures with the intent to iteratively debug and/or fix issues.  Prior 
to this sprint, the reproducer scripts supported reproduction of 
upstream CI jobs using OVB, typically on RDO Cloud.  This sprint we 
extended this capability to support reproduction of jobs in libvirt.


This work was done for a few reasons:

* (short term) enable the team to work on upgrades and other CI team 
tasks more efficiently by mitigating recurring RDO Cloud infrastructure 
issues.  This was the primary motivator for doing this work at this time.
* (mid-longer term) enhance / enable iterative workflows such as THT 
development, debugging deployment scenarios, etc.  Snapshots in 
particular have proven quite useful.  As we look towards a future with a 
viable single-node deployment capability, libvirt has clear benefits for 
common developer scenarios.


Thank you for that, a really cool feature for tripleo development!



It is expected that further iteration and refinement of this initial 
implementation will be required before the tripleo-ci team is able to 
support this broadly.  What we’ve done works as designed.  While we 
welcome folks to explore, please note that we are not announcing a 
supported libvirt reproducer meant for use outside the tripleo-ci team 
at this time.  We expect some degree of change, and have a number of 
RFE’s resulting from our testing as well as documentation patches that 
we’re iterating on.


That said, we think it’s really cool, works well in its current form, 
and are optimistic about its future.


## We did the following (CI):

* Add support to the reproducer script [2,3] generated by CI to enable 
libvirt.

* Basic snapshot create/restore [4] capability.
* Tested Scenarios: featureset 3 (UC idem), 10 (multinode containers), 
37 (min OC + minor update).  See sprint cards for details.
* 14-18 RFE’s identified as part of testing for future work 
http://ow.ly/J2u830jTSLG


---

# Sprint 12 Epic (Tempest):

* Epic Card: https://trello.com/c/ifIYQsxs/75-sprint-12-undercloud-tempest
* Tasks: http://ow.ly/GGvc30jTSfV

“Run tempest on undercloud by using containerized and packaged tempest”
“Complete work items carried from sprint 11 or another side work going on.”

## We did the following (Tempest):

* Create tripleo-ci jobs that run containerized tempest on all stable 
branches.
* Create documentation for configuring and running tempest using 
containerized tempest on UC @tripleo.org , and blog 
posts. [5,6,7]

* Run certification tests via new Jenkins job using ansible role [8]
* Refactor validate-tempest CI role for UC and containers

---

# Ruck and Rover

Each sprint two of the team members assume the roles of Ruck and Rover 
(each for half of the sprint).


* Ruck is responsible to monitoring the CI, checking for failures, 
opening bugs, participate on meetings, and this is your focal point to 
any CI issues.
* Rover is responsible to work on these bugs, fix problems and the rest 
of the team are focused on the sprint. For more information about our 
structure, check [1]


## Ruck & Rover (Sprint 12), Etherpad [9,10]:

* Quique Llorente(quiquell)
* Gabriele Cerami (panda)

A few notable issues where substantial time was spent were:

1767099 
periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset030-master vxlan 
tunnel fails randomly

1758899 reproducer-quickstart.sh building wrong gating package.
1767343 gate tripleo-ci-centos-7-containers-multinode fails to update 
packages in cron container
1762351 
periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-queens-upload 
is timeout Depends on https://bugzilla.redhat.com/show_bug.cgi?id=1565179

1766873 quickstart on ovb doesn't yield a deployment
1767049 Error during test discovery : 'must specify exactly one of host 
or intercept' Depends on https://bugzilla.redhat.com/show_bug.cgi?id=1434385
1767076 Creating pingtest_sack fails: Failed to schedule instances: 
NoValidHost_Remote: No valid host was found

1763634 devmode.sh --ovb fails to deploy overcloud
1765680 Incorrect branch used for not gated tripleo-upgrade repo


[openstack-dev] [tripleo] CI Squads’ Sprint 12 Summary: libvirt-reproducer, python-tempestconf

2018-05-08 Thread Matt Young
Greetings,

The TripleO squads for CI and Tempest have just completed Sprint 12.  The
following is a summary of activities during this sprint.   Details on our
team structure can be found in the spec [1].

---

# Sprint 12 Epic (CI): Libvirt Reproducer

* Epic Card: https://trello.com/c/JEGLSVh6/51-reproduce-ci-jobs-with-libvirt
* Tasks: http://ow.ly/O1vZ30jTSc3

"Allow developers to reproduce a multinode CI job on a bare metal host
using libvirt"
"Enable the same workflows used in upstream CI / reproducer using libvirt
instead of OVB as the provisioning mechanism"

The CI Squad prototyped, designed, and implemented new functionality for
our CI reproducer.   “Reproducers” are scripts generated by each CI job
that allow the job/test to be recreated.  These are useful to both CI team
members when investigating failures, as well as developers creating
failures with the intent to iteratively debug and/or fix issues.  Prior to
this sprint, the reproducer scripts supported reproduction of upstream CI
jobs using OVB, typically on RDO Cloud.  This sprint we extended this
capability to support reproduction of jobs in libvirt.

This work was done for a few reasons:

* (short term) enable the team to work on upgrades and other CI team tasks
more efficiently by mitigating recurring RDO Cloud infrastructure issues.
This was the primary motivator for doing this work at this time.
* (mid-longer term) enhance / enable iterative workflows such as THT
development, debugging deployment scenarios, etc.  Snapshots in particular
have proven quite useful.  As we look towards a future with a viable
single-node deployment capability, libvirt has clear benefits for common
developer scenarios.

It is expected that further iteration and refinement of this initial
implementation will be required before the tripleo-ci team is able to
support this broadly.  What we’ve done works as designed.  While we welcome
folks to explore, please note that we are not announcing a supported
libvirt reproducer meant for use outside the tripleo-ci team at this time.
We expect some degree of change, and have a number of RFE’s resulting from
our testing as well as documentation patches that we’re iterating on.

That said, we think it’s really cool, works well in its current form, and
are optimistic about its future.

## We did the following (CI):

* Add support to the reproducer script [2,3] generated by CI to enable
libvirt.
* Basic snapshot create/restore [4] capability.
* Tested Scenarios: featureset 3 (UC idem), 10 (multinode containers), 37
(min OC + minor update).  See sprint cards for details.
* 14-18 RFE’s identified as part of testing for future work
http://ow.ly/J2u830jTSLG

---

# Sprint 12 Epic (Tempest):

* Epic Card: https://trello.com/c/ifIYQsxs/75-sprint-12-undercloud-tempest
* Tasks: http://ow.ly/GGvc30jTSfV

“Run tempest on undercloud by using containerized and packaged tempest”
“Complete work items carried from sprint 11 or another side work going on.”

## We did the following (Tempest):

* Create tripleo-ci jobs that run containerized tempest on all stable
branches.
* Create documentation for configuring and running tempest using
containerized tempest on UC @tripleo.org, and blog posts. [5,6,7]
* Run certification tests via new Jenkins job using ansible role [8]
* Refactor validate-tempest CI role for UC and containers

---

# Ruck and Rover

Each sprint two of the team members assume the roles of Ruck and Rover
(each for half of the sprint).

* Ruck is responsible to monitoring the CI, checking for failures, opening
bugs, participate on meetings, and this is your focal point to any CI
issues.
* Rover is responsible to work on these bugs, fix problems and the rest of
the team are focused on the sprint. For more information about our
structure, check [1]

## Ruck & Rover (Sprint 12), Etherpad [9,10]:

* Quique Llorente(quiquell)
* Gabriele Cerami (panda)

A few notable issues where substantial time was spent were:

1767099 periodic-tripleo-ci-centos-7-multinode-1ctlr-featureset030-master
vxlan tunnel fails randomly
1758899 reproducer-quickstart.sh building wrong gating package.
1767343 gate tripleo-ci-centos-7-containers-multinode fails to update
packages in cron container
1762351
periodic-tripleo-ci-centos-7-ovb-1ctlr_1comp-featureset002-queens-upload is
timeout Depends on https://bugzilla.redhat.com/show_bug.cgi?id=1565179
1766873 quickstart on ovb doesn't yield a deployment
1767049 Error during test discovery : 'must specify exactly one of host or
intercept' Depends on https://bugzilla.redhat.com/show_bug.cgi?id=1434385
1767076 Creating pingtest_sack fails: Failed to schedule instances:
NoValidHost_Remote: No valid host was found
1763634 devmode.sh --ovb fails to deploy overcloud
1765680 Incorrect branch used for not gated tripleo-upgrade repo

If you have any questions and/or suggestions, please contact us in #oooq or
#tripleo

Thanks,

Matt


tq: https://github.com/openstack/tripleo-quickstart
tqe: 

[openstack-dev] [tripleo] CI & Tempest squad planning summary: Sprint 13

2018-05-07 Thread Matt Young
Greetings,

The TripleO CI & Tempest squads have begun work on Sprint 13.  Like most of
our sprints these are three weeks long and are planned on a Thursday or
Friday (depending on squad) and have a retrospective on Wednesday.  Sprint
13 runs from 2018-05-03 thru 2018-05-23.

More information regarding our process is available in the tripleo-specs
repository [1]. Ongoing meeting notes and other detail are always available
in the Squad Etherpads [2,3].

This sprint the CI squad is working on the Upgrades epic, and the Tempest
squad is refactoring python-tempestconf to in part enable the upstream
refstack group.


## Ruck / Rover:

* Matt Young (myoung) and Sagi Shnaidman (sshnaidm)
* https://review.rdoproject.org/etherpad/p/ruckrover-sprint13


## CI Squad

* Put in place voting update jobs (
https://review.openstack.org/#/q/topic:gate_update)
* Add additional check/gate jobs to gate changes made this sprint.
* Refine the design for how we model releases in CI, taking into account
feedback from a collaborative design session with the Upgrades team
(etherpad http://ow.ly/da5L30jSeo8).

Epic: https://trello.com/c/cuKevn28/728-sprint-13-upgrades-goals
Tasks: http://ow.ly/yeIf30jScyj


## Tempest Squad

* Refactor pythyon-tempestconf tempest config by dynamically discovering
resources

In Scope: Keystone, Nova, Glance, Neutron, Cinder, Swift.

The following are specifically NOT in scope for Sprint 13  They are
tentatively planned for future sprints: Heat, Ironic, ec2api, Zaquar,
Mistral, Manila, Octavia, Horizon, Ceilometer.

Epic:
https://trello.com/c/efqE5XMr/734-sprint-13-refactor-python-tempestconf
Tasks: http://ow.ly/YOXh30jScEw


For any questions please find us in #tripleo

Thanks,

Matt

[1]
https://github.com/openstack/tripleo-specs/blob/master/specs/policy/ci-team-structure.rst
[2] https://etherpad.openstack.org/p/tripleo-ci-squad-meeting
[3] https://etherpad.openstack.org/p/tripleo-tempest-squad-meeting
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI Community Meeting tomorrow (2018-04-24)

2018-04-23 Thread Matt Young
Greetings,

Tomorrow the CI team will be hosting its weekly Community Meeting. We
welcome any/all to join.  The meeting is a place to discuss any concerns /
questions / issues from the community regarding CI.

It will (as usual) be held immediately following the general #tripleo
meeting on BlueJeans [2], typically ~14:30 UTC.  Please feel free to add
items to the agenda [2] or simply come and chat.

Thanks,

Matt

[1] https://bluejeans.com/7050859455

[2] https://etherpad.openstack.org/p/tripleo-ci-squad-meeting
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][ci][ceph] switching to config-download by default

2018-04-20 Thread James Slagle
On Thu, Apr 5, 2018 at 10:38 AM, James Slagle  wrote:
> I've pushed up for review a set of patches to switch us over to using
> config-download by default:
>
> https://review.openstack.org/#/q/topic:bp/config-download-default
>
> I believe I've come up with the proper series of steps to switch
> things over. Let me know if you have any feedback or foresee any
> issues:
>
> FIrst, we update remaining multinode jobs
> (https://review.openstack.org/558965) and ovb jobs
> (https://review.openstack.org/559067) that run against master to
> opt-in to config-download. This will expose any issues with these jobs
> and config-download and let us fix those issues.
>
> We can then switch tripleoclient (https://review.openstack.org/558925)
> over to use config-download by default. Since this also requires a
> Heat environment, we must forcibly inject that environment via
> tripleoclient.

FYI, the above work is completed and config-download is now the
default with tripleoclient.

>
> Once the tripleoclient patch lands, we can update
> tripleo-heat-templates to use the mappings from config-download in the
> default resource registry (https://review.openstack.org/558927).
>
> We can then remove the forcibly injected environment from
> tripleoclient (https://review.openstack.org/558931)

We're now moving forward with the above 2 patches. jtomasek is making
good progress with the UI and support for config-download should be
landing there soon.

>
> Finally, we can go back and update the multinode/ovb jobs on master to
> not be opt-in for config-download since it would now be the default
> (no patch yet).
>
> Now...for Ceph it will be slightly different:

It took some CI wrangling, but Ceph is now switched over to use
external_deploy_tasks. There are patches in progress to clean up the
old workflow_tasks:

https://review.openstack.org/563040
https://review.openstack.org/563113

There will be some further patches for CI to remove other explicit
opt-in's for config-download since it's now the default.

Feel free to ping me directly if you think you've found any issues
related to any of the config-download work, or file bugs in launchpad
using the official "config-download" tag.

-- 
-- James Slagle
--

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI & Tempest squad planning summary: Sprint 12

2018-04-18 Thread Matt Young
Greetings,

The TripleO CI & Tempest squads have begun work on Sprint 12.  This is a 3
week sprint.

The Ruck & Rover for this sprint are quiquell and panda.

## CI Squad

Goals:

"As a developer, I want reproduce a multinode CI job on a bare metal host
using libvirt"
"Enable the same workflows used in upstream CI / reproducer using libvirt
instead of OVB"

Epic:  https://trello.com/c/JEGLSVh6/323-reproduce-ci-jobs-with-libvirt
Tasks: https://tinyurl.com/yd93nz8p

## Tempest Squad

Goals:

"Run tempest on undercloud by using containerized and packaged tempest as
well as against Heat, Mistral, Ironic, Tempest and python-tempestconf
upstream"
"Finish work items carried from sprint 11 or other side work going on."

Epic:  https://trello.com/c/ifIYQsxs/680-sprint-12-undercloud-tempest
Tasks: https://tinyurl.com/y8k6yvbm

For any questions please find us in #tripleo

Thanks,

Matt
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI / Tempest Sprint 11 Summary

2018-04-13 Thread Matt Young
Greetings,

The TripleO squads for CI and Tempest have just completed Sprint 11.  The
following is a summary of activities during this sprint.  The newly formed
Tempest Squad has completed its first sprint.  Details on the team
structure can be found in the spec [1].

Sprint 11 Epic (CI Squad): Upgrades
Epic Card: https://trello.com/c/8pbRwBps/549-upstream-upgrade-ci

This is the second sprint that the team focused on CI for upgrades.  We
expect additional sprints will be needed focused on upgrades, and have a
number of backlog tasks remaining as well [2]

We did the following:
* Prune and remove old / irrelevant jobs from CI
* Assess the current state of existing jobs to determine status and issues.
* Ensure the reproducer script enabling the correct branches of
tripleo-upgrade
* Implement “Keystone Only” CI job.  This is a minimal deployment with the
smallest set of services (keystone + deps) in play.
   * tripleo-ci-centos-7-scenario000-multinode-oooq-container-updates
* Consolidate  docker namespaces between docker.io, rdoproject.org


Sprint 11 Epic (Tempest Squad): Containerize Tempest
Epic Card: https://trello.com/c/066JFJjf/537-epic-containerize-tempest


As noted above, this is the first sprint for the newly formed Tempest
Squad.  The work was a combination of the sprint epic and team members’
pre-existing work that is nearing completion.

We did the following:
* Fix tempest plugins upgrade issue (RHOS 10>11>12>13)
* Switch to stestr to run tempest beginning with queens
* Move neutron CLI calls to openstack CLI
* Containerize tempest on featureset027 (UC idempotency)

We made progress on the following, but work remains and continues in Sprint
12
* Refactor validate-tempest CI role for UC and containers (reviews in
flight)
* Updates to ansible-role-openstack-certification playbook & CI jobs that
use it.
* Upstream documentation covering above work

Note:
We have added a new trello board [2] to archive completed sprint cards.
Previously we were archiving (trello operation) the cards, making it
difficult to analyze/search the past.

Ruck and Rover

Each sprint two of the team members assume the roles of Ruck and Rover
(each for half of the sprint).
* Ruck is responsible to monitoring the CI, checking for failures, opening
bugs, participate on meetings, and this is your focal point to any CI
issues.
* Rover is responsible to work on these bugs, fix problems and the rest of
the team are focused on the sprint. For more information about our
structure, check [1]

Ruck & Rover (Sprint 11), Etherpad [4]:
* Arx Cruz (arxcruz)
* Rafael Folco (rfolco)

Two issues in particular where substantial time was spent were:

http://bugs.launchpad.net/bugs/1757556 (SSH timeouts)
https://bugs.launchpad.net/tripleo/+bug/1760189 (AMQP issues)

The full list of bugs open or worked on were:

https://bugs.launchpad.net/tripleo/+bug/1763009
https://bugs.launchpad.net/tripleo/+bug/1762419
https://bugs.launchpad.net/tripleo/+bug/1762351
https://bugs.launchpad.net/tripleo/+bug/1761171
https://bugs.launchpad.net/tripleo/+bug/1760189
https://bugs.launchpad.net/bugs/1757556
https://bugs.launchpad.net/tripleo/+bug/1759868
https://bugs.launchpad.net/tripleo/+bug/1759876
https://bugs.launchpad.net/tripleo/+bug/1759583
https://bugs.launchpad.net/tripleo/+bug/1758143
https://bugs.launchpad.net/tripleo/+bug/1757134
https://bugs.launchpad.net/tripleo/+bug/1755485
https://bugs.launchpad.net/tripleo/+bug/1758932
https://bugs.launchpad.net/tripleo/+bug/1751180

If you have any questions and/or suggestions, please contact us in #tripleo

Thanks,

Matt

[1]
https://specs.openstack.org/openstack/tripleo-specs/specs/policy/ci-team-structure.html
[2]
https://trello.com/b/U1ITy0cu/tripleo-and-rdo-ci?menu=filter=label:upgrades

[3] https://trello.com/b/BjcIIp0f/tripleo-and-rdo-ci-archive
[4] https://review.rdoproject.org/etherpad/p/ruckrover-sprint11
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci] use of tags in launchpad bugs

2018-04-06 Thread Rafael Folco
Thanks for the clarifications about official tags. I was the one creating
random/non-official tags for tripleo bugs.
Although this may be annoying for some people, it helped me while
ruckering/rovering CI to open unique bugs and avoid dups for the first
time(s).
There isn't a standard way of filing a bug. People open bugs using
different/non-standard wording in summary and description.
I just thought it was a good idea to tag featuresetXXX, ovb, branch, etc.,
so when somebody asks me if there is a bug for the job XYZ, the bug could
be found more easily.

Since sprint 10 ruck/rover started recording notes [1] and this helps to
keep track of the issues.
Perhaps the CI team could implement something on CI monitoring that links a
bug to the failing job(s), e.g:  [LP XX].

I'm doing a cleanup for the open bugs removing the non-official tags.

Thanks,

--Folco

[1] https://review.rdoproject.org/etherpad/p/ruckrover-sprint11


On Fri, Apr 6, 2018 at 6:09 AM, Jiří Stránský  wrote:

> On 5.4.2018 21:04, Alex Schultz wrote:
>
>> On Thu, Apr 5, 2018 at 12:55 PM, Wesley Hayutin 
>> wrote:
>>
>>> FYI...
>>>
>>> This is news to me so thanks to Emilien for pointing it out [1].
>>> There are official tags for tripleo launchpad bugs.  Personally, I like
>>> what
>>> I've seen recently with some extra tags as they could be helpful in
>>> finding
>>> the history of particular issues.
>>> So hypothetically would it be "wrong" to create an official tag for each
>>> featureset config number upstream.  I ask because that is adding a lot of
>>> tags but also serves as a good test case for what is good/bad use of
>>> tags.
>>>
>>>
>> We list official tags over in the specs repo[0].   That being said as
>> we investigate switching over to storyboard, we'll probably want to
>> revisit tags as they will have to be used more to replace some of the
>> functionality we had with launchpad (e.g. milestones).  You could
>> always add the tags without being an official tag. I'm not sure I
>> would really want all the featuresets as tags.  I'd rather see us
>> actually figure out what component is actually failing than relying on
>> a featureset (and the Rosetta stone for decoding featuresets to
>> functionality[1]).
>>
>
> We could also use both alongside. Component-based tags better relate to
> the actual root cause of the bug, while featureset-based tags are useful in
> relation to CI.
>
> E.g. "I see fs037 failing, i wonder if anyone already reported a bug for
> it" -- if the reporter tagged the bug, it would be really easy to figure
> out the answer.
>
> This might also again bring up the question of better job names to allow
> easier mapping to featuresets. IMO:
>
> tripleo-ci-centos-7-containers-multinode  -- not great
> tripleo-ci-centos-7-featureset010  -- not great
> tripleo-ci-centos-7-containers-mn-fs010  -- *happy face*
>
> Jirka
>
>
>
>>
>> Thanks,
>> -Alex
>>
>>
>> [0] http://git.openstack.org/cgit/openstack/tripleo-specs/tree/s
>> pecs/policy/bug-tagging.rst#n30
>> [1] https://git.openstack.org/cgit/openstack/tripleo-quickstart/
>> tree/doc/source/feature-configuration.rst#n21
>>
>>> Thanks
>>>
>>> [1] https://bugs.launchpad.net/tripleo/+manage-official-tags
>>>
>>> 
>>> __
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: openstack-dev-requ...@lists.op
>>> enstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>> 
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscrib
>> e
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Rafael Folco
Senior Software Engineer
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci] use of tags in launchpad bugs

2018-04-06 Thread Jiří Stránský

On 5.4.2018 21:04, Alex Schultz wrote:

On Thu, Apr 5, 2018 at 12:55 PM, Wesley Hayutin  wrote:

FYI...

This is news to me so thanks to Emilien for pointing it out [1].
There are official tags for tripleo launchpad bugs.  Personally, I like what
I've seen recently with some extra tags as they could be helpful in finding
the history of particular issues.
So hypothetically would it be "wrong" to create an official tag for each
featureset config number upstream.  I ask because that is adding a lot of
tags but also serves as a good test case for what is good/bad use of tags.



We list official tags over in the specs repo[0].   That being said as
we investigate switching over to storyboard, we'll probably want to
revisit tags as they will have to be used more to replace some of the
functionality we had with launchpad (e.g. milestones).  You could
always add the tags without being an official tag. I'm not sure I
would really want all the featuresets as tags.  I'd rather see us
actually figure out what component is actually failing than relying on
a featureset (and the Rosetta stone for decoding featuresets to
functionality[1]).


We could also use both alongside. Component-based tags better relate to 
the actual root cause of the bug, while featureset-based tags are useful 
in relation to CI.


E.g. "I see fs037 failing, i wonder if anyone already reported a bug for 
it" -- if the reporter tagged the bug, it would be really easy to figure 
out the answer.


This might also again bring up the question of better job names to allow 
easier mapping to featuresets. IMO:


tripleo-ci-centos-7-containers-multinode  -- not great
tripleo-ci-centos-7-featureset010  -- not great
tripleo-ci-centos-7-containers-mn-fs010  -- *happy face*

Jirka




Thanks,
-Alex


[0] 
http://git.openstack.org/cgit/openstack/tripleo-specs/tree/specs/policy/bug-tagging.rst#n30
[1] 
https://git.openstack.org/cgit/openstack/tripleo-quickstart/tree/doc/source/feature-configuration.rst#n21

Thanks

[1] https://bugs.launchpad.net/tripleo/+manage-official-tags

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo][ci] use of tags in launchpad bugs

2018-04-05 Thread Alex Schultz
On Thu, Apr 5, 2018 at 12:55 PM, Wesley Hayutin  wrote:
> FYI...
>
> This is news to me so thanks to Emilien for pointing it out [1].
> There are official tags for tripleo launchpad bugs.  Personally, I like what
> I've seen recently with some extra tags as they could be helpful in finding
> the history of particular issues.
> So hypothetically would it be "wrong" to create an official tag for each
> featureset config number upstream.  I ask because that is adding a lot of
> tags but also serves as a good test case for what is good/bad use of tags.
>

We list official tags over in the specs repo[0].   That being said as
we investigate switching over to storyboard, we'll probably want to
revisit tags as they will have to be used more to replace some of the
functionality we had with launchpad (e.g. milestones).  You could
always add the tags without being an official tag. I'm not sure I
would really want all the featuresets as tags.  I'd rather see us
actually figure out what component is actually failing than relying on
a featureset (and the Rosetta stone for decoding featuresets to
functionality[1]).


Thanks,
-Alex


[0] 
http://git.openstack.org/cgit/openstack/tripleo-specs/tree/specs/policy/bug-tagging.rst#n30
[1] 
https://git.openstack.org/cgit/openstack/tripleo-quickstart/tree/doc/source/feature-configuration.rst#n21
> Thanks
>
> [1] https://bugs.launchpad.net/tripleo/+manage-official-tags
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo][ci] use of tags in launchpad bugs

2018-04-05 Thread Wesley Hayutin
FYI...

This is news to me so thanks to Emilien for pointing it out [1].
There are official tags for tripleo launchpad bugs.  Personally, I like
what I've seen recently with some extra tags as they could be helpful in
finding the history of particular issues.
So hypothetically would it be "wrong" to create an official tag for each
featureset config number upstream.  I ask because that is adding a lot of
tags but also serves as a good test case for what is good/bad use of tags.

Thanks

[1] https://bugs.launchpad.net/tripleo/+manage-official-tags
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][ci][ceph] switching to config-download by default

2018-04-05 Thread James Slagle
On Thu, Apr 5, 2018 at 10:38 AM, James Slagle  wrote:
> I've pushed up for review a set of patches to switch us over to using
> config-download by default:
>
> https://review.openstack.org/#/q/topic:bp/config-download-default
>
> I believe I've come up with the proper series of steps to switch
> things over. Let me know if you have any feedback or foresee any
> issues:
>
> FIrst, we update remaining multinode jobs
> (https://review.openstack.org/558965) and ovb jobs
> (https://review.openstack.org/559067) that run against master to
> opt-in to config-download. This will expose any issues with these jobs
> and config-download and let us fix those issues.
>
> We can then switch tripleoclient (https://review.openstack.org/558925)
> over to use config-download by default. Since this also requires a
> Heat environment, we must forcibly inject that environment via
> tripleoclient.
>
> Once the tripleoclient patch lands, we can update
> tripleo-heat-templates to use the mappings from config-download in the
> default resource registry (https://review.openstack.org/558927).

I forgot to mention that at this point the UI would have to be working
with config-download before we land that tripleo-heat-templates patch.
Or, the UI could opt-in to the
disable-config-download-environment.yaml that I'm providing with that
patch.


-- 
-- James Slagle
--

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][ci][ceph] switching to config-download by default

2018-04-05 Thread James Slagle
I've pushed up for review a set of patches to switch us over to using
config-download by default:

https://review.openstack.org/#/q/topic:bp/config-download-default

I believe I've come up with the proper series of steps to switch
things over. Let me know if you have any feedback or foresee any
issues:

FIrst, we update remaining multinode jobs
(https://review.openstack.org/558965) and ovb jobs
(https://review.openstack.org/559067) that run against master to
opt-in to config-download. This will expose any issues with these jobs
and config-download and let us fix those issues.

We can then switch tripleoclient (https://review.openstack.org/558925)
over to use config-download by default. Since this also requires a
Heat environment, we must forcibly inject that environment via
tripleoclient.

Once the tripleoclient patch lands, we can update
tripleo-heat-templates to use the mappings from config-download in the
default resource registry (https://review.openstack.org/558927).

We can then remove the forcibly injected environment from
tripleoclient (https://review.openstack.org/558931)

Finally, we can go back and update the multinode/ovb jobs on master to
not be opt-in for config-download since it would now be the default
(no patch yet).

Now...for Ceph it will be slightly different:

We have a patch that migrates from workflow_tasks to
external_deploy_tasks (https://review.openstack.org/#/c/546966/) and
that depends on a quickstart patch to update the Ceph scenarios to use
config-download (https://review.openstack.org/#/c/548306/). These
patches are co-dependencies and present a problem in that they both
must land at the same time.

To workaround that, I think we need to update the
tripleo-heat-templates patch to include both the existing
workflow_tasks *and* the new external_deploy_tasks. Once we've proven
the external_deploy_tasks work, we remove the depends-on and land the
tripleo-heat-templates patch. It will pass the existing Ceph scenario
jobs b/c they will be using workflow_tasks.

We then land the quickstart patch to switch those scenario jobs to use
external_deploy_tasks. Then we can circle back and remove
workflow_tasks from the ceph templates in tripleo-heat-templates.

I think this will allow everything to land and keep CI green along the
way. Please let me know any feedback as we plan to try and push on
this work over the next couple of weeks.

-- 
-- James Slagle
--

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo][ci] FYI.. the full tempest execution removed from promotion criteria temporarily

2018-03-27 Thread Wesley Hayutin
Greetings,

The upstream packages for master and queens have not been updated in
TripleO in 22 days.  We have come very close to a package promotion a
number of times, but failed for several different reasons.  In this latest
issue the full tempest job featureset020 was discussed with both Alex and
Emilien and we are temporarily removing it from criteria for a promotion.
There are several performance issues atm that we are still getting details
on with regards to the number of httpd processes on the controller and the
cpu usage of openvswitch agents.

The full tempest job is very useful and helpful in discovering issues like
this one that may have gone undetected otherwise.  Removing it temporarily
is a safe operation because none of the upstream tripleo check or gate jobs
run full tempest.

As soon as the promotion is complete with the containers, images, and repo
promoted I will revert the patches that removed the full tempest run from
criteria.

Note the tempest jobs are still running at the time I'm writing this email
and still may pass, however to ensure upstream gets promoted packages it
has been removed as a precaution.

Thank you
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA][HA][Eris][LCOO] Validating HA on upstream

2018-03-15 Thread Sam P
Hi All,
 Sorry, Late to the party...
 I have added myself.

--- Regards,
Sampath


On Fri, Mar 16, 2018 at 9:31 AM, Ghanshyam Mann 
wrote:

> On Thu, Mar 15, 2018 at 9:45 PM, Adam Spiers  wrote:
> > Raoul Scarazzini  wrote:
> >>
> >> On 15/03/2018 01:57, Ghanshyam Mann wrote:
> >>>
> >>> Thanks all for starting the collaboration on this which is long pending
> >>> things and we all want to have some start on this.
> >>> Myself and SamP talked about it during OPS meetup in Tokyo and we
> talked
> >>> about below draft plan-
> >>> - Update the Spec - https://review.openstack.org/#/c/443504/. which is
> >>> almost ready as per SamP and his team is working on that.
> >>> - Start the technical debate on tooling we can use/reuse like Yardstick
> >>> etc, which is more this mailing thread.
> >>> - Accept the new repo for Eris under QA and start at least something in
> >>> Rocky cycle.
> >>> I am in for having meeting on this which is really good idea. non-IRC
> >>> meeting is totally fine here. Do we have meeting place and time setup ?
> >>> -gmann
> >>
> >>
> >> Hi Ghanshyam,
> >> as I wrote earlier in the thread it's no problem for me to offer my
> >> bluejeans channel, let's sort out which timeslice can be good. I've
> >> added to the main etherpad [1] my timezone (line 53), let's do all that
> >> so that we can create the meeting invite.
> >>
> >> [1] https://etherpad.openstack.org/p/extreme-testing-contacts
> >
> >
> > Good idea!  I've added mine.  We're still missing replies from several
> > key stakeholders though (lines 62++) - probably worth getting buy-in
> > from a few more people before we organise anything.  I'm pinging a few
> > on IRC with reminders about this.
> >
>
> Thanks rasca, aspiers. I have added myself there and yea good ides to
> ping remaining on IRC.
>
> -gmann
>
> > 
> __
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:
> unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA][HA][Eris][LCOO] Validating HA on upstream

2018-03-15 Thread Ghanshyam Mann
On Thu, Mar 15, 2018 at 9:45 PM, Adam Spiers  wrote:
> Raoul Scarazzini  wrote:
>>
>> On 15/03/2018 01:57, Ghanshyam Mann wrote:
>>>
>>> Thanks all for starting the collaboration on this which is long pending
>>> things and we all want to have some start on this.
>>> Myself and SamP talked about it during OPS meetup in Tokyo and we talked
>>> about below draft plan-
>>> - Update the Spec - https://review.openstack.org/#/c/443504/. which is
>>> almost ready as per SamP and his team is working on that.
>>> - Start the technical debate on tooling we can use/reuse like Yardstick
>>> etc, which is more this mailing thread.
>>> - Accept the new repo for Eris under QA and start at least something in
>>> Rocky cycle.
>>> I am in for having meeting on this which is really good idea. non-IRC
>>> meeting is totally fine here. Do we have meeting place and time setup ?
>>> -gmann
>>
>>
>> Hi Ghanshyam,
>> as I wrote earlier in the thread it's no problem for me to offer my
>> bluejeans channel, let's sort out which timeslice can be good. I've
>> added to the main etherpad [1] my timezone (line 53), let's do all that
>> so that we can create the meeting invite.
>>
>> [1] https://etherpad.openstack.org/p/extreme-testing-contacts
>
>
> Good idea!  I've added mine.  We're still missing replies from several
> key stakeholders though (lines 62++) - probably worth getting buy-in
> from a few more people before we organise anything.  I'm pinging a few
> on IRC with reminders about this.
>

Thanks rasca, aspiers. I have added myself there and yea good ides to
ping remaining on IRC.

-gmann

> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA][HA][Eris][LCOO] Validating HA on upstream

2018-03-15 Thread Adam Spiers

Raoul Scarazzini  wrote:

On 15/03/2018 01:57, Ghanshyam Mann wrote:

Thanks all for starting the collaboration on this which is long pending
things and we all want to have some start on this.
Myself and SamP talked about it during OPS meetup in Tokyo and we talked
about below draft plan-
- Update the Spec - https://review.openstack.org/#/c/443504/. which is
almost ready as per SamP and his team is working on that.
- Start the technical debate on tooling we can use/reuse like Yardstick
etc, which is more this mailing thread. 
- Accept the new repo for Eris under QA and start at least something in
Rocky cycle.
I am in for having meeting on this which is really good idea. non-IRC
meeting is totally fine here. Do we have meeting place and time setup ?
-gmann


Hi Ghanshyam,
as I wrote earlier in the thread it's no problem for me to offer my
bluejeans channel, let's sort out which timeslice can be good. I've
added to the main etherpad [1] my timezone (line 53), let's do all that
so that we can create the meeting invite.

[1] https://etherpad.openstack.org/p/extreme-testing-contacts


Good idea!  I've added mine.  We're still missing replies from several
key stakeholders though (lines 62++) - probably worth getting buy-in
from a few more people before we organise anything.  I'm pinging a few
on IRC with reminders about this.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA][HA][Eris][LCOO] Validating HA on upstream

2018-03-15 Thread Raoul Scarazzini
On 15/03/2018 01:57, Ghanshyam Mann wrote:
> Thanks all for starting the collaboration on this which is long pending
> things and we all want to have some start on this.
> Myself and SamP talked about it during OPS meetup in Tokyo and we talked
> about below draft plan-
> - Update the Spec - https://review.openstack.org/#/c/443504/. which is
> almost ready as per SamP and his team is working on that.
> - Start the technical debate on tooling we can use/reuse like Yardstick
> etc, which is more this mailing thread. 
> - Accept the new repo for Eris under QA and start at least something in
> Rocky cycle.
> I am in for having meeting on this which is really good idea. non-IRC
> meeting is totally fine here. Do we have meeting place and time setup ?
> -gmann

Hi Ghanshyam,
as I wrote earlier in the thread it's no problem for me to offer my
bluejeans channel, let's sort out which timeslice can be good. I've
added to the main etherpad [1] my timezone (line 53), let's do all that
so that we can create the meeting invite.

[1] https://etherpad.openstack.org/p/extreme-testing-contacts

-- 
Raoul Scarazzini
ra...@redhat.com

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA][HA][Eris][LCOO] Validating HA on upstream

2018-03-14 Thread Ghanshyam Mann
Thanks all for starting the collaboration on this which is long pending
things and we all want to have some start on this.

Myself and SamP talked about it during OPS meetup in Tokyo and we talked
about below draft plan-

- Update the Spec - https://review.openstack.org/#/c/443504/. which is
almost ready as per SamP and his team is working on that.
- Start the technical debate on tooling we can use/reuse like Yardstick
etc, which is more this mailing thread.
- Accept the new repo for Eris under QA and start at least something in
Rocky cycle.

I am in for having meeting on this which is really good idea. non-IRC
meeting is totally fine here. Do we have meeting place and time setup ?

-gmann

On Fri, Mar 9, 2018 at 8:16 PM, Bogdan Dobrelya  wrote:

> On 3/8/18 6:44 PM, Raoul Scarazzini wrote:
>
>> On 08/03/2018 17:03, Adam Spiers wrote:
>> [...]
>>
>>> Yes agreed again, this is a strong case for collaboration between the
>>> self-healing and QA SIGs.  In Dublin we also discussed the idea of the
>>> self-healing and API SIGs collaborating on the related topic of health
>>> check APIs.
>>>
>>
>> Guys, thanks a ton for your involvement in the topic, I am +1 to any
>> kind of meeting we can have to discuss this (like it was proposed by
>>
>
> Please count me in as well. I can't stop dreaming of Jepsen's Nemesis [0]
> hammering openstack to make it stronger :D
> Jokes off, let's do the best to consolidate on frameworks and tools and
> ditching NIH syndrome!
>
> [0] https://github.com/jepsen-io/jepsen/blob/master/jepsen/src/j
> epsen/nemesis.clj
>
> Adam) so I'll offer my bluejeans channel for whatever kind of meeting we
>> want to organize.
>> About the best practices part Georg was mentioning I'm 100% in
>> agreement, the testing methodologies are the first thing we need to care
>> about, starting from what we want to achieve.
>> That said, I'll keep studying Yardstick.
>>
>> Hope to hear from you soon, and thanks again!
>>
>>
>
> --
> Best regards,
> Bogdan Dobrelya,
> Irc #bogdando
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA][HA][Eris][LCOO] Validating HA on upstream

2018-03-09 Thread Bogdan Dobrelya

On 3/8/18 6:44 PM, Raoul Scarazzini wrote:

On 08/03/2018 17:03, Adam Spiers wrote:
[...]

Yes agreed again, this is a strong case for collaboration between the
self-healing and QA SIGs.  In Dublin we also discussed the idea of the
self-healing and API SIGs collaborating on the related topic of health
check APIs.


Guys, thanks a ton for your involvement in the topic, I am +1 to any
kind of meeting we can have to discuss this (like it was proposed by


Please count me in as well. I can't stop dreaming of Jepsen's Nemesis 
[0] hammering openstack to make it stronger :D
Jokes off, let's do the best to consolidate on frameworks and tools and 
ditching NIH syndrome!


[0] 
https://github.com/jepsen-io/jepsen/blob/master/jepsen/src/jepsen/nemesis.clj



Adam) so I'll offer my bluejeans channel for whatever kind of meeting we
want to organize.
About the best practices part Georg was mentioning I'm 100% in
agreement, the testing methodologies are the first thing we need to care
about, starting from what we want to achieve.
That said, I'll keep studying Yardstick.

Hope to hear from you soon, and thanks again!




--
Best regards,
Bogdan Dobrelya,
Irc #bogdando

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA][HA][Eris][LCOO] Validating HA on upstream

2018-03-08 Thread Adam Spiers

Raoul Scarazzini  wrote:

On 08/03/2018 17:03, Adam Spiers wrote:
[...]

Yes agreed again, this is a strong case for collaboration between the
self-healing and QA SIGs.  In Dublin we also discussed the idea of the
self-healing and API SIGs collaborating on the related topic of health
check APIs.


Guys, thanks a ton for your involvement in the topic, I am +1 to any
kind of meeting we can have to discuss this (like it was proposed by
Adam) so I'll offer my bluejeans channel for whatever kind of meeting we
want to organize.


Awesome, thanks - bluejeans would be great.


About the best practices part Georg was mentioning I'm 100% in
agreement, the testing methodologies are the first thing we need to care
about, starting from what we want to achieve.
That said, I'll keep studying Yardstick.

Hope to hear from you soon, and thanks again!


Yep - let's wait for people to catch up with the thread and hopefully
we'll get enough volunteers on

 https://etherpad.openstack.org/p/extreme-testing-contacts

for critical mass and then we can start discussing!  I think it's
especially important that we have the Eris folks on board since they
have already been working on this for a while.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA][HA][Eris][LCOO] Validating HA on upstream

2018-03-08 Thread Raoul Scarazzini
On 08/03/2018 17:03, Adam Spiers wrote:
[...]
> Yes agreed again, this is a strong case for collaboration between the
> self-healing and QA SIGs.  In Dublin we also discussed the idea of the
> self-healing and API SIGs collaborating on the related topic of health
> check APIs.

Guys, thanks a ton for your involvement in the topic, I am +1 to any
kind of meeting we can have to discuss this (like it was proposed by
Adam) so I'll offer my bluejeans channel for whatever kind of meeting we
want to organize.
About the best practices part Georg was mentioning I'm 100% in
agreement, the testing methodologies are the first thing we need to care
about, starting from what we want to achieve.
That said, I'll keep studying Yardstick.

Hope to hear from you soon, and thanks again!

-- 
Raoul Scarazzini
ra...@redhat.com

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA][HA][Eris][LCOO] Validating HA on upstream

2018-03-08 Thread Adam Spiers
Georg Kunz  wrote: 

Hi Adam,

Raoul Scarazzini  wrote: 
In the meantime, I'll check yardstick to see which kind of bridge we 
can build to avoid reinventing the wheel. 


Great, thanks!  I wish I could immediately help with this, but I haven't had the 
chance to learn yardstick myself yet.  We should probably try to recruit 
someone from OPNFV to provide advice.  I've cc'd Georg who IIRC was the 
person who originally told me about yardstick :-)  He is an NFV expert and is 
also very interested in automated testing efforts: 

http://lists.openstack.org/pipermail/openstack-dev/2017-November/124942.html 

so he may be able to help with this architectural challenge. 


Thank you for bringing this up here. Better collaboration and sharing of knowledge, methodologies and tools across the communities is really what I'd like to see and facilitate. Hence, I am happy to help. 

I have already started to advertise the newly proposed QA SIG in the OPNFV test WG and I'll happily do the same for the self-healing SIG and any HA testing efforts in general. There is certainly some overlapping interest in these testing aspects between the QA SIG and the self-healing SIG and hence collaboration between both SIGs is crucial. 


That's fantastic - thank you so much! 

One remark regarding tools and frameworks: I consider the true value of a SIG to be a place for talking about methodologies and best practices: What do we need to test? What are the challenges? How can we approach this across communities? The tools and frameworks are important and we should investigate which tools are available, how good they are, how much they fit a given purpose, but at the end of the day they are tools meant to enable well designed testing methodologies. 


Agreed 100%. 


[snipped]

I'm beginning to think that maybe we should organise a video conference call 
to coordinate efforts between the various interested parties.  If there is 
appetite for that, the first question is: who wants to be involved?  To answer 
that, I have created an etherpad where interested people can sign up: 

https://etherpad.openstack.org/p/extreme-testing-contacts 

and I've cc'd people who I think would probably be interested.  Does this 
sound like a good approach? 


We discussed a very similar idea in Dublin in the context of the QA SIG. I very much like the idea of a cross-community, cross-team, and apparently even cross-SIG approach. 


Yes agreed again, this is a strong case for collaboration between the 
self-healing and QA SIGs.  In Dublin we also discussed the idea of the 
self-healing and API SIGs collaborating on the related topic of health 
check APIs. 


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA][HA][Eris][LCOO] Validating HA on upstream

2018-03-07 Thread Georg Kunz
Hi Adam,

> Raoul Scarazzini  wrote:
> >On 06/03/2018 13:27, Adam Spiers wrote:
> >> Hi Raoul and all,
> >> Sorry for joining this discussion late!
> >[...]
> >> I do not work on TripleO, but I'm part of the wider OpenStack
> >> sub-communities which focus on HA[0] and more recently,
> >> self-healing[1].  With that hat on, I'd like to suggest that maybe
> >> it's possible to collaborate on this in a manner which is agnostic to
> >> the deployment mechanism.  There is an open spec on this>
> >> https://review.openstack.org/#/c/443504/
> >> which was mentioned in the Denver PTG session on destructive testing
> >> which you referenced[2].
> >[...]
> >>    https://www.opnfv.org/community/projects/yardstick
> >[...]
> >> Currently each sub-community and vendor seems to be reinventing HA
> >> testing by itself to some extent, which is easier to accomplish in
> >> the short-term, but obviously less efficient in the long-term.  It
> >> would be awesome if we could break these silos down and join efforts!
> >> :-)
> >
> >Hi Adam,
> >First of all thanks for your detailed answer. Then let me be honest
> >while saying that I didn't know yardstick.
> 
> Neither did I until Sydney, despite being involved with OpenStack HA for
> many years ;-)  I think this shows that either a) there is room for improved
> communication between the OpenStack and OPNFV communities, or b) I
> need to take my head out of the sand more often ;-)
> 
> >I need to start from scratch
> >here to understand what this project is. In any case, the exact meaning
> >of this thread is to involve people and have a more comprehensive look
> >at what's around.
> >The point here is that, as you can see from the tripleo-ha-utils spec
> >[1] I've created, the project is meant for TripleO specifically. On one
> >side this is a significant limitation, but on the other one, due to the
> >pluggable nature of the project, I think that integrations with other
> >software like you are proposing is not impossible.
> 
> Yep.  I totally sympathise with the tension between the need to get
> something working quickly, vs. the need to collaborate with the community
> in the most efficient way.
> 
> >Feel free to add your comments to the review.
> 
> The spec looks great to me; I don't really have anything to add, and I don't
> feel comfortable voting in a project which I know very little about.
> 
> >In the meantime, I'll check yardstick to see which kind of bridge we
> >can build to avoid reinventing the wheel.
> 
> Great, thanks!  I wish I could immediately help with this, but I haven't had 
> the
> chance to learn yardstick myself yet.  We should probably try to recruit
> someone from OPNFV to provide advice.  I've cc'd Georg who IIRC was the
> person who originally told me about yardstick :-)  He is an NFV expert and is
> also very interested in automated testing efforts:
> 
> http://lists.openstack.org/pipermail/openstack-dev/2017-
> November/124942.html
> 
> so he may be able to help with this architectural challenge.

Thank you for bringing this up here. Better collaboration and sharing of 
knowledge, methodologies and tools across the communities is really what I'd 
like to see and facilitate. Hence, I am happy to help.

I have already started to advertise the newly proposed QA SIG in the OPNFV test 
WG and I'll happily do the same for the self-healing SIG and any HA testing 
efforts in general. There is certainly some overlapping interest in these 
testing aspects between the QA SIG and the self-healing SIG and hence 
collaboration between both SIGs is crucial.

One remark regarding tools and frameworks: I consider the true value of a SIG 
to be a place for talking about methodologies and best practices: What do we 
need to test? What are the challenges? How can we approach this across 
communities? The tools and frameworks are important and we should investigate 
which tools are available, how good they are, how much they fit a given 
purpose, but at the end of the day they are tools meant to enable well designed 
testing methodologies.

> Also you should be aware that work has already started on Eris, the extreme
> testing framework proposed in this user story:
> 
> http://specs.openstack.org/openstack/openstack-user-stories/user-
> stories/proposed/openstack_extreme_testing.html
> 
> and in the spec you already saw:
> 
> https://review.openstack.org/#/c/443504/
> 
> You can see ongoing work here:
> 
> https://github.com/LCOO/eris
> https://openstack-
> lcoo.atlassian.net/wiki/spaces/LCOO/pages/13393034/Eris+-
> +Extreme+Testing+Framework+for+OpenStack
> 
> It looks like there is a plan to propose a new SIG for this, although 
> personally I
> would be very happy to see it adopted by the self-healing SIG, since this
> framework is exactly what is needed for testing any self-healing mechanism.
> 
> I'm hoping that Sampath and/or Gautum will chip in here, since I think they're
> currently the main drivers for Eris.
> 
> 

Re: [openstack-dev] [TripleO][CI][QA][HA][Eris][LCOO] Validating HA on upstream

2018-03-07 Thread Adam Spiers

Raoul Scarazzini  wrote:

On 06/03/2018 13:27, Adam Spiers wrote:

Hi Raoul and all,
Sorry for joining this discussion late!

[...]

I do not work on TripleO, but I'm part of the wider OpenStack
sub-communities which focus on HA[0] and more recently,
self-healing[1].  With that hat on, I'd like to suggest that maybe
it's possible to collaborate on this in a manner which is agnostic to
the deployment mechanism.  There is an open spec on this>    
https://review.openstack.org/#/c/443504/
which was mentioned in the Denver PTG session on destructive testing
which you referenced[2].

[...]

   https://www.opnfv.org/community/projects/yardstick

[...]

Currently each sub-community and vendor seems to be reinventing HA
testing by itself to some extent, which is easier to accomplish in the
short-term, but obviously less efficient in the long-term.  It would
be awesome if we could break these silos down and join efforts! :-)


Hi Adam,
First of all thanks for your detailed answer. Then let me be honest
while saying that I didn't know yardstick.


Neither did I until Sydney, despite being involved with OpenStack HA
for many years ;-)  I think this shows that either a) there is room
for improved communication between the OpenStack and OPNFV
communities, or b) I need to take my head out of the sand more often ;-)


I need to start from scratch
here to understand what this project is. In any case, the exact meaning
of this thread is to involve people and have a more comprehensive look
at what's around.
The point here is that, as you can see from the tripleo-ha-utils spec
[1] I've created, the project is meant for TripleO specifically. On one
side this is a significant limitation, but on the other one, due to the
pluggable nature of the project, I think that integrations with other
software like you are proposing is not impossible.


Yep.  I totally sympathise with the tension between the need to get
something working quickly, vs. the need to collaborate with the
community in the most efficient way.


Feel free to add your comments to the review.


The spec looks great to me; I don't really have anything to add, and I
don't feel comfortable voting in a project which I know very little
about.


In the meantime, I'll check yardstick to see which kind of bridge we
can build to avoid reinventing the wheel.


Great, thanks!  I wish I could immediately help with this, but I
haven't had the chance to learn yardstick myself yet.  We should
probably try to recruit someone from OPNFV to provide advice.  I've
cc'd Georg who IIRC was the person who originally told me about
yardstick :-)  He is an NFV expert and is also very interested in
automated testing efforts:

   http://lists.openstack.org/pipermail/openstack-dev/2017-November/124942.html

so he may be able to help with this architectural challenge.

Also you should be aware that work has already started on Eris, the
extreme testing framework proposed in this user story:

   
http://specs.openstack.org/openstack/openstack-user-stories/user-stories/proposed/openstack_extreme_testing.html

and in the spec you already saw:

   https://review.openstack.org/#/c/443504/

You can see ongoing work here:

   https://github.com/LCOO/eris
   
https://openstack-lcoo.atlassian.net/wiki/spaces/LCOO/pages/13393034/Eris+-+Extreme+Testing+Framework+for+OpenStack

It looks like there is a plan to propose a new SIG for this, although
personally I would be very happy to see it adopted by the self-healing
SIG, since this framework is exactly what is needed for testing any
self-healing mechanism.

I'm hoping that Sampath and/or Gautum will chip in here, since I think
they're currently the main drivers for Eris.

I'm beginning to think that maybe we should organise a video
conference call to coordinate efforts between the various interested
parties.  If there is appetite for that, the first question is: who
wants to be involved?  To answer that, I have created an etherpad
where interested people can sign up:

   https://etherpad.openstack.org/p/extreme-testing-contacts

and I've cc'd people who I think would probably be interested.  Does
this sound like a good approach?

Cheers,
Adam

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA][HA] Validating HA on upstream

2018-03-06 Thread Raoul Scarazzini
On 06/03/2018 13:27, Adam Spiers wrote:
> Hi Raoul and all,
> Sorry for joining this discussion late!
[...]
> I do not work on TripleO, but I'm part of the wider OpenStack
> sub-communities which focus on HA[0] and more recently,
> self-healing[1].  With that hat on, I'd like to suggest that maybe
> it's possible to collaborate on this in a manner which is agnostic to
> the deployment mechanism.  There is an open spec on this>    
> https://review.openstack.org/#/c/443504/
> which was mentioned in the Denver PTG session on destructive testing
> which you referenced[2].
[...]
>    https://www.opnfv.org/community/projects/yardstick
[...]
> Currently each sub-community and vendor seems to be reinventing HA
> testing by itself to some extent, which is easier to accomplish in the
> short-term, but obviously less efficient in the long-term.  It would
> be awesome if we could break these silos down and join efforts! :-)

Hi Adam,
First of all thanks for your detailed answer. Then let me be honest
while saying that I didn't know yardstick. I need to start from scratch
here to understand what this project is. In any case, the exact meaning
of this thread is to involve people and have a more comprehensive look
at what's around.
The point here is that, as you can see from the tripleo-ha-utils spec
[1] I've created, the project is meant for TripleO specifically. On one
side this is a significant limitation, but on the other one, due to the
pluggable nature of the project, I think that integrations with other
software like you are proposing is not impossible.
Feel free to add your comments to the review. In the meantime, I'll
check yardstick to see which kind of bridge we can build to avoid
reinventing the wheel.

Thanks a lot again for your involvement,

[1] https://review.openstack.org/#/c/548874/

-- 
Raoul Scarazzini
ra...@redhat.com

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA][HA] Validating HA on upstream

2018-03-06 Thread Adam Spiers

Hi Raoul and all,

Sorry for joining this discussion late!

Raoul Scarazzini  wrote:

TL;DR: we would like to change the way HA is tested upstream to avoid
being hitten by evitable bugs that the CI process should discover.

Long version:

Today HA testing in upstream consist only in verifying that a three
controllers setup comes up correctly and can spawn an instance. That's
something, but it’s far from being enough since we continuously see "day
two" bugs.
We started covering this more than a year ago in internal CI and today
also on rdocloud using a project named tripleo-quickstart-utils [1].
Apart from his name, the project is not limited to tripleo-quickstart,
it covers three principal roles:

1 - stonith-config: a playbook that can be used to automate the creation
of fencing devices in the overcloud;
2 - instance-ha: a playbook that automates the seventeen manual steps
needed to configure instance HA in the overcloud, test them via rally
and verify that instance HA works;
3 - validate-ha: a playbook that runs a series of disruptive actions in
the overcloud and verifies it always behaves correctly by deploying a
heat-template that involves all the overcloud components;


Yes, a more rigorous approach to HA testing obviously has huge value,
not just for TripleO deployments, but also for any type of OpenStack
deployment.


To make this usable upstream, we need to understand where to put this
code. Here some choices:


[snipped]

I do not work on TripleO, but I'm part of the wider OpenStack
sub-communities which focus on HA[0] and more recently,
self-healing[1].  With that hat on, I'd like to suggest that maybe
it's possible to collaborate on this in a manner which is agnostic to
the deployment mechanism.  There is an open spec on this:

   https://review.openstack.org/#/c/443504/

which was mentioned in the Denver PTG session on destructive testing
which you referenced[2].

As mentioned in the self-healing SIG's session in Dublin[3], the OPNFV
community has already put a lot of effort into testing HA scenarios,
and it would be great if this work was shared across the whole
OpenStack community.  In particular they have a project called
Yardstick:

   https://www.opnfv.org/community/projects/yardstick

which contains a bunch of HA test cases:

   
http://docs.opnfv.org/en/latest/submodules/yardstick/docs/testing/user/userguide/15-list-of-tcs.html#h-a

Currently each sub-community and vendor seems to be reinventing HA
testing by itself to some extent, which is easier to accomplish in the
short-term, but obviously less efficient in the long-term.  It would
be awesome if we could break these silos down and join efforts! :-)

Cheers,
Adam

[0] #openstack-ha on Freenode IRC
[1] https://wiki.openstack.org/wiki/Self-healing_SIG
[2] https://etherpad.openstack.org/p/qa-queens-ptg-destructive-testing
[3] https://etherpad.openstack.org/p/self-healing-ptg-rocky

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA] Validating HA on upstream

2018-03-02 Thread Raoul Scarazzini
On 02/03/2018 15:19, Emilien Macchi wrote:
> Talking with clarkb during PTG, we'll need to transform
> tripleo-quickstart-utils into a non-forked repo - or move the roles to
> an existing repo. But we can't continue to maintain this fork.
> Raoul, let us know what you think is best (move repo to OpenStack or
> move modules to an existing upstream repo).
> Thanks,

Hey Emilien,
I prepared this [1] in which some folks started to have a look at, maybe
it's what we need to move on on this.
If you think something else needs to be done, let me know, I'll work on it.

Thanks,

[1] https://review.openstack.org/#/c/548874

-- 
Raoul Scarazzini
ra...@redhat.com

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA] Validating HA on upstream

2018-03-02 Thread Emilien Macchi
Talking with clarkb during PTG, we'll need to transform
tripleo-quickstart-utils into a non-forked repo - or move the roles to an
existing repo. But we can't continue to maintain this fork.

Raoul, let us know what you think is best (move repo to OpenStack or move
modules to an existing upstream repo).


Thanks,

On Fri, Feb 16, 2018 at 3:12 PM, Raoul Scarazzini  wrote:

> On 16/02/2018 15:41, Wesley Hayutin wrote:
> [...]
> > Using galaxy is an option however we would need to make sure that galaxy
> > is proxied across the upstream clouds.
> > Another option would be to follow the current established pattern of
> > adding it to the requirements file [1]
> > Thanks Bogdan, Raoul!
> > [1] https://github.com/openstack/tripleo-quickstart/
> blob/master/quickstart-extras-requirements.txt
>
> This is how we're using it today into the internal pipelines, so once we
> will have the tripleo-ha-utils (or whatever it will be called) it will
> be just a matter of adding it into the file. In the end I think that
> once the project will be created either way of using it will be fine.
>
> Thanks for your involvement on this folks!
>
> --
> Raoul Scarazzini
> ra...@redhat.com
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Emilien Macchi
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA] Validating HA on upstream

2018-02-16 Thread Raoul Scarazzini
On 16/02/2018 15:41, Wesley Hayutin wrote:
[...]
> Using galaxy is an option however we would need to make sure that galaxy
> is proxied across the upstream clouds.
> Another option would be to follow the current established pattern of
> adding it to the requirements file [1]
> Thanks Bogdan, Raoul!
> [1] 
> https://github.com/openstack/tripleo-quickstart/blob/master/quickstart-extras-requirements.txt

This is how we're using it today into the internal pipelines, so once we
will have the tripleo-ha-utils (or whatever it will be called) it will
be just a matter of adding it into the file. In the end I think that
once the project will be created either way of using it will be fine.

Thanks for your involvement on this folks!

-- 
Raoul Scarazzini
ra...@redhat.com

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA] Validating HA on upstream

2018-02-16 Thread Wesley Hayutin
On Fri, Feb 16, 2018 at 9:16 AM, Bogdan Dobrelya 
wrote:

> On 2/16/18 2:59 PM, Raoul Scarazzini wrote:
>
>> On 16/02/2018 10:24, Bogdan Dobrelya wrote:
>> [...]
>>
>>> +1 this looks like a perfect fit. Would it be possible to install that
>>> tripleo-ha-utils/tripleo-quickstart-utils with ansible-galaxy, alongside
>>> the quickstart, then apply destructive-testing playbooks with either the
>>> quickstart's static inventory [0] (from your admin/control node) or
>>> maybe via dynamic inventory [1] (from undercloud managing the overcloud
>>> under test via config-download and/or external ansible deployment
>>> mechanisms)?
>>> [0]
>>> https://git.openstack.org/cgit/openstack/tripleo-quickstart/
>>> tree/roles/tripleo-inventory
>>> [1]
>>> https://git.openstack.org/cgit/openstack/tripleo-validations
>>> /tree/scripts/tripleo-ansible-inventory
>>>
>>
>> Hi Bogdan,
>> thanks for your answer. On the inventory side of things these playbooks
>> work on any kind of inventory, we're using it at the moment with both
>> manual and quickstart generated environments, or even infrared ones.
>> We're able to do it at the same time the environment gets deployed or in
>> a second time like a day two action.
>> What is not clear to me is the ansible-galaxy part you're mentioning,
>> today we rely on the github.com/redhat-openstack git repo, so we clone
>> it and then launch the playbooks via ansible-playbook command, how do
>> you see ansible-galaxy into the picture?
>>
>
> Git clone just works as well... Though, I was thinking of some minimal
> integration via *playbooks* (not roles) in quickstart/tripleo-validations
> and *external* roles. So the in-repo playbooks will be referencing those
> external destructive testing roles. While the roles are installed with
> galaxy, like:
>
> $ ansible-galaxy install git+https://$repo_name,master -p
> $external_roles_path
>
> or prolly adding the $repo_name and $release (master or a tag) into some
> galaxy-requirements.yaml file and install from it:
>
> $ ansible-galaxy install --force -r quickstart-extras/playbooks/ex
> ternal/galaxy-requirements.yaml -p $external_roles_path
>
> Then invoked for quickstart-extras/tripleo-validations like:
>
> $ ansible-playbook -i inventory quickstart-extras/playbooks/ex
> ternal/destructive-tests.yaml
>
>
>> Thanks!
>>
>>
Using galaxy is an option however we would need to make sure that galaxy is
proxied across the upstream clouds.
Another option would be to follow the current established pattern of adding
it to the requirements file [1]

Thanks Bogdan, Raoul!

[1]
https://github.com/openstack/tripleo-quickstart/blob/master/quickstart-extras-requirements.txt


>
> --
> Best regards,
> Bogdan Dobrelya,
> Irc #bogdando
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA] Validating HA on upstream

2018-02-16 Thread Bogdan Dobrelya

On 2/16/18 2:59 PM, Raoul Scarazzini wrote:

On 16/02/2018 10:24, Bogdan Dobrelya wrote:
[...]

+1 this looks like a perfect fit. Would it be possible to install that
tripleo-ha-utils/tripleo-quickstart-utils with ansible-galaxy, alongside
the quickstart, then apply destructive-testing playbooks with either the
quickstart's static inventory [0] (from your admin/control node) or
maybe via dynamic inventory [1] (from undercloud managing the overcloud
under test via config-download and/or external ansible deployment
mechanisms)?
[0]
https://git.openstack.org/cgit/openstack/tripleo-quickstart/tree/roles/tripleo-inventory
[1]
https://git.openstack.org/cgit/openstack/tripleo-validations/tree/scripts/tripleo-ansible-inventory


Hi Bogdan,
thanks for your answer. On the inventory side of things these playbooks
work on any kind of inventory, we're using it at the moment with both
manual and quickstart generated environments, or even infrared ones.
We're able to do it at the same time the environment gets deployed or in
a second time like a day two action.
What is not clear to me is the ansible-galaxy part you're mentioning,
today we rely on the github.com/redhat-openstack git repo, so we clone
it and then launch the playbooks via ansible-playbook command, how do
you see ansible-galaxy into the picture?


Git clone just works as well... Though, I was thinking of some minimal 
integration via *playbooks* (not roles) in 
quickstart/tripleo-validations and *external* roles. So the in-repo 
playbooks will be referencing those external destructive testing roles. 
While the roles are installed with galaxy, like:


$ ansible-galaxy install git+https://$repo_name,master -p 
$external_roles_path


or prolly adding the $repo_name and $release (master or a tag) into some 
galaxy-requirements.yaml file and install from it:


$ ansible-galaxy install --force -r 
quickstart-extras/playbooks/external/galaxy-requirements.yaml -p 
$external_roles_path


Then invoked for quickstart-extras/tripleo-validations like:

$ ansible-playbook -i inventory 
quickstart-extras/playbooks/external/destructive-tests.yaml




Thanks!




--
Best regards,
Bogdan Dobrelya,
Irc #bogdando

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI][QA] Validating HA on upstream

2018-02-16 Thread Raoul Scarazzini
On 16/02/2018 10:24, Bogdan Dobrelya wrote:
[...]
> +1 this looks like a perfect fit. Would it be possible to install that
> tripleo-ha-utils/tripleo-quickstart-utils with ansible-galaxy, alongside
> the quickstart, then apply destructive-testing playbooks with either the
> quickstart's static inventory [0] (from your admin/control node) or
> maybe via dynamic inventory [1] (from undercloud managing the overcloud
> under test via config-download and/or external ansible deployment
> mechanisms)?
> [0]
> https://git.openstack.org/cgit/openstack/tripleo-quickstart/tree/roles/tripleo-inventory
> [1]
> https://git.openstack.org/cgit/openstack/tripleo-validations/tree/scripts/tripleo-ansible-inventory

Hi Bogdan,
thanks for your answer. On the inventory side of things these playbooks
work on any kind of inventory, we're using it at the moment with both
manual and quickstart generated environments, or even infrared ones.
We're able to do it at the same time the environment gets deployed or in
a second time like a day two action.
What is not clear to me is the ansible-galaxy part you're mentioning,
today we rely on the github.com/redhat-openstack git repo, so we clone
it and then launch the playbooks via ansible-playbook command, how do
you see ansible-galaxy into the picture?

Thanks!

-- 
Raoul Scarazzini
ra...@redhat.com

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI] Validating HA on upstream

2018-02-16 Thread Bogdan Dobrelya

On 2/15/18 8:22 PM, Raoul Scarazzini wrote:

TL;DR: we would like to change the way HA is tested upstream to avoid
being hitten by evitable bugs that the CI process should discover.

Long version:

Today HA testing in upstream consist only in verifying that a three
controllers setup comes up correctly and can spawn an instance. That's
something, but it’s far from being enough since we continuously see "day
two" bugs.
We started covering this more than a year ago in internal CI and today
also on rdocloud using a project named tripleo-quickstart-utils [1].
Apart from his name, the project is not limited to tripleo-quickstart,
it covers three principal roles:

1 - stonith-config: a playbook that can be used to automate the creation
of fencing devices in the overcloud;
2 - instance-ha: a playbook that automates the seventeen manual steps
needed to configure instance HA in the overcloud, test them via rally
and verify that instance HA works;
3 - validate-ha: a playbook that runs a series of disruptive actions in
the overcloud and verifies it always behaves correctly by deploying a
heat-template that involves all the overcloud components;

To make this usable upstream, we need to understand where to put this
code. Here some choices:

1 - tripleo-validations: the most logical place to put this, at least
looking at the name, would be tripleo-validations. I've talked with some
of the folks working on it, and it came out that the meaning of
tripleo-validations project is not doing disruptive tests. Integrating
this stuff would be out of scope.

2 - tripleo-quickstart-extras: apart from the fact that this is not
something meant just for quickstart (the project supports infrared and
"plain" environments as well) even if we initially started there, in the
end, it came out that nobody was looking at the patches since nobody was
able to verify them. The result was a series of reviews stuck forever.
So moving back to extras would be a step backward.

3 - Dedicated project (tripleo-ha-utils or just tripleo-utils): like for
tripleo-upgrades or tripleo-validations it would be perfect having all
this grouped and usable as a standalone thing. Any integration is
possible inside the playbook for whatever kind of test. Today we're


+1 this looks like a perfect fit. Would it be possible to install that 
tripleo-ha-utils/tripleo-quickstart-utils with ansible-galaxy, alongside 
the quickstart, then apply destructive-testing playbooks with either the 
quickstart's static inventory [0] (from your admin/control node) or 
maybe via dynamic inventory [1] (from undercloud managing the overcloud 
under test via config-download and/or external ansible deployment 
mechanisms)?


[0] 
https://git.openstack.org/cgit/openstack/tripleo-quickstart/tree/roles/tripleo-inventory
[1] 
https://git.openstack.org/cgit/openstack/tripleo-validations/tree/scripts/tripleo-ansible-inventory



using the bash framework to interact with the cluster, rally to test
instance-ha and Ansible itself to simulate full power outage scenarios.

There's been a lot of talk about this during the last PTG [2], and
unfortunately, I'll not be part of the next one, but I would like to see
things moving on this side.
Everything I wrote is of course up to discussion, that's precisely the
meaning of this mail.

Thanks to all who'll give advice, suggestions, and thoughts about all
this stuff.

[1] https://github.com/redhat-openstack/tripleo-quickstart-utils
[2] https://etherpad.openstack.org/p/qa-queens-ptg-destructive-testing




--
Best regards,
Bogdan Dobrelya,
Irc #bogdando

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI] Validating HA on upstream

2018-02-15 Thread Raoul Scarazzini
TL;DR: we would like to change the way HA is tested upstream to avoid
being hitten by evitable bugs that the CI process should discover.

Long version:

Today HA testing in upstream consist only in verifying that a three
controllers setup comes up correctly and can spawn an instance. That's
something, but it’s far from being enough since we continuously see "day
two" bugs.
We started covering this more than a year ago in internal CI and today
also on rdocloud using a project named tripleo-quickstart-utils [1].
Apart from his name, the project is not limited to tripleo-quickstart,
it covers three principal roles:

1 - stonith-config: a playbook that can be used to automate the creation
of fencing devices in the overcloud;
2 - instance-ha: a playbook that automates the seventeen manual steps
needed to configure instance HA in the overcloud, test them via rally
and verify that instance HA works;
3 - validate-ha: a playbook that runs a series of disruptive actions in
the overcloud and verifies it always behaves correctly by deploying a
heat-template that involves all the overcloud components;

To make this usable upstream, we need to understand where to put this
code. Here some choices:

1 - tripleo-validations: the most logical place to put this, at least
looking at the name, would be tripleo-validations. I've talked with some
of the folks working on it, and it came out that the meaning of
tripleo-validations project is not doing disruptive tests. Integrating
this stuff would be out of scope.

2 - tripleo-quickstart-extras: apart from the fact that this is not
something meant just for quickstart (the project supports infrared and
"plain" environments as well) even if we initially started there, in the
end, it came out that nobody was looking at the patches since nobody was
able to verify them. The result was a series of reviews stuck forever.
So moving back to extras would be a step backward.

3 - Dedicated project (tripleo-ha-utils or just tripleo-utils): like for
tripleo-upgrades or tripleo-validations it would be perfect having all
this grouped and usable as a standalone thing. Any integration is
possible inside the playbook for whatever kind of test. Today we're
using the bash framework to interact with the cluster, rally to test
instance-ha and Ansible itself to simulate full power outage scenarios.

There's been a lot of talk about this during the last PTG [2], and
unfortunately, I'll not be part of the next one, but I would like to see
things moving on this side.
Everything I wrote is of course up to discussion, that's precisely the
meaning of this mail.

Thanks to all who'll give advice, suggestions, and thoughts about all
this stuff.

[1] https://github.com/redhat-openstack/tripleo-quickstart-utils
[2] https://etherpad.openstack.org/p/qa-queens-ptg-destructive-testing

-- 
Raoul Scarazzini
ra...@redhat.com

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI] Which network templates to use in CI (with and without net isolation)?

2018-01-04 Thread James Slagle
On Thu, Jan 4, 2018 at 5:26 PM, Sagi Shnaidman  wrote:
> Hi, all
>
> we have now network templates in tripleo-ci repo[1] and we'd like to move
> them to tht repo[2] and to use them from there.

They've already been moved from tripleo-ci to tripleo-heat-templates:
https://review.openstack.org/#/c/476708/

> We have also default
> templates defined in overcloud-deploy role[3].
> So the question is - which templates should we use and how to configure
> them?

We should use the ones for ci, not the examples under
tripleo-heat-templates/network/config. Those examples (especially for
multiple-nics) are meant to be clear and orderly so that users can
easily understand how to adapt them to their own environments.
Especially for multiple-nics, there isn't really a sane default, and I
don't think we should make our examples match what we use in ci.

It may be possible to update ovb so that it deploys virt environments
such that the examples work. That feels like a lot of unecessary churn
though. But even then ci is using mtu:1350, which we don't want in the
examples.

> One option for configuration is set network args (incl. isolation) in
> overcloud-deploy role[3] depending on other features (like docker, ipv6,
> etc).
> The other is to set them in featureset[4] files for each job.
> The question is also which network templates we want to gate in CI and
> should it be the same we have by default in tripleo-quickstart-extras?
>
> We have a few patches from James (@slagle) to address this topic[5]

What I'm trying to do in these patches is just use the templates and
environments from tripleo-heat-templates that were copied from
tripleo-ci in 476708. I gathered that was the intent since they were
copied into tripleo-heat-templates. Otherwise, why do we need them
there are at all?

-- 
-- James Slagle
--

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI] Which network templates to use in CI (with and without net isolation)?

2018-01-04 Thread Sagi Shnaidman
Hi, all

we have now network templates in tripleo-ci repo[1] and we'd like to move
them to tht repo[2] and to use them from there. We have also default
templates defined in overcloud-deploy role[3].
So the question is - which templates should we use and how to configure
them?
One option for configuration is set network args (incl. isolation) in
overcloud-deploy role[3] depending on other features (like docker, ipv6,
etc).
The other is to set them in featureset[4] files for each job.
The question is also which network templates we want to gate in CI and
should it be the same we have by default in tripleo-quickstart-extras?

We have a few patches from James (@slagle) to address this topic[5] and
from Arx for this issue[6].

Please feel free to share your thoughts what and where should be tested in
CI from network templates.

Thanks

[1]
https://github.com/openstack-infra/tripleo-ci/tree/821d84f34c851a79495f0205ad3c8dac928c286f/test-environments

[2]
https://github.com/openstack/tripleo-heat-templates/tree/master/ci/environments/network

[3]
https://github.com/openstack/tripleo-quickstart-extras/blob/master/roles/overcloud-deploy/tasks/pre-deploy.yml#L21-L51

[4]
https://github.com/openstack/tripleo-quickstart/blob/cf793bbb8368f89cd28214fe21adca2df48ef7f3/config/general_config/featureset001.yml#L26-L28

[5] https://review.openstack.org/#/c/531224/
https://review.openstack.org/#/c/525331
https://review.openstack.org/#/c/531221

[6] https://review.openstack.org/#/c/512225/

-- 
Best regards
Sagi Shnaidman
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI promotion blockers

2018-01-02 Thread Julie Pichon
On 2 January 2018 at 16:30, Alex Schultz  wrote:
> On Tue, Jan 2, 2018 at 9:08 AM, Julie Pichon  wrote:
>> Hi!
>>
>> On 27 December 2017 at 16:48, Emilien Macchi  wrote:
>>> - Keystone removed _member_ role management, so we stopped using it
>>> (only Member is enough): https://review.openstack.org/#/c/529849/
>>
>> There's been so many issues with the default member role and Horizon
>> over the years, that one got my attention. I can see that
>> puppet-horizon still expects '_member_' for role management [1].
>> However trying to understand the Keystone patch linked to in the
>> commit, it looks like there's total freedom in which role name to use
>> so we can't just change the default in puppet-horizon to use 'Member'
>> as other consumers may expect and settle on '_member_' in their
>> environment. (Right?)
>>
>> In this case, the proper way to fix this for TripleO deployments may
>> be to make the change in instack-undercloud (I presume in [2]) so that
>> the default role is explicitly set to 'Member' for us? Does that sound
>> like the correct approach to get to a working Horizon?
>>
>
> We probably should at least change _member_ to Member in
> puppet-horizon. That fixes both projects for the default case.

Oh, I thought there was no longer a default and that TripleO was
creating the 'Member' role by itself? Fixing it directly in
puppet-horizon sounds ideal in general, if changing the default value
isn't expected to cause other issues.

Thanks,

Julie

>
> Thanks,
> -Alex
>
>> Julie
>>
>> [1] 
>> https://github.com/openstack/puppet-horizon/blob/master/manifests/init.pp#L458
>> [2] 
>> https://github.com/openstack/instack-undercloud/blob/master/elements/puppet-stack-config/puppet-stack-config.yaml.template#L622

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI promotion blockers

2018-01-02 Thread Alex Schultz
On Tue, Jan 2, 2018 at 9:08 AM, Julie Pichon  wrote:
> Hi!
>
> On 27 December 2017 at 16:48, Emilien Macchi  wrote:
>> - Keystone removed _member_ role management, so we stopped using it
>> (only Member is enough): https://review.openstack.org/#/c/529849/
>
> There's been so many issues with the default member role and Horizon
> over the years, that one got my attention. I can see that
> puppet-horizon still expects '_member_' for role management [1].
> However trying to understand the Keystone patch linked to in the
> commit, it looks like there's total freedom in which role name to use
> so we can't just change the default in puppet-horizon to use 'Member'
> as other consumers may expect and settle on '_member_' in their
> environment. (Right?)
>
> In this case, the proper way to fix this for TripleO deployments may
> be to make the change in instack-undercloud (I presume in [2]) so that
> the default role is explicitly set to 'Member' for us? Does that sound
> like the correct approach to get to a working Horizon?
>

We probably should at least change _member_ to Member in
puppet-horizon. That fixes both projects for the default case.

Thanks,
-Alex

> Julie
>
> [1] 
> https://github.com/openstack/puppet-horizon/blob/master/manifests/init.pp#L458
> [2] 
> https://github.com/openstack/instack-undercloud/blob/master/elements/puppet-stack-config/puppet-stack-config.yaml.template#L622
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI promotion blockers

2018-01-02 Thread Julie Pichon
Hi!

On 27 December 2017 at 16:48, Emilien Macchi  wrote:
> - Keystone removed _member_ role management, so we stopped using it
> (only Member is enough): https://review.openstack.org/#/c/529849/

There's been so many issues with the default member role and Horizon
over the years, that one got my attention. I can see that
puppet-horizon still expects '_member_' for role management [1].
However trying to understand the Keystone patch linked to in the
commit, it looks like there's total freedom in which role name to use
so we can't just change the default in puppet-horizon to use 'Member'
as other consumers may expect and settle on '_member_' in their
environment. (Right?)

In this case, the proper way to fix this for TripleO deployments may
be to make the change in instack-undercloud (I presume in [2]) so that
the default role is explicitly set to 'Member' for us? Does that sound
like the correct approach to get to a working Horizon?

Julie

[1] 
https://github.com/openstack/puppet-horizon/blob/master/manifests/init.pp#L458
[2] 
https://github.com/openstack/instack-undercloud/blob/master/elements/puppet-stack-config/puppet-stack-config.yaml.template#L622

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI promotion blockers

2018-01-01 Thread Emilien Macchi
We've got promotion today, thanks Wes & Sagi for your help.

On Sun, Dec 31, 2017 at 9:06 AM, Emilien Macchi  wrote:
> Here's an update on what we did the last days after merging these
> blockers mentioned in the previous email:
>
> - Ignore a failing test in tempest (workaround)
> https://review.rdoproject.org/r/#/c/8/ until
> https://review.openstack.org/#/c/526647/ is merged. It allowed RDO
> repos to be consistent again, so we could have the latest patches in
> TripleO, tested by Promotion jobs.
> - scenario001 is timeouting a lot, we moved tacker/congress to
> scenario007, and also removed MongoDB that was running for nothing.
> - tripleo-ci-centos-7-containers-multinode was timeouting a lot, we
> removed cinder and some other services already covered by scenarios,
> so tripleo-ci-centos-7-containers-multinode is like ovb, testing the
> minimum set of services (which is why we created this job).
> - fixing an ipv6 issue in puppet-tripleo:
> https://review.openstack.org/#/c/530219/
>
> All of the above is merged.
> Now the remaining blocker is to update the RDO CI layout for promotion jobs:
> See https://review.rdoproject.org/r/#/c/9/ and
> https://review.rdoproject.org/r/#/c/11120/
> Once it merges and job runs, we should get a promotion.
>
> Let me know any question,
>
> On Wed, Dec 27, 2017 at 8:48 AM, Emilien Macchi  wrote:
>> Just a heads-up about what we've done the last days to make progress
>> and hopefully get a promotion this week:
>>
>> - Disabling voting on scenario001, 002 and 003. They timeout too much,
>> we haven't figured out why yet but we'll look at it this week and next
>> week. Hopefully we can re-enable voting today or so.
>> - Kolla added Sensu support and it broke our container builds. It
>> should be fixed by https://review.openstack.org/#/c/529890/ and
>> https://review.openstack.org/530232
>> - Keystone removed _member_ role management, so we stopped using it
>> (only Member is enough): https://review.openstack.org/#/c/529849/
>> - Fixup MTU configuration for CI envs: 
>> https://review.openstack.org/#/c/527249
>> - Reduce memory for undercloud image convert:
>> https://review.openstack.org/#/c/530137/
>> - Remove policy.json default rules from Heat in THT:
>> https://review.openstack.org/#/c/530225
>>
>> That's pretty all. Due to the lack of reviewers during the Christmas
>> time, we had to land some patches ourselves. If there is any problem
>> with one of them, please let us know. We're trying to maintain CI is
>> good shape this week and it's a bit of a challenge ;-)
>> --
>> Emilien Macchi
>
>
>
> --
> Emilien Macchi



-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [tripleo] CI promotion blockers

2017-12-31 Thread Emilien Macchi
Here's an update on what we did the last days after merging these
blockers mentioned in the previous email:

- Ignore a failing test in tempest (workaround)
https://review.rdoproject.org/r/#/c/8/ until
https://review.openstack.org/#/c/526647/ is merged. It allowed RDO
repos to be consistent again, so we could have the latest patches in
TripleO, tested by Promotion jobs.
- scenario001 is timeouting a lot, we moved tacker/congress to
scenario007, and also removed MongoDB that was running for nothing.
- tripleo-ci-centos-7-containers-multinode was timeouting a lot, we
removed cinder and some other services already covered by scenarios,
so tripleo-ci-centos-7-containers-multinode is like ovb, testing the
minimum set of services (which is why we created this job).
- fixing an ipv6 issue in puppet-tripleo:
https://review.openstack.org/#/c/530219/

All of the above is merged.
Now the remaining blocker is to update the RDO CI layout for promotion jobs:
See https://review.rdoproject.org/r/#/c/9/ and
https://review.rdoproject.org/r/#/c/11120/
Once it merges and job runs, we should get a promotion.

Let me know any question,

On Wed, Dec 27, 2017 at 8:48 AM, Emilien Macchi  wrote:
> Just a heads-up about what we've done the last days to make progress
> and hopefully get a promotion this week:
>
> - Disabling voting on scenario001, 002 and 003. They timeout too much,
> we haven't figured out why yet but we'll look at it this week and next
> week. Hopefully we can re-enable voting today or so.
> - Kolla added Sensu support and it broke our container builds. It
> should be fixed by https://review.openstack.org/#/c/529890/ and
> https://review.openstack.org/530232
> - Keystone removed _member_ role management, so we stopped using it
> (only Member is enough): https://review.openstack.org/#/c/529849/
> - Fixup MTU configuration for CI envs: https://review.openstack.org/#/c/527249
> - Reduce memory for undercloud image convert:
> https://review.openstack.org/#/c/530137/
> - Remove policy.json default rules from Heat in THT:
> https://review.openstack.org/#/c/530225
>
> That's pretty all. Due to the lack of reviewers during the Christmas
> time, we had to land some patches ourselves. If there is any problem
> with one of them, please let us know. We're trying to maintain CI is
> good shape this week and it's a bit of a challenge ;-)
> --
> Emilien Macchi



-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI promotion blockers

2017-12-27 Thread Emilien Macchi
Just a heads-up about what we've done the last days to make progress
and hopefully get a promotion this week:

- Disabling voting on scenario001, 002 and 003. They timeout too much,
we haven't figured out why yet but we'll look at it this week and next
week. Hopefully we can re-enable voting today or so.
- Kolla added Sensu support and it broke our container builds. It
should be fixed by https://review.openstack.org/#/c/529890/ and
https://review.openstack.org/530232
- Keystone removed _member_ role management, so we stopped using it
(only Member is enough): https://review.openstack.org/#/c/529849/
- Fixup MTU configuration for CI envs: https://review.openstack.org/#/c/527249
- Reduce memory for undercloud image convert:
https://review.openstack.org/#/c/530137/
- Remove policy.json default rules from Heat in THT:
https://review.openstack.org/#/c/530225

That's pretty all. Due to the lack of reviewers during the Christmas
time, we had to land some patches ourselves. If there is any problem
with one of them, please let us know. We're trying to maintain CI is
good shape this week and it's a bit of a challenge ;-)
-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo][ci] log collection in upstream jobs

2017-11-10 Thread Wesley Hayutin
Greetings,

Infra asked the tripleo team to cut down on the logs we're producing
upstream.  We are using a lot of space on their servers and also it's
taking too long to collect the logs themselves.

We need to compromise and be flexible here, so I'd like the tripleo-ci and
tripleo core to take another pass at this review w/ fresh eyes.  I would
ask then anything that would justify a -2 to be called out and worked on in
this thread.  Please be specific, I don't want to hear we need all the logs
to do our job as that is not possible.   Thanks all!

https://review.openstack.org/#/c/511526/
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI Status & Where to find your patch's status in CI

2017-10-19 Thread Alex Schultz
Hey Folks,

So the gate queue is quite backed up due to various reasons. If your
patch has been approved but you're uncertain of the CI status please,
please, please check the dashboard[0] before doing anything.  Do not
rebase or recheck things currently in a queue somewhere. When you
rebase a patch that's in the gate queue it will reset every job behind
it and restart the jobs for that change.

I've noticed that due to various restarts we did lose some comments on
things that are actually in the gate but there was no update in
gerrit. So please take some time and checkout the dashboard if you are
not certain if it's currently being checked.

Thanks,
-Alex


[0] http://zuulv3.openstack.org/

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] TripleO CI end of sprint status

2017-10-11 Thread Arx Cruz
Hello,

On October 10 we came to our first end of sprint using our new team
structure [1], and here’s the highlights:

TripleO CI Infra meeting notes:


   -

   Zuul v3 related patch:
   -

  The new Zuul v3 doesn’t have the cirros image cached, so we have a
  patch to change the tempest image to default value, that is download the
  image from cirros website.
  -

 https://review.openstack.org/510839
 -

   Zuul migration
   -

  There will have an outage in order to fix some issues found during
  the Zuul migration to v3
  -


 
http://lists.openstack.org/pipermail/openstack-dev/2017-October/123337.html
 -

   Job for migration
   -

  We are planning to start moving some jobs rom rh1 cloud to rdo cloud.
  -

   RDO Software Factory outage
   -

  There were an outage on RDO cloud on October 9, some jobs were
  stalled for a long time, now everything is working.


Sprint Review:

The sprint epic was utilizing the DLRN api across TripleO and RDO [2] to
report job status and promotions, and we set several tasks in 20 cards, and
I am glad to report that we were able to complete 19 cards! Some of these
cards generate some tech debts, and after a review, we got 11 card in the
tech debt list, plus 3 new bugs opened and XYZ bugs closed by the Ruck and
Rover.

One can see the results of the sprint via https://tinyurl.com/yblqs5z2

Below the list of new bugs related to the work completed in the sprint:


   -

   https://bugs.launchpad.net/tripleo/+bug/1722552
   -

   https://bugs.launchpad.net/tripleo/+bug/1722554
   -

   https://bugs.launchpad.net/tripleo/+bug/1722558


And here the list of what was done by the Ruck and Rover:

   -

   https://bugs.launchpad.net/tripleo/+bug/1722640
   -

   https://bugs.launchpad.net/tripleo/+bug/1722621
   -

   https://bugs.launchpad.net/tripleo/+bug/1722596
   -

   https://bugs.launchpad.net/tripleo/+bug/1721790
   -

   https://bugs.launchpad.net/tripleo/+bug/1721366
   -

   https://bugs.launchpad.net/tripleo/+bug/1721134
   -

   https://bugs.launchpad.net/tripleo/+bug/1720556
   -

   https://bugs.launchpad.net/tripleo/+bug/1719902
   -

   https://bugs.launchpad.net/tripleo/+bug/1719421



[1] https://review.openstack.org/#/c/509280/

[2] https://trello.com/c/5FnfGByl
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI] FreeIPA Deployment

2017-09-07 Thread Harry Rybacki
On Thu, Aug 31, 2017 at 12:52 PM, Juan Antonio Osorio
 wrote:
> Something that just came to my mind: Another option would be to allocate an
> extra IP Address for the undercloud, that would be dedicated to FreeIPA, and
> that way we MAY be able to deploy the FreeIPA server in the undercloud. If
> folks are OK with this I could experiment on this front. Maybe I could try
> to run FreeIPA on a container [1] (which wasn't available when I started
> working on this).
>
Interesting idea, Ozz! I'm not sure what/if the security implications
of running them
on the same host would be.

I'm cc'ing Toure to discuss possible workflow approach to this as well.

/R

> [1] https://hub.docker.com/r/freeipa/freeipa-server/
>
> On Sat, Aug 26, 2017 at 2:52 AM, Emilien Macchi  wrote:
>>
>> On Sun, Aug 20, 2017 at 11:45 PM, Juan Antonio Osorio
>>  wrote:
>> > The second option seems like the most viable. Not sure how the TripleO
>> > integration would go though. Care to elaborate on what you had in mind?
>>
>> Trying to reproduce what we did with ceph-ansible and use Mistral to
>> deploy FreeIPA with an external deployment tool.
>> Though I find the solution quite complex, maybe we can come-up with an
>> easier approach this time?
>> --
>> Emilien Macchi
>>
>> __
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> --
> Juan Antonio Osorio R.
> e-mail: jaosor...@gmail.com
>
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI Squad Meeting Summary (week 35) - better late than never

2017-09-06 Thread Attila Darazs
If the topics below interest you and you want to contribute to the 
discussion, feel free to join the next meeting:


Time: Thursdays, 14:30-15:30 UTC
Place: https://bluejeans.com/4113567798/

Full minutes: https://etherpad.openstack.org/p/tripleo-ci-squad-meeting

Here are the topics discussed from last Thursday:

Downstream (RHEL based) upstream Quickstart gates are down, because we 
had to migrate from QEOS7 to the ci-rhos internal cloud, which is not 
able to support our jobs currently. Ronelle is going to talk to 
responsible people to solve problems on it.


Tempest is now running in more and more scenario jobs. See Emilien's 
email[1] for details.


There's ongoing work from Emilien to get the upgrades job working on 
stable/pike. Please help with reviews to get it going.


Most of the squad's work is currently focusing on getting the periodic 
promotion pipeline on rdocloud working and uploading containers and images.


That's the short version, join us on the Thursday meeting or read the 
etherpad for more. :)


Best regards,
Attila

[1] 
http://lists.openstack.org/pipermail/openstack-dev/2017-September/121849.html


__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI] FreeIPA Deployment

2017-08-31 Thread Juan Antonio Osorio
Something that just came to my mind: Another option would be to allocate an
extra IP Address for the undercloud, that would be dedicated to FreeIPA,
and that way we MAY be able to deploy the FreeIPA server in the undercloud.
If folks are OK with this I could experiment on this front. Maybe I could
try to run FreeIPA on a container [1] (which wasn't available when I
started working on this).

[1] https://hub.docker.com/r/freeipa/freeipa-server/

On Sat, Aug 26, 2017 at 2:52 AM, Emilien Macchi  wrote:

> On Sun, Aug 20, 2017 at 11:45 PM, Juan Antonio Osorio
>  wrote:
> > The second option seems like the most viable. Not sure how the TripleO
> > integration would go though. Care to elaborate on what you had in mind?
>
> Trying to reproduce what we did with ceph-ansible and use Mistral to
> deploy FreeIPA with an external deployment tool.
> Though I find the solution quite complex, maybe we can come-up with an
> easier approach this time?
> --
> Emilien Macchi
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Juan Antonio Osorio R.
e-mail: jaosor...@gmail.com
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] CI design session at the PTG

2017-08-28 Thread Clark Boylan


On Mon, Aug 28, 2017, at 07:19 AM, Paul Belanger wrote:
> On Mon, Aug 28, 2017 at 09:42:45AM -0400, David Moreau Simard wrote:
> > Hi,
> > 
> > (cc whom I would at least like to attend)
> > 
> > The PTG would be a great opportunity to talk about CI design/layout
> > and how we see things moving forward in TripleO with Zuul v3, upstream
> > and in review.rdoproject.org.
> > 
> > Can we have a formal session on this scheduled somewhere ?
> > 
> Wednesday onwards likely is best for me, otherwise, I can find time
> during
> Mon-Tues if that is better.

The Zuulv3 stuff may be appropriate during the Infra team helproom on
Monday and Tuesday. There will be an afternoon Zuulv3 for OpenStack devs
session in Vail at 2pm Monday, but I think we generally plan on helping
with Zuulv3 during the entire helproom time.

Clark

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] CI design session at the PTG

2017-08-28 Thread David Moreau Simard
On Mon, Aug 28, 2017 at 10:25 AM, Wesley Hayutin  wrote:
> +1 from me, I'm sure John, Sagi, and Arx are also interested.

Yes, of course, I just went with whom I knew were going to the PTG.
Anyone else is welcome to join as well !


David Moreau Simard
Senior Software Engineer | OpenStack RDO

dmsimard = [irc, github, twitter]


On Mon, Aug 28, 2017 at 10:25 AM, Wesley Hayutin  wrote:
>
>
> On Mon, Aug 28, 2017 at 10:19 AM, Paul Belanger 
> wrote:
>>
>> On Mon, Aug 28, 2017 at 09:42:45AM -0400, David Moreau Simard wrote:
>> > Hi,
>> >
>> > (cc whom I would at least like to attend)
>> >
>> > The PTG would be a great opportunity to talk about CI design/layout
>> > and how we see things moving forward in TripleO with Zuul v3, upstream
>> > and in review.rdoproject.org.
>> >
>> > Can we have a formal session on this scheduled somewhere ?
>> >
>> Wednesday onwards likely is best for me, otherwise, I can find time during
>> Mon-Tues if that is better.
>>
>
> +1 from me, I'm sure John, Sagi, and Arx are also interested.
>
> Thanks
>

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] CI design session at the PTG

2017-08-28 Thread David Moreau Simard
On Mon, Aug 28, 2017 at 10:58 AM, Emilien Macchi  wrote:
> Yeah, this session would be interesting to do.
> Feel free to add it on https://etherpad.openstack.org/p/tripleo-ptg-queens
> We need to work on scheduling before the PTG but it would likely
> happen between Wednesday and Friday morning.

Good idea, I've added it to the etherpad [1] and I've created a pad
for the session as well [2].

[1]: https://etherpad.openstack.org/p/tripleo-ptg-queens
[2]: https://etherpad.openstack.org/p/tripleo-ptg-queens-ci

David Moreau Simard
Senior Software Engineer | OpenStack RDO

dmsimard = [irc, github, twitter]

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] CI design session at the PTG

2017-08-28 Thread Emilien Macchi
On Mon, Aug 28, 2017 at 6:42 AM, David Moreau Simard  wrote:
[...]
> Can we have a formal session on this scheduled somewhere ?

Yeah, this session would be interesting to do.
Feel free to add it on https://etherpad.openstack.org/p/tripleo-ptg-queens
We need to work on scheduling before the PTG but it would likely
happen between Wednesday and Friday morning.

Thanks,
-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] CI design session at the PTG

2017-08-28 Thread Wesley Hayutin
On Mon, Aug 28, 2017 at 10:19 AM, Paul Belanger 
wrote:

> On Mon, Aug 28, 2017 at 09:42:45AM -0400, David Moreau Simard wrote:
> > Hi,
> >
> > (cc whom I would at least like to attend)
> >
> > The PTG would be a great opportunity to talk about CI design/layout
> > and how we see things moving forward in TripleO with Zuul v3, upstream
> > and in review.rdoproject.org.
> >
> > Can we have a formal session on this scheduled somewhere ?
> >
> Wednesday onwards likely is best for me, otherwise, I can find time during
> Mon-Tues if that is better.
>
>
+1 from me, I'm sure John, Sagi, and Arx are also interested.

Thanks
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] CI design session at the PTG

2017-08-28 Thread Paul Belanger
On Mon, Aug 28, 2017 at 09:42:45AM -0400, David Moreau Simard wrote:
> Hi,
> 
> (cc whom I would at least like to attend)
> 
> The PTG would be a great opportunity to talk about CI design/layout
> and how we see things moving forward in TripleO with Zuul v3, upstream
> and in review.rdoproject.org.
> 
> Can we have a formal session on this scheduled somewhere ?
> 
Wednesday onwards likely is best for me, otherwise, I can find time during
Mon-Tues if that is better.

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO] CI design session at the PTG

2017-08-28 Thread Harry Rybacki
On Mon, Aug 28, 2017 at 9:42 AM, David Moreau Simard  wrote:
> Hi,
>
> (cc whom I would at least like to attend)
>
> The PTG would be a great opportunity to talk about CI design/layout
> and how we see things moving forward in TripleO with Zuul v3, upstream
> and in review.rdoproject.org.
>
> Can we have a formal session on this scheduled somewhere ?
>
+1

> David Moreau Simard
> Senior Software Engineer | OpenStack RDO
>
> dmsimard = [irc, github, twitter]
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO] CI design session at the PTG

2017-08-28 Thread David Moreau Simard
Hi,

(cc whom I would at least like to attend)

The PTG would be a great opportunity to talk about CI design/layout
and how we see things moving forward in TripleO with Zuul v3, upstream
and in review.rdoproject.org.

Can we have a formal session on this scheduled somewhere ?

David Moreau Simard
Senior Software Engineer | OpenStack RDO

dmsimard = [irc, github, twitter]

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI] FreeIPA Deployment

2017-08-25 Thread Emilien Macchi
On Sun, Aug 20, 2017 at 11:45 PM, Juan Antonio Osorio
 wrote:
> The second option seems like the most viable. Not sure how the TripleO
> integration would go though. Care to elaborate on what you had in mind?

Trying to reproduce what we did with ceph-ansible and use Mistral to
deploy FreeIPA with an external deployment tool.
Though I find the solution quite complex, maybe we can come-up with an
easier approach this time?
-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI Squad Meeting Summary (week 34)

2017-08-25 Thread Attila Darazs
If the topics below interest you and you want to contribute to the 
discussion, feel free to join the next meeting:


Time: Thursdays, 14:30-15:30 UTC
Place: https://bluejeans.com/4113567798/

Full minutes: https://etherpad.openstack.org/p/tripleo-ci-squad-meeting

Topics discussed:

We talked about the balance between using openstack-infra supported vs. 
self hosted solutions for graphite, logservers, proxies and mirrors. 
Paul Belanger joined us and the end result seemed to be that we're going 
to try to keep as many services under infra as we can, but sometimes the 
line is not so clear when we're dealing with 3rd party environments like 
rdocloud.


Ronelle talked about changing the overcommit ratio on rdocloud after the 
analysis of our usage. This can be probably done without any issue.


Wes added 
"gate-tripleo-ci-centos-7-scenario003-multinode-oooq-container" to the 
tripleo-quickstart-extras check and gate jobs to make sure we won't 
break containers. and that we get some feedback on the status of 
containers jobs.


RDO packaging changes are now gating with Quickstart 
(multinode-featureset005), though it's non-voting. It might help us 
prevent breaks from the packaging side.


Promotion jobs are still not working fully on RDO Cloud, but we're 
working on it.


That's it for this week, have a nice weekend.

Best regards,
Attila

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [TripleO][CI] FreeIPA Deployment

2017-08-21 Thread Juan Antonio Osorio
On Mon, Aug 21, 2017 at 5:48 PM, Ben Nemec  wrote:

>
>
> On 08/21/2017 01:45 AM, Juan Antonio Osorio wrote:
>
>> The second option seems like the most viable. Not sure how the TripleO
>> integration would go though. Care to elaborate on what you had in mind?
>>
>
> I can't remember if we discussed this when we were first implementing the
> ci job, but could FreeIPA run on the undercloud itself?  We could have the
> undercloud install process install FreeIPA before it does the rest of the
> undercloud install, and then the undercloud by default would talk to that
> local instance of FreeIPA.  We'd provide configuration options to allow use
> of a standalone server too, of course.
>

Right, this would have been the preferred option, and we did try to do
this. However, FreeIPA not very flexible (it isn't at all) on its port
configuration. And unfortunately there are port conflicts. Hence why we
decided to use a separate node.


> I feel like there was probably a reason we didn't do that in the first
> place (port conflicts?), but it would be the easiest option for deployers
> if we could make it work.
>
>
>> On Fri, Aug 18, 2017 at 9:11 PM, Emilien Macchi > > wrote:
>>
>> On Fri, Aug 18, 2017 at 8:34 AM, Harry Rybacki > > wrote:
>>  > Greetings Stackers,
>>  >
>>  > Recently, I brought up a discussion around deploying FreeIPA via
>>  > TripleO-Quickstart vs TripleO. This is part of a larger discussion
>>  > around expanding security related CI coverage for OpenStack.
>>  >
>>  > A few months back, I added the ability to deploy FreeIPA via
>>  > TripleO-Quickstart through three reviews:
>>  >
>>  > 1) Adding a role to deploy FreeIPA via OOOQ_E[1]
>>  > 2) Providing OOOQ with the ability to deploy a supplemental node
>>  > (alongside the undercloud)[2]
>>  > 3) Update the quickstart-extras playbook to deploy FreeIPA[3]
>>  >
>>  >
>>  > The reasoning behind this is as follows (copied from a conversation
>>  > with jaosorior):
>>  >
>>  >> So the deal is that both the undercloud and the overcloud need
>> to be registered as a FreeIPA client.
>>  >> This is because they need to authenticate to it in order to
>> execute actions.
>>  >>
>>  >> * The undercloud needs to have FreeIPA credentials because it's
>> running novajoin, which in turn
>>  >> executes requests to FreeIPA in order to create service principals
>>  >>  - The service principals are ultimately the service name and
>> the node name entries for which we'll
>>  >> requests the certificates.
>>  >> * The overcloud nodes need to be registered and authenticated to
>> FreeIPA (which right now happens > through a cloud-init script
>> provisioned by nova/nova-metadata) because that's how it requests
>>  >> certificates.
>>  >>
>>  >> So the flow is as follows:
>>  >>
>>  >> * FreeIPA node is provisioned.
>>  >>  - We'll appropriate credentials at this point.
>>  >>  - We register the undercloud as a FreeIPA client and get an OTP
>> (one time password) for it
>>  >> - We add the OTP to the undercloud.conf and enable novajoin.
>>  >> * We trigger the undercloud install.
>>  >>  - after the install, we have novajoin running, which is the
>> service that registers automatically the
>>  >> overcloud nodes to FreeIPA.
>>  >> * We trigger the overcloud deploy
>>  >>  - We need to set up a flag that tells the deploy to pass
>> appropriate nova metadata (which tells
>>  >> novajoin that the nodes should be registered).
>>  >>  - profit!! we can now get certificates from the CA (and do
>> other stuff that FreeIPA allows you to do,
>>  >> such as use kerberos auth, control sudo rights of the nodes'
>> users, etc.)
>>  >>
>>  >> Since the nodes need to be registered to FreeIPA, we can't rely
>> on FreeIPA being installed by
>>  >> TripleO, even if that's possible by doing it through a
>> composable service.
>>  >> If we would use a composable service to install FreeIPA, the
>> flow would be like this:
>>  >>
>>  >> * Install undercloud
>>  >> * Install overcloud with one node (running FreeIPA)
>>  >> * register undercloud node to FreeIPA and modify undercloud.conf
>>  >> * Update undercloud
>>  >> * scale overcloud and register the rest of the nodes to FreeIPA
>> through novajoin.
>>  >>
>>  >> So, while we could install FreeIPA with TripleO. This really
>> complicates the deployment to an
>>  >> unnecessary point.
>>  >>
>>  >> So I suggest keeping the current behavior, which treats FreeIPA
>> as a separate node to be
>>  >> provisioned before the undercloud). And if folks would like to
>> have a separate FreeIPA node for their > overcloud 

Re: [openstack-dev] [TripleO][CI] FreeIPA Deployment

2017-08-21 Thread Ben Nemec



On 08/21/2017 01:45 AM, Juan Antonio Osorio wrote:
The second option seems like the most viable. Not sure how the TripleO 
integration would go though. Care to elaborate on what you had in mind?


I can't remember if we discussed this when we were first implementing 
the ci job, but could FreeIPA run on the undercloud itself?  We could 
have the undercloud install process install FreeIPA before it does the 
rest of the undercloud install, and then the undercloud by default would 
talk to that local instance of FreeIPA.  We'd provide configuration 
options to allow use of a standalone server too, of course.


I feel like there was probably a reason we didn't do that in the first 
place (port conflicts?), but it would be the easiest option for 
deployers if we could make it work.




On Fri, Aug 18, 2017 at 9:11 PM, Emilien Macchi > wrote:


On Fri, Aug 18, 2017 at 8:34 AM, Harry Rybacki > wrote:
 > Greetings Stackers,
 >
 > Recently, I brought up a discussion around deploying FreeIPA via
 > TripleO-Quickstart vs TripleO. This is part of a larger discussion
 > around expanding security related CI coverage for OpenStack.
 >
 > A few months back, I added the ability to deploy FreeIPA via
 > TripleO-Quickstart through three reviews:
 >
 > 1) Adding a role to deploy FreeIPA via OOOQ_E[1]
 > 2) Providing OOOQ with the ability to deploy a supplemental node
 > (alongside the undercloud)[2]
 > 3) Update the quickstart-extras playbook to deploy FreeIPA[3]
 >
 >
 > The reasoning behind this is as follows (copied from a conversation
 > with jaosorior):
 >
 >> So the deal is that both the undercloud and the overcloud need
to be registered as a FreeIPA client.
 >> This is because they need to authenticate to it in order to
execute actions.
 >>
 >> * The undercloud needs to have FreeIPA credentials because it's
running novajoin, which in turn
 >> executes requests to FreeIPA in order to create service principals
 >>  - The service principals are ultimately the service name and
the node name entries for which we'll
 >> requests the certificates.
 >> * The overcloud nodes need to be registered and authenticated to
FreeIPA (which right now happens > through a cloud-init script
provisioned by nova/nova-metadata) because that's how it requests
 >> certificates.
 >>
 >> So the flow is as follows:
 >>
 >> * FreeIPA node is provisioned.
 >>  - We'll appropriate credentials at this point.
 >>  - We register the undercloud as a FreeIPA client and get an OTP
(one time password) for it
 >> - We add the OTP to the undercloud.conf and enable novajoin.
 >> * We trigger the undercloud install.
 >>  - after the install, we have novajoin running, which is the
service that registers automatically the
 >> overcloud nodes to FreeIPA.
 >> * We trigger the overcloud deploy
 >>  - We need to set up a flag that tells the deploy to pass
appropriate nova metadata (which tells
 >> novajoin that the nodes should be registered).
 >>  - profit!! we can now get certificates from the CA (and do
other stuff that FreeIPA allows you to do,
 >> such as use kerberos auth, control sudo rights of the nodes'
users, etc.)
 >>
 >> Since the nodes need to be registered to FreeIPA, we can't rely
on FreeIPA being installed by
 >> TripleO, even if that's possible by doing it through a
composable service.
 >> If we would use a composable service to install FreeIPA, the
flow would be like this:
 >>
 >> * Install undercloud
 >> * Install overcloud with one node (running FreeIPA)
 >> * register undercloud node to FreeIPA and modify undercloud.conf
 >> * Update undercloud
 >> * scale overcloud and register the rest of the nodes to FreeIPA
through novajoin.
 >>
 >> So, while we could install FreeIPA with TripleO. This really
complicates the deployment to an
 >> unnecessary point.
 >>
 >> So I suggest keeping the current behavior, which treats FreeIPA
as a separate node to be
 >> provisioned before the undercloud). And if folks would like to
have a separate FreeIPA node for their > overcloud deployment (which
could provision certs for the tenants) then we could do that as a
 >> composable service, if people request it.
 >
 > I am now re-raising this to the group at large for discussion about
 > the merits of this approach vs deploying via TripleO itself.

There are 3 approaches here:

- Keep using Quickstart which is of course not the viable option since
TripleO Quickstart is only used by CI and developers right now. Not by
customers neither in production.
- Deploy your own Ansible playbooks or automation tool to deploy

Re: [openstack-dev] [TripleO][CI] FreeIPA Deployment

2017-08-21 Thread Juan Antonio Osorio
The second option seems like the most viable. Not sure how the TripleO
integration would go though. Care to elaborate on what you had in mind?

On Fri, Aug 18, 2017 at 9:11 PM, Emilien Macchi  wrote:

> On Fri, Aug 18, 2017 at 8:34 AM, Harry Rybacki 
> wrote:
> > Greetings Stackers,
> >
> > Recently, I brought up a discussion around deploying FreeIPA via
> > TripleO-Quickstart vs TripleO. This is part of a larger discussion
> > around expanding security related CI coverage for OpenStack.
> >
> > A few months back, I added the ability to deploy FreeIPA via
> > TripleO-Quickstart through three reviews:
> >
> > 1) Adding a role to deploy FreeIPA via OOOQ_E[1]
> > 2) Providing OOOQ with the ability to deploy a supplemental node
> > (alongside the undercloud)[2]
> > 3) Update the quickstart-extras playbook to deploy FreeIPA[3]
> >
> >
> > The reasoning behind this is as follows (copied from a conversation
> > with jaosorior):
> >
> >> So the deal is that both the undercloud and the overcloud need to be
> registered as a FreeIPA client.
> >> This is because they need to authenticate to it in order to execute
> actions.
> >>
> >> * The undercloud needs to have FreeIPA credentials because it's running
> novajoin, which in turn
> >> executes requests to FreeIPA in order to create service principals
> >>  - The service principals are ultimately the service name and the node
> name entries for which we'll
> >> requests the certificates.
> >> * The overcloud nodes need to be registered and authenticated to
> FreeIPA (which right now happens > through a cloud-init script provisioned
> by nova/nova-metadata) because that's how it requests
> >> certificates.
> >>
> >> So the flow is as follows:
> >>
> >> * FreeIPA node is provisioned.
> >>  - We'll appropriate credentials at this point.
> >>  - We register the undercloud as a FreeIPA client and get an OTP (one
> time password) for it
> >> - We add the OTP to the undercloud.conf and enable novajoin.
> >> * We trigger the undercloud install.
> >>  - after the install, we have novajoin running, which is the service
> that registers automatically the
> >> overcloud nodes to FreeIPA.
> >> * We trigger the overcloud deploy
> >>  - We need to set up a flag that tells the deploy to pass appropriate
> nova metadata (which tells
> >> novajoin that the nodes should be registered).
> >>  - profit!! we can now get certificates from the CA (and do other stuff
> that FreeIPA allows you to do,
> >> such as use kerberos auth, control sudo rights of the nodes' users,
> etc.)
> >>
> >> Since the nodes need to be registered to FreeIPA, we can't rely on
> FreeIPA being installed by
> >> TripleO, even if that's possible by doing it through a composable
> service.
> >> If we would use a composable service to install FreeIPA, the flow would
> be like this:
> >>
> >> * Install undercloud
> >> * Install overcloud with one node (running FreeIPA)
> >> * register undercloud node to FreeIPA and modify undercloud.conf
> >> * Update undercloud
> >> * scale overcloud and register the rest of the nodes to FreeIPA through
> novajoin.
> >>
> >> So, while we could install FreeIPA with TripleO. This really
> complicates the deployment to an
> >> unnecessary point.
> >>
> >> So I suggest keeping the current behavior, which treats FreeIPA as a
> separate node to be
> >> provisioned before the undercloud). And if folks would like to have a
> separate FreeIPA node for their > overcloud deployment (which could
> provision certs for the tenants) then we could do that as a
> >> composable service, if people request it.
> >
> > I am now re-raising this to the group at large for discussion about
> > the merits of this approach vs deploying via TripleO itself.
>
> There are 3 approaches here:
>
> - Keep using Quickstart which is of course not the viable option since
> TripleO Quickstart is only used by CI and developers right now. Not by
> customers neither in production.
> - Deploy your own Ansible playbooks or automation tool to deploy
> FreeIPA and host it wherever you like. Integrate the playbooks in
> TripleO, as an external component (can be deployed manually between
> some steps but will be to be documented).
> - Create a composable service that will deploy FreeIPA service(s),
> part of TripleO Heat Templates. The way it works *now* will require
> you to have a puppet-freeipa module to deploy the bits but we're
> working toward migrating to Ansible at some point.
>

This approach is not ideal and will be quite a burdain as I described
above. I wouldn't consider this an option.


> I hope it helps, let me know if you need further details on a specific
> approach.
> --
> Emilien Macchi
>
> __
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>




Re: [openstack-dev] [TripleO][CI] FreeIPA Deployment

2017-08-18 Thread Emilien Macchi
On Fri, Aug 18, 2017 at 8:34 AM, Harry Rybacki  wrote:
> Greetings Stackers,
>
> Recently, I brought up a discussion around deploying FreeIPA via
> TripleO-Quickstart vs TripleO. This is part of a larger discussion
> around expanding security related CI coverage for OpenStack.
>
> A few months back, I added the ability to deploy FreeIPA via
> TripleO-Quickstart through three reviews:
>
> 1) Adding a role to deploy FreeIPA via OOOQ_E[1]
> 2) Providing OOOQ with the ability to deploy a supplemental node
> (alongside the undercloud)[2]
> 3) Update the quickstart-extras playbook to deploy FreeIPA[3]
>
>
> The reasoning behind this is as follows (copied from a conversation
> with jaosorior):
>
>> So the deal is that both the undercloud and the overcloud need to be 
>> registered as a FreeIPA client.
>> This is because they need to authenticate to it in order to execute actions.
>>
>> * The undercloud needs to have FreeIPA credentials because it's running 
>> novajoin, which in turn
>> executes requests to FreeIPA in order to create service principals
>>  - The service principals are ultimately the service name and the node name 
>> entries for which we'll
>> requests the certificates.
>> * The overcloud nodes need to be registered and authenticated to FreeIPA 
>> (which right now happens > through a cloud-init script provisioned by 
>> nova/nova-metadata) because that's how it requests
>> certificates.
>>
>> So the flow is as follows:
>>
>> * FreeIPA node is provisioned.
>>  - We'll appropriate credentials at this point.
>>  - We register the undercloud as a FreeIPA client and get an OTP (one time 
>> password) for it
>> - We add the OTP to the undercloud.conf and enable novajoin.
>> * We trigger the undercloud install.
>>  - after the install, we have novajoin running, which is the service that 
>> registers automatically the
>> overcloud nodes to FreeIPA.
>> * We trigger the overcloud deploy
>>  - We need to set up a flag that tells the deploy to pass appropriate nova 
>> metadata (which tells
>> novajoin that the nodes should be registered).
>>  - profit!! we can now get certificates from the CA (and do other stuff that 
>> FreeIPA allows you to do,
>> such as use kerberos auth, control sudo rights of the nodes' users, etc.)
>>
>> Since the nodes need to be registered to FreeIPA, we can't rely on FreeIPA 
>> being installed by
>> TripleO, even if that's possible by doing it through a composable service.
>> If we would use a composable service to install FreeIPA, the flow would be 
>> like this:
>>
>> * Install undercloud
>> * Install overcloud with one node (running FreeIPA)
>> * register undercloud node to FreeIPA and modify undercloud.conf
>> * Update undercloud
>> * scale overcloud and register the rest of the nodes to FreeIPA through 
>> novajoin.
>>
>> So, while we could install FreeIPA with TripleO. This really complicates the 
>> deployment to an
>> unnecessary point.
>>
>> So I suggest keeping the current behavior, which treats FreeIPA as a 
>> separate node to be
>> provisioned before the undercloud). And if folks would like to have a 
>> separate FreeIPA node for their > overcloud deployment (which could 
>> provision certs for the tenants) then we could do that as a
>> composable service, if people request it.
>
> I am now re-raising this to the group at large for discussion about
> the merits of this approach vs deploying via TripleO itself.

There are 3 approaches here:

- Keep using Quickstart which is of course not the viable option since
TripleO Quickstart is only used by CI and developers right now. Not by
customers neither in production.
- Deploy your own Ansible playbooks or automation tool to deploy
FreeIPA and host it wherever you like. Integrate the playbooks in
TripleO, as an external component (can be deployed manually between
some steps but will be to be documented).
- Create a composable service that will deploy FreeIPA service(s),
part of TripleO Heat Templates. The way it works *now* will require
you to have a puppet-freeipa module to deploy the bits but we're
working toward migrating to Ansible at some point.

I hope it helps, let me know if you need further details on a specific approach.
-- 
Emilien Macchi

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI Squad Meeting Summary (week 33)

2017-08-18 Thread Attila Darazs
If the topics below interest you and you want to contribute to the 
discussion, feel free to join the next meeting:


Time: Thursdays, 14:30-15:30 UTC
Place: https://bluejeans.com/4113567798/

Full minutes: https://etherpad.openstack.org/p/tripleo-ci-squad-meeting

Topics discussed:

We debated whether we should add an upgrades job to 
tripleo-quickstart-extra that would allow our IRC bot (hubbot) to report 
on the status of the upgrades as well using gatestatus[1]. The upgrades 
jobs are not stable enough to do that though.


We have/had two major infra issues during the week, one is not using the 
nodepool DNS in jobs (fixed by Sagi) and not using the DLRN & CentOS 
mirrors during DLRN package building in the gates. The latter has fixes 
but they are not merged yet.


Emilien and Arx is working on adding tempest tests in place of pingtests 
in most of our gate jobs where it's useful. We also have quite a few 
jobs that don't have any validation yet.


We decided on using a whitelist for the collecting log files from /etc 
on the upstream jobs. This will reduce the load on the logserver.


3/4 node multinode jobs are almost ready, we're trying to merge the 
changes, just like the ones for multinic with libvirt.


We're also working hard to get the periodic/promotion jobs work on 
rdocloud to increase the cadence of the promotions. We have daily 
standups to coordinate the work with Ronelle, Wes, John and me.


That's it for this week, have a nice weekend.

Best regards,
Attila

[1] https://github.com/adarazs/gate-status

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [TripleO][CI] FreeIPA Deployment

2017-08-18 Thread Harry Rybacki
Greetings Stackers,

Recently, I brought up a discussion around deploying FreeIPA via
TripleO-Quickstart vs TripleO. This is part of a larger discussion
around expanding security related CI coverage for OpenStack.

A few months back, I added the ability to deploy FreeIPA via
TripleO-Quickstart through three reviews:

1) Adding a role to deploy FreeIPA via OOOQ_E[1]
2) Providing OOOQ with the ability to deploy a supplemental node
(alongside the undercloud)[2]
3) Update the quickstart-extras playbook to deploy FreeIPA[3]


The reasoning behind this is as follows (copied from a conversation
with jaosorior):

> So the deal is that both the undercloud and the overcloud need to be 
> registered as a FreeIPA client.
> This is because they need to authenticate to it in order to execute actions.
>
> * The undercloud needs to have FreeIPA credentials because it's running 
> novajoin, which in turn
> executes requests to FreeIPA in order to create service principals
>  - The service principals are ultimately the service name and the node name 
> entries for which we'll
> requests the certificates.
> * The overcloud nodes need to be registered and authenticated to FreeIPA 
> (which right now happens > through a cloud-init script provisioned by 
> nova/nova-metadata) because that's how it requests
> certificates.
>
> So the flow is as follows:
>
> * FreeIPA node is provisioned.
>  - We'll appropriate credentials at this point.
>  - We register the undercloud as a FreeIPA client and get an OTP (one time 
> password) for it
> - We add the OTP to the undercloud.conf and enable novajoin.
> * We trigger the undercloud install.
>  - after the install, we have novajoin running, which is the service that 
> registers automatically the
> overcloud nodes to FreeIPA.
> * We trigger the overcloud deploy
>  - We need to set up a flag that tells the deploy to pass appropriate nova 
> metadata (which tells
> novajoin that the nodes should be registered).
>  - profit!! we can now get certificates from the CA (and do other stuff that 
> FreeIPA allows you to do,
> such as use kerberos auth, control sudo rights of the nodes' users, etc.)
>
> Since the nodes need to be registered to FreeIPA, we can't rely on FreeIPA 
> being installed by
> TripleO, even if that's possible by doing it through a composable service.
> If we would use a composable service to install FreeIPA, the flow would be 
> like this:
>
> * Install undercloud
> * Install overcloud with one node (running FreeIPA)
> * register undercloud node to FreeIPA and modify undercloud.conf
> * Update undercloud
> * scale overcloud and register the rest of the nodes to FreeIPA through 
> novajoin.
>
> So, while we could install FreeIPA with TripleO. This really complicates the 
> deployment to an
> unnecessary point.
>
> So I suggest keeping the current behavior, which treats FreeIPA as a separate 
> node to be
> provisioned before the undercloud). And if folks would like to have a 
> separate FreeIPA node for their > overcloud deployment (which could provision 
> certs for the tenants) then we could do that as a
> composable service, if people request it.

I am now re-raising this to the group at large for discussion about
the merits of this approach vs deploying via TripleO itself.


[1] - https://review.openstack.org/#/c/436198/
[2] - https://review.openstack.org/#/c/451523/
[3] - https://review.openstack.org/#/c/453223/

/R

Harry Rybacki

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


[openstack-dev] [tripleo] CI Squad Meeting Summary (week 31)

2017-08-04 Thread Attila Darazs
If the topics below interest you and you want to contribute to the 
discussion, feel free to join the next meeting:


Time: Thursdays, 14:30-15:30 UTC
Place: https://bluejeans.com/4113567798/

Full minutes: https://etherpad.openstack.org/p/tripleo-ci-squad-meeting

There are a lot of people on vacation, so this was a small meeting.

We started by discussing the hash promotions and the ways to track 
issues. Whether it's an upstream or RDO promotion issue, just create a 
Launchpad bug against tripleo and tag it with ci and alert. It will 
automatically gets escalated and gets attention.


Gabriele gave a presentation about his current status with container 
building on RDO Cloud. It looks to be in a good shape, however there are 
still bugs to iron out.


Arx explained that the scenario001 jobs are now running a tempest test 
as well, good way to introduce more testing upstream, while Emilien 
explained that we should probably do more tempest testing on container 
jobs as well.


Wes brought up an issue about collecting logs during the image building 
process which needs attention.


That's it for this week, have a nice weekend.

Best regards,
Attila

__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


  1   2   3   4   >