On 03/25/2016 03:52 PM, Jeremy Stanley wrote:
On 2016-03-25 16:33:44 -0400 (-0400), Jay Pipes wrote:
[...]
What I'm proposing isn't using or needing a custom OpenStack
deployment. There's nothing non-standard at all about the PCI or
NFV stuff besides the hardware required to functionally test it.

What you _are_ talking about though is maintaining physical servers
in a data center running an OpenStack environment (and if you want
it participating in gating/preventing changes from merging you need
more than one environment so we don't completely shutdown
development when one of them collapses). This much has been a
challenge for the TripleO team, such that the jobs running for them
are still not voting on their changes.

What we're talking about here is using the same upstream Infra
Puppet modules, installed on a long-running server in a lab that
can interface with upstream Gerrit, respond to new change events
in the Gerrit stream, and trigger devstack-gate[-like] builds on
some bare-metal gear.

It's possible I'm misunderstanding you... you're talking about
maintaining a deployment of OpenStack with specific hardware to be
able to run these jobs in, right? That's not as trivial an effort as
it sounds, and I'm skeptical "a couple of operators" is sufficient
to sustain such an endeavor.

Two things:

- Rhere is no current concept of "a long-lived machine running that we run devstack on from time to time" - everything in Infra is designed around using OpenStack APIs to get compute resources. So if we want to run jobs on hardware in this lab, as it stands right now, that hardware would need to be provided by Ironic+Nova.

Last time we did the math (and Jim can maybe correct my numbers) in order to keep up with the demand similar to our VM environments, I believe such an env would need at least 83 Ironic nodes. And as Jeremy said, we'd need at least 2 envs for redundancy - so in looking at getting it funded, looking for approximately 200 machines is likely about right.

- zuul v3 does introduce the concept of statically available resources that can be checked out of nodepool - specifically to address the question of people wanting to use long-lived servers as test resources for things. The machine count is still likely to remain static - but once we have zuul v3 out, it might reduce the need for the operators to operate 2 100-node Ironic-based OpenStack clouds. (This implies that help with zuul v3 might be seen as an accelerant to this project)

Also keep in mind, if/when resources are sought out, that every underlying OS config would double the amount of resources. So if we got 2 sets of 100 nodes to start with, and started running NFV config'd devstack tests on them on ubuntu trusty, and then our friends at RedHat request that we test the same on a RH-baed distro, the cost for that would be an additional 100 nodes in each DC.

Is that something that is totally out of the question for the
upstream Infra team to be a guide for?

We've stated in the past that we're willing to accept this level of
integration as long as our requirements for redundancy/uptime are
met. We mostly just don't want to see issues with the environment
block development for projects relying on it because it's the only
place those jobs can run, so multiple environments in different data
centers would be a necessity (right now our gating jobs are able to
run in any of 9 regions from 6 providers, which mitigates this
risk).



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to