Jim, great stuff. A couple suggestions inline :)
On 02/26/2015 09:59 AM, James E. Blair wrote:
A tenant may optionally specify repos from which it may derive its
configuration. In this manner, a repo may keep its Zuul configuration
within its own repo. This would only happen if the main configuration
file specified that it is permitted::
### main.yaml (continued)
- tenant:
name: random-stackforge-project
include:
- global_config.yaml
repos:
- stackforge/random # Specific project config is in-repo
Might I suggest that instead of a repos: YAML block, that instead, the
include: YAML block allow URIs. So, to support some random Zuul config
in a stackforge repo, you could do:
include:
- global_config.yaml
- https://git.openstack.org/stackforge/random/tools/zuul.yml
That would make the configuration simpler, I think.
Jobs defined in-repo may not have access to the full feature set
(including some authorization features). They also may not override
existing jobs.
Job definitions continue to have the features in the current Zuul
layout, but they also take on some of the responsibilities currently
handled by the Jenkins (or other worker) definition::
### global_config.yaml
# Every tenant in the system has access to these jobs (because their
# tenant definition includes it).
- job:
name: base
timeout: 30m
node: precise # Just a variable for later use
nodes: # The operative list of nodes
- name: controller
image: {node} # Substitute the variable
auth: # Auth may only be defined in central config, not in-repo
swift:
- container: logs
pre-run: # These specify what to run before and after the job
- zuul-cloner
post-run:
- archive-logs
++
Jobs have inheritance, and the above definition provides a base level
of functionality for all jobs. It sets a default timeout, requests a
single node (of type precise), and requests swift credentials to
upload logs. Further jobs may extend and override these parameters::
### global_config.yaml (continued)
# The python 2.7 unit test job
- job:
name: python27
parent: base
node: trusty
Yes, this is great :)
Our use of job names specific to projects is a holdover from when we
wanted long-lived slaves on jenkins to efficiently re-use workspaces.
This hasn't been necessary for a while, though we have used this to
our advantage when collecting stats and reports. However, job
configuration can be simplified greatly if we simply have a job that
runs the python 2.7 unit tests which can be used for any project. To
the degree that we want to know how often this job failed on nova, we
can add that information back in when reporting statistics. Jobs may
have multiple aspects to accomodate differences among branches, etc.::
### global_config.yaml (continued)
# Version that is run for changes on stable/icehouse
- job:
name: python27
parent: base
branch: stable/icehouse
node: precise
# Version that is run for changes on stable/juno
- job:
name: python27
parent: base
branch: stable/juno # Could be combined into previous with regex
node: precise # if concept of "best match" is defined
Jobs may specify that they require more than one node::
### global_config.yaml (continued)
- job:
name: devstack-multinode
parent: base
node: trusty # could do same branch mapping as above
nodes:
- name: controller
image: {node}
- name: compute
image: {node}
Jobs defined centrally (i.e., not in-repo) may specify auth info::
### global_config.yaml (continued)
- job:
name: pypi-upload
parent: base
auth:
password:
pypi-password: pypi-password
# This looks up 'pypi-password' from an encrypted yaml file
# and adds it into variables for the job
Pipeline definitions are similar to the current syntax, except that it
supports specifying additional information for jobs in the context of
a given project and pipeline. For instance, rather than specifying
that a job is globally non-voting, you may specify that it is
non-voting for a given project in a given pipeline::
### openstack.yaml
- project:
name: openstack/nova
gate:
queue: integrated # Shared queues are manually built
jobs:
- python27 # Runs version of job appropriate to branch
- devstack
- devstack-deprecated-feature:
branch: stable/juno # Only run on stable/juno changes
voting: false # Non-voting
post:
jobs:
- tarball:
jobs:
- pypi-upload
Currently unique job names are used to build shared change queues.
Since job names will no longer be unique, shared queues must be
manually constructed by assigning them a name. Projects with the same
queue name for the same pipeline will have a shared queue.
A subset of functionality is avaible to projects that are permitted to
use in-repo configuration::
### stackforge/random/.zuul.yaml
- job:
name: random-job
parent: base # From global config; gets us logs
node: precise
- project:
name: stackforge/random
gate:
jobs:
- python27 # From global config
- random-job # Flom local config
Again, here I would support URI-based job config directives. Why? Well,
let's say that a project has a separate repository that contains job and
test configuration files. You'd be able to set a URI here and continue
to keep your job and test configurations separate from the code base...
The executable content of jobs should be defined as ansible playbooks.
Playbooks can be fairly simple and might consist of little more than
"run this shell script" for those who are not otherwise interested in
ansible::
### stackforge/random/playbooks/random-job.yaml
---
hosts: controller
tasks:
- shell: run_some_tests.sh
Global jobs may define ansible roles for common functions::
### openstack-infra/zuul-playbooks/python27.yaml
---
hosts: controller
roles:
- tox:
env: py27
Because ansible has well-articulated multi-node orchestration
features, this permits very expressive job definitions for multi-node
tests. A playbook can specify different roles to apply to the
different nodes that the job requested::
### openstack-infra/zuul-playbooks/devstack-multinode.yaml
---
hosts: controller
roles:
- devstack
---
hosts: compute
roles:
- devstack-compute
Additionally, if a project is already defining ansible roles for its
deployment, then those roles may be easily applied in testing, making
CI even closer to CD. Finally, to make Zuul more useful for CD, Zuul
may be configured to run a job (ie, ansible role) on a specific node.
The pre- and post-run entries in the job definition might also apply
to ansible playbooks and can be used to simplify job setup and
cleanup::
### openstack-infra/zuul-playbooks/zuul-cloner.yaml
---
hosts: all
roles:
- zuul-cloner: {{zuul}}
Where the zuul variable is a dictionary containing all the information
currently transmitted in the ZUUL_* environment variables. Similarly,
the log archiving script can copy logs from the host to swift.
A new Zuul component would be created to execute jobs. Rather than
running a worker process on each node (which requires installing
software on the test node, and establishing and maintaining network
connectivity back to Zuul, and the ability to coordinate actions across
nodes for multi-node tests), this new component will accept jobs from
Zuul, and for each one, write an ansible inventory file with the node
and variable information, and then execute the ansible playbook for that
job. This means that the new Zuul component will maintain ssh
connections to all hosts currently running a job. This could become a
bottleneck, but ansible and ssh have been known to scale to a large
number of simultaneous hosts, and this component may be scaled
horizontally. It should be simple enough that it could even be
automatically scaled if needed. In turn, however, this does make node
configuration simpler (test nodes need only have an ssh public key
installed) and makes tests behave more like deployment.
+100 on the Ansible-related suggested changes. :)
Thanks!
-jay
_______________________________________________
OpenStack-Infra mailing list
[email protected]
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-infra