Hi FreeIPA contributors,

first, I apologize for such long mail.

in the team, we discuss how upstream CI should look like and what to
expect from it. Various proposals were discussed. In this email I'd
like to write down what I believe should be the expectations/goals of
it. Or scope of the expectations. Most were based on Fedora CI and
Cockpit Project ideals. The intention of this mail is to convey this
information and also to create space to receive feedback and start a
specific discussion on the individual points.

In the first part, I'm mostly focusing on the ideals, not the
implementation (using of some specific CI). This is the core part to
focus on. When we agree on ideals/goals, we can discuss
implementation.

In the second part, I'm mentioning some aspects of PR-CI and it's
capabilities and why I think it fulfills the ideas above.

I don't want to take much of your time and I hope that I'm not
repeating myself much, but I have a feeling that we lack a write-up
like this, i.e. that not everybody knows the motivation behind PR-CI
or some parts of it. I did not cover everything. Also, there is the
idea how to transform CI runner into a bot for other non-test tasks
but that is for another mail, another discussion.

# Upstream CI expectations:

Upstream CI should be upstream. So that upstream contributors can use
it. It should be also simple enough to modify.

Being upstream means:
* upstream contributor is able to add test
* upstream contributor is able to modify a way how tests are run (e.g.
install selenium and Firefox on a host for Web UI tests)
* upstream contributor is able to inspect results and logs of a test run
* test run can be referenced in public resources like Pagure
* doing the above should not require an action outside of upstream channels

## Goal of CI:
Keep good enough quality to release often without regressions.

So in essence, I see 2 goals: test(assessment of quality) and being
upstream friendly.

## What to test:
* ideally everything on each pull request

## Complexity of the test
* test everything might raise questions like: ok what about
performance tests or big infra tests. My personal opinion is that such
tests might create quite high demand on the testing infra - too much
unnecessary complexity. Upstream CI could support test with e.g. 6
hosts but the amount of such test could be limited. In my opinion,
this could be the threshold in what upstream tests and what downstream
providers of FreeIPA test. So in practice, upstream CI should test
"almost everything". ;)

## How quickly to test it:
* in a reasonable time after opening/updating PR. Reasonable in this
context means a time where it still provides feedback but also is able
to run something non-trivial. As an arbitrary value which can fulfill
it 1.5h was chosen.

## How test should look like
Ideally be in a form that it can be reused easily on other OSes then Fedora.

## How many parallel Pull Requests to count with these constraints:
* when designing PR-CI, 3 were chosen. Seemed as a value which might
be close to usual reality peak.

## Other expectations on test infra:
* be able to divide resource cost to more entities/companies (a
concept of a trusted runner)
* upstream contributor should be able to write and debug tests in the
same fashion as tests are run in the CI


Every testing and especially resource demanding testing like multihost
tests which FreeIPA has requires significant resources. Currently,
those are provided by Red Hat. If somebody else would like to join the
party, or if there are some available resources then why not utilize
them.

Being able to reproduce the test environment on private hw allows
contributors to write and debug tests. If this is not possible then it
makes a contribution of tests difficult. So ideally majority of tests
should be possible to run on a well-equipped laptop (e.g with 12-16 GB
of memory) while still being able to do some work there.

## How results should look like
Up to the team. For me, personally, even current state is good enough
(meaning the Git Hub interface). But the decentralized and public
aspect could support other views like Michal's dashboard POC which is
quite nice. Or could support sending results to DBs like result
DB<https://fedoraproject.org/wiki/ResultsDB> for future analysis and
processing. Such DB can have it own UI and mainly doesn't need to be
developed by us.

## Compromises and release cadence

Is it possible to test everything in 1.5h while having 3 parallel PRs?
Maybe. But probably not, or maybe not at the beginning. But if so then
let us explore how. If not then lets come up with reasonable
compromises.

Being able to test everything on each PRs allows us to release at any moment.

If we do compromises then we do also compromise on how often we are
able to release. This might be still OK. It is up to team's decision.

I'd prefer if we would aim for releasing upstream FreeIPA every 14
days. It might be too ambitious at the beginning so e.g. 1 month could
be OK at the start.


# PR-CI specifics and why it fulfills the goals:

It's completely upstream - code, test definition, implementation and
deployment specification. The only private part is a run of the actual
deployment of a runner. But if upstream core team has a way how to
"bless" runners to be trusted then this is fine, more entities can
deploy such runners. In practice, this blessing is done by
fedorahosted private key and GitHub token.

## Nightly tests

Nighly tests are a compromise which basically says: we are not able to
test everything in every PR. So at least let's test the rest nightly.
It was easy to set up without much-required work. But it doesn't mean
that different approach cannot be taken.

## Alternatives to nightly tests

In last weeks discussion, Lex was talking about testing more stuff and
utilizing runners when they don't do anything. I can imagine that
there might be a set of jobs with higher priority to give fast
feedback and then another set of jobs (currently the ones in
nightlies) with lower priority. Thus it would still provide good
feedback soon enough and full feedback a bit later.

## Test stability
We often say PR-CI, while meaning test on Pull Request. But the hidden
jewel of the project is the way how thing are run on a runner. It uses
local virtualization by using Vagrant and test configuration by using
Ansible with predefined, regularly updated host images.
* local virtualization does not tight us to specific provisioning
systems. It can work on OpenStack, Beaker, bare metal and maybe also
OpenShift or other cloud providers like AWS.
* Ansible is just ease of setting it up
* custom images is the key part here. The core idea is that we prepare
an image with dependencies, test it first and if it does not regress
it can be used. So bugs in dependencies won't stop development. For
FreeIPA which has huge dependency tree, it is really a game changer.
Regular updates and pre-test of images before use is a key to catch
and fix such bugs while having images relatively up-to-date.

## Optimizations:

Not every job has the same size. There are tests which require 1 host,
there are tests which need 5. Currently, all runners have the same
size and thus small tasks don't utilize resources well. Having support
for different sizes will allow us to run more stuff in parallel with
the same resources.

## Test definition yaml

The key benefit is that adding test to run on PR or in future nightly
could be done by simple modification of `.freeipa-pr-ci.yaml`

Kudos if you got here.

Regards
-- 
Petr Vobornik
_______________________________________________
FreeIPA-devel mailing list -- freeipa-devel@lists.fedorahosted.org
To unsubscribe send an email to freeipa-devel-le...@lists.fedorahosted.org

Reply via email to