On Tue, May 30, 2023 at 10:36:36AM +0300, Michael Tokarev wrote:
> 26.05.2023 13:19, Daniel P. Berrangé wrote:
> > We just (re)discovered that our gitlab rules don't work nicely with
> > pipelines running from stable staging branches. Every pipeline gets
> > published with the 'latest' tag, whether its the main staging branch
> > or one of the stable staging branches. If pipelines for multiple
> > staging branches run concurrently they'll get very confused and end
> > up using the wrong container content. eg a 8.0 stable job will get
> > run with a container from the development branch, or vica-verca.
> > 
> > With this series we dynamically change the tag so that the 'staging'
> > branch still uses 'latest', but the stable 'staging-X.Y' branaches
> > use a 'staging-X-Y' container tag.
> > 
> > We also let the container tag be set explicitly via the new variable
> > 
> >    QEMU_CI_CONTAINER_TAG
> > 
> > to facilitate CI testing, the new variable
> > 
> >    QEMU_CI_UPSTREAM
> > 
> > can be set to the fork namespace, to allow contributors to run a
> > pipeline as if their fork were upstream.
> 
> Daniel, can you describe in a bit more detail (or refer to some text
> to read) about how this whole thing works, aka the "big picture"?

What docs we have are at docs/devel/ci*rst but they're by no means
complete.

> It smells like we're doing huge wasteful job here, but it might be
> just because I don't understand how it works.
> 
> Can't we prepare all containers separately and independently of regular
> qemu commits, and just use the prepared container images every time
> we run tests?

Contributor branches, contributor patch series submissions, and pull
requests periodically contain updates to the dockerfiles. When we
build such code branches, we need to ensure that containers we using
building inside match the contents of those dockerfiles otherwise the
build will fail or perhaps worse, silently not test the changes in the
correct way.

Also we're creating contaniers in a staging branch and there's no
guarantee that the staging branch will actually get merged to master,
it might get rejected if CI fails, so we're left with containers that
might reflect a discard pull request.

A final point is that the distro base images change periodically and
we want to pick up this content.

We don't want people triggering the pipelines to have to think about
any of this to figure out whether a container rebuild is needed or
not, as that is inherantly error prone. We need CI to "do the right
thing" at all times.

Thus we will always build the containers in stage 1 of the pipeline.
The stage 2 will then do the QEMU builds inside the just refreshed
continers.

This is indeed wasteful if the patch series being tested did NOT
have any container changes.

To mitigate this wastage, however, we tell docker to use the previously
published containers as a cache. So docker build will compare each
command in the dockerfile, against the cache and if they match, it will
just copy across the contanier layer. This is a major performance win,
but even this act of checking the cache does have some wastage.

Essentially with CI reliability is king and generally overules other
considerations. A reliable, but computationally wasteful CI, is more
usable than an unreliable, but computationally efficient CI.

Obviously the ideal is computationally efficient *and* reliable, and
that's what we constantly want to strive towards.

> Also, why can't several tests (from several different pipelines, maybe
> run for different branches) use the same images (if we don't re-prepare
> them on each and every test run)?

The cache should mostly address this, within the scope of our release
stream.

> I understand different branches might have different requirements for
> containers, like using older version of some OS, etc, - this is done
> by naming the container images appropriately, like debian-latest (for
> master) vs debian-bullseye (for stable-7.2) etc.

The package dependancies have changed frequently enough that each
release of QEMU needs distinct containers. So with this patch series,
we set container tag names based on the staging branch, so we'll get

   debian11:latest       (for master / staging)
   debian11:staging-8-0  (for 8.0.x release staging)
   debian11:staging-8-2  (for 7.2.x release staging)



Finally, I have at last figured out a way we can improve this that will
probably let us remove the redundant container rebuilds for patch series
that /don't/ include dockerfile changes. IOW, we may finally be able to
achieve a computationally efficient and reliable CI, that doesn't require
maintainers to figure out when to rebuild containers. It is on my to do
list to try it out....



With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Reply via email to