On Tue, May 30, 2023 at 10:36:36AM +0300, Michael Tokarev wrote: > 26.05.2023 13:19, Daniel P. Berrangé wrote: > > We just (re)discovered that our gitlab rules don't work nicely with > > pipelines running from stable staging branches. Every pipeline gets > > published with the 'latest' tag, whether its the main staging branch > > or one of the stable staging branches. If pipelines for multiple > > staging branches run concurrently they'll get very confused and end > > up using the wrong container content. eg a 8.0 stable job will get > > run with a container from the development branch, or vica-verca. > > > > With this series we dynamically change the tag so that the 'staging' > > branch still uses 'latest', but the stable 'staging-X.Y' branaches > > use a 'staging-X-Y' container tag. > > > > We also let the container tag be set explicitly via the new variable > > > > QEMU_CI_CONTAINER_TAG > > > > to facilitate CI testing, the new variable > > > > QEMU_CI_UPSTREAM > > > > can be set to the fork namespace, to allow contributors to run a > > pipeline as if their fork were upstream. > > Daniel, can you describe in a bit more detail (or refer to some text > to read) about how this whole thing works, aka the "big picture"?
What docs we have are at docs/devel/ci*rst but they're by no means complete. > It smells like we're doing huge wasteful job here, but it might be > just because I don't understand how it works. > > Can't we prepare all containers separately and independently of regular > qemu commits, and just use the prepared container images every time > we run tests? Contributor branches, contributor patch series submissions, and pull requests periodically contain updates to the dockerfiles. When we build such code branches, we need to ensure that containers we using building inside match the contents of those dockerfiles otherwise the build will fail or perhaps worse, silently not test the changes in the correct way. Also we're creating contaniers in a staging branch and there's no guarantee that the staging branch will actually get merged to master, it might get rejected if CI fails, so we're left with containers that might reflect a discard pull request. A final point is that the distro base images change periodically and we want to pick up this content. We don't want people triggering the pipelines to have to think about any of this to figure out whether a container rebuild is needed or not, as that is inherantly error prone. We need CI to "do the right thing" at all times. Thus we will always build the containers in stage 1 of the pipeline. The stage 2 will then do the QEMU builds inside the just refreshed continers. This is indeed wasteful if the patch series being tested did NOT have any container changes. To mitigate this wastage, however, we tell docker to use the previously published containers as a cache. So docker build will compare each command in the dockerfile, against the cache and if they match, it will just copy across the contanier layer. This is a major performance win, but even this act of checking the cache does have some wastage. Essentially with CI reliability is king and generally overules other considerations. A reliable, but computationally wasteful CI, is more usable than an unreliable, but computationally efficient CI. Obviously the ideal is computationally efficient *and* reliable, and that's what we constantly want to strive towards. > Also, why can't several tests (from several different pipelines, maybe > run for different branches) use the same images (if we don't re-prepare > them on each and every test run)? The cache should mostly address this, within the scope of our release stream. > I understand different branches might have different requirements for > containers, like using older version of some OS, etc, - this is done > by naming the container images appropriately, like debian-latest (for > master) vs debian-bullseye (for stable-7.2) etc. The package dependancies have changed frequently enough that each release of QEMU needs distinct containers. So with this patch series, we set container tag names based on the staging branch, so we'll get debian11:latest (for master / staging) debian11:staging-8-0 (for 8.0.x release staging) debian11:staging-8-2 (for 7.2.x release staging) Finally, I have at last figured out a way we can improve this that will probably let us remove the redundant container rebuilds for patch series that /don't/ include dockerfile changes. IOW, we may finally be able to achieve a computationally efficient and reliable CI, that doesn't require maintainers to figure out when to rebuild containers. It is on my to do list to try it out.... With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|