On 2/16/21 1:43 PM, Daniel P. Berrangé wrote: > On Wed, Feb 10, 2021 at 11:17:00AM +0000, Daniel P. Berrangé wrote: >> On Tue, Feb 09, 2021 at 09:58:29AM +0000, Daniel P. Berrangé wrote: >>> On Tue, Feb 09, 2021 at 07:37:51AM +0100, Thomas Huth wrote: >>>> On 08/02/2021 17.33, Daniel P. Berrangé wrote: >>>> [...] >>>>> For example, consider pushing 5 commits, one of which contains a >>>>> dockerfile change. This will trigger a CI pipeline for the >>>>> containers. Now consider you do some more work on the branch and push 3 >>>>> further commits, so you now have a branch of 8 commits. For the second >>>>> push GitLab will only look at the 3 most recent commits, the other 5 >>>>> were already present. Thus GitLab will not realize that the branch has >>>>> dockerfile changes that need to trigger the container build. >>>>> >>>>> This can cause real world problems: >>>>> >>>>> - Push 5 commits to branch "foo", including a dockerfile change >>>>> >>>>> => rebuilds the container images with content from "foo" >>>>> => build jobs runs against containers from "foo" >>>>> >>>>> - Refresh your master branch with latest upstream master >>>>> >>>>> => rebuilds the container images with content from "master" >>>>> => build jobs runs against containers from "master" >>>>> >>>>> - Push 3 more commits to branch "foo", with no dockerfile change >>>>> >>>>> => no container rebuild triggers >>>>> => build jobs runs against containers from "master" >>>>> >>>>> The "changes" conditional in gitlab is OK, *provided* your build >>>>> jobs are not relying on any external state from previous builds. >>>>> >>>>> This is NOT the case in QEMU, because we are building container >>>>> images and these are cached. This is a scenario in which the >>>>> "changes" conditional is not usuable. >>>>> >>>>> The only other way to avoid this problem would be to use the git >>>>> branch name as the container image tag, instead of always using >>>>> "latest". >>>> I'm basically fine with your patch, but let me ask one more thing: Won't we >>>> still have the problem if the user pushes to different branches >>>> simultaneously? E.g. the user pushes to "foo" with changes to dockerfiles, >>>> containers start to get rebuild, then pushes to master without waiting for >>>> the previous CI to finish, then the containers get rebuild from the >>>> "master" >>>> job without the local changes to the dockerfiles. Then in the "foo" CI >>>> pipelines the following jobs might run with the containers that have been >>>> built by the "master" job... >>> >>> Yes, this is the issue I describe in the cover letter. >>> >>>> So if we really want to get it bulletproof, do we have to use the git >>>> branch >>>> name as the container image tag? >>> >>> That is possible, but I'm somewhat loathe to do that, as it means the >>> container registry in developers forks will accumulate a growing list >>> of image tags. I know gitlab will force expire once it gets beyond a >>> certain number of tags, but it still felt pretty wasteful of space >>> to create so many tags. >>> >>> Having said that, maybe this is not actually wasteful if we always >>> use the "master" as a cache for docker, then the "new" images we >>> build on each branch will just re-use existing docker layers and >>> thus not add to disk usage. We'd only see extra usage if the branch >>> contained changes to dockerfiles. >> >> The challenge here is that I need the docker tag name to be in an env >> variable in the gitlab-ci.yml file. >> >> I can directly use $CI_COMMIT_REF_NAME to get the branch name but >> the list of valid characters for a git branch is way more permissive >> than valid characters for a docker tag. >> >> So we need to filter the git branch name to form a valid docker tag, >> and AFAICT, there's no way todo that when setting a global env variable >> in the gitlab-ci.yml. I can only do filtering once in the before_script: >> stage, and that's too late to use it in the image name for the job. > > I've thought of a solution here. > > We can tag the images with $CI_COMMIT_SHORT_SHA , and the build jobs > can reference them with > > image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:$CI_COMMIT_SHORT_SHA > > In the continer build script, we then *also* tag them with a sanitized > version of $CI_COMMIT_REF_NAME, and also use this as the cache to pull > from when building the image. > > The main downside here is that we'll end up creating alot of tags, but > most will have the same content so shouldn't be too bad.
This could be automated (for forks): https://docs.gitlab.com/ee/user/packages/container_registry/#delete-images-by-using-a-cleanup-policy Not yet to the qemu-project registry because: Cleanup policies can be run on all projects, with these exceptions: For GitLab.com, the project must have been created after 2020-02-22. Regards, Phil.