Re: [openstack-dev] [TripleO] overcloud_containers.yaml: container versioning, tags and use cases ?
On Sat, May 27, 2017 at 7:07 AM, David Moreau Simardwrote: > Hi, > > Today we discussed a challenge around image tags that mostly boils > down to limitations in how overcloud_containers.yaml is constructed > and used. > > TL;DR, we need a smart and easy way to work with the > overcloud_containers.yaml file (especially tags). > > Let's highlight a few use cases that we need to work through: > > #1. Building containers > For building containers, all we really care is the name of the > images we need to build. > Today, we install a trunk repository and then install > tripleo-common-containers from that trunk repository. > We then mostly grep/sed/awk our way from overcloud_containers.yaml > to a clean list of images to build and then build those. > Relatively okay with this but prone to things breaking -- a clean > way to get just the list of images out of there would be nice. > > The command "openstack overcloud container image build" also has some string matching logic, but then invokes kolla-build directly. Can I suggest that we add a --list-images option to this command so that it just returns a list of images for other image building tools to consume? > #2. Testing and promoting containers > This comes right after use case #1 where we build containers in the > pipeline. > For those familiar with the CI pipeline to do promotions [1], this > would look a bit like this [2]. > > In practice, this works basically the same way as we build, test and > promote tripleo-quickstart images. > We pick a particular trunk repository hash and build containers for > that hash. These are then pushed with both the tags ":latest" and > ":". > We're then supposed to test those containers in the pipeline but to > do that, we need to be pulling from :, not :latest... > although they are in theory equivalent at that given time, this might > not always be true. > So the testing job(s) need a way to easily customize/pull from that > particular hash instead of the hardcoded latest we have right now. > > I would like to see another "openstack overcloud container image ..." command which is pointed at an image registry and a canonical overcloud_containers.yaml file, then generates another overcloud_containers.yaml (and heat environment file) which contains the proper latest tags. This tool could work too for stable version-style tags. How about "openstack overcloud container image discover"? This would be easier to implement if the canonical overcloud_containers.yaml file was a template rather than a file with hard-coded namespace and tags. > #3. Upstream gate jobs > Ideally, gate jobs should never use ":latest". This is in trunk/dlrn > terms the equivalent of "/current/" or "/consistent/". > They'd use something like ":latest-passed-ci" which would be the > proper equivalent of "/current-passed-ci/" or "/current-tripleo/". > > There is nothing special about the word latest. Can we give these images the same tag as the name of the package repo they came from? so :current-passed-ci :current-tripleo? > This brings an interesting challenge around how we currently add new > images to overcloud_containers.yaml (example [3]). > It is expected that, when you add the image, the image is already > present on the registry because otherwise the container jobs will fail > since this new image cannot be pulled (example [4]). > My understanding currently is that humans may build and push images > to the registry ahead of time so that this works. > We can keep a similar approach if that's what we really want with > the new private registry, the job that builds container is made > generic exactly to be able to build just a specific set of image(s) if > we want. > Here's the catch, though: this new container image will have the > ":latest" tag, it will not have ":latest-passed-ci" because it hasn't > passed CI yet, it's being added just now. > So how do we address this ? > > Here is an idea, the "discover" command mentioned above could filter images based on their presence in the registry with the required tags, so the resulting generated overcloud_containers.yaml would have less entries if there is no image with the requested tag. > Note: > We've already discussed that some containers need to pick up the > latest and the greatest from the "/current/" repository, either > because they are "direct" tripleo packages or if "Depends-On" is used. > So far, the solution we seem to be going towards is to pick up the > containers from ":latest-passed-ci" and then more or less add a 'yum > update' layer to the images needing an update. > This is the option that is in the best interest of time, we'd > otherwise be spending too much time building containers in jobs that > are already taking way too long to run. > That is a shame, I have no suggestions to avoid this though. > #4. Test days > When doing test days, we know to point testers to > /current-passed-ci/ as well as tested
[openstack-dev] [TripleO] overcloud_containers.yaml: container versioning, tags and use cases ?
Hi, Today we discussed a challenge around image tags that mostly boils down to limitations in how overcloud_containers.yaml is constructed and used. TL;DR, we need a smart and easy way to work with the overcloud_containers.yaml file (especially tags). Let's highlight a few use cases that we need to work through: #1. Building containers For building containers, all we really care is the name of the images we need to build. Today, we install a trunk repository and then install tripleo-common-containers from that trunk repository. We then mostly grep/sed/awk our way from overcloud_containers.yaml to a clean list of images to build and then build those. Relatively okay with this but prone to things breaking -- a clean way to get just the list of images out of there would be nice. #2. Testing and promoting containers This comes right after use case #1 where we build containers in the pipeline. For those familiar with the CI pipeline to do promotions [1], this would look a bit like this [2]. In practice, this works basically the same way as we build, test and promote tripleo-quickstart images. We pick a particular trunk repository hash and build containers for that hash. These are then pushed with both the tags ":latest" and ":". We're then supposed to test those containers in the pipeline but to do that, we need to be pulling from :, not :latest... although they are in theory equivalent at that given time, this might not always be true. So the testing job(s) need a way to easily customize/pull from that particular hash instead of the hardcoded latest we have right now. #3. Upstream gate jobs Ideally, gate jobs should never use ":latest". This is in trunk/dlrn terms the equivalent of "/current/" or "/consistent/". They'd use something like ":latest-passed-ci" which would be the proper equivalent of "/current-passed-ci/" or "/current-tripleo/". This brings an interesting challenge around how we currently add new images to overcloud_containers.yaml (example [3]). It is expected that, when you add the image, the image is already present on the registry because otherwise the container jobs will fail since this new image cannot be pulled (example [4]). My understanding currently is that humans may build and push images to the registry ahead of time so that this works. We can keep a similar approach if that's what we really want with the new private registry, the job that builds container is made generic exactly to be able to build just a specific set of image(s) if we want. Here's the catch, though: this new container image will have the ":latest" tag, it will not have ":latest-passed-ci" because it hasn't passed CI yet, it's being added just now. So how do we address this ? Note: We've already discussed that some containers need to pick up the latest and the greatest from the "/current/" repository, either because they are "direct" tripleo packages or if "Depends-On" is used. So far, the solution we seem to be going towards is to pick up the containers from ":latest-passed-ci" and then more or less add a 'yum update' layer to the images needing an update. This is the option that is in the best interest of time, we'd otherwise be spending too much time building containers in jobs that are already taking way too long to run. #4. Test days When doing test days, we know to point testers to /current-passed-ci/ as well as tested quickstart images. How can we make it easy for containers ? If the container list from tripleo-common is hardcoded to latest, that won't work. If it's hardcoded to :latest-passed-ci, it won't work for other use cases. Ideally this would be super easy for end users as well as developers so that they can get started easily. #5 Stable releases, users, operators & co The packaging workflow is not the same for trunk (dlrn out of git source on trunk.rdoproject.oirg) and for stable releases (CentOS build system out of released tarballs on mirror.centos.org). It's also going to be different for containers. For trunk, we'll be building containers with trunk repositories and publishing them to a private registry analogous to trunk.rdoproject.org repositories. For stable releases, while still hand-wavy and foggy, we seem to be headed in the direction of the CentOS official registry which takes Dockerfiles in pseudo dist-git repositories and builds/publishes the containers through Jenkins jobs. This is sort of similar to how downstream would work if you replace Jenkins by Brew/OSBS. So here, we want to use the overcloud_containers.yaml file to "compile" Dockerfiles which will be shipped to git repositories and then built by another process. These containers will be published somewhere that tripleo-common is sort of expected to know ahead of time because users, customers and developers need to be pulling from that "stable" source and from a "stable" tag. So... what do we do ? [1]: