> On 27 Apr 2020, at 18:37, Nir Soffer <[email protected]> wrote: > > On Mon, Apr 27, 2020 at 7:21 PM Barak Korren <[email protected]> wrote: >> >> >> >> בתאריך יום ב׳, 27 באפר׳ 2020, 17:15, מאת Marcin Sobczyk >> <[email protected]>: >>> >>> Hi, >>> >>> recently I've been working on a PoC for OST that replaces the usage >>> of lago templates with pre-built, layered VM images packed in RPMs [2][7]. >>> >>> >>> What's the motivation? >>> >>> There are two big pains around OST - first one is that it's slow >>> and the second one is it uses lago, which is unmaintained. >>> >>> >>> How is OST working currently? >>> >>> Lago launches VMs based on templates. It actually has its own mechanism for >>> VM >>> templating - you can find the ones that we currently use here [1]. How these >>> templates are created? There is a multiple-page doc somewhere that >>> describes the process, >>> but few are familiar with it. These templates are nothing special really - >>> just a xzipped >>> qcow with some metadata attached. The proposition here is to replace those >>> templates with >>> RPMs with qcows inside. The RPMs themselves would be built by a CI >>> pipeline. An example >>> of a pipeline like this can be found here [2]. >>> >>> >>> Why RPMs? >>> >>> It ticks all the boxes really. RPMs provide: >>> - tried and well known mechanisms for packaging, versioning, and >>> distribution instead >>> of lago's custom ones >>> - dependencies which permit to layer the VM images in a controllable way >>> - we already install RPMs when running OST, so using the new ones is a >>> matter of adding >>> some dependencies >>> >>> >>> How the image building pipeline works? [3] >>> >>> - we download a dvd iso for installation of the distro >>> - we use 'virt-install' with the dvd iso + kickstart file to build a 'base' >>> layer >>> qcow image >>> - we create another qcow image that has the 'base' image as the backing >>> one. In this >>> image we use 'virt-customize' to run 'dnf upgrade'. This is our 'upgrade' >>> layer. >>> - we create two more qcow images that have the 'upgrade' image as the >>> backing one. On one >>> of them we install the 'ovirt-host' package and on the other the >>> 'ovirt-engine'. These are >>> our 'host-installed' and 'engine-installed' layers. >>> - we create 4 RPMs for these qcows: >>> * ost-images-base >>> * ost-images-upgrade >>> * ost-images-host-installed >>> * ost-images-engine-installed >>> - we publish the RPMs to templates.ovirt.org/yum/ DNF repository (not >>> implemented yet) >>> >>> Each of those RPMs holds their respective qcow image. They also have proper >>> dependencies >>> set up - since 'upgrade' layer requires 'base' layer to be functional, it >>> has an RPM >>> requirement to that package. Same thing happens for '*-installed' packages >>> which depend on >>> 'upgrade' package. >>> >>> Since this is only a PoC there's still a lot of room for improvement around >>> the pipeline. >>> The 'base' RPM would be actually built very rarely, since it's a bare >>> distro, and the >>> 'upgrade' and '*-installed' RPMs would be built nightly. This would allow >>> us to simply >>> type 'dnf upgrade' on any machine and have a fresh set of VMs ready to be >>> used with OST. >>> >>> >>> Advantages: >>> >>> - we have CI for building OST images instead of current, obscure template >>> creating process >>> - we get rid of lots of unnecessary preparations that are done during each >>> OST run >>> by moving stuff from 'deploy scripts' [4] to image-building pipeline - >>> this should >>> speed up the runs a lot >>> - if the nightly pipeline for building images is not successful, the RPMs >>> won't be >>> published - OST will use the older ones. This makes a nice "early error >>> detection" >>> mechanism and can partially mitigate situations where everything is >>> blocked because >>> of some, i.e. dependency issues. >>> - it's another step for removing responsibilities from lago >>> - the pre-built VM images can be used for much more than OST - functional >>> testing of >>> vdsm/engine on a VM? We have an image for that >>> - we can build images for multiple distros, both u/s and d/s, easily >>> >>> >>> Caveats: >>> >>> - we have to download the RPMs before running OST and that takes time, >>> since they're big. >>> This can be handled by having them cached on the CI slaves though. >>> - current limitations of CI and lago force us to make a copy of the images >>> after >>> installation so they can be seen both by the processes in the chroot and >>> libvirt, which >>> is running outside of chroot. Right now they're placed in '/dev/shm' >>> (which would >>> actually make some sense if they could be shared among all OST runs on the >>> slave, but >>> that's another story). There are some possible workarounds around that >>> problem too (like >>> running pipelines on bare metal machines with libvirt running inside >>> chroot) >>> - multiple qcow layers can slow down the runs because there's a lot of >>> jumping around. >>> This can be handled by i.e. introducing a meta package that squashes all >>> the layers into >>> one. >>> - we need a way to run OST with custom-built artifacts. There are multiple >>> ways we can >>> approach it: >>> * use 'upgrade' layer and not '*-installed' one >>> * first build your artifacts, then build VM image RPMs that have your >>> artifacts >>> installed and pass those RPMs to OST run >>> * add 'ci build vms' that will do both ^^^ steps for you >>> Even here we can still benefit from using pre-built images - we can create >>> a 'deps-installed' layer that sits between 'upgrade' and '*-installed' and >>> contains >>> all vdsm's/engine's dependencies. >>> >>> >>> Some numbers >>> >>> So let's take a look at two OST runs - first one that uses the old way of >>> working [5] >>> and one that uses the new pre-built VM images [6]. The hacky change that >>> allows us to >>> use the pre-built images is here [7]. Here are some running times: >>> >>> - chroot init: 00:34 for the old way vs 14:03 for pre-built images >>> >>> This happens because the slave didn't have the new RPMs and chroot cached, >>> so it took a lot >>> of time to download them - the RPMs are ~2GB currently. When they will be >>> available >>> in cache it will get much closer to the old-way timing. >>> >>> - deployment times: >>> * engine 08:09 for the old way vs 03:31 for pre-built images >>> * host-1 05:05 for the old way vs 02:00 for pre-built images >>> >>> Here we can clearly see the benefits. This is without any special fine >>> tuning really - >>> even when using pre-built images there's still some deployment done, which >>> can be moved >>> to image-creating pipeline. >>> >>> >>> Further improvements >>> >>> We could probably get rid of all the funny custom repository stuff that >>> we're >>> doing right now because we can put everything that's necessary to pre-built >>> VM images. >>> >>> We can ship the images with ssh key injected - currently lago injects an ssh >>> key for root user in each run, which requires selinux relabeling, which >>> takes a lot >>> of time. >>> >>> We can try creating 'ovirt-deployed' images, where the whole ovirt solution >>> would >>> be already deployed for some tests. >>> >>> WDYT? >> >> >> We should not reinvent packer.io. It's bad enough we're reinventing Vagrant >> with Lago.
> > Yes, this looks promising: > https://www.packer.io/docs/builders/qemu.html it’s not about reinventing but rather avoiding unnecessary packages/dependencies we considered that as well but other than added complexity on top of virt-install it doesn’t really do anything more. > >>> Regards, Marcin >>> >>> [1] https://templates.ovirt.org/repo/ >>> [2] https://gerrit.ovirt.org/#/c/108430/ >>> [3] https://gerrit.ovirt.org/#/c/108430/6/ost-images/Makefile.am >>> [4] >>> https://github.com/oVirt/ovirt-system-tests/tree/master/common/deploy-scripts >>> [5] >>> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/6793/consoleFull >>> [6] >>> https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/9027/consoleFull >>> [7] https://gerrit.ovirt.org/#/c/108610/ >>> _______________________________________________ >>> Devel mailing list -- [email protected] >>> To unsubscribe send an email to [email protected] >>> Privacy Statement: https://www.ovirt.org/privacy-policy.html >>> oVirt Code of Conduct: >>> https://www.ovirt.org/community/about/community-guidelines/ >>> List Archives: >>> https://lists.ovirt.org/archives/list/[email protected]/message/YUXDBHPGQ6E55W5AO7QIKQWXD5PGSLH5/ >> >> _______________________________________________ >> Devel mailing list -- [email protected] >> To unsubscribe send an email to [email protected] >> Privacy Statement: https://www.ovirt.org/privacy-policy.html >> oVirt Code of Conduct: >> https://www.ovirt.org/community/about/community-guidelines/ >> List Archives: >> https://lists.ovirt.org/archives/list/[email protected]/message/EM3HHMSSTUDZYCXARYBJWUE3GFHLB64P/ > _______________________________________________ > Devel mailing list -- [email protected] > To unsubscribe send an email to [email protected] > Privacy Statement: https://www.ovirt.org/privacy-policy.html > oVirt Code of Conduct: > https://www.ovirt.org/community/about/community-guidelines/ > List Archives: > https://lists.ovirt.org/archives/list/[email protected]/message/KMB35NCSIREYCDTJ5SID3QIABDB6ALIQ/ _______________________________________________ Devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Privacy Statement: https://www.ovirt.org/privacy-policy.html oVirt Code of Conduct: https://www.ovirt.org/community/about/community-guidelines/ List Archives: https://lists.ovirt.org/archives/list/[email protected]/message/RNYG2A4LQNOCRHBKWZ2RVYOIIW3TRPBC/
