> On 27 Apr 2020, at 18:37, Nir Soffer <[email protected]> wrote:
> 
> On Mon, Apr 27, 2020 at 7:21 PM Barak Korren <[email protected]> wrote:
>> 
>> 
>> 
>> בתאריך יום ב׳, 27 באפר׳ 2020, 17:15, מאת Marcin Sobczyk 
>> ‏<[email protected]>:
>>> 
>>> Hi,
>>> 
>>> recently I've been working on a PoC for OST that replaces the usage
>>> of lago templates with pre-built, layered VM images packed in RPMs [2][7].
>>> 
>>> 
>>> What's the motivation?
>>> 
>>> There are two big pains around OST - first one is that it's slow
>>> and the second one is it uses lago, which is unmaintained.
>>> 
>>> 
>>> How is OST working currently?
>>> 
>>> Lago launches VMs based on templates. It actually has its own mechanism for 
>>> VM
>>> templating - you can find the ones that we currently use here [1]. How these
>>> templates are created? There is a multiple-page doc somewhere that 
>>> describes the process,
>>> but few are familiar with it. These templates are nothing special really - 
>>> just a xzipped
>>> qcow with some metadata attached. The proposition here is to replace those 
>>> templates with
>>> RPMs with qcows inside. The RPMs themselves would be built by a CI 
>>> pipeline. An example
>>> of a pipeline like this can be found here [2].
>>> 
>>> 
>>> Why RPMs?
>>> 
>>> It ticks all the boxes really. RPMs provide:
>>> - tried and well known mechanisms for packaging, versioning, and 
>>> distribution instead
>>>  of lago's custom ones
>>> - dependencies which permit to layer the VM images in a controllable way
>>> - we already install RPMs when running OST, so using the new ones is a 
>>> matter of adding
>>>  some dependencies
>>> 
>>> 
>>> How the image building pipeline works? [3]
>>> 
>>> - we download a dvd iso for installation of the distro
>>> - we use 'virt-install' with the dvd iso + kickstart file to build a 'base' 
>>> layer
>>>  qcow image
>>> - we create another qcow image that has the 'base' image as the backing 
>>> one. In this
>>>  image we use 'virt-customize' to run 'dnf upgrade'.  This is our 'upgrade' 
>>> layer.
>>> - we create two more qcow images that have the 'upgrade' image as the 
>>> backing one. On one
>>>  of them we install the 'ovirt-host' package and on the other the 
>>> 'ovirt-engine'. These are
>>>  our 'host-installed' and 'engine-installed' layers.
>>> - we create 4 RPMs for these qcows:
>>>  * ost-images-base
>>>  * ost-images-upgrade
>>>  * ost-images-host-installed
>>>  * ost-images-engine-installed
>>> - we publish the RPMs to templates.ovirt.org/yum/ DNF repository (not 
>>> implemented yet)
>>> 
>>> Each of those RPMs holds their respective qcow image. They also have proper 
>>> dependencies
>>> set up - since 'upgrade' layer requires 'base' layer to be functional, it 
>>> has an RPM
>>> requirement to that package. Same thing happens for '*-installed' packages 
>>> which depend on
>>> 'upgrade' package.
>>> 
>>> Since this is only a PoC there's still a lot of room for improvement around 
>>> the pipeline.
>>> The 'base' RPM would be actually built very rarely, since it's a bare 
>>> distro, and the
>>> 'upgrade' and '*-installed' RPMs would be built nightly. This would allow 
>>> us to simply
>>> type 'dnf upgrade' on any machine and have a fresh set of VMs ready to be 
>>> used with OST.
>>> 
>>> 
>>> Advantages:
>>> 
>>> - we have CI for building OST images instead of current, obscure template 
>>> creating process
>>> - we get rid of lots of unnecessary preparations that are done during each 
>>> OST run
>>>  by moving stuff from 'deploy scripts' [4] to image-building pipeline - 
>>> this should
>>>  speed up the runs a lot
>>> - if the nightly pipeline for building images is not successful, the RPMs 
>>> won't be
>>>  published - OST will use the older ones. This makes a nice "early error 
>>> detection"
>>>  mechanism and can partially mitigate situations where everything is 
>>> blocked because
>>>  of some, i.e. dependency issues.
>>> - it's another step for removing responsibilities from lago
>>> - the pre-built VM images can be used for much more than OST - functional 
>>> testing of
>>>  vdsm/engine on a VM? We have an image for that
>>> - we can build images for multiple distros, both u/s and d/s, easily
>>> 
>>> 
>>> Caveats:
>>> 
>>> - we have to download the RPMs before running OST and that takes time, 
>>> since they're big.
>>>  This can be handled by having them cached on the CI slaves though.
>>> - current limitations of CI and lago force us to make a copy of the images 
>>> after
>>>  installation so they can be seen both by the processes in the chroot and 
>>> libvirt, which
>>>  is running outside of chroot. Right now they're placed in '/dev/shm' 
>>> (which would
>>>  actually make some sense if they could be shared among all OST runs on the 
>>> slave, but
>>>  that's another story). There are some possible workarounds around that 
>>> problem too (like
>>>  running pipelines on bare metal machines with libvirt running inside 
>>> chroot)
>>> - multiple qcow layers can slow down the runs because there's a lot of 
>>> jumping around.
>>>  This can be handled by i.e. introducing a meta package that squashes all 
>>> the layers into
>>>  one.
>>> - we need a way to run OST with custom-built artifacts. There are multiple 
>>> ways we can
>>>  approach it:
>>>  * use 'upgrade' layer and not '*-installed' one
>>>  * first build your artifacts, then build VM image RPMs that have your 
>>> artifacts
>>>    installed and pass those RPMs to OST run
>>>  * add 'ci build vms' that will do both ^^^ steps for you
>>>  Even here we can still benefit from using pre-built images - we can create
>>>  a 'deps-installed' layer that sits between 'upgrade' and '*-installed' and 
>>> contains
>>>  all vdsm's/engine's dependencies.
>>> 
>>> 
>>> Some numbers
>>> 
>>> So let's take a look at two OST runs - first one that uses the old way of 
>>> working [5]
>>> and one that uses the new pre-built VM images [6]. The hacky change that 
>>> allows us to
>>> use the pre-built images is here [7]. Here are some running times:
>>> 
>>> - chroot init: 00:34 for the old way vs 14:03 for pre-built images
>>> 
>>> This happens because the slave didn't have the new RPMs and chroot cached, 
>>> so it took a lot
>>> of time to download them - the RPMs are ~2GB currently. When they will be 
>>> available
>>> in cache it will get much closer to the old-way timing.
>>> 
>>> - deployment times:
>>>  * engine 08:09 for the old way vs 03:31 for pre-built images
>>>  * host-1 05:05 for the old way vs 02:00 for pre-built images
>>> 
>>> Here we can clearly see the benefits. This is without any special fine 
>>> tuning really -
>>> even when using pre-built images there's still some deployment done, which 
>>> can be moved
>>> to image-creating pipeline.
>>> 
>>> 
>>> Further improvements
>>> 
>>> We could probably get rid of all the funny custom repository stuff that 
>>> we're
>>> doing right now because we can put everything that's necessary to pre-built 
>>> VM images.
>>> 
>>> We can ship the images with ssh key injected - currently lago injects an ssh
>>> key for root user in each run, which requires selinux relabeling, which 
>>> takes a lot
>>> of time.
>>> 
>>> We can try creating 'ovirt-deployed' images, where the whole ovirt solution 
>>> would
>>> be already deployed for some tests.
>>> 
>>> WDYT?
>> 
>> 
>> We should not reinvent packer.io. It's bad enough we're reinventing Vagrant 
>> with Lago.

> 
> Yes, this looks promising:
> https://www.packer.io/docs/builders/qemu.html

it’s not about reinventing but rather avoiding unnecessary packages/dependencies
we considered that as well but other than added complexity on top of 
virt-install it doesn’t really do anything more.

> 
>>> Regards, Marcin
>>> 
>>> [1] https://templates.ovirt.org/repo/
>>> [2] https://gerrit.ovirt.org/#/c/108430/
>>> [3] https://gerrit.ovirt.org/#/c/108430/6/ost-images/Makefile.am
>>> [4] 
>>> https://github.com/oVirt/ovirt-system-tests/tree/master/common/deploy-scripts
>>> [5] 
>>> https://jenkins.ovirt.org/view/oVirt%20system%20tests/job/ovirt-system-tests_manual/6793/consoleFull
>>> [6] 
>>> https://jenkins.ovirt.org/job/ovirt-system-tests_standard-check-patch/9027/consoleFull
>>> [7] https://gerrit.ovirt.org/#/c/108610/
>>> _______________________________________________
>>> Devel mailing list -- [email protected]
>>> To unsubscribe send an email to [email protected]
>>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>>> oVirt Code of Conduct: 
>>> https://www.ovirt.org/community/about/community-guidelines/
>>> List Archives: 
>>> https://lists.ovirt.org/archives/list/[email protected]/message/YUXDBHPGQ6E55W5AO7QIKQWXD5PGSLH5/
>> 
>> _______________________________________________
>> Devel mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>> Privacy Statement: https://www.ovirt.org/privacy-policy.html
>> oVirt Code of Conduct: 
>> https://www.ovirt.org/community/about/community-guidelines/
>> List Archives: 
>> https://lists.ovirt.org/archives/list/[email protected]/message/EM3HHMSSTUDZYCXARYBJWUE3GFHLB64P/
> _______________________________________________
> Devel mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> Privacy Statement: https://www.ovirt.org/privacy-policy.html
> oVirt Code of Conduct: 
> https://www.ovirt.org/community/about/community-guidelines/
> List Archives: 
> https://lists.ovirt.org/archives/list/[email protected]/message/KMB35NCSIREYCDTJ5SID3QIABDB6ALIQ/
_______________________________________________
Devel mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Privacy Statement: https://www.ovirt.org/privacy-policy.html
oVirt Code of Conduct: 
https://www.ovirt.org/community/about/community-guidelines/
List Archives: 
https://lists.ovirt.org/archives/list/[email protected]/message/RNYG2A4LQNOCRHBKWZ2RVYOIIW3TRPBC/

Reply via email to