After the cloud sprint I ended up exploring moving Hadron Industries's
(my employer) internal image creation from vmdebootstrap plus a bunch of
custom code to FAI.
summary: FAI reminds me of the best of live-build except that I think
I'll be able to extend and debug it easier.
We produce images for a derivative of stretch designed for government
environments with a focus on some proprietary spacial computing
Much of what we deliver is real hardware, although it's cloud like in
that the systems should be identical with little state. You can change
what "slot" a system is assigned to in our provisioning system, push a
new config, and any system should replace any other system. Caveats and
reality apply. However, these images are also used for VMs and to a
lesser extent containers. There's currently no use of these images on
public clouds, although some of our customers are investigating
technology based on public clouds.
For a variety of reasons we use btrfs for some things.
Our customization and configuration includes:
* UEFI for boot (moving to a complex secure boot infrastructure in a
bit, but we're not there yet)
* On initial boot images resize themselves, growing the partition and
resizing the filesystem to max.
* Because we use btrfs and because btrfs is more concerned with the
filesystem's uuid than most filesystems, it's important that the image
choose a unique filesystem uuid on first boot. Unfortunately, you
cannot change the uuid of a mounted filesystem. So we need an
initramfs hook to do this.
* Installing KDE
* Various forms of serial console support
* Installing a couple of proprietary packages that the image will not be
able to access repositories for. (The repositories require client
certificates; the initial image is not permitted to have keys on it)
* Adding some users, populating authorized_keys, etc.
* Configuring networking
Detailed review of FAI:
In general I've found that FAI does more of the tasks that I'm hoping
for out of an image creation tool than vmdebootstrap. I've also found
that FAI is hugely faster than vmdebootstrap. With vmdebootstrap, it
takes long enough to create an image that the build-test-debug cycle
significantly negatively impacts my developer experience. FAI's caching
and performance optimizations are good enough that I find my time being
used more efficiently.
I found myself taking advantage of classes in FAI to make it easier to
produce related images. With the previous approach I had been focused
on minimizing variance between images because I had to manage deciding
what was shared between images and what differed.
However, I found that it was worth having a couple of different types of
image once I had classes available.
FAI does a clean job of handling package installation. You can't really
do package installation in the bootstrap phase. You may need multiple
repositories, and you cannot depend on debootstrap being able to act as
a package resolver. There are too many things it does not handle. FAI
does this well.
For vmdebootstrap I did all my customization in Python. I found that
shell, especially with abstractions like ainsl and fcopy significantly
simplified my code. Yes, I could have written abstractions myself.
However, the neat thing about a good image tool is that it ought to over
a number of years do a better job of knowing what abstractions I'll need
than I'll come up with at first guess. FAI met this goal.
I ended up reporting a number of bugs in FAI. None of them were
blockers, but they were promptly addressed. I find the FAI code
relatively easy to review and patch.
I found my FAI-based work easier to extend than my previous work.
During the project an urgent requirement appeared both for producing
container images and for producing upgrade images for some older
non-UEFI machines. These requirements were easier to meet than
requirements of similar complexity than came up with the previous round
of the project. I'm not quite sure I've hit the break-even point for
saving time with the rewrite and technology change, but I expect to hit
FAI is of course not perfect. Here are the things that I noticed:
FAI combines code and user configuration data in a manner that does not
thrill me. As an example, the script for installing grub lives in the
configuration space. It's not entirely trivial. Part of me wishes that
it lived in a library that was included from the configuration space.
On the other hand, I found that it didn't quite work for non-UEFI btrfs
without a boot partition, and I was glad that it was easy to customize.
Getting this separation right is hard, and I guess if pushed I'd agree
that having more code in the configuration space is a better direction
to err in than the other way around.
There are a lot of odd defaults in the configuration space. It's
Thomas's software, and he promotes some choices that seem different from
the rest of the project. As an example, FAI leaves installed systems
mounting disks nobarrier with unsafe dpkg IO enabled. That's really
neat for the initial install: it performs much better. I'm less
convinced that's a reasonable default to leave system in though. "But
it's only configuration, Sam." Perhaps, but the configuration is
complex enough I bet a lot of people leave it that way.
I find that I need a bit of a driver script to run and build all the
image classe I need. I'm not actually sure there's any way around
this. I think I can reduce complexity when I migrate to fai 5.3 and
use the -N argument to fai-diskimage (thanks Thomas!)
fai-server has a lot of dependencies that it doesn't need just for
diskimage. Well, a lot of recommends at least.
All in all, I'm very pleased.