Hi folks.
After the cloud sprint I ended up exploring moving Hadron Industries's (my employer) internal image creation from vmdebootstrap plus a bunch of custom code to FAI. summary: FAI reminds me of the best of live-build except that I think I'll be able to extend and debug it easier. We produce images for a derivative of stretch designed for government environments with a focus on some proprietary spacial computing applications. Much of what we deliver is real hardware, although it's cloud like in that the systems should be identical with little state. You can change what "slot" a system is assigned to in our provisioning system, push a new config, and any system should replace any other system. Caveats and reality apply. However, these images are also used for VMs and to a lesser extent containers. There's currently no use of these images on public clouds, although some of our customers are investigating technology based on public clouds. For a variety of reasons we use btrfs for some things. Our customization and configuration includes: * UEFI for boot (moving to a complex secure boot infrastructure in a bit, but we're not there yet) * On initial boot images resize themselves, growing the partition and resizing the filesystem to max. * Because we use btrfs and because btrfs is more concerned with the filesystem's uuid than most filesystems, it's important that the image choose a unique filesystem uuid on first boot. Unfortunately, you cannot change the uuid of a mounted filesystem. So we need an initramfs hook to do this. * Installing KDE * Various forms of serial console support * Installing a couple of proprietary packages that the image will not be able to access repositories for. (The repositories require client certificates; the initial image is not permitted to have keys on it) * Adding some users, populating authorized_keys, etc. * Configuring networking Detailed review of FAI: In general I've found that FAI does more of the tasks that I'm hoping for out of an image creation tool than vmdebootstrap. I've also found that FAI is hugely faster than vmdebootstrap. With vmdebootstrap, it takes long enough to create an image that the build-test-debug cycle significantly negatively impacts my developer experience. FAI's caching and performance optimizations are good enough that I find my time being used more efficiently. I found myself taking advantage of classes in FAI to make it easier to produce related images. With the previous approach I had been focused on minimizing variance between images because I had to manage deciding what was shared between images and what differed. However, I found that it was worth having a couple of different types of image once I had classes available. FAI does a clean job of handling package installation. You can't really do package installation in the bootstrap phase. You may need multiple repositories, and you cannot depend on debootstrap being able to act as a package resolver. There are too many things it does not handle. FAI does this well. For vmdebootstrap I did all my customization in Python. I found that shell, especially with abstractions like ainsl and fcopy significantly simplified my code. Yes, I could have written abstractions myself. However, the neat thing about a good image tool is that it ought to over a number of years do a better job of knowing what abstractions I'll need than I'll come up with at first guess. FAI met this goal. I ended up reporting a number of bugs in FAI. None of them were blockers, but they were promptly addressed. I find the FAI code relatively easy to review and patch. I found my FAI-based work easier to extend than my previous work. During the project an urgent requirement appeared both for producing container images and for producing upgrade images for some older non-UEFI machines. These requirements were easier to meet than requirements of similar complexity than came up with the previous round of the project. I'm not quite sure I've hit the break-even point for saving time with the rewrite and technology change, but I expect to hit that soon. FAI is of course not perfect. Here are the things that I noticed: FAI combines code and user configuration data in a manner that does not thrill me. As an example, the script for installing grub lives in the configuration space. It's not entirely trivial. Part of me wishes that it lived in a library that was included from the configuration space. On the other hand, I found that it didn't quite work for non-UEFI btrfs without a boot partition, and I was glad that it was easy to customize. Getting this separation right is hard, and I guess if pushed I'd agree that having more code in the configuration space is a better direction to err in than the other way around. There are a lot of odd defaults in the configuration space. It's Thomas's software, and he promotes some choices that seem different from the rest of the project. As an example, FAI leaves installed systems mounting disks nobarrier with unsafe dpkg IO enabled. That's really neat for the initial install: it performs much better. I'm less convinced that's a reasonable default to leave system in though. "But it's only configuration, Sam." Perhaps, but the configuration is complex enough I bet a lot of people leave it that way. I find that I need a bit of a driver script to run and build all the image classe I need. I'm not actually sure there's any way around this. I think I can reduce complexity when I migrate to fai 5.3 and use the -N argument to fai-diskimage (thanks Thomas!) fai-server has a lot of dependencies that it doesn't need just for diskimage. Well, a lot of recommends at least. All in all, I'm very pleased.