Re: reproducible builds with btrfs seed feature
On Sat, Oct 13, 2018 at 4:28 PM, Chris Murphy wrote: > Is it practical and desirable to make Btrfs based OS installation > images reproducible? Or is Btrfs simply too complex and > non-deterministic? [1] > > The main three problems with Btrfs right now for reproducibility are: > a. many objects have uuids other than the volume uuid; and mkfs only > lets us set the volume uuid > b. atime, ctime, mtime, otime; and no way to make them all the same > c. non-deterministic allocation of file extents, compression, inode > assignment, logical and physical address allocation d. generation, just pick a consistent default because the entire image is made with mkfs and then never rw mounted so it's not a problem > - Possibly disallow subvolumes and snapshots There's no actual mechanism to do either of these with mkfs, so it's not a problem. And if a sprout is created, it's fine for newly created subvolumes to follow the usual behavior of having unique UUID and incrementing generation. Thing is, the sprout will inherit the seeds preset chunk uuid, which while it shouldn't cause a problem is a kind of violation of uuid uniqueness; but ultimately I'm not sure how big of a problem it is for such uuids to spread. -- Chris Murphy
reproducible builds with btrfs seed feature
Is it practical and desirable to make Btrfs based OS installation images reproducible? Or is Btrfs simply too complex and non-deterministic? [1] The main three problems with Btrfs right now for reproducibility are: a. many objects have uuids other than the volume uuid; and mkfs only lets us set the volume uuid b. atime, ctime, mtime, otime; and no way to make them all the same c. non-deterministic allocation of file extents, compression, inode assignment, logical and physical address allocation I'm imagining reproducible image creation would be a mkfs feature that builds on Btrfs seed and --rootdir concepts to constrain Btrfs features to maybe make reproducible Btrfs volumes possible: - No raid - Either all objects needing uuids can have those uuids specified by switch, or possibly a defined set of uuids expressly for this use case, or possibly all of them can just be zeros (eek? not sure) - A flag to set all times the same - Possibly require that target block device is zero filled before creation of the Btrfs - Possibly disallow subvolumes and snapshots - Require the resulting image is seed/ro and maybe also a new compat_ro flag to enforce that such Btrfs file systems cannot be modified after the fact. - Enforce a consistent means of allocation and compression The end result is creating two Btrfs volumes would yield image files with matching hashes. If I had to guess, the biggest challenge would be allocation. But it's also possible that such an image may have problems with "sprouts". A non-removable sprout seems fairly straightforward and safe; but if a "reproducible build" type of seed is removed, it seems like removal needs to be smart enough to refresh *all* uuids found in the sprout: a hard break from the seed. Competing file systems, ext4 with make_ext4 fork, and squashfs. At the moment I'm thinking it might be easier to teach squashfs integrity checking than to make Btrfs reproducible. But then I also think restricting Btrfs features, and applying some requirements to constrain Btrfs to make it reproducible, really enhances the Btrfs seed-sprout feature. Any thoughts? Useful? Difficult to implement? Squashfs might be a better fit for this use case *if* it can be taught about integrity checking. It does per file checksums for the purpose of deduplication but those checksums aren't retained for later integrity checking. [1] problems of reproducible system images https://reproducible-builds.org/docs/system-images/ [2] purpose and motivation for reproducible builds https://reproducible-builds.org/ [3] who is involved? https://reproducible-builds.org/who/#Qubes%20OS -- Chris Murphy