On Fri, Oct 12, 2018 at 5:26 PM, Marek Marczykowski-Górecki
<marma...@invisiblethingslab.com> wrote:
> On Fri, Oct 12, 2018 at 03:44:38PM -0600, Chris Murphy wrote:

>> mkfs.btrfs has --rootdir and --shrink features to pre-allocate a
>> volume with files at mkfs time; I have no idea to what degree it
>> depends on kernel code.
>
> Probably not at all, given it works as non-root user too.
> I've tried to run it twice on the same directory (and with the same
> --uuid) on 32MB of data and got different images (~2000 lines of hexdump
> diff). Could be some timestamps, could be something else.

There is volume UUID which is what --uuid affects. But there are other
uuids, including the chunk uuid which gets repeated in every leaf and
node along with the volume uuid, device uuid, each files tree
(subvolume) get its own uuid, etc. Time stamps include atime, otime,
mtime, and ctime. Some objects have all 0's for uuid, and some items
have only 0.0 for times. I'll float the reproducibility question on
the Btrfs list, if it's desirable, useful, and how difficult it is. I
think subsetting Btrfs features to reduce complexity generally, and
therefore increase reproducibility as a consequence of that, has
merit.


>> It's also
>> possible with dm-verity or dm-integrity but then that adds back the dm
>> complexity.
>
> Oh, please, no...

Haha...

>
> There are two almost separate aspects here:
>  - image layout (squashfs+ext4, squashfs alone, squashfs+btrfs)
>  - how copy-on-write is achieved (dm-snapshot, overlay fs)

ext4 alone, and btrfs alone are also viable. But since ext4 has no
compression, image size grows by maybe a factor of 2. Btrfs supports
lzo and zlib compression since forever, and zstd since kernel 4.14,
same as squashfs. What's been missing is mksquashfs with zstd support,
which I imagine will be in 5.0. The compression ratio compares well
with xz currently being used by mksquashfs in Fedora composes, but
with much less CPU to compress and decompress. So I'd say go with zstd
in any case.


>
> For reproducibility, squashfs alone is the best option, but does not
> improve integrity checking (but also doesn't make it worse).

I'm not able to estimate how much work it is to add a files hash
manifest to squashfs, and to always use it on reads, and then add some
error handling to EIO upon any mismatch. But yeah it'd need user space
code in mksquashfs and also kernel code to support it.


> As for copy-on-write, dm-snapshot is quite complex to setup and require
> underlying FS to support write. Also, doesn't allow to write more data
> than original image size (may be an issue for persistent partition
> case). Overlay fs on the other hand works with any underlying fs, you
> can write as much data as you want. And in case of persistent partition,
> you can access that data even if base image (the lower layer) is
> unavailable/broken. I think the only downside of overlay fs is when you
> modify large file it gets copied in full to the upper layer. But I don't
> think that's an issue in this use case.
>
> For me, overlay fs is a clear winner here.
> But as for image layout, it isn't that simple. For reproducibility,
> squashfs alone is better. But if the goal of this change would be also
> improving read errors detection, then it isn't that clear anymore. It
> may be that it takes a simple mkfs.btrfs patch to make it reproducible,
> but it isn't obvious for me at this stage. Also, keeping two layers
> looks like unnecessary complexity.

I agree. Overlayfs works fine with any of the discussed filesystems.
I'd give a slight edge to Btrfs seed+sprout as the overlay mechanism
in the case of persistence on a USB stick: a) checksumming b)
compression helps improve performance of USB flash drives and reduces
wear c) kernel discovers both seed and sprout in early boot by sprout
uuid alone, no special mount options needed for setup. But it's a
really minor point because a) and b) are still possible with overlayfs
with a new independent btrfs as the upperdir.


> What do you think about sidestepping this discussion a little and
> replacing dm-snapshot with overlay fs regardless of other changes here?
> That should be doable without any change to image format and will give
> more flexibility there.

Agreed. What I can't tell you off hand is if livecd-iso-to-disk would
be affected by this in some way; or whether the change policy applies.
But I think it's better to file the change so there's awareness and
coordination: installer team would have to sign off on the pull
request for lorax, and then releng team probably should know about it
because they define their own compose settings (I guess they often use
upstreams defaults but they don't have to), and then QA might want a
heads up so if things blow up they know who to ask what's up, and then
it's also a good idea to let SOAS folks know about it. And a central
point of filing changes is coordination.

https://fedoraproject.org/wiki/Changes/Policy



-- 
Chris Murphy
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org

Reply via email to