On Mon, Dec 21, 2020, at 11:28 AM, Ben Cotton wrote:
> https://fedoraproject.org/wiki/Changes/RPMCoW
> 
> 
> == Summary ==
> 
> RPM Copy on Write provides a better experience for Fedora Users as it
> reduces the amount of I/O and offsets CPU cost of package
> decompression. RPM Copy on Write uses reflinking capabilities in
> btrfs, which is the default filesystem in Fedora 33.
> 
> == Owners ==
> 
> * Name: [[User:malmond|Matthew Almond]], [[User:dcavalca|Davide Cavalca]]
> * Email: malm...@fb.com, dcava...@fb.com
> 
> 
> == Detailed description ==
> 
> Installing and upgrading software packages is a standard part of
> managing the lifecycle of any operating system. For the entire
> lifecycle of Fedora, all software is packaged and distributed using
> the RPM file fomat. This proposal changes how software is downloaded
> and installed, leaving the distribution process unmodified.
> 
> === Current process ===
> 
> # Resolve packaging request into a list of packages and operations
> # Download and verify new packages
> # Install and/or upgrade packages sequentially using RPM files,
> decompressing, and writing a copy of the new files to storage.
> 
> === New process ===
> 
> # Resolve packaging request into a list of packages and operations
> # Download and '''decompress''' packages into a '''locally optimized''' rpm 
> file

Please verify the signature on the downloaded RPM before decompressing it.  (Do 
we do this already?)

> # Install and/or upgrade packages sequentially using RPM files, using
> '''reference linking''' (reflinking) to reuse data already on disk.

Sounds like a great improvement!  Any real-world data on how much time it 
saves, how much it changes disk usage, or how much SSD writes it saves?

> 
> The outcome is intended to be the same, but the order of operations is
> different.
> 
> # Decompression happens inline with download. This has a positive
> effect on resource usage: downloads are typically limited by
> bandwidth. Decompression and writing the full data into a single file
> per rpm is essentially free. Additionally: if there is more than one
> download at a time, a multi-CPU system can be better utilized. All
> compression types supported in RPM work because this uses the rpm I/O
> functions.

I referenced above, I think each chunk should also be verified before 
decompressing.

> # RPMs are cached on local storage between downloading and
> installation time as normal. This allows DNF to defer actual RPM
> installation to when all the RPM are available. This is unchanged.
> # The file format for RPMs is different with Copy on Write. The
> headers are identical, but the payload is different. There is also a
> footer.
> ## Files are converted (“transcoded”) locally during download using
> <code>/usr/bin/rpm2extents</code> (part of rpm codebase). The format
> is not intended to be “portable” - i.e. copying the files from the
> cache is not supported.

I think these should be made to be portable.  How many variants of these are 
there?  Would it be difficult to make the transcoder also understand RPMs 
transcoded for a different platform/setup?  Eventually, I'd like to see 
additional signatures added to the RPM for each of the variants so RPM itself 
can do the verification at install time, avoiding a transcode to the 
"canonical" format.  (I suppose this might require a build-time or sign-time 
transcode to each of the other variants.)  Until then, I'd like to ensure that 
the package signatures are being verified in a secure manner, which would be 
necessary for the plugin to be able to install packages not built with multiple 
signatures/digests.

Would it be practical to just have a single format aligned to the largest page 
size known, leaving fs holes as necessary on systems with smaller page sizes?


> ## Regular RPMs use a compressed .cpio based payload. In contrast,
> extent based RPMs contain uncompressed data aligned to the fundamental
> page size of the architecture, e.g. 4KiB on x86_64. This alignment is
> required for <code>FICLONERANGE</code> to work. Only files are
> represented in the payload, other directory entries like symlinks,
> device nodes etc are constructed entirely from rpm header information.
> Files are referenced by their digest, so identical files are
> de-duplicated.

How are hardlinks in an RPM handled?  Do they stay as hardlinks or become 
reflinks only, losing the hardlink status?  They should stay hardlinks, in my 
opinion.

> ## The footer currently has three sections
> ### Table of original (rpm) file digests, used to validate the
> integrity of the download in dnf.
> ### Table of digest → offset used when actually installing files.
> ### Signature 8 bytes at the end of the file, used to differentiate
> between traditional RPMs and extent based.

I think this magic number "signature" should vary based on the items that cause 
the format to change.

What happens if you try to use a transcoded RPM on a non-compatible system?

> 
> === Notes ===
> 
> # The headers are preserved bit for bit during transcoding. This
> preserves signatures. The signatures cover the main header blob, and
> the main header blob ensures the integrity of data in two ways:
> ## Each file with content has a digest. Originally this was md5, but
> today it’s usually sha256. In normal RPM this is only used to verify
> the integrity of files, e.g. <code>rpm -V</code>. With CoW we use this
> as a content key.
> ## There is/are one or two digests (<code>PAYLOADDIGEST</code> and
> <code>PAYLOADDIGESTALT</code>) covering the payload archive
> (compressed cpio). The header value is preserved, but transcoded RPMs
> do not preserve the original structure so RPM’s pre-installation
> verification (controlled by <code>%_pkgverify_level</code> will fail.
> <code>dnf-plugin-cow</code> disables this check in dnf because it
> verifies the whole file digest which is captured during
> download/transcoding. The second one is likely used for delta rpm.
> # This is untested, and possibly incompatible with delta RPM (drpm).
> The process for reconstructing an rpm to install from a delta is
> expensive from both a CPU and I/O perspective, while only providing
> marginal benefits on download size. It is expected that having delta
> rpm enabled (which is the default) will be handled gracefully.

https://github.com/rpm-software-management/rpm/pull/880 added DIGESTALT, 
apparently to help reduce this CPU usage problem.  I don't know if it's 
actually used by anything, but it is much newer than I'd have guessed (2019 
October).

> # Disk space requirements are expected to be marginally higher than
> before: all new packages or updates will consume their installed size
> before installation instead of about half their size (regular rpms
> with payloads still cost space).
> # <code>rpm-plugin-reflink</code> will fall back to simple file
> copying when the destination path is not on the same
> filesystem/subvolume. A common example is <code>/boot</code> and/or
> <code>/boot/efi</code>.
> # The system will still work on other filesystem types, but will
> ''always'' fall back to simple copying. This is expected to be
> slightly slower than not enabling CoW because the source for copying
> will be the decompressed data.

Any testing to see the speed impact?

> # For systems that enable transparent filesystem compression: every
> file will continue to be decompressed from the original rpm, and then
> transparently re-compressed by the filesystem. There is no effective
> change here. There is a future project to investigate alternate
> distribution mechanics to provide parallel versions of file content
> pre-compressed in a filesystem specific format, reducing both CPU
> costs and I/O. It is expected that this will result in slightly higher
> network utilization because filesystem compression is purposely
> restricted to allow random I/O.
> # Current implementation of <code>dnf-plugin-cow</code> is in Python,
> but it looks possible to implement this in <code>libdnf</code> instead
> which would make it work in <code>packagekit</code>.
> 
> === Performance Metrics ===
> 
> Ballpark performance difference is about half the duration for file
> download+install time. A lot of rpms are very small, so it’s difficult
> to see/measure. Larger RPMs give much clearer signal.
> 
> (Actual numbers/charts will be supplied in Jan 2021)

Seems like a very nice optimization!  Thanks for working on it!


V/r,
James Cassell
_______________________________________________
devel mailing list -- devel@lists.fedoraproject.org
To unsubscribe send an email to devel-le...@lists.fedoraproject.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org

Reply via email to