On Thu, Dec 11, 2025 at 6:48 AM Zbigniew Jędrzejewski-Szmek <[email protected]> wrote: > > On Fri, Dec 05, 2025 at 04:12:13PM -0500, Dave Cantrell wrote: > > On my workstation I made a copy of /usr/share/man and removed all of the > > symlinks in that tree. There are 39163 man pages in that directory. I > > made two copies. The first to uncompress the pages and the second to > > compress them all with zstd. Here's the storage results I gathered from > > 'du -s -h': > > > > default/ 182M > > uncompressed/ 294M > > zstd/ 182M > > I get: > uncompressed/ 390M > default/ 187M > zstd/ 182M > > I used 'zstd -19' to match the max gzip compresion we're using. > > It seems clear that the change in size is negligible. > > We should also measure the time required for compression and > decompression. I'd posit that decompression is actually more important, > because that happens on user systems and snapiness makes users happy. > > zstd is clearly better here: > > $ time zcat man/*.gz >/dev/null > zcat man/*.gz >/dev/null 1.65s user 0.11s system 99% cpu 1.769 total > zcat man/*.gz >/dev/null 1.63s user 0.11s system 99% cpu 1.748 total > zcat man/*.gz >/dev/null 1.67s user 0.12s system 99% cpu 1.792 total > $ time zstdcat man-zstd/*.zst >/dev/null > zstdcat man-zstd/*.zst >/dev/null 0.42s user 0.15s system 97% cpu 0.580 total > zstdcat man-zstd/*.zst >/dev/null 0.39s user 0.15s system 99% cpu 0.545 total > zstdcat man-zstd/*.zst >/dev/null 0.39s user 0.15s system 99% cpu 0.543 total > > Nevertheless, unless somebody is searching over man pages, the decompression > time of a single page is going to be hard to see. >
We do have graphical tools that do this sort of thing (like KDE's help center searches and indexes man pages), and some console shells (e.g. fish) or shell extensions (oh-my-zsh) implement similar functionality. So making this faster for them *would* be valuable. > For compression: > $ time parallel gzip --best -q -k ::: * > parallel gzip --best -q -k ::: * 57.36s user 120.09s system 197% cpu 1:29.99 > total > $ time parallel zstd -q -19 ::: * > parallel zstd -q -19 ::: * 235.64s user 166.12s system 378% cpu 1:46.09 > total > > Gzip comes out ahead a little bit here. (Though in both cases, the CPU doesn't > seem to be saturated. Since the IO is negligible, I'd expect the CPUs to be > all running at 100%. So maybe some tweaking in how the compression is invoked > could bring this down.) But since this happens during package build time, > any package which has enough man pages for this to be noticeable is probably > already taking hours to build, so this is not going to matter. > We'd probably want to tell zstd to compress using all available CPU cores, which it doesn't do by default. This can be done with "zstdmt" or "zstd -T0" (or "zstd -T<number-of-threads>"). That may improve compression performance. -- 真実はいつも一つ!/ Always, there's only one truth! -- _______________________________________________ devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/[email protected] Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
