On 12/5/25 16:46, Neal Gompa wrote: > On Fri, Dec 5, 2025 at 4:29 PM Dave Cantrell <[email protected]> wrote: >> >> On 12/5/25 16:14, Neal Gompa wrote: >>> On Fri, Dec 5, 2025 at 4:11 PM Dave Cantrell <[email protected]> wrote: >>>> >>>> On 12/5/25 15:47, Neal Gompa wrote: >>>>> On Fri, Dec 5, 2025 at 2:20 PM Dave Cantrell <[email protected]> wrote: >>>>>> >>>>>> On 11/28/25 13:07, Neal Gompa wrote: >>>>>>> Hey folks, >>>>>>> >>>>>>> This has been percolating in my mind for a little while now, but we've >>>>>>> been going through and switching things to zstd across the board (most >>>>>>> recently initramfs, but we did btrfs compression, zram, and even rpm >>>>>>> previously), and I wonder if we want to also switch man and info pages >>>>>>> from gzip to zstd. >>>>>>> >>>>>>> This has already been done at least once with OpenMandriva, so I know >>>>>>> it is possible with an RPM distribution in a reasonable fashion. >>>>>>> >>>>>>> At least man-db has support for zstd compression and just needs to be >>>>>>> told to prefer it at build time, and OpenMandriva has a patch for >>>>>>> texinfo[1]. If we want to do it, it's not hard to accomplish >>>>>>> >>>>>>> Is there a compelling reason that we shouldn't consider migrating to >>>>>>> zstd in an upcoming Fedora release? >>>>>>> >>>>>>> >>>>>>> [1]: >>>>>>> https://github.com/OpenMandrivaAssociation/texinfo/blob/master/texinfo-6.7-zstd-compression.patch >>>>>>> >>>>>>> >>>>>> >>>>>> I personally do not see the point in compressed man pages anymore. A >>>>>> long time ago we would preformat man pages too since doing that was >>>>>> resource intensive and even Linux systems had both man pages and catman >>>>>> pages. But we don't need any of that these days. >>>>> >>>>> It is tempting, but the usage of Linux on really constrained systems >>>>> (such as single board computers) are still extremely common, so I >>>>> think there's still value in compression. And it is also becoming more >>>>> common to make larger and wordier man pages, rather than splitting >>>>> them into separate help tooling. The compression helps a little bit >>>>> with that. I think it's definitely valuable for info pages, which are >>>>> often huge. >>>>> >>>>> (As an aside, I *barely* remember the time when we preformatted man >>>>> pages, that was already going away by the time I started to use >>>>> Linux...) >>>> Constrained systems are a reasonable concern, but which ones out there now >>>> have storage constraints where changing the compression used on man pages >>>> would help? I ask because I really don't know. >>>> >>>> On my workstation I made a copy of /usr/share/man and removed all of the >>>> symlinks in that tree. There are 39163 man pages in that directory. I >>>> made two copies. The first to uncompress the pages and the second to >>>> compress them all with zstd. Here's the storage results I gathered from >>>> 'du -s -h': >>>> >>>> default/ 182M >>>> uncompressed/ 294M >>>> zstd/ 182M >>>> >>>> Other comparisons: >>>> /usr/share/doc 439M >>>> /usr/share/licenses 47M >>>> >>>> At least on my system, changing the man pages to zstd makes no difference. >>>> But keeping them compressed saves me 112M. Are there really systems >>>> where something like 112M is a concern? I also have 3504 packages >>>> installed on my workstation, so I sort of expect to be using storage space. >>>> >>> >>> Yes, unfortunately. Especially when you multiply it with containers >>> and other things that have become more common now. At least Fedora's >>> default setup doesn't have space contention issues like it used to, >>> but people still put Fedora on small SD cards for these things... >> >> I don't have an SD card around here smaller 8GB. I know they have to exist >> (or did). I had CompactFlash cards that were 32M, but that's now ancient >> history. >> >> Sorry, I'm just having trouble seeing this as a real problem. But I don't >> work on single board systems all the time so I don't know what that >> landscape is like now. Are there actual real targets we are trying to or >> want to support that this is a real problem? Or is this more of a >> hypothetical concern or an aspirational goal? Either outcome is fine. >> > > It's still an issue in some cases where 4GB SD cards are still a > thing, and even 8GB ones. The trend was to make it less of a thing, > but with flash memory scarcity due to AI, I think it's going to come > back in a big way. :(
That's a concern, but that's still not what I was looking for. I wanted to know if we have any actual real examples of users hitting this problem or targets we want to support where it is a problem. Or if this is just a concern we think will happen. >>>> If we make this change, we run the risk of breaking man page symlinks. >>>> Rather than using ".so" man page references that we could compress the >>>> same way as man pages, we have symlinks to the compressed form of the >>>> target man page. So, take gawk for example. We have >>>> /usr/share/man/man1/awk.1.gz which is a symlink to >>>> /usr/share/man/man1/gawk.1.gz. A cleaner method to me is to install >>>> /usr/share/man/man1/gawk.1 and then do this for awk.1: >>>> >>>> echo ".so man1/gawk.1" > /usr/share/man/man1/awk.1 >>>> >>>> Then you can compress *.1 in the package building process and not have to >>>> have a more complicated loop that figures out the symlink farm. >>>> >>>> I would still prefer uncompressed man pages. If we were to still compress >>>> them, I would prefer to keep the file ending consistent regardless of the >>>> compression format used and make man(1) deal with that transparently. >>>> >>>> But all of that aside, if space constrained systems is a concern, having a >>>> way to keep man pages but exclude /usr/share/doc would save more space. >>>> Finer-grain control on --nodocs. >>>> >>> >>> To be fair, we should probably document (or create a macro) for man >>> page symlinks. I do it correctly for pkgconf and other packages I >>> maintain where it's needed, but not many people know about this. >> That would be a nice improvement to have. I think both documenting it and >> making a macro would be useful. I'm not a fan of burying everything in a >> spec file macro because it makes spec files unreadable--or at least not >> useful in a standalone way. You have to then go learn all of hyperspecific >> macros we've defined or preprocess the spec file to see what's happening. >> > > Do you happen to know if there's an info page equivalent? Not that I am aware of. Info has cross references, but the whole info system is a little different. Info pages are generated from texinfo files so in a way they are like catman pages, but in other ways they are not. Texinfo documentation is generated and cross references go to either different files or to different URLs if you are generating online documentation. On my various systems there are no symlinks in /usr/share/info, but that doesn't mean they don't exist. The .so thing for man pages is basically #include. I have not seen something like that for info pages. -- Dave Cantrell <[email protected]> Red Hat, Inc. | Boston, MA | EST5EDT -- _______________________________________________ devel mailing list -- [email protected] To unsubscribe send an email to [email protected] Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/[email protected] Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
