Hello, developers. I'd like to raise the following item for discussion: making .xz the default compressor used by portage for documentation, man pages and info files. That is, the equivalent of:
PORTAGE_COMPRESS=xz
in make.globals.
Rationale: xz-utils is quite widespread nowadays and it is a part
of @system set. It can achieve better compression ratio than bzip2,
and faster decompression at the same time.
I have confirmed that both sys-apps/man and sys-apps/man-db can
handle .xz compressed man pages, and sys-apps/texinfo can handle .xz
compressed info pages. Major text editors and pagers support .xz
alike .bz2 (i.e. usually they support both or neither :)).
The additional question is: what preset to use? To help discussing
this, I'd like to quote the tables from 'man xz':
Preset DictSize CompCPU CompMem DecMem
-0 256 KiB 0 3 MiB 1 MiB
-1 1 MiB 1 9 MiB 2 MiB
-2 2 MiB 2 17 MiB 3 MiB
-3 4 MiB 3 32 MiB 5 MiB
-4 4 MiB 4 48 MiB 5 MiB
-5 8 MiB 5 94 MiB 9 MiB
-6 8 MiB 6 94 MiB 9 MiB
-7 16 MiB 6 186 MiB 17 MiB
-8 32 MiB 6 370 MiB 33 MiB
-9 64 MiB 6 674 MiB 65 MiB
Preset DictSize CompCPU CompMem DecMem
-0e 256 KiB 8 4 MiB 1 MiB
-1e 1 MiB 8 13 MiB 2 MiB
-2e 2 MiB 8 25 MiB 3 MiB
-3e 4 MiB 7 48 MiB 5 MiB
-4e 4 MiB 8 48 MiB 5 MiB
-5e 8 MiB 7 94 MiB 9 MiB
-6e 8 MiB 8 94 MiB 9 MiB
-7e 16 MiB 8 186 MiB 17 MiB
-8e 32 MiB 8 370 MiB 33 MiB
-9e 64 MiB 8 674 MiB 65 MiB
I'd like to note here that increasing dictionary size over file size
does not improve compression. However, the options involved in CompCPU
may.
Depending on the expected amount of complexity, I'd either go for:
1) -6e (or -6, the default) -- max CompCPU, reasonable use of memory,
and dictionary larger than most (or all?) documents that are going to
be compressed,
2) -Ne with minimal 'N' for CompCPU==8 and DictSize > filesize -- still
max compression ratio while keeping lowest memory requirements possible.
Your thoughts?
--
Best regards,
Michał Górny
signature.asc
Description: PGP signature
