I played with lzma2 options and find two curious things.

First, start with something simple as reference (everything tested with
xz-5.0.3)
xz -vv  -8e < coreutils-8.15.tar >/dev/null
xz: Filter
chain: --lzma2=dict=32MiB,lc=3,lp=0,pb=2,mode=normal,nice=273,mf=bt4,depth=5
12
xz: 370 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 33 MiB of memory.
  100 %       4832.6 KiB / 44.2 MiB = 0.107   361 KiB/s       2:05

Then the goal of the game is :
- to remove -8e so the instruction could be applied on different files size
with less memory required when possible
- still achieve the small as possible result (no more than 1% excess than
the maximum compression)

I tried using
xz -vv --lzma2=dict=$(du -sk coreutils-8.15.tar | awk '{ printf "%dKiB", $1
* 3 / 4 }'),nice=273,depth=512 < coreutils-8.15.tar >/dev/null
xz: Filter
chain: --lzma2=dict=33993KiB,lc=3,lp=0,pb=2,mode=normal,nice=273,mf=bt4,dept
h=512
xz: 381 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 34 MiB of memory.
  100 %       4833.4 KiB / 44.2 MiB = 0.107   358 KiB/s       2:06

That way to set lzma2 option only loose 0.8 KiB here and should work with
various size to be compressed.

First question :
I find strange here that with a dictionary size even a bit bigger than with
bare -8e the compressed file is a bit bigger.
Trying to add -e doesnt change the result (and time to compress) when
nice=273 depth=512 are set.
3 / 4 is a pure guess I made. So I tried if result is better with a bigger
dictionary.

xz -vv --lzma2=dict=$(du -sk coreutils-8.15.tar | awk '{ printf "%dKiB",
$1 }'),nice=273,depth=512 < coreutils-8.15.tar >/dev/null
xz: Filter
chain: --lzma2=dict=45324KiB,lc=3,lp=0,pb=2,mode=normal,nice=273,mf=bt4,dept
h=512
xz: 486 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 45 MiB of memory.
  100 %       4832.4 KiB / 44.2 MiB = 0.107   358 KiB/s       2:06

I retrieve the same size as when using bare -8e. That's fine.

As I am adventurious, I tried to reduce the memory requirement by setting a
bit smaller dictionary size :
xz -vv --lzma2=dict=$(du -sk coreutils-8.15.tar | awk '{ printf "%dKiB", $1
* 4 / 5 }'),nice=273,depth=512 < coreutils-8.15.tar >/dev/null
xz: Filter
chain: --lzma2=dict=36259KiB,lc=3,lp=0,pb=2,mode=normal,nice=273,mf=bt4,dept
h=512
xz: 402 MiB of memory is required. The limit is 17592186044416 MiB.
xz: Decompression will need 36 MiB of memory.
  100 %       4831.7 KiB / 44.2 MiB = 0.107   357 KiB/s       2:06

Memory requirement follow dictionary size setting as usual.
But this time, even file size is smaller too (and even smaller than when
using -9e).
Isn't that strange?

Gilles


Reply via email to