----- Original Message ----- From: "Jim Meyering" <[email protected]> To: "GNU" <[email protected]> Sent: Saturday, March 03, 2012 11:14 AM Subject: [PATCH 2/2] maint: use an optimal-for-grep xz compression setting
... > From 4b2224681fbc297bf585630b679d8540a02b78d3 Mon Sep 17 00:00:00 2001 > From: Jim Meyering <[email protected]> > Date: Sat, 3 Mar 2012 10:51:11 +0100 > Subject: [PATCH 2/2] maint: use an optimal-for-grep xz compression setting > > * cfg.mk (XZ_OPT): Use -6e (determined empirically, see comments). > This sacrifices a meager 60 bytes of compressed tarball size for a > 55-MiB decrease in the memory required during decompression. I.e., > using -9e would shave off only 60 bytes from the tar.xz file, yet > would force every decompression process to use 55 MiB more memory. > --- ... > +export XZ_OPT = -6e > + > old_NEWS_hash = 347e90ee0ec0489707df139ca3539934 > -9 should be set only when the file to compress is really big enought -6 is xz default compression setting -6e approximately double the required time to compress (with 1% size gain) -6{,e} work well with a file with approximately the same size as grep-2.11.tar. But if a bigger .tar is compressed, that may not give good compression result. rm -f dummy; for i in 1 2 3 4 5; do echo " $i x grep-2.11.tar size";cat grep-2.11.tar >>dummy; xz -vv -6 < dummy >/dev/null; done; rm dummy 1 x grep-2.11.tar size xz: Filter chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0 xz: 94 MiB of memory is required. The limit is 17592186044416 MiB. xz: Decompression will need 9 MiB of memory. 100 % 1112.9 KiB / 9240.0 KiB = 0.120 746 KiB/s 0:12 2 x grep-2.11.tar size xz: Filter chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0 xz: 94 MiB of memory is required. The limit is 17592186044416 MiB. xz: Decompression will need 9 MiB of memory. 100 % 2130.4 KiB / 18.0 MiB = 0.115 721 KiB/s 0:25 3 x grep-2.11.tar size xz: Filter chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0 xz: 94 MiB of memory is required. The limit is 17592186044416 MiB. xz: Decompression will need 9 MiB of memory. 100 % 3147.8 KiB / 27.1 MiB = 0.114 708 KiB/s 0:39 4 x grep-2.11.tar size xz: Filter chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0 xz: 94 MiB of memory is required. The limit is 17592186044416 MiB. xz: Decompression will need 9 MiB of memory. 100 % 4165.2 KiB / 36.1 MiB = 0.113 709 KiB/s 0:52 5 x grep-2.11.tar size xz: Filter chain: --lzma2=dict=8MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0 xz: 94 MiB of memory is required. The limit is 17592186044416 MiB. xz: Decompression will need 9 MiB of memory. 100 % 5182.4 KiB / 45.1 MiB = 0.112 707 KiB/s 1:05 So using -6 could consider more decompression memory requirement than compressed file size result. Contrast that with setting dictionary size (I know this benchmark is extreme) Dictionary size limit is theorically the file to be compressed, here 3/4 is fully arbitrary, only decrease memory requirement a bit. for i in 1 2 3 4 5; do cat grep-2.11.tar >>dummy; XZ_OPT=--lzma2=dict=$(du -h dummy | awk '{ printf "%dMiB", $1 / 4 * 3 }') xz -vv < dummy >/dev/null; done; rm dummy xz: Filter chain: --lzma2=dict=6MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0 xz: 75 MiB of memory is required. The limit is 17592186044416 MiB. xz: Decompression will need 7 MiB of memory. 100 % 1118.2 KiB / 9240.0 KiB = 0.121 748 KiB/s 0:12 xz: Filter chain: --lzma2=dict=14MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0 xz: 167 MiB of memory is required. The limit is 17592186044416 MiB. xz: Decompression will need 15 MiB of memory. 100 % 1114.1 KiB / 18.0 MiB = 0.060 752 KiB/s 0:24 xz: Filter chain: --lzma2=dict=21MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0 xz: 265 MiB of memory is required. The limit is 17592186044416 MiB. xz: Decompression will need 22 MiB of memory. 100 % 1115.5 KiB / 27.1 MiB = 0.040 739 KiB/s 0:37 xz: Filter chain: --lzma2=dict=27MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0 xz: 322 MiB of memory is required. The limit is 17592186044416 MiB. xz: Decompression will need 28 MiB of memory. 100 % 1116.8 KiB / 36.1 MiB = 0.030 752 KiB/s 0:49 xz: Filter chain: --lzma2=dict=34MiB,lc=3,lp=0,pb=2,mode=normal,nice=64,mf=bt4,depth=0 xz: 389 MiB of memory is required. The limit is 17592186044416 MiB. xz: Decompression will need 35 MiB of memory. 100 % 1118.2 KiB / 45.1 MiB = 0.024 751 KiB/s 1:01 Here adding 4 times to the load give same compressed file size (in 1% range). Probably tar should learn to set xz dictionary size to the size of .tar when using -J? That would be the most efficient way to compress without wasting memory. Gilles
