I decided to benchmark some compression algorithms on our actual data.
Results are attached.
Summary:
. LZO is the fastest.
. 7-zip (p7zip) produces the best compression ratio. (Overall, it's
the most impressive.) Doesn't seem possible to use it as an in-line filter.
. gzip --fast is the most widely available. Its speed is same order
of magnitude with LZO, and its compression is same order of magnitude as
7-zip, but it's unquestionably beat by those others, in their natural
habitat. So for general purpose, gzip --fast is the most likely to be used
in general.
. bzip2 is soundly destroyed by 7-zip, if only 7-zip is made
available as an in-line filter. Presently, in-line filtering is the only
reason to ever use bzip2 instead of 7-zip.
------------------
no compression (copy with cp) (cache is cold; this warms cache for
everything else)
207M 0m3.144s
after cache warm:
207M 0m0.658s
------------------
Note: In all these tests, I watch "top" to ensure the benchmark is "fair."
No
processes are multithreading or using multiple cores. The default for
p7zip was
to use multiple cores, but I gave it the switch to disable that. The
default
for all others is ... unless you download specialized packages that are
meant
for parallelization, you cannot parallelize. You would need something like
pbzip2, or pigz, or threadzip, in order to use the other algorithms
parallelized.
LZO is so light, even cat /dev/zero | lzop > /dev/null cannot make lzop
consume
100% of the cpu. That's pretty amazing. I thought maybe it was because
it's all
zeros, so I also tried (while true ; do cat somefile ; done) | lzop >
/dev/null
and the result was the same. VERY light compression.
------------------
LZO (lzop) compression level 1
77M 0m1.913s
------------------
LZO (lzop) compression level 5
77M 0m1.913s
------------------
LZO (lzop) compression level 9
77M 0m1.913s
------------------
compress (ncompress, Lempel-Ziv)
81M 0m6.486s
------------------
7-zip (p7zip, 7za) compression level 1
17M 0m13.995s
------------------
7-zip (p7zip, 7za) compression level 3
17M 0m19.222s
------------------
7-zip (p7zip, 7za) compression level 5
13M 1m53.736s
------------------
7-zip (p7zip, 7za) compression level 7
9.1M 2m16.036s
------------------
7-zip (p7zip, 7za) compression level 9
8.7M 2m19.534s
------------------
zlib (gzip) compression level 1
42M 0m5.883s
------------------
zlib (gzip) compression level 2
42M 0m5.643s
------------------
zlib (gzip) compression level 3
42M 0m6.733s
------------------
zlib (gzip) compression level 4
41M 0m7.293s
------------------
zlib (gzip) compression level 5
41M 0m9.257s
------------------
zlib (gzip) compression level 6
41M 0m14.189s
------------------
zlib (gzip) compression level 7
41M 0m15.573s
------------------
zlib (gzip) compression level 8
42M 0m21.628s
------------------
zlib (gzip) compression level 9
42M 0m28.298s
------------------
bzip2 compression level 1
29M 0m34.042s
------------------
bzip2 compression level 2
30M 0m34.244s
------------------
bzip2 compression level 3
30M 0m34.588s
------------------
bzip2 compression level 4
31M 0m34.724s
------------------
bzip2 compression level 5
31M 0m34.254s
------------------
bzip2 compression level 6
31M 0m36.000s
------------------
bzip2 compression level 7
31M 0m36.798s
------------------
bzip2 compression level 8
31M 0m39.771s
------------------
bzip2 compression level 9
32M 0m39.351s
_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
http://lopsa.org/