gzip/deflate is approximately the same speed to decompress for all compression levels. However, for compression, it varies by a factor of 5 or so between the fastest (1) and slowest (9).
This is a useful link for gzip performance characteristics: http://tukaani.org/lzma/benchmarks.html On 3/4/11 9:25 AM, "Doug Cutting" <[email protected]> wrote: >On 03/01/2011 09:05 PM, felix gao wrote: >> I am running some comparison tests on a data set that I converted to >> avro with deflator set to level 6. The original logs consists of 2880 >> uncompressed http access logs with a total size of 1.4TB. The Compressed >> avro log is about 2/3 of the size. However, when I ran the same pig job >> on the raw logs, it is blazing fast during the initial map phase. >> Finished in under 40 min. When I ran the same pig job with avro files, >> the initial map phase took 8 minutes to only finish 10%. I am wondering >> is there any way to figure out what is slowing down the map? > >What version of Avro are you using? How are you integrating Avro with >Pig? > >Also, for speed, you might try level=1 (Deflater.BEST_SPEED). > >Doug
