> #2 Compressed logs in textfile tables: 60sec (filesize of 736 MB over 8
> compressed files)
> #3 Compressed logs in sequencefile tables: 101sec (filesize of 4,773 MB
> over 126 compressed files)
>

Why is there such a *big* difference in compression ratios between the gzip
utility and Hive?

Uncompressed file size: approx 3500 MB
Gzip utility: approx 250 MB
org.apache.hadoop.io.compress.GzipCodec (BLOCK): approx 1600 MB
org.apache.hadoop.io.compress.DefaultCodec (BLOCK): approx 1700 MB

Saurabh.
-- 
http://nandz.blogspot.com
http://foodieforlife.blogspot.com

Reply via email to