Hello All, I think I figured our where I goofed up.
I was flushing on every record, so basically this was compression per record, so it had a meta data with each record. This was adding more data to the output when compared to avro. So now I have better figures: atleast looks realistic, still need to find out of it is map-reduceable. Avro= 12G Avro+Defalte= 4.5G Avro+Snappy = 5.5G Have others tried Avro + LZO? Thanks, Nikhil On 3/30/12 12:54 AM, "Shirahatti, Nikhil" <[email protected]> wrote: >The original data file (a text file) is 40GB, the avro file is about 12GB, >avro snappy is 13GB! > >Thanks, >Nikhil > >-- >View this message in context: >http://apache-avro.679487.n3.nabble.com/avro-compression-using-snappy-and- >deflate-tp3870167p3870184.html >Sent from the Avro - Users mailing list archive at Nabble.com.
