Hi,

2011/1/31 Sean Bigdatafun <sean.bigdata...@gmail.com>:
> GZIP is not splittable.

Correct, gzip is a stream compression system which effectively means
you can only start at the beginning of the data with decompressing.

> Does that mean a GZIP block compressed sequencefile can't take advantage of 
> MR parallelism?

AFAIK it should be splittable in the same blocks as the compression was done.

> How to control the size of block to be compressed in SequenceFile?

Can't help you with that one.

-- 
Met vriendelijke groeten,

Niels Basjes

Reply via email to