Hi all, I'm using hadoop 1.0.4 and using gzip to keep the logs processed by hadoop (logs are gzipped into block size files). I read that bzip2 is splittable. Is it so in hadoop 1.0.4 ? Does that mean that any input file bigger then block size will be split between maps ? What are the tradeoffs between the two ?
Thanks.