compressed input splits to Map tasks

abhishek sharma Wed, 14 Apr 2010 18:27:10 -0700

Hi all,

I created some data using the randomwriter utility and compressed the
map task outputs using the options
-D mapred.output.compress=true
-D mapred.map.output.compression.type=BLOCK


I set the bytes per map to be 128 MB but due to compression the final
size of each map tasks output is around 75MB.

I want to use these individual 75MB compressed files as input to
another Map task.
How do I get Hadoop to first decompress the files before computing the
input splits for the map tasks?

Thanks,
Abhishek

compressed input splits to Map tasks

Reply via email to