Hi all,

I created some data using the randomwriter utility and compressed the
map task outputs using the options
-D mapred.output.compress=true
-D mapred.map.output.compression.type=BLOCK

I set the bytes per map to be 128 MB but due to compression the final
size of each map tasks output is around 75MB.

I want to use these individual 75MB compressed files as input to
another Map task.
How do I get Hadoop to first decompress the files before computing the
input splits for the map tasks?

Thanks,
Abhishek

Reply via email to