Use the following knobs:
mapred.compress.map.output = true
mapred.map.output.compression.codec =
org.apache.hadoop.io.compress.LzoCodec
or call
jobConf.setMapOutputCompressorClass(LzoCodec.class);
You will need the native hadoop-gpl-compression library installed on
all machines from http://code.google.com/p/hadoop-gpl-compression/
Arun
On Feb 16, 2010, at 9:26 PM, jiang licht wrote:
New to Hadoop (now using 0.20.1), I want to know how to choose
and set up compression methods for Map output, especially how to
configure and use LZO compression?
Specifically, please share your experience for the following 2
scenarios. Thanks!
(1) Is there a global setting in some hadoop configuration files
for naming a compression method (e.g. LZO) such that it will be used
to compress Map output by default? and how?
(2) How to use a compression method (e.g. LZO) in java code (I
noticed that in javadoc, org.apache.hadoop.mapred is labeld
Deprecated)?
Thanks!
--
Michael