Arun C Murthy wrote:
Hi Johan,
On Tue, Mar 13, 2007 at 05:50:21PM +0000, Johan Oskarsson wrote:
Hi.
I can't seem to find out how to set the compression codec in a
SequenceFile if it's created when a program runs with the output format
set to SequenceFileOutputFormat.
Use these knobs:
mapred.output.compress (set this to 'true')
io.seqfile.compression.type (NONE, RECORD, BLOCK)
mapred.output.compression.codec (zlib, lzo etc.)
@see SequenceFileOutputFormat.getRecordWriter() for more info...
hth,
Arun
Ah, thanks. Works great. The javadoc and name of the JobConf method
setMapOutputCompressorClass that sets "mapred.output.compression.codec"
threw me off, it did sound like it was only for the map -> reduce stage
and not the actual reduce output as well.
Final question, does this also work for a MapFileOutputFormat? I'm
running a benchmark and the block vs record seems to make a difference,
but I believe it's only using the default compression codec (same
filesize of output even though I change the codec with
OutputFormatBase.setOutputCompressorClass). Works just fine with
SequenceFileOutputFormat.
/Johan