LZO Compression Libraries don't appear to work properly with MultipleOutputs

ed Thu, 21 Oct 2010 14:52:56 -0700

Hello everyone,

I am having problems using MultipleOutputs with LZO compression (could be a
bug or something wrong in my own code).


In my driver I set

     MultipleOutputs.addNamedOutput(job, "test", TextOutputFormat.class,
NullWritable.class, Text.class);

In my reducer I have:

     MultipleOutputs<NullWritable, Text> mOutput = new
MultipleOutputs<NullWritable, Text>(context);

     public String generateFileName(Key key){
        return "custom_file_name";
     }

Then in the reduce() method I have:

     mOutput.write(mNullWritable, mValue, generateFileName(key));

This results in creating LZO files that do not decompress properly (lzop -d
throws the error "lzop: unexpected end of file: outputFile.lzo")

If I switch back to the regular context.write(mNullWritable, mValue);
everything works fine.

Am I forgetting a step needed when using MultipleOutputs or is this a
bug/non-feature of using LZO compression in Hadoop.

Thank you!


~Ed

LZO Compression Libraries don't appear to work properly with MultipleOutputs

Reply via email to