Hello everyone,
I am having problems using MultipleOutputs with LZO compression (could be a
bug or something wrong in my own code).
In my driver I set
MultipleOutputs.addNamedOutput(job, "test", TextOutputFormat.class,
NullWritable.class, Text.class);
In my reducer I have:
MultipleOutputs<NullWritable, Text> mOutput = new
MultipleOutputs<NullWritable, Text>(context);
public String generateFileName(Key key){
return "custom_file_name";
}
Then in the reduce() method I have:
mOutput.write(mNullWritable, mValue, generateFileName(key));
This results in creating LZO files that do not decompress properly (lzop -d
throws the error "lzop: unexpected end of file: outputFile.lzo")
If I switch back to the regular context.write(mNullWritable, mValue);
everything works fine.
Am I forgetting a step needed when using MultipleOutputs or is this a
bug/non-feature of using LZO compression in Hadoop.
Thank you!
~Ed