Hi,
On Tue, Sep 13, 2011 at 12:27 PM, Vivek K <[email protected]> wrote:
> Hi all,
>
> I am trying to build a Hadoop/MR application in c++ using hadoop-pipes. I
> have been able to successfully work with my own mappers and reducers, but
> now I need to generate output (from reducer) in a format different from the
> default TextOutputFormat. I have a few questions:
>
> (1) Similar to Hadoop streaming, is there an option to set OutputFormat in
> HadoopPipes (in order to use say org.apache.hadoop.io.SequenceFile.Writer) ?
> I am using Hadoop version 0.20.2.
>
> (2) For a simple test on how to use an in-built non-default writer, I tried
> the following:
>
> hadoop pipes -D hadoop.pipes.java.recordreader=true -D
> hadoop.pipes.java.recordwriter=false -input input.seq -output output
> -inputformat org.apache.hadoop.mapred.SequenceFileInputFormat -writer
> org.apache.hadoop.io.SequenceFile.Writer -program my_test_program
-writer wants an outputformat:
if (results.hasOption("writer")) {
setIsJavaRecordWriter(job, true);
job.setOutputFormat(getClass(results, "writer", job,
OutputFormat.class));
As such I think you want:
-writer org.apache.hadoop.mapred.SequenceFileOutputFormat
SequenceFile.Writer simply writes sequence files has nothing todo with
MapReduce.
This is also wrong:
hadoop.pipes.java.recordwriter=false
Brock