On Tue, Jul 22, 2008 at 8:04 PM, Lincoln Ritter
<[EMAIL PROTECTED]> wrote:
> I have what I think is a pretty straight-forward, noobie question.  I
> would like to write one file per key in the reduce (or map) phase of a
> mapreduce job.  I have looked at the documentation for
> FileOutputFormat and MultipleTextOutputFormat but am a bit unclear on
> how to use it/them.  Can anybody give me a quick pointer?

Hi Lincoln,

I do something like this to dump my records out, one per file, for
debugging.  This may not be "correct" because it writes the files as
side-effects of the job, but hey, it works.  It looks something like
this:

    public static class MyMap extends MapReduceBase
        implements Mapper<VIntWritable, Text, NullWritable, NullWritable> {

        private JobConf conf;

        public void configure(JobConf conf) {
            this.conf = conf;
        }

        public void map(VIntWritable key, Text value,
                        OutputCollector<NullWritable, NullWritable> output,
                        Reporter reporter) throws IOException {

            FileSystem fs = FileSystem.get(conf);
            Path workPath = FileOutputFormat.getWorkOutputPath(conf);
            Path filePath = new Path(workPath, key.toString());
            OutputStream out = fs.create(filePath);
            /* ... write value to out ... */
            out.close();
        }
    }

Reply via email to