On Tue, Jul 22, 2008 at 8:04 PM, Lincoln Ritter
<[EMAIL PROTECTED]> wrote:
> I have what I think is a pretty straight-forward, noobie question. I
> would like to write one file per key in the reduce (or map) phase of a
> mapreduce job. I have looked at the documentation for
> FileOutputFormat and MultipleTextOutputFormat but am a bit unclear on
> how to use it/them. Can anybody give me a quick pointer?
Hi Lincoln,
I do something like this to dump my records out, one per file, for
debugging. This may not be "correct" because it writes the files as
side-effects of the job, but hey, it works. It looks something like
this:
public static class MyMap extends MapReduceBase
implements Mapper<VIntWritable, Text, NullWritable, NullWritable> {
private JobConf conf;
public void configure(JobConf conf) {
this.conf = conf;
}
public void map(VIntWritable key, Text value,
OutputCollector<NullWritable, NullWritable> output,
Reporter reporter) throws IOException {
FileSystem fs = FileSystem.get(conf);
Path workPath = FileOutputFormat.getWorkOutputPath(conf);
Path filePath = new Path(workPath, key.toString());
OutputStream out = fs.create(filePath);
/* ... write value to out ... */
out.close();
}
}