I would like to create a hierarchy of output files based on the keys passed to the reducer. The first folder level is the first few digits of the key, the next level is the next few, etc. I had written a very ugly hack that achieved this by passing a filesystem object into the record writer. It seems however that this use case is what the MultipleOutputs APi was designed to handle. I began to implement it based on examples I have found but I am getting stuck.
In my Tool I have the following: ---------------- MultipleOutputs.addNamedOutput(job, "namedOutput", SlightlyModifiedTextOutputFormat.class, keyClass, valueClass); ---------------- In my Reducer I have the following: ---------------- private MultipleOutputs<KeyValue> mo_context; public void setup(Context context) { mo_context = new MultipleOutputs<Key, Value>(context); } protected void reduce(Key key, Iterable<Value> values, Context context) throws IOException, InterruptedException { for(Value value: values) { //context.write(key, value); mo_context.write(key, value, key.toString()); // I can change key.toString() to include the folder tree if needed context.progress(); } } public void cleanup(Context context) throws IOException, InterruptedException { if (mo_context != null) { mo_context.close(); } } ---------------- When I run it I receive the following stack trace just as reducing begins: ---------------- java.lang.NoSuchMethodError: org.apache.hadoop.mapreduce.lib.output.FileOutputFormat.setOutputName(Lorg/apache/hadoop/mapreduce/JobContext;Ljava/lang/String;)V at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.getRecordWriter(MultipleOutputs.java:439) at org.apache.hadoop.mapreduce.lib.output.MultipleOutputs.write(MultipleOutputs.java:408) at xxxxxx.xxxxxxxxxxx.xxxx.xxxxx.xxxxxxxxxxxxxxxxxReducer.reduce(xxxxxxxxxxxxxxxxxReducer.java:54) at xxxxxx.xxxxxxxxxxx.xxxx.xxxxx.xxxxxxxxxxxxxxxxxReducer.reduce(xxxxxxxxxxxxxxxxxReducer.java:27) at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176) at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:566) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216) ---------------- I must be setting this up incorrectly somehow. Does anyone have a solid example of using OutputFormats that shows the job setup, reduction, and possibly the output format, and is using a version around 0.20.205.0?