In this case, don't bother with MultipleOutput. Specify 2 reducers, and a custom partitioner that sends 'even' records to partition 0, and 'odd' partitions to partition 1.
You will have two output files named 'part-00000' and 'part-00001' corresponding to odd and even. On Mon, Aug 16, 2010 at 2:55 AM, rajgopalv <[email protected]> wrote: > > 0 down vote favorite > > > Hi. I'm a newbie in Hadoop. I'm trying out the Wordcount program. > > Now to try out multiple output files, i use MultipleOutputFormat. this link > helped me in doing it. > > http://hadoop.apache.org/common/docs/r0.19.0/api/org/apache/hadoop/mapred/lib/MultipleOutputs.html > > in my driver class i had > > MultipleOutputs.addNamedOutput(conf, "even", > org.apache.hadoop.mapred.TextOutputFormat.class, Text.class, > IntWritable.class); > > MultipleOutputs.addNamedOutput(conf, "odd", > org.apache.hadoop.mapred.TextOutputFormat.class, Text.class, > IntWritable.class);` > > and my reduce class became this > > public static class Reduce extends MapReduceBase implements > Reducer<Text, IntWritable, Text, IntWritable> { > MultipleOutputs mos = null; > > public void configure(JobConf job) { > mos = new MultipleOutputs(job); > } > > public void reduce(Text key, Iterator<IntWritable> values, > OutputCollector<Text, IntWritable> output, Reporter reporter) > throws IOException { > int sum = 0; > while (values.hasNext()) { > sum += values.next().get(); > } > if (sum % 2 == 0) { > mos.getCollector("even", reporter).collect(key, new > IntWritable(sum)); > }else { > mos.getCollector("odd", reporter).collect(key, new > IntWritable(sum)); > } > //output.collect(key, new IntWritable(sum)); > } > @Override > public void close() throws IOException { > // TODO Auto-generated method stub > mos.close(); > } > } > > Things worked , but i get LOT of files, (one odd and one even for every > map-reduce) > > Question is : How can i have just 2 output files (odd & even) so that every > odd output of every reduce gets written into that odd file, and same for > even. > > -- > View this message in context: > http://old.nabble.com/MultipleOutputFormat-tp29447204p29447204.html > Sent from the Hadoop core-user mailing list archive at Nabble.com. > >
