Hello everyone,

I want to find out how to get an array as output in mapreduce.

I just simply modified the reduce method of wordcount example as shown below:

public static class IntSumReducer extends
                        Reducer<Text, IntWritable, Text, ArrayWritable> {
                private IntWritable[] iw = new IntWritable[2];
                private ArrayWritable result = new ArrayWritable
(IntWritable.class,iw);
                
                
                public void reduce(Text key, Iterable<IntWritable> values,
                                Context context) throws IOException, 
InterruptedException {
                        iw[0] = new IntWritable();  // initialize 
                        iw[1] = new IntWritable();  // initialize
                        int sum = 0;
                        for (IntWritable val : values) {
                                sum += val.get();
                        }
                        iw[0].set(sum);
                        iw[1].set(sum);
                        result.set(iw);
                        context.write(key, result);
                }
        }



Map outputs <Text ,ArrayWritable>.

Reduce takes <Text,Iterable<IntWritable>> as inputs.

The moment I'm trying to use the ArrayWritable in reduce to store the results
I get the following error :

11/03/08 17:03:07 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
processName=JobTracker, sessionId=
11/03/08 17:03:07 WARN mapred.JobClient: No job jar file set.  User classes 
may not be found. See JobConf(Class) or JobConf#setJar(String).
11/03/08 17:03:07 INFO input.FileInputFormat: Total input paths to process : 1
11/03/08 17:03:07 INFO mapred.JobClient: Running job: job_local_0001
11/03/08 17:03:07 INFO input.FileInputFormat: Total input paths to process : 1
11/03/08 17:03:07 INFO mapred.MapTask: io.sort.mb = 100
11/03/08 17:03:07 INFO mapred.MapTask: data buffer = 79691776/99614720
11/03/08 17:03:07 INFO mapred.MapTask: record buffer = 262144/327680
11/03/08 17:03:08 INFO mapred.MapTask: Starting flush of map output
11/03/08 17:03:08 WARN mapred.LocalJobRunner: job_local_0001
java.io.IOException: wrong value class: class 
org.apache.hadoop.io.ArrayWritable is not class 
org.apache.hadoop.io.IntWritable
        at org.apache.hadoop.mapred.IFile$Writer.append(IFile.java:167)
        at org.apache.hadoop.mapred.Task$CombineOutputCollector.collect
(Task.java:880)
        at 
org.apache.hadoop.mapred.Task$NewCombinerRunner$OutputConverter.write
(Task.java:1201)
        at org.apache.hadoop.mapreduce.TaskInputOutputContext.write
(TaskInputOutputContext.java:80)
        at org.apache.hadoop.examples.WordCount$IntSumReducer.reduce
(WordCount.java:59)
        at org.apache.hadoop.examples.WordCount$IntSumReducer.reduce
(WordCount.java:1)
        at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:176)
        at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine
(Task.java:1222)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill
(MapTask.java:1265)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush
(MapTask.java:1129)
        at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close
(MapTask.java:549)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:623)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run
(LocalJobRunner.java:177)
11/03/08 17:03:08 INFO mapred.JobClient:  map 0% reduce 0%
11/03/08 17:03:08 INFO mapred.JobClient: Job complete: job_local_0001
11/03/08 17:03:08 INFO mapred.JobClient: Counters: 0


Reply via email to