Confused about Reduce functions

Kylie McCormick Wed, 23 Jul 2008 14:52:15 -0700

Hello!
I have been getting NullPointerExceptions in my reduce() function, with the
code below. (If have removed all the "check for null pointer" if-statements,
but they are there for every object.)


I based my code off of the Word Count example. Essentially, the reduce
function is to rescore the DocumentWritable[] part of the
ResultSetWritable[] object and then let the OutputCollector have it.

However, I discovered that the iterator is returning empty instances
(eg--the initialized value when ResultSetWritable() is called). When I
commented out this function to see what would happen, I got the following
error

java.lang.RuntimeException: java.lang.NoSuchMethodException:
edu.arsc.multisearch.ResultSetWritable.<init>()
        at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:80)
        at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:62)
        at
org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:40)
        at
org.apache.hadoop.mapred.ReduceTask$ValuesIterator.readNextValue(ReduceTask.java:291)
        at
org.apache.hadoop.mapred.ReduceTask$ValuesIterator.next(ReduceTask.java:232)
        at
org.apache.hadoop.mapred.ReduceTask$ReduceValuesIterator.next(ReduceTask.java:311)
        at edu.arsc.multisearch.MergeReduce.reduce(MergeReduce.java:38)
        at edu.arsc.multisearch.MergeReduce.reduce(MergeReduce.java:18)
        at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:391)
        at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:201)
Caused by: java.lang.NoSuchMethodException:
edu.arsc.multisearch.ResultSetWritable.<init>()
        at java.lang.Class.getConstructor0(Class.java:2706)
        at java.lang.Class.getDeclaredConstructor(Class.java:1985)
        at
org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:74)
        ... 9 more

...So I am confused about what the Iterator is doing. Is it generating new
ResultSetWritable objects through reflection? Why? I thought Iterator had
all the values associated with the given Key from the OutputCollector of the
Map function...

Thanks,
Kylie

--------CODE-------
    public void reduce(Text text, Iterator<ResultSetWritable> iterator,
OutputCollector<Text, ResultSetWritable> outputCol, Reporter reporter) {

    //create the appropriate kind of final set (naive merge for now)
    NaiveMergeSet nms = new NaiveMergeSet();

    //iterate through the ServiceWritable, taking the output and merging it
    while(iterator.hasNext()) {

        //grab the output result set
        ResultSetWritable rsw = iterator.next();
        DocumentWritable[] docs = rsw.getResults();

        //merge them together
        DocumentWritable[] newScores = nms.merge(docs);

        //set the new docs to old RSW
        rsw.setResults(newScores);

        Text newKey = new Text(Multisearch.getQuery());

....

  }

}

Confused about Reduce functions

Reply via email to