at page 243:
Per my understanding, The reducer is supposed to output the first value (the
maximum) for each year. But I just don't know how it work.
suppose we have the data
1901 200
1901 300
1901 400
Since group is done by the year, so we have only one group, but we have 3
different key as the key is a combination of year and temperature. for the
reduce, the output should be key, list(value) pair, since we have 3 key, so
we should output 3 rows, but since we have only one group, we only output 1
rows. So where is the conflict? Where do I misunderstand?
public static class GroupComparator extends WritableComparator {
protected GroupComparator() {
super(IntPair.class, true);
}
@Override
public int compare(WritableComparable w1, WritableComparable w2) {
IntPair ip1 = (IntPair) w1;
IntPair ip2 = (IntPair) w2;
return IntPair.compare(ip1.getFirst(), ip2.getFirst());
}
}
static class MaxTemperatureReducer extends MapReduceBase
implements Reducer<IntPair, NullWritable, IntPair, NullWritable> {
public void reduce(IntPair key, Iterator<NullWritable> values,
OutputCollector<IntPair, NullWritable> output, Reporter reporter)
throws IOException {
output.collect(key, NullWritable.get());
}
}