On Tue, 2 Aug 2011 21:49:22 +0800 (CST), "Daniel,Wu" <[email protected]>
wrote:
> we usually use something like values.next()  to loop every rows in a
> specific group, but I didn't see any code to loop the list, at least it
> need to get the first row in the list, which is something like
> values.get().   
> or will NullWritable.get() get the first row in the group?

No; like you said before the value is now in the key.

The grouping comparator receives (1900,35),(1900,34),(1900,34), and so on.
Due to the line

return -IntPair.compare(ip1.getSecond(),ip2.getSecond());

in the KeyComparator, these are guaranteed to come in reverse order in the
second slot.  That is, if 35 is the maximum temperature then (1900,35) will
come before ANY other (1900,t).  Then as the GroupComparator does its
thing, any time (1900,t) comes up it gets compared AND FOUND EQUAL TO
(1900,35), and thus its (null) value is added to the (1900,35) group.

The reducer then gets a (1900,35) key with an Iterable of null values,
which it pretty much discards and just emits the key, which contains the
maximum value.

I admit, it's a pretty subtle trick, and I'm actually glad you brought it
up since I think I may be able to use it to solve a problem I've been
having...

Reply via email to