or I should ask, should the input of the reducer for the group of year 1900 be 
like
key,  value pair
(1900,35), null
(1900,34),null
(1900,33),null


or like
(1900,35), null
(1900,35), null    ==> since (1900,34) is for the same group as (1900,35), so 
it use (1900,35) as the key.
(1900,35), null


At 2011-08-03 10:35:51,"Daniel,Wu" <[email protected]> wrote:
>
>So the key of a group is determined by the first coming record in the group,  
>if we have 3 records in a group
>1: (1900,35)
>2:(1900,34)
>3:(1900,33)
>
>if (1900,35) comes in as the first row, then the result key will be (1900,35), 
>when the second row (1900,34) comes in, it won't the impact the key of the 
>group, meaning it will not overwrite the key (1900,35) to (1900,34), correct.
>
>>in the KeyComparator, these are guaranteed to come in reverse order in the 
>>>second slot.  That is, if 35 is the maximum temperature then (1900,35) will 
>>>come before ANY other (1900,t).  Then as the GroupComparator does its 
>>>thing, any time (1900,t) comes up it gets compared AND FOUND EQUAL TO 
>>>(1900,35), and thus its (null) value is added to the (1900,35) group. > >The 
>>reducer then gets a (1900,35) key with an Iterable of null values, >which it 
>>pretty much discards and just emits the key, which contains the >maximum 
>>value.

Reply via email to