It may be sorted within the output for a single reducer and, indeed, you can
even guarantee that it is sorted but *only* by the reduce key.  The order
that values appear will not be deterministic.

To sort by value, you need to run another MR job with the count from the
first step as the key and the old reducers output key as the value.  You
will only need an identity mapper.  If you use both the count and the key as
the new key and have an empty value, then you can do a two level sort in one
step.

Hadoop isn't magic.  If you want something sorted according to a new
ordering *something* will have to do the work.


On 2/21/08 5:38 PM, "Tarandeep Singh" <[EMAIL PROTECTED]> wrote:

> On Thu, Feb 21, 2008 at 5:34 PM, Ted Dunning <[EMAIL PROTECTED]> wrote:
>> 
>>  Use another job step to get the sort done.
>> 
> 
> but isn't the output of reduce step sorted ?
> Also can I specify that sort be done in reverse order ?
> 
>> 
>> 
>>  On 2/21/08 5:11 PM, "Tarandeep Singh" <[EMAIL PROTECTED]> wrote:
>> 
>>> On Thu, Feb 21, 2008 at 3:46 PM, Tarandeep Singh <[EMAIL PROTECTED]>
>>> wrote:
>>>> hi,
>>>> 
>>>>  Can I sort the output of reducer based on the value instead of key.
>>>>  Also can I specify that the output should be sorted in decreasing order ?
>>>> 
>>>>  Mapper output -
>>>>   <aWord, 1>
>>>> 
>>>>  Reducer gets-
>>>>   <aWord, (1,1,...)>
>>>> 
>>>>  and outputs -
>>>>  <aWord, count>
>>>> 
>>>>  e.g abc 10
>>>>       xyz  100
>>>> 
>>>>  I want the output to be sorted based on the value and that too in
>>>>  decreasing order -
>>>>      xyz 100
>>>>      abc  10
>>>> 
>>>>  Any suggestions ?
>>>> 
>>> 
>>> I set the output format to Text and then converted the count into text
>>> and wrote this as key and the aWord as value. I was expecting an
>>> output sorted on the count now but it didn't work that way ? Could
>>> anyone explain why so ?
>>> 
>>> reducer output -
>>>   <000001, abc>
>>>   <000005, xyz>
>>>   <000002, pqr>
>>> 
>>> thanks,
>>> Taran
>>> 
>>> 
>>>>  thanks,
>>>>  Taran
>>>> 
>> 
>> 

Reply via email to