Hi Rob, The sort is an internal mechanism in hadoop, the reduce step will always do sort on the keys. If you want to sort the result by count, you could start a second job with the input from the first job, and use the count as the key, word as the value,.
On Fri, Jan 15, 2010 at 2:42 PM, Rob Stewart <[email protected]>wrote: > Hi, > > I am having a look at the WordCount java example here: > > http://hadoop.apache.org/common/docs/current/mapred_tutorial.html#Walk-through > > I am wanting a word count application that, instead of sorting by key > (alphabetically by word), I want to sort by the count (frequency) of the > words. > > I can't see in the reduce method in the above example where exactly the > key/values get specified to order by key alphabetically? Or how I can > override this to state to for by the value of the final reduce (i.e. by the > frequency). > > Thanks, > > Rob Stewart > -- Best Regards Jeff Zhang
