You can specify a custom sort option using the JobConf. One way to do this is to provide a custom class that implements WritableComparable and use that as the jey class. Another way is to specify a CustomComparator in the job configuration via the setOutputKeyComparatorClass() method on the JobConf object.
- Sudhir On Aug/24/ 5:08 AM, "[email protected]" <[email protected]> wrote: > From: Teodor Macicas <[email protected]> > Date: Tue, 24 Aug 2010 11:21:39 +0200 > To: "[email protected]" <[email protected]> > Subject: Hadoop sorting algorithm on equal keys > > Hello, > > Let's say that we have two maps outputs which will be sorted before the > reducer will start. Doesn't matter what {a,b0,b1,c} mean, but let's > assume that b0=b1. > Map output1 : a, b0 > Map output2: c, b1 > In this case we can have 2 different sets of sorted data: > 1. {a,b0,b1,c} and > 2. {a,b1,b0,c} since b0=b1 . > > In my particular problem I want to distingush between b0 and b1. > Basically, they are numbers but I have extra-info on which my comparison > will be made. > Now, the question is: how can I change Hadoop default behaviour in order > to control the sorting algorithm on equal keys ? > > Thank you in advance. > Best, > Teodor iCrossing Privileged and Confidential Information This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information of iCrossing. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message.
