seems you have to insert a tag in the map output tuple which tells where this 
tuple come from. At reduce side, you write your own sort with the tag involved. 


-Gang




----- 原始邮件 ----
发件人: Teodor Macicas <[email protected]>
收件人: "[email protected]" <[email protected]>
发送日期: 2010/8/24 (周二) 5:21:39 上午
主   题: Hadoop sorting algorithm on equal keys

Hello,

Let's say that we have two maps outputs which will be sorted before the reducer 
will start. Doesn't matter what {a,b0,b1,c} mean, but let's assume that b0=b1.
Map output1 : a, b0
Map output2:  c, b1
In this case we can have 2 different sets of sorted data:
1. {a,b0,b1,c}  and
2. {a,b1,b0,c}  since b0=b1 .

In my particular problem I want to distingush between b0 and b1. Basically, 
they 
are numbers but I have extra-info on which my comparison will be made.
Now, the question is: how can I change Hadoop default behaviour in order to 
control the sorting algorithm on equal keys ?

Thank you in advance.
Best,
Teodor




Reply via email to