Thanks Ed and Prateek who indicate this in previous mail. Yes, I use Text instead of IntWritable. It make sense if it is sorted in lexicographical order.
-Gang ----- 原始邮件 ---- 发件人: Ed Mazur <[email protected]> 收件人: [email protected] 发送日期: 2010/2/28 (周日) 4:28:46 下午 主 题: Re: no complete sort Hi Gang, What's your reduce output key type? It looks like you're using Text instead of IntWritable, causing your keys to be sorted lexicographically instead of numerically. Sorting is done with a comparator that defines how an arbitrary element compares to another. Hashing serves a different purpose. Ed On Sun, Feb 28, 2010 at 4:23 PM, Gang Luo <[email protected]> wrote: > Hi all, > here is a wired observation. The keys in the result of *ONE* reducer are > ordered like this: > 18166 > 18169 > 1817 > 18171 > 18172 > > why is key "1817" comes after "18169"? It makes sense if that key is "18170" > but it isn't! Why does it happen and basically, how does hadoop tell key1 is > larger than key2? Compare their hash codes? > > Thanks. > -Gang > > > > >
