There is the largely undocumented record stream stuff. You define your records in an IDL-like language which compiles to java code. I haven't used it, but it doesn't look particularly hard.
I believe that this stuff includes definitions of comparators. Also, if you just put concatenated keys into the key that is output from the mapper, you effectively get multi-key sorting. If you really mean that you want to sort the values that your reduce functions get, that is also possible. The trick is that you need to define a key that includes both the partitioning data (to determine which records get grouped together for reducing) and the sort key (to determine what order the reduce sees the data in). This means that you have to define two functions in your job config. I don't have sample code just off-hand for this, but it isn't hard to figure out from the javadocs. On 12/3/07 5:10 PM, "Rui Shi" <[EMAIL PROTECTED]> wrote: > Hi, > > I need to sort the data by multiple keys. Is there any built-in support in > Hadoop? > > Thanks, > > Rui > > > > > ______________________________________________________________________________ > ______ > Be a better pen pal. > Text or chat with friends inside Yahoo! Mail. See how. > http://overview.mail.yahoo.com/
