I think you can do this by creating your own key type extending IntWritable
and override the compareTo method to implement this.
Cheers

Tim




On Wed, Jun 17, 2009 at 6:34 PM, Kunsheng Chen <ke...@yahoo.com> wrote:

>
> Thanks, Alex! It is really helpful, at least I know it is sorted in
> someway.
>
> Furthermore, could I control it as 'Ascend' or 'Descend' order ? Say if my
> keys are Integers, and I want them to be in Descend order, is it easy to do
> that ?
>
>
> Thanks again,
>
> -Kun
>
> --- On Mon, 6/15/09, Alex Loddengaard <a...@cloudera.com> wrote:
>
> > From: Alex Loddengaard <a...@cloudera.com>
> > Subject: Re: Anyway to sort "keys" before Reduce function in Hadoop ?
> > To: core-user@hadoop.apache.org
> > Date: Monday, June 15, 2009, 11:53 PM
> > Hey Kun,
> >
> > Keys given to a given reducer instance are given in sorted
> > order.  Meaning,
> > for a given reducer JVM instance, the reduce function will
> > be called several
> > times, once for each key.  The order in which the keys
> > are given to the
> > reduce function are sorted.  The sorting happens in
> > the shuffle phase, which
> > is basically partitioning and sorting.  That said, if
> > you have one reducer
> > (which isn't possible in large jobs), keys will be given to
> > you in sorted
> > order.
> >
> > You may be interested in the combiner phase, which is
> > essentially a mini
> > reduce that happens before data is transferred between
> > mapper and reducer:
> >
> > <http://wiki.apache.org/hadoop/HadoopMapReduce> (grep
> > for "combine")
> >
> > You may also find these videos useful:
> > <http://www.cloudera.com/hadoop-training-mapreduce-hdfs>
> > <http://www.cloudera.com/hadoop-training-programming-with-hadoop>
> >
> > Hope this helps.  Let me know if I misunderstood your
> > question.
> >
> > Alex
> >
> > On Mon, Jun 15, 2009 at 4:22 PM, Kunsheng Chen <ke...@yahoo.com>
> > wrote:
> >
> > >
> > > Hi everyone,
> > >
> > > Is there anyway to sort the "keys" before Reduce but
> > after Map ?
> > >
> > >
> > > I also think of sorting keys myself in Reduce
> > function, but it might take
> > > too many memory once the number of results getting
> > large.
> > >
> > > I am thinking of using some numeric value as "keys" in
> > Reduce (which was
> > > calculate by Map). If it is possible, I could output
> > my results by some
> > > orders easily.
> > >
> > >
> > > Thanks in advance,
> > >
> > > -Kun
> > >
> > >
> > >
> > >
> >
>
>
>
>

Reply via email to