sort the values in reduce side

exception Sun, 30 Jan 2011 02:06:55 -0800

Hi,

I am running a simple invert index generating program in hadoop which will emit 
every word in a text file as well as it's offsets.
So the output key is Text and output value is a list of LongWritable.


What I am trying to do is sort the offsets in reduce function. For each key, I 
put every value into a List and sort using Collections.sort().

This is the code sanp:
offsetList.clear();
            for (LongWritable val : values)
            {
                offsetList.add(val);
            }
            Collections.sort(offsetList);


            for (LongWritable offset : offsetList)
                            {
                                     ......
}

But it doesn't work. Looks like all the elements in offsetList have been 
overwritten by the smallest value in values. offsetList and values have the 
same size.
Can I sort the data in this way?

Thanks.

sort the values in reduce side

Reply via email to