On Jan 15, 2008, at 17:56, Peter W. wrote:

That would output last 10 values for each key. I need
to do this across all the keys in the set.

Vadim

Hello,

Try using Java collection.

untested code follows...

public static class R extends MapReduceBase implements Reducer
  {
  public void reduce(WritableComparable wc,Iterator it,
     OutputCollector out,Reporter r)throws IOException
     {
     Stack s=new Stack();
     int cnt=0;

     // everything
     while(it.hasNext())
        {
        s.push(((IntWritable)it.next()).get());
        }

     // last 10
     while((!s.empty())&&(cnt<=10))
        {
        out.collect(wc,(IntWritable)s.pop());
        cnt++;
        }
     }
  }

Good Luck,

Peter W.

Vadim Zaliva wrote:


Rui Shi wrote:

As far as I understand, let mapper produce top N records is not working as each mapper only has partial knowledge of the data, which will not lead to
global optimal... I think your mapper needs to output all records
(combined) and let the reducer to pick the top N values.

the question remains, how to return, say, last 10 records from Reducer.
I need to know when last record is processed.

Vadim


Reply via email to