On Jan 15, 2008, at 17:02, Rui Shi wrote:
As far as I understand, let mapper produce top N records is not working as each mapper only has partial knowledge of the data, which will not lead toglobal optimal... I think your mapper needs to output all records (combined) and let the reducer to pick the top N values.
the question remains, how to return, say, last 10 records from Reducer. I need to know when last record is processed. Vadim