Welcome to the land of the fuzzy elephant!
Of course there are many ways to do it. Here is one, it might not be brilliant
or the right was, but I am sure you will get more :)
Use the identity mapper...
job.setMapperClass(Mapper.class);
then have one reducer....
job.setNumReduceTasks(1);
then have a reducer that has something like this around your reducing code...
Counter counter = context.getCounter(“ME", "total output records");
if (counter.getValue() < LIMIT) {
<do your reducey stuff here>
context.write(key, value);
counter.increment(1);
}
Cheers
James.
On 2010-09-10, at 3:04 PM, Neil Ghosh wrote:
Hello ,
I am new to Hadoop.Can anybody suggest any example or procedure of
outputting TOP N items having maximum total count, where the input file has
have (Item, count ) pair in each line .
Items can repeat.
Thanks
Neil
http://neilghosh.com
--
Thanks and Regards
Neil
http://neilghosh.com