Top K words problem

Zhige Xin Sat, 09 Aug 2014 09:50:31 -0700

I have a question about hadoop that how to modify the wordcount program to
give the top K words according to their occurrences.


The naive method is to count and sort but it needs too many lines of code
and is not elegant. Another one uses a data structure, called TreeMap, to
solve this problem, which only takes 100 lines and reduces the time
complexity.

Are there any other ways? Any ideas are welcomed.




Best,
Isaiah

Top K words problem

Reply via email to