I have a question about hadoop that how to modify the wordcount program to
give the top K words according to their occurrences.

The naive method is to count and sort but it needs too many lines of code
and is not elegant. Another one uses a data structure, called TreeMap, to
solve this problem, which only takes 100 lines and reduces the time
complexity.

Are there any other ways? Any ideas are welcomed.




Best,
Isaiah

Reply via email to