Hi Sebastien, On 8/14/07, Sebastien Rainville <[EMAIL PROTECTED]> wrote: > > I am new to Hadoop. Looking at the documentation, I figured out how to > write map and reduce functions but now I'm stuck... How do we work with > the output file produced by the reducer? For example, the word count > example produces a file with words as keys and the number of occurrences > of each word as the values. Now, let's say I want to get the total > number of words by analyzing the output file... how I am supposed to do > it?
I asked a similar question some time ago and haven't had any response sofar so I hope you will get one. Regarding your particular question, assuming each line in the output files contains exactly one word, counting the number of lines in the output files would give the answer you're looking for. But if you're looking for the count of particular word, I wonder if scanning through the output files for a line that starts with the word you're looking for is such an efficient solution. -- regards, Jeroen
