Re: Working with the output files of a hadoop application

Jeroen Verhagen Wed, 15 Aug 2007 05:16:37 -0700

Hi Sebastien,

On 8/14/07, Sebastien Rainville <[EMAIL PROTECTED]> wrote:
>
> I am new to Hadoop. Looking at the documentation, I figured out how to
> write map and reduce functions but now I'm stuck... How do we work with
> the output file produced by the reducer? For example, the word count
> example produces a file with words as keys and the number of occurrences
> of each word as the values. Now, let's say I want to get the total
> number of words by analyzing the output file... how I am supposed to do
> it?


I asked a similar question some time ago and haven't had any response
sofar so I hope you will get one.

Regarding your particular question, assuming each line in the output
files contains exactly one word, counting the number of lines in the
output files would give the answer you're looking for.

But if you're looking for the count of particular word, I wonder if
scanning through the output files for a line that starts with the word
you're looking for is such an efficient solution.

-- 

regards,

Jeroen

Re: Working with the output files of a hadoop application

Reply via email to