The manual way is to copy the split files to your local filesystem using 'hadoop fs -copyToLocal'. You could also write code to read that data from hdfs.
What I do is set the reduced output to be in SequencedFile format, and then create a new SequenceFile.Reader to read the split files from hdfs. Calvin On 8/15/07, Jeroen Verhagen <[EMAIL PROTECTED]> wrote: > Hi Sebastien, > > On 8/14/07, Sebastien Rainville <[EMAIL PROTECTED]> wrote: > > > > I am new to Hadoop. Looking at the documentation, I figured out how to > > write map and reduce functions but now I'm stuck... How do we work with > > the output file produced by the reducer? For example, the word count > > example produces a file with words as keys and the number of occurrences > > of each word as the values. Now, let's say I want to get the total > > number of words by analyzing the output file... how I am supposed to do > > it? > > I asked a similar question some time ago and haven't had any response > sofar so I hope you will get one. > > Regarding your particular question, assuming each line in the output > files contains exactly one word, counting the number of lines in the > output files would give the answer you're looking for. > > But if you're looking for the count of particular word, I wonder if > scanning through the output files for a line that starts with the word > you're looking for is such an efficient solution. > > -- > > regards, > > Jeroen >
