Re: Input and Output types?

Owen O'Malley Fri, 18 Apr 2008 08:44:41 -0700


On Apr 17, 2008, at 11:20 PM, Sridhar Raman wrote:

I am new to MapReduce and Hadoop, and I have managed to find my waythrough
with a few programs.  But I still have some doubts that are constantly
clinging onto me. I am not too sure whether these are basicdoubts, or just
some documentation that I missed somewhere.


Take a look at  http://tinyurl.com/4y7776 under InputFormats.

1) Should my input _always_ be text files? What if my input is inthe form
of Java objects?  Where do I handle this conversion?

You can define your own InputFormat that reads an arbitrary format,or use SequenceFileInputFormat that reads SequenceFiles.SequenceFiles are a file format defined by Hadoop to hold binary dataconsisting of Writable keys and values.

2) How do I control how the output is written? For example, if Iwant to
output in a format that is my own, how do I do it?

That is controlled by the OutputFormat. It defaults toTextOutputFormat, but you can either use SequenceFileOutputFormat ormake your own.


-- Owen

Re: Input and Output types?

Reply via email to