Hi,

On Sat, Mar 5, 2011 at 9:03 AM, maha <[email protected]> wrote:
> Hi,
>
> I have 2 questions:
>
> 1) Is a  SequenceFile more efficient than TextFiles for input?  ... I think 
> TextFiles will be processed by TextInputFormat into sequenceFiles inside 
> hadoop. So will SequenceFiles (ie.binary input Files) be more efficient ?

Depends on what your scenario is.

> 2) If I decided to use SequenceFiles as InputFormat, Do I need to stick to 
> the header protocol defined in 
> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.html
>  ?

No. You would use SequenceFileInputFormat and SequenceFileOutputFormat classes.

May I suggest reading a good Hadoop book that covers the little,
scattered stuff like this, neatly? I like Tom White's Hadoop: The
Definitive Guide :)

-- 
Harsh J
www.harshj.com

Reply via email to