Thanks again Harsh, I actually got the book 2 days ago, but didn't have time to read it yet. Maha
On Mar 4, 2011, at 7:54 PM, Harsh J wrote: > Hi, > > On Sat, Mar 5, 2011 at 9:03 AM, maha <[email protected]> wrote: >> Hi, >> >> I have 2 questions: >> >> 1) Is a SequenceFile more efficient than TextFiles for input? ... I think >> TextFiles will be processed by TextInputFormat into sequenceFiles inside >> hadoop. So will SequenceFiles (ie.binary input Files) be more efficient ? > > Depends on what your scenario is. > >> 2) If I decided to use SequenceFiles as InputFormat, Do I need to stick to >> the header protocol defined in >> http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/io/SequenceFile.html >> ? > > No. You would use SequenceFileInputFormat and SequenceFileOutputFormat > classes. > > May I suggest reading a good Hadoop book that covers the little, > scattered stuff like this, neatly? I like Tom White's Hadoop: The > Definitive Guide :) > > -- > Harsh J > www.harshj.com
