Hello, On Sun, Apr 3, 2011 at 6:49 AM, maha <m...@umail.ucsb.edu> wrote: > Hi Harsh, > > My job is for a Similarity Search application. But, my aim for now is to > measure the IO overhead if my mapper.map() opened a sequence file and started > to read it record by record with: > > SequenceFile.Reader.next(key,value); > > I want to make sure that "next" here is IO efficient. Otherwise, I will > need to write it myself to be block read then parsed in my program using the > "sync" hints.
You can have a look at SequenceFile.Reader class's source code perhaps - it should clear out all doubts you're having? > what parameter is used for the buffer size? Records are not loaded into the memory. Records are read using key/value size informations off the buffered input stream. You can specify a buffer size while constructing a Reader object for SequenceFiles, or the "io.file.buffer.size" value is used as a default. -- Harsh J http://harshj.com