Hey Maninder, In some ways the TFile is close to SequenceFiles.
On Fri, Apr 20, 2012 at 8:19 PM, maninder batth <batth.manin...@gmail.com> wrote: > My requirements are to save variable sized binary records and ability to > query them later on. So i was looking at Tfile and had some doubts. > > 1. Is the datablock in the tfile a fixed size or variable size? If it is > fixed, what happens when a record cannot fit in the datablock? Would you > normally fill the empty space with zeros or spread the record over 2 > datablocks? > > 2. Is there any downside of having a variable sized datablocks? The condition for creation of a data block is only if the current size of the block (at end of an append) is >= min-size-of-block. Hence the data block isn't "fixed" in size. So if there's still space, another record's written and then the condition is checked (which would then trigger a block completion). > 3. Are the records synced with file at the boundary of a datablock or they > just written to file system. The question is like write() call in linux vs > fsync()? Unsure what you mean by a "datablock" here. The TFiles don't work at the FS level, and the "datablocks" in it are logical. Could you clarify this question given (1) and (2)? -- Harsh J