Wednesday, September 5, 2012, 4:16:42 PM, you wrote: > Are all Doc structures of the same size?
Mostly, but you don't need their actual size, only the size of the content per Doc instance, which is implicit in the fragment offset table. Given a range request we can walk the fragment offset table (which was read with the First/Earliest Doc) until we find the fragment which contains the starting byte of the range and being loading fragments with that one. > So the fragments table entry will be augmented with a new uint64_t field > which holds the chunk validity information. Right. > But after computing all the necessary "Doc" structures for satisfying a > range request they still have to be read from disk (not the content, just > the first part containing the fragments table) to check for chunk validity. No. The fragment offset table will be moved in to the alternate header, which is stored in First Doc. Once that is read you have all of the fragment offset and chunk validity data in memory. The entire table, with all offsets and validity bitmaps, is a single chunk of data in the alternate header. Again, this is just a proposal to stimulate discussion, it is *not* a committed plan of action. > Only when we know we have the complete range valid we may serve the range > from the cache. Yes, and that can be computed from in memory data as soon as the alternate is selected.