>
> > 1. Is there a one to many relation between a cached object and a Doc
> > structure?
>
> Yes, that's the chained Docs in the lower right. Each Doc represents a
> fragment and we have discussed previously how objects are stored as an
> ordered set of fragments.
>

Just for clarification: I did some tests and it seems the entries in the
"alternate vector" stored in the first "Doc" are different versions of the
same cached object? (I see that these entries are scanned for the best
quality match and the entry with the highest score and most recent is used)
So actually "AltVec" is a vector of linked lists, each linked list being a
different version of a cached object?

As the cached object is represented with a linked list of fragments does
that mean that in order to serve a specific range from it we have to read
all fragments (presumably from disk) for the next_key until we get to the
signifficant fragment? Or maybe again I'm missing something ..


> > 2. A Doc structure contains a table of fragments which are represented as
> > uint64_t offsets.
>
> The fragment offsets are the location in the object of the fragment data.
> frag_offset[i] is the address of the first byte past the end of fragment i.
> To simplify, presume fragments are at most 100 bytes long and we have an
> object that is 390 bytes long in four fragments (0,1,2,3). Then
> frag_offset[1] could be 201 which would mean the next byte past the end of
> fragment 1 is byte 201 of 390 in the original object (or equivalently that
> the first byte of fragment 2 is byte 201 of 390 in the object). This data
> is used to accelerate range requests so that if you had a request for bytes
> 220-300 of the object you could skip immediately to fragment 2 without
> reading fragment 1. In real life there are some very large (20M+) files out
> there and being able to skip reading the first 10M or so from disk is an
> enormous performance improvement. Note the earliest doc for an alternate is
> always read because of the way the logic is done now, although in theory
> that could be skipped because you the fragment data you need is in the
> First Doc which doesn't contain any object data for objects with multiple
> alternates or that are more than 1 fragment long.
>
> From this description it seems to me that these "fragments inside a
fragment" resemble to the chunks you describe in your original e-mail.
The differences are:
- currently these chunks may be of different sizes as opposed to the
proposed new model of having equally sized chunks
- there is  currently no bitfield to say which chunks are valid
Are these "fragments inside a fragment" currently used for anything else
than range request acceleration?

Best regards,
Bogdan Graur

Reply via email to