> > Can we also consider the other way: make the request to the origin server
> > with a larger range so that we may 'join' two disjoint parts of the data?
> > (try to avoid having many empty chunks in between filled chunks)
>
> We can consider it but I think it won't work. It would require, among
> other things, cleverness in changing the outgoing request *and* tweaking
> the response so that the client gets the originally requested range.
> Unlikely to be worth it - better to do something in the nature of the other
> bug you mentioned where, in a plugin, completely synthetic range request
> are generated to fill gaps.
>
You're right. Changing the request and the response is not the safest thing
to do.
The idea to generate synthetic range requests from a plugin to fill in gaps
sounds very good if you say it can be done.


> > The solution with the chunk validity bitmap has the advantage that it
> works
> > fine without throwing away cached data.
> > Does throwing away parts of a cached object while updating the cache for
> > the same object raise any synchronization issues?
>
> No, because you don't really throw away the data, you simply change the
> valid span values in the fragment. Reading and writing those have to be
> synchronized for other reasons so it's no additional cost.
>
> Which is better may well depend on our anticipated access pattern. It's
> easy to think of a plausible pattern where a client "rolls through" an
> object via range requests (a very common pattern in practice) and each
> range request spans chunks without every covering a chunk so you don't
> actually store anything from the entire set of requests, even though ATS
> saw the entire object. The span scheme, OTOH, would end up capturing the
> entire object.
>

This "roll through" use case might indeed be the common pattern.

The chunk validity solution behavior for this case:
In the worse case, as you commented, after reading the whole object we will
not be able to cache anything from the object.
In the average case (when a range request is bigger than the size of the
chunk) the cached object will have many not filled chunks and we might have
to do a lot of small range request to fill these gaps.

The span scheme behavior is ideal in both the average and worst case
scenario.

Other use cases:
- "roll through" a part of the object then skip a portion of a file and
then "roll through" again (and so on): the span scheme is again the better
one as this is a generalization of the common pattern.
- make many "disjoint" range requests: the two solutions would probably be
equally good with a plus on the side of the chunk validity bitmap solution
as it might cache more data.

Any other use cases I have missed?

Reply via email to