> > Can we also consider the other way: make the request to the origin server > > with a larger range so that we may 'join' two disjoint parts of the data? > > (try to avoid having many empty chunks in between filled chunks) > > We can consider it but I think it won't work. It would require, among > other things, cleverness in changing the outgoing request *and* tweaking > the response so that the client gets the originally requested range. > Unlikely to be worth it - better to do something in the nature of the other > bug you mentioned where, in a plugin, completely synthetic range request > are generated to fill gaps. > You're right. Changing the request and the response is not the safest thing to do. The idea to generate synthetic range requests from a plugin to fill in gaps sounds very good if you say it can be done.
> > The solution with the chunk validity bitmap has the advantage that it > works > > fine without throwing away cached data. > > Does throwing away parts of a cached object while updating the cache for > > the same object raise any synchronization issues? > > No, because you don't really throw away the data, you simply change the > valid span values in the fragment. Reading and writing those have to be > synchronized for other reasons so it's no additional cost. > > Which is better may well depend on our anticipated access pattern. It's > easy to think of a plausible pattern where a client "rolls through" an > object via range requests (a very common pattern in practice) and each > range request spans chunks without every covering a chunk so you don't > actually store anything from the entire set of requests, even though ATS > saw the entire object. The span scheme, OTOH, would end up capturing the > entire object. > This "roll through" use case might indeed be the common pattern. The chunk validity solution behavior for this case: In the worse case, as you commented, after reading the whole object we will not be able to cache anything from the object. In the average case (when a range request is bigger than the size of the chunk) the cached object will have many not filled chunks and we might have to do a lot of small range request to fill these gaps. The span scheme behavior is ideal in both the average and worst case scenario. Other use cases: - "roll through" a part of the object then skip a portion of a file and then "roll through" again (and so on): the span scheme is again the better one as this is a generalization of the common pattern. - make many "disjoint" range requests: the two solutions would probably be equally good with a plus on the side of the chunk validity bitmap solution as it might cache more data. Any other use cases I have missed?