Julian Sedding created OAK-11934:
------------------------------------

             Summary: segment prefetching for segmentstore cache
                 Key: OAK-11934
                 URL: https://issues.apache.org/jira/browse/OAK-11934
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: segment-tar
    Affects Versions: 1.84.0
            Reporter: Julian Sedding
            Assignee: Julian Sedding


Particularly for remote segment stores, IO can be a constraining factor. 
Processes like compaction, that traverse the repository, often alternate 
between processing segments and loading segments.

IO could be parallelized by enhancing the {{SegmentCache}} to asynchronously 
prefetch segments that are referenced by a newly loaded segment. I.e. if the 
"main" thread requests a segment from the cache, and the segment needs to be 
loaded from the persistence, then all segments referenced by the newly loaded 
segment are prefetched, and placed into the cache, asynchronously. When the 
"main" thread loads the next segment, it is likely already in the cache.

Prefetching could preload a configurable "depth" of references. Presumably, 
usually a depth of 1 or 2 strikes a good balance between preloading too 
aggressively and efficiently parallelizing IO.

If prefetching of references is only performed for newly loaded segments, the 
overhead of the prefetch mechanism should be minimal to non-existent while only 
cached segments are read.

cc [~miroslav], [~nuno.santos]





--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to