Julian Sedding created OAK-11934:
------------------------------------
Summary: segment prefetching for segmentstore cache
Key: OAK-11934
URL: https://issues.apache.org/jira/browse/OAK-11934
Project: Jackrabbit Oak
Issue Type: Improvement
Components: segment-tar
Affects Versions: 1.84.0
Reporter: Julian Sedding
Assignee: Julian Sedding
Particularly for remote segment stores, IO can be a constraining factor.
Processes like compaction, that traverse the repository, often alternate
between processing segments and loading segments.
IO could be parallelized by enhancing the {{SegmentCache}} to asynchronously
prefetch segments that are referenced by a newly loaded segment. I.e. if the
"main" thread requests a segment from the cache, and the segment needs to be
loaded from the persistence, then all segments referenced by the newly loaded
segment are prefetched, and placed into the cache, asynchronously. When the
"main" thread loads the next segment, it is likely already in the cache.
Prefetching could preload a configurable "depth" of references. Presumably,
usually a depth of 1 or 2 strikes a good balance between preloading too
aggressively and efficiently parallelizing IO.
If prefetching of references is only performed for newly loaded segments, the
overhead of the prefetch mechanism should be minimal to non-existent while only
cached segments are read.
cc [~miroslav], [~nuno.santos]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)