[
https://issues.apache.org/jira/browse/OAK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226764#comment-16226764
]
Michael Dürig commented on OAK-5655:
------------------------------------
In another analysis I ran offline compaction on a repository (17.5GB footprint
compacting to 564MB, 4M nodes). The process took 20min to complete. When then
running offline compaction again on the result it takes just 50sec to complete.
While this test is a bit artificial as the repository consists of completely
random content created by {{SegmentCompactionIT}} it still indicates that the
process is thrashing in reads caused by bad locality.
To better understand the connection between repository size and compaction time
I ran offline compaction with memory mapped files on and off graphing
compaction time against compacted repository size:
!compaction-time-vs.reposize.png|width=400!
Compaction times increase super linear and {{mmap=on}} is clearly superior to
{{mmap=on}}.
To validate the hypothesis that the process is (read) IO bound I took a JMC
flight recording from an offline compaction of the same repository with
{{map=false}}. The flight recording shows that the process spends almost 99% of
its time in {{java.io.RandomAccessFile.read()}} and all these calls originating
from segment reads. Furthermore the segment reads are spread more or less
evenly across time and across all 50 tar files.
> TarMK: Analyse locality of reference
> -------------------------------------
>
> Key: OAK-5655
> URL: https://issues.apache.org/jira/browse/OAK-5655
> Project: Jackrabbit Oak
> Issue Type: Task
> Components: segment-tar
> Reporter: Michael Dürig
> Labels: scalability
> Fix For: 1.8
>
> Attachments: compaction-time-vs.reposize.png, offrc.jfr,
> segment-per-path-compacted-nocache.png,
> segment-per-path-compacted-nostringcache.png, segment-per-path-compacted.png,
> segment-per-path.png
>
>
> We need to better understand the locality aspects of content stored in TarMK:
> * How is related content spread over segments?
> * What content do we consider related?
> * How does locality of related content develop over time when changes are
> applied?
> * What changes do we consider typical?
> * What is the impact of compaction on locality?
> * What is the impact of the deduplication caches on locality (during normal
> operation and during compaction)?
> * How good are checkpoints deduplicated? Can we monitor this online?
> * ...
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)