[ 
https://issues.apache.org/jira/browse/OAK-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226764#comment-16226764
 ] 

Michael Dürig edited comment on OAK-5655 at 10/31/17 1:17 PM:
--------------------------------------------------------------

In another analysis I ran offline compaction on a repository (17.5GB footprint 
compacting to 564MB, 4M nodes). The process took 20min to complete. When then 
running offline compaction again on the result it takes just 50sec to complete. 
While this test is a bit artificial as the repository consists of completely 
random content created by {{SegmentCompactionIT}} it still indicates that the 
process is thrashing in reads caused by bad locality. 

To better understand the connection between repository size and compaction time 
I ran offline compaction with memory mapped files on and off graphing 
compaction time against compacted repository size:

!compaction-time-vs.reposize.png|width=400!

Compaction times increase super linear and {{mmap=on}} is clearly superior to 
{{mmap=on}}. 

To validate the hypothesis that the process is (read) IO bound I took a JMC 
[flight recording|^offrc.jfr] from an offline compaction of the same repository 
with {{map=false}}. The flight recording shows that the process spends almost 
99% of its time in {{java.io.RandomAccessFile.read()}} and all these calls 
originating from segment reads. Furthermore the segment reads are spread more 
or less evenly across time and across all 50 tar files.





was (Author: mduerig):
In another analysis I ran offline compaction on a repository (17.5GB footprint 
compacting to 564MB, 4M nodes). The process took 20min to complete. When then 
running offline compaction again on the result it takes just 50sec to complete. 
While this test is a bit artificial as the repository consists of completely 
random content created by {{SegmentCompactionIT}} it still indicates that the 
process is thrashing in reads caused by bad locality. 

To better understand the connection between repository size and compaction time 
I ran offline compaction with memory mapped files on and off graphing 
compaction time against compacted repository size:

!compaction-time-vs.reposize.png|width=400!

Compaction times increase super linear and {{mmap=on}} is clearly superior to 
{{mmap=on}}. 

To validate the hypothesis that the process is (read) IO bound I took a JMC 
flight recording from an offline compaction of the same repository with 
{{map=false}}. The flight recording shows that the process spends almost 99% of 
its time in {{java.io.RandomAccessFile.read()}} and all these calls originating 
from segment reads. Furthermore the segment reads are spread more or less 
evenly across time and across all 50 tar files.




> TarMK: Analyse locality of reference 
> -------------------------------------
>
>                 Key: OAK-5655
>                 URL: https://issues.apache.org/jira/browse/OAK-5655
>             Project: Jackrabbit Oak
>          Issue Type: Task
>          Components: segment-tar
>            Reporter: Michael Dürig
>              Labels: scalability
>             Fix For: 1.8
>
>         Attachments: compaction-time-vs.reposize.png, offrc.jfr, 
> segment-per-path-compacted-nocache.png, 
> segment-per-path-compacted-nostringcache.png, segment-per-path-compacted.png, 
> segment-per-path.png
>
>
> We need to better understand the locality aspects of content stored in TarMK: 
> * How is related content spread over segments?
> * What content do we consider related? 
> * How does locality of related content develop over time when changes are 
> applied?
> * What changes do we consider typical?
> * What is the impact of compaction on locality? 
> * What is the impact of the deduplication caches on locality (during normal 
> operation and during compaction)?
> * How good are checkpoints deduplicated? Can we monitor this online?
> * ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to