[ 
https://issues.apache.org/jira/browse/OAK-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15984862#comment-15984862
 ] 

Michael Dürig commented on OAK-5790:
------------------------------------

The following graph shows the size of the segment store over time comparing the 
base lines with patched in a longevity test over almost a month. The test 
compared unpatched, patched, unpatched with NODE_DEDUPLICATION_CACHE_SIZE = 
128k and patched with NODE_DEDUPLICATION_CACHE_SIZE = 128k.

!size over time.png|width=500!

It clearly shows that the patched versions are more space efficient than the 
unpatched ones, even when running with much smaller caches. Compaction times 
are comparable across all approaches. However, overall volume written differs 
greatly. It is 525GB, 348GB, 620GB and 405GB for the base, patched, base with 
small cache and patched with small cache approach, respectively. These numbers 
also reflect the positive effect of deduplicating via rebasing: patched numbers 
are better than unpatched and standard cache size numbers are better than small 
cache size numbers. 



> Chronologically rebase checkpoints on top of each other during compaction
> -------------------------------------------------------------------------
>
>                 Key: OAK-5790
>                 URL: https://issues.apache.org/jira/browse/OAK-5790
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: segment-tar
>            Reporter: Michael Dürig
>            Assignee: Michael Dürig
>              Labels: compaction, gc, performance
>             Fix For: 1.8, 1.7.1
>
>         Attachments: size over time.png
>
>
> Currently the compactor does just a rewrite of the super root node without 
> any special handling of the checkpoints. It just relies on the node 
> de-duplication cache to avoid fully exploding the checkpoints. 
> I think this can be improved by subsequently rebasing checkpoints on top of 
> each other during compaction. (Very much like checkpoints are handled in 
> migration). 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to