[
https://issues.apache.org/jira/browse/OAK-5790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15984862#comment-15984862
]
Michael Dürig commented on OAK-5790:
------------------------------------
The following graph shows the size of the segment store over time comparing the
base lines with patched in a longevity test over almost a month. The test
compared unpatched, patched, unpatched with NODE_DEDUPLICATION_CACHE_SIZE =
128k and patched with NODE_DEDUPLICATION_CACHE_SIZE = 128k.
!size over time.png|width=500!
It clearly shows that the patched versions are more space efficient than the
unpatched ones, even when running with much smaller caches. Compaction times
are comparable across all approaches. However, overall volume written differs
greatly. It is 525GB, 348GB, 620GB and 405GB for the base, patched, base with
small cache and patched with small cache approach, respectively. These numbers
also reflect the positive effect of deduplicating via rebasing: patched numbers
are better than unpatched and standard cache size numbers are better than small
cache size numbers.
> Chronologically rebase checkpoints on top of each other during compaction
> -------------------------------------------------------------------------
>
> Key: OAK-5790
> URL: https://issues.apache.org/jira/browse/OAK-5790
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: segment-tar
> Reporter: Michael Dürig
> Assignee: Michael Dürig
> Labels: compaction, gc, performance
> Fix For: 1.8, 1.7.1
>
> Attachments: size over time.png
>
>
> Currently the compactor does just a rewrite of the super root node without
> any special handling of the checkpoints. It just relies on the node
> de-duplication cache to avoid fully exploding the checkpoints.
> I think this can be improved by subsequently rebasing checkpoints on top of
> each other during compaction. (Very much like checkpoints are handled in
> migration).
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)