[
https://issues.apache.org/jira/browse/OAK-2713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388775#comment-14388775
]
Michael Dürig commented on OAK-2713:
------------------------------------
Looking at {{CompactionMap}} there is notthing much to squeeze out. Given an id
{{a}} compacted to {{b}} then to {{c}} and finally to {{d}}, this is currently
stored as a series of mappings {{a -> b, b -> c, c -> d}}. Each id is
asymptotically stored twice and we could thus roughly half memory consumption
through value sharing of ids.
On the implementation side this would need a complete rewrite, possibly
switching to a representative based implementation of the equivalence relation
(i.e. {{a -> a, b -> a, c -> a, d -> a}}) as the pointer arithmetic would be
simpler here.
However, given this maximally halves memory consumption and that we have a
linear demand for more memory on each compaction cycle this would only push the
problem out from the {{n}}-th to the {{2n}}-th cycle.
>From this I conclude we have to options here: a) Either forget mappings (i.e.
>make the compaction map more cache like) or b) persist the compaction map.
AFAICS option a) would mostly be trading CPU for memory with a certain risk to
run into {{SegmentNotFoundException}} s when running 'really old diffs'. Option
b) OTOH would be on the safe side here but would require some additional disk
space that could only be claimed by off line compaction.
cc [~alexparvulescu], [~mmarth]
> High memory usage of CompactionMap
> ----------------------------------
>
> Key: OAK-2713
> URL: https://issues.apache.org/jira/browse/OAK-2713
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: segmentmk
> Reporter: Michael Dürig
> Assignee: Michael Dürig
> Labels: compaction, gc
> Fix For: 1.3.0
>
>
> In environments with a lot of volatile content the {{CompactionMap}} can end
> up eating a lot of memory. From
> {{CompactionStrategyMBean#getCompactionMapStats}}:
> {noformat}
> [Estimated Weight: 317,5 MB, Records: 39500094, Segments: 36698],
> [Estimated Weight: 316,4 MB, Records: 39374593, Segments: 36660],
> [Estimated Weight: 315,4 MB, Records: 39253205, Segments: 36620],
> [Estimated Weight: 315,1 MB, Records: 39221882, Segments: 36614],
> [Estimated Weight: 314,9 MB, Records: 39195490, Segments: 36604],
> [Estimated Weight: 315,0 MB, Records: 39182753, Segments: 36602],
> [Estimated Weight: 360 B, Records: 0, Segments: 0],
> {noformat}
> This causes compaction to be skipped:
> {noformat}
> 2015-03-30:30.03.2015 02:00:00.038 *INFO* [] [TarMK compaction thread
> [/foo/bar/crx-quickstart/repository/segmentstore], active since Mon Mar 30
> 02:00:00 CEST 2015, previous max duration 3854982ms]
> org.apache.jackrabbit.oak.plugins.segment.file.FileStore Not enough available
> memory 5,5 GB, needed 6,3 GB, last merge delta 1,3 GB, so skipping compaction
> for now
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)