[
https://issues.apache.org/jira/browse/OAK-4493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15343976#comment-15343976
]
Alex Parvulescu commented on OAK-4493:
--------------------------------------
Pasting in a cleaned sample of a failed running compaction. this is a single
node state that has 8M nodes only (I think compaction was actually processing
the 55M child nodes content node):
{noformat}
java.util.HashMap$Entry[8388608] 1.9Gb
table java.util.HashMap 1.9Gb
nodes org.apache.jackrabbit.oak.plugins.memory.MutableNodeState 1.9Gb
state
org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder$RootHead 1.9Gb
head, rootHead
org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder 1.9Gb
{noformat}
I have also seen this failing at 33M nodes( with a bigger max heap value).
> Offline compaction persisted mode
> ---------------------------------
>
> Key: OAK-4493
> URL: https://issues.apache.org/jira/browse/OAK-4493
> Project: Jackrabbit Oak
> Issue Type: Bug
> Components: segment-tar, segmentmk
> Reporter: Alex Parvulescu
> Assignee: Alex Parvulescu
> Labels: compaction, gc
>
> I'm investigating a case where offline compaction is unable to finish, and
> crashes with OOMEs because of the content structure, namely large child node
> lists. The biggest issue is with the UUID index which has 55M nodes.
> In the current implementation, the compactor will use an inmemory nodestate
> to collect all the data, and persist at the very end, once compaction is done
> [0].
> This is prone to OOME once the size of the data parts (no binaries involved)
> grows beyond a certain size (in this case I have 350Gb but there's a fair
> amount of garbage due to compaction not running).
> My proposal is to add a special flag {{oak.compaction.eagerFlush=true}} that
> should be enabled only in case the size of the repo will not allow running
> offline compaction with the available heap size. This will turn the inmemory
> compaction transaction into one based on a persisted SegmentNodeState,
> meaning we're trading disk space (and IO) for memory.
> [0]
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-segment/src/main/java/org/apache/jackrabbit/oak/plugins/segment/Compactor.java#L248
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)