[ 
https://issues.apache.org/jira/browse/JCRVLT-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16938864#comment-16938864
 ] 

Mark Adamcin commented on JCRVLT-374:
-------------------------------------

[~joerghoh] I think the memory issue is pretty straightforward... simply a case 
where too much content is buffered before serialization. Two conceptual 
solutions: 1. buffer less before write or 2. don't buffer/write that stuff at 
all. I would like to know if /jcr:system resources should simply be excluded 
from the vault-exported view of the repository, since the node type indicates 
that the immediate children are all protected, and therefore might not behave 
well when an exported content package is installed in a different repository. 
On the other hand, it is obviously a candidate for a desirable feature to be 
able to export version history with content, though I'm not sure if explicitly 
including the jcr:system/jcr:versionStorage path is the right way to 1) capture 
version history during assembly, or 2) the right way to serialize it in 
filevault.

> assembling a content-package consumes much memory
> -------------------------------------------------
>
>                 Key: JCRVLT-374
>                 URL: https://issues.apache.org/jira/browse/JCRVLT-374
>             Project: Jackrabbit FileVault
>          Issue Type: Improvement
>          Components: Packaging
>    Affects Versions: 3.2.8
>            Reporter: Jörg Hoh
>            Priority: Major
>         Attachments: JCRVLT-374-proto.patch, filevault.log.gz
>
>
> I came across a situation that packaging a huge subtree 
> (/jcr:system/jcr:versionStorage) (bad idea, I know) caused a huge spike in 
> memory usage, which caused lots of FullGCs (due to AllocationFailures). 
> I have several stacktraces from that time, which all look very similar to 
> this one:
> {noformat}
> qtp1597826410-38130" prio=5 tid=0x94f2 nid=0xffffffff runnable
>    java.lang.Thread.State: RUNNABLE
>         at 
> org.apache.jackrabbit.oak.segment.SegmentNodeBuilder.createChildBuilder(SegmentNodeBuilder.java:147)
>         at 
> org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.getChildNode(MemoryNodeBuilder.java:330)
>         at 
> org.apache.jackrabbit.oak.core.SecureNodeBuilder.<init>(SecureNodeBuilder.java:110)
>         at 
> org.apache.jackrabbit.oak.core.SecureNodeBuilder.getChildNode(SecureNodeBuilder.java:327)
>         at 
> org.apache.jackrabbit.oak.core.MutableTree.getTree(MutableTree.java:288)
>         at 
> org.apache.jackrabbit.oak.core.MutableRoot.getTree(MutableRoot.java:220)
>         at 
> org.apache.jackrabbit.oak.core.MutableRoot.getTree(MutableRoot.java:69)
>         at 
> org.apache.jackrabbit.oak.jcr.session.WorkspaceImpl$1.getTypes(WorkspaceImpl.java:85)
>         at 
> org.apache.jackrabbit.oak.plugins.nodetype.ReadOnlyNodeTypeManager.isNodeType(ReadOnlyNodeTypeManager.java:293)
>         at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl$24.perform(NodeImpl.java:931)
>         at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl$24.perform(NodeImpl.java:926)
>         at 
> org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:207)
>         at 
> org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:112)
>         at 
> org.apache.jackrabbit.oak.jcr.session.NodeImpl.isNodeType(NodeImpl.java:926)
>         at 
> org.apache.jackrabbit.vault.fs.impl.aggregator.FileAggregator.matches(FileAggregator.java:66)
>         at 
> org.apache.jackrabbit.vault.fs.impl.AggregatorProvider.getAggregator(AggregatorProvider.java:68)
>         at 
> org.apache.jackrabbit.vault.fs.impl.AggregateManagerImpl.getAggregator(AggregateManagerImpl.java:455)
>         at 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:720)
>         at 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733)
>         at 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733)
>         at 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733)
>         at 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733)
>         at 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:733)
>         at 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl.collect(AggregateImpl.java:684)
>         at 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl.prepare(AggregateImpl.java:747)
>         at 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl.load(AggregateImpl.java:657)
>         at 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl.getArtifacts(AggregateImpl.java:259)
>         at 
> org.apache.jackrabbit.vault.fs.impl.VaultFileImpl.<init>(VaultFileImpl.java:101)
>         at 
> org.apache.jackrabbit.vault.fs.impl.VaultFileSystemImpl.<init>(VaultFileSystemImpl.java:120)
>         at org.apache.jackrabbit.vault.fs.Mounter.mount(Mounter.java:64)
>         at 
> org.apache.jackrabbit.vault.packaging.impl.PackageManagerImpl.assemble(PackageManagerImpl.java:141)
>         at 
> org.apache.jackrabbit.vault.packaging.impl.PackageManagerImpl.assemble(PackageManagerImpl.java:102)
>         at 
> org.apache.jackrabbit.vault.packaging.impl.JcrPackageManagerImpl.assemble(JcrPackageManagerImpl.java:358)
>         at 
> org.apache.jackrabbit.vault.packaging.impl.JcrPackageManagerImpl.assemble(JcrPackageManagerImpl.java:324)
> {noformat}
> It seems to me that vault is traversing the complete tree and also storing 
> some information of every traversed node in memory.
> For validation I enabled trace logging for {{org.apache.jackrabbit.vault.fs}} 
> and tried to reproduce locally to package the complete 
> {{/jcr:system/jcr:versionStorage}} in a package. 
> {noformat}
> [...]
> 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl Create Aggregate /jcr:system
> 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl Collecting /jcr:system
> 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl descending into /jcr:system 
> (descend=false)
> 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> 
> /jcr:system/jcr:primaryType
> 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> 
> /jcr:system
> 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> 
> /jcr:system/jcr:mixinTypes
> 19.09.2019 20:06:08.792 *TRACE* [qtp681943839-1771] 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> 
> /jcr:system/jcr:versionStorage
> 19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl descending into 
> /jcr:system/jcr:versionStorage (descend=true)
> 19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> 
> /jcr:system/jcr:versionStorage/jcr:primaryType
> 19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> 
> /jcr:system/jcr:versionStorage/ee
> 19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl descending into 
> /jcr:system/jcr:versionStorage/ee (descend=true)
> 19.09.2019 20:06:08.793 *TRACE* [qtp681943839-1771] 
> org.apache.jackrabbit.vault.fs.impl.AggregateImpl including /jcr:system -> 
> /jcr:system/jcr:versionStorage/ee/jcr:primaryType
> [...]
> {noformat}
> I found a lot of these "Including /jcr:system -> ..." statements in the log:
> {noformat}
> $ grep -c "AggregateImpl including" filevault.log
> 174425
> $
> {noformat}
> which is logged at [1]. And at [2] something is unconditionally added to a 
> global variable. And I think that this is the problematic piece.
> I don't know the details of vault  good enough to propose a solution, but I 
> would love to have a less memory-intensive algorithm, for which the 
> memory-usage does not grow linear with the number of nodes covered by the 
> package rules.
> [1] 
> https://github.com/apache/jackrabbit-filevault/blob/jackrabbit-filevault-3.2.8/vault-core/src/main/java/org/apache/jackrabbit/vault/fs/impl/AggregateImpl.java#L502
> [2] 
> https://github.com/apache/jackrabbit-filevault/blob/jackrabbit-filevault-3.2.8/vault-core/src/main/java/org/apache/jackrabbit/vault/fs/impl/AggregateImpl.java#L507



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to