[ 
https://issues.apache.org/jira/browse/OAK-1768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14129806#comment-14129806
 ] 

Marcel Reutegger commented on OAK-1768:
---------------------------------------

To tackle the remaining problem, I think we need to move more of the processing 
to MongoDB. So far we keep all affected paths of a branch in DocumentNodeStore. 
This is required because the _lastRev fields of all documents modified in a 
branch must be updated with the merge commit revision. At the moment this 
eventually happens with the background operation using UnsavedModifications 
with DocumentStore.update() calls. For large transactions there is not just a 
memory problem, but also the time to update the _lastRevs will take a long time 
with the current implementation. This will affect propagation of changes across 
cluster nodes, which is bad as well.

The idea I have in mind is to make the _lastRevs update more intelligent for a 
branch merge. Instead of passing individual keys of documents to MongoDB to 
update the _lastRev, we could pass a list of branch commit revisions that 
identifies the documents to update. This reduces the required memory to track a 
branch in DocumentNodeStore significantly and would also speed up the 
background update.

The tricky part is how to keep the existing branch checks in DocumentNodeStore 
and NodeDocument. It also not clear to me yet how to handle ancestors not 
directly affected by the branch commits. Let's say a huge tree is added unter 
/foo/bar/baz. A merge will have to update the _lastRev of /foo, /foo/bar and 
/foo/bar/baz as well even though they were not directly touched by the branch. 
I guess for those nodes we'd still have to track the paths and _lastRevs as we 
do now.

> DocumentNodeBuilder.setChildNode() runs OOM with large tree
> -----------------------------------------------------------
>
>                 Key: OAK-1768
>                 URL: https://issues.apache.org/jira/browse/OAK-1768
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.0
>            Reporter: Marcel Reutegger
>            Assignee: Marcel Reutegger
>             Fix For: 1.1
>
>         Attachments: 
> 0001-OAK-1768-DocumentNodeBuilder.setChildNode-runs-OOM-w.patch
>
>
> This is a follow up issue for OAK-1760. There we implemented a workaround in 
> the upgrade code to prevent the OOME. For the 1.1 release we should implement 
> a proper fix in the DocumentNodeStore as suggested by Jukka in OAK-1760.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to