[ 
https://issues.apache.org/jira/browse/OAK-591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13578292#comment-13578292
 ] 

Marcel Reutegger commented on OAK-591:
--------------------------------------

Meanwhile I thought about the MK again and how it addresses nodes (primarily by 
path+revision). It would be nice if we allowed the MK to annotate the children 
in a getNodes() call with revision information. This is similar to the recently 
added :id and :hash, but has the advantage that it does not introduce yet 
another way to address nodes. The returned path+revision for children could 
then be used in any of the other methods of the MK. The only problem I see with 
this approach is, that it's not easily possible for our current MK 
implementations to provide the revision information. Linking back to a previous 
revision of an unmodified subtree is done with hashes (in MKH2), with IDs (in 
SegmentMK) or not at all (in MongoMK). I therefore dropped the idea again.

What I implemented instead is the :hash lookup approach after the read.
                
> Improve KernelNodeStore cache efficiency
> ----------------------------------------
>
>                 Key: OAK-591
>                 URL: https://issues.apache.org/jira/browse/OAK-591
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core
>    Affects Versions: 0.6
>            Reporter: Marcel Reutegger
>         Attachments: mk.log.gz, OAK-591.patch
>
>
> The cache in KernelNodeStore references entries with a path+revision combo. 
> This mapping quickly becomes inefficient when there are writes on the 
> repository. Whenever something is changed, the complete cache basically 
> becomes invalid and oak-core needs to re-fetch nodes again, even though they 
> didn't change. The attached test shows this behaviour. The test initially 
> creates 10 nodes and lets a thread read those nodes repeatedly. To make the 
> test somewhat realistic the reader acquires a new session in every run 
> through the loop. This is to simulate e.g. a request which acquires a new 
> session every time (Apache Sling does it that way). At the same time writes 
> occur but in a separate part of the repository. As can be seen in the logs, 
> the nodes are read from the MicroKernel whenever something changes anywhere 
> in the repository. Obviously this is no limited to the test nodes. The log 
> also shows repeated reads to node type, user and index nodes. None of them 
> change while the test runs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to