[jira] [Comment Edited] (OAK-4176) Persistent Cache improvements

Thomas Mueller (JIRA) Thu, 14 Apr 2016 06:34:45 -0700

    [ 
https://issues.apache.org/jira/browse/OAK-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241152#comment-15241152
 ]


Thomas Mueller edited comment on OAK-4176 at 4/14/16 1:34 PM:
--------------------------------------------------------------

Results so far:

* I found a performance problem in the LIRS cache: notifyAll was called quite a 
lot. This can be avoided, making the persistent cache a bit faster.
* The persistent cache flushes once per second, the write buffer is about 2 MB. 
This could be changed to flush every 10 seconds, and use a larger write buffer 
(let's say 16 MB). This reduces the persistent file size from about 600 MB to 
about 500 MB.
* Traversing the whole repository with the document store takes about 5 seconds 
(everything in the persistent cache), versus 1.5 seconds with the segment 
store. One of the problems might be that the document store reads all 
properties in memory (de-serialized everything eagerly), while the segment 
store only does that on demand. So maybe the test is somewhat "unfair". JSON 
de-serialization overhead is about 15%, we should consider using lazy 
de-serialization.
* Changing the data model in the persistent cache from "key+revision" to 
"revision+key" didn't change much: the performance is about the same, the disk 
space used is about the same.
* Using the segment store to persist is work in progress; this will take some 
more time.



was (Author: tmueller):
Results so far:

* I found a performance problem in the LIRS cache: notifyAll was called quite a 
lot. This can be avoided, making the persistent cache a bit faster.
* Traversing the whole repository with the document store takes about 5 seconds 
(everything in the persistent cache), versus 1.5 seconds with the segment 
store. One of the problems might be that the document store reads all 
properties in memory (de-serialized everything eagerly), while the segment 
store only does that on demand. So maybe the test is somewhat "unfair". JSON 
de-serialization overhead is about 15%, we should consider using lazy 
de-serialization.
* Changing the data model in the persistent cache from "key+revision" to 
"revision+key" didn't change much: the performance is about the same, the disk 
space used is about the same.
* Using the segment store to persist is work in progress; this will take some 
more time.

> Persistent Cache improvements
> -----------------------------
>
>                 Key: OAK-4176
>                 URL: https://issues.apache.org/jira/browse/OAK-4176
>             Project: Jackrabbit Oak
>          Issue Type: Improvement
>          Components: core, documentmk
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>
> We want to analyze and improve the persistent cache (specially the node 
> cache):
> * Measure how much slower / faster it is compared to the segment store, for 
> example for traversal of the whole repository.
> * Measure the cache miss overhead, and find ways to reduce it.
> * Try to improve the data model, specially ordering of the data and 
> granularity.
> * Try to reduce usage of 3rd party libraries if it makes sense.
> * Hierarchy info is lost in the persistent cache API (currently it's just a 
> key-value store), so possibly change the API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (OAK-4176) Persistent Cache improvements

Reply via email to