[
https://issues.apache.org/jira/browse/OAK-4176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15241152#comment-15241152
]
Thomas Mueller edited comment on OAK-4176 at 4/14/16 1:34 PM:
--------------------------------------------------------------
Results so far:
* I found a performance problem in the LIRS cache: notifyAll was called quite a
lot. This can be avoided, making the persistent cache a bit faster.
* The persistent cache flushes once per second, the write buffer is about 2 MB.
This could be changed to flush every 10 seconds, and use a larger write buffer
(let's say 16 MB). This reduces the persistent file size from about 600 MB to
about 500 MB.
* Traversing the whole repository with the document store takes about 5 seconds
(everything in the persistent cache), versus 1.5 seconds with the segment
store. One of the problems might be that the document store reads all
properties in memory (de-serialized everything eagerly), while the segment
store only does that on demand. So maybe the test is somewhat "unfair". JSON
de-serialization overhead is about 15%, we should consider using lazy
de-serialization.
* Changing the data model in the persistent cache from "key+revision" to
"revision+key" didn't change much: the performance is about the same, the disk
space used is about the same.
* Using the segment store to persist is work in progress; this will take some
more time.
was (Author: tmueller):
Results so far:
* I found a performance problem in the LIRS cache: notifyAll was called quite a
lot. This can be avoided, making the persistent cache a bit faster.
* Traversing the whole repository with the document store takes about 5 seconds
(everything in the persistent cache), versus 1.5 seconds with the segment
store. One of the problems might be that the document store reads all
properties in memory (de-serialized everything eagerly), while the segment
store only does that on demand. So maybe the test is somewhat "unfair". JSON
de-serialization overhead is about 15%, we should consider using lazy
de-serialization.
* Changing the data model in the persistent cache from "key+revision" to
"revision+key" didn't change much: the performance is about the same, the disk
space used is about the same.
* Using the segment store to persist is work in progress; this will take some
more time.
> Persistent Cache improvements
> -----------------------------
>
> Key: OAK-4176
> URL: https://issues.apache.org/jira/browse/OAK-4176
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: core, documentmk
> Reporter: Thomas Mueller
> Assignee: Thomas Mueller
>
> We want to analyze and improve the persistent cache (specially the node
> cache):
> * Measure how much slower / faster it is compared to the segment store, for
> example for traversal of the whole repository.
> * Measure the cache miss overhead, and find ways to reduce it.
> * Try to improve the data model, specially ordering of the data and
> granularity.
> * Try to reduce usage of 3rd party libraries if it makes sense.
> * Hierarchy info is lost in the persistent cache API (currently it's just a
> key-value store), so possibly change the API.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)