[
https://issues.apache.org/jira/browse/OAK-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13779857#comment-13779857
]
Jukka Zitting commented on OAK-1042:
------------------------------------
bq. negative cache
Actually we shouldn't need one at this level, or if we do we have a big
problem. :-) All cache lookups are results of following some existing
references, so unless there's an inconsistency it should always result in a
match.
That said, there probably is room for the Bloom filter (yay!) within the TarMK.
Currently all the tar files are scanned at startup and in-memory maps are kept
about the locations of all segments. That'll be quite expensive with large
repositories so I was thinking of adding a small bloom filter for each tar file
for quickly checking whether a given segment can possibly exist within that
file.
> Segment node store caching
> --------------------------
>
> Key: OAK-1042
> URL: https://issues.apache.org/jira/browse/OAK-1042
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Reporter: Thomas Mueller
> Assignee: Thomas Mueller
>
> Segment node stores caching seems to use quite a lot of CPU. According to my
> test, the oak-run SimpleSearchTest uses about 50% for Segment node store
> caching, when using the built-in profiler:
> {code}
> java -mx1g -Dwarmup=3 -Druntime=15 -jar target/oak-run-*.jar benchmark
> SimpleSearchTest Oak-Tar
> packages:
> 48%: com.google.common.cache <== cache
> 16%: org.apache.jackrabbit.oak.plugins.segment
> 8%: org.apache.jackrabbit.oak.plugins.memory
> 4%: org.apache.jackrabbit.oak.util
> 3%: org.apache.jackrabbit.oak.core
> 2%: org.apache.jackrabbit.oak.benchmark
> 2%: com.google.common.base <== cache
> .
> Oak-Tar 308 310 313 324 344 48
> {code}
> The problem seems to be the cache in the FileStore. As far as I see, the
> cache limit is 1000 <UUID, Segment> entries (size based, not weight based).
> I wonder if there is a simple way to reduce CPU usage. I will try with the
> LIRS cache.
> I also wonder if this cache should really be size limited, and not weight
> limited (segments can have different sizes as far as I know)?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira