[
https://issues.apache.org/jira/browse/JENA-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778647#comment-13778647
]
Andy Seaborne edited comment on JENA-524 at 9/26/13 10:58 AM:
--------------------------------------------------------------
I would appreciate a guide to as to how concurrent access to the cache is
managed as a design. (I found a commented out {{synchronized}}!) Concurrency
is tricky - to ensure it's working properly, I find it hard to intuit the
requirements and assumption just from the code.
was (Author: andy.seaborne):
I would appreciate a guide to as to how concurrent access to the cache is
managed as a design. (I found a commented out {{synchronized}}!) I need to
see concurrency as a design view as well as code where the cache is doing
concurrency control.
> Global Cache for servers hosting a large number of TDB stores
> -------------------------------------------------------------
>
> Key: JENA-524
> URL: https://issues.apache.org/jira/browse/JENA-524
> Project: Apache Jena
> Issue Type: New Feature
> Components: TDB
> Affects Versions: TDB 0.10.1
> Reporter: André Lanka
> Priority: Minor
> Labels: patch
> Attachments: patch_hojoki_global_cache.txt
>
>
> Hello,
> we (namely Hojoki) use Jena/TDB since a couple of years. We started in 2011
> to implement a global cache shared over all TDB stores currently opened on a
> server. The motivation was that we need to have many TDB stores on a single
> machine to provide parallel write access to the different graphs. Our goal
> was to have more than 2000 stores on a single machine. As we have only 8GB of
> memory for the JVM we can't use appropriate sized local caches for each store.
> So, we decided to implement a global shared cache for both Nodes/NodeIDs and
> Blocks. We intensively tested our changes with the current TDB version 0.10.1
> since it came up and it works well. Currently we host more than 5000 stores
> on each server, containing more than a billion triples on each server (stored
> in round about 150-200 GB TDB data). The cache has a size of approximately
> 500 MB.
> We will be very happy if we can integrate our changes in the official tdb
> branch. Our cache can be turned on by calling SystemTDB.useGlobalCache(true).
> If this method is not called, the factories use the original NodeTableCache
> and the original BlockMgrCache. If it's called, our table and our manager is
> used. Of course, it has a some overhead, but at least it's possible to have
> this large number of stores on a single machine.
> We only tested it with FileMode.direct as we only use this mode (for smaller
> file sizes, and we know for sure when changes a written to disk -- important
> for our backup mechanism). The cache applies only to the big data files on
> disk, not to the journal files.
> I can provide a patch I created yesterday against the current snapshot
> version (I can't find a upload field in this "Create issue"-mask). The patch
> still contains a few tests that are merely Hojoki specific and it could need
> a few more general approaches (configuration by config files, instead of code
> constants and such things).
> Anyways, if you allow us to integrate our changes, I'll improve these parts.
> What do you think?
> Best wishes
> André
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira