Should we examine using the JCache interface for caching operations? This would allow for pluggable implementations.
On Tue, Sep 17, 2013 at 10:48 AM, Andy Seaborne (JIRA) <[email protected]>wrote: > > [ > https://issues.apache.org/jira/browse/JENA-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769372#comment-13769372] > > Andy Seaborne commented on JENA-524: > ------------------------------------ > > Within TDB, long and Long both get used. It is likely that object map > keys will win out because Long is too short for some ids in TDB in the long > term. We can use the existing cache code or Google's guava cache code and > focus on the cache design for now. > > > Global Cache for servers hosting a large number of TDB stores > > ------------------------------------------------------------- > > > > Key: JENA-524 > > URL: https://issues.apache.org/jira/browse/JENA-524 > > Project: Apache Jena > > Issue Type: New Feature > > Components: TDB > > Affects Versions: TDB 0.10.1 > > Reporter: André Lanka > > Priority: Minor > > Labels: patch > > Fix For: TDB 1.0.0 > > > > Attachments: patch_hojoki_global_cache.txt > > > > > > Hello, > > we (namely Hojoki) use Jena/TDB since a couple of years. We started in > 2011 to implement a global cache shared over all TDB stores currently > opened on a server. The motivation was that we need to have many TDB stores > on a single machine to provide parallel write access to the different > graphs. Our goal was to have more than 2000 stores on a single machine. As > we have only 8GB of memory for the JVM we can't use appropriate sized local > caches for each store. > > So, we decided to implement a global shared cache for both Nodes/NodeIDs > and Blocks. We intensively tested our changes with the current TDB version > 0.10.1 since it came up and it works well. Currently we host more than 5000 > stores on each server, containing more than a billion triples on each > server (stored in round about 150-200 GB TDB data). The cache has a size of > approximately 500 MB. > > We will be very happy if we can integrate our changes in the official > tdb branch. Our cache can be turned on by calling > SystemTDB.useGlobalCache(true). If this method is not called, the factories > use the original NodeTableCache and the original BlockMgrCache. If it's > called, our table and our manager is used. Of course, it has a some > overhead, but at least it's possible to have this large number of stores on > a single machine. > > We only tested it with FileMode.direct as we only use this mode (for > smaller file sizes, and we know for sure when changes a written to disk -- > important for our backup mechanism). The cache applies only to the big data > files on disk, not to the journal files. > > I can provide a patch I created yesterday against the current snapshot > version (I can't find a upload field in this "Create issue"-mask). The > patch still contains a few tests that are merely Hojoki specific and it > could need a few more general approaches (configuration by config files, > instead of code constants and such things). > > Anyways, if you allow us to integrate our changes, I'll improve these > parts. > > What do you think? > > Best wishes > > André > > -- > This message is automatically generated by JIRA. > If you think it was sent incorrectly, please contact your JIRA > administrators > For more information on JIRA, see: http://www.atlassian.com/software/jira > -- I like: Like Like - The likeliest place on the web<http://like-like.xenei.com> LinkedIn: http://www.linkedin.com/in/claudewarren
