Re: [jira] [Commented] (JENA-524) Global Cache for servers hosting a large number of TDB stores

Claude Warren Tue, 17 Sep 2013 03:00:05 -0700

Should we examine using the JCache interface for caching operations?  This
would allow for pluggable implementations.



On Tue, Sep 17, 2013 at 10:48 AM, Andy Seaborne (JIRA) <[email protected]>wrote:

>
>     [
> https://issues.apache.org/jira/browse/JENA-524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13769372#comment-13769372]
>
> Andy Seaborne commented on JENA-524:
> ------------------------------------
>
> Within TDB, long and Long both get used.  It is likely that object map
> keys will win out because Long is too short for some ids in TDB in the long
> term.  We can use the existing cache code or Google's guava cache code and
> focus on the cache design for now.
>
> > Global Cache for servers hosting a large number of TDB stores
> > -------------------------------------------------------------
> >
> >                 Key: JENA-524
> >                 URL: https://issues.apache.org/jira/browse/JENA-524
> >             Project: Apache Jena
> >          Issue Type: New Feature
> >          Components: TDB
> >    Affects Versions: TDB 0.10.1
> >            Reporter: André Lanka
> >            Priority: Minor
> >              Labels: patch
> >             Fix For: TDB 1.0.0
> >
> >         Attachments: patch_hojoki_global_cache.txt
> >
> >
> > Hello,
> > we (namely Hojoki) use Jena/TDB since a couple of years. We started in
> 2011 to implement a global cache shared over all TDB stores currently
> opened on a server. The motivation was that we need to have many TDB stores
> on a single machine to provide parallel write access to the different
> graphs. Our goal was to have more than 2000 stores on a single machine. As
> we have only 8GB of memory for the JVM we can't use appropriate sized local
> caches for each store.
> > So, we decided to implement a global shared cache for both Nodes/NodeIDs
> and Blocks. We intensively tested our changes with the current TDB version
> 0.10.1 since it came up and it works well. Currently we host more than 5000
> stores on each server, containing more than a billion triples on each
> server (stored in round about 150-200 GB TDB data). The cache has a size of
> approximately 500 MB.
> > We will be very happy if we can integrate our changes in the official
> tdb branch. Our cache can be turned on by calling
> SystemTDB.useGlobalCache(true). If this method is not called, the factories
> use the original NodeTableCache and the original BlockMgrCache. If it's
> called, our table and our manager is used. Of course, it has a some
> overhead, but at least it's possible to have this large number of stores on
> a single machine.
> > We only tested it with FileMode.direct as we only use this mode (for
> smaller file sizes, and we know for sure when changes a written to disk --
> important for our backup mechanism). The cache applies only to the big data
> files on disk, not to the journal files.
> > I can provide a patch I created yesterday against the current snapshot
> version (I can't find a upload field in this "Create issue"-mask). The
> patch still contains a few tests that are merely Hojoki specific and it
> could need a few more general approaches (configuration by config files,
> instead of code constants and such things).
> > Anyways, if you allow us to integrate our changes, I'll improve these
> parts.
> > What do you think?
> > Best wishes
> > André
>
> --
> This message is automatically generated by JIRA.
> If you think it was sent incorrectly, please contact your JIRA
> administrators
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>



-- 
I like: Like Like - The likeliest place on the web<http://like-like.xenei.com>
LinkedIn: http://www.linkedin.com/in/claudewarren

Re: [jira] [Commented] (JENA-524) Global Cache for servers hosting a large number of TDB stores

Reply via email to