[
https://issues.apache.org/jira/browse/JENA-902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14361956#comment-14361956
]
Andy Seaborne edited comment on JENA-902 at 3/14/15 6:34 PM:
-------------------------------------------------------------
[~jdbeer] One question about your setup - could you put all the graphs in one
dataset?
Could you try out the feature described below? As with any disk-frmat-sentive
feature, testign and compatibility are especially vital.
There is a proper mechanism now. This is a replacement for the properties file.
The {{StoreParams}} machinery is now flexible as of Jena 2.13.0 (TDB 1.1.2)
which is the first release that exposes the machinery.
The system looks for a file "tdb.cfg" ({{StoreParamsCodec}}) in the database
location or it can be done by pre-populating the DatasetGraph system cache
using {{StoreConnection.make}}. Non-standard setting are written to disk.
Some parameters must be the same as when the dataset was created otherwise, the
dataset will be corrupted. i.e. don't edit static parameters in "tdb.cfg"
after the database is created.
Some parameters can be varied on a per-run basis; this is captured by
{{StoreParamsDynamic}}. If attaching to an existing database, static parameters
set in {{StoreConnection.make}} are ignored.
Exposing via {{TDBFactory}} is the wrong way because different calls could
request different parameters but that's not possible with TDB. The access to
the storage, and hence the caches, is JVM-wide.
was (Author: andy.seaborne):
[~jdbeer]
One question about your setup - could you put all the graphs in one dataset?
This needs documenting but there is a proper mechanism now. This is a
replacement for the properties file.
The {{StoreParams}} machinery is now flexible as of Jena 2.13.0 (TDB 1.1.2) is
the first release that exposes the machinery in any way.
The system looks for a file "tdb.cfg" ({{StoreParamsCodec}}) in the database
location or it can be done by pre-populating the DatasetGraph system cache
using {{StoreConnection.make}}. Non-standard setting are written to disk.
Some parameters must be the same as when the dataset was created otherwise, the
dataset will be corrupted. i.e. don't edit static parameters in "tdb.cfg"
after the database is created.
Some parameters can be varied on a per-run basis; this is captured by
{{StoreParamsDynamic}}. If attaching to an existing database, static parameters
set in {{StoreConnection.make}} are ignored.
Exposing via {{TDBFactory}} is the wrong way because different calls could
request different parameters but that's not possible with TDB. The access to
the storage, and hence the caches, is JVM-wide.
> Default cache sizes too large / configurability should be more exposed
> ----------------------------------------------------------------------
>
> Key: JENA-902
> URL: https://issues.apache.org/jira/browse/JENA-902
> Project: Apache Jena
> Issue Type: Improvement
> Components: TDB
> Affects Versions: TDB 1.1.0
> Reporter: Jan De Beer
>
> In the class "com.hp.hpl.jena.tdb.sys.SystemTDB" there are two important
> cache size settings, namely "Node2NodeIdCacheSize" and
> "NodeId2NodeCacheSize". On a 64-bit platform, they default to 100000 and
> 500000 entries.
> In our setup of 5 TDB-backed graphs inside a single dataset, these default
> settings led to out-of-memory errors of the Fuseki server although setting
> the available heap memory to 3GB. It took some heap analysis to reveal that
> these two caches were the main memory consumers. Also, from the public
> documentation of Fuseki / TDB, it wasn't so obvious that their sizes can be
> configured. Only by looking into the source code, we found that one can use a
> TDB properties file to override the defaults.
> Proposed is to lower the default settings and have a more exposed
> documentation as to tuning the memory settings of TDB.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)