On 24/02/16 13:56, Jean-Marc Vanel wrote:
Hi
I try to load an average sized RDF, namely the offical french cities
document from here:
http://rdf.insee.fr/geo/index.html
wc cog-2014.ttl
341283 1041471 19900497 cog-2014.ttl
ls -l cog-2014.ttl
-rw-rw-r-- 1 jmv jmv 19900497 févr. 14 2015 cog-2014.ttl
So there is about 300 000 triples, that's not much .
But with tdbloader it fails because of memory.
Exactly what's the error?
Version? 2.13.0?
On a 32bit java, all caching has to be in RAM.
JVM_ARGS allows you to tune the heap size but 300K suggests the data has
large literals.
The default heap size by the script is 1G (IIRC) unless you've override
that. The cache sizes are not heap-size sensitive. What is unlikely to
work is the default Java heap size because on a 2G machine that's really
small for a database (512M? less?).
You can build databases on any machine and copy the directory around.
https://jena.apache.org/documentation/tdb/store-parameters.html
is a way to fine tune - dropping the cache size may help depending on
the root cause - but it is a jena3 feature.
So I used tdbloader2 with an empty database, but apparently tdbloader2
offers no possibility to load in a specific named graph . This is annoying.
Convert your data to n-quads, adding the graph field. But running
tdbloader2 on a small RAM machine may give it's own problems.
Andy
java -version
java version "1.8.0_51"
Java(TM) SE Runtime Environment (build 1.8.0_51-b07)
Java HotSpot(TM) Client VM (build 25.51-b07, mixed mode)
uname -a
Linux c1-10-1-34-165 3.2.34-30 #17 SMP Mon Apr 13 15:53:45 UTC 2015 armv7l
armv7l armv7l GNU/Linux