On 24/02/16 13:56, Jean-Marc Vanel wrote:
Hi

I try to  load an average sized RDF, namely the offical french cities
document from here:
http://rdf.insee.fr/geo/index.html

wc cog-2014.ttl
   341283  1041471 19900497 cog-2014.ttl
  ls -l cog-2014.ttl
-rw-rw-r-- 1 jmv jmv 19900497 févr. 14  2015 cog-2014.ttl

So there is about 300 000 triples, that's not much .
But with tdbloader it fails because of memory.

Exactly what's the error?

Version? 2.13.0?

On a 32bit java, all caching has to be in RAM.

JVM_ARGS allows you to tune the heap size but 300K suggests the data has large literals.

The default heap size by the script is 1G (IIRC) unless you've override
that. The cache sizes are not heap-size sensitive. What is unlikely to
work is the default Java heap size because on a 2G machine that's really small for a database (512M? less?).

You can build databases on any machine and copy the directory around.

https://jena.apache.org/documentation/tdb/store-parameters.html
is a way to fine tune - dropping the cache size may help depending on the root cause - but it is a jena3 feature.

So I used tdbloader2 with an empty database, but apparently tdbloader2
offers no possibility to load in a specific named graph . This is annoying.

Convert your data to n-quads, adding the graph field. But running tdbloader2 on a small RAM machine may give it's own problems.

        Andy


java -version
java version "1.8.0_51"
Java(TM) SE Runtime Environment (build 1.8.0_51-b07)
Java HotSpot(TM) Client VM (build 25.51-b07, mixed mode)
uname -a
Linux c1-10-1-34-165 3.2.34-30 #17 SMP Mon Apr 13 15:53:45 UTC 2015 armv7l
armv7l armv7l GNU/Linux


Reply via email to