Hello,

I'm trying to load two files into a tdb with the command below:

bin/tdbloader2 --loc=tdb-03 d3.nt dc.nt

The files d3.nt and dc.nt have 114,176,368 and 175,984,917 triples,
respectively. The server where I'm running the command have 32GB of
RAM and enough disk space. I'm using Jena 2.13.0 and the java version
that comes with Debian:

java version "1.7.0_75"
OpenJDK Runtime Environment (IcedTea 2.5.4) (7u75-2.5.4-1~deb7u1)
OpenJDK 64-Bit Server VM (build 24.75-b04, mixed mode)

However I got the error when processing the triples:

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at java.util.LinkedHashMap.createEntry(LinkedHashMap.java:442)
        at java.util.HashMap.addEntry(HashMap.java:884)
        at java.util.LinkedHashMap.addEntry(LinkedHashMap.java:427)
        at java.util.HashMap.put(HashMap.java:505)
at org.apache.jena.atlas.lib.cache.CacheLRU.put(CacheLRU.java:59) at com.hp.hpl.jena.tdb.store.nodetable.NodeTableCache.cacheUpdate(NodeTableCache.java:200) at com.hp.hpl.jena.tdb.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:127) at com.hp.hpl.jena.tdb.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:85) at com.hp.hpl.jena.tdb.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:55) at com.hp.hpl.jena.tdb.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:67) at com.hp.hpl.jena.tdb.solver.stats.StatsCollectorNodeId.convert(StatsCollectorNodeId.java:51) at com.hp.hpl.jena.tdb.solver.stats.StatsCollectorBase.results(StatsCollectorBase.java:54) at com.hp.hpl.jena.tdb.solver.stats.StatsCollectorNodeId.results(StatsCollectorNodeId.java:30) at com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.exec(CmdNodeTableBuilder.java:172)
        at arq.cmdline.CmdMain.mainMethod(CmdMain.java:102)
        at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
        at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
at com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.main(CmdNodeTableBuilder.java:80)

I have incremented the memory used by java setting the line above in
the bin/tbloader2worker file.

JVM_ARGS=${JVM_ARGS:--Xmx20000M}

After that I run the tdbloader2 again and I got the following error message:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at java.util.HashMap.resize(HashMap.java:580)
        at java.util.HashMap.addEntry(HashMap.java:879)
        at java.util.HashMap.put(HashMap.java:505)
at com.hp.hpl.jena.tdb.solver.stats.StatsCollectorNodeId.convert(StatsCollectorNodeId.java:52) at com.hp.hpl.jena.tdb.solver.stats.StatsCollectorBase.results(StatsCollectorBase.java:54) at com.hp.hpl.jena.tdb.solver.stats.StatsCollectorNodeId.results(StatsCollectorNodeId.java:30) at com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.exec(CmdNodeTableBuilder.java:172)
        at arq.cmdline.CmdMain.mainMethod(CmdMain.java:102)
        at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
        at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
at com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.main(CmdNodeTableBuilder.java:80)

I'm do not have much experience with the java management of memory. I
guess that there is a configuration that would be better when working
with the Jena tdbloader in this scenario. Is there?

Thanks in advance!
Daniel

Reply via email to