El 2015-04-23 18:12, Andy Seaborne escribió:
Hi there,
It's hard to eb sure - what does the load log file say before the
exception occurs?
It was loading data when the error occurs. I tried again with
export JVM_ARGS=-Xmx10000M before the load execution and I got the
error:
NFO Add: 289,750,000 Data (Batch: 120,481 / Avg: 67,007)
INFO Add: 289,800,000 Data (Batch: 117,647 / Avg: 67,012)
INFO Add: 289,850,000 Data (Batch: 155,279 / Avg: 67,018)
INFO Add: 289,900,000 Data (Batch: 151,515 / Avg: 67,025)
INFO Add: 289,950,000 Data (Batch: 156,250 / Avg: 67,031)
INFO Add: 290,000,000 Data (Batch: 155,279 / Avg: 67,038)
INFO Elapsed: 4,325.89 seconds [2015/04/24 12:23:55 UTC]
INFO Add: 290,050,000 Data (Batch: 162,866 / Avg: 67,045)
INFO Add: 290,100,000 Data (Batch: 50,968 / Avg: 67,041)
INFO Add: 290,150,000 Data (Batch: 160,771 / Avg: 67,048)
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead
limit exceeded
at java.util.LinkedHashMap.createEntry(LinkedHashMap.java:442)
at java.util.HashMap.addEntry(HashMap.java:884)
at java.util.LinkedHashMap.addEntry(LinkedHashMap.java:427)
at java.util.HashMap.put(HashMap.java:505)
at org.apache.jena.atlas.lib.cache.CacheLRU.put(CacheLRU.java:59)
at
com.hp.hpl.jena.tdb.store.nodetable.NodeTableCache.cacheUpdate(NodeTableCache.java:200)
at
com.hp.hpl.jena.tdb.store.nodetable.NodeTableCache._retrieveNodeByNodeId(NodeTableCache.java:127)
at
com.hp.hpl.jena.tdb.store.nodetable.NodeTableCache.getNodeForNodeId(NodeTableCache.java:85)
at
com.hp.hpl.jena.tdb.store.nodetable.NodeTableWrapper.getNodeForNodeId(NodeTableWrapper.java:55)
at
com.hp.hpl.jena.tdb.store.nodetable.NodeTableInline.getNodeForNodeId(NodeTableInline.java:67)
at
com.hp.hpl.jena.tdb.solver.stats.StatsCollectorNodeId.convert(StatsCollectorNodeId.java:51)
at
com.hp.hpl.jena.tdb.solver.stats.StatsCollectorBase.results(StatsCollectorBase.java:54)
at
com.hp.hpl.jena.tdb.solver.stats.StatsCollectorNodeId.results(StatsCollectorNodeId.java:30)
at
com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.exec(CmdNodeTableBuilder.java:172)
at arq.cmdline.CmdMain.mainMethod(CmdMain.java:102)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:63)
at arq.cmdline.CmdMain.mainRun(CmdMain.java:50)
at
com.hp.hpl.jena.tdb.store.bulkloader2.CmdNodeTableBuilder.main(CmdNodeTableBuilder.java:80
On 23/04/15 20:53, Daniel Hernández wrote:
Hello,
I'm trying to load two files into a tdb with the command below:
bin/tdbloader2 --loc=tdb-03 d3.nt dc.nt
Do these files have a lot of literals? A lot of large literals?
I think that there is not problem with the literals, because I have
loaded the same data with another schema and without problems. I guess
that the problem could be having much different predicates. The first
file have 50 millions of different predicates.
I have incremented the memory used by java setting the line above in
the bin/tbloader2worker file.
JVM_ARGS=${JVM_ARGS:--Xmx20000M}
JVM_ARGS is set further out in tdbloader2 as well and so this change
has no effect (JVM_ARGS is set so ${:-} returns the existing value).
it's merely a fall back at that point.
The right idiom is to set in the shell environment calling tdbloader2
e.g.
export JVM_ARGS=-Xmx5000M
tdbloader2 ...
or
env JVM_ARGS=-Xmx5000M tdbloader2 ...
Don't set it too large. Much of the bulk space is no in the java
heap.
I used 10GB for the heap the last time, so there are 20GB extra to be
used.
However, I got the error above.
Thanks,
Daniel