Hi Joachim,
The issue is not the 8000 triples being added but the 45 million triples
being deleted. The PUT operation is to delete everythign in the target
and then add the new data.
Deletion of large amounts of data is a struggle in one single
transaction for TDB currently (newer versions than 2.12.1 will not make
a difference).
1. Stephen Allen suggested enabling the "spill to disk"
http://mail-archives.apache.org/mod_mbox/jena-users/201507.mbox/%3CCAPTxtVOZRzyPxN1njh3WVggsJEUNxeXDJhNvx%2BG4WcRtExxPxg%40mail.gmail.com%3E
Other possible workarounds are:
2. Delete sections of the data in separate transactions with SPARQL Update.
3. Dump the database, text process to remove the graph and reload.
4. Large heap, maybe temporarily. This is one of the few occasions when
a larger heap can help.
Workarounds 2-4 are not very transparent to system operation.
(long term, there is a solution as part of rearchitecting TDB but that's
a way off yet)
Sorry there isn't a simple solution,
Andy
On 24/09/15 20:59, Neubert, Joachim wrote:
I got an OutOfMemoryError while loading a turtle file with less than 8000
triples into a named graph, which however already consisted of 45 million
triples (details see below - there are a few other graphs in the tdb, with
about 100K triples or less).
After almost half an hour, I got the error below. The (virtual) machine has 32 GB of
memory, Fuseki2 (exact versions see below) was started with
JAVA_OPTIONS="-Xmx6G"
Any ideas?
20:50:14 INFO [37] PUT
http://localhost:3030/ebstw/data?graph=http://zbw.eu/beta/ebds/ng
...
21:16:07 WARN [37] RC = 500 : GC overhead limit exceeded
java.lang.OutOfMemoryError: GC overhead limit exceeded
at
com.hp.hpl.jena.tdb.base.record.RecordFactory.create(RecordFactory.java:87)
at
com.hp.hpl.jena.tdb.base.record.RecordFactory.buildFrom(RecordFactory.java:122)
at
com.hp.hpl.jena.tdb.base.buffer.RecordBuffer._get(RecordBuffer.java:107)
at
com.hp.hpl.jena.tdb.base.buffer.RecordBuffer.getHigh(RecordBuffer.java:67)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeRecords.shiftRight(BPTreeRecords.java:221)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.shiftRight(BPTreeNode.java:1012)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.rebalance(BPTreeNode.java:832)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:721)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:735)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:735)
at
com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.delete(BPTreeNode.java:247)
at
com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.deleteAndReturnOld(BPlusTree.java:342)
at
com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.delete(BPlusTree.java:336)
at
com.hp.hpl.jena.tdb.store.tupletable.TupleIndexRecord.performDelete(TupleIndexRecord.java:70)
at
com.hp.hpl.jena.tdb.store.tupletable.TupleIndexBase.delete(TupleIndexBase.java:76)
at
com.hp.hpl.jena.tdb.store.tupletable.TupleTable.delete(TupleTable.java:149)
at
com.hp.hpl.jena.tdb.store.nodetupletable.NodeTupleTableConcrete.deleteRow(NodeTupleTableConcrete.java:110)
at com.hp.hpl.jena.tdb.store.QuadTable.delete(QuadTable.java:81)
at
com.hp.hpl.jena.tdb.store.DatasetGraphTDB.deleteFromNamedGraph(DatasetGraphTDB.java:112)
at
com.hp.hpl.jena.sparql.core.DatasetGraphTriplesQuads.delete(DatasetGraphTriplesQuads.java:58)
at
com.hp.hpl.jena.sparql.core.DatasetGraphTrackActive.delete(DatasetGraphTrackActive.java:131)
at
com.hp.hpl.jena.sparql.core.DatasetGraphWrapper.delete(DatasetGraphWrapper.java:100)
at
com.hp.hpl.jena.sparql.core.DatasetGraphMonitor.delete$(DatasetGraphMonitor.java:153)
at
com.hp.hpl.jena.sparql.core.DatasetGraphMonitor.delete(DatasetGraphMonitor.java:142)
at
com.hp.hpl.jena.sparql.core.GraphView.performDelete(GraphView.java:153)
at com.hp.hpl.jena.graph.impl.GraphBase.delete(GraphBase.java:225)
at com.hp.hpl.jena.graph.GraphUtil.remove(GraphUtil.java:308)
at com.hp.hpl.jena.graph.impl.GraphBase.clear(GraphBase.java:244)
at
org.apache.jena.fuseki.servlets.SPARQL_REST_RW.clearGraph(SPARQL_REST_RW.java:226)
at
org.apache.jena.fuseki.servlets.SPARQL_REST_RW.addDataIntoTxn(SPARQL_REST_RW.java:129)
at
org.apache.jena.fuseki.servlets.SPARQL_REST_RW.doPutPost(SPARQL_REST_RW.java:102)
at
org.apache.jena.fuseki.servlets.SPARQL_REST_RW.doPut(SPARQL_REST_RW.java:80)
21:16:07 INFO [37] 500 GC overhead limit exceeded (1.552,938 s)
# java -cp /opt/fuseki/fuseki-server.jar tdb.tdbstats --version
Jena: VERSION: 2.12.1
Jena: BUILD_DATE: 2014-10-02T16:36:17+0100
ARQ: VERSION: 2.12.1
ARQ: BUILD_DATE: 2014-10-02T16:36:17+0100
RIOT: VERSION: 2.12.1
RIOT: BUILD_DATE: 2014-10-02T16:36:17+0100
TDB: VERSION: 1.1.1
TDB: BUILD_DATE: 2014-10-02T16:36:17+0100
# java -cp /opt/fuseki/fuseki-server.jar tdb.tdbstats --loc=.
--graph=http://zbw.eu/beta/ebds/ng
(stats
(meta
(timestamp
"2015-09-24T21:49:07.125+02:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>)
(run@ "2015/09/24 21:49:07 MESZ")
(count 45087163))
(<http://purl.org/dc/elements/1.1/type> 4279019)
(<http://purl.org/dc/elements/1.1/creator> 2505439)
(<http://purl.org/dc/terms/isPartOf> 189117)
(<http://purl.org/dc/terms/source> 4279019)
(<http://purl.org/dc/terms/description> 127946)
(<http://purl.org/dc/elements/1.1/relation> 2367362)
(<http://purl.org/dc/terms/creator> 1907335)
(<http://purl.org/ontology/bibo/issn> 57108)
(<http://purl.org/dc/elements/1.1/subject> 4263296)
(<http://umbel.org/umbel#isLike> 1186896)
(<http://purl.org/dc/terms/title> 4279019)
(<http://purl.org/dc/terms/contributor> 905965)
(<http://purl.org/dc/terms/alternative> 318253)
(<http://purl.org/dc/elements/1.1/date> 4242995)
(<http://purl.org/dc/terms/subject> 7948468)
(<http://purl.org/ontology/bibo/isbn> 617274)
(<http://purl.org/dc/elements/1.1/publisher> 1939890)
(<http://purl.org/dc/elements/1.1/language> 2851709)
(<http://purl.org/dc/elements/1.1/contributor> 821053)
(other 0))