Hi Andy,

Sorry so much - POST had been intented, not PUT. I wrote it down several times, 
and looked at it intensely, but missed that simple error.

Anyway, thank you for the explanation re. large delete transactions, which I've 
noticed in other places. Normally a deletion of the files and a rebuild of the 
database from scratch is what worked best for me (in a scenario with relativly 
static data).

Cheers, Joachim

-----Ursprüngliche Nachricht-----
Von: Andy Seaborne [mailto:[email protected]] 
Gesendet: Freitag, 25. September 2015 13:31
An: [email protected]
Betreff: Re: Fuseki2: GC overhead limit exceeded when loading data via HTTP PUT

Hi Joachim,

The issue is not the 8000 triples being added but the 45 million triples being 
deleted.  The PUT operation is to delete everythign in the target and then add 
the new data.

Deletion of large amounts of data is a struggle in one single transaction for 
TDB currently (newer versions than 2.12.1 will not make a difference).

1. Stephen Allen suggested enabling the "spill to disk"

http://mail-archives.apache.org/mod_mbox/jena-users/201507.mbox/%3CCAPTxtVOZRzyPxN1njh3WVggsJEUNxeXDJhNvx%2BG4WcRtExxPxg%40mail.gmail.com%3E

Other possible workarounds are:

2. Delete sections of the data in separate transactions with SPARQL Update.

3. Dump the database, text process to remove the graph and reload.

4. Large heap, maybe temporarily.  This is one of the few occasions when a 
larger heap can help.

Workarounds 2-4 are not very transparent to system operation.

(long term, there is a solution as part of rearchitecting TDB but that's a way 
off yet)

        Sorry there isn't a simple solution,
        Andy

On 24/09/15 20:59, Neubert, Joachim wrote:
> I got an OutOfMemoryError while loading a turtle file with less than 8000 
> triples into a named graph, which however already consisted of 45 million 
> triples (details see below - there are a few other graphs in the tdb, with 
> about 100K triples or less).
>
> After almost half an hour, I got the error below. The (virtual) machine has 
> 32 GB of memory, Fuseki2 (exact versions see below) was started with 
> JAVA_OPTIONS="-Xmx6G"
>
> Any ideas?
>
> 20:50:14 INFO  [37] PUT 
> http://localhost:3030/ebstw/data?graph=http://zbw.eu/beta/ebds/ng
> ...
> 21:16:07 WARN  [37] RC = 500 : GC overhead limit exceeded
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>          at 
> com.hp.hpl.jena.tdb.base.record.RecordFactory.create(RecordFactory.java:87)
>          at 
> com.hp.hpl.jena.tdb.base.record.RecordFactory.buildFrom(RecordFactory.java:122)
>          at 
> com.hp.hpl.jena.tdb.base.buffer.RecordBuffer._get(RecordBuffer.java:107)
>          at 
> com.hp.hpl.jena.tdb.base.buffer.RecordBuffer.getHigh(RecordBuffer.java:67)
>          at 
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeRecords.shiftRight(BPTreeRecords.java:221)
>          at 
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.shiftRight(BPTreeNode.java:1012)
>          at 
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.rebalance(BPTreeNode.java:832)
>          at 
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:721)
>          at 
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:735)
>          at 
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.internalDelete(BPTreeNode.java:735)
>          at 
> com.hp.hpl.jena.tdb.index.bplustree.BPTreeNode.delete(BPTreeNode.java:247)
>          at 
> com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.deleteAndReturnOld(BPlusTree.java:342)
>          at 
> com.hp.hpl.jena.tdb.index.bplustree.BPlusTree.delete(BPlusTree.java:336)
>          at 
> com.hp.hpl.jena.tdb.store.tupletable.TupleIndexRecord.performDelete(TupleIndexRecord.java:70)
>          at 
> com.hp.hpl.jena.tdb.store.tupletable.TupleIndexBase.delete(TupleIndexBase.java:76)
>          at 
> com.hp.hpl.jena.tdb.store.tupletable.TupleTable.delete(TupleTable.java:149)
>          at 
> com.hp.hpl.jena.tdb.store.nodetupletable.NodeTupleTableConcrete.deleteRow(NodeTupleTableConcrete.java:110)
>          at com.hp.hpl.jena.tdb.store.QuadTable.delete(QuadTable.java:81)
>          at 
> com.hp.hpl.jena.tdb.store.DatasetGraphTDB.deleteFromNamedGraph(DatasetGraphTDB.java:112)
>          at 
> com.hp.hpl.jena.sparql.core.DatasetGraphTriplesQuads.delete(DatasetGraphTriplesQuads.java:58)
>          at 
> com.hp.hpl.jena.sparql.core.DatasetGraphTrackActive.delete(DatasetGraphTrackActive.java:131)
>          at 
> com.hp.hpl.jena.sparql.core.DatasetGraphWrapper.delete(DatasetGraphWrapper.java:100)
>          at 
> com.hp.hpl.jena.sparql.core.DatasetGraphMonitor.delete$(DatasetGraphMonitor.java:153)
>          at 
> com.hp.hpl.jena.sparql.core.DatasetGraphMonitor.delete(DatasetGraphMonitor.java:142)
>          at 
> com.hp.hpl.jena.sparql.core.GraphView.performDelete(GraphView.java:153)
>          at com.hp.hpl.jena.graph.impl.GraphBase.delete(GraphBase.java:225)
>          at com.hp.hpl.jena.graph.GraphUtil.remove(GraphUtil.java:308)
>          at com.hp.hpl.jena.graph.impl.GraphBase.clear(GraphBase.java:244)
>          at 
> org.apache.jena.fuseki.servlets.SPARQL_REST_RW.clearGraph(SPARQL_REST_RW.java:226)
>          at 
> org.apache.jena.fuseki.servlets.SPARQL_REST_RW.addDataIntoTxn(SPARQL_REST_RW.java:129)
>          at 
> org.apache.jena.fuseki.servlets.SPARQL_REST_RW.doPutPost(SPARQL_REST_RW.java:102)
>          at 
> org.apache.jena.fuseki.servlets.SPARQL_REST_RW.doPut(SPARQL_REST_RW.ja
> va:80)
> 21:16:07 INFO  [37] 500 GC overhead limit exceeded (1.552,938 s)
>
> # java -cp /opt/fuseki/fuseki-server.jar tdb.tdbstats --version
> Jena:       VERSION: 2.12.1
> Jena:       BUILD_DATE: 2014-10-02T16:36:17+0100
> ARQ:        VERSION: 2.12.1
> ARQ:        BUILD_DATE: 2014-10-02T16:36:17+0100
> RIOT:       VERSION: 2.12.1
> RIOT:       BUILD_DATE: 2014-10-02T16:36:17+0100
> TDB:        VERSION: 1.1.1
> TDB:        BUILD_DATE: 2014-10-02T16:36:17+0100
>
> # java -cp /opt/fuseki/fuseki-server.jar tdb.tdbstats --loc=. 
> --graph=http://zbw.eu/beta/ebds/ng
> (stats
>    (meta
>      (timestamp 
> "2015-09-24T21:49:07.125+02:00"^^<http://www.w3.org/2001/XMLSchema#dateTime>)
>      (run@ "2015/09/24 21:49:07 MESZ")
>      (count 45087163))
>    (<http://purl.org/dc/elements/1.1/type> 4279019)
>    (<http://purl.org/dc/elements/1.1/creator> 2505439)
>    (<http://purl.org/dc/terms/isPartOf> 189117)
>    (<http://purl.org/dc/terms/source> 4279019)
>    (<http://purl.org/dc/terms/description> 127946)
>    (<http://purl.org/dc/elements/1.1/relation> 2367362)
>    (<http://purl.org/dc/terms/creator> 1907335)
>    (<http://purl.org/ontology/bibo/issn> 57108)
>    (<http://purl.org/dc/elements/1.1/subject> 4263296)
>    (<http://umbel.org/umbel#isLike> 1186896)
>    (<http://purl.org/dc/terms/title> 4279019)
>    (<http://purl.org/dc/terms/contributor> 905965)
>    (<http://purl.org/dc/terms/alternative> 318253)
>    (<http://purl.org/dc/elements/1.1/date> 4242995)
>    (<http://purl.org/dc/terms/subject> 7948468)
>    (<http://purl.org/ontology/bibo/isbn> 617274)
>    (<http://purl.org/dc/elements/1.1/publisher> 1939890)
>    (<http://purl.org/dc/elements/1.1/language> 2851709)
>    (<http://purl.org/dc/elements/1.1/contributor> 821053)
>    (other 0))
>

Reply via email to