Hello again. I’ve tried upgrading to Fuseki 1.1.2, and it now gives a heap space error after only 6 minutes (instead of 60-odd).
There are a few hundred graphs in the database, but most of them small (apart from the one I’m trying to delete). Does this mean that using the Graph Protocol is a no-go? I’ll try the batched deletes... > On 13 Jul 2015, at 22:51, Andy Seaborne <[email protected]> wrote: > > On 13/07/15 21:31, Andy Seaborne wrote: >> Hi Ric, >> >> Could you please try Fuseki 1.1.2 or Fuseki 2.0.0? >> >> How many datasets does the server host? >> >> 1.0.1 was Jan 2014 and IIRC this area has changed, especially DELETE of >> a graph with the Graph Store Protocol. However, if this is just due to >> transaction overheads (it's not immediately clear it is or is not), then >> DELETE {} WHERE { SELECT {...} LIMIT } is the way to go for an immediate >> solution. >> >> TDB1 (i.e. the Jena code) is a bit memory hungry for transactions. >> >> TDB2 is not memory bound but it isn't in the Jena codebase. It has been >> tested with 100 million triple loads in a single Fuseki2 upload. >> >> See >> http://www.sparql.org/validate/update > > That's the service point. > > http://www.sparql.org/update-validator.html > > is the HTML formm. > >> for checking syntax. >> >> Andy >> >> On 13/07/15 18:59, Ric Roberts wrote: >>> Hi. I’m having problems deleting a moderately large graph from a >>> jena-fuseki-1.0.1 database. >>> >>> The graph contains approximately 60 million triples, and the database >>> contains about 70 million triples in total. >>> >>> I’ve started Fuseki with 16G Heap. (JVM_ARGS=${JVM_ARGS:—Xmx16000M}). >>> The server has 32G RAM. >>> >>> When I issue the DELETE command over http, I see this in the fuseki log: >>> >>> 16:12:03 INFO [24] DELETE >>> http://127.0.0.1:3030/stagingdb/data?graph=http://example.com/graph >>> <http://127.0.0.1:3030/stagingdb/data?graph=http://example.com/graph> >>> 17:10:40 WARN [24] RC = 500 : Java heap space >>> 17:10:40 INFO [24] 500 Java heap space (3,517.614 s) >>> >>> i.e. it takes about an hour, and then 500s with an error about heap >>> space. >>> >>> I’ve also tried DROP and CLEAR SPARQL update statements but they >>> timeout with our default endpoint timeout of 30s. >>> >>> I’ve also tried deleting 1000 triples at a time, from the graph by >>> issuing a sparql update statement like this: >>> >>> DELETE { >>> GRAPH <http://example.com/graph <http://example.com/graph>> >>> { ?s ?p ?o } >>> } >>> WHERE { >>> GRAPH <http://example.com/graph <http://example.com/graph>> >>> { ?s ?p ?o } >>> } >>> LIMIT 1000 >>> >>> … but this times out too (which surprised me, as I only asked it to >>> find and DELETE 1000 triples). >>> >>> What is the recommended way to delete this graph - I need to replace >>> its contents fairly urgently on a production system. We loaded it by >>> loading 10,000 triples at a time, which worked fine, but I’m having >>> trouble deleting its current contents first. >>> >>> Any pointers appreciated. >>> Thanks, Ric. >> >
