Thanks, Dave! I was going to try some experiments with different DELETE syntax if no one suggested anything else.
Is deleting in chunks the recommended approach? I was half-hoping someone might be able to tell me a way to get the DB to do an optimised version of DROP / CLEAR (?). Cheers Ric > On 13 Jul 2015, at 20:59, Dave Reynolds <[email protected]> wrote: > > Hi Ric, > > On 13/07/15 18:59, Ric Roberts wrote: >> Hi. I’m having problems deleting a moderately large graph from a >> jena-fuseki-1.0.1 database. >> >> The graph contains approximately 60 million triples, and the database >> contains about 70 million triples in total. >> >> I’ve started Fuseki with 16G Heap. (JVM_ARGS=${JVM_ARGS:—Xmx16000M}). The >> server has 32G RAM. >> >> When I issue the DELETE command over http, I see this in the fuseki log: >> >> 16:12:03 INFO [24] DELETE >> http://127.0.0.1:3030/stagingdb/data?graph=http://example.com/graph >> <http://127.0.0.1:3030/stagingdb/data?graph=http://example.com/graph> >> 17:10:40 WARN [24] RC = 500 : Java heap space >> 17:10:40 INFO [24] 500 Java heap space (3,517.614 s) >> >> i.e. it takes about an hour, and then 500s with an error about heap space. >> >> I’ve also tried DROP and CLEAR SPARQL update statements but they timeout >> with our default endpoint timeout of 30s. > >> I’ve also tried deleting 1000 triples at a time, from the graph by issuing a >> sparql update statement like this: >> >> DELETE { >> GRAPH <http://example.com/graph <http://example.com/graph>> >> { ?s ?p ?o } >> } >> WHERE { >> GRAPH <http://example.com/graph <http://example.com/graph>> >> { ?s ?p ?o } >> } >> LIMIT 1000 >> >> … but this times out too (which surprised me, as I only asked it to find and >> DELETE 1000 triples). > > Didn't know you could use LIMIT there, suggest trying with a sub-select along > the lines of: > > WITH <http://example.com/graph> > DELETE { ?s ?p ?o } > WHERE { > SELECT * WHERE { > ?s ?p ?o > } LIMIT 10000 > } > > (untested syntax, might need and extral {} layer) > > Dave >
