Hello again. I’ve tried upgrading to Fuseki 1.1.2, and it now gives a heap 
space error after only 6 minutes (instead of 60-odd).

There are a few hundred graphs in the database, but most of them small (apart 
from the one I’m trying to delete).

Does this mean that using the Graph Protocol is a no-go? I’ll try the batched 
deletes...


> On 13 Jul 2015, at 22:51, Andy Seaborne <[email protected]> wrote:
> 
> On 13/07/15 21:31, Andy Seaborne wrote:
>> Hi Ric,
>> 
>> Could you please try Fuseki 1.1.2 or Fuseki 2.0.0?
>> 
>> How many datasets does the server host?
>> 
>> 1.0.1 was Jan 2014 and IIRC this area has changed, especially DELETE of
>> a graph with the Graph Store Protocol.  However, if this is just due to
>> transaction overheads (it's not immediately clear it is or is not), then
>> DELETE {} WHERE { SELECT {...} LIMIT } is the way to go for an immediate
>> solution.
>> 
>> TDB1 (i.e. the Jena code) is a bit memory hungry for transactions.
>> 
>> TDB2 is not memory bound but it isn't in the Jena codebase.  It has been
>> tested with 100 million triple loads in a single Fuseki2 upload.
>> 
>> See
>>   http://www.sparql.org/validate/update
> 
> That's the service point.
> 
> http://www.sparql.org/update-validator.html
> 
> is the HTML formm.
> 
>> for checking syntax.
>> 
>>     Andy
>> 
>> On 13/07/15 18:59, Ric Roberts wrote:
>>> Hi. I’m having problems deleting a moderately large graph from a
>>> jena-fuseki-1.0.1 database.
>>> 
>>> The graph contains approximately 60 million triples, and the database
>>> contains about 70 million triples in total.
>>> 
>>> I’ve started Fuseki with 16G Heap. (JVM_ARGS=${JVM_ARGS:—Xmx16000M}).
>>> The server has 32G RAM.
>>> 
>>> When I issue the DELETE command over http, I see this in the fuseki log:
>>> 
>>> 16:12:03 INFO  [24] DELETE
>>> http://127.0.0.1:3030/stagingdb/data?graph=http://example.com/graph
>>> <http://127.0.0.1:3030/stagingdb/data?graph=http://example.com/graph>
>>> 17:10:40 WARN  [24] RC = 500 : Java heap space
>>> 17:10:40 INFO  [24] 500 Java heap space (3,517.614 s)
>>> 
>>> i.e. it takes about an hour, and then 500s with an error about heap
>>> space.
>>> 
>>> I’ve also tried DROP and CLEAR SPARQL update statements but they
>>> timeout with our default endpoint timeout of 30s.
>>> 
>>> I’ve also tried deleting 1000 triples at a time, from the graph by
>>> issuing a sparql update statement like this:
>>> 
>>> DELETE {
>>>  GRAPH <http://example.com/graph <http://example.com/graph>>
>>>    { ?s ?p ?o }
>>> }
>>> WHERE {
>>>   GRAPH <http://example.com/graph <http://example.com/graph>>
>>>     { ?s ?p ?o }
>>> }
>>> LIMIT 1000
>>> 
>>> … but this times out too (which surprised me, as I only asked it to
>>> find and DELETE 1000 triples).
>>> 
>>> What is the recommended way to delete this graph - I need to replace
>>> its contents fairly urgently on a production system. We loaded it by
>>> loading 10,000 triples at a time, which worked fine, but I’m having
>>> trouble deleting its current contents first.
>>> 
>>> Any pointers appreciated.
>>> Thanks, Ric.
>> 
> 

Reply via email to