Thanks - I’ll look into that, Stephen.

I’d been trying it in 50k-triple chunks but the performance isn’t great: it 
takes about 1 min per request.


> On 16 Jul 2015, at 20:37, Stephen Allen <[email protected]> wrote:
> 
> Hi Ric,
> 
> You could try setting two properties for your dataset:
> 
> <#yourdatasetname> rdf:type tdb:DatasetTDB ;
>   ja:context [ ja:cxtName "tdb:transactionJournalWriteBlockMode" ;
> ja:cxtValue "mapped" ] ;
>   ja:context [ ja:cxtName "arq:spillToDiskThreshold" ; ja:cxtValue 10000 .
> ] .
> 
> The first one will use a temporary memory mapped file for storing
> uncommitted TDB blocks.  By default the blocks are stored in heap memory.
> The second option will cause the update engine to store temporary bindings
> generated during the delete operation to be written out to a temporary file
> on disk after the specified threshold is passed.  For the first option, you
> can also use "direct", which will use process heap instead of JVM heap.
> 
> Both of these options should reduce the heap usage and maybe get you to
> what you are looking for.  Try just the first option (memory mapped blocks)
> first and see if that does it, since the second option will likely reduce
> performance a bit.
> 
> But Andy's suggestion of breaking up the query with limited subselects
> should be working for you, since it also limits the size of the heap.
> 
> -Stephen
> 
> 
> On Wed, Jul 15, 2015 at 7:04 AM, Ric Roberts <[email protected]> wrote:
> 
>> Hello again. I’ve tried upgrading to Fuseki 1.1.2, and it now gives a heap
>> space error after only 6 minutes (instead of 60-odd).
>> 
>> There are a few hundred graphs in the database, but most of them small
>> (apart from the one I’m trying to delete).
>> 
>> Does this mean that using the Graph Protocol is a no-go? I’ll try the
>> batched deletes...
>> 
>> 
>>> On 13 Jul 2015, at 22:51, Andy Seaborne <[email protected]> wrote:
>>> 
>>> On 13/07/15 21:31, Andy Seaborne wrote:
>>>> Hi Ric,
>>>> 
>>>> Could you please try Fuseki 1.1.2 or Fuseki 2.0.0?
>>>> 
>>>> How many datasets does the server host?
>>>> 
>>>> 1.0.1 was Jan 2014 and IIRC this area has changed, especially DELETE of
>>>> a graph with the Graph Store Protocol.  However, if this is just due to
>>>> transaction overheads (it's not immediately clear it is or is not), then
>>>> DELETE {} WHERE { SELECT {...} LIMIT } is the way to go for an immediate
>>>> solution.
>>>> 
>>>> TDB1 (i.e. the Jena code) is a bit memory hungry for transactions.
>>>> 
>>>> TDB2 is not memory bound but it isn't in the Jena codebase.  It has been
>>>> tested with 100 million triple loads in a single Fuseki2 upload.
>>>> 
>>>> See
>>>>  http://www.sparql.org/validate/update
>>> 
>>> That's the service point.
>>> 
>>> http://www.sparql.org/update-validator.html
>>> 
>>> is the HTML formm.
>>> 
>>>> for checking syntax.
>>>> 
>>>>    Andy
>>>> 
>>>> On 13/07/15 18:59, Ric Roberts wrote:
>>>>> Hi. I’m having problems deleting a moderately large graph from a
>>>>> jena-fuseki-1.0.1 database.
>>>>> 
>>>>> The graph contains approximately 60 million triples, and the database
>>>>> contains about 70 million triples in total.
>>>>> 
>>>>> I’ve started Fuseki with 16G Heap. (JVM_ARGS=${JVM_ARGS:—Xmx16000M}).
>>>>> The server has 32G RAM.
>>>>> 
>>>>> When I issue the DELETE command over http, I see this in the fuseki
>> log:
>>>>> 
>>>>> 16:12:03 INFO  [24] DELETE
>>>>> http://127.0.0.1:3030/stagingdb/data?graph=http://example.com/graph
>>>>> <http://127.0.0.1:3030/stagingdb/data?graph=http://example.com/graph>
>>>>> 17:10:40 WARN  [24] RC = 500 : Java heap space
>>>>> 17:10:40 INFO  [24] 500 Java heap space (3,517.614 s)
>>>>> 
>>>>> i.e. it takes about an hour, and then 500s with an error about heap
>>>>> space.
>>>>> 
>>>>> I’ve also tried DROP and CLEAR SPARQL update statements but they
>>>>> timeout with our default endpoint timeout of 30s.
>>>>> 
>>>>> I’ve also tried deleting 1000 triples at a time, from the graph by
>>>>> issuing a sparql update statement like this:
>>>>> 
>>>>> DELETE {
>>>>> GRAPH <http://example.com/graph <http://example.com/graph>>
>>>>>   { ?s ?p ?o }
>>>>> }
>>>>> WHERE {
>>>>>  GRAPH <http://example.com/graph <http://example.com/graph>>
>>>>>    { ?s ?p ?o }
>>>>> }
>>>>> LIMIT 1000
>>>>> 
>>>>> … but this times out too (which surprised me, as I only asked it to
>>>>> find and DELETE 1000 triples).
>>>>> 
>>>>> What is the recommended way to delete this graph - I need to replace
>>>>> its contents fairly urgently on a production system. We loaded it by
>>>>> loading 10,000 triples at a time, which worked fine, but I’m having
>>>>> trouble deleting its current contents first.
>>>>> 
>>>>> Any pointers appreciated.
>>>>> Thanks, Ric.
>>>> 
>>> 
>> 
>> 

Reply via email to