Hi Ric,

You could try setting two properties for your dataset:

<#yourdatasetname> rdf:type tdb:DatasetTDB ;
   ja:context [ ja:cxtName "tdb:transactionJournalWriteBlockMode" ;
ja:cxtValue "mapped" ] ;
   ja:context [ ja:cxtName "arq:spillToDiskThreshold" ; ja:cxtValue 10000 .
] .

The first one will use a temporary memory mapped file for storing
uncommitted TDB blocks.  By default the blocks are stored in heap memory.
The second option will cause the update engine to store temporary bindings
generated during the delete operation to be written out to a temporary file
on disk after the specified threshold is passed.  For the first option, you
can also use "direct", which will use process heap instead of JVM heap.

Both of these options should reduce the heap usage and maybe get you to
what you are looking for.  Try just the first option (memory mapped blocks)
first and see if that does it, since the second option will likely reduce
performance a bit.

But Andy's suggestion of breaking up the query with limited subselects
should be working for you, since it also limits the size of the heap.

-Stephen


On Wed, Jul 15, 2015 at 7:04 AM, Ric Roberts <[email protected]> wrote:

> Hello again. I’ve tried upgrading to Fuseki 1.1.2, and it now gives a heap
> space error after only 6 minutes (instead of 60-odd).
>
> There are a few hundred graphs in the database, but most of them small
> (apart from the one I’m trying to delete).
>
> Does this mean that using the Graph Protocol is a no-go? I’ll try the
> batched deletes...
>
>
> > On 13 Jul 2015, at 22:51, Andy Seaborne <[email protected]> wrote:
> >
> > On 13/07/15 21:31, Andy Seaborne wrote:
> >> Hi Ric,
> >>
> >> Could you please try Fuseki 1.1.2 or Fuseki 2.0.0?
> >>
> >> How many datasets does the server host?
> >>
> >> 1.0.1 was Jan 2014 and IIRC this area has changed, especially DELETE of
> >> a graph with the Graph Store Protocol.  However, if this is just due to
> >> transaction overheads (it's not immediately clear it is or is not), then
> >> DELETE {} WHERE { SELECT {...} LIMIT } is the way to go for an immediate
> >> solution.
> >>
> >> TDB1 (i.e. the Jena code) is a bit memory hungry for transactions.
> >>
> >> TDB2 is not memory bound but it isn't in the Jena codebase.  It has been
> >> tested with 100 million triple loads in a single Fuseki2 upload.
> >>
> >> See
> >>   http://www.sparql.org/validate/update
> >
> > That's the service point.
> >
> > http://www.sparql.org/update-validator.html
> >
> > is the HTML formm.
> >
> >> for checking syntax.
> >>
> >>     Andy
> >>
> >> On 13/07/15 18:59, Ric Roberts wrote:
> >>> Hi. I’m having problems deleting a moderately large graph from a
> >>> jena-fuseki-1.0.1 database.
> >>>
> >>> The graph contains approximately 60 million triples, and the database
> >>> contains about 70 million triples in total.
> >>>
> >>> I’ve started Fuseki with 16G Heap. (JVM_ARGS=${JVM_ARGS:—Xmx16000M}).
> >>> The server has 32G RAM.
> >>>
> >>> When I issue the DELETE command over http, I see this in the fuseki
> log:
> >>>
> >>> 16:12:03 INFO  [24] DELETE
> >>> http://127.0.0.1:3030/stagingdb/data?graph=http://example.com/graph
> >>> <http://127.0.0.1:3030/stagingdb/data?graph=http://example.com/graph>
> >>> 17:10:40 WARN  [24] RC = 500 : Java heap space
> >>> 17:10:40 INFO  [24] 500 Java heap space (3,517.614 s)
> >>>
> >>> i.e. it takes about an hour, and then 500s with an error about heap
> >>> space.
> >>>
> >>> I’ve also tried DROP and CLEAR SPARQL update statements but they
> >>> timeout with our default endpoint timeout of 30s.
> >>>
> >>> I’ve also tried deleting 1000 triples at a time, from the graph by
> >>> issuing a sparql update statement like this:
> >>>
> >>> DELETE {
> >>>  GRAPH <http://example.com/graph <http://example.com/graph>>
> >>>    { ?s ?p ?o }
> >>> }
> >>> WHERE {
> >>>   GRAPH <http://example.com/graph <http://example.com/graph>>
> >>>     { ?s ?p ?o }
> >>> }
> >>> LIMIT 1000
> >>>
> >>> … but this times out too (which surprised me, as I only asked it to
> >>> find and DELETE 1000 triples).
> >>>
> >>> What is the recommended way to delete this graph - I need to replace
> >>> its contents fairly urgently on a production system. We loaded it by
> >>> loading 10,000 triples at a time, which worked fine, but I’m having
> >>> trouble deleting its current contents first.
> >>>
> >>> Any pointers appreciated.
> >>> Thanks, Ric.
> >>
> >
>
>

Reply via email to