Thanks so much Andy. I was thinking deleting a batch at a time might work but I wasn't sure if that was the right plan or not. Thanks again.
On Mon, Apr 6, 2015 at 6:47 AM, Andy Seaborne <[email protected]> wrote: > The reason this runs into space problem is that the TDB transaction system > uses RAM for uncommited data. > > Way around this: > > 1/ Larger heap. > > 2/ "spill to disk" option > See TDB.transactionJournalWriteBlockMode > --set tdb:transactionJournalWriteBlockMode=mapped > (I confess I have never had to use this feature) > > 3/ Delete piece at a time: > (best if you can do it this way - but note that > this is several transactions hence visible while it happens) > > WITH <named_graph> > DELETE { ?s ?p ?o } > WHERE > { SELECT ?s ?p ?o > WHERE { ?s ?p ?o } > LIMIT 100000 > } # Tune the limit to work > > apply until nothing deleted. > ASK { GRAPH <named_graph> { ?s ?p ?o } } > returns false. > > > Long term futures: > > A different version of TDB has arbitrary sized transactions. > The core technology for this works (copy-on-write B+Trees, new generalised > transaction coordination). > It is not ready for deployment yet. > <advert> > No timescale - I would love to find a way to get > some personal resourcing/sponsorship > to make this happen sooner. > </advert> > > Andy > > > > On 03/04/15 17:15, Trevor Donaldson wrote: > >> More info. I tried datasetAccessor.deleteModel(graphURI) as well as >> DROPGRAPH <graphUri> as well as DELETE {GRAPH <graphURi {?s ?p ?o}} WHERE >> {GRAPH <graphURI> {?s ?p ?o}}. These all result in the same GC overhead >> limit exceeded. Unsure on how to remove a named graph. >> >> On Fri, Apr 3, 2015 at 10:15 AM, Trevor Donaldson <[email protected]> >> wrote: >> >> More info. I am running CLEAR GRAPH <Named_GRAPH> from the basedataset >>> service which is a union of all named graphs. There are 7 named graphs in >>> total. I didn't think this should matter though. If this will not work, >>> is >>> there another way for me to clear a named graph without running into the >>> above error. >>> >>> Thanks >>> >>> On Fri, Apr 3, 2015 at 9:22 AM, Trevor Donaldson <[email protected]> >>> wrote: >>> >>> Lets try that again. smh >>>> >>>> Additional info. >>>> java.lang.OutOfMemoryError : GC overhead limit exceeded >>>> at java.util.HashMap$KeySet.iterator(HashMap.java:912) >>>> at org.apache.jena.atlas.lib.Map2.iterator(Map2.java:81) >>>> at com.hp.hpl.jena.tdb.solver.BindingTDB.calcVars(BindingTDB.java:74) >>>> ... >>>> at >>>> org.apache.jena.fusek.servlets.SPARQL.update. >>>> perform(SPARQL_UPDATE.java:105) >>>> >>>> On Fri, Apr 3, 2015 at 9:20 AM, Trevor Donaldson <[email protected]> >>>> wrote: >>>> >>>> Additional info. >>>>> java.lang.OutOfMemoryError:GC overhead limit exceeded >>>>> >>>>> >>>>> On Thu, Apr 2, 2015 at 4:33 PM, Trevor Donaldson <[email protected]> >>>>> wrote: >>>>> >>>>> Hi all, >>>>>> >>>>>> I have a named graph with 10mil+ quads. I am trying to clear the named >>>>>> graph. In order to do this I am using CLEAR GRAPH <NAMED_GRAPH>. I am >>>>>> using >>>>>> fusek2 and the fuseki war. Any ideas why I am running into a Java Heap >>>>>> space error? >>>>>> >>>>>> Thanks >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> >
