Trevor See https://issues.apache.org/jira/browse/JENA-804 which discusses some of the reasons why a TDB store can and will grow over time
addModel() and putModel() are semantically different addModel() is add triples given in the model to the given graph if they do not already exist, it is additive ONLY putModel() is delete the existing graph and replace it with the contents of the given model Rob On 12/02/2015 13:49, "Trevor Donaldson" <[email protected]> wrote: >So I think the difference is in the subtle nuance between putModel and add >of the datasetAccessor. I ran the following test : (26M) > >/************************************************************************* >*****/ >for(int i=0; i<1000;i++){ > Resource subject = ResourceFactory.createResource(" >http://example.org/task/"+i); > Property predicate = ResourceFactory.createProperty("urn:my:id"); > Resource object = ResourceFactory.createTypedLiteral(i); > > Statement stmt = model.createStatement(subject,predicate,object); > model.add(stmt); > > //I thought this was flushing the queue, boy was I wrong. > if(i % 10 == 0){ > datasetAccessor.putModel(GRAPH_NAME,model); > } >} > >datasetAccessor.putModel(GRAPH_NAME,model); >/************************************************************************* >*****/ > >Then I ran the following test (568K) > >/************************************************************************* >*****/ > >for(int i=0; i<1000;i++){ > Resource subject = ResourceFactory.createResource(" >http://example.org/task/"+i); > Property predicate = ResourceFactory.createProperty("urn:my:id"); > Resource object = ResourceFactory.createTypedLiteral(i); > > Statement stmt = model.createStatement(subject,predicate,object); > model.add(stmt); > > //I thought this was flushing the queue, boy was I wrong. > if(i % 10 == 0){ > datasetAccessor.add(GRAPH_NAME,model); > } >} > >datasetAccessor.add(GRAPH_NAME,model); > >/************************************************************************* >*****/ > >So my guess is that I need to use add as opposed to putModel. I just need >to see if add will delete an existing triple. > > >On Thu, Feb 12, 2015 at 7:53 AM, Trevor Donaldson <[email protected]> >wrote: > >> More information.... >> >> My workflow is as follows : >> 1. Read data from web service >> 2. create triples >> 3. apply some reification >> 4. after about a 100 or so call datasetAccessor.putModel(<GRAPH>, >>model); >> >> Should I use put model or should I use add? I need to update (delete / >> insert) triples from time to time that is why I am using put. >> >> On Thu, Feb 12, 2015 at 6:32 AM, Trevor Donaldson <[email protected]> >> wrote: >> >>> Hi, >>> >>> I am in the middle of updating our store from RDB to TDB. I have >>>noticed >>> a significant size increase in the amount of storage needed. Currently >>>RDB >>> is able to hold all the data I need (4 third party services and 4 >>>years of >>> their data) and it equals ~ 12G. I started inserting data from 1 third >>> party service, only 4 months of their data into TDB and the TDB >>>database >>> size has already reached 15G. Is this behavior expected? Seems like >>>that is >>> a lot. I will probably need multiple TBs if this is the expected >>>behavior. >>> >>> Thanks, >>> Trevor >>> >> >>
