Trevor

See https://issues.apache.org/jira/browse/JENA-804 which discusses some of
the reasons why a TDB store can and will grow over time

addModel() and putModel() are semantically different

addModel() is add triples given in the model to the given graph if they do
not already exist, it is additive ONLY

putModel() is delete the existing graph and replace it with the contents
of the given model

Rob

On 12/02/2015 13:49, "Trevor Donaldson" <[email protected]> wrote:

>So I think the difference is in the subtle nuance between putModel and add
>of the datasetAccessor. I ran the following test : (26M)
>
>/*************************************************************************
>*****/
>for(int i=0; i<1000;i++){
>  Resource subject = ResourceFactory.createResource("
>http://example.org/task/"+i);
>  Property predicate = ResourceFactory.createProperty("urn:my:id");
>  Resource object = ResourceFactory.createTypedLiteral(i);
>
>  Statement stmt = model.createStatement(subject,predicate,object);
>  model.add(stmt);
>
>  //I thought this was flushing the queue, boy was I wrong.
>  if(i % 10 == 0){
>     datasetAccessor.putModel(GRAPH_NAME,model);
>   }
>}
>
>datasetAccessor.putModel(GRAPH_NAME,model);
>/*************************************************************************
>*****/
>
>Then I ran the following test (568K)
>
>/*************************************************************************
>*****/
>
>for(int i=0; i<1000;i++){
>  Resource subject = ResourceFactory.createResource("
>http://example.org/task/"+i);
>  Property predicate = ResourceFactory.createProperty("urn:my:id");
>  Resource object = ResourceFactory.createTypedLiteral(i);
>
>  Statement stmt = model.createStatement(subject,predicate,object);
>  model.add(stmt);
>
>  //I thought this was flushing the queue, boy was I wrong.
>  if(i % 10 == 0){
>     datasetAccessor.add(GRAPH_NAME,model);
>   }
>}
>
>datasetAccessor.add(GRAPH_NAME,model);
>
>/*************************************************************************
>*****/
>
>So my guess is that I need to use add as opposed to putModel. I just need
>to see if add will delete an existing triple.
>
>
>On Thu, Feb 12, 2015 at 7:53 AM, Trevor Donaldson <[email protected]>
>wrote:
>
>> More information....
>>
>> My workflow is as follows :
>> 1. Read data from web service
>> 2. create triples
>> 3. apply some reification
>> 4. after about a 100 or so call datasetAccessor.putModel(<GRAPH>,
>>model);
>>
>> Should I use put model or should I use add? I need to update (delete /
>> insert) triples from time to time that is why I am using put.
>>
>> On Thu, Feb 12, 2015 at 6:32 AM, Trevor Donaldson <[email protected]>
>> wrote:
>>
>>> Hi,
>>>
>>> I am in the middle of updating our store from RDB to TDB. I have
>>>noticed
>>> a significant size increase in the amount of storage needed. Currently
>>>RDB
>>> is able to hold all the data I need (4 third party services and 4
>>>years of
>>> their data) and it equals ~ 12G. I started inserting data from 1 third
>>> party service, only 4 months of their data into TDB and the TDB
>>>database
>>> size has already reached 15G. Is this behavior expected? Seems like
>>>that is
>>> a lot. I will probably need multiple TBs if this is the expected
>>>behavior.
>>>
>>> Thanks,
>>> Trevor
>>>
>>
>>




Reply via email to