I am using Fuseki2. I thought it manages the transactions for me. Is this
not the case? I was using datasetfactory to interact with fuseki.
On Feb 13, 2015 12:10 PM, "Andy Seaborne" <[email protected]> wrote:

> This may be related:
>
> https://issues.apache.org/jira/browse/JENA-804
>
> I say "may" because the exact patterns of use deep affect the outcome. In
> JENA-804 it is across transaction boundaries, which your "putModel" isn't.
>
> (Are you really running without transactions?)
>
>         Andy
>
> On 13/02/15 16:56, Andy Seaborne wrote:
>
>> Does the size stabilise?
>> If not, do some files stabilise in size and other not?
>>
>> There are two places for growth:
>>
>> nodes - does the new data have new RDF terms in it?  Old terms are not
>> deleted, just left round to be reused so if you are adding terms, the
>> node table can grow.  (Terms are not reference counted - that would be
>> very expensive for sucgh a small data item.)
>>
>> TDB (current version) does not properly reuse freed up space in indexes
>> but should do within a transaction. put is delete-add and some space
>> should be reused
>>
>> A proper fix to reuse across transactions may require a database format
>> change but I haven't had time to workout the details though off the top
>> of my head, much use should be doable by moving the free chain
>> management onto the main database on a transaction as its single-active
>> writer.  The code is currently too cautious about old generation readers
>> which I now see it need not be.
>>
>>      Andy
>>
>> On 12/02/15 17:51, Trevor Donaldson wrote:
>>
>>> Any thoughts anyone? If I change my model every hour with new data or
>>> data
>>> to replace. Lets say over a period of inserting years worth of triples
>>> should I persist potentially millions of triples at one time using
>>> putModel? Committing one time seems to be the only way to not mitigate
>>> against the directory growing exponentially.
>>>
>>> On Thu, Feb 12, 2015 at 9:53 AM, Trevor Donaldson <[email protected]>
>>> wrote:
>>>
>>>  Damian,
>>>>
>>>> I am using du -ksh ./* on the databases directory.
>>>>
>>>> I am getting
>>>> 25M      ./test_store
>>>>
>>>> On Thu, Feb 12, 2015 at 9:35 AM, Damian Steer <[email protected]>
>>>> wrote:
>>>>
>>>>  On 12/02/15 13:49, Trevor Donaldson wrote:
>>>>>
>>>>>> On Thu, Feb 12, 2015 at 6:32 AM, Trevor Donaldson
>>>>>>> <[email protected]
>>>>>>>
>>>>>>
>>>>>>  wrote:
>>>>>>>
>>>>>>>  Hi,
>>>>>>>>
>>>>>>>> I am in the middle of updating our store from RDB to TDB. I have
>>>>>>>>
>>>>>>> noticed
>>>>>
>>>>>> a significant size increase in the amount of storage needed.
>>>>>>>>
>>>>>>> Currently RDB
>>>>>
>>>>>> is able to hold all the data I need (4 third party services and 4
>>>>>>>>
>>>>>>> years of
>>>>>
>>>>>> their data) and it equals ~ 12G. I started inserting data from 1
>>>>>>>> third
>>>>>>>> party service, only 4 months of their data into TDB and the TDB
>>>>>>>>
>>>>>>> database
>>>>>
>>>>>> size has already reached 15G. Is this behavior expected?
>>>>>>>>
>>>>>>>
>>>>> Hi Trevor,
>>>>>
>>>>> How are you measuring the space used? TDB files tend to be sparse, so
>>>>> the disk use reported can be unreliable. Example from my system:
>>>>>
>>>>> 6.2M [...] 264M [...] GOSP.dat
>>>>>
>>>>> The first number (6.2M) is essentially the disk space taken, the second
>>>>> (264M!) is the 'length' of the file.
>>>>>
>>>>> Damian
>>>>>
>>>>> --
>>>>> Damian Steer
>>>>> Senior Technical Researcher
>>>>> Research IT
>>>>> +44 (0) 117 928 7057
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to