Hello,
Thank you very much for your comment.
Indeed I have gathered all the facts and in November we did use tdbloader2
for our import.
In April I used tdbloader.
Could you please give me some more information on the updates.
If I use tdbupdate tool after I used tdbloader2, the benefit of smaller (in
theory faster) index is removed?
Can I do incremental updates some other way though without loosing it?
The requirement is we do updates to the store after we load.
Ewa

---------- Forwarded message ----------
From: bwm-epimorphics <[email protected]>
Date: 2014-05-19 11:41 GMT+01:00
Subject: Re: Freebase data on Jena TDB
To: [email protected]



On 19/05/14 11:26, Ewa Szwed wrote:

> Hi Brian - I was using tdbloader for both November and April imports - I
> have tested it before and for freebase data set it works better than
> tdbloader2.
> tdbloader2 had faster data importing phase but much slower the indexing
> phase hence it makes the total import time longer than tdbloader for my
> case.
>
Yes. For some of mine too.

The reason I asked is that, as Andy mentioned, tdbloader2 tends to
generate a significantly more compact set of files and as a result
tdb can go a bit faster.  That advantage goes away if you then update
the database.  If you are loading a tdb image and then not updating it,
it might be worth the wait for tdbloader2.

Brian



>
> 2014-05-14 10:00 GMT+01:00 bwm-epimorphics <[email protected]>:
>
>  How did you load the TDB store?  Is it possible you used tdbloader2 for
>> the first load and tdbloader for the second?
>>
>> Brian
>>
>>
>> On 13/05/14 14:13, Ewa Szwed wrote:
>>
>>  I have the following problem with my Jena TDB instance.
>>> Last year in November I have loaded freebase dump to Jena TDB and I was
>>> able to work with it reasonably good and got quite good performance for
>>> most of my queries.
>>> Recently I have updated my Jena TDB store with a dump from April.
>>> Here are some numbers to show the difference between these 2 instances.
>>>
>>>
>>>
>>> *November 2013*
>>>
>>> *April 2014*
>>>
>>>
>>> Full time of import
>>>
>>> 262,052 sec /3,03 days
>>>
>>> 716,121 sec / 8,29 days
>>>
>>> Number of triples
>>>
>>> 1,826,551,456
>>>
>>> 2,489,221,915
>>>
>>> Index size (whole dir)
>>>
>>> 174 GB
>>>
>>> 333 GB
>>>
>>>
>>> My problem is that my new instance in not performing at all.
>>> The queries that previously run for a couple of minutes take a couple of
>>> hours now and it is not acceptable for my business. :(
>>> So I would like to ask if there is a practical index limit size for Jena
>>> TDB. Is there anything I can do to improve the performance of it.
>>> Is this significant drop in performance sth expected or maybe I have sth
>>> fundamentally wrong in my set up - which I would need to track and fix.
>>> Please advise.
>>> Regards,
>>> Ewa Szwed
>>>
>>>
>>>  --
>> Epimorphics Ltd (http://www.epimorphics.com)
>>
>> Epimorphics Ltd. is a limited company registered in England (number
>> 7016688)
>> Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20
>> 6PT, UK
>>
>>
>>
-- 
Epimorphics Ltd (http://www.epimorphics.com)

Epimorphics Ltd. is a limited company registered in England (number 7016688)
Registered address: Court Lodge, 105 High Street, Portishead, Bristol BS20
6PT, UK

Reply via email to