Re: Storing a lot of strings in TDB store

ajs6f Fri, 22 Feb 2019 05:17:02 -0800

TDB's design is given in official documentation here:

https://jena.apache.org/documentation/tdb/architecture.html


ajs6f

> On Feb 22, 2019, at 5:02 AM, Ekaterina Danilova <[email protected]> 
> wrote:
> 
> Thank you, it was exactly what I needed. It is still nice to hear what
> others think about my idea of data storage as resources and I think I will
> stick to that option, but TDB storage logic was quite unclear to me. Would
> be great if it was mentioned in official documentation since I couldn't
> find it.
> Thanks again for your help
> 
> On Tue, 19 Feb 2019 at 20:40, Rob Vesse <[email protected]> wrote:
> 
>> Since I don't think anyone answered your specific original question
>> 
>> TDB and TDB2 both use dictionary encoding (and in fact most RDF stores use
>> some variation on this).  Basically they map each unique RDF term (whether
>> URI, string, blank node etc) to a consistent internal identifier and use
>> this to refer to the term.  Therefore most data structures internally are
>> implemented in terms of these internal identifiers (which are typically
>> very compact, TDB/TDB2 use 64 bit identifiers) and the system only
>> translates between the internal identifier and the full RDF term when
>> explicitly needed e.g. when presenting results
>> 
>> Rob
>> 
>> On 15/02/2019, 06:03, "Ekaterina Danilova" <[email protected]>
>> wrote:
>> 
>>    i would like to ask how TDB2 and Fuseki manages big amounts of string
>> data
>>    (especially repeating data) and what it the best practices. Does it
>>    optimize it somehow?
>> 
>> 
>> 
>> 
>>

Re: Storing a lot of strings in TDB store

Reply via email to