Re: Storing a lot of strings in TDB store

Andy Seaborne Thu, 07 Mar 2019 05:06:07 -0800

At the level of that description, they are much the same.


TDB2 differs in actual inline encoding of literals (it keeps the datatype).

TDB2 B+Trees are "copy on-write" (MVCC) and TDB2 has a differenttransaction mechanism resulting in arbitrary large transaction changesbeing supported.

TDB2 bulkloader is much faster (although it could be backported to TDB1;it is not fundamental to the TDB2 disk layout).


    Andy

On 06/03/2019 12:38, Siddhesh Rane wrote:

It's for TDB 1 right? Is there a document for TDB 2? I couldn't find one

Regards
Siddhesh


On Fri, 22 Feb 2019, 8:48 pm Rob Vesse, <[email protected]> wrote:

It's here - http://jena.apache.org/documentation/tdb/architecture.html

Rob

On 22/02/2019, 04:03, "Ekaterina Danilova" <[email protected]>
wrote:

     Thank you, it was exactly what I needed. It is still nice to hear what
     others think about my idea of data storage as resources and I think I
will
     stick to that option, but TDB storage logic was quite unclear to me.
Would
     be great if it was mentioned in official documentation since I couldn't
     find it.
     Thanks again for your help

     On Tue, 19 Feb 2019 at 20:40, Rob Vesse <[email protected]> wrote:

     > Since I don't think anyone answered your specific original question
     >
     > TDB and TDB2 both use dictionary encoding (and in fact most RDF
stores use
     > some variation on this).  Basically they map each unique RDF term
(whether
     > URI, string, blank node etc) to a consistent internal identifier and
use
     > this to refer to the term.  Therefore most data structures
internally are
     > implemented in terms of these internal identifiers (which are
typically
     > very compact, TDB/TDB2 use 64 bit identifiers) and the system only
     > translates between the internal identifier and the full RDF term when
     > explicitly needed e.g. when presenting results
     >
     > Rob
     >
     > On 15/02/2019, 06:03, "Ekaterina Danilova" <
[email protected]>
     > wrote:
     >
     >     i would like to ask how TDB2 and Fuseki manages big amounts of
string
     > data
     >     (especially repeating data) and what it the best practices. Does
it
     >     optimize it somehow?
     >
     >
     >
     >
     >

Re: Storing a lot of strings in TDB store

Reply via email to