TDB's design is given in official documentation here: https://jena.apache.org/documentation/tdb/architecture.html
ajs6f > On Feb 22, 2019, at 5:02 AM, Ekaterina Danilova <[email protected]> > wrote: > > Thank you, it was exactly what I needed. It is still nice to hear what > others think about my idea of data storage as resources and I think I will > stick to that option, but TDB storage logic was quite unclear to me. Would > be great if it was mentioned in official documentation since I couldn't > find it. > Thanks again for your help > > On Tue, 19 Feb 2019 at 20:40, Rob Vesse <[email protected]> wrote: > >> Since I don't think anyone answered your specific original question >> >> TDB and TDB2 both use dictionary encoding (and in fact most RDF stores use >> some variation on this). Basically they map each unique RDF term (whether >> URI, string, blank node etc) to a consistent internal identifier and use >> this to refer to the term. Therefore most data structures internally are >> implemented in terms of these internal identifiers (which are typically >> very compact, TDB/TDB2 use 64 bit identifiers) and the system only >> translates between the internal identifier and the full RDF term when >> explicitly needed e.g. when presenting results >> >> Rob >> >> On 15/02/2019, 06:03, "Ekaterina Danilova" <[email protected]> >> wrote: >> >> i would like to ask how TDB2 and Fuseki manages big amounts of string >> data >> (especially repeating data) and what it the best practices. Does it >> optimize it somehow? >> >> >> >> >>
